Azure AI-900
True or False: pre-built form recognizer models are only in English
True
OCR
Optical character recognition, used to recognize individual shapes as letters, numerals, punctuation or other elements of text
Read API returns a hierarchy of information that includes:
Pages- One for each page of text, including information about the page size and orientation Lines- The lines of text on a page Words- The words in a line of text Each line and work includes bounding box coordinates indicating its position on the page
Recall
Percentage of the class prediction made by the model that were correct. For example if the model predicted that 10 images are oranges, of which 8 were actually oranges, then recall is 80%
What are the translator text optional configurations?
Profanity filtering and selective translation
What information does the client application need to use a published custom vision model?
Project ID, Model name, Prediction endpoint, Prediction key
What service do you use for image classification?
Custom Vision
Speech Synthesis
The ability to generate spoken output
Data and Compute Management
cloud-based data storage and compute resources that professional data scientists can use to run data and experiment code at scale
If you pass the "en" language code for french, what will the sentiment score be?
.5
What is the 3 step process to use the Read API?
1. Submit an image to the API and retrieve an operation ID in response 2. Use the operation ID to check on the status of the image analysis operation and wait until it has completed 3. Retrieve the results of the operation
How many pages will be processed for free tier form recognizer?
200
What resource do you need to use Form Recognizer?
A Form Recognizer resource
What models do software use to recognize speech?
An acoustic model(converts audio signals into phonemes) A language model (maps phonemes to words, usually using a statistical algorithm that predicts the most probable sequence of words based on the phonemes
Prediction/Forecasting Workloads
Automated Machine Learning, Azure Machine Learning Designer, Data and Compute Management, Pipelines
AP
Average precision, an overall metric that takes into account both precision and recall
What information does an object detection model return?
Class of object identified, probability score of the object classification, coordinates of a bounding box for each object
What resources can you use for OCR?
Computer vision or cognitive services
What resources to you need to create an object detection solution?
Custom Vision(can be either training or prediction resource) Cognitive Services(can be used for training, prediction or both)
Form recognizer supports automated document processing through:
Custom models trained with your own forms pre-built receipt models
Which cognitive services can you use for face detection?
Custom vision, Video indexer and Face service
Pipelines
Data scientists, software engineers, and IT operations professionals can define pipelines to orchestrate model training, deployment, and management tasks
Text Analytics uses pre-trained models that can:
Determine the language of a document or text Perform sentiment analysis on text to determine if it's positive or negative Extract key phrases from text that might indicate its main talking points Identify and categorize entities in the text. Entities can be people, places, organizations, or even everyday items such as dates, times, quantities and so on
List
Entities that are defined as a hierarchy of lists and sublists. For example, a device list might include sublists for light and fan. you can specify synonyms, such as lamp for light
RegEx Entity
Entities that are defined as a regular expression that describes a pattern - for example, you might define a pattern like [0-9]{3}-[0-9]{3}-[0-9]{4} for telephone numbers
What are some uses of object detection?
Evaluating the safety of a building by looking for fire extinguishers or other emergency equipment, creating software for self-driving cars or vehicles with lane assist capabilities, medical imaging such as an MRI or x-rays that can detect known objects for medical diagnosis
What resources can be used for Face Service?
Face and or Cognitive Services
Face Service supports the following functionality:
Face detection, face verification, find similar faces, group faces based on similarities, identify people
Read API
For documents with a lot of text. has the ability to automatically determine the proper recognition model to use, taking into consideration lines of text and supporting images with printed text as well as recognizing handwriting
Azure Machine Learning Designer
GUI enabling no-code development of machine learning solutions
What's the difference between object detection and image classification?
Image classification is a machine learning based form of computer vision in which a model is trained to categorize images based on the primary subject matter they contain. Object detection goes further than this to classify individual objects within the image, and to return the coordinates of a bounding box that indicates the objects location
Computer Vision models
Image classification, Object detection, Semantic segmentation, Image analysis, Face detection, analysis and recognition, Optical character recognition
What computer vision API's can read text?
OCR and Read
Form Recognizer file requirements
JPEG, PNG, BMP, PDF or TIFF less than 20 MB 50x50 up to 10000x10000 up to 17inx17in PDF
Image requirements for Face Service
JPEG, PNG, GIF or BMP 4 MB or smaller 36x36 to 4096x4096
What does the Text Analytics service return when used for language detection?
Language Name ISO 6391 language code A score indicating a level of confidence in the language detection
Anomaly Detection
ML based technique that analyzes data over time and identifies unusual changes. Azure has an Anomaly Detector service
What are the four types of entities?
Machine-learned List RegEx Pattern.any
How many languages does text translator support?
More than 60
If text analytics returns an unknown language name, what is the score?
NaN
Does the order in which you create entities and intents matter?
No
What are the uses of OCR?
Note taking, digitizing forms such as medical records or historical documents, scanning printed or handwritten checks for bank deposits. designed for quick extraction of small amounts of texts in images. operates synchronously to provide immediate results. can recognize text in numerous languages
What types of transcription does speech-to-text allow?
Real-time(application listen on microphone or other audio input source such as audio file) Batch (audio recording from file share, remote server or azure storage)
OCR API returns a hierarchy of information that consists of:
Regions in the image that contain text, Lines of text in each region, Words in each line of text
Intent
Represents the purpose or goal expressed in a user's utterance. Ex. the intent for "Switch the fan on" is TurnOn
Uses of face detection and analysis
Security, Social media, Intelligent monitoring, Advertising, Missing persons, Identity validation
You are translating a document and don't want a brand name to be translated, what translator text configuration do you use?
Selection Translation
What APIs does the Speech service include?
Speech-to-text Text-to-speech Speech Translation
What APIs does the Speech cognitive service include?
Speech-to-text text-to-speech
Does the Read API work synchronously or asynchronously?
The Read API works asynchronously so as not to block your application while it is reading the content and returning results to your application
True or False: When using translator text you can specify one from language and multiple two languages to simultaneously translate a document into multiple languages
True
What are the 3 main tasks for creating an object detection solution with custom vision?
Uploading and tagging images, training model, publishing model
When using LUIS, how do you handle utterances that do not map any of the utterances you entered?
Use the None intent to provide a generic response to users when their requests don't match any other intent
What are the 3 core concepts you need to take into account to work with LUIS?
Utterances Entities Intents
How are custom vision models evaluated?
With precision, recall and AP
What resources can you use for the Speech Service?
You can use either Speech or Cognitive Services
What Azure resources can you use for the Text Analytics service?
You can use either Text Analytics or Cognitive Services
What resources can be used for custom vision?
You can use either custom vision or cognitive services
What resources do you need to use LUIS?
You need a Language understanding resource (can be either an authoring or prediction resource) and/r a cognitive services resource(can only be used for prediction)
Entity
an item to which an utterance refers. For example fan in the following utterance "switch on the fan"
What scenarios is the speech-to-text API optimized for?
conversational and dictation
Authoring
creating entities, intents and utterances to train your LUIS model
Automated Machine Learning
enables non-experts to quickly create an effective ML model from data
Computer Vision
enables software engineers to create intelligent solutions that extract information from images; a common task in many artificial intelligence (AI) scenarios
Speech
enables speech to text and speech to speech translation
Machine-Learned entity
entities that are learned by your model during training from context in the sample utterances
Key Phrase Extraction
evaluating the text of a document and identifying the main talking points
What is speech synthesis used for?
generating spoken responses to user input creating voice menus for telephone systems reading email broadcasting announcements
What are key considerations for tagging training images?
having sufficient images from multiple angles and making sure bounding boxing are tightly defined
Face detection
involves identifying regions of an image that contain a human face, typically by returning bounding box coordinates that form a rectangle around the face
What is an entity?
item of a particular type of a category
What do you need to create client application with form recognizer?
key and endpoint
What information do you need from your resources used for OCR?
key and endpoint
What information do you need to use Face Service?
key and endpoint
Facial recognition
machine learning model used to identify known individuals from their facial features
Form recognizer combines OCR with predictive models that can interpret form data by:
matching field names to values processing tables of data identifying specific types of fields, such as dates, telephone numbers, addresses, totals and others
Face analysis
moves beyond simple face detection, some algorithms can also return other information such as facial landmarks(nose, eyes, eyebrows, lips). These can be used as features with which to train a machine learning model from which you can infer information about a person such as their gender, age or emotional state.
Face Service
offers pre-built algorithms that can detect, recognize and analyze faces.
How are batch transcription jobs scheduled?
on a best effort basis
How many languages does the Speech service support?
over 60
Speech Translation API
part of the speech service used to translate speech in one language to text or speech in another
Precision
percentage of class prediction that the model correctly identified. Ex. if there are 10 images of apples and the model found 7, then precision is 70%
If text analytics return a sentiment score of 1, is the sentiment positive or negative?
positive
What is speech recognition used for?
providing closed captions creating transcripts automated note dictation determining intended user input for further processing
Form Recognizer
service in Azure that provides intelligent form processing capabilities that you can use to automate the processing of data in documents such as forms, invoices and receipts.
Utterance
something a user might say and your application must interpret. Ex "Switch on the fan"
Translator text
supports text to text translation
What is entity linking?
text analytics will provide a link to a specific reference. ex a wikipedia link to the city seattle when it is mentioned
Speech Recogntion
the ability to detect and interpret spoken input
When you are using translator text you must specify:
the language you are translating to and from using the ISO 639-1 code
Face Service can return:
the rectangle coordinates for any human faces that are found in an image as well as a series of attributes relates to those faces such as: head pose, gender, age, emotion, if there's facial hair, if the person is wearing glasses, if makeup is applied, if the person is smiling, blur, exposure, noise, occlusion
Object Detection
trained to classify individual objects within an image and identify their location with a bounding box
Image classification
trains ML model to classify images based on their contents. Ex. classification can be used in a traffic monitoring solution to classify images based on the type of vehicle they contain.
True or false: Batch transcription should be run in an asynchronous manner
true
What are the type and subtype of the entity 6?
type: quantity subtype: number
Video Indexer
used to detect and identify faces in a video
Speech-to-text
used to transcribe speech from an audio source to text format used to generate spoken audio from a text source used to translate speech in one language to text or speech in another
What are speech synthesis neural voices?
voices that leverage neural networks to overcome common limitations in speech synthesis with regard to itonation, resulting in a more natural sounding voice
What resource do you need to use the translator text and speech services?
you can use the dedicated translator text and speech services or cognitive services