AI-900 Vocabulary
Classification models
Predicts a class (dog or cat) or multi-class (dog, cat, or rabbit) of the label based on incoming data (features); Supervised learning; Is this A or B? Predicts discrete values.
Clustering Models
Predicts what data points belong to what cluster or group. There is no prior knowledge about the data clusters or groups that can be used for prediction; this algorithm type learns common cluster properties first, then calculates the cluster membership probability for each data point: Unsupervised Learning; How is this organized?
Authoring
Process of language understanding model creation and training. Supply entity, intent, and utterance.
Regression models
Produces a numeric value prediction for the label, like a game score or a stock price; Supervised Learning; How much or how many? Predicts continuous values.
Computer Vision: Azure Face Service
Provides AI algorithms that detect, recognize, and analyze human faces in images. Face detection, Face verification, Face idenification, Find Similar.
NLP: Key Phrase Extraction
Quickly identify the main concepts in text. For example, in the text "The food was delicious and there were wonderful staff", Key Phrase Extraction will return the main talking points: "food" and "wonderful staff". Text analytics (Language) API. It helps to extract the key phrases from the unstructured text. This functionality is beneficial when you need to create a summary or a catalog from the document content or understand the customer reviews' key points.
Recall or True Positive Rate
"Classification model analysis that is the fraction of the total amount of relevant instances that were actually retrieved. Recall = TP/(TP+FN); defines the percentage of the class identification that the model makes correct. For example, if there are ten apple images, and the model identifies only eight, the model recall is 80%."
Combined Evaluation
"Clustering model analysis that combines the ADOC, ADCC, NoP, and MDCC metrics per cluster into the combined model evaluation metric."
Average Distance to Cluster Center
"Clustering model analysis that reflects how far is the average distance for each data point in a cluster to the centroid of the cluster."
Maximal DIstance to Cluster Center
"Clustering model analysis that reflects the cluster spread.Metric's value is the sum of distances from each point in the cluster to the cluster's centroid."
Computer Vision: Custom Vision Service
"Lets you build, deploy, and improve your own image classifiers. An image classifier is an AI service that applies labels (which represent classes) to images, based on their visual characteristics. These models are based on image classifications. As for any classification model, it should be a set of images for each known class or category. Custom Vision service relies on deep learning techniques. These techniques use convolutional neural networks (CNN). CNN links pixels to the classes or categories"
Coefficient of Determination
"Regression model analysis that reflects the model performance: the closer R2 to 1 - the better the model fits the data."
Root Mean Squared Error (RMSE)
"Regression model analysis that represents the square Root from the squared mean of the errors between predicted and actual values."
Utterance
The user's input that your model needs to interpret.
Pipelines
visual designer for creating ML tasks workflow
Supervised Learning
"makes predictions based on the information from the previous outcomes (labeled data); rely on the structured data where input columns or fields are called features, and the output is the label or labels: Regression and Classification"
Object detection
"identifies objects and their boundaries within the image; places each recognizable object in the bounding box with the class name and probability score."
NLP: QnA Maker
A cloud-based Natural Language Processing (NLP) service that allows you to create a natural conversational layer over your data. It is used to find the most appropriate answer for any input from your custom knowledge base (KB) of information. Doesn't store customer data.
NLO: Language Understanding (LUIS)
A cloud-based conversational AI service that applies custom machine-learning intelligence to a user's conversational, natural language text to predict overall meaning, and pull out relevant, detailed information. LUIS provides access through its custom portal, APIs and SDK client libraries. Lanuage API.
NLP: Translator
A cloud-based machine translation service you can use to translate text in near real-time through a simple REST API call. The service uses modern neural machine translation technology and offers statistical machine translation technology. Custom Translator is an extension of Translator, which allows you to build neural translation systems. The customized translation system can be used to translate text with Translator or Microsoft Speech Services.
Experiment
A collection of runs or trials in Azure ML.
Feature
A feature is an input variable—the x variable in simple linear regression. A simple machine learning project might use a single feature, while a more sophisticated machine learning project could use millions of features, specified as: x1, x2, ......xn
Intent
Action or task the user wants to execute.
Confusion Matrix
Also known as the Error Matrix which provides a tabulated view of predicted and actual values for each class. It is usually used as a performance assessment for Classification models. But it can also be used for fast visualization of the Clustering model results too.
Examples
An example is a particular instance of data, x. (We put x in boldface to indicate that it is a vector.) We break examples into two categories: a. labeled examples; b: unlabeled examples.
Image classification
Analyzes images and videos, detects objects and text, extract descriptions, and create tags.
Azure Cognitive Services
Azure Cognitive Services are cloud-based services with REST APIs and client library SDKs available to help you build cognitive intelligence into your applications.
NLP: Named Entity Recognition
Can Identify and categorize entities in your text as people, places, organizations, quantities, Well-known entities are also recognized and linked to more information on the web. Text analytics (Language) API.
NLP: Language Detection
Can detect the language an input text is written in and report a single language code for every document submitted on the request in a wide range of languages, variants, dialects, and some regional/cultural languages. The language code is paired with a confidence score. Text analytics (Language) API.
F1 Score
Classification model analysis that combines Precisionand Recall.
AUC
Classification model analysis that measures the area under the curve plotted with true positives on the y axis and false positives on the x axis. This metric is useful because it provides a single number that lets you compare models of different types. This is a classification-threshold-invariant. It measures the quality of the model's predictions irrespective of what classification threshold is chosen; One way of interpreting this metric is as the probability that the model ranks a random positive example more highly than a random negative example. If the AUC value is below 0.5, the model performance is worse than random. Ideally, the best-fitted model has a value of 1. Such an ideal model predicts all the values correctly.
Accuracy
Classification model analysis that measures the goodness of a classification model as the proportion of true results to total cases; defines how many predictions (positive and negative) are actually predicted right.
Precision
Classification model analysis that the proportion of true results over all positive results. Precision = TP/(TP+FP); Defines the percentage of the class predictions that the model makes correct. For example, if the model predicts ten images are bananas, and there are actually only seven bananas, the model precision is 70%. Defines how many positive cases are actually predicted right.
Azure Bot Service
Cloud based platform for developing and managing bots.
Number of Points
Clustering model analysis that is the number of points assigned to each cluster.
Average Distance to Other Center
Clustering model analysis that reflects how far is the average distance for each data point in a cluster to the centroids of all other clusters.
Machine Learning Tasks
Data Ingestion, Data preparation and transformation, Feature selection and engineering, Model training, Evaluation, Model deployment, Model management
Inference Clusters
Deployment targets for predictive services that use your trained models.
Computer Instances
Development workstations that data scientists can use to work with data and models.
Decision API: Anomaly Detector
Enables you to monitor and detect abnormalities in your time series data without having to know machine learning. This API's algorithms adapt by automatically identifying and applying the best-fitting models to your data, regardless of industry, scenario, or data volume. Using your time series data, the API determines boundaries for anomaly detection, expected values, and which data points are anomalies.
Data and Compute Management
cloud-based tools for data science professionals
Labeled Example
Includes both feature(s) and the label. That is: {features, label}: (x,y)
Unlabeled Example
Includes features but not the label: {features, ?}: (x,y)
Average Precision (AP) = F1 Score
Is the combined metrics of both Precision and Recall.
Attached Computer
Links to existing Azure compute resources, such as Virtual Machines or Azure Databricks clusters.
Relative Squared Error (RSE)
Regression model analysis that based on the square of the differences between predicted and true values. The value is between 0 and 1. The closer this value is to 0, the better is model performance. Relativity of this metric helps to compare model performances for the labels in different units
Relative Absolute Error (RAE)
Regression model analysis that is based on absolute differences between predicted and true values. The value is between 0 and 1. The closer this value is to 0, the better is model performance. Relativity of this metric helps to compare model performances for the labels in different units.
Mean Absolute Error (MAE)
Regression model analysis that produces the score that measures how close the model is to the actual values; the lower score, better the model performance.
Compute Clusters
Scalable clusters of virtual machines for on-demand processing of experiment code.
NLP: Text Analytics API
The Text Analytics API is a cloud-based service that provides Natural Language Processing (NLP) features for text mining and text analysis, including: sentiment analysis, opinion mining, key phrase extraction, language detection, and named entity recognition.
Computer Vision: Face Detection
The Detect API detects human faces in an image and returns the rectangle coordinates of their locations. Optionally, face detection can extract a series of face-related attributes, such as head pose, gender, age, emotion, facial hair, and glasses. These attributes are general predictions, not actual classifications.
Computer Vision: Find Similar
The Find Similar API does face matching between target face and a set of candidate faces, finding a smaller set of faces that look similar to the target face. This is useful for doing a face search by image. Two working modes, matchPerson and matchFace, are supported. The matchPerson mode returns similar faces after filtering for the same person by using the Verify API. The matchFace mode ignores the same-person filter.
Computer Vision: Face Grouping
The Group API divides a set of unknown faces into several groups based on similarity. Each group is a disjoint proper subset of the original set of faces. All of the faces in a group are likely to belong to the same person. There can be several different groups for a single person. The groups are differentiated by another factor, such as expression, for example.
Computer Vision: Face Identification
The Identify API also starts with Detection and answers the question, "Can this detected face be matched to any enrolled face in a database?" Because it's like face recognition search, is also called "one-to-many" matching. Candidate matches are returned based on how closely the probe template with the detected face matches each of the enrolled templates.
Computer Vision: Face Verification
The Verify API builds on Detection and addresses the question, "Are these two images the same person?". Verification is also called "one-to-one" matching because the probe image is compared to only one enrolled template. Verification can be used in identity verification or access control scenarios to verify a picture matches a previously captured image (such as from a photo from a government issued ID card).
NLP: Sentiment Analysis
The feature provides sentiment labels (such as "negative", "neutral" and "positive") based on the highest confidence score found by the service at a sentence and document-level. This feature also returns confidence scores between 0 and 1 for each document & sentences within it for positive, neutral and negative sentiment. You can also be run the service on premises using a container. Text Analytics (Language) API.
Loss
The penalty for a bad prediction. This is is a number indicating how bad the model's prediction was on a single example. If the model's prediction is perfect, the loss is zero; otherwise, the loss is greater. The goal of training a model is to find a set of weights and biases that have low loss, on average, across all examples.
NLP: Speech APIs
The unification of speech-to-text, text-to-speech, and speech-translation into a single Azure subscription. It's easy to speech enable your applications, tools, and devices with the Speech CLI, Speech SDK, Speech Devices SDK, Speech Studio, or REST APIs.
Computer Vision: Person
This data structure is a list of PersistedFace objects that belong to the same person. It has a unique ID, a name string, and optionally a user data string.
Computer Vision: FaceList or Large FaceList
This data structure is an assorted list of PersistedFace objects. A FaceList has a unique ID, a name string, and optionally a user data string.
Computer Vision: Person Group or Large Person Group
This data structure is an assorted list of Person objects. It has a unique ID, a name string, and optionally a user data string. A PersonGroup must be trained before it can be used in recognition operations.
Decision API: Personalizer
This is a cloud-based service that helps your applications choose the best content item to show your users. You can use the Personalizer service to determine what product to suggest to shoppers or to figure out the optimal position for an advertisement. After the content is shown to the user, your application monitors the user's reaction and reports a reward score back to the Personalizer service. This ensures continuous improvement of the machine learning model, and Personalizer's ability to select the best content item based on the contextual information it receives.
Decision API: Metrics Advisor
This is a part of Azure Cognitive Services that uses AI to perform data monitoring and anomaly detection in time series data. The service automates the process of applying models to your data, and provides a set of APIs and a web-based workspace for data ingestion, anomaly detection, and diagnostics - without needing to know machine learning. Developers can build AIOps, predicative maintenance, and business monitor applications on top of the service. Analyze multi-dimensional data from multiple data sources; Identify and correlate anomalies; Configure and fine-tune the anomaly detection model used on your data; Diagnose anomalies and help with root cause analysis.
Decision API: Content Moderator
This is an AI service that lets you handle content that is potentially offensive, risky, or otherwise undesirable. It includes the AI-powered content moderation service which scans text, image, and videos and applies content flags automatically, as well as the Review tool, an online moderator environment for a team of human reviewers. API groups include: Text moderation, Custom term lists, Image moderation, Custom image lists, Video moderation.
NLO: Immersive Reader
This is an inclusively designed tool that implements proven techniques to improve reading comprehension for new readers, language learners, and people with learning differences such as dyslexia. With the Immersive Reader client library, you can leverage the same technology used in Microsoft Word and Microsoft One Note to improve your web applications.
Computer Vision for Digital Asset Management
This is the business process of organizing, storing, and retrieving rich media assets and managing digital rights and permissions. For example, a company may want to group and identify images based on visible logos, faces, objects, colors, and so on. Or, you might want to automatically generate captions for images and attach keywords so they're searchable.
Computer Vision: Spatial Analysis
This service analyzes the presence and movement of people on a video feed and produces events that other systems can respond to.
Computer Vision: Image Analysis
This service extracts many visual features from images, such as objects, faces, adult content, and auto-generated text descriptions.
Computer Vision: Optical Character Recognition (OCR)
This service extracts text from images. You can use the new Read API to extract printed and handwritten text from photos and documents. It uses deep-learning-based models and works with text on a variety of surfaces and backgrounds. These include business documents, invoices, receipts, posters, business cards, letters, and whiteboards.
Computer Vision: DetectedFace
This single face representation is retrieved by the face detection operation. Its ID expires 24 hours after it's created.
Conversational AI
Tools and services for intelligent conversation.
Computer Vision: Form Recognizer
Vision API services that uses machine learning technology to identify and extract key-value pairs and table data from form documents. It then outputs structured data that includes the relationships in the original file. Unsupervised learning allows the model to understand the layout and field data without manual data labeling or intensive coding. You can also do supervised learning with manually labeled data. Models trained with labeled data can perform better and can work with more complicated documents.
Computer Vision
Vision API that gives you access to advanced algorithms that process images and return information based on the visual features you're interested in. OCR, Image Analysis, and Spatial Analysis are the sub services within this product.
Cognitive Services Categories
Vision, Speech, Language, Decision, Search
Computer Vision:Video Indexer
Visition API service that is the Azure Media Services AI solution and part of the Azure Cognitive Services brand. Provides ability to extract deep insights (with no need for data analysis or coding skills) using machine learning models based on multiple channels (voice, vocals, visual).
Computer Vision: PersistedFace
When DetectedFace objects are added to a group, such as FaceList or Person, they become PersistedFace objects. They can be retrieved at any time and don't expire.
Entity
Word or phrase that is the focus of the utterance.
Azure Machine Learning Designer
a graphical interface for no-code creation of the ML solutions
Semantic Segmentation
classifies pixels to the objects they belong to
Fall -Out or False Positive Rate (FBR)
defines how many negative cases that model predicted are actual predicted right. To calculate this metric, use the following formula: FP/(FP+TN).
Face detection, analysis, and recognition
detects, analyzes, and recognizes human faces
Image Analysis
extracts information from the images, tag them, and creates a descriptive image summary
Reciever Operator Characteristics (ROC)
is the relation between FPR (Fall-out) and TPR (Recall). Produces the Area Under Curve (AUC).
Reinforcement Learning
learns from the outcome and decides the next move based on this knowledge.
Unsupervised Learning
makes predictions without any prior knowledge of the possible outcomes; Clustering Models
Predicted vs. true Chart
presents the differences between predicted and true values. The dotted line outlines the ideal model performance and the solid line reflects the average model predictions. Closer these lines to each other, better model performance.
Residual Histogram
presents the frequency of residual values distribution. Residual is the difference between predicted and actual values. It represents the amount of error in the model. For high performance models, we should expect that most of the errors are small. They will cluster around 0 on the Residual histogram
False negative (FN)
the number of negative cases that the model predicted as positive
True Negative (TN)
the number of negative cases that the model predicted correctly.
False positive (FP)
the number of positive cases that the model predicted as negative.
True positive (TP)
the number of positive cases that the model predicted correctly