AI-900 Last Minute
Capabilities of Azure Machine Learning
- AutoML - Centralized storage and management of datasets - On-demand compute resources - Visual tools to define pipelines - Integration with common machine learning frameworks such as MLflow - Built-in responsible AI evaluation
Key workloads of AI
- Machine learning - Anomaly detection - Computer vision - Natural language processing - Knowledge mining
2 types of Playgrounds
1. Completions playground 2. Chat playground
4 kinds of resources you can create in Azure Machine Learning Studio
1. Compute Instances 2. Compute Clusters 3. Kubernetes Clusters 4. Attached Compute
Binary Classification Evaluation Metrics
1. Confusion Metrix - Accuracy - Recall - Precision - F1-Score 2. Area under the curve (AUC)
6 Microsoft Principles of Responsible AI
1. Fairness (both the technical component & societal context of deployment) "sociotechnical" = Equal access to tech & avoiding bias 2. Reliability & Safety = development must be subjected to rigorous testing and deployment management processes 3. Privacy & Security = Need to ensure protection of data 4. Inclusiveness = Ai should empower people around the world Intentionally diverse in AI approaches 5. Transparency = Gaining user trust & be open about limitations 6. Accountability = Solutions work within governance and framework
2 types of Multiclass Classification Training Algorithms
1. One-vs-Rest (OvR) algorithms 2. Multinomial algorithms
3 goals of OpenAI
1. Use Azure to build enterprise-grade apps 2. Deploy OpenAI model capabilities across MS products beyond Azure AI 3. Use Azure to power OpenAI workloads
Why has AI been restricted to all but the largest technology companies?
1. the large amounts of data required to train models 2. the massive amount of computing power needed 3. the budget to hire specialist programmers
You want to create a model to predict the cost of heating an office building based on its size in square feet and the number of employees working there. What kind of machine learning problem is this?
= Regression
Azure AI Services
AI capabilities that can be built into web or mobile applications, in a way that's straightforward to implement Ex: image recognition, natural language processing, speech, AI-powered search, etc. Use Cases: - Azure AI Language service can be used for widely known use-cases that require minimal tuning (the process of optimizing a model's performance) - Azure OpenAI Service may be more beneficial for use-cases that require highly customized generative models, or for exploratory research. Most Azure AI services are available through REST APIs and client library SDKs in popular development languages → or integrate them with tools such as Logic Apps and Power Automate
Automated Machine Learning (AutoML)
Automatically run multiple training jobs using different algorithms and parameters to find the best model - enables non-experts - In Automated Machine Learning, you can select configurations for the primary metric, type of model used for training, exit criteria, and concurrency limits.
Clustering Evaluation Metrics
Average distance to cluster center Average distance to other center Maximum distance to cluster center Silhouette
4 Microsoft Azure Natural Language Processing (NPL) Services
Azure AI Language Azure AI Translator Azure AI Speech Azure AI Bot Service
4 Microsoft Azure Computer Vision Services
Azure AI Vision Azure AI Custom Vision Azure AI Face Azure AI Document Intelligence
Challenges/Risks of AI:
Bias can affect results Errors may cause harm Data could be exposed Solutions may not work for everyone Users must trust a complex system Who's liable for AI-driven decisions?
Natural language processing (NLP)
Capability for a computer to interpret written or spoken language, and respond in kind NLP enables you to create software that can: - Analyze and interpret text in documents, email messages, and other sources. - Interpret spoken language, and synthesize speech responses. - Automatically translate spoken or written phrases between languages. - Interpret commands and determine appropriate actions.
Data and compute management
Cloud-based data storage and compute resources that professional data scientists can use to run data experiment code at scale
Multiclass Classification Evaluation Metrics
Confusion Metrix - Accuracy - Recall - Precision - F1-Score
Kubernetes Clusters
Deployment targets for predictive services that use your trained models. You can access previous versions of "inference clusters" here.
Compute Instances
Development workstations that data scientists can use to work with data and models.
K-Means Clustering
Feature values are vectorized to coordinates then you decide on # of clusters → each data point is then assigned to the nearest cluster on the vector
Observations can be observed attributes ____________ or the known value of the thing you want to train your model to predict ______________
Features; Labels
Computer Vision models and capabilities:
Image classification Object detection Semantic segmentation Image analysis Face detection, analysis, and recognition Optical character recognition (OCR)
How do you train a cluster model?
K-Means Clustering
Attached Compute
Links to existing Azure compute resources, such as Virtual Machines or Azure Databricks clusters.
Function
Machine Learning model
What is the confidence score returned by the language detection service of natural language processing (NLP) for an unknown language name?
NaN (Not a #)
What is one action Microsoft takes to support ethical AI practices in Azure OpenAI?
Provides Transparency Notes that share how technology is built and asks users to consider its implications.
Compute Clusters
Scalable clusters of virtual machines for on-demand processing of experiment code
Computer vision
The capability of software to interpret the world visually through cameras, video, and images (visual processing)
Knowledge mining
The capability to extract information from large volumes of often unstructured data to create a searchable knowledge store
Loss Function
To evaluate the aggregate difference between predicted and actual label values
How are classification models validated?
a subset of the training data is held back to validate the trained model (iterative process)
Asset Library Tab
access data assets and components in a pipeline project
Azure AI Language
access features for understanding and analyzing text, training language models that can understand spoken or text-based commands, and building intelligent applications.
Deep Learning
advanced form of machine learning that tries to emulate the way the human brain learns - Key to deep learning is the creation of an artificial neural network
Azure OpenAI
allows you to transition between your work with Azure services and OpenAI, while utilizing Azure's private networking, regional availability, and responsible AI content filtering Components: - Pre-trained GenAI models - Customization - Detect & mitigate harmful use cases (responsible AI) - Security (RBAC & private networks)
Azure AI Vision
analyze images and video, and extract descriptions, tags, objects, and text
Seeing AI
app designed for the blind and low vision community to open up the visual world and describe nearby people, text and objects
Anomaly detection
automatically detect errors or unusual activity in a system - If a measurement occurs outside of the normal expected range, the model reports an anomaly that can be used to inform an alert
Azure Machine Learning Studio
browser-based portal for managing your machine learning resources and jobs Must assign the workspace you created in the portal to the studio Capabiltities: - Run code in notebooks - Create jobs & pipelines - Train models - Deploy trained models for on-request and batch interfacing import/export data
Azure OpenAI Studio
build AI models and deploy them for public consumption in software applications Model types: - GPTs - Embeddings - DALL-E
One-vs-Rest (OvR) algorithms
calculates the probability of the observation being a specific class compared to any other class
Indexes
can then be used for internal only use, or to enable searchable content on public facing internet assets
Image classification
classify images based on their contents
Object detection
classify individual objects within an image, and identify their location with a bounding box
Azure Machine Learning
cloud-based platform for creating, managing, and deploying machine learning models
Image analysis
combine machine learning models with advanced image analysis techniques to extract information from images, including "tags" that could help catalog the image or even descriptive captions that summarize the scene shown in the image
Azure OpenAI Service
enables users to build enterprise-grade solutions with OpenAI models Capability to generate: - natural language (trained on tokens; text completion, embeddings) - code (ex: (GPTs, GitHub Copilot) - images (ex: DALL-E) OpenAI makes its AI models available to developers
Azure AI Face
enables you to build face detection and facial recognition solutions
Components
encapsulates one step in a machine learning pipeline - Programming function / building block for Azure Machine Learning pipelines
Key Phrase Extraction
evaluating the text of a document and identifying the main talking points
Generative pre-trained transformer (GPT) models
excellent at both understanding and creating natural language - take natural language or code snippets and translate them into code - also summarize functions that are already written, explain SQL queries or tables, and convert a function from one programming language into another - Can specify libraries or language-specific tags
Multiclass Classification (supervised machine learning)
extends binary classification to predict a label that represents one of multiple possible classes Used to calculate probability values for multiple class labels, enabling a model to predict the most probable class for a given observation. Examples: - The species of a penguin (Adelie, Gentoo, or Chinstrap) based on its physical measurements. The genre of a movie (comedy, horror, romance, adventure, or science fiction) based on its cast, director, and budget. TYPICALLY MUTUALLY EXCLUSIVE - However, there are multilabel classification models, in which there may be more than one valid label for a single observation - - Ex: A movie is both comedy AND horror
Azure AI Document Intelligence
extract information from scanned forms and documents
Text completion
generate and edit text
Azure Machine Learning Designer
graphical interface enabling no-code development of machine learning solutions (drag-and-drop interface)
Lemmatization (Stemming)
grouping words together based on their basic dictionary definition
Semantic segmentation
individual pixels in the image are classified according to the object to which they belong
GitHub Copilot
integrates the power of OpenAI Codex into a plugin for developer environments like Visual Studio Code - AI pair programmer (OpenAI + GitHub)
Regression (supervised machine learning)
label predicted by the model is a calculated numeric value Examples: - The number of ice creams sold on a given day, based on the temperature, rainfall, and windspeed. - The selling price of a property based on its size in square feet, the number of bedrooms it contains, and socio-economic metrics for its location. - The fuel efficiency (in miles-per-gallon) of a car based on its engine size, weight, width, height, and length.
Face detection, analysis, and recognition
locates human faces in an image
Supervised Machine Learning
machine learning algorithms in which the training data includes both feature values and known label values Uses: - train models by determining a relationship between the features and labels in past observations, so that unknown labels can be predicted for features in future cases Types: - Regression - Classification (Binary & Multiclass)
GPT-3.5
models that can generate natural language and code responses based on prompts. ex: ChatGPT
Embeddings
models that convert text to numeric vectors for analysis - for example comparing sources of text for similarity.
DALL-E
models that generate images based on natural language descriptions
GPT-4
models that represent the latest generative models for natural language and code
Clustering
observations are grouped into clusters based on similarities in their data values, or features (purely based on characteristics / no known values to train the model) Examples: -Group similar flowers based on their size, number of leaves, and number of petals. - Identify groups of similar customers based on demographic attributes and purchasing behavior. Similar to multiclass classification as it categorizes observations into discrete groups - BUT no previously known cluster label and the algorithm groups the data observations based purely on similarity of features ***In some cases, clustering is used to determine the set of classes that exist before training a classification model. For example, you might use clustering to segment your customers into groups, and then analyze those groups to identify and categorize different classes of customer (high value - low volume, frequent small purchaser, and so on). You could then use your categorizations to label the observations in your clustering results and use the labeled data to train a classification model that predicts to which customer category a new customer might belong.
Jobs
operations that you run (enable systematic tracking for your testing/workflows)
Multinomial algorithms
output is a vector (an array of values) that contains the probability distribution for all possible classes - with a probability score for each class which when totaled add up to 1.0 Ex: Softmax function
Transcribing
part of speech recognition, which involves converting speech into a text representation.
Tokenization
part of speech synthesis that involves breaking text into individual words such that each word can be assigned phonetic sounds
Training data is made up of _________________
past observations
Playgrounds
place within Azure OpenAI Studio to experiment with OpenAI models
Inferencing
predicting new values in the process (post the function being defined)
Azure Machine Learning workspace
primary resource required for Azure Machine Learning and is provisioned in an Azure subscription (created using the Azure Portal) - manage data, code, models, and other artifacts related to your machine learning workloads - After creation → you can develop solutions with the Azure Machine Learning service either with developer tools or the Azure Machine Learning studio web portal
Azure Cognitive Search
private, enterprise, search solution that has tools for building indexes Utilizes the built-in AI capabilities of Azure AI services to perform knowledge mining of documents The product's AI capabilities makes it possible to index previously unsearchable documents and to extract and surface insights from large amounts of data quickly
Training
process of defining the function
Azure AI Bot Service
provides a platform for conversational AI, the capability of a software "agent" to participate in a conversation - Developers can use the Bot Framework to create a bot and manage it with Azure Bot Service - integrating back-end services like Language, and connecting to channels for web chat, email, Microsoft Teams, and others.
Azure Anomaly Detector
provides an application programming interface (API) that developers can use to create anomaly detection solutions
Azure AI Speech service
provides powerful speech to text and text to speech capabilities, allowing speech to be accurately transcribed into text, or text to natural sounding voice audio
Run configuration
provides the information needed to specify your training script and Azure Machine Learning environment in your run configuration and run a training job - AzureML maintains a run record for the job
Azure AI Speech
recognize and synthesize speech, and to translate spoken languages.
Artificial Neural Network
simulates electrochemical activity in biological neurons by using mathematical functions
2 Types of Machine Learning
supervised and unsupervised
Optical character recognition (OCR)
technique used to detect printed and handwritten text in images Uses: - Note taking - Digitizing forms (medical records, historical docs) - Scanning printed or handwritten checks for bank deposits
AI
the creation of software that imitates human behaviors and capabilities
Machine Learning
the foundation for an AI system, and is the way we "teach" a computer model to make predictions and draw conclusions from data
Binary Classification (supervised machine learning)
the label determines whether the observed item is (or isn't) an instance of a specific class (Mutually exclusive outcomes (true/false, +/-)) Instead of calculating numeric values like a regression model, the algorithms used to train classification models calculate probability values Examples: - Whether a patient is at risk for diabetes based on clinical metrics like weight, age, blood glucose level, and so on. - Whether a bank customer will default on a loan based on income, credit history, age, and other factors.
Classification (supervised machine learning)
the label represents a categorization, or class use a set of features to calculate a probability score for each possible class and predict a label that indicates the most likely class
How to use Azure OpenAI?
through REST APIs, Python SDK, or Azure OpenAI Studio
Azure AI Custom Vision
train custom image classification and object detection models using your own images
Unsupervised Machine Learning
training models using data that consists only of feature values without any known labels Uses: - determine relationships between the features of the observations in the training data Types: - Clustering
Azure AI Translator
translate text between more than 60 languages.
Completions playground
type in prompts, configure parameters, and see responses without having to code
Chat playground
use the assistant setup to instruct the model about how it should behave. The assistant will try to mimic the responses you include in tone, rules, and format you've defined in your system message
Azure AI Content Safety service
used to detect harmful content within text or images, including violent or hateful content, and report on its severity
Azure AI Language service
used to summarize text, classify information, or extract key phrases
Pipelines (Azure ML Designer projects)
you define pipelines to orchestrate model training, deployment, and management tasks - organize, manage, and reuse complex machine learning workflows across projects and users - Starts with the dataset from which you want to train the model