AWS Machine Learning Specialty Exam
AWS Glue Data Catalog
A central repository to store structural and operational metadata for all your data assets. For a given data set, you can store its table definition, physical location, add business relevant attributes, as well as track how this data has changed over time. Use the info in data catalog to create and monitor ETL jobs.
Learning Rate
A constant value used in the Stochastic Gradient Descent (SGD) algorithm. Learning rate affects the speed at which the algorithm reaches (converges to) the optimal weights. The SGD algorithm makes updates to the weights of the linear model for every data example it sees. The size of these updates is controlled by the learning rate.
Amazon Mechanical Turk
A crowdsourced marketplace where anyone can request, and pay for, individuals to complete tasks that computers are unable to.
BlazingText Algorithm
A highly optimized implementation of the Word2vec and text classification algorithms that scale to large datasets easily. It is useful for many downstream natural language processing (NLP) tasks. useful for many downstream natural language processing (NLP) tasks, such as sentiment analysis, named entity recognition, machine translation, etc. Text classification is an important task for applications that perform web searches, information retrieval, ranking, and document classification. Word embedding is a vector representation of a word. Words that are semantically similar correspond to vectors that are close together. That way, word embeddings capture the semantic relationships between words.
What is Kinesis?
A managed alternative to Apache Kafka. Great for "real-time" big data, streaming processing frameworks, application logs, metrics, IoT, clickstreams.
How can you increase the availability of a web app using a SageMaker endpoint to access the model's underlying ML component?
Add SageMaker instances of the same size and use the existing endpoint to host them. Amazon SageMaker multi-model endpoints give businesses a scalable yet cost-effective solution to deploy multiple ML models. Multi-model endpoints provide a scalable and cost-effective solution to deploying large numbers of models. They use a shared serving container that is enabled to host multiple models. This reduces hosting costs by improving endpoint utilization compared with using single-model endpoints. It also reduces deployment overhead because Amazon SageMaker manages loading models in memory and scaling them based on the traffic patterns to them.
Precision
All Pos: TP / TP+FP. AKA False Positive rate (percent of relevant results). Good metric when you care a lot about false positives (i.e. medical screening, drug testing - would be disastrous to say someone tested positive for using cocaine when they did not).
Inference Pipeline
An inference pipeline is an Amazon SageMaker model that is composed of a linear sequence of two to five containers that process requests for inferences on data. You use an inference pipeline to define and deploy any combination of pre-trained SageMaker built-in algorithms and your own custom algorithms packaged in Docker containers. You can use an inference pipeline to combine preprocessing, predictions, and post-processing data science tasks. Inference pipelines are fully managed.
Supervised Built-In Algorithm: AutoGluon-Tabular
An open-source AutoML framework that ensembles models and stacks them in multiple layers.
Latent Dirichlet Allocation (LDA)
An unsupervised learning algorithm suitable for determining topics in a set of documents.
What training input does Image Classification expect?
Apache MXNet RecordIO (NOT protobuf) -- adds interoperability with other deep learning frameworks OR raw jpg or png images. Image format requires .lst files to associate image index, class label, and path to image. Augmented Manifest Image Format enables Pipe mode.
Amazon Transcribe
AWS service that makes it easy for customers to convert speech-to-text. Using Automatic Speech Recognition (ASR) technology, customers can choose to use Amazon Transcribe for a variety of business applications, including transcription of voice-based customer service calls, generation of subtitles on audio/video content, and conduct (text-based) content analysis on audio/video content. Can mask or remove words that you don't want with vocabulary filtering.
Pipe Mode
Accelerates data streaming speed from S3 storage into SageMaker during training. Better throughput than File mode, which downloads data to EBS before training. Allows you to train jobs sooner, finish sooner, uses less disk space, reduces overall cost to train ML models on SageMaker. With Pipe input mode, your data is fed on-the-fly into the algorithm container without involving any disk I/O. This approach shortens the lengthy download process and dramatically reduces startup time. It also offers generally better read throughput than File input mode. This is because your data is fetched from Amazon S3 by a highly optimized multi-threaded background process. It also allows you to train on datasets that are much larger than the 16 TB Amazon Elastic Block Store (EBS) volume size limit.
What is Image Classification for?
Assigning label(s) to an image. Does not tell you where objects are, just what objects are in the image.
SageMaker Lifecycle Configuration
Automate customizations to be applied at different phases of the lifecycle of an instance. For example, you can write a script to install a list of libraries and, using the Lifecycle configuration feature, configure the scripts to automatically execute every time your notebook instance is started. Similarly, you can choose to automatically run the script only once when the notebook instance is created.
SageMaker Autopilot
Automates algorithm selection, preprocessing, and model tuning.
Object Detection - MXNet
Detects and classifies objects in images using a single deep neural network. It is a supervised learning algorithm that takes images as input and identifies all instances of objects within the image scene.
Unsupervised Built-In Algorithm: Random Cut Forest (RCF)
Detects anomalous data points within a data set that diverge from otherwise well-structured or patterned data. Anomalies can manifest as unexpected spikes in time series data, breaks in periodicity, or unclassifiable data points. They are easy to describe in that when viewed in a plot, they are often easily distinguishable from the "regular" data. Including these anomalies in a data set can drastically increase the complexity of a machine learning task since the "regular" data can often be described with a simple model. With each data point, RCF associates an anomaly score. Low score values indicate that the data point is considered "normal." High values indicate the presence of an anomaly in the data. The definitions of "low" and "high" depend on the application but common practice suggests that scores beyond three standard deviations from the mean score are considered anomalous.
Object Detection - TensorFlow
Detects bounding boxes and object labels in an image. It is a supervised learning algorithm that supports transfer learning with available pretrained TensorFlow models
Amazon Comprehend
(NLP) service that uses machine learning to find insights and relationships in a text. No machine learning experience required.The service identifies the language of the text; extracts key phrases, places, people, brands, or events; understands how positive or negative the text is; analyzes text using tokenization and parts of speech, and automatically organizes a collection of text files by topic. You can also use AutoML capabilities in Amazon Comprehend to build a custom set of entities or text classification models that are tailored uniquely to your organization's needs.
Supervised Built-In Algorithm: Object2Vec
A new, highly customizable multi-purpose algorithm used for feature engineering. It can learn low-dimensional dense embeddings of high-dimensional objects to produce features that improve training efficiencies for downstream models. While this is a supervised algorithm, as it requires labeled data for training, there are many scenarios in which the relationship labels can be obtained purely from natural clusterings in data, without any explicit human annotation.
Supervised Built-In Algorithm: KNN
A non-parametric method that uses the k nearest labeled points to assign a label to a new data point for classification or a predicted target value from the average of the k nearest points for regression.
Supervised Built-In Algorithm: Tab Transformer
A novel deep tabular data modeling architecture built on self-attention-based Transformers.
Virtual Private Cloud (VPC)
A private network segment made available to a single cloud consumer within a public cloud. You can connect directly to the SageMaker API or to the SageMaker Runtime through an interface endpoint in your Virtual Private Cloud (VPC) instead of connecting over the internet. When you use a VPC interface endpoint, communication between your VPC and the SageMaker API or Runtime is conducted entirely and securely within the AWS network. The SageMaker API and Runtime support Amazon Virtual Private Cloud (Amazon VPC) interface endpoints that are powered by AWS PrivateLink. Each VPC endpoint is represented by one or more Elastic Network Interfaces with private IP addresses in your VPC subnets. The VPC interface endpoint connects your VPC directly to the SageMaker API or Runtime without an internet gateway, NAT device, VPN connection, or AWS Direct Connect connection. The instances in your VPC don't need public IP addresses to communicate with the SageMaker API or Runtime.
SageMaker Neo
A service primarily used to optimize machine learning models to run faster with no accuracy loss for inference on edge devices. Train once, run anywhere. Edge devices: ARM, Intel, NVIDIA processors. Optimizes code for specific devices: Tensorflow, MXNet, PyTorch, ONNX, XGBoost. Consists of a compiler and a runtime.
Sequence-to-Sequence Algorithm
A supervised algorithm commonly used for neural machine translation. Input is seq of tokens (text, audio), output is another seq of tokens. Ex: Machine translation, Text summarization, Speech-to-text. Need an input file that maps every word to a number (tokens must be integers). Needs RecordIO-Protobuf format. Can only use GPUs, P3 for ex. Can only use a single machine for training, but can use multi-GPUs on one machine.
Text Classification - TensorFlow
A supervised algorithm that supports transfer learning with available pretrained models for text classification.
Image Classification Hyperparameters
Batch size, learning rate, optimizer (weight decay, beta 1, beta 2, eps, gamma)
Types of Neural Networks
CNN: image classification. RNN: sequences in time(i.e. predict stock prices, understand words in a sentence, translation)
Bias Metrics in Clarify
Class Imbalance - CI - (one facet/demographic group has fewer training vals than others). Difference in Proportions of Labels - DPL - Imbalance of positive outcomes between facet vals. Kullback-Leibler Divergence - KL - and Jensen Shannon Divergence - JS - how much outcome distributions of facets diverge. Lp-Norm - LP - P norm diff between dists of outcomes. Total Variation Distance - TVD - L1 norm diff between dists of outcomes. Kolmogorov-Smirnov -KS- Max divergence between outcomes in dists from facets (ex: measure which college app outcomes represent the most signif disparity in ethnic groups). Conditional Demographic Disparity -CDD- disparity of outcomes between facets as a whole and by subgroups.
Non-Linear Activation Functions
Create complex input-output mappings. Allow backpropagation (because they have a useful derivative). Allow for multiple layers (linear functions degenerate to a single layer).
Amazon DMS
Database Migration Service to quickly migrate databases to AWS. Source DB remains available during migration. Must create an EC2 instance to perform data replication tasks between source and target DBs. No data transformation, migration only.
AWS X-Ray
Debugging service that helps developers understand how their application and its underlying services are performing to identify and troubleshoot the root cause of performance issues and errors.
Build a Jeff Bezos Detector
DeepLens -> Rekognition
Activation Functions
Define the output of a node/neuron given its input signals.
Kinesis Streams
Low latency streaming ingest at scale. Real time, automatic scaling with on-demand mode, data storage from min 1 day to max 1 year, replay capability, multi consumers. Cannot transform data.
ELU
Exponential Linear Unit (ReLU variant). Smoother curve on negative side (exponential function).
Custom Entity Recognition
Extends the capability of Amazon Comprehend by enabling you to identify new entity types not supported as one of the preset generic entity types. This means that in addition to identifying entity types such as LOCATION, DATE, PERSON, and so on, you can analyze documents and extract entities like product codes or business-specific entities that fit your particular needs. You can use it to extract entities specific only to your business, such as product codes.
Unsupervised Built-In Algorithm: K-Means
Finds discrete groupings within data, where members of a group are as similar as possible to one another and as different as possible from members of other groups.
Supervised Built-In Algorithm: DeepAR Forecasting
For forecasting scalar (one-dimensional) time series using recurrent neural networks (RNN). When your dataset contains hundreds of related time series, DeepAR outperforms the standard ARIMA and ETS methods. You can also use the trained model to generate forecasts for new time series that are similar to the ones it has been trained on. Based on historical data for a behavior, you can predict future behavior using DeepAR algorithm. For example, you can predict sales on a new product based on previous sales data. SageMaker DeepAR algorithm specializes in forecasting new product performance.
What are the Object Detection and Image Classification Instance Types?
GPU for training, CPU or GPU for inference
Amazon FSx for Lustre
High-performance distributed file system integrated with S3. Speeds up ML training. Amazon FSx for Lustre speeds up your training jobs by serving your Amazon S3 data to Amazon SageMaker at high speeds. The first time you run a training job, Amazon FSx for Lustre automatically copies data from Amazon S3 and makes it available to Amazon SageMaker. Additionally, the same Amazon FSx file system can be used for subsequent iterations of training jobs on Amazon SageMaker, preventing repeated downloads of common Amazon S3 objects. Because of this, Amazon FSx has the most benefit to training jobs that have training sets in Amazon S3 and in workflows where training jobs must be run several times using different training algorithms or parameters to see which gives the best result.
IAM
IAM enables you to provide access to AWS APIs securely using fine-grained policies. It is one mechanism you can use to have compliant use of Amazon SageMaker. You can assign an IAM policy permissions set to a user or federated IAM role. The user utilizes these permissions to access various Amazon SageMaker components, including notebooks, model training, and model usage. Supports authorization based on resource tags. You can specify allowed or denied actions and resources.
What is object detection for?
Identifying all objects in an image with bounding boxes, Detecting and classifying objects with a single neural network, Designates classes with confidence scores, Can train from scratch or use pre-trained models on ImageNet.
SageMaker Training Compiler
Integrated into AWS Deep Learning Containers (DLCs). Compile & optimize training jobs on GPU instances up to 50%. Converts models into hardware-optimized instructions. Tested with Hugging Face transformers lib, or byo model. Incompatible with SageMaker distributed training libraries. Use GPU instances, PyTorch models must use PyTorch/XLA model save function or else it won't have necessary info, Enable debug flag in compiler_config parameter to enable debugging.
What training input does Semantic Segmentation expect?
JPG images with PNG annotations for both trainign and validation, label maps to describe annotations, augmented manifest image format supported for Pipe mode. JPG images accepted for inference.
A Hyperparameter tuning job in SageMaker is using more time and resources than you are willing to spend. What might be one way to make it more efficient?
Restrict the search space by limiting hyperparameter ranges as much as possible.
4 Methods for Encrypting Objects in S3
SSE-S3: encrypts objects using keys managed by AWS. SSE-KMS: Key Mgmt Service manages encryption keys (addtl security and audit trails). SSE-C: manage your own encryption keys. Client Side Encryption.
Sigmoid/Logistic/TanH
Sigmoid/Log scales everything from 0 to 1. TanH scales from -1 to 1. (Hyperbolic Tangent Function - well suited to RNNs). Changes slowly for high or low values (vanishing gradient problem). Computationally expensive. TanH generally preferred over sigmoid.
Where does the training code used by SageMaker come from?
All training code deployed to SageMaker training instances come from ECR. (A Docker image registered with ECR).
Amazon Elastic Inference
Allows you to attach low-cost GPU-powered acceleration to Amazon EC2 and Sagemaker instances or Amazon ECS tasks, to reduce the cost of running deep learning inference by up to 75%. Amazon Elastic Inference supports TensorFlow, Apache MXNet, PyTorch, and ONNX models.
Amazon Athena
Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL serverless pay for only the queries you run.
Amazon Lex
Amazon Lex is a service for building conversational interfaces into any application using voice and text. NLP chatbot engine. Built around intents. Utterances invoke intents. Lambda functions invoked to fulfill intents. Slots specitfy extra info needed by intent.
Amazon Polly
Amazon Polly is a service that turns text into lifelike speech. Lexicons to customize pronunciation of specific words & phrases. SSML Speech Synthesis Markup Language for control over emphasis, pronunciation, breathing, whispering, speech rate, pitch, and pauses. Speech Marks to encode when sentence/word starts and ends in the audio stream, useful for lip-synching animation.
Amazon Rekognition
Amazon Rekognition is a service that makes it easy to add image analysis to your applications. With Rekognition, you can detect objects, scenes, and faces in images. You can also search and compare faces. Has a content moderation feature.
Boto3
An AWS SDK for Python. Enables you to control AWS services through APIs.
Amazon Inspector
An assessment tool used to improve the security of your applications on EC2 instances.
Supervised Built-In Algorithm: Factorization Machine
An extension of a linear model that is designed to economically capture interactions between features within high-dimensional sparse datasets. Click prediction • Item recommendations • Since an individual user doesn't interact with most pages / products the data is sparse.
Supervised Built-In Algorithm: CatBoost
An implementation of the gradient-boosted trees algorithm good for categorical and ordinal features.
Supervised Built-In Algorithm: Light GBM
An implementation of the gradient-boosted trees algorithm that adds two novel techniques for improved efficiency and scalability: Gradient-based One-Side Sampling (GOSS) and Exclusive Feature Bundling (EFB).
Supervised Built-In Algorithm: XGBoost
An implementation of the gradient-boosted trees algorithm that combines an ensemble of estimates from a set of simpler and weaker models.
Leaky ReLU
Solves dying ReLU problem by introducing a negative slope below 0.
Early Stopping
Early stopping offers a fail-safe feature wherein it prevents your model from learning further if its accuracy is no longer improving or if it's degrading at succeeding epochs. When you enable early stopping for a hyperparameter tuning job, SageMaker evaluates each training job that the hyperparameter tuning job launches as follows: - After each epoch of training, get the value of the objective metric. - Compute the running average of the objective metric for all previous training jobs up to the same epoch, and then compute the median of all of the running averages. - If the value of the objective metric for the current training job is worse (higher when minimizing or lower when maximizing the objective metric) than the median value of running averages of the objective metric for previous training jobs up to the same epoch, SageMaker stops the current training job. To use early stopping with your own algorithm, you must write your algorithms such that it emits the value of the objective metric after each epoch. You can use the following frameworks for this task: TensorFlow, MXNet, Chainer, PyTorch, and Spark.
Supervised Built-In Algorithm: Linear Learner
Learns a linear function for regression or a linear threshold function for classification. Expects RecordIO wrapped protobuf as training input (Float32 data only), CSV (first column = label). Supports File and Pipe mode. Used for preprocessing (training data must be normalized - LL can do this for you) (input data should be shuffled). Uses SGD for training, trains multiple models in parallel and selects most optimal. Tune L1, L2 regularization.
Unsupervised Built-In Algorithm: IP Insights
Learns the usage patterns for IPv4 addresses. It is designed to capture associations between IPv4 addresses and various entities, such as user IDs or account numbers. Learns vector representations for online resources and IP addresses where each point is close together. Accepts inputs formatted as (entity, IP address) pairs.
Kinesis Firehose
Load streams into S3, Redshift, ElasticSearch. Serverless data transformations with lambda. Near real time (lowest buffer is 1 minute). Automated scaling. No data storage. Can do ingestion and transformation. A database is not a valid data source for Amazon Kinesis Data Firehose. This service is only used to ingest data from data sources that continuously produce data.
Logaritimic Scaling
Logarithmic scaling works only for ranges that have values greater than 0. Choose logarithmic scaling when you need a range that spans several orders of magnitude. For example, if you are tuning a linear learner model and you specify a range of values between .0001 and 1.0 for the learning_rate hyperparameter, searching uniformly on a logarithmic scale gives you a better sample of the entire range than searching on a linear scale would. This is because searching on a linear scale would, on average, devote 90 percent of your training budget to only the values between .1 and 1.0, leaving only 10 percent of your training budget for the values between .0001 and .1.
Model Monitor
MLOps tool for identifying drift over time.
Kinesis Video Streams
Meant for streaming video in real-time.
Object Detection Hyperparameters
Mini_batch_size, learning_rate, Optimizer (SGD, adam, rmsprop, adadelta).
Dropout Regularization
Reducing model size: Aims to prevent overfitting. It works by randomly choosing nodes to disable during training. This helps "slow-learning" nodes to learn more by inhibiting them to depend on "smart" nodes, making the model as a whole, robust. The nodes to be disabled are updated at each iteration, which in return, changes the architecture of the model. What we need in this problem is a method that trims the connection between nodes.
Model Pruning
Reducing model size: Model pruning aims to remove weights that don't contribute much to the training process. Weights are learnable parameters: they are randomly initialized and optimized during the training process. During the forward pass, data passes through the model. The loss function evaluates model output given the labels; during the backward pass, weights are updated to minimize the loss. To do so, the gradients of the loss with respect to the weights are computed, and each weight receives a different update. After a few iterations, certain weights are typically more impactful than others; the goal of pruning is to remove the useless ones without significantly reducing model accuracy.
AWS Config
Monitors and records your AWS resource configurations. It also allows you to automate the evaluation of recorded configurations against desired configurations.
Choosing an Activation Function
Multi-class: use softmax. RNN's do well with TanH. For everything else: start with ReLU, then try ReLU, then PReLU, Maxout, Swish if the deepest.
SageMaker Canvas
No-code ML for business analysts. Upload CSV data, select column to pred, built it, make preds. Can join datasets. Class or regress. Auto data cleaning. Shares models & datasets with SageMaker studio.
How is Image Classification used?
ResNet CNN. Full training mode: network initialized with random weights or Transfer learning mode: initialized with pre-trained weights, top fully-connected layer is initialized with random weights, network is fine-tuned with new training data.
Residual Plots
Residuals represent the portion of the target that the model is unable to predict. Positive residual = model is underestimating the target. Negative residual = model is overestimating the target.
Kinesis Analytics
Perform real-time analytics on streams using SQL.
What is Semantic Segmentation used for?
Pixel-level object classification (different from image class which assigns labels to whole images and different from object detect which assigns labels to bounding boxes). Useful for self-driving vehicles, medical imaging diagnostics, robot sensing. Produces a segmentation mask.
Image Classification - Semantic Segmentation
Provides a fine-grained, pixel-level approach to developing computer vision applications. It tags every pixel in an image with a class label from a predefined set of classes. Tagging is fundamental for understanding scenes, which is critical to an increasing number of computer vision applications such as self-driving vehicles, medical imaging diagnostics, and robot sensing. Because the semantic segmentation algorithm classifies every pixel in an image, it also provides information about the shapes of the objects contained in the image. The segmentation output is represented as a grayscale image called a segmentation mask. A segmentation mask is a grayscale image with the same shape as the input image.
Swish
ReLU variant from Google that benefits very deep networks with 40+ layers.
Maxout
ReLU variant that outputs max of inputs. Doubles parameters that need to be trained.
Parametric ReLU
ReLU, but the slope in the negative part is learned via backpropagation (learning all weights between nodes in NN and learning optimal slope for the negative portion of the ReLU activation function). Computationally expensive.
What training input does Object Detection expect?
RecordIO or image format (jpg or png). For image format, supply a JSON file with annotation data for each image.
ReLU
Rectified Linear Unit: easy and fast computing. When inputs are 0 or negative, we have a linear function and all of its problems (dying ReLU problem).
Unsupervised Built-In Algorithm: PCA
Reduces the dimensionality (number of features) within a dataset by projecting data points onto the first few principal components. The objective is to retain as much information or variation as possible. For mathematicians, principal components are eigenvectors of the data's covariance matrix.
Image Classification - MXNet
Supervised learning algorithm (Uses example data with answers) for image classification. Detects and classifies objects in images using a single deep neural network. It is a supervised learning algorithm that takes images as input and identifies all instances of objects within the image scene.
Image Classification - TensorFlow
Supervised learning algorithm that uses pretrained TensorFlow Hub models to fine-tune for specific tasks for image classification.
AWS CloudTrail
This service is simply used for tracking AWS account activity, including actions taken through the AWS Management Console, AWS SDKs, command-line tools, and other AWS services.
Learning Rate Too Large/Too Small
Too large a learning rate might prevent the weights from approaching the optimal solution. Too small a value results in the algorithm requiring many passes to approach the optimal weights. In Amazon ML, the learning rate is auto-selected based on your data.
Are people on the phone happy?
Transcribe -> Comprehend
BYO Alexa
Transcribe -> Lex -> Polly
Make a Universal Translator
Transcribe -> Translate -> Polly
EMR Usage
Transient and long-running clusters, connect to master run jobs, submit ordered steps via the console, serverless automatic scaling
Recall
True Pos over ALL true (TP) and ALL false (FN). TP/TP+FN. IT IS THE TRUE POSITIVE RATE. Good choice of metric when you care a lot about false negatives (i.e. fraud detection -- you need to flag EVERY instance of fraud.. you want to err on the side of False Pos not False Neg). Medical screening: The company would like to be extra sure that a patient does not have cancer before they pronounce them healthy. This implies that they want less false negatives. As false negatives decrease, the model would have a higher recall, so recall is the metric to focus on.
Neural Topic Model (NTM) Algorithm
Unsupervised technique for determining topics in a set of documents based on statistical distribution using a neural network approach. Ex: docs that contain food hunger waiter service kitchen chef would likely be about transportation.
Softmax
Used on the final output layer of a multiple classification problem. Converts outputs to probabilities for each classification. Can't produce more than one label for something (use sigmoid if you need more than one label for something bc this will just pick the highest probability and use a single label).
Multinomial Logistic Regression
Used to predict categorical placement in or the probability of category membership on a dependent variable based on multiple independent variables. It is a simple extension of binary logistic regression that allows for more than two categories of the dependent or outcome variable. Like binary logistic regression, multinomial logistic regression uses maximum likelihood estimation to evaluate the probability of categorical membership.
NLP Preprocessing
Word tokennization, stop work removal, HTML tag removal, stemming, lemmatization, etc.
Amazon CloudWatch
You can monitor Amazon SageMaker using Amazon CloudWatch, which collects raw data and processes it into readable, near real-time metrics. These statistics are kept for 15 months so that you can access historical information and gain a better perspective on how your web application or service is performing. To help you debug your processing jobs, training jobs, endpoints, transform jobs, notebook instances, and notebook instance lifecycle configurations, anything an algorithm container, a model container, or a notebook instance lifecycle configuration sends to stdout or stderr is also sent to Amazon CloudWatch Logs. In addition to debugging, you can use these for progress analysis. You can also use Amazon CloudWatch Events to react to different SageMaker Events.
