AWS ML ExamTopics Dump 60 Full Questions and Answers
Question #7 A Machine Learning Specialist is building a model that will perform time series forecasting using Amazon SageMaker. The Specialist has finished training the model and is now planning to perform load testing on the endpoint so they can configure Auto Scaling for the model variant. Which approach will allow the Specialist to review the latency, memory utilization, and CPU utilization during the load test? they are being produceAnswer: outputted by Amazon SageMaker. Amazon SageMaker. data.
ANSWER: Generate an Amazon CloudWatch dashboard to create a single view for the latency, memory utilization, and CPU utilization metrics that are
Question #19 A Machine Learning Specialist is building a logistic regression model that will predict whether or not a person will order a pizzAnswer: The Specialist is trying to build the optimal model with an ideal classification thresholAnswer: What model evaluation technique should the Specialist use to understand how different classification thresholds will impact the model's performance?
ANSWER: Receiver operating characteristic (ROC) curve
Question #22 A retail chain has been ingesting purchasing records from its network of 20,000 stores to Amazon S3 using Amazon Kinesis Data FirehosAnswer: To support training an improved machine learning model, training records will require new but simple transformations, and some attributes will be combineAnswer: The model needs to be retrained daily. Given the large number of stores and the legacy data ingestion, which change will require the LEAST amount of development effort? do the transformation. records in Amazon S3, outputting new/transformed records to Amazon S3. and output the transformed records to Amazon S3. simple transformed values using SQL.
ANSWER: Insert an Amazon Kinesis Data Analytics stream downstream of the Kinesis Data Firehose stream that transforms raw record attributes into
Question #24 A Machine Learning Specialist trained a regression model, but the first iteration needs optimizing. The Specialist needs to understand whether the model is more frequently overestimating or underestimating the target. What option can the Specialist use to determine whether it is overestimating or underestimating the target value?
ANSWER: Residual plots
Question #9 A Machine Learning Specialist is developing a custom video recommendation model for an application. The dataset used to train this model is very large with millions of data points and is hosted in an Amazon S3 bucket. The Specialist wants to avoid loading all of this data onto an Amazon SageMaker notebook instance because it would take hours to move and will exceed the attached 5 GB Amazon EBS volume on the notebook instancAnswer: Which approach allows the Specialist to use all the data to train the model? parameters seem reasonablAnswer: Initiate a SageMaker training job using the full dataset from the S3 bucket using Pipe input modAnswer: data to verify the training code and hyperparameters. Go back to Amazon SageMaker and train using the full dataset a SageMaker training job using the full dataset from the S3 bucket using Pipe input modAnswer: parameters seem reasonablAnswer: Launch an Amazon EC2 instance with an AWS Deep Learning AMI and attach the S3 bucket to train the full dataset.
ANSWER: Load a smaller subset of the data into the SageMaker notebook and train locally. Confirm that the training code is executing and the model
Question #14 A company wants to classify user behavior as either fraudulent or normal. Based on internal research, a Machine Learning Specialist would like to build a binary classifier based on two features: age of account and transaction month. The class distribution for these features is illustrated in the figure provideAnswer: Based on this information, which model would have the HIGHEST accuracy
ANSWER: Logistic regression ANSWER: Support vector machine (SVM) with non-linear kernel
Question #10 A Machine Learning Specialist has completed a proof of concept for a company using a small data sample, and now the Specialist is ready to implement an end- to-end solution in AWS using Amazon SageMaker. The historical training data is stored in Amazon RDS. Which approach should the Specialist use for training a model using that data?
ANSWER: Push the data from Microsoft SQL Server to Amazon S3 using an AWS Data Pipeline and provide the S3 location within the notebook.
Question #21 A Machine Learning Specialist is configuring Amazon SageMaker so multiple Data Scientists can access notebooks, train models, and deploy endpoints. To ensure the best operational performance, the Specialist needs to be able to track how often the Scientists are deploying models, GPU and CPU utilization on the deployed SageMaker endpoints, and all errors that are generated when an endpoint is invokeAnswer: Which services are integrated with Amazon SageMaker to track this information? (Choose two.)
ANSWER: AWS CloudTrail ANSWER: Amazon CloudWatch
Question #6 A Machine Learning Specialist is using an Amazon SageMaker notebook instance in a private subnet of a corporate VPANSWER: The ML Specialist has important data stored on the Amazon SageMaker notebook instance's Amazon EBS volume, and needs to take a snapshot of that EBS volumAnswer: However, the ML Specialist cannot find the Amazon SageMaker notebook instance's EBS volume or Amazon EC2 instance within the VPANSWER: Why is the ML Specialist not seeing the instance visible in the VPC?
ANSWER: Amazon SageMaker notebook instances are based on EC2 instances running within AWS service accounts.
Question #17 An employee found a video clip with audio on a company's social media feeAnswer: The language used in the video is Spanish. English is the employee's first language, and they do not understand Spanish. The employee wants to do a sentiment analysis. What combination of services is the MOST efficient to accomplish the task?
ANSWER: Amazon Transcribe, Amazon Translate, and Amazon Comprehend
Question #2 A Machine Learning Specialist is designing a system for improving sales for a company. The objective is to use the large amount of information the company has on users' behavior and product preferences to predict which products users would like based on the users' similarity to other users. What should the Specialist do to meet this objective?
ANSWER: Build a collaborative filtering recommendation engine with Apache Spark ML on Amazon EMR.
Question #18 A Machine Learning Specialist is packaging a custom ResNet model into a Docker container so the company can leverage Amazon SageMaker for training. The Specialist is using Amazon EC2 P3 instances to train the model and needs to properly configure the Docker container to leverage the NVIDIA GPUs. What does the Specialist need to do?
ANSWER: Build the Docker container to be NVIDIA-Docker compatible.
Question #12 A Machine Learning Specialist is working with a large company to leverage machine learning within its products. The company wants to group its customers into categories based on which customers will and will not churn within the next 6 months. The company has labeled the data available to the Specialist. Which machine learning model type should the Specialist use to accomplish this task?
ANSWER: Classification
Question #11 A Machine Learning Specialist receives customer data for an online shopping websitAnswer: The data includes demographics, past visits, and locality information. The Specialist must develop a machine learning approach to identify the customer shopping patterns, preferences, and trends to enhance the website for better service and smart recommendations. Which solution should the Specialist recommend?
ANSWER: Collaborative filtering based on user interactions and correlations to identify patterns in the customer database.
Question #15 A Machine Learning Specialist at a company sensitive to security is preparing a dataset for model training. The dataset is stored in Amazon S3 and contains Personally Identifiable Information (PII). The dataset: ✑ Must be accessible from a VPC only. ✑ Must not traverse the public internet. How can these requirements be satisfied? EC2 instance.
ANSWER: Create a VPC endpoint and apply a bucket access policy that restricts access to the given VPC endpoint and the VPC.
Question #25 A company wants to classify user behavior as either fraudulent or normal. Based on internal research, a Machine Learning Specialist would like to build a binary classifier based on two features: age of account and transaction month. The class distribution for these features is illustrated in the figure provideAnswer: Based on this information,
ANSWER: Decision tree
Question #20 An interactive online dictionary wants to add a widget that displays words used in similar contexts. A Machine Learning Specialist is asked to provide word features for the downstream nearest neighbor model powering the widget. What should the Specialist do to meet these requirements?
ANSWER: Download word embeddings pre-trained on a large corpus.
Question #23 A Machine Learning Specialist is building a convolutional neural network (CNN) that will classify 10 types of animals. The Specialist has built a series of layers in a neural network that will take an input image of an animal, pass it through a series of convolutional and pooling layers, and then finally pass it through a dense and fully connected layer with 10 nodes. The Specialist would like to get an output from the neural network that is a probability distribution of how likely it is that the input image belongs to each of the 10 classes. Which function will produce the desired output?
ANSWER: Softmax
Question #16 During mini-batch training of a neural network for a classification problem, a Data Scientist notices that training accuracy oscillates. What is the MOST likely cause of this issue?
ANSWER: The learning rate is very high.
Question #1 A large mobile network operating company is building a machine learning model to predict customers who are likely to unsubscribe from the servicAnswer: The company plans to offer an incentive for these customers as the cost of churn is far greater than the cost of the incentivAnswer: The model produces the following confusion matrix after evaluating on a test dataset of 100 customers: Based on the model evaluation results, why is this a viable model for production?
ANSWER: The model is 86% accurate and the cost incurred by the company as a result of false positives is less than the false negatives.
Question #8 A manufacturing company has structured and unstructured data stored in an Amazon S3 bucket. A Machine Learning Specialist wants to use SQL to run queries on this datAnswer: Which solution requires the LEAST effort to be able to query this data?
ANSWER: Use AWS Glue to catalogue the data and Amazon Athena to run queries.
Question #5 A Data Engineer needs to build a model using a dataset containing customer credit card information How can the Data Engineer ensure the data remains encrypted and the credit card information is secure? DeepAR algorithm to randomize the credit card numbers. insert fake credit card numbers. SageMaker principal component analysis (PCA) algorithm to reduce the length of the credit card numbers. AWS Glue.
ANSWER: Use AWS KMS to encrypt the data on Amazon S3 and Amazon SageMaker, and redact the credit card numbers from the customer data with
Question #3 A Mobile Network Operator is building an analytics platform to analyze and optimize a company's operations using Amazon Athena and Amazon S3. The source systems send data in .CSV format in real timAnswer: The Data Engineering team wants to transform the data to the Apache Parquet format before storing it on Amazon S3. Which solution takes the LEAST effort to implement?
ANSWER: Use the Amazon SageMaker Linear Learner algorithm on the single time series consisting of the full year of data with a predictor_type of
Question #4 A city wants to monitor its air quality to address the consequences of air pollution. A Machine Learning Specialist needs to forecast the air quality in parts per million of contaminates for the next 2 days in the city. As this is a prototype, only daily data from the last year is availablAnswer: Which model is MOST likely to provide the best results in Amazon SageMaker? predictor_type of regressor. regressor. classifier.
ANSWER: Use the Amazon SageMaker Linear Learner algorithm on the single time series consisting of the full year of data with a predictor_type of