Udemy Practice Test Review
C
A classifier predicts if insurance claims are fraudulent or not. The cost of paying a fraudulent claim is higher than the cost of investigating a claim that is suspected to be fraudulent. Which metric should we use to evaluate this classifier? A. Precision B. Specificity C. Recall D. F1
C
A company needs to access training data in S3 without traversing the internet. How should training data stored in S3 bucket be accessed from Amazon SageMaker notebook instance running inside customer-managed VPC ? A. run the notebook instance in a private subnet and access S3 bucket bastion host B. run the notebook instance in a private subnet and access S3 bucket via a NAT gateway C. access the S3 bucket using gateway VPC endpoints D. access the S3 bucket using interface VPC endpoints
D
A Machine Learning Engineer is developing a deep learning model that classifies a dog, cat and zebra images. The label data is currently represented as dog, cat, or zebra. Which data processing step should the ML engineer apply to the label data prior to training? Softmax is used A. No processing B. Label encoding C. Map dog to 0, cat to 1 and zebra to 2 D. One-hot encoding
B
A Machine Learning Engineer is preparing a data frame for a supervised learning task with the Amazon SageMaker Linear Learner algorithm. The ML Engineer notices the one numerical feature has about 5% of the records missing. This numeric variable has extreme values but considered plausible. What should the ML engineer do to minimize bias due to missing values? A. Replace the missing values by the mean or median across non-missing values in the same row B. Approximate the missing values using supervised learning based on other features C. Replace each missing value by the mean or median across non-missing values in the column D. Delete the entire column
A
A Machine Learning Specialist built a linear regression model based on a dataset consisting of mortality rate (deaths per 100,000) and recorded water hardness (ppm of calcium). The residual from the model as shown below. Which of the following conclusions about the residual plot is CORRECT? A. The residual plot indicates the model relatively fits well to the dataset B. A linear relationship doesn't not exist between mortality rate and calcium level C. The residual plot indicates a linear model is not appropriate to use for the dataset D. Residuals are computed by adding predicted values and actual values
B
A Machine Learning Specialist created a custom training image to run on SageMaker. What should the Specialist do to enable model training on GPU devices? A. SageMaker doesn't have 'bring your own training image' option B. Only the CUDA toolkit should be included on containers C. The container should be able to run with no modification D. Bundle NVIDIA drivers with the image
C E F
A Machine Learning Specialist is building a deep learning model for image classification. The model error is acceptable but performs poorly on the test dataset. What is the likely problem with this model? (SELECT 3) A. high bias B. underfitting C. high variance D. low variance E. overfitting F. low bias
A
A Machine Learning Specialist is developing a deep learning model. The model has an accuracy of 97%. However, the accuracy of the validation dataset is 65%. What should the Specialist do to make the model generalize well? A. Use dropouts B. Increase the number of epochs C. increase the mini-batch size D. Add more features
C
A Machine Learning Specialist is developing a machine learning model using Logistic Regression algorithm. The classification problem requires tuning the model by varying the threshold as getting false positives is more costly than false negatives. Which metrics can meets the Specialist's need? A. AUC B. RMSE C. ROC D. Accuracy
C
A Machine Learning Specialist is developing a machine learning model using Logistic Regression algorithm. The classification problem requires tuning the model by varying the threshold as getting false positives is more costly than false negatives. Which metrics can meets the Specialist's need? A. AUC B. RMSE C. ROC D. Accuracy
B
A Machine Learning Specialist is developing a model to predict the likely hood of a customer to purchase a book. One of the features in the collected dataset is age. The specialist determined that a linear correlation between age and target variable could be useful in improving the model performance. How can the specialist transform the age variable to achieve linearity between age and target variable A. One hot encode the age variable B. Apply binning on the age variable C. Drop the age variable D. Use the age variable as is
B
A Machine Learning Specialist is developing a predictive model to predict customer churn. The business requirement is losing a customer is more expensive than taking prior actions such as providing discounted rates, free coupons, etc that can reduce churn rate. Which performance metrics should ML specialists focus on while tuning the model? A. False-positive rate B. recall C. accuracy D. precision
A
A Machine Learning Specialist is developing a system that captures streaming video from webcams, analyzes the video to detect faces of celebrities. When a face in the video is matched with the face collection in S3 the matching information is streamed and a text notification is sent to an end-user. Which of the following AWS services can meet the requirement? A. Amazon Kinesis Video Streams and Amazon Rekognition Video and Amazon Kinesis Streams B. Amazon Kinesis Video streams and Amazon Kinesis streams C. Amazon Kinesis Video Streams and Amazon Rekognition Video D. AWS Lambda for sending the text notification and DynamoDB for streaming the text notification
D
A Machine Learning Specialist is exploring data to build machine learning model using linear regression algorithm. The Specialist found out that one of the features has highly skewed distribution. What should the Specialist do to reduce the skew? A. Adding more data will definitely help B. scaling the data can help C. distribution is right skewed and log transformation might help D. distribution is left skewed and box-cox transformation might help
C
A Machine Learning Specialist is required to build a supervised image-recognition to identify a dog. The ML Specialist performs some tests and records the following results for a neural-based image classifier. Total number of images available = 2000 Test set images = 200 The ML Specialist notices that, in over 85% of the misclassified images, the dog images were upside down. Which techniques can be used by the ML Specialist to improve this specific test error? A. Use dropouts on the last fully connected layer B. Increase the learning rate C. Use data augmentation to generate more training data to account for image orientations D. Increase the number of epochs
B C D E
A Machine Learning Specialist needs to deploy a pre-trained TensorFlow model into Amazon SageMaker. The model is trained outside of Amazon Sagemaker. Which steps are needed to achieve this goal? (SELECT four) A. Model definitions must be written in the MXNet framework B. The SageMaker model is deployed as an endpoint C. Using the model definitions, artifacts, and the Amazon SageMaker Python SDK, a SageMaker model is created. D. The model is exported and model artifacts that can be understood by Amazon SageMaker are created. E. Model artifacts are uploaded to an Amazon S3 bucket
A
A Machine Learning Specialist needs to train deep learning model using TensorFlow framework. As the training is done on large datasets the Specialist wants to speed up the process by distributing the training on multiple machines or processes in a cluster. Setting up and launching distributed training with TensorFlow quickly is required without the expense and difficulty of directly managing training clusters. Training updates should also be done synchronously. Which framework can meet these requirements? A. Horovod distributed framework B. Distributed training is not possible using TensorFlow so other deep learning frameworks should be used C. EMR cluster D. TensorFlow's native parameter server
B
A Machine Learning Specialist needs to use a metrics to evaluate the performance of a classifier model trained on imbalanced data. Which metrics works best for evaluating performance of a model on with imbalanced class? A. AUC B. F1 score C. RMSE D. Accuracy
D
A Machine Learning Specialist needs to use an EMR cluster for ETL task and train the model using SageMaker. Which of the following frameworks can the Specialist use with minimal to no code to connect EMR cluster and SageMaker? A. Apache Kafka B. Apache MXNet C. AWS Glue D. Apache Spark
D
A Machine Learning Specialist needs to use an EMR cluster for ETL task and train the model using SageMaker.Which of the following frameworks can the Specialist use with minimal to no code to connect EMR cluster and SageMaker? A. Apache Kafka B. Apache MXNet C. AWS Glue D. Apache Spark
A
A Publishing company embarked on the big data journey and needs to collect pertinent data from over 200+ digital sights in real-time. This data gives invaluable insight into the usage of their sites and indicates the most relevant trending topics based on content. The company needs a service to reliably ingest and deliver batched, compressed, and encrypted data to S3. Which services BEST meets the company requirement? A. Amazon Kinesis Firehose B. AWS Lambda C. Amazon Kinesis Data Analytics D. Amazon Kinesis Streams
A
A convention wishes to install cameras that automatically detect when conference attendees are seen wearing a specific company shirt, as part of a contest. Which is a viable approach? A. Use DeepLens and the DeepLens_Kinesis_Video module to send video streams to a CNN trained with SageMaker using a labeled training set of videos of the company shirts. B. Use DeepLens and the DeepLens_Kinesis_Video module to analyze video in real time using the ImageNet CNN. C. Send raw video feeds into Amazon Rekognition to detect the company shirts D. Use RNN's embedded within DeepLens to detect shirts at the edge
A
A data engineer should not use AWS Glue for one of the following tasks. A. To handle streaming data B. To build end-to-end ETL workflow using multiple jobs C. To discover properties of the data you own, transform it, and prepare it for analytics D. To run your existing Scala or Python code for ETL Jobs
A
A data scientist wants to build a machine learning model using Amazon SageMaker. There is a massive amount of data currently stored in Amazon Redshift cluster. How can the data scientist best perform iterative data transformation to reduce the size of the datasets? A. Use an EMR Cluster running Apache Spark. Interact with the cluster from your Jupyter notebook using Apache Livy. B. Read the datasets directly into your notebook and perform analysis. C. Use AWS Glue to perform data transformation. Perform your analysis by directly connecting your Jupyter notebook to AWS Glue. D. Copy the data to an RDS cluster. Directly connect to the cluster from the notebook and analyze the data.
D
A dataset representing a clinical trial includes many features, including Mean Arterial Pressure (MAP). The various features are not well correlated, and less than 1% of the data is missing MAP information. Apart from some outliers, the MAP data is fairly evenly distributed. All other features contain complete information. Which is the best choice for handling this missing data? A. Populate the missing MAP values with random noise B. Drop the MAP column C. Impute the missing values with the mean MAP value. D. Impute the missing values with the median MAP value.
C
A deep learning model is performing well on the training dataset but tends to generalize poorly. Which of the following steps can improve the performance of this model? A. decreasing the number of GPUs B. increasing the number of epochs C. decreasing the mini-batch size D. increasing the number of layers
A
A file containing the following sentences is used as into the Amazon BlazingText algorithm in Word2Vec mode. How many vectors will be generated as an output? 'apple', 'day', 'keep', 'doctor', 'away' 'Apple', 'day', 'keep', 'doctor', 'away' 'safety', 'first' A. 3 B. 4 C. 2 D. 5
D
A financial services company needs to automate the analysis of each day's transaction costs, execution reporting, and market performance. They have developed their own Big Data tools to perform this analysis, which require the scheduling and configuration of their underlying computing resources. Which tool provides the simplest approach for configuring the resources and scheduling the data analytic workloads? A. Amazon Simple Workflow Service B. Amazon SQS C. AWS Step Functions D> AWS Batch
C
A large news website needs to produce personalized recommendations for articles to its readers, by training a machine learning model on a daily basis using historical click data. The influx of this data is fairly constant, except during major elections when traffic to the site spikes considerably. Which system would provide the most cost-effective and simplest solution? A. Publish click data into Amazon S3 using Kinesis Streams, and process the data in real time using Splunk on an EMR cluster with spot instances added as needed. Publish the model's results to DynamoDB for producing recommendations in real-time. B. Publish click data into Amazon Elasticsearch using Kinesis Firehose, and query the Elasticsearch data to produce recommendations in real-time. C. Publish click data into Amazon S3 using Kinesis Firehose, and process the data nightly using Apache Spark and MLLib using spot instances in an EMR cluster. Publish the model's results to DynamoDB for producing recommendations in real-time. D. Publish click data into Amazon S3 using Kinesis Firehose, and process the data nightly using Apache Spark and MLLib using reserved instances in an EMR cluster. Publish the model's results to DynamoDB for producing recommendations in real-time.
C
A machine learning model performed well on the training data but failed to generalize well on the validation and test datasets. What should be done to address this issue? A. Apply dimensionality reduction B. add new features C. add regularization D. increase the number of epochs
D
A media company wishes to recommend movies to users based on the predicted rating of that movie for each user, using SageMaker and Factorization Machines. What format should the rating data used for training the model be in? A. LibSVM B. CSV C. RecordIO / protobuf in integer format D. RecordIO/ protobuf in float32 format
C D
A medical company is building a model to predict the occurrence of thyroid cancer. The training data contains 900 negative instances (people who don't have cancer) and 100 positive instances. The resulting model has 90% accuracy, but extremely poor recall. What steps can be used to improve the model's performance? (SELECT TWO) A. Under-sample instances from the positive (has cancer) class B. Generate synthetic samples using SMOTE C. Over-sample instances from the negative (no cancer) class D. Collect more data for the positive case E. Use Bagging
B
A model _______ have to trained using Amazon SageMaker in order to use Amazon SageMaker Neo to convert the model A. does B. does not
overfitting
A model failed to generalize on new data implies _______________. Hence, regularization should be applied
A C
A regression model on a dataset including many features includes L1 regularization, and the resulting model appears to be underfitting. Which steps might lead to better accuracy? (SELECT TWO) A. Decrease the L1 regression term B. Remove features C. Try L2 instead of L1 regularization D. Increase the L1 regularization term E. Use L0 regularization
outliers
A rough imputation method such as mean or median can be a resonable choice when only a handful of values are missing, and there aren't large relationships between features that we might compromise. Due to the _______ mentioned, median is a better choice than mean.
A
An aspiring Data Scientist wants to develop a model that recommends (suggests) relevant tags when a user posts on StackOverflow. Which algorithm is the best choice for this task? A. Latent Dirichlet Allocation (LDA) B. Random Cut Forest C. BlazingText Word2Vec D. K-Means clustering
D
An e-commerce company needs to pre-process large amounts of consumer behavior data stored in HDFS using Apache Spark on EMR prior to analysis on a daily basis. The volume of data is highly seasonal, and can surge during holidays and big sales. What is the most cost-effective option for handling these sporadic demands, without incurring data loss or needing to terminate the entire cluster? A. Use EC2 Spot instances for core and task nodes, and Reserved instances for the master node B. Use reserved instances for task nodes, and spot instances for core nodes. C. Use EC2 spot instances for all node types. D. Use EC2 Spot instances for Spark task nodes only.
encyption
Inter-container ______ is just a checkbox away when creating a training job via the SageMaker console. It can also be specified using the SageMaker API with a little extra work.
No
Is scikit-learn is a distributed solution?
C
Machine Learning Specialist is about to create an Amazon SageMaker Notebook instance. The ML Specialist wants to prevent direct internet access from the instance. How can the ML specialist prevent this? A. Creat the Notebook instance without providing any VPC information as the default setting prevents direct internet access B. Creat the Notebook instance with customer attached VPC with direct internet access C. Creat the Notebook instance with customer attached VPC without direct internet access D. Creat the Notebook instance within SageMaker managed VPC
C
Parallel processing with multiple GPUs is an important step in scaling the training of deep models. To achieve this scalability a Machine Learning Specialist wants to move training of a deep model with mini-batch size 32 and learning rate 0.01 to multi--GPUs. How should the mini-batch size and the learning-rate be adjusted if the plan is to use 4 GPUs for distributed training? A. mini-batch size decreased by a factor of 2 and learning rate reduced by a factor of 2 B. mini-batch size increased by a factor of 2 and the learning rate increased by a factor of 2 C. mini-batch size increased by a factor of 4 and the learning rate increased by a factor of 4 D. mini-batch size decreased by a factor of 4 and the learning rate decreased by a factor of 4
B
When using SageMaker's BlazingText algorithm in Word2Vec mode, which of the following statements are true? A. The order of words does matter, as it uses skip-gram and continuous bag of words (CBOW) architectures. B. The order of words doesn't matter, as it uses skip-gram and continuous bag of words (CBOW) architectures. C. The order of words matters, because it uses LSTM internally D. The order of words does not matter, as it uses a CNN internally.
C
Which is a valid approach for determining the optimal value of k in k-Means clustering? A. Use the "elbow method" on a plot of accuracy as a function of k B. Use k-fold cross validation C. Use the "elbow method" on a plot of the total within-cluster sum of squares (WSS) as a function of k D. Use SGD to converge on k
A B C
Which of the following are the best practices for protecting model and training data at rest? ( Choose 3) A. Pass an AWS KMS key to encrypt the attached machine learning (ML) storage volume B. ML data volumes may be encrypted with customer specified AWS KMS keys C. Use encrypted S3 buckets for model artifacts and data D. If you do not specify an AWS KMS key, Amazon SageMaker encrypts storage volumes with AWS-managed AWS KMS key
A B D
Which of the following are true about an inference pipeline? ( Choose 3) A. Within an inference pipeline model, Amazon SageMaker handles invocations as a sequence of HTTP requests. B. Define and deploy any combination of pre-trained Amazon SageMaker built-in algorithms and your own custom algorithms packaged in Docker containers. C. An existing inference pipeline model can be modified any time. D. Preprocessing, predictions, and post-processing data science tasks can be combined
B C D
Which of the following are true about data preprocessing techniques prior training a machine learning algorithm? A. Encode norminal categorical features as integers B. Define a hierarchy structure nominal categorical features with many levels and group levels by similarity to reduce the number of levels C. Dropping rows with missing values can result in underfitting D. decision trees and random forests are generally not sensitive to feature scaling
C
Which of the following data preprocessing steps is not applicable to the following text for developing an NLP model? <a href="thas will be gone, too">But this will still be here!</a>" A. Normalization B. Tokenization C. Spelling correction D. Noise removal
A B C
Which of the following mechanisms help in reducing the impact of exploding or vanishing gradients problem in training neural networks? (Choose three) A. Rectified Linear Activation Unit (ReLU) B. Gradient Clipping C. Weight regularization D. Increasing number of Epochs
D
Which of the following options is used to allow a notebook instance running in a customer attached VPC without direct internet access? A. It can access the internet back through a virtual gateway & VPC endpoint B. It can access the internet through a VPC endpoint C. The network interface only has a private IP address and It should be in public subnet to access the internet D. The network interface only has a private IP address and it should be in private subnet with a NAT to access the internet
D
Which of the following statement is true about protecting communications between ML compute instances in a distributed Traning job using Amazon SageMaker? A. Intec-container traffic encryption can provide addition data protection bween instances B. Enabling inter-container traffic encryption doesn't affect training jobs with a single compute instance C. Enabling inter-container traffic encryption can increase training time if deep learning algorithms are used D. Enabling inter-container traffic encryption can increase the cost for Amazon SageMaker built-in algorithms such as XGBoost, DeepAR, and linear learner
B
You are analyzing Tweets from some public figure, and want to compute an embedding that shows past Tweets that are semantically similar to each other. Which tool would be best suited to this task? A. SageMaker Factorization Machines B. SageMaker Object2Vec C. Amazon Transcribe D. SageMaker BlazingText in word2vec mode
C, E
You are deploying your own custom inference container on Amazon SageMaker. Which of the following are requirements of your container? (SELECT TWO) A. Must be compressed in ZIP format B. Respond to both /invocations and /ping on port 80 C. Respond to both /invocations and /ping on port 8080 D. Respond to GET requests on the /ping endpoint in under 5 seconds. E. Accept all socket connection requests within 250 ms.
C
You are developing a computer vision system that can classify every pixel in an image based on its image type, such as people, buildings, roadways, signs, and vehicles. Which SageMaker algorithm would provide you with the best starting point for this problem? A. Rekognition B. Object Detection C. Semantic Segmentation D. Object2Vec
C
You are developing a machine learning model to predict house sale prices based on features of a house. 10% of the houses in your training data are missing the number of square feet in the home. Your training data set is not very large. Which technique would allow you to train your model while achieving the highest accuracy? A. Drop all rows that contain missing data B. Impute the missing values using deep learning, based on other features such as number of bedrooms C. Impute the missing square footage values using kNN D. Impute the missing values using the mean square footage of all homes
C
You are developing a machine translation model using SageMaker's seq2seq model. What format must your training data be provided in? A. Text in JSON format B. Text in CSV format C. RecordIO-protobuf format with integer tokens D. RecordIO-protobuf with floating point tokens
B
You are developing an autonomous vehicle that must classify images of street signs with extremely low latency, processing thousands of images per second. What AWS-based architecture would best meet this need? A. Use Amazon Rekognition on AWS DeepLens to identify specific street signs in a self-contained manner. B. Develop your classifier with TensorFlow, and compile it for an NVIDIA Jetson edge device using SageMaker Neo, and run it on the edge with IoT GreenGrass. C. Use Amazon Rekognition in edge mode D. Develop your classifier using SageMaker Object Detection, and use Elastic Inference to accelerate the model's endpoints called over the air from the vehicle.
A
You are ingesting a data feed of subway ridership in near-real-time. Your incoming data is timestamped by the minute, and includes the total number of riders at each station for that minute. What is the simplest approach for automatically sending alerts when an unusually high or low number of riders is observed? A. Ingest the data with Kinesis Data Firehose, and use Random Cut Forest in Kinesis Data Analytics to detect anomalies. Use AWS Lambda to process the output from Kinesis Data Analytics, and issue an alert via SNS if needed. B. Publish data directly into S3, and use Glue to detect anomalies and pass on alerts to SNS. C. Ingest the data with Kinesis Firehose, and use Amazon CloudWatch to alert when anomalous data is detected. D. Ingest the data with Kinesis Data Streams directly into S3, and use Random Cut Forest in SageMaker to detect anomalies in real-time. Integrate SageMaker with SNS to issue alarms.
A
You are running SageMaker training jobs within a private VPC with no Internet connectivity, for security reasons. How can your training jobs access your training data in S3 in a secure manner? A. Create an Amazon S3 VPC Endpoint, and a custom endpoint policy to restrict access to S3 B. Use bucket policies to restrict access to your VPC C. Make the S3 bucket containing training data public D. Use NAT translation to allow S3 access
D
You are tasked with developing a machine learning system that can detect the presence of your company's logo in an image. You have a large training set of images that do and do not contain your logo, but they are unlabeled. How might you prepare this data prior to training a supervised learning model with it, with the least amount of development effort? A. Use Amazon Rekognition B. Use Amazon Mechanical Turk C. Use SageMaker Object Detection D. Use Amazon SageMaker Ground Truth
C
You are training SageMaker's supervised BlazingText using file mode. Which is an example of a properly formatted line within the training file? A. __label4 linux ready for prime time, intel says. B. __label__4 Linux ready for prime time, Intel says. C. __label__4 linux ready for prime time , intel says . D. __label__4 linux ready for prime time, intel says.
C
You are training a Linear Learner model in SageMaker, with normalization enabled. However, your training is failing. What might be a probable cause? A. You are attempting to perform classification instead of regression. B. The data was shuffled prior to training. C. The data was not shuffled prior to training. D. Normalization should be disabled.
A E
You are using SageMaker and XGBoost to classify millions of videos into genres, based on each video's attributes. Prior to training your model, the video attribute data must be cleaned and transformed into LibSVM format. Which are viable approaches for pre-processing the data? (SELECT TWO) A. Use Spark on EMR to pre-process the data, and store the processed results in an S3 bucket accessible to SageMaker. B. Use Glue ETL to transform the data into LibSVM format, and then train with SageMaker. C. Use scikit-learn in your SageMaker notebook to pre-process the data, and then train it. D> Use Kinesis Analytics to transform the data as it is received into LibSVM format, then train with SageMaker. E. Use PySpark with the XGBoostSageMakerEstimator to prepare the data using Spark, and then pass off the training to SageMaker.
normality
You can transform your data into __________ using a Box-Cox transformation to unskew a curve
A
You decreased the batch size used to train your deep neural network, and found that the accuracy of the model suddenly suffered as a result. What is a likely cause? A. The small batch size caused training to get stuck in local minima B. The small batch size caused training to overshoot the true minima C. Shuffling should not be used D. Too many layers are being used
A
You have a large set of encyclopedia articles in text format, but do not have topics already assigned to each article to train with. Which tool allows you to automatically assign topics to articles with a minimum of human effort? A. LDA B. Amazon Translate C. Ground Truth D. Random Cut Forest
C
You have created a SageMaker notebook instance using its default IAM role. How is access to data in S3 managed? A. The default IAM role allows access to any S3 bucket, regardless of name B. No buckets are available by default; you must edit the default IAM role to explicitly allow access. C. Any bucket with "sagemaker" in the name is accessible with the default role D. Only S3 buckets with public access enabled are accessible
C
You increased the learning rate on your deep neural network in order to speed up its convergence, but suddenly the accuracy of the model has suffered as a result. What is a likely cause? A. Shuffling should not be used B. Too many layers are being used C. The true minimum of your loss function was overshot while learning D. SGD got stuck in local minima
B
You've set up a camera in Los Angeles, and want to be notified when a known celebrity is spotted. Which services, used together, could accomplish this with the least development effort? A. SageMaker Object Detection, Lambda, and SNS B. Amazon Rekognition, IAM, and SNS C. SageMaker Semantic Segmentation, SQS, and SNS
B
Your XGBoost model has high accuracy on its training set, but poor accuracy on its validation set, suggesting overfitting. Which hyperparameter would be most likely to improve the situation? A. booster B. subsample C. csv_weights D. grow_policy
B C
Your automatic hyperparameter tuning job in SageMaker is consuming more resources than you would like, and coming at a high cost. What are TWO techniques that might reduce this cost? A. Use more concurrency while tuning B. Use less concurrency while tuning C. Use logarithmic scales on your parameter ranges D. Use inference pipelines E. Use linear scales on your parameter ranges
B
Your company wishes to monitor social media, and perform sentiment analysis on Tweets to classify them as positive or negative sentiment. You are able to obtain a data set of past Tweets about your company to use as training data for a machine learning system, but they are not classified as positive or negative. How would you build such a system? A. Stream both old and new tweets into an Amazon Elasticsearch Service cluster, and use Elasticsearch machine learning to classify the tweets. B. Use SageMaker Ground Truth to label past Tweets as positive or negative, and use those labels to train a neural network on SageMaker. (Correct) C. Use RANDOM_CUT_FOREST to automatically identify negative tweets as outliers. D. Use Amazon Machine Learning with a binary classifier to assign positive or negative sentiments to the past Tweets, and use those labels to train a neural network on an EMR cluster.
8080, 2, tar
Your inference container responds to port _____, and must respond to ping requests in under ____ seconds. Model artifacts need to be compressed in ____ format, not zip.
C
Your machine learning system needs to be re-trained nightly on millions of records. The training involves multiple long-running ETL jobs which need to execute in order and complete successfully, prior to being handed off to a machine learning training workflow. Which is the simplest tool to help manage this process? A. Amazon SQS B. AWS Batch C. AWS Step Functions D. Amazon Simple Workflow Service
A
Your neural network is underfitting, and in response you've added more layers. Upon adding additional layers, your accuracy no longer converges successfully while training. What is the most likely cause? A. Use of a sigmoid activation function is leading to the "vanishing gradient" problem; ReLU may work better. B. The additional layers are now causing your model to over-fit. C. Too many training epochs are being used. D. The learning rate needs to be increased.
Transcribe
______ can convert a customer's speech to text, which could then be fed into Lex for handling the chatbot logic.
LSTM
______ is a specific type of RNN well-suited for music
Deep Learning, KNN
______ is better suited to the imputation of categorical data. _____is better suited to the imputation of numerical data.
Object2Vec, BlazingText
______ is capable of creating embeddings for arbitrary objects, such as Tweets. ________ can only find relationships between individual words, not entire Tweets.
Random Cut Forest
______ is integrated within Kinesis Data Analytics
RBF
______ kernel is a function whose value depends on the distance from the origin or from some point.
Large
_______ batch sizes lead to faster training
Poisson
_______ distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time
Random Cut Forest
_______ is Amazon's own algorithm for anomaly detection, and is usually the right choice when anomaly detection is asked for on the exam.
Recall
_______ is TP / (TP+FN)
Precision
_______ is TP / (TP+FP)
RecordIO
_______ is usually the best choice on models that support it, as it allows for efficient processing and the use of Pipe mode.
Pipe
_______ mode allows you to stream in data, instead of copying the entire dataset to every machine you are training on. For large data sets this can make a big difference.
Transfer learning
________ generally involves using an existing model, or adding additional layers on top of one.
Seasonality, trends
________ refers to periodic changes, while _______ are longer-term changes over time. A trend across seasonal data would result in periodic seasonal spikes and valleys increasing or decreasing over time.
Variance
_________ captures how much your classifier changes if you train on a different training set. How "over-specialized" is your classifier to a particular training set (overfitting)?
Decreasing
___________ the mini-batch size can result in a model that generalizes well.
Increasing
____________ the number of epochs can lead to overfitting.
Bias
_____________ is the inherent error that you obtain from your classifier even with infinite training data. This is due to your classifier being "biased" to a particular kind of solution (e.g. linear classifier). In other words, bias is inherent to your model.
Small
_______batch sizes run the risk of getting stuck in localized minima instead of finding the true one.
ReLu
A "vanishing gradient" results from multiplying together many small derivates of the sigmoid activation function in multiple layers. _____ does not have a small derivative, and avoids this problem.
D
A Data Analyst needs to visualize ROI, investment time and investment size in the same chart. Which chart type BEST meets the DA's need? A. Line Chart B. Scatter Chart C. Pie Chart D. Bubble Chart
B
A Key Performance Indicator (KPI) is usually a single value that relates to a particular area or function and is a reflection of how well a company is doing in that area or function. Which of the following popular KPIs indicate how likely a customer recommend a product or service to a friend? A. Conversion rate B. Net Promotor Score (NPS) C. Relative market share D. Customer Profitability Score (CPS)
B D
A Machine Learning Specialist needs to use an inception v3 network architecture as the performance of the model based on Amazon SageMaker built-in image classification algorithm is not satisfactory. Which of the following will accomplish this? (Choose two.) A. Customize the built-in image classification algorithm to use Inception and use this for model training. B. Bundle your own container with the new architecture and import it into Amazon Elastic Container Registry C. Contact support center to make the inception v3 network architecture to use for training D. Customize the Amazon SageMaker TensorFlow container with the new architecture transfer learning code in TensorFlow framework and import this container into Amazon ECR.
A C F
A Machine Learning specialist is build ML model for a classification problem. The specialist discovered that the training error higher than expected and tuning the hyper-parameters of the model doesn't improve the performance. What remedies may be used to get better performance? ( Select 3) A. Add more features B. Add more data C. Use more complex model ( e.g. Kernelize, use non-linear models) D. Bagging E. Reduce model complexity F. Boosting
A
A Machine learning specialist wants to build a face detection system using AWS DeepLense device. The specialist further needs to train her own deep learning model on Amazon Sagemaker. Which option requires MINIMUM effort to send video feeds of giving time intervals from AWS DeepLense Device to use as an input for training. A. Use DeepLens_Kinesis_Video module for integrating DeepLense with Amazon Kinesis Video Streams. Use the data as input for training. B. Use AWS DeepLense with Kinesis Firehose to push data to S3. Use the data as input for training. C. Use AWS Lambda to fetch stream data from AWS DeepLense and put it on S3. Use the data as input for training. D. Use AWS DeepLense to write data directly to S3 and use the data as input for training.
A B D
A data engineer is requested to create an external Hive metastore to help ensure that the Hive metadata store can scale with the implementation and that the metastore persists even if the cluster terminates. What are the options for creating an external Hive metastore for EMR cluster? (Choose three) A. Use Amazon Aurora B. Use the AWS Glue Data Catalog C. Use Redshift Spectrum D. Use Amazon RDS
large
A learning rate that is too ____ may overshoot the true minima
small
A learning rate that is too _____ will slow down convergence.
B
A legacy MapReduce job is running on EMR, and must ingest data that has been moved from HDFS to an S3 data lake. Which is a viable option for connecting the S3 data to MapReduce on EMR? A. MapReduce can talk natively to S3 using the s3a:// prefix B. Use EMRFS to connect MapReduce to S3, using the s3:// file prefix. C. Use Apache Hive as an intermediary between MapReduce and S3 D. Use EFS to connect MapReduce to the S3 bucket.
B
A pipeline model _______ be changed, but you can update an inference pipeline by deploying a new one using the UpdateEndpoint operation A. can B. can not
A B
After training a deep neural network over 100 epochs, it achieved high accuracy on your training data, but lower accuracy on your test data, suggesting the resulting model is overfitting. What are TWO techniques that may help resolve this problem? A. Use dropout regularization B. Use early stopping C. Use more features in the training data D. Use more layers in the network E. Employ gradient checking
D
Amazon Kinesis Data Firehose allows you to compress your data before delivering it to Amazon S3. The service currently supports GZIP, ZIP, and SNAPPY compression formats. Which format is currently supported if the data is further loaded to Amazon Redshift? A. ZIP B. SNAPPY C. PARQUET D. GIZP
Neo
Amazon SageMaker _____ enables machine learning models to train once and run anywhere in the cloud and at the edge
NTM
Amazon SageMaker _____ is an unsupervised learning algorithm that is used to organize a corpus of documents into topics.
B
An image recognition model using a CNN is capable of identifying flowers in an image, but you need an image recognition model that can identify specific species of flowers as well. How might you accomplish this effectively while minimizing training time? A. Train a new CNN from scratch with only your flower species labels B. Use transfer learning by training a new classification layer on top of the existing model C. Use transfer learning by training the entire model with new labels D. Use incremental training on Amazon Rekognition
Spark
Apache Spark can be used for preprocessing data and Amazon SageMaker for model training and hosting with no additional code since Amazon SageMaker provides an Apache _____ library, in both Python and Scala.
overfitting
Dropout regularization forces the learning to be spread out amongst the artificial neurons, preventing ___________.
EMRFS
EMR extends Hadoop (which includes MapReduce) to use S3 as a storage backend instead of HDFS, using _____.
punctuation
Each line of a BlazingText input file should contain a training sentence per line, along with their labels. Labels must be prefixed with __label__, and the tokens within the sentence - including ____________________- should be space separated.
increase
Enabling inter-container traffic encryption can __________ the cost for Amazon SageMaker built-in algorithms
clipping
Exploding gradients can be managed through gradient _______.
2 x / +
F1 = (____ x Precision ____ Recall) ___ (Precision ___ Recall)
float32, RecordIO
Factorization Machines are unusual in that they expect ______ data, not integers. They support _________.
classification
For ___________ problems, target values should be one-hot encoded prior to training a neural network.
tokenize
For machine translation you first need to _____ your words into integers, which refer to vocabulary files you also must provide.
D
Given the following three data points, which of the following equation represent the closed form solution for linear regression and the value of Y when x =2? The three data points (x,Y) are (-1, 0.8), (0, 0.3), (1, -0.2) A. Y = -.5x - 0.3, -0.7 B. Y = .5x - 0.3, 1.3 C. Y = .5x + 0.3, 1.3 D. Y = -.5x + 0.3, -0.7
transient
If you do not specify an AWS KMS key, Amazon SageMaker encrypts storage volumes with a ___________ key.
D
If you wanted to build your own Alexa-type device that converses with customers using speech, which Amazon services might you use? A. Amazon Polly -> Amazon Lex -> Amazon Transcribe B. Amazon Comprehend -> Amazon Lex -> Amazon Polly C. Amazon Transcribe -> Amazon Comprehend -> Amazon Polly D. Amazon Transcribe -> Amazon Lex -> Amazon Polly
GZIP
If you're loadding your data from S3 to Redshift, convert your data to _____________ file type
overfit
Increasing the number of epochs can cause a model to ________
No
Is incremental training something Rekognition supports?
underfitting
It the training error higher than expected and tuning the hyper-parameters of the model doesn't improve the performance, the model is likely ___________
WSS
K-means is an unsupervised learning method, and the best we can do is try to optimize the tightness of the resulting clusters. ______ is one way to measure that.
Underfitting
L1 effectively removes features that are unimportant, and doing this too aggressively can lead to _________.
grouping
Lemmatization in linguistics is the process of grouping together the inflected forms of a word so they can be analysed as a single item
RNN
Music is fundamentally a time-series problem, which ____ are best suited for.
LibSVM
Neither Glue ETL nor Kinesis Analytics can convert to _____ format
metric
Only single objective ______ is currently supported when using hyper-parameter tuning in SageMaker.
Spark
Parallelizing processing is something _____ is good for.
recurrent neural networks
RNN is the acronym for _______
residual
Randomness in the ______ plot indicates current model is fitting well.
stemming
Reducing word variants to bases is called _________
concurrency
Since the tuning process learns from each incremental step, too much ___________ can actually hinder that learning.
SageMakerEstimator
The ______________________ classes allow tight integration between Spark and SageMaker for several models including XGBoost, and offers the simplest solution.
duplicates
The count of vectors BlazingText in WordVec mode will produce will be the count of rows (even if the rows are ______________)
A C
The graph below plots observations of two distinct classes, represented by blue and green, against two features, represented by the X and Y axes. Which algorithms would be appropriate for learning how to classify additional observations? (SELECT TWO) Larger image A. kNN B. SVM with a linear kernel C. SVM with a RBF kernel D. Linear regression E. PCA
B
The graph below plots predicted and actual website views over time. Based on this graph, would you say the prediction model: A. Captures seasonality well, but trends poorly B. Captures seasonality and trends poorly C. Captures seasonality poorly, and trends well D. Captures seasonality and trends wellB
AUC
The metric ____ compares ROCs to each other
RMSE
The metric _____ is applicable for regression (numbers)
ROC
The metric _______ is applicable for classification with two classes.
recall
The primary goal should be to predict actual positives as positive as much as possible. This is achieved by using __________ as the performance metrics.
Normalized, Shuffled
Training data should be __________ and __________.
sagemaker
Unless you add a policy with S3FullAccess permission to the role, it is restricted to buckets with "________" in the bucket name.
B
What are the data processing steps indicated below? a. [cached, caching, caches] ----> [cach] b. [is, was, were] ----> [be] A. One-hot encoding and bag of words B. stemming and lemmatization C. word embeddings & lemmatization D. bag of words & stemming
A
What is an appropriate choice of an instance type for training XGBoost in SageMaker? A. M4 B. P3 C. C4 D. P2
One-hot encoding
When pre-processing data, nominal categorical features shouldn't be encoded as integers (label-encoded). ________________ should be used instead.
D
When using Principal Component Analysis (PCA) for dimensionality reduction the maximum number of components you can have is? A. at most less than the number of features B. less than the number of features and components are correlated C. More than the number of features D. at most equal to the number of features
C D
Which are best practices for hyperparameter tuning in SageMaker? (CHOOSE TWO) A. Run training jobs on a single instance B. Choose a large number of hyperparameters to tune C. Run only one training job at a time D. Choose the smallest possible ranges for your hyperparameters E. Use linear scales for hyperparameters
A C E
Which of the following statements are true regarding Amazon SageMaker and Amazon SageMaker Neo? Choose 3 A. Amazon SageMaker Neo enables machine learning models to train once and run anywhere in the cloud and at the edge B. Amazon SageMaker stores code in ML storage volumes, secured by security groups and always encrypted at rest. C. Only single objective metric is currently supported when using hyper-parameter tuning in SageMaker. D. The infrastructure that Amazon SageMaker runs on can be accessed by the account root user E. A model doesn't necessarily have to trained using Amazon SageMaker in order to use Amazon SageMaker Neo to convert the model
Core
While you can use Spot instances on any node type, a Spot interruption on the Master node requires terminating the entire cluster, and on a ____ node, it can lead to HDFS data loss.
Logarithmic
With auto-tuning, Logarithmic ranges tend to find optimal values more quickly than linear ranges
LibSVM, CSV
XGBoost does not use RecordIO, ______ or ______ input
Memory M
XGBoost is a CPU-only algorithm, and won't benefit from the GPU's of a P-series. It is also ______-bound, making _____ a better choice than C-series
'tB
You _____ deploy SageMaker to an EMR cluster A. Can B. Can
A
You are training a distributed deep learning algorithm on SageMaker within a private VPC. Sensitive data is being used for this training, and it must be secured in-transit. How would you meet this requirement? A. Enable inter-container traffic encryption via SageMaker's console when creating the training job B. Enable server-side encryption in the S3 bucket containing your training data C. Use SSE-KMS D. This isn't an option, and you must train on a single host in this case.
B
You are training an XGBoost model on SageMaker with millions of rows of training data, and you wish to use Apache Spark to pre-process this data at scale. What is the simplest architecture that achieves this? A. Use Amazon EMR to pre-process your data using Spark, and then use AWS Data Pipelines to transfer the processed training data to SageMaker B. Use sagemaker_pyspark and XGBoostSageMakerEstimator to use Spark to pre-process, train, and host your model using Spark on SageMaker. C. Use Sparkmagic to pre-process your data within a SageMaker notebook, transform the resulting Spark DataFrames into RecordIO format, and then use Spark's XGBoost algorithm to train the model. D. Use Amazon EMR to pre-process your data using Spark, and use the same EMR instances to host your SageMaker notebook.
a
You want to create AI-generated music, by training some sort of neural network on existing music and getting it to predict additional notes going forward. What architecture might be appropriate? A. RNN B. CNN C. MLP D. ResNet50
B
You wish to categorize terabytes' worth of articles into topics using SageMaker and LDA, but processing that much data at once is leading to difficulties in storage and training the model reliably. What can be done to improve the performance of the system? A. Configure SageMaker to use multiple instances for training LDA B. Convert the articles to RecordIO format, and use Pipe mode C. Convert the articles to CSV format, and use Pipe mode D. Configure SageMaker to use multiple GPU's for training LDA
C
You wish to control access to SageMaker notebooks to specific IAM groups. How might you go about this? A. Restrict access to the specific EC2 instances used to host the notebooks using IAM B. Integrate SageMaker with Active Directory C. Attach tags to the groups of SageMaker resources to be kept private to specific groups, and use ResourceTag conditions in IAM policies. D. Use S3 bucket policies to restrict access to the resources needed by the notebooks
D
You wish to use a model built with Tensorflow for training within a SageMaker notebook. To do so, you have created a Dockerfile with which you'll package your model into a SageMaker container, copying your training code with the command COPY train.py /opt/ml/code/train.py. What further needs to be done to define your train.py as the script entrypoint? A. Nothing; the entrypoint must be named train.py and this is assumed. B. Nothing; any script inside /opt/ml/code will be considered the entrypoint automatically. C. Enter train.py as the entrypoint in the SageMaker console D. Include ENV SAGEMAKER_PROGRAM train.py in the Dockerfile
B
You've developed a custom training model for SageMaker using TensorFlow, but a single GPU can't handle the data it needs to train with. How would you go about scaling your algorithm to use multiple GPU's within SageMaker? A. Deploy your model to multiple EC2 P3 instances, and SageMaker will distribute it automatically B. Write your model with the Horovod distributed training framework, which is supported by SageMaker. C. This isn't possible with Tensorflow; use Apache MXNet instead. D. Wrap your Tensorflow code with PySpark, and use sagemaker-spark to distribute it.
L2
_____ weighs each feature instead of removing them entirely, which can lead to better accuracy.
Bernoulli
__________________ distribution is a special case of the binomial distribution where a single trial is conducted
Semantic Segmentation
_____________________________ produces segmentation masks that identify classifications for each individual pixel in an image. It uses MXNet and the ResNet architecture to do this.