DP-100 practice exam
You plan to optimize a machine learning workflow by using pipelines. You need to create a reference to the data source location to pass as data to the first component. Which class should you use? Select only one answer. InputOutputAmlTokenConfigurationMLClient
Input
You plan to deploy models with batch endpoints for an extract, transform, and load (ETL) project. You need to create the deployment definition. Which class should you use? Select only one answer.BatchDeploymentOnlineDeploymentParallelPipeline Next
BatchDeployment
Your dashboard shows the age feature has a high impact on loan approval predictions. What additional analysis could help determine if age is being used appropriately?
Causal analysis could help determine if age is having an unfair bias or accurate correlation. Causal estimates could show if age impact is justified or if the model needs adjustment to prevent age discrimination.
You are working on a project using automated machine learning. You need to apply one-hot encoding to the categorical features of a dataset. What should you do?
Enable featurization.
You are working on a project using automated machine learning. You need to apply one-hot encoding to the categorical features of a dataset. What should you do? Select only one answer.Enable transform and encode.Enable feature scaling and normalization.Enable featurization.Impute missing values. Next
Enable featurization. Enabling featurization will make sure AutoML tries to apply the encoding transformation. Enabling transform and encode is a technique that will be tried when activating featurization but cannot be enabled on its own. Feature scaling and normalization is enabled by default. Impute missing values is a technique that will be tried when activating featurization but cannot be enabled on its own.
Your model's error analysis shows high error rates for a certain subgroup. What Responsible AI principle does this violate and how can you address it?
High error rates for a subgroup likely violates fairness. The model is not treating that subgroup appropriately. Options to address include: Get more representative data for that group. Adjust the data processing/featurization to handle the subgroup better. Use techniques like re-weighting samples to improve model performance for that group.
Your AutoML experiment completed in only 5 minutes and you know more training time is available. What could you change to potentially achieve a better model?
If the experiment finished quickly, some things to try are: Increase the training time limit constraints. Add more trials within the time limit to try more models. Remove any restrictions on algorithms to open up the search space. Use a larger compute cluster to allow more parallel trials.
You have been tasked to work on a machine learning project. You need to create an Azure Machine Learning workspace using Azure ML Python SDK v2. What should you do first? Select only one answer.Use the Azure portal to create the workspace.Install the Azure ML Python SDK.Import the workspace class with the "from azure.ai.ml.entities import Workspace" statement.Use the "az ml workspace create -w 'aml-workspace' -g 'aml-resources'" command to create the workspace. Next
Install the Azure ML Python SDK.
A company plans to tune the hyperparameters for a machine learning algorithm by using Azure Machine Learning. You need to implement a distribution that uses discrete hyperparameters. Which type of distribution should you use? Select only one answer. LogNormalUniformLogUniformQNormal
QNormal
A company plans to industrialize its machine learning workflow. The company implements MLflow to track the training of a classifier. You need to log the accuracy using MLflow. Which method should you use? Select only one answer. fit log_artifact log_metric log_param
log_metric
You plan to deploy a batch endpoint for an extract, transform, and load (ETL) system. You create a script to configure the batch endpoint. You need to ensure the script processes 50 records at each run. Which parameter should you configure? Select only one answer.instance_countmini_batch_sizeoutput_actionscoring_script Next
mini_batch_size
You are starting a new Azure Machine Learning project. You have a dataset that consists of tabular data. The dataset needs to be accessed frequently. You need to create a data asset type optimized for tabular data using the SDK v2. Which data asset type should you create? Select only one answer.FileDatasetTabularDatasetmltableuri_folder Next
mltable
A company plans to put its machine learning jobs into production with MLflow. You need to save different versions of the models after each training. Which method should you use? Select only one answer.register_modellog_artifactget_runlog_param Next
register_model
A company is working on a new anomaly detection classifier. You plan to use the SVM classifier. You need to evaluate the accuracy of the model using sklearn. Which method should you use? score fit predict log_metric
score
You notice the class imbalance alert triggered in your AutoML experiment. What does this indicate and how can you address it?
The class imbalance alert means there is a significant skew in the ratio of samples for each class. This can bias the model. Options to address it include: Enabling sampling in featurization to balance the classes. Adjusting the evaluation metric to not optimize just for overall accuracy. Using an algorithm like SVM that is less influenced by class imbalance.
Stakeholders want to understand how your loan approval model makes decisions. What dashboard insight would help provide transparency?
The explanations insight that shows feature importance would help explain how the model makes decisions, providing transparency.
Your binary classification model has an accuracy of 90%, but a low precision. What can you conclude?
The low precision means that of the cases predicted as positive, many are actually negative. Even with 90% overall accuracy, the model is likely predicting too many false positives.
You get an authentication error when invoking your managed endpoint. What are two possible ways to authenticate?
The two main authentication methods for managed endpoints are key-based authentication and token-based authentication. Ensure you are passing the correct key or token in the request.
You are collaborating with other data scientists on a project with a shared Git repository. You must clone the Git repository to the local file system on an Azure Machine Learning Notebook instance. What should you do first? Select only one answer. Add the public key to the Git account.Clone the Git repository with the Git clone command.Generate a new Secure Shell Protocol (SSH) key.Open the terminal window in the Azure Machine Learning Notebook tab.
This item tests the candidate's knowledge of integrating an Azure Machine Learning Notebook instance with a Git repository. The integration process starts by opening the terminal window in the AML Notebook tab. Thereafter, a new SSH key must be generated, clone the public key to Git account, and clone the Git repository with the Git clone command.
You have been tasked to train a machine learning model that maximizes accuracy. You need to protect the privacy of all the participants of a case study whose dataset you use. What should you do? Select only one answer. Transform the data to log scale.Delete some records from the dataset.Use differential privacy with very high epsilon value. Use differential privacy with slightly low epsilon value.
To protect the user privacy, differential privacy has to be used with slightly low epsilon value, this ensures that the model trained will be accurate and the user privacy is also not compromised.
You want to score a registered model in an Azure Databricks notebook. How can you load it?
Use mlflow.pyfunc.spark_udf() to load the model as a Spark UDF for batch scoring in a Databricks cluster.
You are working on an Azure Machine Learning project. You must reuse the same Python packages, environment variables, and software settings across all compute instances. You need to configure the project. What should you do? Select only one answer.Use the Workspace object.Apply the script to set up the environment for all compute instances.Use the Environment object.Use the same compute instance for all scripts. Next
Use the Environment object.
You implemented OVR for a 4-class classifier. How many binary classification models were trained?
With OVR, one binary classifier is trained per class. For 4 classes, there would be 4 binary classification models trained with OVR.
A company hires you to lead a team of data scientists. You need to create a custom environment to increase collaboration within the team by using Azure CLI v2. Which command should you use? Select only one answer.az ml environment createml_client.environments.create_or_updateaz ml environment showaz ml environment update Next
az ml environment create
You are configuring a machine learning operations (MLOps) continuous integration (CI) pipeline in Azure DevOps. You need to save the model by using the Azure Machine Learning CLI within an Azure CLI task. Which command should you run? Select only one answer.az ml computetarget createaz ml run submit-scriptaz ml model registeraz ml model deploy Next
az ml model register
You need to verify that a model training script works locally before running the job in the cloud by setting the compute parameter of the command class. Which setting should you use? Select only one answer. compute='local'compute=mlclient.begin_create_or_update(AmlCompute(name='gpu-cluster')).result().namecompute='gpu-cluster'compute='cpu-cluster'
compute='local'
You plan to deploy models with batch endpoints for an extract, transform, and load (ETL) project. An ETL process extracts data from different systems. You need to specify the location of the data source when invoking the batch endpoint. Which parameter should you use? Select only one answer.inputjob-nameoutputsinput-type Next
input
You plan to train a machine learning model by using a script and the Azure ML Python SDK v2. You need to pass a parameter to the script to set the number of estimators in the random forest machine learning algorithm. The value of the parameter should be set to 50. Which parameter of the command class should you use? Select only one answer. outputs code inputs command
inputs
You are building a text analytics pipeline in Azure Machine Learning designer. You need to add a component that removes stop words from the text. Which component should you add? Select only one answer. Extract N-Gram Features from TextConvert Word to VectorSplit DataPreprocess Text
object detection
You are training a model by using automated machine learning. You need to specify an evaluation metric that minimizes the false positive rate for the trained model. Which metric should you use?
precision
ou plan to optimize the hyperparameters for a machine learning model. You need to define the metric to evaluate the hyperparameters in a sweep job. Which parameter should you use? Select only one answer. goalcomputeprimary_metricsampling_algorithm
primary_metric
You are wrangling tabular data to train a new model using pandas on your local laptop. You just learned that there is exponentially more data to process and that it needs to be done as soon as possible. You need to use Azure Machine Learning to wrangle the data with as little overhead as possible. Which tool should you use? Select only one answer. compute instanceNumpystand-alone Spark jobSpark component in a pipeline job
stand-alone Spark job
You write a model training script. You plan to use the script in a run in Azure Machine Learning. You need to load all records from a dataset into a pandas dataframe. Which code segment should you add to the script? Select only one answer.tabular_data = data.to_pandas_dataframe()tabular_data = data.get()tabular_data = data.to_spark_dataframe()tabular_data = data.sample() Next
tabular_data = data.to_pandas_dataframe()
You write a model training script. You plan to use the script in a run in Azure Machine Learning. You need to load all records from a dataset into a pandas dataframe. Which code segment should you add to the script? Select only one answer. tabular_data = data.to_pandas_dataframe()tabular_data = data.get()tabular_data = data.to_spark_dataframe()tabular_data = data.sample()
tabular_data = data.to_pandas_dataframe() The to_pandas_dataframe method loads the tabular dataset as a pandas dataframe. The get method of the Dataset class returns a references Dataset from the Azure Machine Learning workspace. The to_spark_dataframe method loads the tabular dataset as a spark dataframe. The data.sample method generates a sample from the tabular dataset.
ou have submitted a training job to Azure Machine Learning which has been completed. You need to retrieve job metrics in a Jupyter Notebook by using the MLflowClient class in Azure ML Python SDK v2. Which method should you use? log_metric() get_run() update_run() log_artifact()
get_run() This item tests the candidate's knowledge of tracking experiment metrics in Jupyter Notebooks with MLflow. Only the get_run method will return the metadata, including metrics, for a given run. The other options perform unrelated operations on the run.
You plan to optimize the hyperparameters for a machine learning model. You need to sample the hyperparameter space based on the following characteristics: Hyperparameters are discrete. All combinations of the hyperparameter values must be evaluated. Which sampling technique should you use? Select only one answer.grid samplingsobol samplingrandom samplingBayesian sampling Next
grid sampling
You deploy version1 of a model and send 100 percent of incoming requests to that endpoint. You deploy version2 of the model. You need to allocate 10 percent of incoming requests to version2 of the model by using the online-endpoint update command. Which command segment should you use? Select only one answer. --traffic "version1=90, version2=10"--traffic "version1=10, version=90"--mirror-traffic--name
--traffic "version1=90, version2=10" The code segment that includes "version1=90, version2=10" sends 10 percent of incoming requests to version2 and 90 percent to version1. The code segment that includes "version1=10, version=90" sends 90 percent of incoming requests to version2. The code segment that includes --mirror-traffic sends 100 percent of incoming requests to more than one endpoint. The code segment that includes --name renames the endpoint.
A company is starting its first machine learning project to segment its customers. The company decides to use Azure Machine Learning designer. You need to recommend a metric to evaluate the performance of the classification algorithm. Which metric should you recommend? Select only one answer.AUCR SquareMSEMAE
AUC
You configured an AutoML classification experiment with accuracy as the primary metric. The best run shows an accuracy of 0.85. What could you do to potentially improve accuracy?
Adjust featurization like imputation or encoding to better handle the data. Remove restrictions on the algorithms to allow more models to be tried. Increase max trials to train more models to find a better one. Adjust sampling or cross-validation to build models on more representative data splits.
You are training a fraud detection model for a large, multinational financial institution. You need to provision a compute cluster for the training job that contains four instances using the Azure Machine Learning Python SDK v2. Which class should you use? Select only one answer. ComputeInstanceEnvironmentModelAmlCompute
AmlCompute
Question: In what scenario would you prefer to use a Kubernetes Cluster as your compute target in Azure Machine Learning, and what advantages does it offer over other compute types?
Answer: A Kubernetes Cluster is the preferred compute target in scenarios where there's a need for more control over how the compute environment is configured and managed, especially for complex, scalable cloud computing or on-premises workloads. Kubernetes Clusters, based on Kubernetes technology, offer advantages in customization and scalability. They allow for attaching self-managed Azure Kubernetes (AKS) clusters for cloud compute or Arc Kubernetes clusters for on-premises workloads. This flexibility is particularly advantageous for tasks requiring specific configurations or where standard compute resources are insufficient. Kubernetes Clusters provide robust orchestration, containerization benefits, and the ability to handle large-scale, distributed workloads efficiently.
Question: How do Compute Clusters in Azure Machine Learning optimize resource usage for large-scale data processing scripts, and what is the key benefit of their scalability?
Answer: Compute Clusters in Azure Machine Learning offer a cost-effective solution for running scripts that need to process large volumes of data. They are multi-node clusters of virtual machines that automatically scale up or down based on demand. This auto-scaling feature allows for efficient resource usage, as the clusters expand to accommodate the workload when needed and scale down to minimize costs when idle. The key benefit of this scalability is the ability to use parallel processing, distributing the workload across multiple nodes to reduce the time required to run scripts, especially beneficial for training machine learning models or processing large datasets.
Question: Describe the process and considerations for creating and using a Compute Instance in Azure Machine Learning. What are its primary use cases, and how does it differ from Compute Clusters in terms of functionality and workload management?
Answer: Creating a Compute Instance in Azure Machine Learning can be done via the Azure Machine Learning Studio, the Azure command-line interface (CLI), or using the Python SDK. Key considerations include ensuring a unique name across an Azure region and possibly customizing the instance with necessary packages, tools, or software via a script. A Compute Instance is primarily used for running notebooks and is ideal for experimentation and development, particularly with Jupyter notebooks. It differs from Compute Clusters in that it behaves more like a traditional virtual machine, designed for continuous running but not suited for handling parallel workloads or auto-scaling. Unlike Compute Clusters, a Compute Instance is assigned to a single user and can't handle parallel workloads, making it more suitable for individual development and experimentation tasks rather than large-scale data processing or model training.
Question: You trained a binary classification model to predict if a patient has diabetes using a dataset of medical records. The model has an accuracy of 80%, precision of 67%, and recall of 40% using the default 0.5 probability threshold. What change could improve recall without reducing precision?
Answer: Lowering the probability threshold for predicting the positive class (diabetes) could increase recall without reducing precision. This would allow more positive predictions, detecting more actual positive cases (increasing recall) while not increasing the number of incorrect positive predictions (maintaining precision).
Your model is registered in Azure Databricks but you want to deploy to Azure ML. What should you do?
Answer: Re-register the model in the Azure ML model registry by setting the MLflow registry URI to point to Azure ML.
You want to track experiments in your Azure ML workspace only. What do you need to configure?
Answer: Set the MLflow tracking URI to point exclusively to your Azure ML workspace tracking URI.
Question: You trained a regression model to predict property sale prices based on features like square footage, number of bedrooms, etc. The model has a MAE of $50,000, RMSE of $75,000, RSE of 0.12, and R2 of 0.85 on a validation set. What metric indicates the model may have high variance in the errors?
Answer: The RMSE of $75,000 compared to the MAE of $50,000 indicates there is high variance in the individual errors. When RMSE is substantially higher than MAE, it signals high variance in the errors. Some predictions are very close while others are farther off.
You notice model artifacts are not logged to your workspace. What could be the issue?
Answer: The tracking URI may not be configured properly to point to the Azure ML workspace, so model artifacts are not logged there.
You want to log the AUC metric for a run. What MLflow function should you use?
Answer: mlflow.log_metric() should be used to log the AUC metric value. log_metric() logs numeric metrics with each run.
You want to track a PyTorch model. What do you need to enable?
Answer: mlflow.pytorch.autolog() needs to be called to enable autologging for PyTorch models to automatically log metrics and artifacts.
You have been tasked to train a machine learning model using Azure Machine Learning. Coding is not required. You need to select the development approach to train the best-performing classification model. Which development approach should you use? Select only one answer. Azure Machine Learning Designer PipelinesAzure CLIVS Code extensions for Azure MLAzure AutoML
Azure AutoML
You train a machine learning model. You need to deploy the model as a real-time inferencing model for production based on the following requirements: High performance High scalability Secure Which model deployment endpoint should you use? Select only one answer.Local ServiceCompute InstanceAzure Kubernetes ServiceAzure Container Instances Next
Azure Kubernetes Service
A company needs to build machine learning models. The company has the following requirements: Build models by using a GUI tool. Use Azure Machine Learning features. You need to recommend a solution that meets the requirements. What should you recommend? Select only one answer. Azure Machine Learning designerAzure Machine Learning SDK for PythonAzure CLIAzure Machine Learning studio
Azure Machine Learning designer Azure Machine Learning designer is a no code, drag and drop interface. The Azure Machine Learning SDK is a code only interface in Python. The Azure CLI is a code only interface and does not support all the features of Azure Machine Learning. Azure Machine Learning studio is a web-based tool for managing an Azure Machine Learning instance.
Your company starts a new Azure Machine Learning project. The source data is stored in Parquet files, in a hierarchical fashion to allow partitioning. You need to create a new datastore. Which class should you use? Select only one answer.AzureBlobDatastoreAzureDataLakeGen2DatastoreKubernetesComputeSpark Next
AzureDataLakeGen2Datastore The class AzureDataLakeGen2Datastore creates a datastore linking to an Azure Data Lake Gen2, which can store Parquet files hierarchically. An AzureBlobDatastore can also store Parquet file, but not with a hierarchy. Spark and KubernetesCompute are not datastores.
A company hired you to debug its Azure Machine Learning pipelines. You need to monitor the status of the pipeline after submitting it from the authoring page. What should you do? Select only one answer.Check the pipeline job link in the submission list.Monitor the status in the authoring page.Publish the pipeline as an endpoint.Refresh the authoring page. Next
Check the pipeline job link in the submission list. This item tests the candidate's knowledge of monitoring pipeline runs. A submitted job will appear in the submission list, along with its status. However, the pipeline job status and results are not filled back to the authoring page. Finally, the authoring page does not include details about pipeline job status.
You train a machine learning model by using Azure Machine Learning. You need to retrain the model to prevent the model from becoming obsolete. What should you do? Select only one answer.Preprocess the data by using a manual process.Preprocess the data by using an automated process.Compare the outputs of the new model to the outputs of the old model.Replace the old model based on a predefined replacement criterion. Next
Compare the outputs of the new model to the outputs of the old model.
A company trains a new pipeline for customer classification in Azure Machine Learning designer. You need to deploy the model as a real-time endpoint. Which step must you perform before deploying the model? Select only one answer. Send data to the URI. Ensure AUC is above 0.9. Execute a Python Script component. Convert the training pipeline to an inference pipeline.
Convert the training pipeline to an inference pipeline.
You are working on an Azure Machine Learning project. You have the following requirements: You need to create a reference to the data in its storage, with a copy of its metadata. No data should be copied. What should you do? Select only one answer. Register the datastore where your data is located. Create a dataset. Create a pipeline to transfer the metadata only. Use Azure Data Factory to ingest the data.
Create a dataset. Creating a dataset will create a reference to the data, with a copy of its metadata. It will not copy the data. Registering a datastore will not create a reference to the data with a copy of its metadata. Creating a pipeline is used to create a machine learning workflow, not to create a reference to data in a datastore. Using Azure Data Factory is for ETL processes when moving the data is needed, not for creating a reference to the data.
A company has a team of data scientists working on an application. A data scientist tries to replicate an experiment and an error message related to the versions of libraries is displayed. You need to recommend a solution to avoid this error in the future. The solution must minimize redundant environments. Which solution should you recommend? Select only one answer. Create and register an environment while running the first experiment. Then, specify the environment while replicating the experiment. Save a Dockerfile for the first experiment. Share the Dockerfile with other data scientists that need to replicate the experiment. Create and register an environment while running the first experiment. Then, create a duplicate environment while replicating the experiment. Save a requirements.txt file for the initial experiment. Then, use the file to create a new environment while replicating the experiment.
Create and register an environment while running the first experiment. Then, specify the environment while replicating the experiment. Registering an environment creates an environment one time and reuse it for future runs of the experiment. Passing a Dockerfile around is both error prone and creates a new environment for each experiment run. Creating and registering an environment while running the first experiment creates a redundant environment. Saving a requirements.txt file for the initial experiment will take less time than trial and error, since the correct version is specified, but is more complex since it creates redundant environments.
You are wrangling data from an existing Azure Blob Storage container for a pricing model. You need to determine how to load and process the data in a notebook in as few steps as possible. What should you do? Select only one answer. Load the data using Azure CLI.Load the data using Azure Storage Explorer.Load the data directly in the notebook using the data's URI and SAS token. .Load the data using Azure Machine Learning Python SDK v2.
Load the data directly in the notebook using the data's URI and SAS token.
You train a model to predict housing prices based on house features. You need to determine why the model made a prediction on a specific house in the dataset. Which method should you use? Select only one answer. PrecisionGlobal feature importanceGini coefficientLocal feature importance
Local feature importance
You plan to deploy an online endpoint for price forecasting. You need to specify the name of the endpoint, its instance type, its environment, and its code configuration. Which class should you use? Select only one answer.ManagedOnlineDeploymentNetworkSettingsOnlineEndpointModel Next
ManagedOnlineDeployment
You are working on a machine learning project to predict a person's height based on their weight given in pounds. You plan to implement Machine Learning Operations (MLOPS) practices by triggering the machine learning model training pipeline to retrain the model if the weight is in kilograms. You need to recommend an event type to listen to that will be used to trigger the pipeline. Which event type should you recommend? Select only one answer.Microsoft.MachineLearningServices.DatasetDriftDetectedMicrosoft.MachineLearningServices.RunStatusChangedMicrosoft.MachineLearningServices.RunCompletedMicrosoft.MachineLearningServices.ModelDeployed Next
Microsoft.MachineLearningServices.DatasetDriftDetected
You create a new pipeline by using the Azure Machine Learning designer. You need to choose an algorithm that makes a prediction between several categories of data. Which algorithm should you select? Select only one answer. Bayesian Linear RegressionBoosted Decision Tree RegressionK-meansMulticlass Decision Forest
Multiclass Decision Forest The Multiclass Decision Forest algorithm will make a prediction between several categories. The Bayesian Linear Regression algorithm will forecast a numeric value, not make a prediction between several categories. The Boosted Decision Tree Regression algorithm will forecast a numeric value, not make a prediction between several categories. The K-means algorithm will separate similar data points into groups, not make a prediction between several categories. Types of supported algorithm: Classification, regression, time-series forecasting, NLP, computer vision
You plan to create anomaly detection for a company's purchasing system. You need to train an autoencoder on a GPU compute. Which virtual machine (VM) series should you use? Select only one answer.DSv3ESv3HBv3NVv3 Next
NVv3
You are working on an Azure Machine Learning project. The source data files are more than 1 GB. You need to recommend a file format to optimize data processing. Which file format should you choose? Select only one answer.PythonParquetCSVxlsx Next
Parquet
You deployed a new model version but see no change in predictions. What could cause this?
Possible causes include: Traffic is still routed 100% to the old deployment. Update traffic allocation. The new deployment failed. Check deployment logs in the studio. The new model is not registered correctly. Verify model path/ID.
Your endpoint intermittently returns timeouts. What could be the issue?
Possible issues include: Insufficient instance count to handle load spikes. Scale up instances. Using too small of a VM size. Use a larger VM size. Scoring script has slow performance. Optimize scoring script.
You plan to apply Machine Learning Operations (MLOPS) practices. You need to trigger a machine learning model training pipeline that is configured to trigger by detecting a change in a repo. What should you do? Select only one answer. Clone the GitHub/Azure DevOps Repository.Raise a Pull Request (PR) to GitHub/Azure DevOps Repository.Create a new branch in GitHub/Azure DevOps Repository.Stash the changes done in GitHub/Azure DevOps Repository.
Raise a Pull Request (PR) to GitHub/Azure DevOps Repository.
You are optimizing hyperparameters by using Azure Machine Learning. You have the following requirements: Use a large search space of parameters for a model. Find optimal parameters quickly. Minimize run time. Which sampling strategy should you use? GridParameterSamplingBanditPolicy RandomParameterSampling TruncationSelectionPolicy
RandomParameterSampling
You have a large machine learning project. The project team writes Jupyter notebooks. You need to enable automatic code quality checks in the continuous integration (CI) process. What should you do? Select only one answer.Refactor the Jupyter notebooks into .py files.Write a series of checks for the Jupyter notebooks.Publish a pipeline that has some checks.Refactor the Jupyter notebooks into R scripts. Next
Refactor the Jupyter notebooks into .py files.
A company plans to implement automated machine learning. You need to train a model. Which two algorithms can you use? Each correct answer presents a complete solution. Select all answers that apply.RegressionDimensionality reductionClusteringClassification Next
Regression and Classification The regression and the classification algorithms are supported by automated machine learning. The dimensionality reduction and the clustering algorithms not supported by automated machine learning.
You are working on a project to decrease customer churn. You need to perform a counterfactual what-if analysis to view which features about a given customer you could minimally change to turn a churning customer into a returning one. Which solution should you use? Select only one answer. MLflowCompute InstanceResponsible AI DashboardML Pipelines
Responsible AI Dashboard
You trained a risk scoring model in Azure Machine Learning. You need to provide the mean absolute error per cohort of age in your dataset using Azure Machine Learning. Which service should you choose? Select only one answer.Azure Machine Learning designerResponsible AI scorecardbatch endpointsAzure Blob Datastore Next
Responsible AI scorecard
A company is using MLflow to manage experiments in Azure Machine Learning. You need to log metrics, model parameters, and model artifacts automatically when training a model. What should you do? Select only one answer.Use the print() command to log all metrics.Use the MLflow model registry.Run the mlflow.autolog() command.Run the mlflow.set_tracking_uri() command. Next
Run the mlflow.autolog() command.
You need to stop a compute instance every day with the least amount of effort. What should you do? Select only one answer. Use Azure Machine Learning studio to stop the compute instance.Create a script that uses the Azure Machine Learning SDK for Python. Schedule the script to run.Set up a schedule in Azure Machine Learning studio and include a shutdown time. .Set the autoscale of the compute instance to 0.
Set up a schedule in Azure Machine Learning studio and include a shutdown time.
You have a machine learning project. You install the azureml-mlflow package and create an Azure Machine Learning workspace. You need to use MLflow to track local experiments. What should you do first? Select only one answer. Set up the tracking environment Set the experiment name. Start the training run. Create an Azure Databricks resource.
Set up the tracking environment. To track a local run, the tracking environment must be set up with MLflow' s tracking URI. Setting the experiment name comes after setting the setting of the tracking environment. Starting the training run comes last, after setting up the tracking environment and the experiment name. Creating an Azure Databricks is not needed in this scenario.
You built a multiclass image classification model to detect cat, dog or rabbit. The confusion matrix shows 10 false positives for cats. What does this mean?
The 10 false positives for cats means there were 10 cases where the model predicted cat but the actual label was dog or rabbit. False positives refer to cases incorrectly predicted as positive.
You are working on an Azure Machine Learning project. You must reuse the same Python packages, environment variables, and software settings across all compute instances. You need to configure the project. What should you do? Select only one answer. Use the Workspace object. .Apply the script to set up the environment for all compute instances. Use the Environment object. Use the same compute instance for all scripts.
Use the Environment object. Using the Environment object will guarantee you are reusing the same environment every time. Using the Workspace object will allow you to create a workspace but not to save the environment. Applying the script to set up your environment will not guarantee the environment is the same every time because it is not a systematic approach. Using the same compute instance will not allow you to reuse the environment across all compute instances.
You are training a computer vision model using Azure AutoML to return bounding box coordinates of cars in images of a parking lot. You need to define the task by using the Azure Machine Learning Python SDK v2. Which task should you use? Select only one answer. azure.ai.ml.automl.image_classification_multilabelazure.ai.ml.automl.image_classificationazure.ai.ml.automl.image_object_detectionazure.ai.ml.automl.image_instance_segmentation
azure.ai.ml.automl.image_object_detection This item tests the candidate's knowledge of computer vision tasks supported by Azure AutoML via the Azure Machine Learning python SDK v2. Image_classification_multilabel is used for predicting multiple tags for a given image. Image_classification is for predicting a single tag for a given image. Image_instance_segmentation is for making a prediction for each pixel in the input image. Only image_object_detection is used to predict bounding box coordinates around objects in a given image.
You are working on a deep learning (DL) project. You plan to create a pipeline to process data and then train the DL model. You need to recommend where to import the pipeline class. What should you recommend? Select only one answer. azure.ai.ml.dsl .azure.ai azure.ai.ml azure.pipeline
azure.ai.ml.dsl
A company creates a machine learning pipeline with the Azure Machine Learning designer. The company wants to improve data cleaning in the pipeline with a new Python script. You need to use the Execute Python Script component to implement a specific method to run. Which method should you implement? load_data train azureml_main predict
azureml_main An Execute Python Script component must implement an azureml_main method. A Create Python Model component must implement a train and predict methods. The load_data method is used to construct a data object but is not part of the Execute Python Script component.
You are working on a natural language processing (NLP) project. You need to map tokens with specific tags by using Azure AutoML. Which format of data and NLP task should you use? Select only one answer. data in the .txt format and named entity recognition (NER) task data in the .csv format and named entity recognition (NER) task data in the .csv format and multi-class text classification task data in the .csv format and multi-label text classification task
data in the .txt format and named entity recognition (NER) task