DP-100 Data Science Questions Topic 5

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

DRAG DROP - You have been tasked with moving data into Azure Blob Storage for the purpose of supporting Azure Machine Learning. Which of the following can be used to complete your task? Answer by dragging the correct options from the list to the answer area. Select and Place: - AzCopy - Bulk Copy Program (BCP) - SSIS - Bulk Insert SQL Query - Azure Storage Explorer

AzCopy SSIS Azure Storage Explorer You can move data to and from Azure Blob storage using different technologies: ✑ Azure Storage-Explorer ✑ AzCopy ✑ Python ✑ SSIS Reference: https://docs.microsoft.com/en-us/azure/machine-learning/team-data-science-process/move-azure-blob

You are making use of the Azure Machine Learning to designer construct an experiment. After dividing a dataset into training and testing sets, you configure the algorithm to be Two-Class Boosted Decision Tree. You are preparing to ascertain the Area Under the Curve (AUC). Which of the following is a sequential combination of the models required to achieve your goal? A. Train, Score, Evaluate. B. Score, Evaluate, Train. C. Evaluate, Export Data, Train. D. Train, Score, Export Data.

Correct Answer: A

This question is included in a number of questions that depicts the identical set-up. However, every question has a distinctive result. Establish if the recommendation satisfies the requirements. You have been tasked with evaluating your model on a partial data sample via k-fold cross-validation. You have already configured a k parameter as the number of splits. You now have to configure the k parameter for the cross-validation with the usual value choice. Recommendation: You configure the use of the value k=3. Will the requirements be satisfied? A. Yes B. No

Correct Answer: B Reference: https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/cross-validate-model

HOTSPOT - You need to consider the underlined segment to establish whether it is accurate. Hot Area: The >>1<< visualization can be used to reveal outliers in your data. >>1<<: Venn diagram, Box plot, Gradient descent, Violin plot

Correct Answer: Box plot The box-plot algorithm can be used to display outliers. Reference: https://medium.com/analytics-vidhya/what-is-an-outliers-how-to-detect-and-remove-them-which-algorithm-are-sensitive-towards-outliers-2d501993d59

DRAG DROP - You build a binary classification model using the Azure Machine Learning Studio Two-Class Neural Network module. You are preparing to configure the Tune Model Hyperparameters module for the purpose of tuning accuracy for the model. Which of the following are valid parameters for the Two-Class Neural Network module? Answer by dragging the correct options from the list to the answer area. Select and Place: Options: - Depth of the tree - Random number seed - Optimization tolerance - The initial learning weights diameter - Lambda - Number of learning iterations - Project to the unitsphere

Correct Answer: - Random number seed - The initial learning weights diameter - Number of learning iterations Reference: https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/two-class-neural-network

DRAG DROP - You are planning to host practical training to acquaint staff with Docker for Windows. Staff devices must support the installation of Docker. Which of the following are requirements for this installation? Answer by dragging the correct options from the list to the answer area. Select and Place: - 2 GB of system RAM - 4 GB of system RAM - BIOS-enabled virtualization - Microsoft Hardware-Assisted Virtualization Detection Tool - Windows 10 64-bit - Windows 10 32-bit

Correct Answer: 4 GB of system RAM, BIOS-enabled virtualization, Windows 10 64-bit Reference: https://docs.docker.com/toolbox/toolbox_install_windows/https://blogs.technet.microsoft.com/canitpro/2015/09/08/step-by-step-enabling-hyper-v-for-use-on-windows-10/ https://docs.docker.com/docker-for-windows/install/

This question is included in a number of questions that depicts the identical set-up. However, every question has a distinctive result. Establish if the recommendation satisfies the requirements. You have been tasked with constructing a machine learning model that translates language text into a different language text. The machine learning model must be constructed and trained to learn the sequence of the. Recommendation: You make use of Recurrent Neural Networks (RNNs). Will the requirements be satisfied? A. Yes B. No

Correct Answer: A Note: RNNs are designed to take sequences of text as inputs or return sequences of text as outputs, or both. Theyג€™re called recurrent because the networkג€™s hidden layers have a loop in which the output and cell state from each time step become inputs at the next time step. This recurrence serves as a form of memory. It allows contextual information to flow through the network so that relevant outputs from previous time steps can be applied to network operations at the current time step. Reference: https://towardsdatascience.com/language-translation-with-rnns-d84d43b40571

This question is included in a number of questions that depicts the identical set-up. However, every question has a distinctive result. Establish if the recommendation satisfies the requirements. You are in the process of carrying out feature engineering on a dataset. You want to add a feature to the dataset and fill the column value. Recommendation: You must make use of the Edit Metadata Azure Machine Learning Studio module. Will the requirements be satisfied? A. Yes B. No

Correct Answer: A Alternate Answer: B (Edit metadata can change existing column names not add new ones) Typical metadata changes might include marking columns as features. Reference: https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/edit-metadata https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/join-data https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/group-categorical-values

This question is included in a number of questions that depicts the identical set-up. However, every question has a distinctive result. Establish if the recommendation satisfies the requirements. You are in the process of carrying out feature engineering on a dataset. You want to add a feature to the dataset and fill the column value. Recommendation: You must make use of the Join Data Azure Machine Learning Studio module. Will the requirements be satisfied? A. Yes B. No

Correct Answer: B Add Columns need to be used. Join data is needed only for database style joins Reference: https://docs.microsoft.com/bs-cyrl-ba/azure/machine-learning/component-reference/add-columns https://docs.microsoft.com/bs-cyrl-ba/azure/machine-learning/component-reference/join-data

This question is included in a number of questions that depicts the identical set-up. However, every question has a distinctive result. Establish if the recommendation satisfies the requirements. You are planning to make use of Azure Machine Learning designer to train models. You need choose a suitable compute type. Recommendation: You choose Attached compute. Will the requirements be satisfied? A. Yes B. No

Correct Answer: B Alternative Comment: (Training on attached compute is not possible when using designer according to this: https://docs.microsoft.com/en-us/azure/machine-learning/concept-compute-target) Reference: https://docs.microsoft.com/en-us/azure/machine-learning/how-to-create-attach-compute-studio

This question is included in a number of questions that depicts the identical set-up. However, every question has a distinctive result. Establish if the recommendation satisfies the requirements. You have been tasked with constructing a machine learning model that translates language text into a different language text. The machine learning model must be constructed and trained to learn the sequence of the. Recommendation: You make use of Generative Adversarial Networks (GANs). Will the requirements be satisfied? A. Yes B. No

Correct Answer: B Comments: (GANs used for image translation)

This question is included in a number of questions that depicts the identical set-up. However, every question has a distinctive result. Establish if the recommendation satisfies the requirements. You have been tasked with constructing a machine learning model that translates language text into a different language text. The machine learning model must be constructed and trained to learn the sequence of the. Recommendation: You make use of Convolutional Neural Networks (CNNs). Will the requirements be satisfied? A. Yes B. No

Correct Answer: B Comments: (Use Reccurent Neural Network (RNN) for translations. cnn for image classification)

This question is included in a number of questions that depicts the identical set-up. However, every question has a distinctive result. Establish if the recommendation satisfies the requirements. You are in the process of creating a machine learning model. Your dataset includes rows with null and missing values. You plan to make use of the Clean Missing Data module in Azure Machine Learning Studio to detect and fix the null and missing values in the dataset. Recommendation: You make use of the Custom substitution value option. Will the requirements be satisfied? A. Yes B. No

Correct Answer: B Alternate Answer: A (answer should be YES, because we can use '0' for numeric and 'na' for text columns) Reference: https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/clean-missing-data

You are in the process of constructing a deep convolutional neural network (CNN). The CNN will be used for image classification. You notice that the CNN model you constructed displays hints of overfitting. You want to make sure that overfitting is minimized, and that the model is converged to an optimal fit. Which of the following is TRUE with regards to achieving your goal? A. You have to add an additional dense layer with 512 input units, and reduce the amount of training data. B. You have to add L1/L2 regularization, and reduce the amount of training data. C. You have to reduce the amount of training data and make use of training data augmentation. D. You have to add L1/L2 regularization, and make use of training data augmentation. E. You have to add an additional dense layer with 512 input units, and add L1/L2 regularization.

Correct Answer: B Alternative Answer: D YES (all responses) ("data augmentation simply means increasing size of the data that is increasing the number of images present in the dataset.. using data augmentation a lot of similar images can be generated. This helps in increasing the dataset size and thus reduce overfitting." https://www.kdnuggets.com/2019/12/5-techniques-prevent-overfitting-neural-networks.html) B: Weight regularization provides an approach to reduce the overfitting of a deep learning neural network model on the training data and improve the performance of the model on new data, such as the holdout test set. Keras provides a weight regularization API that allows you to add a penalty for weight size to the loss function. Three different regularizer instances are provided; they are: ✑ L1: Sum of the absolute weights. ✑ L2: Sum of the squared weights. ✑ L1L2: Sum of the absolute and the squared weights. Because a fully connected layer occupies most of the parameters, it is prone to overfitting. One method to reduce overfitting is dropout. At each training stage, individual nodes are either "dropped out" of the net with probability 1-p or kept with probability p, so that a reduced network is left; incoming and outgoing edges to a dropped-out node are also removed. By avoiding training all nodes on all training data, dropout decreases overfitting. Reference: https://machinelearningmastery.com/how-to-reduce-overfitting-in-deep-learning-with-weight-regularization/ https://en.wikipedia.org/wiki/Convolutional_neural_network

This question is included in a number of questions that depicts the identical set-up. However, every question has a distinctive result. Establish if the recommendation satisfies the requirements. You have been tasked with employing a machine learning model, which makes use of a PostgreSQL database and needs GPU processing, to forecast prices. You are preparing to create a virtual machine that has the necessary tools built into it. You need to make use of the correct virtual machine type. Recommendation: You make use of a Deep Learning Virtual Machine (DLVM) Windows edition. Will the requirements be satisfied? A. Yes B. No

Correct Answer: B DLVM is a template on top of DSVM image. In terms of the packages, GPU drivers etc are all there in the DSVM image. Mostly it is for convenience during creation where we only allow DLVM to be created on GPU VM instances on Azure. Reference: https://docs.microsoft.com/en-us/azure/machine-learning/data-science-virtual-machine/overview

This question is included in a number of questions that depicts the identical set-up. However, every question has a distinctive result. Establish if the recommendation satisfies the requirements. You are planning to make use of Azure Machine Learning designer to train models. You need choose a suitable compute type. Recommendation: You choose Inference cluster. Will the requirements be satisfied? A. Yes B. No

Correct Answer: B Reference: https://docs.microsoft.com/en-us/azure/machine-learning/how-to-create-attach-compute-studio

This question is included in a number of questions that depicts the identical set-up. However, every question has a distinctive result. Establish if the recommendation satisfies the requirements. You are in the process of creating a machine learning model. Your dataset includes rows with null and missing values. You plan to make use of the Clean Missing Data module in Azure Machine Learning Studio to detect and fix the null and missing values in the dataset. Recommendation: You make use of the Replace with median option. Will the requirements be satisfied? A. Yes B. No

Correct Answer: B Reference: https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/clean-missing-data

You have been tasked with designing a deep learning model, which accommodates the most recent edition of Python, to recognize language. You have to include a suitable deep learning framework in the Data Science Virtual Machine (DSVM).Which of the following actions should you take? A. You should consider including Rattle. B. You should consider including TensorFlow. C. You should consider including Theano. D. You should consider including Chainer.

Correct Answer: B Reference: https://www.infoworld.com/article/3278008/what-is-tensorflow-the-machine-learning-library-explained.html

This question is included in a number of questions that depicts the identical set-up. However, every question has a distinctive result. Establish if the recommendation satisfies the requirements. You have been tasked with employing a machine learning model, which makes use of a PostgreSQL database and needs GPU processing, to forecast prices. You are preparing to create a virtual machine that has the necessary tools built into it. You need to make use of the correct virtual machine type. Recommendation: You make use of a Geo AI Data Science Virtual Machine (Geo-DSVM) Windows edition. Will the requirements be satisfied? A. Yes B. No

Correct Answer: B The Azure Geo AI Data Science VM (Geo-DSVM) delivers geospatial analytics capabilities from Microsoft's Data Science VM. Specifically, this VM extends the AI and data science toolkits in the Data Science VM by adding ESRI's market-leading ArcGIS Pro Geographic Information System. Reference: https://docs.microsoft.com/en-us/azure/machine-learning/data-science-virtual-machine/overview

You need to consider the underlined segment to establish whether it is accurate. To transform a categorical feature into a binary indicator, you should make use of the Clean Missing Data module. Select ג€No adjustment requiredג€ if the underlined segment is accurate. If the underlined segment is inaccurate, select the accurate option. A. No adjustment required. B. Convert to Indicator Values C. Apply SQL Transformation D. Group Categorical Values

Correct Answer: B Use the Convert to Indicator Values module in Azure Machine Learning Studio. The purpose of this module is to convert columns that contain categorical values into a series of binary indicator columns that can more easily be used as features in a machine learning model. Reference: https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/convert-to-indicator-values

You have been tasked with ascertaining if two sets of data differ considerably. You will make use of Azure Machine Learning Studio to complete your task. You plan to perform a paired t-test. Which of the following are conditions that must apply to use a paired t-test? (Choose all that apply.) A. All scores are independent from each other. B. You have a matched pairs of scores. C. The sampling distribution of d is normal. D. The sampling distribution of x1- x2 is normal.

Correct Answer: BC Alternative Answer: ABC YES Reference: https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/test-hypothesis-using-t-test

You need to implement a Data Science Virtual Machine (DSVM) that supports the Caffe2 deep learning framework. Which of the following DSVM should you create? A. Windows Server 2012 DSVM B. Windows Server 2016 DSVM C. Ubuntu 16.04 DSVM D. CentOS 7.4 DSVM

Correct Answer: C Caffe2 is supported by Data Science Virtual Machine for Linux. Microsoft offers Linux editions of the DSVM on Ubuntu 16.04 LTS and CentOS 7.4. However, only the DSVM on Ubuntu is preconfigured for Caffe2. Reference: https://docs.microsoft.com/en-us/azure/machine-learning/data-science-virtual-machine/overview

You make use of Azure Machine Learning Studio to develop a linear regression model. You perform an experiment to assess various algorithms. Which of the following is an algorithm that reduces the variances between actual and predicted values? A. Fast Forest Quantile Regression B. Poisson Regression C. Boosted Decision Tree Regression D. Linear Regression

Correct Answer: C Mean absolute error (MAE) measures how close the predictions are to the actual outcomes; thus, a lower score is better. Reference: https://docs.microsoft.com/en-us/azure/machine-learning/algorithm-module-reference/boosted-decision-tree-regression https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/evaluate-model https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/linear-regression

You want to train a classification model using data located in a comma-separated values (CSV) file. The classification model will be trained via the Automated Machine Learning interface using the Classification task type. You have been informed that only linear models need to be assessed by the Automated Machine Learning. Which of the following actions should you take? A. You should disable deep learning. B. You should enable automatic featurization. C. You should disable automatic featurization. D. You should set the task type to Forecasting.

Correct Answer: C Reference: https://econml.azurewebsites.net/spec/estimation/dml.html https://docs.microsoft.com/en-us/azure/machine-learning/how-to-use-automated-ml-for-ml-models

You are planning to host practical training to acquaint learners with data visualization creation using Python. Learner devices are able to connect to the internet. Learner devices are currently NOT configured for Python development. Also, learners are unable to install software on their devices as they lack administrator permissions. Furthermore, they are unable to access Azure subscriptions. It is imperative that learners are able to execute Python-based data visualization code. Which of the following actions should you take? A. You should consider configuring the use of Azure Container Instance. B. You should consider configuring the use of Azure BatchAI. C. You should consider configuring the use of Azure Notebooks. D. You should consider configuring the use of Azure Kubernetes Service.

Correct Answer: C Reference: https://notebooks.azure.com/

HOTSPOT - Complete the sentence by selecting the correct option in the answer area. Hot Area: >>1<< is a data cleaning option of the Clean Missing Data module that does not require predictors for each column. >>1<<: - Probabalistic PCA - Median - SMOTE - Custom substitution value

Correct Answer: Probabilistic PCA Replace using Probabilistic PCA: Compared to other options, such as Multiple Imputation using Chained Equations (MICE), this option has the advantage of not requiring the application of predictors for each column. Instead, it approximates the covariance for the full dataset. Therefore, it might offer better performance for datasets that have missing values in many columns. Reference: https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/clean-missing-data

This question is included in a number of questions that depicts the identical set-up. However, every question has a distinctive result. Establish if the recommendation satisfies the requirements. You have been tasked with evaluating your model on a partial data sample via k-fold cross-validation. You have already configured a k parameter as the number of splits. You now have to configure the k parameter for the cross-validation with the usual value choice. Recommendation: You configure the use of the value k=1. Will the requirements be satisfied? A. Yes B. No

Correct Answer: B

You are preparing to train a regression model via automated machine learning. The data available to you has features with missing values, as well as categorical features with little discrete values. You want to make sure that automated machine learning is configured as follows: ✑ missing values must be automatically imputed. ✑ categorical features must be encoded as part of the training task. Which of the following actions should you take? A. You should make use of the featurization parameter with the 'auto' value pair. B. You should make use of the featurization parameter with the 'off' value pair. C. You should make use of the featurization parameter with the 'on' value pair. D. You should make use of the featurization parameter with the 'FeaturizationConfig' value pair.

Correct Answer: A Featurization str or FeaturizationConfig Values: 'auto' / 'off' / FeaturizationConfig Indicator for whether featurization step should be done automatically or not, or whether customized featurization should be used. Column type is automatically detected. Based on the detected column type preprocessing/featurization is done as follows: Categorical: Target encoding, one hot encoding, drop high cardinality categories, impute missing values. Numeric: Impute missing values, cluster distance, weight of evidence. DateTime: Several features such as day, seconds, minutes, hours etc. Text: Bag of words, pre-trained Word embedding, text target encoding. Reference: https://docs.microsoft.com/en-us/python/api/azureml-train-automl-client/azureml.train.automl.automlconfig.automlconfig

This question is included in a number of questions that depicts the identical set-up. However, every question has a distinctive result. Establish if the recommendation satisfies the requirements. You have been tasked with employing a machine learning model, which makes use of a PostgreSQL database and needs GPU processing, to forecast prices. You are preparing to create a virtual machine that has the necessary tools built into it. You need to make use of the correct virtual machine type. Recommendation: You make use of a Data Science Virtual Machine (DSVM) Windows edition. Will the requirements be satisfied? A. Yes B. No

Correct Answer: A In the DSVM, your training models can use deep learning algorithms on hardware that's based on graphics processing units (GPUs). PostgreSQL is available for the following operating systems: Linux (all recent distributions), 64-bit installers available for macOS (OS X) version 10.6 and newer ג€" Windows (with installers available for 64-bit version; tested on latest versions and back to Windows 2012 R2. Reference: https://docs.microsoft.com/en-us/azure/machine-learning/data-science-virtual-machine/overview

This question is included in a number of questions that depicts the identical set-up. However, every question has a distinctive result. Establish if the recommendation satisfies the requirements. You have been tasked with evaluating your model on a partial data sample via k-fold cross-validation. You have already configured a k parameter as the number of splits. You now have to configure the k parameter for the cross-validation with the usual value choice. Recommendation: You configure the use of the value k=10. Will the requirements be satisfied? A. Yes B. No

Correct Answer: A Leave One Out (LOO) cross-validation Setting K = n (the number of observations) yields n-fold and is called leave-one out cross-validation (LOO), a special case of the K-fold approach. LOO CV is sometimes useful but typically doesnג€™t shake up the data enough. The estimates from each fold are highly correlated and hence their average can have high variance. This is why the usual choice is K=5 or 10. It provides a good compromise for the bias-variance tradeoff.

This question is included in a number of questions that depicts the identical set-up. However, every question has a distinctive result. Establish if the recommendation satisfies the requirements. You are planning to make use of Azure Machine Learning designer to train models. You need choose a suitable compute type. Recommendation: You choose Compute cluster. Will the requirements be satisfied? A. Yes B. No

Correct Answer: A Reference: https://docs.microsoft.com/en-us/azure/machine-learning/how-to-create-attach-compute-studio

This question is included in a number of questions that depicts the identical set-up. However, every question has a distinctive result. Establish if the recommendation satisfies the requirements. You are in the process of creating a machine learning model. Your dataset includes rows with null and missing values. You plan to make use of the Clean Missing Data module in Azure Machine Learning Studio to detect and fix the null and missing values in the dataset. Recommendation: You make use of the Remove entire row option. Will the requirements be satisfied? A. Yes B. No

Correct Answer: A Remove entire row: Completely removes any row in the dataset that has one or more missing values. This is useful if the missing value can be considered randomly missing. Reference: https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/clean-missing-data

You need to consider the underlined segment to establish whether it is accurate. To improve the amount of low incidence cases in a dataset, you should make use of the SMOTE module. Select ג€No adjustment requiredג€ if the underlined segment is accurate. If the underlined segment is inaccurate, select the accurate option. A. No adjustment required. B. Remove Duplicate Rows C. Join Data D. Edit Metadata

Correct Answer: A Use the SMOTE module in Azure Machine Learning Studio to increase the number of underrepresented cases in a dataset used for machine learning. SMOTE is a better way of increasing the number of rare cases than simply duplicating existing cases. Reference: https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/smote

HOTSPOT - Complete the sentence by selecting the correct option in the answer area. Hot Area: To move a large dataset from Azure Machine Learning Studio to a Weka environment, the data must be converted to this format: - CSV - DOCX - ARFF - TXT

Correct Answer: ARFF Use the Convert to ARFF module in Azure Machine Learning Studio, to convert datasets and results in Azure Machine Learning to the attribute-relation file format used by the Weka toolset. This format is known as ARFF. The ARFF data specification for Weka supports multiple machine learning tasks, including data preprocessing, classification, and feature selection. In this format, data is organized by entities and their attributes, and is contained in a single text file. Reference: https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/convert-to-arff

This question is included in a number of questions that depicts the identical set-up. However, every question has a distinctive result. Establish if the recommendation satisfies the requirements. You are in the process of carrying out feature engineering on a dataset. You want to add a feature to the dataset and fill the column value. Recommendation: You must make use of the Group Categorical Values Azure Machine Learning Studio module. Will the requirements be satisfied? A. Yes B. No

Correct Answer: B

You have been tasked with creating a new Azure pipeline via the Machine Learning designer. You have to makes sure that the pipeline trains a model using data in a comma-separated values (CSV) file that is published on a website. A dataset for the file for this file does not exist. Data from the CSV file must be ingested into the designer pipeline with the least amount of administrative effort as possible. Which of the following actions should you take? A. You should make use of the Convert to TXT module. B. You should add the Copy Data object to the pipeline. C. You should add the Import Data object to the pipeline. D. You should add the Dataset object to the pipeline.

Correct Answer: D Alternate Answer: C YES (There is a component called import data and it is in fact the correct answer to this question. Check here: https://docs.microsoft.com/en-us/azure/machine-learning/component-reference/import-data) The preferred way to provide data to a pipeline is a Dataset object. The Dataset object points to data that lives in or is accessible from a datastore or at a WebURL. The Dataset class is abstract, so you will create an instance of either a FileDataset (referring to one or more files) or a TabularDataset that's created by from one or more files with delimited columns of data. Example: from azureml.core import Datasetiris_tabular_dataset = Dataset.Tabular.from_delimited_files([(def_blob_store, 'train-dataset/iris.csv')]) Reference: https://docs.microsoft.com/en-us/azure/machine-learning/how-to-create-your-first-pipeline

You construct a machine learning experiment via Azure Machine Learning Studio. You would like to split data into two separate datasets. Which of the following actions should you take? A. You should make use of the Split Data module. B. You should make use of the Group Categorical Values module. C. You should make use of the Clip Values module. D. You should make use of the Group Data into Bins module.

Correct Answer: D Alternate Answer: A (all comments) The Group Data into Bins module supports multiple options for binning data. You can customize how the bin edges are set and how values are apportioned into the bins. Reference: https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/group-data-into-bins

You make use of Azure Machine Learning Studio to create a binary classification model. You are preparing to carry out a parameter sweep of the model to tune hyperparameters. You have to make sure that the sweep allows for every possible combination of hyperparameters to be iterated. Also, the computing resources needed to carry out the sweep must be reduced. Which of the following actions should you take? A. You should consider making use of the Selective grid sweep mode. B. You should consider making use of the Measured grid sweep mode. C. You should consider making use of the Entire grid sweep mode. D. You should consider making use of the Random grid sweep mode.

Correct Answer: D Maximum number of runs on random grid: This option also controls the number of iterations over a random sampling of parameter values, but the values are not generated randomly from the specified range; instead, a matrix is created of all possible combinations of parameter values and a random sampling is taken over the matrix. This method is more efficient and less prone to regional oversampling or undersampling. If you are training a model that supports an integrated parameter sweep, you can also set a range of seed values to use and iterate over the random seeds as well. This is optional, but can be useful for avoiding bias introduced by seed selection. C: Entire grid: When you select this option, the module loops over a grid predefined by the system, to try different combinations and identify the best learner. This option is useful for cases where you don't know what the best parameter settings might be and want to try all possible combination of values. Reference: https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/tune-model-hyperparameters

You have recently concluded the construction of a binary classification machine learning model. You are currently assessing the model. You want to make use of a visualization that allows for precision to be used as the measurement for the assessment. Which of the following actions should you take? A. You should consider using Venn diagram visualization. B. You should consider using Receiver Operating Characteristic (ROC) curve visualization. C. You should consider using Box plot visualization. D. You should consider using the Binary classification confusion matrix visualization.

Correct Answer: D Reference: https://docs.microsoft.com/en-us/azure/machine-learning/how-to-understand-automated-ml#confusion-matrix

HOTSPOT - Complete the sentence by selecting the correct option in the answer area. Hot Area: >>SSD, FPGUA, GPU, PowerBI<< is required for Deep Learning Virtual Machine (DLVM) to support Compute Unified Device Architecture (CUDA) computations

Correct Answer: GPU A Deep Learning Virtual Machine is a pre-configured environment for deep learning using GPU instances.

DRAG DROP - You are in the process of constructing a regression model. You would like to make it a Poisson regression model. To achieve your goal, the feature values need to meet certain conditions. Which of the following are relevant conditions with regards to the label data? Answer by dragging the correct options from the list to the answer area. Select and Place: Options - It must be whole numbers - It must be a negative value - It must be fractions - It must be non-discrete - It must be a positive value

Correct Answer: It must be whole numbers, It must be a positive value Poisson regression is intended for use in regression models that are used to predict numeric values, typically counts. Therefore, you should use this module to create your regression model only if the values you are trying to predict fit the following conditions: ✑ The response variable has a Poisson distribution. ✑ Counts cannot be negative. The method will fail outright if you attempt to use it with negative labels. ✑ A Poisson distribution is a discrete distribution; therefore, it is not meaningful to use this method with non-whole numbers. Reference: https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/poisson-regression

DRAG DROP - You have been tasked with evaluating the performance of a binary classification model that you created. You need to choose evaluation metrics to achieve your goal. Which of the following are the metrics you would choose? Answer by dragging the correct options from the list to the answer area. Select and Place: Options: - Precision - Accuracy - Relative Squared Error - Coefficient of determination - Relative Absolute Error

Correct Answer: Precision, Accuracy The evaluation metrics available for binary classification models are: Accuracy, Precision, Recall, F1 Score, and AUC. Note: A very natural question is: ג€˜Out of the individuals whom the model, how many were classified correctly (TP)?ג€™ This question can be answered by looking at the Precision of the model, which is the proportion of positives that are classified correctly. Reference: https://docs.microsoft.com/en-us/azure/machine-learning/studio/evaluate-model-performance

DP-100 Data Science Questions Topic 5

Ensembles d'études connexes

Mod. 5 Elimination, 5B, 5C

Chapter 10: Standard Costs and Variances

ACC Review Unit 3

Mod 6 Unit 2 ALTERNATORS

Chapter 4 Book and Activities

The Hunger Games Part 1: Chapters 1-9

AP Bio Genetics Part 2

ch.32

Muscle Anatomy

Chapter 2 Fungal Reproduction

Business Law Chapter 11

CONTEMPORARY ART EXAM #2

Chemistry Significant Figures Test - Summer 2019

Pharmacology new

The Statute of Frauds

PreU chapter 8 communication

Bio 1010 - Final Exam - Dulai

Statistics Math 125 - Module 1 Homework 1.4

Rad Bio Final

Chapter 7 Test