DP-100 Data Scientist Questions

Ace your homework & exams now with Quizwiz!

You plan to run a script as an experiment using a Script Run Configuration. The script uses modules from the scipy library as well as several Python packages that are not typically installed in a default conda environment. You plan to run the experiment on your local workstation for small datasets and scale out the experiment by running it on more powerful remote compute clusters for larger datasets. You need to ensure that the experiment runs successfully on local and remote compute with the least administrative effort. What should you do? A. Do not specify an environment in the run configuration for the experiment. Run the experiment by using the default environment. B. Create a virtual machine (VM) with the required Python configuration and attach the VM as a compute target. Use this compute target for all experiment runs. C. Create and register an Environment that includes the required packages. Use this Environment for all experiment runs. D. Create a config.yaml file defining the conda packages that are required and save the file in the experiment folder. E. Always run the experiment with an Estimator by using the default packages.

Answer : C Explanation: If you have an existing Conda environment on your local computer, then you can use the service to create an environment object. By using this strategy, you can reuse your local interactive environment on remote runs. Reference: https://docs.microsoft.com/en-us/azure/machine-learning/how-to-use-environments

HOTSPOT Complete the sentence by selecting the correct option in the answer area. >>1<< is required for a Deep Learning Virutal Machine (DLVM) to support Compute Unified Device Architecture (CUDA) computations. >>1<<: SSD, FPGA, GPU, PowerBI

Answer: A Alternate Answer: C <—— SSDs has nothing to do with CUDA, so A cannot be the right answer. Explanation: A Deep Learning Virtual Machine is a pre-configured environment for deep learning using GPU instances.

This question is included in a number of questions that depicts the identical set-up. However, every question has a distinctive result. Establish if the recommendation satisfies the requirements You have been tasked with employing a machine learning model, which makes use of a PostgreSQL database and needs GPU processing, to forecast prices. You are preparing to create a virtual machine that has the necessary tools built into it. You need to make use of the correct virtual machine type. Recommendation: You make use of a Data Science Virtual Machine (DSVM) Windows edition. Will the requirements be satisfied? Options: A. Yes B. No

Answer: A Explanation: In the DSVM, your training models can use deep learning algorithms on hardware that's based on graphics processing units (GPUs). PostgreSQL is available for the following operating systems: Linux (all recent distributions), 64-bit installers available for macOS (OS X) version 10.6 and newer - Windows (with installers available for 64-bit version; tested on latest versions and back to Windows 2012 R2. Reference: https://docs.microsoft.com/en-us/azure/machine-learning/data-science-virtual-machine/overview

This question is included in a number of questions that depicts the identical set-up. However, every question has a distinctive result. Establish if the recommendation satisfies the requirements. You have been tasked with evaluating your model on a partial data sample via k-fold cross-validation. You have already configured a k parameter as the number of splits. You now have to configure the k parameter for the cross-validation with the usual value choice. Recommendation: You configure the use of the value k=10. Will the requirements be satisfied? Options: A. Yes B. No

Answer: A Explanation: Leave One Out (LOO) cross-validation Setting K = n (the number of observations) yields n-fold and is called leave-one out cross-validation (LOO), a special case of the K-fold approach. LOO CV is sometimes useful but typically doesn't shake up the data enough. The estimates from each fold are highly correlated and hence their average can have high variance. This is why the usual choice is K=5 or 10. It provides a good compromise for the bias-variance tradeoff.

HOTSPOT Complete the sentence by selecting the correct option in the answer area. To move a large dataset from Azure Machine learning Studio to a Weka environment, the data must be converted to the >>1<< format. >>1<<: CSV, DOCX, ARFF, TXT

Answer: ARFF Explanation: Use the Convert to ARFF module in Azure Machine Learning Studio, to convert datasets and results in Azure Machine Learning to the attribute-relation file format used by the Weka toolset. This format is known as ARFF. The ARFF data specification for Weka supports multiple machine learning tasks, including data preprocessing, classification, and feature selection. In this format, data is organized by entities and their attributes, and is contained in a single text file. Reference: https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/convert-to-arff

This question is included in a number of questions that depicts the identical set-up. However, every question has a distinctive result. Establish if the recommendation satisfies the requirements. You have been tasked with evaluating your model on a partial data sample via k -fold cross-validation. You have already configured a k parameter as the number of splits. You now have to configure the k parameter for the cross-validation with the usual value choice. Recommendation: You configure the use of the value k =3. Will the requirements be satisfied? Options: A. Yes B. No

Answer: B

This question is included in a number of questions that depicts the identical set-up. However, every question has a distinctive result. Establish if the recommendation satisfies the requirements. You have been tasked with employing a machine learning model, which makes use of a PostgreSQL database and needs GPU processing, to forecast prices. You are preparing to create a virtual machine that has the necessary tools built into it. You need to make use of the correct virtual machine type. Recommendation: You make use of a Deep Learning Virtual Machine (DLVM) Windows edition. Will the requirements be satisfied? Options: A. Yes B. No

Answer: B Explanation: DLVM is a template on top of DSVM image. In terms of the packages, GPU drivers etc. are all there in the DSVM image. Mostly it is for convenience during creation where we only allow DLVM to be created on GPU VM instances on Azure. Reference: https://docs.microsoft.com/en-us/azure/machine-learning/data-science-virtual-machine/overview

You have been tasked with designing a deep learning model, which accommodates the most recent edition of Python, to recognize language. You have to include a suitable deep learning framework in the Data Science Virtual Machine (DSVM). Which of the following actions should you take? Options: A. You should consider including Rattle. B. You should consider including TensorFlow. C. You should consider including Theano. D. You should consider including Chainer.

Answer: B Explanation: Reference: https://www.infoworld.com/article/3278008/what-is-tensorflow-the-machine-learning-library-explained.html

This question is included in a number of questions that depicts the identical set-up. However, every question has a distinctive result. Establish if the recommendation satisfies the requirements. You have been tasked with employing a machine learning model, which makes use of a PostgreSQL database and needs GPU processing, to forecast prices. You are preparing to create a virtual machine that has the necessary tools built into it. You need to make use of the correct virtual machine type. Recommendation: You make use of a Geo AI Data Science Virtual Machine (Geo-DSVM) Windows edition. Will the requirements be satisfied? Options: A. Yes B. No

Answer: B Explanation: The Azure Geo AI Data Science VM (Geo-DSVM) delivers geospatial analytics capabilities from Microsoft's Data Science VM. Specifically, this VM extends the AI and data science toolkits in the Data Science VM by adding ESRI's market-leading ArcGIS Pro Geographic Information System. Reference: https://docs.microsoft.com/en-us/azure/machine-learning/data-science-virtual-machine/overview

This question is included in a number of questions that depicts the identical set-up. However, every question has a distinctive result. Establish if the recommendation satisfies the requirements. You are in the process of creating a machine learning model. Your dataset includes rows with null and missing values. You plan to make use of the Clean Missing Data module in Azure Machine Learning Studio to detect and fix the null and missing values in the dataset. Recommendation: You make use of the Replace with median option. Will the requirements be satisfied? Options: A. Yes B. No

Answer: B Explanation: https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/clean-missing-data

You need to implement a Data Science Virtual Machine (DSVM) that supports the Caffe2 deep learning framework. Which of the following DSVM should you create? Options: A. Windows Server 2012 DSVM B. Windows Server 2016 DSVM C. Ubuntu 16.04 DSVM D. CentOS 7.4 DSVM

Answer: C Explanation: Caffe2 is supported by Data Science Virtual Machine for Linux. Microsoft offers Linux editions of the DSVM on Ubuntu 16.04 LTS and CentOS 7.4. However, only the DSVM on Ubuntu is preconfigured for Caffe2. Reference: https://docs.microsoft.com/en-us/azure/machine-learning/data-science-virtual-machine/overview


Related study sets

Organization and Management (Chapter 3 -4 )

View Set

Maternal Captsone questions exam 3

View Set

Body Cavities & Serous Membranes Practical

View Set

NTR 201 Chapter 10 study questions

View Set