Deep Learning Quiz #1

¡Supera tus tareas y exámenes ahora con Quizwiz!

Match the term with the definition using the following values in a classification task : Accuracy

(TP+TN)/(TP+TN+FP+

Match the method of normalization with the result new_score = (old_score - mean) / standard_deviation

0 mean and unit variance

Match the cross-validation technique with the description : Nested cross-validation

A cross-validation

Match the advanced machine learning concept to an example of it : Clustering

Babies learning that 'r'

Here is an analogy: "Rose" is to "Flower" as "Porsche" is to "Automobile", because the first word is a type of the second word. "North" is to "South" as "Black" is to "White" because second word is the opposite of the first word. and so on... The following is analogy can be said for four important concepts in machine learning. Fill in the blank. Classification is to regression in supervised learning as _____________ is to dimensionality reduction in unsupervised learning. Or more succinctly Classification is to regression as ___________________ is to dimensionality reduction

Clustering

A friend in your machine learning class created a movie rating prediction system that judges how many stars (out of 5) a person would rate a movie they haven't seen yet given their ratings for other movies. They stated their rating system is 100% accurate according to their data. What is the best question to ask them?

Did you remember to separate your training set from your test set?

Asking a thousand people hundreds of questions about their personalities, you can use which technique to find numbers which may approximate the "Big 5" personality characteristics.

Dimensionality reduction

Normalization is not important in k-NN classification because the features with the larger range should always have a larger influence on the k-NN distance metric than other features.

False

When cross-validation is performed in the validation set, the score of the best fitted model hyperparameters in that set is on average lower than the the score of that best fitted model on a separate test set.

False

When you use cross-validation to select the right hyperparameter, you need a separate set of data to properly measure the accuracy of the model with that hyperparameter (due to potential overfitting). Unfortunately, many scientists don't do this (although they should)

False

When you use cross-validation to select the right hyperparameters, you do not need a separate set of test data to properly measure the quality of the model because cross-validation already separates training from testing.

False

Match the advanced machine learning concept to an example of it : Feature selection

Including "IQ"

K-fold cross-validation will lead to lower accuracies than expected with the full training set because only (K-1)/K % of the data is being used for training (e.g. 4/5ths for K=5). The way to improve this is by increasing K. But what is a problem with increasing K?

K models have to be trained which takes more time as K increases

There are three kinds of people who build machine learning models. Person A doesn't separate training from testing, and just fits the model to all the data, Person B uses cross-validation over the entire data set to pick the best hyperparameters and reports the quality of the model on that data set. Person C uses cross-validation on a validation set for hyperparameters and uses a separate test set for evaluating the model.

Person C

The proportion of correctly identified samples of class A, among the test samples that were identified as belonging to class A, is called...

Precision

Select all scenarios that are examples of supervised learning

Predicting a buyer's chance of clicking on an online advertisement based on the previous behavior of similar online shoppers. Netflix using their database of user ratings to predict how you would rate a movie you haven't seen

In a given binary classification problem, Out of all samples of class A in the test set, the proportion of those which are correctly identified as class A by the classifier is called...

Recall

Match the cross-validation technique with the description : Leave one out cross-validation

Same as K-fold cross-validation where K = the size of the data set

If I want to test my voice recognition software to see how well it will works on a new person it has not yet been trained for, what type of cross-validation would give me the best sense of accuracy?

Subject-wise cross-validation

Match the term with the definition using the following values in a classification task : Specificity (Recall for the negative case)

TN / (TN + FP)

Match the term with the definition using the following values in a classification task : Sensitivity (Recall for the positive case)

TP / (TP + FN)

Match the term with the definition using the following values in a classification task : Precision (for the positive case)

TP / (TP + FP)

Match the term with the definition using the following values in a classification task : F1 Score

The harmonic mean

Weather forecasters in Denton decided to build a model that predicts tomorrow's high temperature from the previous 30 day's high temperatures. To do this, they used the past year's weather data to train the model. They had perfect accuracy in predicting when using last year's data for testing. However, when they applied the same model to predict the weather the next day, they found it was off by 10 degrees.Select all statements that are likely to apply to their model.

The model is overfitting (it is too complex, too many variables) They should have used separate sets of data for training and for testing to pick the right model

What is the purpose of regularization in linear regression?

To improve prediction accuracy on a future test set better than ordinary linear regression To decrease the coefficient values for irrelevant terms in the regression model To diminish the contribution of irrelevant features to the resulting model, effectively performing automated feature selection during learning

Dimensionality reduction is useful to lower the number of features in a systematic way. Which is NOT a reason why it may be useful to reduce the dimensionality of your feature set?

To project the data into a higher dimensional space to create a linear separating hyperplane

Match the cross-validation technique with the description :K-fold cross-validation

Train your model on K-1 groups of the data set, and test of the Kth portion. Repeat the process by changing the test set to be each of the other K groups

Match the cross-validation technique with the description : Subject-wise cross-validation

When you use data

Match the advanced machine learning concept to an example of it : Deep Learning

a type of machine learning that automatically creates features from the data to help with more complex decisions. Often this is done with neural networks with multiple layers. Useful for difficult machine learning problems like speech recognition and visual object identification. The new hotness in complex machine learning problems currently.

Match the advanced machine learning concept to an example of it : Feature Engineering

creating 'day of the week' as a feature from date strings ("Sep 20, 2017") when trying to predict if someone will be driving to work that day.

If you are picking among many different model variants you generally split your data into three different groups. Match the group to a property of that group : Training set

data used to create the models

If k-nearest neighbors was your model of how you make decisions, which value of k would be more likely to be superstitious (lead to poor generalization, fit to the noise, be "too complex" of a model...)

k=1

If you are picking among many different model variants you generally split your data into three different groups. Match the group to a property of that group : Validation set

like a test set but it is used to decide which model variant is selected to later apply to the test set

Match the method of normalization with the result new_score = (old_score - min) / (max - min)

range between 0 and 1

If you are picking among many different model variants you generally split your data into three different groups. Match the group to a property of that group : Test set

the data set used to evaluate the selected model that has been trained on the other two data sets


Conjuntos de estudio relacionados

Tissue: Epithelial Connective, Nervous, and Muscle

View Set

Module 12 Mobile Device Forensics and the Internet of Anything

View Set

Computer Applications - Formulas

View Set

Ch.7: PHOTOSYNTHESIS: review questions

View Set