Deep Learning Midterm 2

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

The Jaccard Coefficient is represented by: σ(X,Y)= (|X∩Y|)/(|X∪Y|) σ(X,Y)= (|X∪Y|)/(|X∩Y|) σ(X,Y)= (|X-Y|)/(|X∪Y|) σ(X,Y)= (|X∩Y|)/(|X-Y|)

σ(X,Y)= (|X∩Y|)/(|X∪Y|)

As a default contrib.learn.DNNClassifier(hidden_units=[300,100], n_classes=10, feature_columns=feature_cols) will use A max value classifier A softmax classifier

A softmax classifier

Manifold assumption, also called the manifold hypothesis, holds that most real-world X-dimensional datasets lie close to a much Y-dimensional manifold. (X=high, Y=high) (X=low, Y=low) (X=low, Y=high) (X=high, Y=low)

(X=high, Y=low)

In an LSTM 1) A ______________ gate controls which parts of the long-term state should be erased. 2) A ______________ gate controls which parts should be added to the long-term state. 3) A _______________ gate controls which parts of the longterm state should be read and output at this time step

1) forget 2) input 3) output

Match the technique with the picture: (tsne=t-Sne, iso=Isomap, mds=Multidimensional scaling) 1) Swirl 2) Stretched parallelogram 3) Abstract fractal

1) mds 2) iso 3) tsne

If we are training the skipgram with a context widow of size 2, then for each center word, how many training examples are typically generated? (choose the best answer) 2 4 6 8

You're given a collection of images and asked to build a model to classify each image into 1 of 100 classes. You decided to create a DNN to build this model, if each image is 30x30 pixels and training batch size 50: The number of neurons in the passthrough (input) layer is: _______________________ If you have 1000 neurons in the first hidden layer then the Weights matrix at this layer would have __________________ by __________________ dimensions If you have 800 neurons in the second hidden layer then the Weights matrix at this layer would have _____________ by _____________ dimensions The number of neurons in the output layer is ____________

900 900, 1000 1000, 800 100

The vanishing/exploding gradients problem in RNNs is controlled using which of the following tricks: (Select all possible answers) Good parameter initialization Non-saturating activation function (eg. ReLU) Batch Normalization Gradient Clipping Faster Optimizers

All of these are correct

Select all the Unsupervised Learning Algorithms from the list k-Means Clustering Apriori Logistic Regression Multinomial Naive Bayes Principal Component Analysis

k-means clustering, apriori, pca

In RNNs "Dynamic Unrolling Through Time" is used to Avoid 'out of memory' errors Improve the accuracy of the network Avoid the vanishing gradient problem Avoid the exploding gradient problem All of the above None of the above

Avoid 'out of memory' errors

The use a DropoutWrapper is TF remedy to Avoid the vanishing gradient problem Avoid the exploding gradient problem Avoid overfitting All of the above None

Avoid overfitting

When training RNNs on long sequences, they may suffer from [blank] gradient problem Vanishing Exploding Both A and B Neither

Both A and B

True or False. A Manifold Hypothesis holds that most real world high-dimensional datasets lie very far away from a much lower-dimensional manifold based on the same data. (An example to think about would be multidimensional scaling).

False

True or False. The lower the dimensionality of the word embeddings vectors, the higher will be the accuracy on various semantic analogy tasks

False

True or False. Truncated Back Propagation through Time is when you unroll an RNN for an unlimited number of time steps during training.

False

The higher the dimensionality of the word embeddings vectors, the higher will be the accuracy on various semantic analogy tasks True False

False (Accuracy is parabolic with respect to dimensionality of word embedding vectors)

In Scikit-Learn's GridSearchCV, all you need to do is tell it which _______________ you want it to experiment with, and what values to try out, and it will evaluate all the possible combinations of ___________________ values, using cross-validation.:

hyperparameters, hyperparameter

The best activation function to use with Linear Regression is: Sigmoid ReLU TanH None of the above

None of the above

Which of the following statements are true: 1) A high-frequency term in the corpus has a low TF-IDF score 2) High Frequency of a term in an individual document has no impact on its TF-IDF score Only 1 Only 2 Both 1 and 2 Neither

Only 1

Your classifier identifies 9 females in a scene where there are 11 females and 13 males. But only 6 are correct and the remaining 3 are males. What is the precision and recall? Precision is 9/11, recall is 9/13 Precision is 9/13, recall is 9/11 Precision is 6/13, recall is 6/11 Precision is 6/9, recall is 6/11

Precision is 6/9, recall is 6/11

Which of the following is not a Dimensionality Reduction Approach (select all that apply) Projection Preservation Manifold Learning Principal Component Analysis Distribution

Preservation, Distribution

As a default contrib.learn.DNNClassifier(hidden_units=[300,100], n_classes=10, feature_columns=feature_cols) will use ReLU Sigmoid activation

ReLU

The curves/lines above can represent the softmax values for four inputs to the softmax function impossible! surely for some values not for four, but for for two inputs (blue and green)

impossible!

Which of the following are not the applications of RNN (select all that apply) Time Series Analysis Shortest Distance Computation Parts of Speech Tagging Text Summarization Handwriting Recognition

Shortest Distance Computation

Unrolling an RNN through time known as Backpropagation through time is an important step in [blank] of an RNN Training Testing Validation Compression

Training

RNNs can achieve high image recognition accuracy (>98% on MNIST data). True or False

True

True or False. PCA requires the whole training set to fit in memory in order for the

True

True or False. The GRU cell is a simplified version of the LSTM cell

True

True or False. Under the hood, the DNNClassifier class creates all the neuron layers, based on the ReLU activation function

True

In TF, when using e.g. the command lstm_cell = tf.contrib.rnn.BasicLSTMCell(num_units=n_neurons) LSTM cells manage how many separate state vectors? One num_units Two Any number declared by a separate command

Two

The command: contrib.learn.DNNClassifier(hidden_units=[300,100], n_classes=10, feature_columns=feature_cols) will create a neural network with Two deep layers and no output layer Two layers and an output layer One hundred layers of 300 neurons and an output layer Three hundred layers of 100 neurons and an output layer

Two layers and an output layer

To analyze time series data, such as stock prices, you would use recurrent neural networks (RNN) convolutional neural networks (CNN) Either Neither

recurrent neural networks (RNN)

A ___________ node and its _____________ method are used to save models in TensorFlow

saver, save

In Tensorflow, When you evaluate [blank], TensorFlow automatically determines the set of nodes that it depends on

Deep Learning Midterm 2

Ensembles d'études connexes

The Marketing Plan

Hiragana - Japanese for Kidz (ha-hi-fu-he-ho) #6B (Word Practice)

Student

Religion Chapter 6: The Resurrection of Jesus Christ

CS201 - Midterm

GPH 381

Exam 2

Fruit

Exam 3 OB

SM Chapter 4 Book

Python

Sherpath Lesson Ch. 15, 16, 17: Ears, Nose, and Throat

Chapter 27: Management of Patients with Coronary Vascular Disorders

WIKIPEDIA

علم اتصال ميد

Princess Bride Study Guide

FIN CH 8

MNGMT 421 Ch 5

Ch. 12 Principles of test selection and administration

MSS Exam 2 Q's