Final - Deep Learning

Ace your homework & exams now with Quizwiz!

What is a vector?

1-dimensional tensor

What is the value of the expression 2 + 3*4? in Python?

14

What is the wrong description? a. A max-pooling layer has many parameters, whereas a convolutional layer doesn't have at all. b. The main innovations in AlexNet compared to LeNet-5 are that it is much larger and deeper, and it stacks convolutional layers directly on top of each other, instead of stacking a pooling layer on top of each convolutional layer. c. The main innovation in GoogleNet is the introduction of inception modules, which make it possible to have a much deeper net than previous CNN architectures, with fewer parameters. d. ResNet's main innovation is the introduction of skip connections, which makes it possible to go well beyond 100 layers.

A) A max-pooling layer has many parameters, whereas a convolutional layer doesn't have at all

What is a wrong statement? a. Python is computationally fast, so that it doesn't need Numpy. b. Numpy functions return either views or copies. c. Views share data with the original array d. Np.copy makes explicit copies

A) Python is computationally fast so it doesnt need Numpy

What is not the benefit of batch normalization? a. Increase learning rate b. Include more dropout c. Reduce L2 weight decay d. Remove local response normalization

B) Include more dropout

What is a disadvantage of TensorFlow? a. Python + Numpy b. Not many pre-trained models c. A computational graph is cool d. Easy to modify nets

B) Not many pre-trained models

What is a wrong statement? a. Pandas stand for Python Data Analytics Library. b. Pandas works with homogeneous numerical data. c. Pandas is capable of offering an in-memory 2d table object called DataFrame. d. (Similar to Numpy) Pandas is one of the most widely used python libraries in data science.

B) Pandas works with homogeneous numerical data

"If you want to build a ship, don't drum up people together to collect wood and don't assign them tasks and work, but rather teach them to long for the endless immensity of the sea." Please find the wrong description about it. a. This quote is by Antoine de Saint-Exupery. b. RNN can't be trained to write this abstract quote above. c. If you wish to build a ship, do not give directions and technical advice to others. d. It is a good guideline for being a manager, supervisor, or group lead.

B) RNN can't be trained to write this abstract quote above

What are the two main difficulties when training RNNs? a. Unstable gradients (exploding or vanishing) and available long-term memory. b. Unstable gradients and limited short-term memory c. Stable gradients and enough short-term memory d. Stable gradients and enough long-term memory

B) Unstable gradient and limited short-term memory

What is the characteristic of Tensorflow or Theano (compared to Keras)? a. The interface of Tensorflow or Theano b. Very flexible c. So easy to learn and use

B) Very flexible

What is the wrong description? a. Because consecutive layers are only partially connected and because it heavily reuses its weights, CNN has many fewer parameters than a fully connected DNN, which makes it much faster to train, reduces the risk of overfitting, and requires much less training data. b. When a DNN has learned a kernel that can detect particular features, it can detect that feature anywhere in the image. In contrast, when a CNN learns a feature in one location, it can detect it only in that particular location. c. Since images typically have very repetitive features, CNNs are able to generalize much better than DNNs for image processing tasks such as classification, using fewer training examples. d. A DNN has no prior knowledge of how pixels are organized: it does not know that nearby pixels are close.

B) When a DNN has learned a kernel that can detect particular features, it can detect that feature anywhere in the image. In contrast, when a CNN learns a feature in one location, it can detect it only in that particular location.

Common image formats include RGB. What does OpenCV include?

BGR

What is not the way to prevent overfitting? a. Get more data b. Use a model with the right capacity c. Use one single model d. Use dropout/drop connect/batch norm

C) Use a single model

Which one was not the limitation of a neural network having many hidden layers, according to Geoffrey Hinton's finding? a. Our labeled datasets were thousands of times too small. b. Our computers were millions of times too slow. c. We initialized the weights in a smart way. d. We used the wrong type of non-linearity.

C) We initialized the weights in a smart way

What is the wrong description? a. You can apply RBM on adjacent two layers as a pre-training step. b. There is no need to use complicated RBM for weight initialization today. c. Xavier and He's initializations are one of the batch normalizations. d. Xavier/He initialization makes the weights "just right," not too small or big.

C) Xavier and He's initializations are one of the batch normalizations

Which function needs to be minimized to optimize machine learning modeling?

Cost Function

Find the wrong description. a. Batch gradient descent: we use the entire training dataset to compute the gradient. b. Stochastic gradient descent: the gradient is computed from each training sample, one by one. c. Mini-batch gradient descent: common mini-batch sizes range between 50 and 256 (but can vary). d. Adam is one of the generalization approaches.

D) Adam is one of the generalization approaches

What is the wrong description? a. A "feature detector" (kernel) slides over the inputs to generate a feature map. b. Feature map size = (input width - kernel width + 2*padding)/stride + 1 c. The option "same" padding (more on padding) in CNN makes the hidden layer have the same height and width as the original images. d. In CNN, pooling layers have learnable parameters too.

D) In CNN, pooling layers have learnable parameters too.

What is not the solution to avoid overfitting? a. Obtaining more training data b. Reducing the number of input features c. Using diverse regularization d. Increasing the number of hidden layers

D) Increasing the number of hidden layers

What is not for Keras? a. Keras is a sequential model. b. Keras is a functional API. c. Keras models subclassing. d. Keras is low-level API.

D) Keras is a low level API

What is the wrong description of LSTM? a. LSTM was invented because RNNs had serious memory leaks. b. LSTM forget by using Forget gates. c. LSTM remembers using input gates d. LSTM keeps long-term memory using a hidden state.

D) LSTM keeps long-term memory using a hidden state

What is a wrong statement? a. Numpy and its ndarry object provide efficient storage and manipulation of the dense array. b. Numpy has limitations to attach labels to data and work with missing data. c. Pandas have two objects: series and dataframe. d. Pandas is usually faster than Numpy.

D) Pandas is usually faster than Numpy

It addresses the problem of learning hierarchical representation with a single algorithm. What is it?

Deep Learning

What is the wrong description of LSTM? a. The cell has a short-term state vector and a long-term state vector. b. At each time step, the inputs and the previous short-term state are fed to a simple RNN cell and three gates. c. The forget gate decides what to remove from the long-term state. d. The input gate decides which part of the output of the simple RNN cell should be added to the long-term state. e. The output gate decides which part of the short-term state should be output at this time step after going through the Relu activation function.

E) The output gate decides which part of the short-term state should be output at this time step after going through the Relu activation function.

In RNN, which case uses one to many (one input and many outputs)?

Image captioning

In LSTM, which gate decides what new information we're going to store in the cell state?

Input gate

What is a multiplatform data visualization library built on Numpy arrays and designed to work with the broader SciPy stack?

Matplotlib

Is it OK to initialize all the weights to the same value as long as that value is selected randomly using He initialization?

No

In a Recurrent Neural Network, the different function and the same set of parameters are used at every time step.

No -> its the same function and same parameters

In Convolutional Neural Networks, the same weights are used for different patches of the input image. What is about?

Parameter sharing

In CNN, a single element in the feature map is connected to only a small patch of pixels. What is about?

Sparce-Connectivity

An RNN layer must have three-dimensional inputs. What is the 2nd dimension from [1st, 2nd, 3rd]?

Time step

What is a kind of white box?

Torch

Can RNN be used for natural language processing?

Yes

In CNN, regularization and ensemble help to enhance the accuracy of the deep learning performance.

Yes

Is it OK to initialize its bias terms to 0?

Yes

Is tuple immutable object in Python?

Yes

The partial convolution uses the same structure as U-net architecture, but each layer is replaced with partial convolution layers

Yes

CNN cannot be used for forecasting?

Yes it can, with some limitations

Which datatype comes out from this operation of Numpy, unit64 + unit32 =?

unit64


Related study sets

Establishment of Distinctly American Political, Economic, and Cultural Characteristics 1800-1860

View Set

Digit, hand, wrist (AP, oblique, lateral)

View Set

Oceans, Coasts, and Climate (Ch. 15 & 19)

View Set

Biochemistry Final Practice Problems

View Set