CS 254 Exam 2

Ace your homework & exams now with Quizwiz!

Which of the following is not true about using convolutional neural networks (CNNs) for image analysis?

A CNN can be trained for unsupervised learning tasks, whereas an ordinary neural net cannot.

Having max pooling layer in between two convolutional layers, always decrease the number of the parameters(weights).

FALSE

If you increase the number of hidden layers in a Multi Layer Perceptron neural networks (fully connected networks), the classification error of test data always decreases.

FALSE

In backpropagation learning, we should start with a small learning rate parameter and slowly increase it during the learning process.

FALSE

Suppose a convolutional neural network is trained on MNIST dataset (handwritten digits dataset). This trained model is then given a completely BLACK image as an input (Zero Input). The output probabilities for this input would be ZERO for all classes.

FALSE

Suppose a convolutional neural network is trained on MNIST dataset (handwritten digits dataset). This trained model is then given a completely white image as an input. The output probabilities for this input would be equal for all classes.

FALSE

Suppose we have one hidden layer neural network as shown below. The hidden layer in this network works as a dimensionality reductor. Now instead of using this hidden layer, we replace it with a dimensionality reduction technique such as PCA. The network that uses a dimensionality reduction technique always give same output as network with hidden layer?

FALSE

The number of neurons in the output layer must match the number of classes (Where the number of classes is greater than 2) in a supervised learning task.

FALSE

weight sharing can occur in convolutional neural network or fully connected neural network (Multi-layer perceptron)

FALSE

What are the steps for using a gradient descent algorithm in training neural networks?

Initialize random weight and bias Pass an input through the network and get values from output layer Calculate error between the actual value and the predicted value Go to each neurons which contributes to the error and change its respective values to reduce the error. Reiterate until you find the best weights of network

Batch Normalization is helpful because

It normalizes (changes) all the input before sending it to the next layer

In a neural network, knowing the weight and bias of each neuron is the most important step. If you can somehow get the correct value of weight and bias for each neuron, you can approximate any function. What would be the best way to approach this?

Iteratively check that after assigning a value how far you are from the best values, and slightly change the assigned values to make them better

Which of the following gives non-linearity to a neural network?

Rectified Linear Unit

Suppose you want to redesign the AlexNet architecture to reduce the number of arithmetic operations required for each backprop update. Which one of these choices will reduce the number of arithmetic operations the most:

Removing a convolutional layer

A multiple-layer neural network with linear activation functions is equivalent to one single-layer perceptron that uses the same error function on the output layer and has the same number of inputs.

TRUE

A neural network with multiple hidden layers and sigmoid nodes can form non-linear decision boundaries.

TRUE

A perceptron is guaranteed to perfectly learn a given linearly separable function within a finite number of training steps.

TRUE

In a neural network, Data Augmentation, Dropout, Regularization all deal with overfitting.

TRUE

In a neural network, Dropout, Regularization and Batch normalization all deal with overfitting.

TRUE

It is possible to use 1x1 convolution filters.

TRUE

Using momentum with gradient descent helps to find solutions more quickly.

TRUE

When pooling layer is added in a convolutional neural network, translation in- variance is preserved.

TRUE

dead unit in a neural network is the unit which doesn't update during training by any of its neighbour

TRUE

using Mini-batch gradient decent the model update frequency is higher than batch gradient descent which allows for more robust convergence, avoiding local minima.

TRUE

For a classification task, instead of random weight initializations in a neural network, we set all the weights to zero. Which of the following statements is true?

The neural network will train but all the neurons will end up recognizing the same thing

Which of the following is NOT correct about ReLU (Rectified Linear Unit) activation function?

Zero-centered output.

See all study sets

CS 254 Exam 2

Related study sets

Combinations

MGMT Chapter 10 Smartbook

Textiles AS/A2

NSAID's and Acetaminophen (Exam 3; Dr. Gupta)

Exam 3

362 exam 4

Ch 16 - The Brain

Chapter 30: Introduction to the Hematopoietic and Lymphatic Systems

Chapter 10 Ocean Marine

1.14.T - Lesson: The Sectional Crisis Review- 1.15.T - Lesson: The Civil War Review

LS1 Week 8 Chapter 50 Assessment and Management of Pt with Biliary Disorders

Class Eighteen Chapter 40 Prep U ACTUAL

Chapter 2: Critical Thinking PrepU

PSY 452 Chapter 13

Strategic Management - Chapter 2 Practice

maternity

ch13 bio questions

Marketing Research Exam #1

Alzheimer's disease & Delirium

What is the Best Way to Communicate? Unit