CSC 177 Quiz 3
Stochastic gradient descent is a process that
update weights after each training case
According to intuition of CNN, there are ____ steps.
4
Bagging is abbreviation of
Bootstrap Aggregating
For given R demo ANN program and data set Churn_modeling.csv, set.seed(123), using h2o, run the following two experiments: E1 has hidden = c(5, 5) and E2 has hidden = c(6, 6). Based on confusion matrix information, compare accuracy of E1 and E 2
E1 is lower
benefits of ensemble learning?
Enriched hypothesis space without incurring much additional efforts. Improved prediction accuracy of a weak learning algorithm through boosting. Less likely to misclassify than a single hypothesis.
Benefits of ensemble learning include
Improved prediction accuracy of a weak learning algorithm. Enriched hypothesis space without incurring much additional effort Less likely to misclassify than a single hypothesis
Bagging is to fit classification of regression models to bootstrap samples from the data and combine by voting (classification) or averaging (regression). True/False?
True
Tensor is __________
a higher order matrix
The idea of Back-propagation is that a hidden node is responsible for some fraction of the error in each of the output nodes to which it connects. After propagating the input forward through the network and compute the output of every unit in the network, there are three steps left for propagate the errors backward through the network. The first of the 3 steps is :
assess the blame for error caused by weights from hidden layer to output layer
Diversity in _________ is obtained through using bootstrapped training data
bagging
What is the benefit of using ReLU function in CNN?
break linearity
Keys to ensemble learning success do not include
component hypotheses agree with each other
To recognize images in different orientation and distortion, we need pooling. The major benefits of pooling do not include preventing model overfitting. True/False?
false
You are hired by the city of Sacramento to construct a bird recognition system using machine learning algorithm. After setting up your train/dev/test sets, the City Council comes across another 1,000,000 images, called the citizens data. Apparently the citizens of Sacramento are so scared of birds that they volunteered to take pictures of the sky and label them, thus contributing these additional 1,000,000 images. These images are different from the distribution of images the City Council had originally given you, but you think it could help your algorithm. You should NOT add the citizens data to the training set, because this will cause the training and dev/test set distributions to become different, thus hurting dev and test set performance. True/False?
false
Based on the needs of feature detection, CNN designers can choose different feature detector. Which term in the following list is equivalent to feature detector.
filter kernel
RF ML algorithm grows a forest of many trees. Grow each tree on an independent bootstrap sample from the training data. At each node: selection m variables at random out of all M possible variables and find the best split on the selected m variables. Then
grow the trees to maximum depth and vote/average the trees to get prediction for new data.
In Logistic Regression, _______ function is used.
sigmoid
ReLU function is also called Rectifier function. True/False?
true
Softmax function is a normalization function used in CNN to convert network real value outputs into probabilities outputs so that they can be added up to 1. True/False?
true