quiz 1 ~ 5
Reinforcement learning is a form of supervised learning. True False
False
Reinforcement learning is a form of unsupervised learning. True False
False
A quantitative variable is best modeled by which type of distribution? Answers: multinomial binomial none of these Gaussian
Gaussian
In R, comments start with: /* // # **
#
What is the output of this code sequence: v1 <- c(1, 2, 3) 5*v1 5 5 5 1 2 3 5 10 15 none of these
5 10 15
What is the assignment operator in R? <- <= =< ==
<-
The conjugate prior of a multinomial is which kind of distribution? Answers: Gaussian Dirichlet Gamma Beta
Dirichlet
In R, the '.' indicates a method, as in obj.method(). True False
False
In vector v<-c(1, 2, 3), v[0] refers to the '1'. True False
False
Linear regression tends to overfit the data. True False
False
Log-odds range from [0, 1] Answers: True False
False
Logistic regression has high variance. Answers: True False
False
Logistic regression in R can handle multi-class classification well. Answers: True False
False
Naive Bayes tends to outperform logistic regression on large data sets. Answers: True False
False
R is a compiled, not interpreted, language. True False
False
Which metric adjust accuracy for the possibility of classifying correctly by chance? Answers: none of these answers Kappa sensitivity specificity
Kappa
Variables that correlate with both the predictor and the target are called: confounding variables hidden variables dummy variables none of the above
confounding variables
R is case sensitive. True False
True
The logistic function has an S-shaped curve. Answers: True False
True
The logistic, or sigmoid, function maps real number inputs to range [0, 1]. Answers: True False
True
What is "i" after the following line of code is executed? i <- sample(1:nrow(df), 0.8*nrow(df), replace=FALSE) Answers: a vector of row indices a test data frame a training data frame a vector of training values
a vector of row indices
A column can be called a feature or a(n) ____________________. categorical quantitative attribute observation
attribute
The conjugate prior of a binomial is which type of distribution? Answers: Gaussian beta multinomial Dirichlet
beta
A qualitative variable with 2 levels is best modeled by which distribution? Answers: Gaussian binomial none of these multinomial
binomial
Another term for a qualitative column is a factor or _______________________ data. categorical quantitative attribute observation
categorical
The logistic regression algorithm is used for: Answers: either classification or regression regression none of these answers classification
classification
This term refers to a type of machine learning in which the target is a qualitative variable. unsupervised classification supervised regression
classification
A(n) __________________ is an R data object that can store two-dimensional data in which columns can be of different types. data frame vector none of these matrix
data frame
A factor variable will show up in the summary() with levels as: dummy variables none of these answers is correct confounding variables hidden variables
dummy variables
The quantity P(X|Y) in the naive Bayes formula quantifies: Answers: how likely it is that we would see the data we see, given the class the likelihood of the class, given the data the probability of the class none of these
how likely it is that we would see the data we see, given the class
Naive Bayes is called 'naive' because: Answers: it makes the naive assumption that each predictor is independent of the others it makes the naive assumption that the classes are evenly distributed it was created by Thomas Bayes, a Presbyterian minister none of these
it makes the naive assumption that each predictor is independent of the others
How does linear regression fit in the bias-variance tradeoff? linear regression has high variance none of these answers is correct linear regression has both high bias and high variance linear regression has high bias
linear regression has high bias
A(n) ________________ is a two-dimensional data object containing elements of the same type. matrix data frame none of these vector
matrix
A qualitative variable with more than 2 levels is best modeled by which type of distribution? Answers: Gaussian multinomial binomial none of these
multinomial
Another term for a row in a data set is a(n) ___________________. categorical quantitative attribute observation
observation
The quantity P(Y|X) in the naive Bayes theorem is called the: Answers: marginal likelihood prior posterior
posterior
In logistic regression, the target variable is: Answers: none of these answers quantitative qualitative either qualitative or quantitative
qualitative
In linear regression, is the target variable quantitative or qualitative? quantitative none of these answers is correct qualitative it can be either
quantitative
This term refers to a type of machine learning in which the target is a quantitative variable. supervised classification unsupervised regression
regression
Another term for the target is a(n) ___________________. categorical quantitative attribute response
response
In this type of learning, one column is the target and all other columns are predictors. supervised classification unsupervised regression
supervised
The coefficient of a predictor x in a logistic regression model represents: Answers: the change in y for a 1-unit change in x the change in the log odds of y for a 1-unit change in x the change in the odds of y for a 1-unit change in x the change in the probability of y for a 1-unit change in x
the change in the log odds of y for a 1-unit change in x
Why is add-one or Laplace smoothing used in computing the Naive Bayes formula? Answers: to balance out the classses to smooth out the distribution of factors for each class to avoid multiplying by a zero probability none of these
to avoid multiplying by a zero probability
Clustering is an example of this type of machine learning. supervised classification regression unsupervised
unsupervised
A(n) ____________________ is a sequence of data elements of the same type. vector data frame none of these matrix
vector