BADM 211 - Test 2 Quizzes

Ace your homework & exams now with Quizwiz!

What could be the objectives behind fitting a regression model? 1. Explanatory or descriptive: assessing the average impact of inputs on an outcome 2. Explanatory or descriptive: Generating descriptive statistics among a set of predictors 3. Predictive: given a set of input values for the observation, predicting the outcome variable for the observations 1 and 3 1 and 2 1, 2, and 3

1 and 3

Which of the following statements is/are true with respect to the equation 6.1 in the book? 1. 𝛽0, 𝛽1, ... 𝛽𝑝 are the coefficients estimated by the model 2. ε is the unexplained part and is called noise 3. 𝑋0, 𝑋1, ... 𝑋𝑝 are the regressors or covariates Only 3 Only 2 1, 2, and 3 Only 1

1,2, and 3

Which of the following is/are true about cross-validation? 1. It is an alternative approach to data partitioning 2. It's immensely useful when the dataset is not large enough for partitioning into train, test and validation set Only 1 Neither 1 nor 2 Both 1 and 2 Only 2

Both 1 and 2

Which of the following statement(s) is/are true with respect to Tables 6.3 and 6.4? 1. Table 6.3 reports the errors on the training data 2. Table 6.4 reports the errors on the validation data Neither 1 nor 2 Both 1 and 2 Only 2 Only 1

Both 1 and 2

What is another name for confusion matrix? coefficient Correlation matrix Classification matrix

Classification matrix

A classifier that performs well will have a ROC curve that is a perfectly diagonal line. True False

False

A higher RMSE means better performance True False

False

If for variable i, P_i <0.05, then it is not significantly associated with the outcome variable. True False

False

Sensitivity is defined as the ability of a classifier to rule out unimportant classes accurately. True False

False

In the used Toyota Corolla cars example in the book (Page no. 165), which of the variables mentioned in Table 6.1 can NOT be modeled as outcome variable in linear regression? Age Kilometers Price Fuel Type

Fuel Type

Which of the following is not an example of feature selection algorithms ? Backward Elimination Exhaustive Search Multiple Linear Regression Forward Selection

Multiple Linear Regression

Which of the best describes 𝛽1 in equation 6.1? 1. A unit change is Y is associated with the 𝛽1 units change in 𝑋1 2. A unit change is 𝑋1 is associated with the 𝛽1 units change in 𝑌 3. Predictor 𝑋1 helps to decrease the prediction error in Y by 𝛽1 units Only 3 Only 2 Only 1 None of them

Only 2

Which of the following is/are true in the context of Linear Regression models? 1. It fits a relationship between multiple numerical outcomes variables and a set of predictors at once. 2. It fits a relationship between a numerical outcome and a set of predictors. 3. It attempts to fit multiple relationships between a numerical outcome and a set of predictors simultaneously. Only 2 Only 1 Only 3 All: 1, 2, 3

Only 2

Which of these statements are true? 1. Interpreting regression coefficients plays an important role in predictive modeling 2. Interpreting regression coefficients plays an important role in explanatory modeling Both 1 and 2 Neither 1 nor 2 Only 2 Only 1

Only 2

Which of the following is true about PCA? PCA is an unsupervised dimension reduction algorithm PCA stands for principal correspondence analysis. PCA is a supervised learning algorithm

PCA is an unsupervised dimension reduction algorithm

Similar MAE implies __

The model did not overfit the training data

If for variable I, beta_i > 0, then there is a positive association with variable i and the outcome variable True false

True

If the assumptions of MLR holds, then the MEAN ERROR (ME) =0. True False

True

PCA is most effective when original features are correlated. TRUE/FALSE

True

Sensitivity and specificity are important accuracy measures when it is more important to predict membership in one class. True False

True

The PCs are uncorrelated. TRUE/FALSE

True

To estimate future classification error, the confusion matrix is computed from the Training set Validation set Entire dataset

Validation set

kc_df.shape (21613,20) Based on the above figure and code, there are ______________ records in the data. a. 21613 b. 2

a. 21613

MLR assumes a. A linear relationship between the predictors and outcome variable b. A nonlinear relationship between the predictors and outcome variable

a. A linear relationship between the predictors and outcome variable

Which of the following is TRUE about LASSO? a. By increasing the penalty parameter lambda , we are shrinking more features to zero b. By increasing the penalty parameter lambda , we are shrinking less features to zero

a. By increasing the penalty parameter lambda , we are shrinking more features to zero

Assume the data is named credit_df, which of the following code outputs the number of rows and columns in the data? a. Credit_df.shape b. Credit_df.count() c. Credit_df.describe() d. Credit_df.head()

a. Credit_df.shape

Which of the following is a classification problem? a. Identification of digits ( 0-9) using images of handwritten digits. b. Predicting lifetime value of customers. c. Forecasting next day sales d. Forecasting number of people watching Super Bowl to decide whether to advertise or not on TV

a. Identification of digits ( 0-9) using images of handwritten digits.

The main limitation of exhaustive search is that a. It is computationally infeasible b. It eliminates predictors that perform well

a. It is computationally infeasible

We estimate beta coefficients by a. Minimizing error in the training set b. Maximizing error in the training set c. Minimizing error in the validation set d. Maximizing error in the validation set

a. Minimizing error in the training set

In this problem we are trying to predict the loan amount to approve a customer. This is a/an a. Regression problem b. Classification problem c. Unsupervised learning problem

a. Regression problem

In LASSO, increasing the penalty parameter lambda results in a. Shrinking more beta coefficients towards zero b. Shrinking fewer beta coefficients towards zero

a. Shrinking more beta coefficients towards zero

The naive predictor in regression is a. The mean of the outcome variable in the validation set b. The median of the outcome variable in the validation set c. The variance of the outcome variable in the validation set

a. The mean of the outcome variable in the validation set

Which of the following is TRUE about LASSO ? a. The penalty parameter lambda acts as a tradeoff between the number of features and prediction error b. The penalty parameter lambda acts as a tradeoff between the number of records and prediction error

a. The penalty parameter lambda acts as a tradeoff between the number of features and prediction error

The dimensionality of data corresponds to the number of a. Variables in the data b. Records in the data c. Variables and records combined in the data

a. Variables in the data

What is the naïve rule for classifying? a. Randomly assign a class, where the probability of being assigned one of m classes is based on the frequency of that class in the dataset. b. Classify the record as a member of the majority class c. Randomly assign a class, where the probability of being assigned one of m classes is 1/m.

b. Classify the record as a member of the majority class

Which of the following error metrics can not be used to asses predictive performance ? a. RMSE b. Mean error c. Mean absolute error

b. Mean error

In this problem we are trying to predict the loan amount to approve a customer. This is a/an a. Explanatory modeling b. Predictive modeling

b. Predictive modeling

Two popular way of normalizing the data is by 1) subtracting the mean and dividing by standard deviation, and 2) _ a. Take the logarithm (base 2) of the variable b. Rescale the variable to a uniform range c. Multiple data by a constant

b. Rescale the variable to a uniform range

Before we build a predictive model, we need to divide the data into training and validation set. The main reason for doing this is a. To have a simpler model b. To avoid overfitting c. To have a more balanced data

b. To avoid overfitting

Which dataset is used to create the predictive model? a. Validation set b. Training set c. Both training and validation set

b. Training set

A propensity score is used in combination with a _____ to determine class membership. a. Confusion matrix b. cutoff value c. ROC curve

b. cutoff value

If you have 4 variables/predictors, then in exhaustive search you need to evaluate a. 8 models b. 12 models c. 15 models d. 20 models

c. 15 models

What is the minimum number of Principal components needed to explain at least 60% of the variance in the data? a. 1 b. 2 c. 3 d. 4

c. 3

Misclassification error arises when ____ a. An outlier is erroneously left in the dataset. b. A categorical variable is misclassified as numerical. c. A record belongs to one class but is classified as another.

c. A record belongs to one class but is classified as another.

Which of the following is TRUE about Multiple Linear Regression (MLR)? a. MLR assumes and non-linear relationship between outcome and predictors. b. Estimated beta coefficients are the ones that maximizes prediction error in the training set. c. Estimated beta coefficients are the ones that minimizes prediction error in the training set. d. Estimated beta coefficients are the ones that maximizes prediction error in the validation set.

c. Estimated beta coefficients are the ones that minimizes prediction error in the training set.

I want to visualize the relationship between Income and Loan_Amount. Therefore, I need to use a/an a. Histogram b. Bar chart c. Scatter plot d. Heatmap

c. Scatter plot

What do the off-diagonal cells in a confusion matrix tell us? a. Total number of records b. The number of correct classifications c. The number of misclassifications

c. The number of misclassifications

PCA is most useful when _______. a. There are categorical predictors b. There are many rows but only a few predictors c. There is high correlation among predictors d. There is no correlation among predictors

c. There is high correlation among predictors

Would you expect root mean square error (RMSE) to be higher for validation data than for the training data? a. Yes, there are more observations in the validation data, thus RMSE is expected to be higher for train data b. No, there are more observations in the train data, thus RMSE is expected to be higher for train data c. Yes, predictions are likely to be worse for the validation data than the training data d. No, the model can't be validated if the error is not smallest for the validation data

c. Yes, predictions are likely to be worse for the validation data than the training data

Suppose you have 5 features in the data and you are transforming your data using PCA. How many PCs do you have? a. 1 b. 2 c. 3 d. 5 e. 10

d. 5

In table 6.3 the Python command "pd.get_dummies" ______. a. Converts numerical predictors into dummy variables b. Converts a numerical outcome into a categorical outcome c. Partitions the data d. Converts categorical predictors into dummy variables

d. Converts categorical predictors into dummy variables

What is regularization in the context of linear models? a. It's a data science philosophy that prescribes testing our model's performance on irregular data, such as outliers b. It's a data science philosophy that prescribes regular updates to our models using the latest data c. It's a method to shrink the data so that model performance can be improved on the training data d. It's a method to drop a few predictors from the model in order to simplify the mode

d. It's a method to drop a few predictors from the model in order to simplify the mode

Which of the following is a feature extraction method? a. Exhaustive search b. Subset selection c. Regularization d. PCA

d. PCA

Which of the following is not an assumption of Linear Regression models? a. There is a linear relationship between the predictors and the outcome variable b. The cases(records) are independent of each other c. The variability in Y for a given set of predictors is the same regardless of the values of the predictors d. The noise follows a poisson distribution

d. The noise follows a poisson distribution

Which of the following is not a reason for dimensionality reduction? a. Cost of obtaining extra features b. To avoid multicollinearity in the data c. To avoid overfitting d. To avoid imbalance in the outcome variable

d. To avoid imbalance in the outcome variable

Which of the following is TRUE about PCA? a. The first PC explains the least variability in the data. b. The first PC is a non-linear combination of original features. c. If the data has 5 original features, then we have 2 PCs. d. You can reduce redundancy in the data by using PCA.

d. You can reduce redundancy in the data by using PCA.

Dimensionality reduction means _______________ . a. reducing the number of records in the data b. reducing the number of missing values in the data c. reducing the number of outliers in the data d. reducing the number of features in the data

d. reducing the number of features in the data

Suppose you have 5 features in the data and you are transforming your data using PCA. Which PC explains the most variance in the data? a. PC5 b. PC4 c. PC3 d. PC2 e. PC1

e. PC1

How many cells will a confusion matrix have if there are m classes? 4 (m-1)^2 m^2

m^2

Principal component analysis is a/an __________ method for dimensionality reduction. supervised unsupervised

BADM 211 - Test 2 Quizzes

Related study sets

Maternal/OB: Ch. 9

Strategic Management Exam 1 Practice questions

MAS REVIEWER

ITD 110

First Aid and Safety: Shock and Sudden Illness

Exam 2 A&P 1

Chapter 1 Connect MC Problems

Management CH 08

Hygiena skúška otázky

Functional units of the organ systems

гидросфера

CH.5 Developing the Security Program

Intermediate Accounting III Units 5-9

GOVT 2306 - Chapter 2: Texas Leg.: Reynolds

Chapter 18 Powerpoints/Endocrine Quiz#2

Branches of the Autonomic Nervous System

Unit 6

MKTG 455 Atkins Midterm

Knowledge & Clinical Judgement Beginning Test

Government Final