DSCI 4520 Exam 2

Ace your homework & exams now with Quizwiz!

A decision tree for classification provides a set of IF-THEN rules. A) True B) False

A)

Evaluation of classification models depends on our choice of performance metric, which in turn depends on the problem that we are trying to solve. A) True B) False

A)

In evaluating a predictive model with a numerical target, the root mean squared error (RMSE) has the same unit as the predicted variable. A) True B) False

A)

In the confusion matrix the term "actual" refers to the observed labels of the data. A) True B) False

A)

In the logistic regression model the target variable is: A) A categorical variable B) A numeric variable C) A number between 0 and 1 D) Either a numeric or a binary variable

A)

Input variables (features) of the logistic regression model cannot be categorical. A) True B) False

A)

Statistical independence for two events is present when the outcome of the first event has no impact on the probability of the second event A) True B) False

A)

The decision tree algorithm repeatedly splits the observations into two or more subgroups to achieve the most homogeneity. A) True B) False

A)

The following chart shows the prediction error of a decision tree based on the training set and validation set as functions of the number of splits. What phenomenon is causing the gap between the two curves at higher numbers of splits? A) Model overfitting B) Model underfitting C) Model variability D) Model instability

A)

The idea behind the random forest model is that the combination of multiple simple decision trees can reduce the probability of misclassification. A) True B) False

A)

The overall goal of building a decision tree for classification is to create leaves that are purer in terms of class labels. A) True B) False

A)

What is the predicted variable in the logistic regression model? A) Probability of class membership B) Confusion matrix C) RMSE D) A number between -1 and 1

A)

Which of the following statements is INCORRECT about the logistic regression model? A) In the logistic regression, the intercept cannot be zero because of the natural logarithm function B) Logistic regression can be developed for a binary or a multi-class target variable C) Logistic regression uses odds and the natural logarithm function D) Logistic regression can be used for classification

A)

Which statement about Entropy and the Gini index is correct? A) Smaller values of both metrics indicate higher purity of a node B) Larger values of the Gini index and smaller values of Entropy indicate higher purity of a node C) Larger values of both metrics indicate higher purity of a node D) Larger values of Entropy and smaller values of the Gini index indicate higher purity of a node

A)

Which statement explains the issues when linear regression is used to model binary target variables? A) Predicted probabilities can be >1 or <0 leading to model interpretation difficulties B) Complexity of logarithmic calculations C) Large intercept value associated with the linear regression models D) Instability of the model coefficients

A)

Which statement is INCORRECT about a CART trained to predict a numerical target? A) Impurity of the leaves can be measured with the Gini index or Entropy B) Prediction is computed as the average of numerical target variable at the leaves C) Training procedure is similar to training a CART for classification D) Pruning procedures and techniques are similar to those for the classification tree

A)

Which statement is INCORRECT about the structure of decision trees? A) Numerical attributes cannot be tested in the tree B) Each branch is a test on the predictor C) Each leaf is a terminal node with prediction D) Each node represents a test result on a predictor

A)

With the Naive Bayes classification method, the zero frequency problem occurs if a given scenario for a single predictor has not been observed. A) True B) False

A)

How can we turn the logistic regression model into the classification model? A) By setting a cutoff value and comparing the predicted odds with it B) By introducing inverse natural log function to the model C) By setting a cutoff value and comparing the predicted probability with it D) By setting a cutoff value and comparing the predicted log odds with it

C)

In the following confusion matrix, which cell is the FALSE POSITIVE? A) a B) b C) c D) d

C)

Maximizing which performance metric, reduced type I and II errors of classification? A) Specificity B) Miss Rate C) AUC ROC D) Fall-out

C)

We are building a decision tree to predict loan default with four predictors: Age, Income, Gender, and Credit Score. For the first split, we have calculated the Gini index of each test. Based on the following information, which predictor is the best for the first split? A) Credit B) Age C) Income D) Gender

C)

We have trained a classification model and it's ROC curve is shown below. Given that the Area Under the Curve (AUC) is our performance metric. Which model is performing better? A) We cannot determine it only by this curve B) B C) A D) Both models perform equally good

C)

What can cause the over-fitting problem in k-NN classifiers? A) splitting the data set B) incorrect distance function C) too small values of k D) too large values of k

C)

What is the fall-out score of the following confusion matrix given that "1" is positive? (rounded to 2 places) A) 0.45 B) 0.53 C) 0.47 D) 0.36

C)

What is the main method of unifying the output of an ensemble of decision trees in a random forest model? A) Logistic regression B) Sequential selection C) Voting D) Random selection

C)

What is the sensitivity score of the following confusion matrix given that "1" is positive? (rounded to 2 decimal places) A) 0.29 B) 0.50 C) 0.71 D) 0.31

C)

What is the specificity score of the following confusion matrix given that "1" is positive? (rounded to 2 places) A) 0.88 B) 0.19 C) 0.81 D) 0.21

C)

What statement is INCORRECT about the k-nearest neighbor (k-NN) method? A) Different k value can change the performance of the classifier B) Too small value for k may lead to over-fitting C) When k=1 (closest record) the classifier performance is maximum D) k is an arbitrary number that can be selected by trial-and-error

C)

Which one is NOT one of the advantages of the Naive Bayes classifiers? A) Ability to handle both categorical and numerical variables B) Robustness to irrelevant features C) Assumption of independence of features D) Ease of build and understand

C)

What is propensity score? A) an indicator of the correct cut-off value B) An arbitrary number assigned to each record C) a measure that shows accuracy of the model D) predicted probability of class membership

D)

What is the error rate of the following confusion matrix? (rounded to 2 decimal places) A) 0.58 B) 0.46 C) 0.59 D) 0.41

D)

What is the primary method to avoid underfitting when you are training a classification and regression tree model? A) Substituting multi-level categorical variables with binary dummy variables B) Increasing the entropy or the Gini index of the leaves C) Converting numerical variables to categorical D) Adding the number of tests (splits)

D)

Which one is NOT a necessary condition for building an ensemble model that outperforms its base models? A) There is a diverse group of base classifiers B) Base classifiers are built independently C) Each base classifier (voter) is better than a random guess D) All base classifiers are trained on the entire input data set

D)

Which statement is correct about the cutoff value of the probability calculated by a logistic regression model to be used for classification? A) Larger cutoff values result in higher model performance B) Smaller cutoff values result in higher model performance C) The cutoff value must always be set to 0.5 D) The cutoff value is an arbitrary value determined by model performance assessment

D)

A decision tree can be pre- or post-pruned to avoid underfitting a classification and regression tree. A) True B) False

B)

Consider two models A and B. If the prediction accuracy of Model A is higher than that of Model B for the training dataset, we can say that Model A is definitely better than Model B. A) True B) False

B)

If events A and B are statistically independent, what is P(A|B), that is the conditional probability of A, given B? A) P(B|A) B) P(A) C) P(B) D) P(A) * P(B)

B)

In building a decision tree, a common strategy for selecting an attribute for splitting is to choose the attribute (feature) that results in the lowest degree of certainty after the split. A) True B) False

B)

In evaluating a predictive model with a numerical target, the mean absolute error (MAE) can be negative or positive but the mean error (ME) is always positive. A) True B) False

B)

The cost of misclassification is always the same for false negative and false positive cases. A) True B) False

B)

The following chart shows the prediction error of a decision tree based on the training set and validation set as functions of the number of splits. To avoid overfitting what is the best number of splits? A) 16 B) 5 C) 10 D) 3

B)

The main difference between k-NN classifiers and k-NN regression models is that the former does not need a distance function, while the latter uses the Euclidean distance function. A) True B) False

B)

To build a random forest model that performs better than each of its decision trees, it is necessary that the base models are dependent. A) True B) False

B)

What statement is correct about the point of maximum entropy/Gini index? A) Node impurity is independent of the Gini index and entropy B) Impurity of a node is maximized C) Impurity of a node is minimized D) Probability of misclassification is minimized

B)

Which of the following technique is NOT useful for preventing over-fitting a decision tree? A) Pruning a tree B) Adding duplicate records C) Early stopping D) Cross-validation

B)

Which one is NOT one of the stopping criteria for splitting in decision tree training? A) When all attributes have been used B) When the tree becomes asymmetric (skewed) C) When additional splits obtain no information gain D) When the tree reaches the specified number of nodes or level of depth

B)

Which statement is INCORRECT about Naïve Bayes classifier? A) It identifies the dependent variables level (i.e. events) that increases the probability of the desired target class label B) It computes and includes prior probability of predictors C) It examines the existing evidence to predict the probability of target levels D) It returns the event with which the join probability of that event and predictors is maximized

B)

With the k-NN model for classification, after we determined the k nearest neighbors of a new data record, how the class is predicted? A) Average of the neighbors B) Majority vote determines the predicted class C) Through a linear combination of neighbors D) Through a logistic regression between the neighbors

B)

n the random forest algorithm, how different subsets of features and data records are selected for training each of the base decision trees? A) Voting B) Random selection C) Sequential selection D) Natural selection

B)


Related study sets

EMT 37, 38, 39 Obstetrics, Pediatrics, Geriatrics philip_olcese

View Set

Chapter 12: Markups and Markdowns: Perishables and Breakeven Analysis

View Set

Pharmacology: Drugs Used to Treat Cancer

View Set

Investment Management - Exam 2 - University of Iowa - Jeff Hart

View Set

Dynamic Earth: Plate Movement & Topography

View Set

ACCTMIS 3200 Ch.1 SB Corrections

View Set