Module 5

¡Supera tus tareas y exámenes ahora con Quizwiz!

The default likelihood threshold for determining outcome class in logistic regression is -- 0.5 -- 1 -- 0.4 -- 0.6

0.5

To decrease the size of the generated tree in decision trees analysis one should ________________ A) increase the minimum size for split B) increase the minimum size for leaf C) use the gain ratio algorithm D) A & B

A & B A) increase the minimum size for split B) increase the minimum size for leaf

If it is not specified logistic regression usually refers to __________ -- Multinomial Logistic Regression -- Polynomial Logistic Regression -- Binomial Logistic Regression -- None of above

Binomial Logistic Regression

Decision Trees analysis can only be used to predict a continuous dependent variable. -- True -- False

False

Logistic Regression is the same as polynomial regression. -- True -- False

False

Multi-nomial logistic regression is the same as linear regression -- True -- False

False

You need to let the data inform you about the appropriate decision threshold -- True -- False

False (No it is highly contextual, it has to be determined mostly based on business understanding)

Logistic Regression is also known as __________ -- Logarithmic Regression -- Logit Regression -- Binary Regression -- Linear Regression

Logit Regression

Which one is not a statistical measure used in logistic regression model evaluation -- Hit Rate -- R-Squared -- NagelKerke R-Squared -- -2LL

R-Squared

Decision tree algorithm starts with __________ predictor and then branches out to __________ -- The neutral, better predictors -- The weakest, better predictors -- The best, weaker predictors -- The worst, stronger predictors

The best, weaker predictors

Logistic Regression is similar to Linear regression in that -- They both only take one independent variable -- They both only take one dependent variable -- They both estimate an imaginary curvilinear line that fits the data the best -- They both estimate an imaginary straight line that fits the data the best

They both only take one dependent variable

Anything in the world can be molded using binary outcomes. -- True -- False

True

Decision tree analysis can detect and model localities within our data. -- True -- False

True

Decision tree analysis is not sensitive to missing values and outliers. -- True -- False

True

Decision tree is also know as classification tree -- True -- False

True

Using the same training data we can generate one decision tree model for the purpose of understanding and explaining and another one for the purpose of best predictive performance. -- True -- False

True

Considering the decision tree model, nodes _______________ -- show us the distribution of categories from the label attribute -- are an alternative for leaves -- represent all of the independent variables in our model -- are attributes which serve as predictors for the dependent attribute

are attributes which serve as predictors for the dependent attribute

Why it is recommended for the scoring data to be within the ranges of training data in classification/prediction tasks? -- because computer algorithms are very sensitive to data boundaries -- because computer algorithms also learn by experience, they can not accurately classify/predict something they have not yet experienced. -- There is no specific reason, it is the tradition. -- because scoring data is part of training data and it can not exceed it.

because computer algorithms also learn by experience, they can not accurately classify/predict something they have not yet experienced

An over fitted decision tree model -- has many leaves with the minimum number of allowed observations -- performs great on training/testing data -- is a fit tree with not many branches, it looks fit and slim -- is sensitive to outliers since it is overly fit -- has too many branches -- is sensitive to missing values -- fails on scoring data

has many leaves with the minimum number of allowed observations; performs great on training/testing data; has too many branches; fails on scoring data

Which one is true about Random Forest analysis -- it randomly chooses model as the best one -- it easily falls into over fitting problem -- it takes time to grow a forest, it is not good for quick tasks. -- it uses a voting mechanism to select the best model -- it is an ensemble of decision tree models -- it is an extended decision tree model with many roots and branches -- it generates different trees by changing model parameters randomly -- models are uncorrelated

it uses a voting mechanism to select the best model, it is an ensemble of decision tree models, it generates different trees by changing model parameters randomly, models are uncorrelated


Conjuntos de estudio relacionados

Exam 3 - Homework Quiz 5 - Cardiovascular

View Set

MKT 411 - Test 2 (Chapters 5, 6, and 7)

View Set

Prep U Chapter 34: Assessment and Management of Patients with Inflammatory Rheumatic Disorders

View Set

Vaccinations and Organisms which are Vaccine Preventable

View Set

Conflict Resolution UNIT 2 - CHALLENGE 1

View Set

Us History- New National Government

View Set

Chapter 39: Oxygenation and Perfusion

View Set