MSIS 4263 - Exam 2

Ace your homework & exams now with Quizwiz!

In building a decision tree, the stopping rule determines ___________

When the splitting should stop

In a decision tree, each leaf represents a _____.

a class

Association Rules (Market Basket) is a predictive data mining approach.

False

Classification trees are used for predicting dependent variables that are continuous in nature, whereas regression trees are used for predicting categorical dependent variables.

False

Decision Tree is a unsupervised learning algorithm

False

Decision tree construction is performed in a bottom-up manner.

False

If the probability of an outcome is p = LaTeX: \frac{outcome\:of\:interest}{all\:possible\:outcomes} o u t c o m e o f i n t e r e s t a l l p o s s i b l e o u t c o m e s , then the probability of a coin flip is p(heads) = LaTeX: \begin{matrix}\frac{2}{6}\end{matrix} 2 6 = 0.333

False

In interpreting the results of a logistic regression model, when the coefficient is negative, this means that the odds ratio is more than 1.

False

Logistic regression uses a straight line to model the probabilities of the predicted values.

False

Odds can range from 0 to 1, whereas Probabilities is a ratio of two probabilities ranging from 0 to infinity

False

This association rule A --> B, suggests that "if A occurs, then B occurs"

True

One of the following is a common algorithm for generating association rules.

Apriori

The module in SAS EM that deals with text mining processes is the _____

Text mining

In text mining, the data preprocessing concept known as Filtering refers to _____.

- Removing unique terms - Removing common words

In preparing to perform text mining, identity some of the issues one encounters in converting unstructured data to structured data.

- Tokenization - Stemming - Filtering - Parsing

Identify the types of data used for classification predictions.

- discrete - categorical - nominal

Consider the transactions in this image. What is the support of the rule {Milk} --> {Sugar}? TID Items 1 Bread, Eggs, Milk 2 Beer, Bread, Milk 3 Apples, Bread, Eggs 4 Bread, Milk, Sugar 5 Eggs, Milk 6 Bread 7 Bread, Milk, Sugar 8 Beer, Bread 9 Apples, Eggs, Sugar 10 Bread, Eggs

20%

Consider the transactions in this image. What is the support of the rule {Bread} --> {Eggs}? TID Items 1 Bread, Eggs, Milk 2 Beer, Bread, Milk 3 Apples, Bread, Eggs 4 Bread, Milk, Sugar 5 Eggs, Milk 6 Bread 7 Bread, Milk, Sugar 8 Beer, Bread 9 Apples, Eggs, Sugar 10 Bread, Eggs

30%

Consider the transactions in this image. What is the support of the rule {Bread} -->{Milk}? TID Items 1 Bread, Eggs, Milk 2 Beer, Bread, Milk 3 Apples, Bread, Eggs 4 Bread, Milk, Sugar 5 Eggs, Milk 6 Bread 7 Bread, Milk, Sugar 8 Beer, Bread 9 Apples, Eggs, Sugar 10 Bread, Eggs

40%

An Odds ratio can be interpreted as ______

A comparison between two Odds

In logistic regression, the target is a/an ___

Binary variable

The measure of strength of an association rule is called _____

Confidence

Which of the following is considered the right hand side of a rule.

Consequent

In text mining, the data preprocessing concept known as lemmetization refers to _____.

Dealing with base words and their related terms

What is the term used to describe the additional information required to predict an event, measured in bits

Entropy

This association rule A <-- B, suggests that "if A occurs, then B occurs"

False

Text mining is the process of applying data mining algorithms to categorical data.

False -Text mining is the process of applying data mining algorithms and approaches to textual data.

Roger is a data scientist working in a marketing consulting company. He has been asked to build a model that identifies loyal and non-loyal customers for one of their biggest clients. What type of classification model is Roger planning to use?

Logistic regression

A field of study that gives computers the ability to learn without being explicitly programmed

Machine learning

Unlike linear regression, logistic regression uses ___ to model

Maximum Likelihood Estimation

In classification, when you test a model, you are _______.

Measuring the model's ignorance

A classifier that performs well on the training data, but poorly on real or new data is known as _____

Overfitting

Identify the type of task that a data scientist is performing when he/she is measuring the model's ignorance.

Partitioning the available data

When one term relates to multiple concepts, this is known as _____

Polysemy

Which of the following is true of supervised learning.

Requires training

The measure of relevance of an association rule is called _____

Support

Multiple synonyms that represent the same concept is known as ____

Synonymy

Logistic regression forces predicted values to fall between 1 and 0.

True

In the classification process, which data partition is usually used in constructing the classification model?

Training data

A corpus is a collection of documents

True

Deep Learning is a subset of Machine Learning

True

Entropy provides us the information required to predict an event with certainty.

True

Latent Semantic Analysis is a text mining algorithm used for topic extraction

True

The Odds of 2 means that a person is twice as likely to experience an event as not to experience it.

True

In the classification process, which data partition is usually used in fine-tuning and assessing the performance of the classification model?

Validation data

In a decision tree, each branch represents a _____.

a decision

The term greedy algorithm refers to _______

algorithms that consider locally optimal solution

Classification is a(n) _____ method

predictive

When a node in a decision tree is impure nodes, this means that _____

we are not certain


Related study sets

F215 - Cellular Control & Biotechnology and Gene Technologies

View Set

Assessment Test 5 Prep-U Questions

View Set

Modeling with Functions Assignment

View Set

(Exam 1) Chapter 11: Health Care of the Older Adult

View Set

Acute Glomerulonephritis Patients

View Set

CPIM - Execution and Control of Operations (ECO)

View Set

To Kill a Mockingbird - Chapter 20-27

View Set

Chapter 5 Time Value of Money Concepts Intermediate Accounting 1

View Set

AP Macroeconomics, Modules 37-40: Section 7 Test

View Set