MSIS 4263 - Exam 2

¡Supera tus tareas y exámenes ahora con Quizwiz!

In building a decision tree, the stopping rule determines ___________

When the splitting should stop

In a decision tree, each leaf represents a _____.

a class

Association Rules (Market Basket) is a predictive data mining approach.

False

Classification trees are used for predicting dependent variables that are continuous in nature, whereas regression trees are used for predicting categorical dependent variables.

False

Decision Tree is a unsupervised learning algorithm

False

Decision tree construction is performed in a bottom-up manner.

False

If the probability of an outcome is p = LaTeX: \frac{outcome\:of\:interest}{all\:possible\:outcomes} o u t c o m e o f i n t e r e s t a l l p o s s i b l e o u t c o m e s , then the probability of a coin flip is p(heads) = LaTeX: \begin{matrix}\frac{2}{6}\end{matrix} 2 6 = 0.333

False

In interpreting the results of a logistic regression model, when the coefficient is negative, this means that the odds ratio is more than 1.

False

Logistic regression uses a straight line to model the probabilities of the predicted values.

False

Odds can range from 0 to 1, whereas Probabilities is a ratio of two probabilities ranging from 0 to infinity

False

This association rule A --> B, suggests that "if A occurs, then B occurs"

True

One of the following is a common algorithm for generating association rules.

Apriori

The module in SAS EM that deals with text mining processes is the _____

Text mining

In text mining, the data preprocessing concept known as Filtering refers to _____.

- Removing unique terms - Removing common words

In preparing to perform text mining, identity some of the issues one encounters in converting unstructured data to structured data.

- Tokenization - Stemming - Filtering - Parsing

Identify the types of data used for classification predictions.

- discrete - categorical - nominal

Consider the transactions in this image. What is the support of the rule {Milk} --> {Sugar}? TID Items 1 Bread, Eggs, Milk 2 Beer, Bread, Milk 3 Apples, Bread, Eggs 4 Bread, Milk, Sugar 5 Eggs, Milk 6 Bread 7 Bread, Milk, Sugar 8 Beer, Bread 9 Apples, Eggs, Sugar 10 Bread, Eggs

20%

Consider the transactions in this image. What is the support of the rule {Bread} --> {Eggs}? TID Items 1 Bread, Eggs, Milk 2 Beer, Bread, Milk 3 Apples, Bread, Eggs 4 Bread, Milk, Sugar 5 Eggs, Milk 6 Bread 7 Bread, Milk, Sugar 8 Beer, Bread 9 Apples, Eggs, Sugar 10 Bread, Eggs

30%

Consider the transactions in this image. What is the support of the rule {Bread} -->{Milk}? TID Items 1 Bread, Eggs, Milk 2 Beer, Bread, Milk 3 Apples, Bread, Eggs 4 Bread, Milk, Sugar 5 Eggs, Milk 6 Bread 7 Bread, Milk, Sugar 8 Beer, Bread 9 Apples, Eggs, Sugar 10 Bread, Eggs

40%

An Odds ratio can be interpreted as ______

A comparison between two Odds

In logistic regression, the target is a/an ___

Binary variable

The measure of strength of an association rule is called _____

Confidence

Which of the following is considered the right hand side of a rule.

Consequent

In text mining, the data preprocessing concept known as lemmetization refers to _____.

Dealing with base words and their related terms

What is the term used to describe the additional information required to predict an event, measured in bits

Entropy

This association rule A <-- B, suggests that "if A occurs, then B occurs"

False

Text mining is the process of applying data mining algorithms to categorical data.

False -Text mining is the process of applying data mining algorithms and approaches to textual data.

Roger is a data scientist working in a marketing consulting company. He has been asked to build a model that identifies loyal and non-loyal customers for one of their biggest clients. What type of classification model is Roger planning to use?

Logistic regression

A field of study that gives computers the ability to learn without being explicitly programmed

Machine learning

Unlike linear regression, logistic regression uses ___ to model

Maximum Likelihood Estimation

In classification, when you test a model, you are _______.

Measuring the model's ignorance

A classifier that performs well on the training data, but poorly on real or new data is known as _____

Overfitting

Identify the type of task that a data scientist is performing when he/she is measuring the model's ignorance.

Partitioning the available data

When one term relates to multiple concepts, this is known as _____

Polysemy

Which of the following is true of supervised learning.

Requires training

The measure of relevance of an association rule is called _____

Support

Multiple synonyms that represent the same concept is known as ____

Synonymy

Logistic regression forces predicted values to fall between 1 and 0.

True

In the classification process, which data partition is usually used in constructing the classification model?

Training data

A corpus is a collection of documents

True

Deep Learning is a subset of Machine Learning

True

Entropy provides us the information required to predict an event with certainty.

True

Latent Semantic Analysis is a text mining algorithm used for topic extraction

True

The Odds of 2 means that a person is twice as likely to experience an event as not to experience it.

True

In the classification process, which data partition is usually used in fine-tuning and assessing the performance of the classification model?

MSIS 4263 - Exam 2

Conjuntos de estudio relacionados

SCM - Chapt. 9 Quiz Questions

Chapter 9: Motivation, Satisfaction, & Performance

Chapter Four

F215 - Cellular Control & Biotechnology and Gene Technologies

MIE 201 Test 2

Assessment Test 5 Prep-U Questions

Gmail Basics Vocabulary

Modeling with Functions Assignment

Cell Biology Study Guide 8

(Exam 1) Chapter 11: Health Care of the Older Adult

Practice for Exam

Acute Glomerulonephritis Patients

CPIM - Execution and Control of Operations (ECO)

Exam #2

FA Davis Ch26 Asthma

ART 334 Unit 2

Ch.7

2.3

ECON 2020 Chapters

To Kill a Mockingbird - Chapter 20-27