MSIS 4263: Exam 2 Quizzes

Ace your homework & exams now with Quizwiz!

True

The Odds of 2 means that a person is twice as likely to experience an event as not to experience it. (T/F)

Entropy

What is the term used to describe the additional information required to predict an event, measured in bits

we are not certain

When a node in a decision tree is impure nodes, this means that _____

False

Association Rules (Market Basket) is a predictive data mining approach. (T/F)

Filtering Stemming Tokenization Parsing

In preparing to perform text mining, identity some of the issues one encounters in converting unstructured data to structured data. •Filtering •Stemming •Tokenization •Parsing

Removing common words Removing unique terms

In text mining, the data preprocessing concept known as Filtering refers to _____. •Removing common words •Removing unique terms •Dealing with base words and their related terms •Removal of term suffices

Dealing with base words and their related terms

In text mining, the data preprocessing concept known as lemmetization refers to _____.

False

Text mining is the process of applying data mining algorithms to categorical data. (T/F)

False

(T/F)

Overfitting

A classifier that performs well on the training data, but poorly on real or new data is known as _____

True

A corpus is a collection of documents (T/F)

Machine learning

A field of study that gives computers the ability to learn without being explicitly programmed

A comparison between two Odds

An Odds ratio can be interpreted as ______

predictive

Classification is a(n) _____ method

False

Classification trees are used for predicting dependent variables that are continuous in nature, whereas regression trees are used for predicting categorical dependent variables. (T/F)

20%

Consider the transactions in this image. What is the support of the rule {Milk} --> {Sugar}?

30%

Consider the transactions in this image. What is the support of the rule {Bread} --> {Eggs}?

40%

Consider the transactions in this image. What is the support of the rule {Bread} -->{Milk}?

False

Decision Tree is a unsupervised learning algorithm (T/F)

False

Decision tree construction is performed in a bottom-up manner. (T/F)

True

Deep Learning is a subset of Machine Learning (T/F)

True

Entropy provides us the information required to predict an event with certainty. (T/F)

Partitioning the available data

Identify the type of task that a data scientist is performing when he/she is measuring the model's ignorance.

Categorical Nominal Discrete

Identify the types of data used for classification predictions. • categorical • ratio • nominal • discrete

a decision

In a decision tree, each branch represents a _____.

class

In a decision tree, each leaf represents a _____.

When the splitting should stop

In building a decision tree, the stopping rule determines ___________

Measuring the model's ignorance

In classification, when you test a model, you are _______.

False

In interpreting the results of a logistic regression model, when the coefficient is negative, this means that the odds ratio is more than 1. (T/F)

Binary variable

In logistic regression, the target is a/an ___

Training data

In the classification process, which data partition is usually used in constructing the classification model?

Validation data

In the classification process, which data partition is usually used in fine-tuning and assessing the performance of the classification model?

True

Latent Semantic Analysis is a text mining algorithm used for topic extraction (T/F)

True

Logistic regression forces predicted values to fall between 1 and 0. (T/F)

False

Logistic regression uses a straight line to model the probabilities of the predicted values. (T/F)

Synonymy

Multiple synonyms that represent the same concept is known as ____

False

Odds can range from 0 to 1, whereas Probabilities is a ratio of two probabilities ranging from 0 to infinity (T/F)

Apriori

One of the following is a common algorithm for generating association rules.

Logistic regression

Roger is a data scientist working in a marketing consulting company. He has been asked to build a model that identifies loyal and non-loyal customers for one of their biggest clients. What type of classification model is Roger planning to use?

Support

The measure of relevance of an association rule is called _____

Confidence

The measure of strength of an association rule is called _____

Text mining

The module in SAS EM that deals with text mining processes is the _____

algorithms that consider locally optimal solution

The term greedy algorithm refers to _______

True

This association rule A --> B, suggests that "if A occurs, then B occurs" (T/F)

False

This association rule A <-- B, suggests that "if A occurs, then B occurs" (T/F)

Maximum Likelihood Estimation

Unlike linear regression, logistic regression uses ___ to model

Polysemy

When one term relates to multiple concepts, this is known as _____

Consequent

Which of the following is considered the right hand side of a rule.

Requires training

Which of the following is true of supervised learning.


Related study sets

HRM - Performance Management and Training

View Set