Data Mining Week 1

Ace your homework & exams now with Quizwiz!

Which of the following IS one of the last two steps in the decision-making process?

Choose an alternative

The father of information theory is _____________.

Claude Shannon

Which of the following is NOT one of the last three steps in a CRISP-DM project?

Data scoring

From the bottom of the data mining pyramid to the top, what is the sequence?

Data-Information-Knowledge-Decisions

What is the purpose of CRISP-DM?

Decision Making

What is the Shannon definition of information?

Defined information as surprise

The person credited with coining the three V's defining Big Data is _________.

Doug Laney

Rattle requires that we click on the ___________ button at each stage of processing.

Execute

The problem-solving process includes all of the decision-making process and?

Implement alternative and Evaluate results.

Data is the raw material of the _________.

Information Age

Information is the input for

Knowledge

Which of the following is NOT one of the first three steps in the decision-making process?

Obtain the data

Analytics goes back at least as far as the work of _____________.

Sir Ronald A. Fisher

What made 15-th century printing important?

Spread of knowledge and allowed information to travel faster.

Data transformation refers to the process of altering or converting the original data to a different form or scale, often to meet the assumptions of a ___________ analysis, improve interpretability, or enhance the performance of a model.

Statistical

What is data mining?

The process of discovering useful information from data.

Sample is...?

a subset of the population

Data are facts acquired through ______, ________, experimentation and ________.

conveyance, theory, and computation

A numerical variable is discrete if it results from a

count

According to Fayyad, et al., which of the following is true regarding the relationship of data mining and knowledge discovery in databases?

data mining is a subset of knowledge discovery in databases

Analytics enables us to begin with ______ and end with _______.

data | knowledge

Why are outliers important?

determine their impact

Knowledge is required for

good decisions

What makes a variable numeric?

if meaningful arithmetic can be performed on it

According to Fayyad, et al., which of the following is true regarding the relationship of data mining and model building?

model building is a subset of data mining

According to Fayyad, et al., knowledge discovery in databases is the non-trivial process of identifying valid, novel, potentially useful and ultimately understandable __________.

patterns in data

Managing data is essential to

produce information

Why is the standard deviation important?

quantifies the spread or dispersion of data points

Data is the ____________ of the Information Age

raw material

The purpose of a model in data mining is to ____________________.

separate the systematic part of the phenomenon from the unsystematic part

DMRR mentions confusion matrices in chapter 2. Which of the following is NOT a term referring to correct predictions?

specificity

The Summary provided by Rattle and R has all of Tukey's five-number summary values plus the ________.

standard deviation

Good decisions generate

successful outcomes

For data to be the source for new knowledge requires ___________.

surprise

Data is _______________.

the new oil

What is indicated when the median is smaller than the mean, or vice versa?

there are outliers

What skewness occurs when there are really large values?

to the right

The definition of Big Data's three V's does NOT include which of the following _________.

variance


Related study sets

medical terminology usf final exam, medical terminology final exam ch 8-14

View Set

Chapter 1 Pretest and Appendix-B Test

View Set

Louisiana Life, Health & Accident Insurances

View Set

Ch. 11 Interest Groups for A.P. Government

View Set