Predictive Analytics Ch. 8

अब Quizwiz के साथ अपने होमवर्क और परीक्षाओं को एस करें!

Affinity analysis and the discovery of rule attribute characteristics that often occur together in a given data set are the definition for which data mining tool?

Association

Which of the following are categories of Schmueli, Patel, and Bruce's analytics tools or techniques?

Association, Clustering, Prediction

What does correlation in data mining require?

Calculated probabilities, not certainty.

What are the two types of prediction variables?

Categorical & Continuous

______ tools allow us to predict a class of objects whose label is unknown to us.

Classification

Which of the following are the general types of patterns we would like to search for from Schmueli, Patel, and Bruce's taxonomy of analytics tools?

Clustering & Classification

What position has a job description that includes making sense of mounds of data by probing the data for patterns?

Data scientist

T/F The clustering tool is the most commonly used prediction method in data mining.

False

What is overfitting?

If there are too many attributes in a model, including some unrelated to the target.

What is nonrivalry?

No one person's use diminishes the value another person can extract from the data.

The four categories of analytics tools available in data mining are _____.

Prediction, Classification, Clustering & Association

What are the five steps identified by SAS for the data mining process?

Sample, Explore, Modify, Model & Assess

What is the definition of the second big data characteristic of velocity?

The rate at which the data arrives.

What is the definition of the first big data characteristic of volume?

The size of the data set.

What does data mining refer to?

The tools and techniques that are used in the large scale, or big data, arena.

What is the definition of the third big data characteristic of variety?

The types of data available.

What is the definition of the fourth big data characteristic of value?

The valuation of data.

What is the job of the data scientist?

To isolate the important patterns in a mass of data so that a firm can take actions to their benefit.

T/F Big data provides a mesh of data in which a few mistakes may not affect the outcomes predicted by the preponderance of data.

True

What is an example of the clustering (segmentation) analysis tool?

University students are identified who have special needs.

Which of these are characteristics of big data?

Value, Volume, Variety

What new mindset is needed to begin data mining using big data?

We need to be open to finding relationships and patterns we never imagined existed in the data we are about to examine.

Which of the following is correct regarding how data mining differs from standard business forecasting?

[1] Business forecasting forecasts seasonality and trends that are known to exist but data mining looks for those patterns that were not known to exist. [2] Business forecasting uses specific models to estimate and forecast known patterns but data mining extracts valuable information from data.

Which of the following are true about clustering (segmentation) analysis tools?

[1] Clustering (segmentation) analysis tools group objects based upon maximizing the intraclass similarity and/or minimizing the interclass similarity. [2] Clustering (segmentation) analysis tools are not input by the user.

Which of the following are correct when defining the term "data warehouse"?

[1] Collective information on every aspect of what has happened in the past. [2] A firm's central repository of integrated historical data. [3] The "memory" of the firm.

Data has waht characteristics?

[1] Data is used and it creates value. [2] Data has the property of "nonrivalry". [3] Data's full value is rarely extracted with its first use.

Which of the following statements are true regarding data mining and statistical forecasting model terminology?

[1] Data mining calls a forecasting model an algorithm. [2] Data mining calls a dependent variable an output variable or target variable. [3] Data mining calls a forecast a score.

Which of the following would be referred to as the explore activity in a data mining process?

[1] Examining the data graphically. [2] Creating summary statistics of the attributes. [3] Data cleansing

Which of the following are queries that could be used in data mining?

[1] Find all the customers that are likely to purchase recreational vehicle insurance in the next six months. [2] Group all the customers with similar buying habits.

Which of the following statements are true regarding data mining and business forecasting patterns?

[1] For business forecasting, the expectation is that the data will contain some level of variation where in data mining patterns are not pre-specified. [2] Both measure the certainty or trustworthiness associated with the patterns discovered. [3] In data mining you simultaneously search for different kinds of patterns in parallel, but in business forecasting search for set patterns.

What are the reasons for sampling/partitioning in data mining?

[1] In most cases, the entire data set is not needed to build a model. [2] It has its roots in the "holdouts" or "holdbacks" used for standard forecasting models. [3] Partitioning and testing for accuracy are standard practice in analytics.

Which of the following correctly describes a lift chart?

[1] It helps determine how effectively the model can reorder that data set. [2] It and its resulting lift calculation is the standard for accuracy in data mining.

Which of the following would best describe the characteristic of a "data mart"?

[1] It holds information that is specialized and has been grouped or chosen specifically. [2] A subset of a data warehouse.

Which of the following would best describe the receiver operating curve (ROC)?

[1] It was developed during WWII by radar engineers. [2] It provides a way to compare competing algorithms with a single number. [3] It is another method of explaining lift.

Data mining has come to be referenced by which of the following terms?

[1] Machine learning [2] Business intelligence, analytics, analysis [3] Data-driven discovery

Which of the following are some of the problems forecasters had previously and are currently having with data?

[1] Previously, forecasters were limited by the lack of data collected intelligently by businesses. [2] Previously, forecasters were limited to few pieces of data and only limited observations on that data existed. [3] Today, forecasters are overwhelmed with too much data.

Which of the following would correctly contrast data mining with database management?

[1] Queries are well defined in database management but less structured in data mining. [2] A query in database management would be "find all customers in Atlanta". In data mining it would be "Group all customers with smaller buying habits". [3] Data mining is more forward-looking where database management is more past-focused.

Data mining could often be described as which of the following?

[1] Seeking the discovery of new knowledge from the data. [2] Allowing data itself to reveal the patterns within, rather than imposing the patterns on the data.

Which of the following are considered part of the definition of data mining?

[1] The analysis of databases, data warehouses, and data marts. [2] Extraction of useful information from large, often unstructured databases. [3] Extraction of knowledge or information from large amounts of data.

For data mining, the primary goal for the data mining model would be which of the following?

[1] To have accuracy and fit as characteristics of the model. [2] To do a good job of representing our known data set.

The model phase of a data mining process would include which of the following?

[1] To select an appropriate algorithm. [2] To determine the data mining task required. [3] To set the parameters necessary to execute the process.

Which of the following are examples of association rules discovery?

[1] You being given coupons at a grocery store checkout based on your purchasing patterns. [2] You being recommended movies by Netflix based upon movies you have watched in the past.

Which of the following would be scenarios where data mining could be helpful?

[1] You know the characteristics of potential customers that are likely to continue to purchase your products or services. [2] You know which customers are likely to default on their loans.

Association rules discovery is _____.

affinity analysis

The statistical forecasting model term for forecasting model would be called _____ in data mining.

an algorithm

The final step in a data mining process is _____.

assess

Volume, velocity, variety, and value are all characteristics of _____.

big data

The modify steps in a data mining process involves _____.

creating, selecting, or transforming data.

Classification tools distinguish between _____.

data classes or concepts

Machine learning is a technique used in _____.

data mining

The data mining term of "score" is known as a(n) _____ in statistical terminology.

forecast

The assess step in a data mining process involves _____.

making a choice among a number of different algorithms chosen as candidates for the analysis.

A(n) _____ in statistical terminology is referred to as a record in data mining terminology.

observation

The first step in a data mining process is _____.

sample

The "iron law of dummy variables" states that _____.

the maximum number of dummy variables must be one less than the states of nature.


संबंधित स्टडी सेट्स

NUR108 #2 & 3 - Chapter 36: Nursing Care of the Child with an Alteration in Comfort-pain Assessment and Management

View Set

Nutrition Exam 2 Ch 5-7 Combined

View Set

3 - No Answer each question negatively using an adjective with the opposite meaning of the adjective used in the question.

View Set

Principles of Economics Chapter 6

View Set

Intro to Marketing 2080 Final Study Guide

View Set

Rizal in the Context of 19th Century PH

View Set