Data mining Final

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

Which of the following weights are not typically initialized to random values?

-0.25

Which of the following are general guidelines for choosing a network architecture?

A number of hidden layers, size of hidden layers, number of output nodes

Which of the following algorithms can be used for basket analysis?

A priori, Frequent pattern growth

With sufficient iterations, neural net can easily overfit the data. To avoid overfitting all of the following should be done EXCEPT:

Add nodes to capture complexity

Which of the following are characteristics of Average Distance?

Also called average linkage, Distance between two clusters is the average of all possible pair-wise distances, Most popular with Centroid distance measurement

All of the following are examples of data mining paradigms EXCEPT:

Analysis

(blank) is the study of what goes with what.

Association Rules

Benefits of Market Basket Analysis are

Better location of items in store to promaote impulse buys, Selection of items for joint promotions and marketing programs, Enable categorization of shopper purpose and motivations

Which of the following are benefits of Market Basket Analysis?

Better location of items, Joint promotions and market programs, Enables categorization

Which of the following options are part of the network structure of neural nets?

Bias values, Weights, nodes, multiple layers.

Big errors lead to ___ changes in weights

Big

The center of a class is known as the:

Centroid

Which of the following are data mining paradigms?

Classification, prediction, association

Discriminant analysis was used for which of the following before data mining?

Classifying organisms into species, skulls, and fingerprint analysis all used discriminant analysis for classification long before data mining.

which of the following is not a classification technique for discriminant analysis before data mining was invented?

Cluster analysis

What is the percentage of antecedent transactions that also have the consequent item set?

Confidence

_________ shows the rate at which consequents will be found

Confidence

. A _________ is a visual representation of the cluster hierarchy.

Dendrogram

A net with a single output node and several hidden layers, where g is the identity function, take the same form as a linear regression model

False

Bias values are subject to iterative adjustment

False

Classification Rules are also called market basket analysis.

False

Confidence is the number or percentage of times this rule occured over the total transactions.

False

Confidence is the number or percentage of times this rule occurred over the total transactions.

False

Discriminant Analysis is best suited for large data sets.

False

During the initial pass through a network, the error is propagated back and distributed to the first hidden node and used to update its weight

False

For batch updating all records in the training set are fed to the network after updating takes place

False

For the terms, the "IF" part is known as the consequent and the "THEN" part is know as the antecedent.

False

In a neural net structure, model "coefficients" are tweaked only a few times.

False

In case updating, completion of all records through the network is one source.

False

It is best to use classes with unequal frequencies in your data set

False

Market business analysis uses the frequency of single objects to suggest business rules.

False

Support is the support for the rule over the support for the antecedent.

False

The 3 layers of the network structure are the input, middle, and output layer.

False

The confidence of a rule can be calculated without first knowing the support.

False

The first step to algorithm for discrimination analysis is to create classification score that reflects the distance from each class?

False

The goal of clustering is to form groups of different records

False

The goal of the Initial Pass through the Network is to find weights that yield the best errors.

False

The lift ration shows the rate at which consequents will be found.

False

True or false: A neural network contains multiple layers; these layers include the input layer, the output layer, and the weight layer.

False

he most successful applications in data mining of neural networks have been multilayer feed-backward networks.

False

In order to measure lift, you must subtract the probability of the outcome from the confidence.

False.

True or False.Some disadvantages of neural networks include that is has good predictive ability, it can capture complex relationships, and there is no need to specify a model.

False.

Which is NOT part of the Multiple Layers

Horizontal Layer

All of the following are layers in the network structure besides:

Known Layers

To prevent overfitting the data which of the following should be done?

Limit the number of training epochs, do not overtrain data, examine the performance on the validation set.

Cluster analysis can be used for which of the following?

Market Segmentation, Industry Analysis, Market Structure Analysis

In maximum distance clustering or complete linkage, the distance between two clusters is the __________ distance between the two pair of records.

Maximum

Which of the following are steps to algorithm for discriminant analysis?

Measuring distance, Classification functions, converting to probabilities.

Which of the following is an example of cluster analysis?

Periodic table of elements, classification of species, grouping securities in portfolios

What is the goal of the initial pass through network?

To find weights that yield best predictions

A lift value greater than 1 indicates that a rule is useful in finding consequent item sets.

True

A neural network is a model for classification and prediction.

True

Association rules, or Affinity analysis, constitute a study of "what goes with what."

True

Dependent Variable uses categorical variables

True

Discriminant Analysis is a classical statistical technique that was used for classification long before data mining.

True

In a neural network, the error associated with the weights begins to decrease due to thousands of updates being performed

True

In case updating weights are updated after each record id ran through the network

True

Inter-cluster distance are maximized between clusters

True

Lift is another metric used to determine the most interesting rules

True

Neural Networks are known to be "black boxes."

True

Small errors leave weights relatively unchanged

True

Step 2 in discriminant analysis is classification functions.

True

The goal of cluster analysis is to form groups of of similar records

True

The idea behind neural networks is to combine the input information in a very flexible way that captures complicated relationships among these variables and between them and the response variable.

True

The most successful applications in data mining of neural networks have multi-layer feed-forward networks.

True

The parable of "beer and diapers" has proven there is a trend of when customers buy diapers they also buy beer.

True

Training the model means estimating the weights that lead to the best possible predictive results.

True

When using logistic regression you can use either categorical variables and interval variables.

True

clustering is used for segmenting markets into groups of similar customers

True

The following are all types of layers in a network structure except:

Weight layer

A net with a single output node and no hidden layers, where g is the identity function, takes the same form as

a linear regression model

Confidence is the percent of ________ transactions that also have the consequent item set

antecedent

The center of a class is called a(n):

centeroid

Discriminant analysis tends to be considered more of a ____________ method than a data mining method.

statistical classification

Which of the following are Association Rules?

study of "what goes into what", transaction based or event based

Using the Euclidean distance has ____ drawbacks?

three

Which criteria should be used to stop updating the neural network?

when the weight change is negligible, the miss-classifications rate reaches a required threshold, when a limit on runs in reached


Set pelajaran terkait

EVRN 148 ch 13 renewable energy questions

View Set

UNIT 1 EXAM: Perfusion Exemplars (HF, MI, Dysryhthmias)

View Set