BUAD 2070 Final Exam - Chapter 4
An analysis of items frequently co-occurring in transactions (such as purchases) is known as _____.
market basket analysis
Which of the following is true of bottom-up hierarchical clustering?
It starts with each observation in its own cluster and then iteratively combine two most similar clusters
Jaccard's coefficient is different from the matching coefficient in that the former:
does not count matching zero entries while the latter does.
Single linkage measures dissimilarity between two clusters by considering:
only the two closest observations in these clusters.
The lift ratio of an association rule with a confidence value of 0.88 and in which the consequent occurs in 60 out of 100 cases is:
1.47
_____ is the vector of the averages computed for each variable across all cluster observations.
Centroid
_____ is a category of data-mining techniques that detect patterns and relationships in the data
Descriptive data-mining
The k-means clustering is the process of
organizing observations into one of k groups based on a measure of similarity.
_____ measures dissimilarity between two clusters by using the distance between the two cluster centroids.
Centroid linkage
_____ measures dissimilarity between two clusters by considering only the two most distant observations in these clusters.
Complete linkage
Which of the following reasons is responsible for the increase in the use of data-mining techniques in business?
The ability to electronically warehouse data
The data-mining method that can be used in market segmentation to divide consumers into different homogeneous groups is _____.
cluster analysis
A tree diagram used to illustrate the sequence of nested clusters produced by hierarchical clustering is known as a _____.
dendrogram
The simplest measure of similarity between observations consisting solely of categorical variables is given by _____.
matching coefficient
The endpoint of a k-means clustering algorithm occurs when:
no further changes are observed in cluster structure and number.
Observation refers to the
set of recorded values of variables associated with a single entity.
A _____ refers to the number of times that a collection of items occur together in a transaction data set.
support count
Average group linkage measures dissimilarity between two clusters by considering:
the average distance over all pairs of observations between these clusters.
In the theory of association rules in data mining, by confidence we mean an estimated probability that
the consequent occurs given that the antecedent occurs
If the Euclidean distance were to be represented in a right triangle, which of the following would be considered the distance between two observations?
the hypotenuse