Stats 2 Final Chapter 4
Descriptive data-mining
A category of data mining techniques that detect patterns and relationships in the data
dendrogram
A tree diagram used to illustrate the sequence of nested clusters produced by hierarchical clustering is known as a
market basket analysis
An analysis of items frequently co-occurring in transactions (such as purchases) is known
the hypotenuse
If the Euclidean distance were to be represented in a right triangle, which of the following would be considered the distance between two observations
the consequent occurs given that the antecedent occurs
In the theory of association rules in data mining, by confidence we mean an estimated probability that
This is true of bottom-up hierarchical clustering
It starts with each observation in its own cluster and then iteratively combine two most similar clusters
The set of recorded values of variables associated with single entity
Observations refers to the
Cluster Analysis
The data mining method that can be used in the market segmentation to divide consumers into different homogeneous groups is
matching coefficient
The simplest measure of similarity between observations consisting solely of categorical variables is given by
The ability to electronically warehouse data
This reason is responsible for the increase in the use of data-mining techniques in business
Jaccard's coefficient is different from the matching coefficient in that the former
does not count matching zero entries while the latter does.
Complete linkage
measures dissimilarity between two clusters by considering only the two most distant observations in these clusters
Centroid linkage
measures dissimilarity between two clusters by using the distance between the two cluster centroids
The endpoint of a k-means clustering algorithm occurs when
no further changes are observed in cluster structure and number
Single linkage measures dissimilarity between two clusters by considering
only the two closest observations in these clusters
The k-means clustering is the process of
organizing observations into one of k groups based on a measure of similarity.
Average group linkage measures dissimilarity between two clusters by considering
the average distance over all pairs of observations between these clusters
Centroid
the vector of the averages computed for each variable across all cluster observations