BA Exam 3 Ch 4

Ace your homework & exams now with Quizwiz!

Which of the following is true of Euclidean distances?

It is commonly used as a method of measuring dissimilarity between quantitative observations.

Which statement is true of an association rule?

It is ultimately judged on how actionable it is and how well it explains the relationship between item sets. An association rule is ultimately judged on how actionable it is and how well it explains the relationship between item sets.

Jaccard's coefficient is different from the matching coefficient in that the former

does not count matching zero entries while the latter does.

The lift ratio of an association rule with a confidence value of 0.45 and in which the consequent occurs in 6 out of 10 cases is

.75

Suppose we had a data set from a call center where customers were asked to choose between the following three options: hear account information, billing questions, and customer service. Using the given order of the three options, and using 0-1 dummy variables to encode the categorical variables, which of the following combinations would yield an entry "customer service"?

001

The strength of a cluster can be measured by comparing the average distance in a cluster to the distance between cluster centroids. One rule of thumb is that the ratio for between-cluster distance to within-cluster distance should exceed what value for useful clusters?

1

Euclidean distance can be used to calculate the dissimilarity between two observations. Let u = (25, $350) correspond to a 25-year-old customer that spent $350 at Store A in the previous fiscal year. Let v = (53, $420) correspond to a 53-year-old customer that spent $4,100 at Store A in the previous fiscal year. Calculate the dissimilarity between these two observations using Euclidean distance.

75.39

__________ is a measure of calculating dissimilarity between clusters by considering only the two most dissimilar observations in the two clusters.

Complete linkage

An analysis of items frequently co-occurring in transactions is known as

Market basket tanalysis

When clustering only by dummy variables that represent categorical variables, the simplest measure of similarity between two observations is called the

Matching coefficient

A collection of text documents to be analyzed is called a ___________.

corpus

k-means clustering is the process of

organizing observations into distinct groups based on a measure of similarity.

Suppose the dissimilarity between clusters A and B has the value 24 and the dissimilarity between cluster B and C has the value 12. Use McQuitty's method to determine the dissimilarity of clusters A and B.

18, Using McQuitty's method, the dissimilarity between clusters A and B is calculated as the average of the dissimilarity between A and C and the dissimilarity between B and C. The calculated value is (12 + 24) / 2 = 18.

A cluster's __________ can be measured by the difference between the distance value at which a cluster is originally formed and the distance value at which it is merged with another cluster in a dendrogram.

Durability

Complete linkage can be used to measure the distance between clusters that are the __________ in cluster analysis.

Most different

Single linkage can be used to measure the distance between clusters that are the __________ in cluster analysis.

Most similar

If the Euclidean distance were to be represented in a right triangle, which of the following would be considered the distance between two observations of a cluster?

The hypotenuse

In the text mining process, the text is first preprocessed by deriving a smaller set of _________ from the larger set of words contained in a collection of documents.

Token

__________ can be used to partition observations in a manner to obtain clusters with the least amount of information loss due to the aggregation.

Wards method

In preparing categorical variables for analysis, it is usually best to

convert the categories to binary, dummy variables.


Related study sets

Weathering, Erosion and Deposition

View Set

fire science chapter 5 fire behavior

View Set

Biology - Chapter 12: DNA Technology - Quiz

View Set

MUS 225 Exam 2 Practice Questions

View Set