Business Analytics Ch. 8

¡Supera tus tareas y exámenes ahora con Quizwiz!

Typically, a silhouette score ranges between _____.

+1 and −1

Arrange the steps involved in running k-means clustering in the correct order of occurrence. (Place the first step at the top.)

1. The analyst decides on k initial subgroups to experiment with the number of clusters. 2. The k-means clustering algorithm randomly assigns observations to clusters. 3. The k-means clustering algorithm calculates cluster centroids. 4. The k-means algorithm reassigns observations based on their proximity to the cluster centroid. 5. The k-means clustering algorithm calculates an overall group mean and assigns observations to the group they are closest to based on their features.

market

A group of people who come together to buy and sell products and/or services is known as a(n) _____.

Identify the commonly used approaches to hierarchical clustering. (Check all that apply.)

Agglomerative clustering Divisive clustering

Match the ways to measure similarity between observations with hierarchical clustering (in the left column) with their descriptions (in the right column). Instructions

Euclidean distance This function measures the distance between two observations as the true straight line distance between two points. Manhattan distance In this function, the distance between two points is not straight, rather it is a path with right turns as if one is walking a grid in a city. Matching coefficient This method measures the similarity between two observations with values that represent the minimum differences between two points. Jaccard's coefficient This method measures the similarity between two observations based on how dissimilar they are from each other.

True or false: Hierarchical clustering is similar to k-means clustering in that it can work only with numerical data.

False

average linkage

In hierarchical clustering, the ___ ___ method of linking individual observations both within and between clusters defines similarity by the group average of observations from one cluster to all observations from another cluster.

silhouette score

In the context of running k-means clustering, ___ ___ is a way to identify the optimal number of clusters for the data by determining the average distance between each observation in the cluster and the cluster centroid.

Ward's method

In the context of the four methods of measuring distances with hierarchical clustering, which of the following methods involves choosing two clusters to combine based on which combination of clusters minimizes the within cluster sum of squares for all clusters across all of the different clusters?

Which of the following is true of divisive clustering as a common approach to hierarchical clustering?

It begins with a single cluster of 100 and ends with 100 different clusters.

Which of the following is true of issues that need to be considered when running k-means clustering?

It can be applied only to numerical data.

Which of the following is true of the use of cluster analysis in practice?

It enables marketers to identify hidden structures and patterns in data.

Which of the following is true of cluster analysis?

It explores different types of relationships using algorithms, and then develops smaller groups from larger populations based on similar attributes.

Identify an issue that needs consideration when running k-means clustering.

It should be executed with data that has been standardized using either min-max or z-scores.

Which of the following is true of issues that need to be considered when executing hierarchical clustering?

It should be executed with standardized data developed using either min-max or z-scores.

How does clustering help marketers? (Check all that apply.)

Its results enable companies to develop targeted marketing campaigns and tactics. It helps marketers identify hidden structures and patterns in the data.

cluster analysis

Segmenting a market using shared characteristics is known as ___ ___.

Identify a true statement about the ways to measure similarity between observations with hierarchical clustering.

Similarity is most often measured for numerical variables using the Euclidean distance or Manhattan distance.

Which of the following are true of agglomerative clustering as a common approach to hierarchical clustering? (Check all that apply.)

Smaller clusters are merged into larger clusters using a linkage method. At the end of this process, all observations are included in a single cluster.

market segmentation

The process of dividing markets into subgroups is called _____.

In the context of measuring distances with hierarchical clustering, ___ ___ applies a measure of the sum of squares within the clusters summed over all variables.

Ward's method

K-means clustering

___ ___ ___ is a type of cluster analysis that uses the mean value for each cluster and minimizes the distance to individual observations.

Hierarchical clustering

___ ___ is a method of clustering that produces solutions in which the data is grouped into a hierarchy of clusters; individual observations are combined into subgroups using a measure of distance between observations.

In the context of the methods of linking individual observations both within and between clusters in hierarchical clustering, the ______ method defines similarity by the shortest distance from an object in a cluster to an object in another cluster and the _____ method defines similarity by the maximum distance between observations in two different clusters.

single linkage; complete linkage


Conjuntos de estudio relacionados

module 8 - sensation and absolute threshold,

View Set

Cisco RSE Chapter 10 Quiz(Module 14)

View Set