Analytics Exam 2
_____ refers to the scenario in which the relationship between the dependent variable and one independent variable is different at different values of a second independent variable.
Interaction Rationale: Interaction refers to the scenario in which the relationship between the dependent variable and one independent variable is different at different values of a second independent variable.
Euclidean distance can be used to calculate the dissimilarity between two observations. Let u = (25, $350) correspond to a 25-year-old customer that spent $350 at Store A in the previous fiscal year. Let v = (53, $420) correspond to a 53-year-old customer that spent $4,100 at Store A in the previous fiscal year. Calculate the dissimilarity between these two observations using Euclidean distance.
75.39 Rationale: The Euclidean distance between these two observations is calculated using the formula.
Complete linkage can be used to measure the distance between _____ in cluster analysis.
clusters Rationale: Complete linkage is a measure of calculating dissimilarity between clusters by considering only the two most dissimilar observations in the two clusters.
The data preparation technique used in market segmentation to divide consumers into different homogeneous groups is called
Cluster Analysis Rationale: preparation step to identify variables or observations that can be aggregated or removed from consideration. Cluster analysis is commonly used in marketing to divide consumers into different homogeneous groups, a process known as market segmentation.
Which statement is true of an association rule?
It is ultimately judged on how actionable it is and how well it explains the relationship between item sets. Rationale: An association rule is ultimately judged on how actionable it is and how well it explains the relationship between item sets.
Which of the following regression models is used to model a nonlinear relationship between the independent and dependent variables by including the independent variable and the square of the independent variable in the model?
Quadratic regression model Rationale: A quadratic regression model is a regression model in which a nonlinear relationship between the independent and dependent variables is fit by including the independent variable and the square of the independent variable in the model.
A tree diagram used to illustrate the sequence of nested clusters produced by hierarchical clustering is known as a _____.
dendrogram Rationale: A tree diagram used to illustrate the sequence of nested clusters produced by hierarchical clustering is known as a dendrogram.
The strength of the association rule is known as _____ and is calculated as the ratio of the confidence of an association rule to the benchmark confidence.
lift Rationale: The strength of the association rule is known as lift and is calculated as the ratio of the confidence of an association rule to the benchmark confidence.
k-means clustering is the process of _____.
organizing observations into distinct groups based on a measure of similarity Rationale: k-means clustering is the process of organizing observations into one of k groups based on a measure of similarity.
In a simple linear regression model, y = ß0 + ß1x + ε the parameter ß1 represents the _____.
slope of the true regression line Rationale: β0, read "beta zero," is the intercept parameter; and β1, read "beta one" is the parameter that represents the slope of the true regression line.
The least squares regression line minimizes the sum of the _____
squared differences between actual and predicted y values Rationale: The least squares regression line minimizes the sum of the squared differences between actual and predicted y values.
_____ refers to the number of times a collection of items occurs together in a transaction data set.
Support Count Rationale: The number of times that a collection of items occurs together in a transaction data set is known as the support count.
In the graph of the simple linear regression equation, the parameter ß1 is the _____ of the true regression line.
slope Rationale: In the graph of the simple linear regression equation, the parameter ß1 is the slope of the true regression line.
The process of extracting useful information from text data is known as _____.
text mining Rationale: The process of extracting useful information from text data is known as text mining.
A procedure for using sample data to find the estimated regression equation is _____.
the least squares method Rationale: The least squares method is a procedure for using sample data to find the estimated regression equation.
The degree of correlation among independent variables in a regression model is called _____.
Multicollinearity Rationale: Multicollinearity is the degree of correlation among independent variables in a regression model.
A visual representation of a document or set of documents in which the size of the word is proportional to the frequency with which the word appears is called a _____.
Word Cloud Rationale: A word cloud is a visual representation of a document or set of documents in which the size of the word is proportional to the frequency with which the word appears.
A variable used to model the effect of categorical independent variables in a regression model is known as a _____.
dummy variable Rationale: A variable used to model the effect of categorical independent variables in a regression model is known as a dummy variable.
The prespecified value of the independent variable at which its relationship with the dependent variable changes in a piecewise linear regression model is referred to as the ______.
knot Rationale: The prespecified value of the independent variable at which its relationship with the dependent variable changes in a piecewise linear regression model is referred to as the knot or breakpoint.