MKTG 3228 EXAM 1
_____ acts as a representative of the population.
A sample Rationale: A subset of the population is known as a sample, and acts as a representative of the population.
_____ uses a weighted average of past time series values as the forecast.
exponential smoothing
A time series with a seasonal pattern can be modeled by treating the season as a _____.
dummy variable
Any data value with a z-score less than -3 or greater than +3 is considered to be a(n) _____.
outlier Rationale: Any data value with a z-score less than -3 or greater than +3 is treated as an outlier.
A positive forecast error indicates that the forecasting method _____ the dependent variable.
underestimated
In the graph of the simple linear regression equation, the parameter ß1 is the _____ of the true regression line.
slope Rationale: In the graph of the simple linear regression equation, the parameter ß1 is the slope of the true regression line.
With reference to exponential forecasting models, a parameter that provides the weight given to the most recent time series value in the calculation of the forecast value is known as the _____.
smoothing constant
Suppose for a particular week, the forecasted sales were $4,000. The actual sales were $3,000. What is the value of the mean absolute percentage error?
33.3%
In the moving averages method, the order k determines the _____.
number of time series values under consideration
Which of the following is not present in a time series?
operational variations
The simplest measure of variability is the _____.
range Rationale: The simplest measure of variability is the range.
A _____ is a graphical presentation of the relationship between two quantitative variables.
scatter chart Rationale: A scatter chart is a graphical presentation of the relationship between two quantitative variables.
A procedure for using sample data to find the estimated regression equation is _____.
the least squares method Rationale: The least squares method is a procedure for using sample data to find the estimated regression equation.
A visual representation of a document or set of documents in which the size of the word is proportional to the frequency with which the word appears is called a _____.
word cloud Rationale: A word cloud is a visual representation of a document or set of documents in which the size of the word is proportional to the frequency with which the word appears.
Which of the following graphs provides information on outliers and IQR of a data set?
Box plot Rationale: A box plot is a graphical summary of the distribution of data, and it is developed from the quartiles for a data set. Therefore, the information on the outliers and IQR can be obtained from a box plot.
Which of the following best exemplifies big data?
Cellphone owners around the world generate vast amounts of data by calling, texting, tweeting, and browsing the Web on a daily basis. Rationale: Big data is simply a set of data that cannot be managed, processed, or analyzed with commonly available software in a reasonable amount of time.
_____ refers to the scenario in which the relationship between the dependent variable and one independent variable is different at different values of a second independent variable.
Interaction Rationale: Interaction refers to the scenario in which the relationship between the dependent variable and one independent variable is different at different values of a second independent variable.
Which statement is true of an association rule?
It is ultimately judged on how actionable it is and how well it explains the relationship between item sets. Rationale: An association rule is ultimately judged on how actionable it is and how well it explains the relationship between item sets.
Which of the following regression models is used to model a nonlinear relationship between the independent and dependent variables by including the independent variable and the square of the independent variable in the model?
Quadratic regression model Rationale: A quadratic regression model is a regression model in which a nonlinear relationship between the independent and dependent variables is fit by including the independent variable and the square of the independent variable in the model.
Which graph represents a negative linear relationship between x and y?
Rationale: A negative relationship means that if one variable gets bigger, the other variable tends to get smaller.
A tree diagram used to illustrate the sequence of nested clusters produced by hierarchical clustering is known as a _____.
dendrogram Rationale: A tree diagram used to illustrate the sequence of nested clusters produced by hierarchical clustering is known as a dendrogram.
Data dashboards are a type of _____analytics.
descriptive Rationale: Descriptive analytics encompass the set of techniques that describes what has happened in the past.
The prespecified value of the independent variable at which its relationship with the dependent variable changes in a piecewise linear regression model is referred to as the ______.
knot Rationale: The prespecified value of the independent variable at which its relationship with the dependent variable changes in a piecewise linear regression model is referred to as the knot or breakpoint.
Autoregressive models _____.
occur whenever all the independent variables are previous values of the time series
k-means clustering is the process of _____.
organizing observations into distinct groups based on a measure of similarity Rationale: k-means clustering is the process of organizing observations into one of k groups based on a measure of similarity.
A _____ is an interval estimate of an individual y value, given values of the independent variables.
prediction interval
In a simple linear regression model, y = ß0 + ß1x + ε the parameter ß1 represents the _____.
slope of the true regression line Rationale: β0, read "beta zero," is the intercept parameter; and β1, read "beta one" is the parameter that represents the slope of the true regression line. 
The least squares regression line minimizes the sum of the _____
squared differences between actual and predicted y values Rationale: The least squares regression line minimizes the sum of the squared differences between actual and predicted y values.
A _____ decision involves higher-level issues and is concerned with the overall direction of the organization, defining the overarching goals and aspirations for the organization's future.
strategic Rationale: A strategic decision involves higher-level issues and is concerned with the overall direction of the organization, defining the overarching goals and aspirations for the organization's future.
The decisions concerning an organization's goals and future plans are called _____.
strategic decisions Rationale: Strategic decisions involve higher-level issues concerned with the overall direction of the organization.
The data preparation technique used in market segmentation to divide consumers into different homogeneous groups is called _____.
cluster analysis Rationale: Clustering can be employed during the data preparation step to identify variables or observations that can be aggregated or removed from consideration. Cluster analysis is commonly used in marketing to divide consumers into different homogeneous groups, a process known as market segmentation.
The correlation coefficient will always take values _____.
between -1 and +1 Rationale: The correlation coefficient will always take values between -1 and +1.
The degree of correlation among independent variables in a regression model is called _____.
multicollinearity Rationale: Multicollinearity is the degree of correlation among independent variables in a regression model.
The process of extracting useful information from text data is known as _____.
text mining Rationale: The process of extracting useful information from text data is known as text mining.
Compute the mode for the following data. 12, 16, 19, 10, 12, 11, 21, 12, 21, 10
12
The College Board reported that the mean Math Level 2 SAT subject test score was 686 with a standard deviation of 96. Assuming scores follow a bell-shaped distribution, use the empirical rule to find the percentage of students who scored less than 494.
2.5% Rationale: z-score = (494 - 686) / 96 = -2. Recall that 95% of observations fall within two standard deviations of the mean, which means 2.5% of observations fall in each tail. Since we want to know the percentage of students who scored less than 494, we essentially want to know the percentage of observations that fall below -2 standard deviations. 2.5% of observations fall below -2 standard deviations.
_____ are visual methods of displaying data.
Charts Rationale: Charts are used to display and analyze data.
_____ is the most critical step of the decision-making process.
Identifying and defining the problem Rationale: Step 1 of decision making, identifying and defining the problem, is the most critical. Only if the problem is well-defined, with clear metrics of success or failure (step 2), can a proper approach for solving the problem (steps 3 and 4) be devised. Decision making concludes with the choice of an alternative (step 5).
DJ needs to display data over time. Which of the following charts should he use?
Line chart Rationale: Line charts are very useful for time series data collected over a period of time (minutes, hours, days, years, etc.). The line chart connects the points of the scatter chart. The addition of lines between the points suggests continuity, and it is easier for the reader to interpret changes over time.
Euclidean distance can be used to calculate the dissimilarity between two observations. Let u = (25, $350) correspond to a 25-year-old customer that spent $350 at Store A in the previous fiscal year. Let v = (53, $420) correspond to a 53-year-old customer that spent $4,100 at Store A in the previous fiscal year. Calculate the dissimilarity between these two observations using Euclidean distance.
Rationale: The Euclidean distance between these two observations is calculated using the formula.
The strength of the association rule is known as _____ and is calculated as the ratio of the confidence of an association rule to the benchmark confidence.
Rationale: The strength of the association rule is known as lift and is calculated as the ratio of the confidence of an association rule to the benchmark confidence.
_____ refers to the number of times a collection of items occurs together in a transaction data set.
Support count Rationale: The number of times that a collection of items occurs together in a transaction data set is known as the support count.
Complete linkage can be used to measure the distance between _____ in cluster analysis.
clusters Rationale: Complete linkage is a measure of calculating dissimilarity between clusters by considering only the two most dissimilar observations in the two clusters.
Assessing the regression model on data other than the sample data that was used to generate the model is known as _____.
cross-validation
A variable used to model the effect of categorical independent variables in a regression model is known as a _____.
dummy variable Rationale: A variable used to model the effect of categorical independent variables in a regression model is known as a dummy variable.
_____ is the amount by which the predicted value differs from the observed value of the time series variable.
forecast error
A _____ is a graphical summary of data previously summarized in a frequency distribution.
histogram Rationale: A common graphical presentation of quantitative data is a histogram. This graphical summary can be prepared for data previously summarized in a frequency, a relative frequency, or a percent frequency distribution.
The letter grades (A, B, C, D, F) of business analysis students are recorded by a professor. This variable's classification _____.
is categorical data Rationale: If arithmetic operations cannot be performed on the data, they are considered categorical data.
Data-ink is the ink used in a table or chart that _____.
is necessary to convey the meaning of the data to the audience Rationale: Data-ink is the ink used in a table or chart that is necessary to convey the meaning of the data to the audience.
Tables should be used instead of charts when _____.
the values being displayed have different units or very different magnitudes Rationale: Tables should be used when the reader needs to refer to specific numerical values, when the reader needs to make precise comparisons between different values and not just relative comparisons, and when the values being displayed have different units or very different magnitudes.
