Predictive Analytics Ch. 5

Ace your homework & exams now with Quizwiz!

The correlation along the main diagonal in a correlation matrix is _____.

1.00

According to the Institute of Business Forecasting (IBF) study by Chama L. Jain in 2015, the forecast errors of demand planners at the "category" level average between _____.

15% and 26%

If P = 3, then based on the Iron Rule of Dummy Variables, the number of dummy variables we would need is _____.

2

According to the Institute of business Forecasting (IBF) study by Chama L. Jain in 2015, the forecast errors of demand planners at the "aggregate" level average between _____.

27% and 15%

According to the Institute of Business Forecasting (IBF) study by Chaman L. Jain in 2015, the forecast errors of demand planners at the SKU level average between _____.

27% and 37%

If P = 4, then based on the Iron Rule of Dummy Variables, the number of dummy variables we would need is _____.

3

What percent of a company's performance comes about as a result of factors outside the firm?

85%

How does a bivariate regression model differ from a multiple regression model?

A bivariate regression has only one dependent and independent variable but a multiple regression has one dependent variable and may have many independent variables.

How does a regression plane differ from a regression line?

A regression plane represents a 3-Dimensional space (1 dependent & 2 independent variables) whereas a regression line represents a 2-Dimensional space (1 dependent & 1 independent variable).

What is a data consolidator?

A service that collects external time series and makes them available to data scientists.

What is a dummy variable?

A variable that can be used as a sort of proxy measure for influences that do not have a direct numerical measure.

What is the most common reason for serial correlation?

An important explanatory variable has been omitted.

What did the book mention is the 5th step for evaluating a multiple regression model?

Check for multicollinearity.

Once the dependent variable is determined when building a bivariate or multiple-regression model, what is the next step?

Determine what factors contribute to the change in the dependent variable.

The _____ is a test of the overall significance of the estimated multiple regression.

F-test

T/F When considering the set of independent variables to use, you should find ones that are highly correlated with one another.

False

What is the most difficult problem for a forecaster using multiple causal regression?

Finding relevant independent variables with the right periodicity and covering the historic period matching the data.

In which of the following scenarios would a dummy variable be most useful?

How have Valentine's Day greeting cards affected our monthly sales for the past 2 years?

The Iron Rule of Dummy Variables states that:

If we have P states of nature, we cannot use more than P-1 dummy variables.

Which factors affect a company's perforamnce?

Internal and External Factors

A dummy variable has which one of the following characteristics?

It can be used to account for seasonality or other qualitative attributes.

Which one of the following statements is correct when you are choosing the type of forecasting model to use for creating forecasts of the future?

Leading indicators need to be used.

A Durbin-Watson statistic value of close to 2 (between 1.5 & 2.5) would signify what?

No serial correlation

If an individual's income doubles, the amount spent on food usually less than doubles. What extension of the multiple-regression model would you use?

Nonlinear Terms

What is it called in a multiple-regression analysis when you desire to suspend a 3D plane among the observations in such a way that the plane best represents the observations?

Regression plane

What is it called in a multiple-regression analysis when you desire to suspend a three-dimensional plane among the observations in such a way that the plane best represents the observations.

Regression plane

In comparing the process of building a bivariate regression with a multiple-regression model, each begins by identifying what variable?

The dependent variable

According to the book, what should you consider if you are selecting independent variables for a multiple regression model?

The selection should be based on an understanding of the nature of the situation.

T/F In practice, for dummy variables, a lower confidence level is used since the variables measure qualitative attributes.

True

A dummy variable has the following characteristics:

[1] A dummy variable will be 1 if the condition does exist for an observation. [2] A dummy variable will be 0 if the condition does not exist for an observation. [3] A dummy variable takes on a value of either 0 or 1.

Which of the following model-specification statistics can be used in selecting the "correct" independent variables?

[1] Bayesian Information Criterion [2] R-Squared [3] Akaike Information Criterion

Which of the following statements is correct when communicating your forecast model to actual users?

[1] Complex models are more difficult to communicate to others. [2] As more causal variables are used, the cost of maintaining the needed database increases in both time and money. [3] Users are less likely to trust a model that they do not understand than a simpler model that they do understand.

Which of the following statements are correct when you are choosing the type of forecasting model to use for creating forecasts of the future?

[1] Do not choose the forecasting model based solely on the model's "fit to history". [2] Leading indicators need to be used.

Deviations between the predicted values based on the sample regression and the actual values of the dependent variable for each observation are called?

[1] Errors [2] Residuals

Which of the following statement is true if a demand planner had models with very good fit statistics?

[1] Forecast accuracy will almost always be worse and often much worse than the fit of a model to historic data. [2] The MAPEs were low. [3] The coefficients of multiple determination (R-squared) were high.

Which of the following is true about when to use the adjusted R-Squared and MAPE for evaluating multiple-regression models?

[1] MAPE should be focused on for use on actual forecasts. [2] R-squared relates to the past (in-sample) period, but may not work so well in forecasting. [3] The AIC and BIC measures should be used to select appropriate independent variables.

In forecasting using multiple-regression models, which of the following statements are correct?

[1] Multiple-regression analysis can be helpful in improving the forecasters level of expertise. [2] No one should ever rely solely on some quantitative procedure in developing a forecast. [3] Knowing the relationship between the dependent variable and some set of independent variables is necessary.

Which of the following statements are correct about outside (external) data?

[1] Outside data can come from public government statistics. [2] Outside data can come from economic statistics.

Which are the usual interpretations if a regression failed the Durbin-Watson test?

[1] Seasonality in the data has not been fully accounted for by the variables included. [2] This represents the effect of an omitted or unobservable variable (or variables) on the dependent variable.

Which of the following statements are correct about serial correaltion?

[1] Serial correlation occurs when there is a significant time pattern in the error terms of a regression analysis. [2] The Durbin-Watson statistic is most frequently used in the evaluation of serial correlation. [3] Serial correlation violates the assumption that error are independent over time.

Research on the best model to use for alternative-variable selection found which of the following was correct?

[1] The AIC chose the correct model in 45% of the cases. [2] The BIC chose the correct model in 46% of the cases. [3] The study determined that economic interpretation of why a variable is included should be strongly considered.

Which of the following statements is correct about the Akaike information criterion (AIC)?

[1] The AIC is constructed so that, as the number of independent variables increases, the AIC has a tendency to increase as well. [2] The AIC involves both the use of a measure of the accuracy of the estimate and a measure of the principle of parsimony. [3] The AIC selects the best model by considering the accuracy of the estimation and the "best" approximation to reality.

Which of the following statements is correct about the Bayesian information criterion (BIC)?

[1] The BIC uses Bayesian arguments about the prior probability of the true model to suggest the correct model. [2] The BIC and AIC often lead to the same model choice. [3] The BIC and AIC are quite similar.

Which of the following statements are true regarding multicollinearity?

[1] You assume that the independent variables are not highly linearly correlated with each other or other combinations. [2] The correlation matrix for the independent variables can be helpful to spot the cause. [3] Regression results show that one or more independent variables appear not to be statistically significant when theory suggests they should be.

Which of the following statements are correct about the Keep It Simple (KIS) principle when developing a multiple regression model?

[1] You must be able to forecast all the independent variables. [2] There may be a trade-off between explanatory power and the number of independent variables used.

The 5th step in evaluating multiple-regression models is to:

check for multicollinearity

The third part you should do in evaluating multiple-regression models is to:

conduct an evaluation of the coefficient of determination for the regression results.

The fourth thing you should do in evaluating multiple-regression models is to:

conduct the Durbin-Watson test for serial correlation.

The second thing to consider when evaluating multiple-regression models is to:

determine whether these results are statistically significant at your desired level of confidence.

If multicollinearity exists, you should:

drop all but one of the highly correlated variables.

If a plane has more than two independent variables it is called a _____.

hyperplane

If most of the actual data points were far above or below the regression plane, the adjusted R-squared of the equation would _____ it otherwise would be.

lower than what

When using a multiple regression model, the dependent variable (Y) is modeled as a function of _____ independent variable(s).

more than one

When using a multiple regression model, the dependent variable is modeled as a function of _____ independent variable.

more than one

When there is an overlap in the way two or more independent variables influence the dependent variable, you will have _____.

multicollinearity

A Durbin-Watson statistic value of 4 would signify a _____.

negative serial correlation

A Durbin-Watson statistic value of 0 would signify a _____.

positive serial correlation

Dummy variables can be useful in regression models because they can account for _____.

seasonality

The first thing you should do in evaluating multiple-regression models is to:

see whether the signs on the coefficients make sense.

A bivariate regression model is _____ dimensional.

two

If all the actual data points were to lie very close to the regression plane, the adjusted R-squared of the equation would be _____.

very high


Related study sets

BUA 380, Computer 12: Communication in Organization

View Set

medsurg tings 1-73 ATI ADULT MED SURG

View Set

Exam: 03.07 Module Three Review and Practice Exam Geometry

View Set

Ch. 8- Corporate Strategy: Vertical Integration and Diversification

View Set

Decision-Making and Cost Calculation

View Set

Chapter 31 Biology Homework and Readings

View Set

Chapter 66: Management of Patients With Neurologic Dysfunction Prep U

View Set