ORMS 3310 MOD 10
In regression analysis, ANOVA is used for which of the following?
Test of joint significance.
Which of the following can be used to test for a statistically significant relationship between the explanatory variables and the response variable? Select all that apply.
Tests of individual regression coefficients. A test of regression coefficients jointly.
Which of the following are true about the ANOVA table? Select all that apply.
The ANOVA table facilitates the calculation of the test statistic. The ANOVA table summarizes the error explained and unexplained by the regression model. The F statistic is found by MSR / MSE.
What is a residual in regression analysis?
The difference between observed and predicted y values.
In the sample regression equation, yˆŷ = b0 + b1x, what does b0 represent?
The estimated intercept.
In the sample regression equation, yˆŷ = b0 + b1x, what does b1 represent?
The estimated slope.
Which of the following are assumptions of regression analysis? Select all that apply.
The expected value of the error term ε is zero. The error term ε is normally distributed.
All else being equal, if three competing models have adjusted R2 values equal to 0.45, 0.72 and 0.86, respectively, which model should be selected?
The model with adjusted R2 = 0.86.
If three competing models have standard errors of the estimate, se, equal to 0.45, 1.72 and 2.86, respectively, which model should be selected?
The model with se = 0.45.
In the sample regression equation, yˆŷ = b0 + b1x, what does yˆŷ represent?
The predicted value.
Why is the stochastic model used in regression analysis in place of the deterministic model?
The relationship between the response and the explanatory variables is inexact.
A company found that a strong positive linear relationship exists between Y, its sales, and X, advertising expense. Predicting Y for values of X outside the range of the sample data is risky for which of the following reasons?
The relationship may not be linear for values of X outside the range of the sample data.
Predicting y for values of x outside the range of the sample data is risky for which of the following reasons?
The relationship may not be linear for values of x outside the range of the sample data.
For which of the following situations is a simple linear regression model appropriate?
The response variable, y, is influenced by one explanatory variable.
In evaluating a regression model, why is a scatterplot a useful tool?
The scatterplot can be used to assess the linearity of the relationship.
What is the unfortunate consequence of heteroskedasticity?
The t and F tests are unreliable.
Heteroskedasticity is a violation of which of the assumptions underlying regression analysis?
The variance of the error term ε is the same for all observations.
With multiple regression analysis, multiple individual tests at a given α are not equivalent to a joint test for which of the following reasons?
The α for the joint test will be smaller than the α for each of the individual tests.
Suppose you estimate the model y = β0+β1x+ε. If it is determined that β1≠ 0, then it can be concluded that
There is a linear relationship between x and y.
Which of the following is true about residual plots? Select all that apply.
They provide an informal way to examine assumptions of regression analysis. They often plot residuals on the y-axis and the explanatory variables or predicted values along the x-axis. They help detect outliers.
Which of the following are true about the ANOVA table? Select all that apply.
Total df = Regression df + Residual df. Total df is n - 1.
True or false: Regression analysis can be used to build a model that uses information on an explanatory variable to predict changes in a response variable.
True
True or false: With simple linear regression, the test of joint significance is the same as the test of individual significance.
True
When is the multiple regression model used?
When the researcher believes that two or more explanatory variables influence the response variable.
Consider the multiple regression equation: y = β0+β1x1+β2x2+ε, If a joint test of significance leads to rejection of the null hypothesis, then
at least one explanatory variable is significant.
Which of the following is true regarding the calculation of the sample regression equation?
b1 must be calculated before b0.
Heteroskedasticity can be corrected by
correcting standard errors of the estimated coefficients.
In a regression model, the Multiple R is the
correlation between the response variable and its predicted value.
In regression analysis, the response variable is also called the
dependent variable.
When the response variable is uniquely determined by the explanatory variable, the relationship is _____
deterministic.
A possible solution for multicollinearity is
dropping a redundant explanatory variable from the model.
One way to detect multicollinearity is to
examine the correlations between the explanatory variables.
Regression analysis is used to
examine the relationship between two or more variables.
True or false: The only necessary test with multiple regression models is for joint significance of the regression coefficients.
false
In the presence of multicollinearity, the ordinary least squares estimators of the intercept and slopes are ______.
imprecise and difficult to interpret.
In regression analysis, the explanatory variable is also called the
independent variable.
In simple linear regression, the p-value of the F test identical to that of the t test because there
is only one slope coefficient being tested.
If a residual plot reveals that all points are randomly dispersed around the zero value of the residuals, then
it is likely that none of the assumptions has been violated
When two regression models applied on the same data set have the same response variable but a different number of explanatory variables, the model that would provide the better fit is the one with the
lower se and higher adjusted R2.
If the correlation between the response variable and the explanatory variables is sufficiently low, then adjusted R2
may be negative.
Which of the following is degrees of freedom for the test statistic for the test of individual significance?
n-k-1
The standard error of the estimate is the standard deviation of the
residuals.
A numerical measure that describes the dispersion of the data points from the sample regression equation is referred to as the
standard error of the estimate.
The method of least squares minimizes the
sum of squared errors.
Which of the following is the test statistic for the test of individual significance?
tdf = bj−βjsbj
All of the following are goodness-of-fit measures for linear regression models EXCEPT
the coefficient of variation.
The residual e represents
the difference between an observed and predicted value of the response variable at a given value of the explanatory variable.
In regression analysis, a large F statistic indicates that
the explanatory variables are explaining a large portion of variation in the response variable
Unlike R2, adjusted R2 explicitly accounts for
the sample size and the number of explanatory variables.
SST represents
the total variation in y.
SSR represents
the variation in the response variable explained by the model.
Serial correlation is typically observed in ______.
time series data
The purpose of the test of joint statistical significance is to
to determine if the group of explanatory variables have a statistical influence on the response variable.
Consider the simple linear regression model: y = β0 + β1x + ε. What is the notation for the explanatory variable?
x
Consider the simple linear regression model y = β0+β1x+ε. What is the implication of β1 = 0?
x does not have a significant linear influence on y.
In a regression model, the residual e is calculated as
y - yˆ
Which of the following is the correct form of the sample multiple regression equation?
yˆŷ = b0 + b1x1 + b2x2 + ... + bkxk
Consider the simple linear regression model: y = β0 + β1x + ε. What is the notation for the population slope coefficient?
β1
Consider the following simple linear regression model: y = β0+ β1x + ε. When determining whether there is a negative linear relationship between x and y, the alternative hypothesis takes the form
β1 < 0.
Consider the following simple linear regression model: y = β0+β1x + ε. When determining whether there is a positive linear relationship between x and y, the alternative hypothesis takes the form
β1 > 0.
Consider the simple linear regression model: y = β0 + β1x + ε. What is the notation for the random error term?
ε
The standard error of the estimate is calculated as
√ErrorSumofSquaresn−k−1
In a simple linear regression model, if all of the data points fall on the sample regression line, then the standard error of the estimate is
0
What values can the coefficient of determination, R2, assume?
0 ≤ R2 ≤ 1.
In a regression, if the Multiple R equals 0.80, then R2 equals
0.64.
Which of the following correlation coefficients between two explanatory variables would be a concern that multicollinearity exists?
0.92 -0.83
In simple linear regression, a downward sloping trend line suggests which of the following?
A negative linear relationship between x and y.
In a simple linear regression, an upward sloping trend line suggests which of the following?
A positive linear relationship between x and y.
If the residuals are plotted across values of an explanatory variable, what is true with regard to identifying nonlinear patterns?
A trend in the plot of residuals indicates a nonlinear pattern.
If the residuals are plotted across values of an explanatory variable resembles a 'U,' what might this indicate?
A violation of the linearity assumption.
In a plot of residuals versus the each explanatory value or the predicted value, what would indicate changing variability?
An increase or decrease in the spread of residuals.
The coefficient of determination can assume which of the following values?
Between zero and one.
Which of the following is a goodness-of-fit measure of linear regression models?
Coefficient of determination
Which of the following is another name for R2?
Coefficient of determination
Which of the following is a possible remedy for multicollinearity?
Collect more data
Which of the following defines multicollinearity?
Correlation between the explanatory variables
Which one of the following is true about the sample multiple regression equation?
Each explanatory variable has its own coefficient.
True or false: If the assumption of constant variability is violated, then the least squares estimators are biased.
False
True or false: Multiple linear regression is an extension of simple linear regression in that more than one response variable is used.
False
True or false: The adjusted R2 always goes up as more explanatory variables are added to the regression model.
False
rue or false: If the assumption of constant variability is violated, then the least squares estimators are biased.
False
If the sample regression equation is found to be yˆŷ = 10 + 2x1 - 3x2, which of the following is true?
For every unit change in x1, the predicted value of the response variable goes up by 2, assuming x2 is held constant.
Consider the multiple regression equation: SP = β0+β1(SF)+β2(AGE)+ε, where SP = selling price of the house, SF= square footage of the house and AGE = age of the house. To test whether SF and AGE have a joint influence on the selling price of the house, which null hypothesis is correct?
H0: β1= β2 = 0
Consider the multiple regression equation: SP = β0+β1(SF)+β2(AGE)+ε, where SP = selling price of the house, SF= square footage of the house and AGE = age of the house. To test whether SF and AGE have a joint influence on the selling price of the house, which alternative hypothesis is correct?
HA: At least 1 βj ≠ 0
Consider the multiple regression equation: SP = β0+β1(SF)+β2(AGE)+ε, where SP = selling price of the house, SF= square footage of the house and AGE = age of the house. In a hypothesis test to determine whether AGE has a significant linear influence on the selling price of the house, what is the alternative hypothesis?
HA: β2 ≠ 0
Which one of these is a disadvantage of R2 as a goodness-of-fit measure?
It can be inflated by adding explanatory variables with no predictive value.
In the sample regression equation, yˆŷ = b0 + b1x, how is b1 interpreted?
It is the change in the response variable for every unit change in the explanatory variable.
In the sample regression equation, yˆŷ = b0 + b1x, how is b0 interpreted?
It is the predicted value of the response variable when the explanatory variable is 0.
Which of the following are accurate regarding the standard error of the estimate? Select all that apply.
It is the standard deviation of the residuals. It measures the spread of the points around the sample regression line.
The test statistic for a test of joint significance is assumed to follow the Fdf1,df2 distribution and its value is calculated as
MSRMSEMSRMSE.
Which of the following approaches is used to calculate the estimated intercept and slope? Select all that apply.
Method of Least Squares Ordinary Least Squares
In an attempt to predict y, three models are estimated. The coefficient of determination for Model 1 equals 0.10, for Model 2 equals 0.64, and for Model 3 equals 0.87. According to the coefficient of determination, which model provides the best fit?
Model 3
Suppose that the slope parameter in a simple linear regression model is β1 = -5.12. What does this indicate about the nature of the relationship between x and y?
Negative linear relationship
How many explanatory variables does a simple linear regression model have?
One
Which of the following is NOT one of the assumptions of regression analysis?
Perfect multicollinearity exists among the explanatory variables.
Suppose that the slope parameter in a simple linear regression model is β1 = 3.52. What does this incidate about the nature of the relationship between x and y?
Positive linear relationship
In a multiple regression model, which of the following do we employ in a test of joint significance of the slope coefficients?
Right-tailed F test
