LS 15
True or false: The overall F test can tell us which independent variables in a multiple regression model are significant.
False
In the multiple regression model having two independent variables, the mean level μy = β0 + β1x1 + β2x2 is represented graphically as ______.
a plane
When testing the significance of an independent variable xj, what is the test statistic used?
t = (bj) /(sbj)
The overall F test can tell us ________.
that at least one independent variable is significant
The least squares prediction equation can be used to estimate the mean value of the dependent variable and _____
to predict an individual value of the dependent variable.
When testing the significance of the independent variables, the provided regression analysis is obtained. Which variables can we conclude is significantly related to y at the 1% level of significance?
x1
Which of the following is a multiple regression model to predict y on the basis of k > 2 independent variables?
y = β0 + β1x1 + β2x2 +... + βkxk + ε
Adjusted R2 is preferred in multiple regression because ______
R2 tends to overestimate the importance of the independent variables.
Dummy variables usually assume the values of ______.
0 to 1
A multiple regression model with 3 independent variables, based on 30 observations, gives SSE = 1.4318. Calculate the approximate standard error, s.
0.23
If the least squares predication equation is found to be ŷŷ = 10 + 2x1 - 3x2, what is the predicted value of y when x1 = 4 and x2 = 1?
15
If a qualitative independent variable has four levels, how many dummy variables are needed to represent these levels in a regression model?
3
A multiple regression model with 2 independent variables, based on 10 observations, gives SSE = 323.9113. Calculate the approximate mean square error, s2.
46
Which of the following are assumptions underlying multiple regression?
At any values of x1, x2,..., xk, the population of potential errors has a normal distribution. The error terms corresponding to different observations of y are independent.
How should influential outliers be handled?
Discard them, and rerun the regression. If they are due to a recording error, correct them and rerun the regression.
The multiple coefficient of determination, R2, is calculated as ______
Explained variation / Total variation
To test the significance of a multiple regression model, what is the test statistic used?
F(model) = (Explained variation/k) / (Unexplained variation/[n−(k+1)])
When testing the significance of an independent variable xj, what are the typical competing hypotheses?
H0: βj = 0 versus Ha: βj ≠ 0
Suppose a contractor specializing in installing hardwood floors uses a multiple regression model to predict the cost of installing a new floor. If he uses the independent variables x1 = square footage of the room and x2 = number of linear feet of baseboard installed, what would it mean if he estimated that b0 = 500, b1 = 20, and b2 = 10?
He estimates it will cost $800 more to install flooring in a 340 sq ft room than in a 300 sq ft room if the rooms use equal amounts of baseboard. He estimates it will cost $100 more to do a job using 70 linear feet of baseboard than to do one using 60 feet, if the rooms are the same size.
Line graphs of y versus x1, for two different levels of x2, are plotted on the same grid. How can we determine if interaction exists?
Interaction exists if the lines are not parallel.
Suppose the tenth studentized residual = 1.35 while tenth studentized deleted residual is 4.07. What does this imply?
Observation 10 is probably an outlier with respect to its y value.
Suppose that Cook's D for observation 5 in a multiple regression data set is much larger than F.50. What does this indicate?
Observation 5 has an influential impact on the model's estimated parameters.
Why do we have to be wary of outliers when deciding on a regression model?
Outliers may substantially change the standard error. Outliers can substantially change the least squares point estimates. Outliers may inflate our confidence and prediction intervals.
Suppose a multiple regression model using x1 and x2 has R2 = 0.85 and adjusted R2 =0.833. If you add a third variable x3 to the model and discover b3= 0, what happens to the model and these two quantities?
SSE will not change. R2 will remain at 0.85.
________ residuals are used to identify values that are outlying with respect to their y values.
Studentized
Before considering a multiple regression analysis, what should be examined?
The correlations between y and each independent variable. Plots of y versus each independent variable.
What does it mean if the leverage value hi corresponding to the ith observation is large?
The ith observation is outlying in terms of its x values. The ith observation may influence the prediction equation substantially.
In the multiple regression model, y = β0 + β1x1 + β2x2 + ε, how is β1 interpreted
The mean change in y associated with a one-unit increase in x1
In the multiple regression model, y = β0 + β1x1 + β2x2 + ε, how is β2 interpreted?
The mean change in y associated with a one-unit increase in x2
In the multiple regression model, y = β0 + β1x1 + β2x2 + ε, how is β0 interpreted?
The mean y value when x1 = 0 and x2 = 0
R2 is the multiple coefficient of determination. What is R?
The multiple correlation coefficient
If influential outliers are present due to a recording error, correct them and rerun the regression.
True
If the correlations between y and each independent variable is strong, it is reasonable to consider multiple regression analysis.
True
The leverage value hi corresponding to the ith observation is outlying in terms of its x values.
True
True or false: y = β0 + β1x1 + β2x2 + ε is a multiple regression model to predict y on the basis of two independent variables.
True
True or false: The closer the multiple coefficient of determination is to 1, the better the model is at predicting y.
True
True or false: The provided graph indicates that there is an interaction between age and dosage on the number of seizures.
True
True or false: When predicting the value of y, given k independent variables, we use a value of 0 for the error term, ε.
True
Suppose we are testing the significance of an independent variable xj and the p-value of our test statistic is 0.021. Which of the following correctly reflects the strength of our evidence that xj is significantly related to y in the regression model?
We have strong evidence.
A 100(1 - α) percent confidence interval for βj is ______
[bj ± tα/2 sbj]
When R2 is modified to take into consideration the number of observations and the number of ______ variables, the result is called the adjusted R2.
independent
Suppose the tenth studentized residual = 1.35 while tenth studentized deleted residual is 4.07. This implies Observation 10 _________
is probably an outlier with respect to its y value.
To test the significance of a multiple regression model, using the overall F test, the test statistic follows the F distribution with ______ degrees of freedom.
k numerator and n - (k + 1) denominator
Intuitively, the best point estimates for the regression parameters β0, β1, ..., βk, are those that ______
minimize the residuals.
Multiple regression models that use _____ independent variable(s).
more than one
Regression models that use more than one independent variable are called ______ regression models.
multiple
Dummy variables are used to model the effects of different levels of a ______ variable.
qualitative independent
The 100(1 - α) percent confidence interval for βj is based on ______
the t distribution with n - (k + 1) degrees of freedom.
When testing the significance of an independent variable xj, the test statistic is based on ______
the t distribution with n - (k + 1) degrees of freedom.
In the multiple regression model having ______ independent variables, the mean level μy = β0 + β1x1 + β2x2 is represented graphically as a plane
two
Studentized residuals are used to identify values that are outlying with respect to their _______.
y values.
Consider the model y = β0 + β1x + β2D + ε, where x is a quantitative variable and D is a dummy variable. For D = 1, the predicted value of y is computed as ______
ŷ = (b0 + b2) + b1x
Consider the model y = β0 + β1x + β2D + ε, where x is a quantitative variable and D is a dummy variable. For D = 0, the predicted value of y is computed as ______
ŷ = b0 + b1x
Consider the model y = β0 + β1x + β2D + β3xD + ε, where x is a quantitative variable and D is a dummy variable. For D = 1, the predicted value of y is computed as ______
ŷŷ = (b0 + b2) + (b1 + b3)x
________ may substantially change the standard error.
Outliers
The least squares prediction equation that best fits observed data gives the smallest possible ______.
SSE
Dummy variables are also known as ______ variables.
indicator