BA 2120 - SB 15.1-15.6 & 15.8-15.11

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

Which of the following are assumptions underlying multiple regression?

- At any values of x1, x2,..., xk, the population of potential errors has a normal distribution. - The error terms corresponding to different observations of y are independent.

Suppose a contractor specializing in installing hardwood floors uses a multiple regression model to predict the cost of installing a new floor. If he uses the independent variables x1 = square footage of the room and x2 = number of linear feet of baseboard installed, what would it mean if he estimated that b0 = 500, b1 = 20, and b2 = 10?

- He estimates it will cost $100 more to do a job using 70 linear feet of baseboard than to do one using 60 feet, if the rooms are the same size. - He estimates it will cost $800 more to install flooring in a 340 sq ft room than in a 300 sq ft room if the rooms use equal amounts of baseboard.

How should influential outliers be handled?

- If they are due to a recording error, correct them and rerun the regression. - Discard them, and rerun the regression.

A multiple regression model with 3 independent variables, based on 30 observations, gives SSE = 1.4318. Calculate the approximate standard error, s.

0.23 ( s = √(SSE / n-(k+1)) )

When R2 is modified to take into consideration the number of observations and the number of independent variables, the result is called the _______ R2.

Adjusted

Studentized residuals are used to identify values that are outlying with respect to their _______.

y values.

Consider the model y = β0 + β1x + β2D + β3xD + ε, where x is a quantitative variable and D is a dummy variable. For D = 0, the predicted value of y is computed as ______

ŷ = b0 + b1x

Consider the model y = β0 + β1x + β2D + ε, where x is a quantitative variable and D is a dummy variable. For D = 0, the predicted value of y is computed as ______

ŷ = b0 + b1x

Suppose you are considering two different models for predicting y. Model #1 uses independent variables x1, x2, and x3 and has SSE = 10.8. Model #2 uses the 3 independent variables from #1 plus a fourth variable x4. Model #2 has SSE = 4.7. Both models are based on the same data set with n = 25 and total sum of squares = 122. Which statement(s) below is(are) correct?

- It appears x4 contains useful information about y not already found in x1, x2, and x3. - In Model #2, R2 =.961 and adjusted R2 =.954.

Suppose you are considering two different models for predicting y. Model #1 uses 3 independent variables and has SSE = 103.5. Model #2 uses the 3 independent variables from #1 plus a fourth variable x4. Model #2 has SSE = 98.7. Both models are based on the same data set with n = 20 and total sum of squares = 1050. Which statement(s) below is(are) correct?

- Model #1 will have the smaller value of s. - Model #1 has a higher adjusted R2 than Model #2, so we might prefer Model #1.

Why do we have to be wary of outliers when deciding on a regression model?

- Outliers may inflate our confidence and prediction intervals. - Outliers may substantially change the standard error. - Outliers can substantially change the least squares point estimates.

Dummy variables usually assume the values of ______.

0 or 1.

If the least squares predication equation is found to be ŷ = 10 + 2x1 - 3x2, what is the predicted value of y when x1 = 4 and x2 = 1?

15

A regression model is obtained to relate sales (in number of cases) to shelf display height (bottom, middle, top). The dummy variable Middle is defined to be 1 if the product is displayed on the middle shelf and 0 otherwise. (The dummy variable Bottom is defined similarly). Looking at the provided regression analysis, which of the following conclusions is correct?

25.7

A multiple regression model with 2 independent variables, based on 10 observations, gives SSE = 323.9113. Calculate the approximate mean square error, s2.

46 (s2 = SSE / n - (k+1) )

A regression model is obtained to relate sales (in number of cases) to shelf display height (bottom, middle, top). The dummy variable Middle is defined to be 1 if the product is displayed on the middle shelf and 0 otherwise. (The dummy variable Bottom is defined similarly). Looking at the provided regression analysis, predict sales when the products are displayed on the bottom shelf ______ cases.

55.8 (intercept + bottom)

A regression model is obtained to relate sales (in the number of cases) to shelf display height (bottom, middle, top). The dummy variable Middle is defined to be 1 if the product is displayed on the middle shelf and 0 otherwise. (The dummy variable Bottom is defined similarly). Looking at the provided regression analysis, predict sales when the products are displayed on the middle shelf ______cases.

77.2 (25.7 + 51.5 -- intercept + middle)

What is a multiple regression model?

A regression model that uses more than one independent variable.

Which of the following are assumptions underlying multiple regression?

At any given combination of values of x1, x2,..., xk, the population of potential error term values has a mean equal to 0.

Given the multiple regression analysis below, which of the following conclusions is true with regard to the overall F test (0.0028) at the 5% significance level?

At least one of the independent variables is significantly related to the dependent variable.

The multiple coefficient of determination, R2, is calculated as ______.

Explained Variation / Total Variation

When testing the significance of an independent variable xj, what are the typical competing hypotheses?

H0: βj = 0 versus Ha: βj ≠ 0

The equation that best fits observed data, by giving the smallest possible SSE, ........ ......... is known as the prediction equation.

Least Squares

Intuitively, the best point estimates for the regression parameters β0, β1, ..., βk, are those that ______

Minimize the residuals.

Regression models that use more than one independent variable are called ______ regression models.

Multiple

R2 is the multiple coefficient of determination. What is R?

Multiple Correlation Coefficient.

Given the multiple regression analysis below, which of the following conclusions is true with regard to the overall F (0.3821) test at the 5% significance level?

Neither of the independent variables is significantly related to the dependent variable.

________ may substantially change the standard error.

Outliers

The least squares prediction equation that best fits observed data gives the smallest possible ______.

SSE

A regression model is obtained to relate sales (in number of cases) to shelf display height (bottom, middle, top). The dummy variable Middle is defined to be 1 if the product is displayed on the middle shelf and 0 otherwise. (The dummy variable Bottom is defined similarly). Looking at the provided regression analysis, which of the following conclusions is correct?

Sales are 4.3 cases greater when products are displayed on the bottom shelf compared to the top shelf.

________ residuals are used to identify values that are outlying with respect to their y values.

Studentized

If influential outliers are present due to a recording error, correct them and rerun the regression.

TRUE

True or false: When predicting the value of y, given k independent variables, we use a value of 0 for the error term, ε.

TRUE, since the mean of all possible error terms is 0, this is a reasonable assumption.

True or false: The closer the multiple coefficient of determination is to 1, the better the model is at predicting y.

TRUE, the closer the multiple coefficient of determination is to 1, the larger the proportion of total variation that is explained by the model.

True or false: The least squares prediction equation can be used to estimate the mean value of the dependent variable AND to predict an individual value of the dependent variable.

TRUE, the equation can be used for point estimation and point prediction.

In the multiple regression model, y = β0 + β1x1 + β2x2 + ε, how is β1 interpreted?

The mean change in y associated with a one-unit increase in x1.

In the multiple regression model, y = β0 + β1x1 + β2x2 + ε, how is β2 interpreted?

The mean change in y associated with a one-unit increase in x2.

In the multiple regression model, y = β0 + β1x1 + β2x2 + ε, how is β0 interpreted?

The mean y value when x1 = 0 and x2 = 0

The variance inflation factor for the independent variable xj is calculated as ______

VIF(j) = 1 / (1-Rj2)

Suppose we are testing the significance of an independent variable xj and the p-value of our test statistic is 0.021. Which of the following correctly reflects the strength of our evidence that xj is significantly related to y in the regression model?

We have strong evidence.

The adjusted multiple coefficient of determination is calculated as ______

[R2 - (k / n-1)] [(n-1)/(n-(k+1))]

Given a correlation matrix, multicollinearity is considered severe if at least one simple correlation coefficient between independent variables is ______.

at least 0.9.

The variance inflation factor for x4 in a model employing 4 independent variables tells us

how strongly x4 is related to the other three independent variables in the model.

Multicollinearity exists when ______

independent variables are correlated.

Suppose a bicycle manufacturer wants to predict y, a rider's comfort on a bike, from arm length, leg length, and overall height. Because of the proportionality in the human body, we should expect that a regression model based on these three independent variables will be characterized by

multicollinearity.

Stepwise regression adds the best predictors, one at a time, until _________.

no statistically significant predictors remain

Dummy variables are used to model the effects of different levels of a ______ variable.

qualitative independent

When deciding which independent variables to include in a regression equation, adding the best predictors, one at a time, until no statistically significant predictors remain, is a method known as ______.

stepwise regression

When testing the significance of an independent variable xj, what is the test statistic used?

t = (bj / sbj)


Kaugnay na mga set ng pag-aaral

Communicating in the Digital Age, Chapter 1

View Set

Systems Analysis and Design Ch 5,6,7 Exam

View Set

Chapter 3: Supply: Thinking Like a Seller

View Set

Chapter 1-3 International Business

View Set

Chapter 64: Introduction to the Integumentary System (NCLEX Review Questions/PrepU)

View Set