Chapter 14

Ace your homework & exams now with Quizwiz!

What level of correlation between two independent variables in a regression model generally does not cause multicollinearity problems.

A correlation coefficient between -0.7 and +0.7.

Autocorrelation can be identified by examining which of the following types of plots?

A scatter diagram of residuals versus fitted values.

The global test of the regression model examines the ratio of two variances. Which of these is the correct description of the test statistic?

F = MSR/MSE

What are the hypotheses for the Global test of the multiple regression model with three independent variables?

H0: β1= β2= β3 = 0, H1: Not all β's are 0

What is the purpose of the Adjusted Coefficient of Determination?

It adjusts R2 to reflect the the number independent variables used..

Which statement(s) correctly describe the Coefficient of Multiple Determination (R2)? Select all that apply.

It is the percent of explained variation. It can range from 0 to 1.

How do you interpret the "Standard Error" in a multiple regression output table?

It is the typical "error" when the regression equation is used to predict Y.

In an ANOVA table, how is the "Residual" related to the regression equation?

It is the variation in the value of Y not explained by the regression.

Which of the following are reasons to avoid correlation between independent variables (multicollinearity)? Select all that apply.

It may lead to erroneous results in hypothesis tests of independent variables. It is difficult to make inferences about the individual regression coefficients.

Define step-wise regression.

Method used to denote the process that builds a regression equation one independent variable at a time

One of the requirements of regression analysis is called the multicollinearity assumption. How is multicollinearity defined?

Multicollinearity exists when independent variables are correlated.

Multicollinearity can have many adverse effects on a multiple regression model. Which of these could be one of them?

Removing a non-significant variable results in drastic changes in the values of the remaining coefficients.

What are the two kinds of plots that allow us to visually evaluate the "linearity assumption"? Select all that apply.

Residual Plot Scatter Diagram

Nominal level variables can be used in regression analysis, in which case they are known as qualitative variables. Identify the qualitative variables from this list. Select all that apply.

Right or left handedness Male or Female Whether or not a car has air conditioning.

What happens in a regression analysis if the number of independent variables is equal to the sample size?

The coefficient of determination becomes 1.

What would you expect to see in a residual plot if the linearity assumption is correct? Select all that apply.

The points are scattered and there is no obvious pattern. The positive and negative values are evenly spread across the whole range.

The normal probability plot shows each residual plotted according to the percentile it represents in the set of residuals. What does it look like if the normality assumption is true?

The points closely approximate a line with positive slope.

Regression analysis makes several assumptions. Which of these best describes the "linearity assumption"?

The relationship between the dependent and individual independent variables is a straight line.

What is meant by "homoscedasticity" in regard to a multiple regression model?

The variation around the regression line is the same for all values of the independent variables.

What does the independent observation assumption mean for the residuals plot?

There is no pattern to the residuals.

Most software packages provide a histogram of residuals as part of regression analysis. How would you use this?

To visually evaluate the normality assumption

Which of the following is characteristic of multiple regression but not simple linear regression?

Two or more independent variables.

Suppose someone is building a model to predict the sales price of a house and they would like to include a variable to indicate whether or not the house has a pool. Which of the following could be used to model that variable.

Use 0 if there is no pool and 1 if there is one.

In the population model Y = α + β1X1+ β2X2+β3X1X2 what is the interaction term?

X1,X2

What is the term used to denote the process that builds a regression equation one independent variable at a time, starting with the one the most highly correlated and keeping only terms with significant coefficients?

stepwise

What effect does increasing the number of independent variables in the regression have on the coefficient of determination?

It makes it larger.

What is a normal probability plot? What is it used for? Select all that apply.

It plots the percentiles vs. the residuals. It is used to check the normality assumption of regression.

When you run a "stepwise regression" or "best subset regression", the software may work "too hard" to find an equation that fits the quirks of your data set. What characteristics should you look for in the regression equation?

It should make sense, based on your knowledge of the connection among the variables. It should be simple and logical.

How does the backward elimination method build a regression model?

It starts with all variables in the model and insignificant ones out one at a time.

When you use the global test for the multiple regression model, what are you testing?

It tests the null hypothesis that all population coefficients are zero.

Multicollinearity can have many adverse effects on a multiple regression equation. Which of these could be one of them?

A variable known to be an important predictor has a non-significant coefficient. The value or sign of one or more coefficients violates common sense.

If a regression equations predicts a Y-value of 15 with a standard error of 5, what does this mean?

About 68% of the sample Y-values are between 10 and 20.

What is the formula for the test statistic used to test individual coefficients of the multiple regression equation?

t = bi−0sbi

Many situations can occur when studying interactions among variables. Which of these situations are valid examples of an interaction?

An interaction among three variables. A nominal scale variable interacting with a ratio-scale variable.

Which method of building a regression model starts will all independent variables and removes one variable at a time until all remaining variables are significant?

Backward elimination method

Suppose k independent variables are being considered for building a regression model. Which method builds the best 1-variable model, the best 2-variable, model, ..., the best k-variable model?

Best-subset method

How are the coefficients found for the multiple regression equation?

By the least squares method using a statistical software package.

What is the impact of correlated independent variables?

Correlated independent variables make inferences about individual regression coefficients difficult.

Which entry in a multiple regression output table is used to draw the conclusion for the Global Test?

"Significance F" in the ANOVA table,

What is a "dummy variable" in the context of regression analysis?

A qualitative variable that has been replaced by a number, usually 0 or 1.

What is an "interaction term" in the context of regression analysis?

A new variable created by multiplying two independent variables.

What should you do if your regression analysis finds that b1 and b3 are significant but b2 is not (t=-0.83, p = 0.15) and neither is b4 (t=-0.25, p = 0.34).

Drop X4 and rerun the regression.

On a residual plot the points are close to zero on the left side but widely scattered on the right side. This indicates a possible violation of which multiple regression assumption?

Homoscedasticity

On a residual plot there are many, widely scattered points in the middle, but only a few points close to the line at either end. This indicates a possible violation of which multiple regression assumption?

Homoscedasticity

What is the term used for the assumption that the variation around the regression line will appear to be the same for the whole range of the residual plot?

Homoscedasticity

Data which is collected over time (time series data) often violates which of these regression assumptions?

Independent Observations

Many situations can occur when studying interactions among variables. Which of these situations is not a valid example of an interaction?

Interaction occurring as the sum of two variables.

As more independent variables are added to a regression model, the coefficient of determination tends to increase. How is this bad?

It can lead to adding variables with no predictive power.

What is the main advantage of using stepwise regression?

It is an efficient way to find a regression equation with only significant coefficients.

Nominal level variables can be used in regression analysis, in which case they are known as qualitative variables. Identify the qualitative variables from this list. Select all that apply.

Marital status Whether or not a home has a garage. Gender

Which one of the following entries from the output tables of a multiple regression model could be used to reject the null hypothesis of equal coefficients in the global Test?

Significance F = 0.002

What is the difference between simple linear regression and multiple regression?

Simple linear regression has one independent variable and multiple regression has two or more.

What distribution is used with the global test of the regression model to reject the null hypothesis?

The F-distribution.

The formula for the variance inflation factor is VIF = 11−Rj211-Rj2. If VIF>10 then multicollinearity is excessive. What is the meaning of Rj2 ?

The coefficient of determination of a regression with Xj as the dependent variable against the other independent variables

One of the assumptions of regression analysis is that the distribution of the Y values about the regression line is approximately normal. Which of these tools can you use to check this?

The histogram of residuals

If the pattern of residuals seems to cluster around a line with mostly positive values on the left and mostly negative values on the right, what regression assumption is violated?

The independent observations assumption.

Your experience tells you that an independent variable is positively correlated to the dependent variable, but a multiple regression model gives it a negative coefficient. What could cause this?

The model may have correlated independent variables.

Which of the following is not an advantage of stepwise regression models?

The model selected always has the highest R2.

Describe the sampling distribution of the test statistic for testing regression coefficients.

The t-distribution with n - (k + 1) degrees of freedom.

When testing individual coefficients of a multiple regression model, what is the sampling distribution?

The t-distribution with n - (k + 1) degrees of freedom.


Related study sets

BTT Vocabulary/Comprehension:Chapters 9-10

View Set

Quiz Chapter 8: The Global Classroom

View Set

Strength Training Conditioning & Programming

View Set

SL American Government Quiz 2 (Topics 4-7)

View Set

Businesss Foundations Seminar Mid Term PCR S7&8 Quiz Review

View Set

shoulder and arm bony landmarks, muscles, joint/ligaments

View Set