3440

Ace your homework & exams now with Quizwiz!

VIF=

( Std Dev sqrd (n-1) (SEsqrd) ) / MSE

R2

(also called the coefficient of determination or the multiple coefficient of determination) is the amount of variation in the dependent variable that is explained by the independent variables. It is calculated as SSR/SST (sum of squares due to the regression model/total sum of squares). It is also equal to (multiple R)2.

1. A regression analysis between sales (Y) and advertising (X) resulted in the following equation: yhat=500+3580x What does this equation imply?

An increase of $1 in advertising is correlated with an increase of $3580 in sales on avg

A decision maker has five potential independent variables with which to build a regression model to explain the variation in the dependent variable. At step 1, variable x3 enters the regression model. Which of the following indicates which of the four remaining independent variables will be next to enter the model?

The variable with the highest coefficient of partial determination

1. What does it mean if R2 equals .78 in terms of x and y?

• 78% of the variation of in y is explained by x.

In a multiple regression analysis involving 15 independent variables and 200 observations, SST = 800 and SSE = 240. The adjusted coefficient of determination is

.66

1. What are the steps Excel uses to determine which variables are entered for forward selection stepwise regression model?

1. enter variables most correlated to Y 2. go thru rest of vars and enter in var that adds most to Rsqrd 3. repeat step 3 until there are no more sig vars left

spurious correlation def

2 data sets that have nothing to do w/ eachother but have correlation

1. If we have a model with the variables x, x2, and x3, what type model has been developed?

3rd degree polynomial

If a decision maker wishes to develop a regression model in which the University Class Standing is a categorical variable with 5 possible levels of response, then he will need to include how many dummy variables?

4

1. What is the general rule for the sample size in a multiple regression model?

4x the number of independent variables

A decision maker is considering including two additional variables into a regression model that has as the dependent variable, Total Sales. The first additional variable is the region of the country (North, South, East, or West) in which the company is located. The second variable is the type of business (Manufacturing, Financial, Information Services, or Other). Given this, how many additional variables will be incorporated into the model?

6

1. In a study of employees at a local company, the human resource manager wants to develop a multiple regression model to explain the difference in employee wage rates. She is thinking of including a variable, degree status, in which the following categories exist: no degree, H.S. degree, some college, junior college degree, bachelor degree, graduate degree, post-graduate degree. One other variable is being considered; Years with the Company. Given this, what is the appropriate number of independent variables to include in the model?

7

Which of the following would best describe the situation that a second-degree polynomial regression equation would be used to model?

A parabola

An industry study was recently conducted in which the sample correlation between units sold and marketing expenses was 0.57. The sample size for the study included 15 companies. Based on the sample results, test to determine whether there is a significant positive correlation between these two variables. Use an alpha = 0.05

Because t = 2.50 > 1.7709, reject the null hypothesis. There is sufficient evidence to conclude there is a positive linear relationship between sales units and marketing expense for companies in this industry.

a. Adding variables that have a low correlation with the dependent variable will cause the R-square value to decline. t/f

F

a. The adjusted R-square might be higher or lower than the value of the R-square.

F

a. The coefficient of determination in a multiple regression model will be equal to the square of the highest correlation in the correlation matrix. t/f

F

correlation = causation t/f

FALSE

SSE/(n-k-1) =

MSE

SSR/(n-2)=

MSR

F Stat =

MSR/MSE

A regression equation that predicts the price of homes in thousands of dollars is t = 24.6 + 0.055x1 - 3.6x2, where x2 is a dummy variable that represents whether the house in on a busy street or not. Here x2 = 1 means the house is on a busy street and x2 = 0 means it is not. Based on this information, which of the following statements is true?

On average, homes that are on busy streets are worth $3600 less than homes that are not on busy streets.

Which of the following is not considered to be a stepwise regression technique?

Optimal variable entry and removal regression

Coeff of Partial Determination

Rsqrd

SSR/SST =

Rsqrd

Rsqrd =

SSR/SST

Adding insignificant variables will cause the adjusted R-square value to decline. t/f

T

a. The sum of the residuals computed for the least squares regression equation will be zero. t/f

T

B1/SEb1=

T stat

Which of the following is a correct interpretation for the regression slope coefficient?

The average change in y of a one-unit change in x will be b1 units.

In a multiple regression, the dependent variable is house value (in '000$) and one of the independent variables is a dummy variable, which is defined as 1 if a house has a garage and 0 if not. The coefficient of the dummy variable is found to be 5.4 but the t-test reveals that it is not significant at the 0.05 level. Which of the following is true?

The house value remains the same with or without a garage.

Under what circumstances does the variance inflation factor signal that multicollinearity may be a problem?

When the VIF is greater than or equal to 5

Second-order polynomial models:

can curve upward or downward depending on the data.

A correlation of -0.9 indicates a weak linear relationship between the variables. t/f

false

A perfect correlation between two variables will always produce a correlation coefficient of +1.0 t/f

false

A research study has stated that the taxes paid by individuals is correlated at a .78 value with the age of the individual. Given this, the scatter plot would show points that would fall on straight line on a slope equal to .78. t/f

false

Both a scatter plot and the correlation coefficient can distinguish between a curvilinear and a linear relationship. t/f

false

If two variables are highly correlated, it not only means that they are linearly related, it also means that a change in one variable will cause a change in the other variable. t/f

false

If two variables are spuriously correlated, it means that the correlation coefficient between them is near zero. t/f

false

In a study of 30 customers' utility bills in which the monthly bill was the dependent variable and the number of square feet in the house is the independent variable, the resulting regression model is = 23.40 + 0.4x. Based on this model, the expected utility bill for a customer with a home with 2,300 square feet is approximately $92.00. t/f

false

In developing a scatter plot, the decision maker has the option of connecting the points or not. t/f

false

The difference between a scatter plot and a scatter diagram is that the scatter plot has the independent variable on the x-axis while the independent variable is on the Y-axis in a scatter diagram. t/f

false

The following regression model has been computed based on a sample of twenty observations: = 34.2 + 19.3x. The first observations in the sample for y and x were 300 and 18, respectively. Given this, the residual value for the first observation is approximately 81.6. t/f

false

Two variables have a correlation coefficient that is very close to zero. This means that there is no relationship between the two variables. t/f

false

When a correlation is found between a pair of variables, this always means that there is a direct cause and effect relationship between the variables. t/f

false

1. What causes multicollinearity? How can we measure it? What threshold do we usually look for? What are some of the symptoms associated with multicollinearity?

highly correlated ind vars or redundant info...we can measure it

• If we determine that the model is significant, what specifically have we decided in simple linear regression?

i. For simple linear regression, if the model is significant, then the slope coefficient is significant, the R2 value is significant, and the correlation between x and y is significant. 1. NOTE: for a multiple regression model, if the model is significant, it means at least one of the variables is significant.

Will the standard error of the estimate always equal the standard error of the slope in simple linear regression?

i. No, they are not the same value.

• What does curvilinear relationship mean for the correlation coefficient?

i. Nothing! The correlation coefficient will still be a measure of the LINEAR

• Are there any values that will be exactly the same in simple linear regression?

i. Yes--in simple linear regression, the p-value for the slope is equal to the p-value for the model (significance F).

Adjusted R2

is a measure of the variation in y that the model explains that is adjusted to take into account the relationship between the sample size and the number of independent variables in the regression model. It is especially helpful when you add or subtract variables to your model—the higher the adjusted R2, the better.

What can we learn from scatter plots?

linear/curvilinear correlation lack of corr even distribution - outliers? posi/neg

residual DF =

n-k-1

what can we learn from a sample correlation coeff?

r=sample corr coeff r will always be between -1 and 1 closer to -1 or 1 the stronger the corr r measure linear rltnshp

What assumptions do we make when we run a regression

residuals are normally distributed residuals are independent of eachother Xs ar independent of eachother rltnshp between Y and X is linear equal variance across residuals

Multiple R

s always equal to the positive answer of the square root of R2. It only has meaning in a simple linear regression model where it is the absolute value of the correlation coefficient between y and x. To figure out whether the correlation coefficient (r) is positive or negative, look at the sign of the slope coefficient.

std error of a model =

sqr rt of MSE

If a sample of n = 30 people is selected and the sample correlation between two variables is r = 0.468, what is the test statistic value for testing whether the true population correlation coefficient is equal to zero?

t=2.8

A bank is interested in determining whether its customers' checking balances are linearly related to their savings balances. A sample of n = 20 customers was selected and the correlation was calculated to be +0.40. If the bank is interested in testing to see whether there is a significant linear relationship between the two variables using a significance level of 0.05, the value of the test statistic is approximately t = 1.8516. t/f

tru

A dependent variable is the variable that we wish to predict or explain in a regression model. t/f

tru

If a set of data contains no values of x that are equal to zero, then the regression coefficient, b0, has no particular meaning. t/f

tru

If it is known that a simple linear regression model explains 56 percent of the variation in the dependent variable and that the slope on the regression equation is negative, then we also know that the correlation between x and y is approximately -0.75. t/f

tru

If the correlation between two variables is known to be statistically significant at the 0.05 level, then the regression slope coefficient will also be significant at the 0.05 level. t/f

tru

In a university statistics course a correlation of -0.8 was found between numbers of classes missed and course grade. This means that the fewer classes students missed, the higher the grade. t/f

tru

In multiple regression analysis, the model will be developed with one dependent variable and two or more independent variables. t/f

tru

The following regression model has been computed based on a sample of twenty observations: = 34.2 + 19.3x. Given this model, the predicted value for y when x = 40 is 806.2. t/f

tru

The multiple coefficient of determination measures the percentage of variation in the dependent variable that is explained by the independent variables in the model. t/f

tru

The scatter plot is a two dimensional graph that is used to graphically represent the relationship between two variables. t/f

tru

The sum of the residuals computed for the least squares regression equation will be zero in a multiple regression model t/f

tru

When constructing a scatter plot, the dependent variable is placed on the vertical axis and the independent variable is placed on the horizontal axis. t/f

tru

when independent variables are highly correlated, multicollinearity will occur t/f

tru

1. What does it mean if the correlation coefficient between the number of wins and the number of fumbles is equal to -.82?

• As the number of fumbles goes up, the # of wins tends to go down. Or, the less fumbles, the more wins.

1. What is the difference between the coefficient of determination, the multiple coefficient determination, and multiple R?

• Coefficient of determination is R2 • Multiple coefficient determination is the official name for R2 in a multiple regression model (meaning we have more than 1 independent variable). • Multiple R is not the same thing! See previous answer.


Related study sets

Unit Circle - Degrees/Radians/Sine/Cosine - Learn by heart...

View Set

Federalist and Anti-Federalist Papers

View Set

Ratification of the Constitution

View Set

Postpartum at Risk, Postpartum, 312 Exam 4

View Set

VARCAROLIS Chapter 21: Child, Older Adult, and Intimate Partner Violence

View Set

Topics 7,8 Practice for Final Exam

View Set