Chapter 13; correlation and linear regression

Ace your homework & exams now with Quizwiz!

The coefficient of determination must be between

0 and 1 or 0 and 100%

Place the following steps in correlation analysis in the order that makes the most sense

1. make scatter diagram 2. calculate a correlation coefficient 3. draw a least squares fit line

The correlation between the size of a house and it's sale price was found to be r=.77. what percentage of variation in sales price can be predicted from the size of a house using a regression line?

59% R^2 = 0.77^2

the correlation between wait time on a help line and customer satisfaction was found to be r=-.85. what percentage of variation in customer satisfaction can be predicted by wait time using a regression line?

72% R^2

CORRELATION ANALYSIS

A group of techniques to measure the relationship between two variables

LEAST SQUARES PRINCIPLE

A mathematical procedure that uses the data to position a line with the objective of minimizing the sum of the squares of the vertical distances between the actual y values and the predicted values of y.

INDEPENDENT VARIABLE

A variable that provides the basis for estimation.

REGRESSION EQUATION

An equation that expresses the linear relationship between two variables.

Suppose we believe that X and Y are positively correlated. Which of the following is a valid null hypothesis for this test of significance of the correlation coefficient?

H0: p <= 0 this would make the alternate --> H1: p>0 the null must have the equality

if we wanted to test to see if there was a negative correlation between two variables, which one of the following would be the correct alternative hypothesis?

H1: p< 0

STANDARD ERROR OF ESTIMATE

Syx = sqrt([sigma(y-yhat)^2 /n-2) Syx - standard error of Y for given X (or standard error of estimate) Yhat- estimated Y for given X n-2 -df: sample size minus 2 y - an observed value of Y

Which of the following tests gives the same result as a test of the regression line slope?

The t-test for the correlation coefficient. they are mathematically the same

DEPENDENT VARIABLE

The variable that is being predicted or estimated.

y-intercept

a = ybar-(b)(xbar) a - the y-intercept b - the slope of the regression line ybar- mean of dependent var xbar-mean of independent variable

correlation coefficient

a measure of the strength of the linear relationship between two variables If there is absolutely no relationship between the two sets of variables, Pearson's r is zero. A correlation coefficient r close to 0 (say, .08) shows that the linear relationship is quite weak. -1 <= r <= 1 r=Σ(x−x¯)(y−y¯)////(n−1)sxsy

A study of the time spent taking a test and the final score on the test found a correlation coefficient of r = .13. how would you describe this relationship?

a weak positive correlation

SLOPE OF THE REGRESSION LINE

b = r(sy/sx) r - correlation coe sy - sd of y (dependent var) sx - sd of x (indep var)

if X is the size (in square feet) of a home and Y is its sales price and the regression equation relating them is Yhat = $92,000 +86x, what is the predicted sales price of a home when x=0? assume that all homes used to build the model were between 1,800 and 2,500 square feet

cannot estimate x=0 is outside of the range of x-values used to build the mode. x=0 means there is no home

A test for the slope of the regression line uses the hypotheses H0: B=0, H1: B!=0. what are we seeking to discover with this test?

if the regression line has predictive power for the dependent variable

Which one of the following demonstrates the correct identification of the independent and dependent variables?

independent var: the size of a house dependent var: the sales price of a house

If two variables are correlated to each other, which of the following are characteristics of the dependent variable? select all that apply

it is usually shown on the vertical axis of a scatter diagram in a cause and effect relationship, it is the effect

which of the following is usually the first step in a correlation analysis?

making a scatter diagram

The equation to estimate Y on the basis of X is referred to as the

regression equation

Which one of these tools allows one to examine the relationship between two variables of interval- or ratio-level measurement?

scatter diagram

a study found a correlation of r=.68 between the weekly sales of ice cream and the number of car accidents. what can you reasonably conclude?

something else is related to ice cream sales and car accidents. for example, ice cream sales may go up during the summer months and more traveling takes place durring summer months

how is the standard error of the estimate calculated from ANOVA information?

sqrt(MSE) = sqrt(SSE/n-2)

TEST FOR SLOPE OF REGRESSION LINE

t=(b-0)/s of b s of b - standard error of the slope b- r(sy/sx)

t TEST FOR THE CORRELATION COEFFICIENT

t=r√(n-2)///√(1-r^2) with n−2 degrees of freedom t - t-dist test stat n-2 is degrees of freedom r - sample correlation n-sample size

What is the term that is used for the proportion of the total variation in Y that is explained by the variation in X?

the coefficient of determination

If the standard error of estimate for a regression line is large, what would you expect for the coefficient of determination?

the coefficient of determination should be small (a large error means a small predictive abilitiy)

How do you calculate the coefficient of determination?

the coefficient of determination, R^2, is the square of the correlation coefficient , r.

A line is drawn through the points on a scatter diagram. Which three of the following are not likely to be a least squares fit?

the line passes through the largest and smallest data points all of the data points are above the line nearly all of the data points are below the line

which of the following is the correct null hypothesis for the test of a sample correlation?

the population correlation is zero H0: p=0, H1: p != 0

in evaluating a regression equation, what does it mean if the standard error of estimate is small?

the predicted y will have small error the data is close to the regression line

Which of the following are statistics that regression analysis provides to evaluate the predictive ability of the regression equation?

the standard error of the estimate the coefficient of determination

When a line is drawn on a scatter diagram using the least squares principal, what is the quantity that is minimized?

the sum of the squared difference between the line and the data points

Which of the following illustrate the connection between the tests of B and p? select all that apply

their t-stats are the same their p-values are the same their degrees of freedom are the same

Compare the test for the slope of the regression line and the test for the correlation coefficient

they are mathematically the same and give the same result

GENERAL FORM OF LINEAR REGRESSION EQUATION

yˆ=a+bx where yhat - estimated value of the y variable for a selected x value a - the y-intercept. the estimated value of Y when x=0 (where the regression line crosses the y=axis when x is zero) b - slope x - any value of the independent variable that is selected


Related study sets

practice questions for med surg II final

View Set

Sleep and Sleep Disorder Ch. 1-4

View Set

Decimals in Standard, Written, and Expanded Forms

View Set

Cybersecurity Principles T/F and Quizzes Part 1

View Set