Chapter 4 Learning Curve
True or False: Using a linear regression equation, the closer the correlation, r, is to zero, the less accurate the prediction of y from x is.
True → r2 gives the percentage of variation in y that is explained by the least squares regression line. So if r is close to zero, then r2 is also close to zero, indicating that not very much variation of the variation in y is explained by x.
True or False: A change of one standard deviation in x corresponds to a change of r standard deviations in y.
True. Slope = . From this we see that when r = 1, a change in the predicted y (in standard deviation units) is the same as the change in x.
An outlier in only the y direction typically has influence on the computation of the ________.
correlation → Outliers in only the y direction typically influence only the correlation because they increase the scatter about the regression line.
The general form for a linear equation is given as: y = a + bx. a represents the y-______.
intercept
An observation with an unusually large (in absolute value) positive or negative residual is classified as a(n) ________________.
outlier → An outlier has an unusually large (in absolute value) positive or negative residual.
The vertical distance between a data point and the regression line is called the ______.
residual → The value of a residual tells us the vertical distance between a data point and the regression line.
Expressing the regression equation in terms of the x variable instead of the y variable will cause the y intercept and ____ to change.
slope
The general form for a linear equation is given as: y = a + bx. In the regression equation, b is called the ______.
slope → b is the symbol for slope. This is the number adjacent to x in the equation.
Which regression line will give the best predictions?
y = 5.8 + 0.15x, r2 = 0.9 This line has the highest r2 and hence will give the best predictions.
Which one of the following r2 values is associated with the line explaining the most variation in y?
98% → r2 gives the percentage of variation in y that is explained by the least squares regression line. 98% is the largest of these r2 values; it is associated with the line explaining the most variation in y.
True or False: The general form for a linear equation is given as: y = a + bx. In this equation, x is the slope.
False → The coefficient of x is the slope; here, that is b.
True or False: The regression model y = a + bx is only reasonable when r > 0.7.
False → The correlation coefficient, r, measures the strength of the linear relationship. It does not tell us if a linear relationship is reasonable. A linear relationship is considered strong when r > 0.7, but it is appropriate for weaker relationships when the pattern is generally linear.
True or False: The response variable, y, and the explanatory variable, x, can be interchanged in the least squares regression line equation.
False → The response variable, y, and the explanatory variable, x, can NEVER be interchanged in the least squares regression line equation.
What does r2 tell us?
The percentage of variation in the y's that is explained by the least squares regression of y on x. → r2 gives the percentage of variation in y that is explained by the least squares regression of y on x.
The slope of a least squares regression line tells us about the strength of the relationship between x and y.
False → The slope tells us the direction and rate change of y as x increases by one unit. Different bivariate quantitative data sets can have the same slope in their respective least squares regression models, but very different measures of strength.
The general form for a linear equation is given as: y = a + bx. This regression model is appropriate in which situation?
For only linear relationships. → The regression line is only valid when the relationship between x and y is linear.
Which of the following measures the strength of fit of a regression line in terms that are easily explained?
Square of the correlation coefficient (r2)
A quantity that measures the amount of variation in y explained by a regression model is the ____________ of the correlation coefficient.
square → r2 measures the amount of variation in y that is explained by the regression model.