Ch 9.2: Linear Regression
*y* represents the
*observed* y-value for a data point
Is it appropriate to use a regression line to predict y-values for x-values that are not in (or close to) the range of x-values found in the data?
It is not appropriate because the regression line models the trend of the given data, and it is not known if the trend continues beyond the range of those data.
Given a set of data and a corresponding regression line, describe all values of x that provide meaningful predictions for y
Prediction values are meaningful only for x-values in (or close to) the range of the original data.
In order to predict y-values using the equation of a regression line, what must be true about the correlation coefficient of the variables?
The correlation between variables must be significant.
*x(bar)* represents the
average of *all x-values*
*y(bar)* represents the
average of *all y-values*
*y(hat)* represents the
predicted y-value (on a regression line)
residual (*di*)
the difference between the *observed* y-value and the *predicted* y-value (can be positive, negative, or 0)
regression line (*line of best fit*)
the line for which the sum of the squares of the residuals is a minimum
the regression line *always passes through point* ...
x(bar), y(bar)
equation of a regression line
y(hat) = mx + b