3.2
The least squares regression line always passes through the point....
(x̅, ȳ)
What are the three limitations of correlation and regression?
1.) the distinction between explanatory variables and response variables is important in regression (this does not matter in correlation) 2.) correlation and regression lines describe only linear relationships 3.) correlation and least squares regression lines are not resistant
Slope
for every one unit change in x there is an ab unit change in y
If a least squares regression line fits the data well, what characteristics should the residual plot exhibit?
if the form of the association and form of the model are the same the residual plot should have have no form other than random scatter
Given r, what can be concluded about the relationship between x and y?
percent of the variability in y explained by the linear relationship between x and y
Residual
sum=0, the difference or vertical distance between observation and predicted value, formula=y-ŷ, if point is above line it is positive and if it is below the line it is negative
Coefficient of Determination, r²
the fraction of the variation in the values of y that is accounted for by the least squares regression line of y on x
Least-Squares Regression Line
the line that makes the sum of the squared residuals as small as possible
Y-Intercept
the predicted value of y when x=0
Predicted Value
the value y predicted by the model (ŷ)
Extrapolation
use of a regression line for prediction far outside the interval of values of the explanatory x used to obtain the line, predictions are often not accurate
Standard Deviation of the Residuals, s
value gives the approximate size of a typical predication error (residual)
Regression Line
ŷ=a+bx ŷ-response variable a-y-intercept b-slope x-explanatory variable *a line that describes how a response variable y changes as an explanatory variable x changes, predicts values
How to find the regression line.
b=r(sy/sx) a=ȳ-bx̅
Outliers and Influential Points
OUTLIERS: an observation that lies outside the overall pattern of the other observations, points that are outliers in the y direction but not the x direction of a scatterplot have large residuals INFLUENTIAL POINTS: an observation is influential if removing it would markedly change the result of the calculation, points that are outliers in the x direction of a scatterplot are often influential for the least squares regression line
Residual Plot
a scatterplot of the (x,residuals) pairs, isolated points or a pattern of points in the residual plot indicate potential problems
Why does association not imply causation?
a strong association between two variables is not enough to draw conclusions about cause and effect