Ch.8: Linear Regression

Ace your homework & exams now with Quizwiz!

y hat = b0 + b1x

- b0 = y-intercept - b1 = slope

relating the equation to the standardized graph

- moving one standard deviation away from the mean in x moves our estimate r standard deviations away from the mean in y - moving any number of standard deviations in x moves r times that number of standard deviations in y

residual mean is always

0

"Best Fit" means

Least Squares - the sum for which the squared residuals is the smallest

Linear Model (Line of Best Fit)

an equation of a straight line through the data

R^2

between 0 and 100%

changing units

doesn't change the correlation but changes the standard deviations

regression to the mean (regression line)

each predicted y tends to be closer to its means (in standard deviations) than its corresponding x was

predicted value

estimate made from a model (ŷ)

1 - r^2

fraction of original variation left in residuals

squared correlation (r^2)

fraction of the original variation left in the residuals

equation of the line of standardized points

hat zy = r(zx)

regression assumptions and conditions

linearity assumption (straight enough condition) equal variance assumption (does the plot thicken condition) outlier condition

standard deviation of the residuals (se)

measure of how much the points spread around regression line

data =

model + residual

line of standardized scores

must go through the origin, which is the point of the means of x and y

slope of the regression line

over one standard deviation in x, up r standard deviations in y hat - b = rsy/sx

percentage of variability in y explained by x

r squared

what can go wrong

see page 189

what have we learned

see page 190

terms

see page 190-191

skills

see page 191

y-intercept

serves only as a starting value for predictions, not to be interpreted as a meaningful predicted value

scatterplot of residuals versus x-values

should be the most boring scatterplot you've ever seen - no distinctive features (direction, shape)

standard deviation of the residuals

square root of sigma e squared over n minus 2

finding residuals

subtract the predicted value from the observed ones - negative means actual is below line, positive above

slope of a regression line

the correlation coefficient (r)

residual

the difference between an observed value of the response variable and the value predicted by the regression line

units of slope

units of y per unit of x

model

with a set of data points, this is done with a line and giving its equation


Related study sets

Starting out With Python Ch.7 Checkpoint p.362

View Set