Simple and Multiple Linear Regression
How to Judge Whether a Prediction is Good
Best prediction is the one that yields the smallest errors between predicted outcomes and actual outcomes. Least Squares Criterion .Prediction errors are squared and the best-fitting regression line is the one that has the smallest sum of squared errors ▪Errors in prediction also called residuals: Difference between an actual score and a predicted score Standard Error of the Estimate .Standard deviation of the residual scores, a measure of error in regression
Standard Error of the Estimate
Errors of prediction are at their maximum when r= 0. .No correlation means no improvement in prediction With perfect correlation (r= ±1), we have perfect prediction. .No errors of prediction (i.e., sY-Y'= 0) Think of standard error of the estimate as the average residual score, the average difference between the actual Y scores and the predicted Y scores
Simple Linear Regression
Goal of linear regression: .Obtain an equation for a line that best fits the data in our scatter plot. This line can be used to predict scores on one variable based on scores from another variable Simple linear regression should be used only with a statistically significant Pearson r.
Errors in Regression
Standard Error of the Estimate .The standard deviation of the residual scores, a measure of error in regression Prediction Interval .The range around Y′within which there is some certainty that a case's real value of Y falls .Calculation of the interval is based on the estimated Yscore and the standard error of the estimate .The smaller the standard error of the estimate is, the narrower the prediction interval will be, and therefore the better the prediction will be
Multiple Regression
The difference between Scenario A and Scenario B is the difference between simple regression and multiple regression. Simple Linear Regression .Prediction in which Y′is predicted from a single independent variable Multiple Linear Regression .Prediction in which multiple independent variables are combined to predict a dependent variable .Adds together the unique predictive power of each variable
The Linear Regression Equation
Three factors need to be known to apply regression line formula 1)X value for which one wants to predict a Y value 2)slope, b 3)Y-intercept, a
Slope
Understanding Slope Tilt of the line; rise over run; how much up or down change in Y is predicted for each 1-unit change in X: .If the slope is positive, the line is moving up and to the right. .If the slope is negative, the line is moving down and to the right. .If the slope is zero, the line is horizontal.
Y intercept
Understanding the Y-Intercept Indicates where the regression line would pass through the Y-axis .If Y-intercept is positive, line passes through the Y-axis above zero. .If Y-intercept is negative, line passes through the Y-axis below zero. .The bigger the absolute value of the Y-intercept, the further away from zero the regression line passes through the Y-axis.
Linear Regression
When we have data for two variables, X and Y, we can graph the data (scatterplot) With regression, we obtain the equation for the best-fitting straight line for predicting Y from X. The Regression Line is the line that minimizes the squared deviations of each data point to the line. . Also known as the "least squares" criterion
Using a Regression line for Predictions
With our regression line, we can engage in prediction; not merely describing the data points we have, but estimating what other values could hypothetically be Y prime .Value of Y predicted from X by a regression equation .Abbreviated Y′
R2 in Multiple Regression
r2, the percentage of variability in the dependent variable that is accounted for by the independent variable(s), is called R2 in multiple regression Better prediction means a higher percentage of variability is accounted for with multiple regression than with simple regression. More powerful technique than simple regression