Prediction
Cohen's conventions for R^2 for multiple regression
- .02 a small effect size - .13 a medium effect size - .26 a large effect size
Criterion Variable (usually Y)
- A variable that is predicted, in prediction
How can you find the correlation coefficient?
- By taking the square root of the PRE. But, since a square root can be positive or negative, have to look at the pattern of numbers to determine if the correlation is positive or negative.
Error and Proportionate Reduction in Error
- Can estimate how accurate your prediction rule would have been if you had used it to make "predictions" for the scores you used to figure the linear regression rule and correlation coefficient
What hypothesis tests do you have in multiple regression?
- Can test the significance of the multiple correlation (and the squared multiple correlation) using a procedure in which the null hypothesis is -> in the population the multiple correlation is 0. Tests whether the variables as a whole are associated with the criterion variable. - Can test the significance of each individual predictor variable using a test like with bivariate prediction. Each test shows whether the regression coefficient for that variable is significantly different from 0. Focuses on whether this predictor variable ass more than 0 to the production above what the other predictor variables already predict. - Another hypothesis test for whether the regression constant is significantly different from 0. Psychologists usually do not pay much attention to this test, because the actual value of the regression constant (& whether it is different from 0) is usually not of great importance in many areas of research.
Least squares criterion
- Finding the regression line that gives the lowest sum of squared errors between the actual scores on the criterion variable (Y) and the predicted scores on the criterion variable (Y hat).
Linear prediction rule (or linear prediction model)
- Formula for predicting a person's score one a criterion variable based on the person's score on one or more predictor variables
Finding a and b for the least square linear prediction rule
- Formulas figure the values of a and b giving the linear prediction rule with the smallest possible sum of squared errors than any other possible prediction rule
Proportionate Reduction in Error
- Helps you to think about the accuracy of a prediction rule by comparing the amount of squared error using your prediction rule to the amount of squared error you would have with the prediction rule 1. Figure the amount of squared error using the prediction rule -> the sum of the squared errors (SSError) Add up the squared errors of the individuals in the study 2. Figure the squared error you would make predicting without the prediction rule -> most accurate prediction for anyone is the criterion variable's mean. the amount of squared error when predicting without a rule is the amount of squared error when predicting each score to be the mean. When the predicted score is the mean, error is the actual score minus the mean (in general error is the actual score on the criterion variable minus the predicted score on the mean). Sum of squared errors -> the total squared error when predicting the mean (SSTotal, same as SSY - sum of squared deviations from the mean) 3. Compare the two amounts of squared error Shows you how using a prediction rule can reduce the proportion of the squared error you would make using the mean -> If SSError = SSTotal, the prediction rule has reduced zero error (numerator) and 0% of the total error (0/SSTotal). SSError can't be worse than SSTotal. -> If SSError = 0, then the prediction rule has reduced the error by 100% - Equal to the correlation coefficient squared -> r^2 is typically used as the symbol for the proportionate reduction in error with bivariate prediction (R^2 for multiple regression). In an actual research situation, should use the simpler procedure of squaring the correlation coefficient to determine the accuracy of a prediction rule rather than figuring the PRE.
Linear prediction rule graphed
- Horizontal axis is for values of the predictor variable (x) - Vertical axis is for predicted scores for the criterion variable (Y hat) - Line is a regression line, which shows the relation between values of the predictor variable and and the predicted values of the criterion variable
In multiple regression, why are the standardized regression coefficients for each predictor variable often smaller than the ordinary correlation coefficient of that predictor variable with the criterion variable?
- In multiple regression, a predictor variable's association with the criterion variable usually overlaps with the other predictor variables' association with the criterion variable. - Thus, the unique association of a predictor variable with the criterion variable (as shown by the standardized regression coefficient) is usually smaller than the ordinary correlation of the predictor variable with the criterion variable.
What is a particularly important difference between multiple regression and bivariate prediction?
- In ordinary bivariate prediction, the standardized regression coefficient (B) is the same as the correlation coefficient -> In multiple regression, the standardized regression coefficient for each predictor variable is not the same as the ordinary correlation coefficient (r) of that predictor with the criterion variable Usually, a B will be closer to 0 than r -> because in multiple regression, both the standardized and the regular regression coefficient are about the unique, distinctive contribution of the variable, excluding any overlap with other predictor variables Ex: predicting mood using just the number of hours slept B, was the same as the correlation coefficient of .85. With multiple regression the B turns out to be. 74 because part of what makes number of hours slept predict mood, overlaps with what makes sleeping well and number of dreams predict mood.
Slope of the Regression Line
- Is the exact value of b, the regression coefficient So the line moves up or down on the criterion variable for every unit of the predictor variable
Regression line
- Line on a graph such as a scatter diagram showing the relationship between the predicted value of the criterion variable (Y hat) for each value of the predictor variable (X) - Visual display of the linear prediction rule
What is a major practical application of statistical methods?
- Making predictions Ex: how much is a reading program likely to help a particular third grader - Understanding various factors affect outcomes of interest Ex: what factors in people who marry predict whether they will be happy and together 10 years later
Regression coefficient (b)
- Number (coefficient) multiplied by a person's score on a predictor variable as part of a linear prediction rule
Multiple regression linear prediction rule with three variables
- Only one regression constant and each predictor variable has its own regression coefficient
Bivariate prediction (or bivariate regression)
- Prediction of scores on one variable (criterion variable) based on scores of one other variable (predictor variable)
Multiple regression
- Procedure for predicting scores on a criterion variable from scores on two or more predictor variables
Prediction in research articles
- Rare for bivariate linear prediction rules to be reported in psychology research articles -> instead simple correlations are reported. Sometimes there will be regression lines from bivariate predictions - usually when there is more than one group and the researchers wants to illustrate the difference in the linear prediction rule between the two groups. Ex: Likeliness of studying abroad and disappointment if they were never able to study abroad with three groups (positive fantasy group about studying abroad, contrast group (middle group) and negative fantasy group about studying abroad). Graph with three regression lines for each of the three experimental groups. - Multiple regression results common in research articles and often reported in tables Tables: - Lay out the regression coefficient for each predictor variable -> whether the regular unstandardized (b), standardized (B), or both, along with the overall R or R^2 in a note at the bottom of the table. - Also the correlation coefficient (r) for each predictor variable with the criterion variable -> Lets you compare the unique association of each predictor to the criterion variable (the regression coefficient) with the overall association (the correlation coefficient) - Usually give the statistical significance for the various statistics reported
Standardized regression coefficient
- Regression coefficient in standard deviation units - Shows the predicted amount of change in standard deviations units of the criterion variable if the value of the predictor variable increases by one standard deviation i.e. For a standardized regression coefficient of .63 would mean for every increase of 1 standard deviation on X, we predict an increase .63 standard deviations on Y - In studies in which scores on a criterion variable are predicted based on scores from one predictor variable, the standardized regression coefficient (B) has the same value as the correlation coefficient (r) between the two variables. If there is more than one predictor variable, the standardized regression coefficient is not the same as the correlation coefficient.
Why do we use the standardized regression coefficient ?
- Researchers in psychology often use slightly different measures or the same measure with a different scale (i.e. 1 to 5 versus 10 to 100), which can make it hard to compare the linear prediction rule for the same type of effect -> scale used for the predictor and criterion variables will affect the value of b (the regression coefficient) in the linear prediction rule (a will also be affected but researchers are more interested in the value of b) - Can use a formula for changing a regression coefficient into a standardized regression coefficient (beta - Greek b)
SSTotal (total squared error when predicting from the mean)
- Sum, over all participants, of the squared differences between each person's score on the criterion variable and that person's predicted criterion score when predicting from the mean
SSError (sum of the squared errors)
- Sum, overall all participants, of the squared differences between each person's score on the criterion variable and each person's predicted criterion variable score
In any given situation, how do we find the right linear prediction rule?
- The closer a regression line comes to matching the actual results (the dots on a scatter plot), the better the job it does as a prediction rule - Only one best linear prediction rule
Multiple correlation coefficient (R)
- The correlation between the criterion variable and all the predictor variables taken together, in multiple regression -> however we use R^2, the squared multiple correlation, because the usual overlap among predictor variables makes the multiple correlation usually smaller than the sum of the individual r's of each predictor with the criterion variable. The squared multiple correlation (R^2) gives the proportionate reduction in error or proportion of variance accounted for in the criterion variable by all the predictor variables taken together. - R^2 of something means that the predictor variables together account for x% of the variation in the scores in the criterion variable - R^2 is a measure for the effect size for multiple regression
Error
- The difference between a person's predicted score on the criterion variable and the person's actual score on the criterion variable in prediction
Regression constant (a) or just constant
- The particular fixed number added into the prediction, in a linear prediction rule - The prediction constant -> a fixed value that you always use when making a prediction
Intercept (or Y intercept)
- The point where the regression line crosses the vertical axis - The regression constant (a) -> the baseline number you always add in - The predicted score on the criterion Variable (Y) when the score on the predictor variable (X) is 0
The Least Squared Error Principle
- The principle that the best prediction rule is the rule that gives the lowest sum of squared errors between the actual scores on the criterion variable and the predicted scores on the criterion variable - a way to come up with the one best prediction rule (a regression line that comes closest to the true scores on the criterion variable that makes predictions that are as little off from the true scores as possible) - want as little error as possible -> the smallest sum of errors, or really smallest sum of squared errors (since errors can come out as negative (the rule will predict too high) or positive (too low)).
What are the two issues in prediction?
- The standardized regression coefficient - Hypothesis testing and prediction
What is not a good idea to do in actual prediction situations?
- To predict from scores on the predictor variable that are much higher or lower than those in the study you used to figure the original correlation -> leads you to predict scores higher or lower than the scale of measurement When using a prediction rule, should make predictions within the same range of scores for the predictor variable that were used to come up with the original correlation on which the prediction rule is based
Predictor variable (usually X)
- Variable that is used to predicted scores of individuals on another variable in prediction
How do we evaluate how good a prediction rule is?
- We figure the sum of the squared errors 1. Take each amount of error and square it 2. Add up the squared errors Penalizes large errors more than small errors and makes it easier to do more advanced computations
Hypothesis Testing and Prediction
- When predicting a criterion variable from one predictor variable the standardized regression coefficient is the same as the correlation coefficient. Thus, you can use a t test (& t statistics) for the correlation between the two variables -> t test for the prediction of the criterion variable from predictor variable - The hypothesis test for a regression coefficient tests whether the regression coefficient is significantly different from 0 -> a regression coefficient of 0 means that knowing a person's score on the predictor variable does not give you any useful information for predicting that person's score on the criterion variable
What is the value of a prediction rule?
- how much less error you make using the prediction rule (SSerror) compared to using the mean (SStotal)). -> With a good prediction rule, SSError should be much smaller than SSTotal. Because, using the mean to predict is not a very precise method (leaves a lot of error).
Why does the procedure for figuring the proportionate reduction in error tell you about the accuracy of the prediction rule?
- it tells you the proportion of total error (the error you would make if just predicting from the mean) you are reducing by using the prediction rule (where your error is based on predicting from the prediction rule). - the larger proportion of total error you reduce, the more accurate your prediction rule will be. Perfect prediction would be 100% reduction.
Steps for figuring the regression coefficient (b)
1. Change the scores for each variable to deviation scores - Figure the mean of each variable - Subtract each variable's mean from each of it's scores 2. Figure the product of the deviation scores for each pairs of scores - For each pair of scores, multiply the deviation score on one variable by the deviation scores on the other variable 3. Add up all the products of the deviation scores 4. Square each deviation score for the predictor variable (X) 5. Add up the squared deviation scores for the predictor variable (X) 6. Divided the sum of the products of deviation scores from Step 3 by the sum of squared deviations for the predictor variable (x) from Step 5
Steps for drawing the Regression Line
1. Draw and label the axes for a scatter diagram -> predictor variable goes on the horizontal axis (X) -> criterion variable goes on the vertical axis (Y) 2. Figure the predicted value on the criterion variable for a low value of the predictor variable and mark the point on the graph -> using the linear prediction rule Y = a + (b)(X) 3. Figure the predicted value on the criterion variable for a high value on the predictor variable -> best to pick a value of (X) that is much higher than used in Step 2 because dots are farther apart and drawing is thus more accurate 4. Draw a line that passes through the two marks -> the regression line
Steps for figuring the proportionate reduction in error
1. Figure the sum of squared errors using the mean to predict - Take each score minus the mean, square it, and these up. -> SSTotal 2. Figure the sum of squared errors using the prediction rule. - Take each score minus the predicted score for this person, square it, and add these up. -> SSError. 3. Figure the reduction in squared error - Step 1 (SStotal) - Step 2 (SSError) 4. Figure the proportionate reduction in squared error - Step 3 (SStotal - SS Error) / Step 1 (SStotal) When predicting y from x, the prediction rule, based on the scores from our group of a people, provides a b% reduction in error over using the mean to predict.
Steps for figuring the regression constant (a)
1. Multiply the regression coefficient, b, by the mean of the X variable 2. Subtract the result of Step 1 from the mean of the Y variable
What are three advantages of using squared errors, as opposed to unsquared errors, when figuring the best linear prediction rule?
1. Unlike unsquared errors, squared errors do not cancel each other out. 2. Using squared errors penalizes large errors more than small errors; 3. Squared errors make it easier to do more advanced computations.
What is there some debate about in psychological research literature?
1. Whether research studies should present their results in terms of unstandardized regression coefficients (bs), standardized regression coefficients (Bs - greek letter B), or both types of coefficients -> Some researchers: list only regular unstandardized regression coefficients when the study is purely applied & only standardized regression coefficients when the study is purely theoretical & both coefficients in all other cases. - But, when is a study purely applied or purely theoretical? Also standardized regression coefficients are sometimes not inconsistent across different samples because they are influenced by the range (and variance) of the scores in the sample, while unstandardized coefficients are not = general recommendation is to present the results of your own research studies in terms of both unstandardized and standardized regression coefficients. - When comparing the size of the regression coefficients for each of several predictor variables in a multiple regression, should compare the standardized regression coefficients (the bs) (not the unstandardized). Because, a large value of b for one predictor variable compared to another predictor variable may simply reflect the different scales for each variable. 2. How to judge the relative importance of each predictor variable in predicting the criterion variable - Should we use the standardized regression coefficients (the Bs) of the predictor variables from the overall regression equation? Or, the bivariate correlation coefficients (the rs) of each predictor variable with the criterion variable? Both standardized -> adjusted for the variation in the variables used. But, although the two are the same in bivariate regression, they are not the same thing in multiple regression. In multiple regression, a regression coefficient tells you the unique contribution of the predictor variable to the prediction, over and above all other predicts. When using the ordinary correlation r, a predictor variable may seem to have a quite different importance relative to the other variable Ex: B's for three predictors could be .2, .3 and .4, But, the rs for these three predictors could be .6, .4 and .5. Different coefficients, change the perceived importance of predictors. -> Many approaches to this problem but all controversial. Most experts recommend considering both the rs and the Bs, keeping in mind the different in what they tell you. r: tells you the overall association of the predictor variable with the criterion variable B: tells you the unique association of this predictor variable, over and above the other predictor variables, with the criterion variable
What are two differences between correlation and prediction?
1. With prediction we have to decide which variable is being predicted from and which variable is being predicted. -> variable being predicted from: predictor variable (X) -> variable being predicted: criterion variable (Y) X predicts Y 2. With prediction we can predict based on raw scores or Z-scores (correlation coefficient requires Z-scores). In real research situations, it is more common to conduct statistical analysis for prediction using raw scores.
Limitations of prediction
All of the limitations for correlation apply to prediction - Procedures are inaccurate if the correlation is curvilinear, the group studied is restricted in range, the measures are unreliable, or there are outliers -> leads to the regression coefficients being smaller than they should be to reflect the true association of the predictor variables with the criterion variable - Prediction procedures by themselves do not tell you anything about the direction of causality -> Prior to doing prediction and correlation analyses, researchers often check to see if any of the preceding limitations are relevant in their research study. Ex: looking at a scatter diagram to identify outliers. Then, conducting one analysis with outliers and another excluding outliers from the figuring. Shows the researcher how the outliers
Multiple correlation
Correlation of a criterion variable with two or more predictor variables
Assumptions of Prediction
Similar to the assumptions for the significance test of a correlation coefficient - there is an equal distribution of each variable at each point of the other variable - the relationship between the variables is linear - the people (or cases) are independent. - the error scores (the difference between the prediction rule's predicted scores on the criterion variable and people's actual scores on the criterion variable) are normally distributed. - also we assume that both the predictor and the criterion variable are equal-interval numeric variables (although it is possible to use other types of predictor and criterion variables in prediction)
Slope
Steepness of the angle of a regression line a graph of the relation of scores on a predictor variable and predicted scores on a criterion variable - Number of units the line goes up for every unit it goes across
Sum of the squared errors
Sum of the squared differences between each predicted score and actual score on the criterion variable
What is the relationship between the regression line and the linear prediction rule?
The regression line is a visual display of the linear prediction rule.
When predicting scores on a criterion variable from scores on one predictor variable, the standardized regression coefficient has the same value as what other statistic?
The standardized regression coefficient has the same value as r, the correlation coefficient between the two variables.
If two things are not correlated
Then knowing about one does not help you predict the other
Why should the linear prediction rule be used for making predictions only within the same range of scores in the group of people studied that was the basis for forming the particular prediction rule?
Using scores outside the range for the predictor variable may give unrealistic (or even impossible) predicted scores for the criterion variable.