PSYC 2005 chapter 12 ACA
regression coefficient
- The number you multiply by the person's score on the predictor variable, b - because a coefficient is a number you multiply by something.
Hypothesis Testing and Prediction
- In Chapter 11, you learned that hypothesis testing with a correlation coefficient meant examining whether the coefficient was significantly different than 0 (no correlation). - You learned how to use a t test to determine whether a correlation coefficient was statistically significant. Since the standardized regression coefficient is the same as the correlation coefficient (when predicting a criterion variable from one predictor variable), the t test for the correlation between the two variables also acts as the t test for the prediction of the criterion variable from the predictor variable. The standardized regression coefficient is just another way of presenting the regular regression coefficient; so the t test for the correlation applies to both types of regression coefficients. In terms of prediction, the hypothesis test for a regression coefficient (for both b and ) tests whether the regression coefficient is significantly different from 0. A regression coefficient of 0 means that knowing a person's score on the predictor variable does not give you any useful information for predicting that person's score on the criterion variable. However, if the hypothesis testing result shows that the regression coefficient is significantly different from 0, then knowing a person's score on the predictor variable gives you useful information for predicting a person's score on the criterion variable.
finding the best linear prediction rule
- In any given situation, how do we find the right linear prediction rule, the correct numbers for the regression constant, a, and the regression coefficient, b? - Whether we think of it as a rule in words, a formula, or a regression line on a graph, we still have to know these two numbers. Figure 12-6 shows the scatter diagram for the six students in the hours of sleep and happy mood example. - Four different regression lines (each showing a different linear prediction rule) are shown in the figure. In the figure, the dots show the actual scores of the six students, and the lines show different prediction rules for predicting happy mood from hours of sleep. So the closer a line comes to matching the actual results (the dots), the better the job it does as a prediction rule. In reality, there is only one best linear prediction rule, but for learning purposes, imagine you have four rules to pick from. Which rule in Figure 12-6 will do the best job of predicting happy mood scores from the number of hours slept? - Table 12-3 summarizes how well each of the four rules predicts the actual happy mood scores of the six students in this example. The table shows the hours slept (X), the actual happy mood score (Y), and the predicted happy mood score 1Yn2 for each student using each of the four prediction rules. The table shows that Rules 3 and 4 have predicted values of the happy mood criterion variable 1Yn2 that are much closer to the actual values of the criterion variable (Y) than is the case for Rules 1 and 2.
The Standardized Regression Coefficient (β) and the Correlation Coefficient (r)
- In the examples we have used so far in this chapter, in which scores on a criterion variable are predicted based on scores from one predictor variable, the standardized regression coefficient 12 has the same value as the correlation coefficient (r) between the two variables. - So, the of .85 for the sleep and mood example is the same as the r of .85 between sleep and mood we figured in Chapter 11 (see Table 11-3). However, in formula (12-4), we gave a more general method for figuring a standardized regression coefficient, because the standardized regression coefficient is not the same as the correlation coefficient when scores on a criterion variable are predicted based on scores from more than one predictor variable (a situation we examine in the next main section).
the least squared error principle
- One way to come up with a prediction rule is by eyeballing and trial and error. Using such a method we might come up with Rule 3 or 4 in Figure 12-6 or a similar rule (each of which would be a lot better than Rule 1 or Rule 2). However, what we really need is some method of coming up with the precise, very best linear prediction rule (that is, the best regression line); and this method should not be subjective or based on eyeballing (where different researchers may decide on different lines). Remember, we are trying to find the one best rule; we considered four rules for the mood and sleep example in Figure 12-6 to show the logic of linear prediction rules. - Coming up with the best prediction rule means we first have to decide on what we mean by "best." In terms of a regression line, we basically mean the line that comes closest to the true scores on the criterion variable, that makes predictions that are as little off from the true scores as possible. The difference between a prediction rule's predicted score on the criterion variable and a person's actual score on the criterion variable is called error. - We, of course, want as little error as possible over the whole range of scores we would predict, so what we want is the smallest sum of errors. But there is one more problem here. Sometimes the errors will be positive (the rule will predict too low), and sometimes they will be negative (the rule will predict too high). The positive and negative errors will cancel each other out. So, to avoid this problem, we use squared errors. That is, we take each amount of error and square it (multiply it by itself), then we add up the squared errors. (This is the same solution we used in a similar situation in Chapter 2 when figuring the variance and standard deviation.) Using squared errors also has various other statistical advantages over alternative approaches. For example, this approach penalizes large errors more than small errors; this approach also makes it easier to do more advanced computations
error
- The difference between a prediction rule's predicted score on the criterion variable and a person's actual score on the criterion variable - in prediction, the difference between a person's predicted score on the criterion variable and the person's actual score on the criterion variable
how to draw the regression line
- The first steps are setting up the axes and labels of your graph—the same steps that you learned in Chapter 11 for setting up a scatter diagram. The regression line is a straight line and thus shows a linear prediction. Thus, to draw the regression line, you only have to figure the location of any two points and draw the line that passes through them
the standardized regression coefficient explanation/extension
- The linear prediction rule you have learned provides a useful and practical way of making predictions on a criterion variable from scores on a predictor variable. Recall from the hours of sleep and happy mood example that the prediction rule for predicting mood from sleep was: Yn = -3 + (1)(X). Now, imagine that another group of researchers carry out the same study and find the prediction rule to be Yn = -3 + (2)(X). In this new study, the regression coefficient (of 2) is much larger than the regression coefficient (of 1) in the original study. Recall that the regression coefficient tells you the slope of the regression line. That is, the regression coefficient is the predicted amount of increase in units for the criterion variable when the predictor variable increases by one unit. So the original study shows a 1-unit increase in sleep is associated with a 1-unit increase in mood. The new study, however, shows a 1-unit increase in sleep is associated with a 2-unit increase in mood. With this in mind, you might conclude that the larger regression coefficient in the new study shows that sleep has a much stronger effect on mood. - At this point, however, we should note one important difference between the two studies of sleep and happy mood. In the original study, happy mood was measured on a scale going from 0 to 8. However, the researchers in the new study used a scale of 0 to 20. Thus, with the first study, a 1-hour increase in sleep predicts a 1-point increase in mood on a scale from 0 to 8, but in the second study a 1-hour increase in sleep predicts a 2-point increase in mood but on a scale of 0 to 20. Thus, the second study might actually have found a slightly smaller effect of sleep! - Researchers in psychology quite often use slightly different measures or the same measure with a different scale (1 to 5 versus 10 to 100, for example) for the same variable. This can make it hard to compare the linear prediction rule for the same type of effect (such as the effect of sleep on happy mood) across studies. This is because the scale used for the predictor and criterion variables will affect the value of b (the regression coefficient) in the linear prediction rule. (The value of a, the regression constant, will also be affected by the scales used. But researchers in psychology are usually more interested in the actual value of b than a.) - Does this type of problem seem familiar to you? Recall from Chapter 3 that you learned how to change a raw score into a Z score, which is a score on a scale of standard deviation units. For example, a Z score of 1.23 is a score that is 1.23 standard deviation units above the mean of the scores, and a Z score of -.62 is .62 standard deviation units below the mean. This allowed you to compare the score on one variable relative to the score on another variable, even when the two variables used different scales. In the case of the regression coefficient in this chapter, we need a type of regression coefficient that can be compared across studies (that may have used different scales for the same variable). - It turns out that, just as there was a formula for changing a raw score to a Z score, there is a formula for changing a regression coefficient into what is known as a standardized regression coefficient.
predictor variable
- The variable being predicted from - (usually X) in prediction, variable that is used to predict scores of individuals on another variable
Suppose we want to predict students' GPA at a particular college from their SAT scores.
- We could go about predicting college GPA by ignoring the SAT scores and just predicting that everyone will have an average level of college GPA. - But we would not be taking advantage of knowing the SAT scores. Another possibility would be to use the information we have about SAT and GPA from recent students at this college to set up a complicated set of rules about what GPA we predict for each possible SAT score. For example, suppose in the past few years that students who came to this college with an SAT of 580 had an average GPA at graduation of 2.62; for students with an SAT of 590 the average GPA was 2.66; for students with an SAT of 600 the average GPA was 2.70; and so forth. We could then use these numbers to set up a rule that, for future students who come in with an SAT of 580, we would predict they would graduate with a GPA of 2.62; for those who come in with an SAT of 590, we would predict they would graduate with a GPA of 2.66; and so forth. This would be a pretty good rule and probably would make reasonably accurate predictions. - However, the problem with this kind of rule is that it is quite complicated. Also because some SAT scores may have only a few students, it might not be that accurate for students with those SAT scores.
criterion variable (usually Y)
- in prediction, a variable that is predicted.
regression line
- line that shows the relation between values of the predictor variable and the predicted values of the criterion variable. - line on a graph such as a scatter diagram showing the predicted value of the criterion variable for each value of the predictor variable; visual display of the linear prediction rule.
regression constant (just constant)
- the formal name for the baseline number or a - in a linear prediction rule, particular fixed number added into the prediction. - It has the name "constant" because it is a fixed value that you always use when making a prediction. - Regression is another name statisticians use for prediction; that is why it is called the regression constant
intercept
- the point where the regression line crosses the vertical axis; the regression constant (a). - The point at which the regression line crosses (or "intercepts") the vertical axis is called the intercept (or sometimes the Y intercept). (This assumes you have drawn the vertical axis so it is at the 0 point on the horizontal axis.) - The intercept is the predicted score on the criterion variable 1Yn2 when the score on the predictor variable (X) is 0. It turns out that the intercept is the same as the regression constant - This works because the regression constant is the number you always add in—a kind of baseline number, the number you start with. And it is reasonable that the best baseline number would be the number you predict from a score of 0 on the predictor variable. - In Figure 12-1, the line crosses the vertical axis at .3. That is, when a person has an SAT score of 0, they are predicted to have a college GPA of .3. In fact, the intercept of the line is exactly a, the regression constant. Another way of thinking of this is in terms of the linear prediction rule formula, Yn = a + (b)(X). If X is 0, then whatever the value of b, when you multiply it by X you get 0. Thus, if b multiplied by X comes out to 0, all that is left of the prediction formula is Yn = a + 0. That is, if X is 0, then Y hat = a. - For the hours of sleep and happy mood example, the regression constant, a, was -3. If you were to extend the regression line in Figure 12-2, it would cross the vertical axis at -3.
What does a standardized regression coefficient show?
A standardized regression coefficient shows how much of a standard deviation the predicted value of the criterion variable changes when the predictor variable changes by one standard deviation.
Figure 12-1 shows the regression line for the SAT scores (predictor variable) and college GPA (criterion variable) example.
By following the regression line, you can find the GPA score that is predicted from any particular SAT score. The dotted lines show the prediction for having an SAT score of 700. From the figure, the predicted GPA score for an SAT score of 700 is a little more than 3.0, which is consistent with the precise predicted value of 3.1 we found earlier when using the linear prediction rule formula. So, as you can see, the regression line acts as a visual display of the linear prediction rule formula.
example of linear prediction rules
In the example of using SAT scores to predict college GPA, the regression constant (a) was .3 and the regression coefficient (b) was .004. So, to predict a person's GPA, you start with a score of .3 and add .004 multiplied by the SAT score. In terms of the linear prediction rule formula - Y hat = a + (b)(X)= .3 + (.004)(X) - Predicted GPA = .3 + (.004)(SAT score). - Applying this formula to predicting GPA from a SAT score of 700, Predicted GPA = .3 + (.004)(700) = .3 + 2.80 = 3.10 - You may be interested to know that research studies have consistently shown that SAT scores (as well as high school GPA) predict students' GPA in college (e.g., Schmitt et al., 2009). As you can imagine (and may know from personal experience), SAT scores are not the only factor that predict students' college GPA.
another example of linear prediction rules
Recall the example from Chapter 11 in which we considered the relationship between the number of hours of sleep and happy mood the next day for six students. In this example, hours of sleep is the predictor variable (X) and happy mood is the criterion variable (Y). The regression constant (a) in this example is -3 and the regression coefficient (b) is 1. (You will learn in a later section how to figure these values of a and b.) So, to predict a person's mood level, you start with a score of -3 and add 1 multiplied by the number of hours of sleep. In terms of the linear prediction rule formula, -3 + (1)(X) = y hat in terms of the variable names, predicted mood = -3 + (1)(hours of sleep) Applying this formula to predicting mood after having 9 hours of sleep predicted mood = -3 + (1)(9) = -3 + 9 = 6 - You can see in this example that when a person sleeps a very small number of hours, you predict negative scores on the mood scale, which is impossible (the scale goes from 0 to 8); and when the person sleeps a great many hours, you predict scores on the mood scale that are higher than the limits of the scale. (In fact, it does not even make sense; sleeping 16 hours would probably not make you incredibly happy.) This is why a prediction rule should be used only for making predictions within the same range of scores for the predictor variable that were used to come up with the original correlation on which the prediction rule is based.
The regression line for the hours slept last night (predictor variable) and happy mood (criterion variable) example is shown in Figure 12-2.
The dotted lines show that having 9 hours of sleep gives a predicted happy mood score of 6, which is the same value we found when using the linear prediction rule formula.
b (the regression coefficient)
The regression coefficient is the sum, over all the people in the study, of the product of each person's two deviation scores, divided by the sum of everyone's squared deviation scores on the predictor variable. - b is the regression coefficient - X - MX is the deviation score for each person on the X (predictor) variable - Y - MY is the deviation score for each person on the Y (criterion) variable - (X - MX)1(Y - MY) is the product of deviation scores for each person - ∑[(X - MX)1(Y - MY)] the sum of the products of deviation scores over all the people in the study. - SSX is the sum of squared deviations for the X variable.
a (the regression constant)
The regression constant is the mean of the criterion variable minus the result of multiplying the regression coefficient by the mean of the predictor variable. - Notice that you need to know the value of b in order to figure the value of a. So you first use formula (12-2) to find the value of b and then use formula (12-3) to figure the value of a
When predicting scores on a criterion variable from scores on one predictor variable, the standardized regression coefficient has the same value as what other statistic?
The standardized regression coefficient has the same value as r, the correlation coefficient between the two variables.
How is hypothesis testing carried out with a regression coefficient?
When predicting a criterion variable based on scores on one predictor variable, the hypothesis test for a regression coefficient is the same as the hypothesis test for the correlation between the two variables. The test uses a t statistic and tests whether the regression coefficient is significantly different from 0. A statistically significant regression coefficient means that knowing a person's score on the predictor variable provides useful information for predicting a person's score on the criterion variable.
It turns out that you can do prediction based on
either Z scores or raw scores. - In other words, you can use Z scores for a predictor variable to predict Z scores for a criterion variable, or you can use raw scores for a predictor variable to predict raw scores for a criterion variable. In real research situations, it is more common to conduct the statistical analyses for prediction using raw scores, so that is what we focus on in this chapter.
We then briefly introduce procedures
for situations in which predictions about one variable, such as college GPA, are made based on information about two or more other variables, such as using both SAT scores and high school GPA. Finally, as an Advanced Topic, we discuss how to estimate the expected accuracy of the predictions we make using these procedures
linear prediction rule (or linear prediction model)
formula for making predictions; that is, formula for predicting a person's score on a criterion variable based on the person's score on one or more predictor variables.
You can visualize a linear prediction rule as a line on a
graph in which the horizontal axis is for values of the predictor variable (X) and the vertical axis is for predicted scores for the criterion variable 1Yn2. (The graph is set up like the scatter diagrams you learned to make in Chapter 11.)
Statistical prediction also plays a major part in
helping research psychologists understand how various factors affect outcomes of interest - For example, what factors in people who marry predict whether they will be happy and together 10 years later; what are the factors in childhood that predict depression and anxiety in adulthood; what are the circumstances of learning something that predict good or poor memory for it years later; or what are the various kinds of support from friends and family that predict how quickly or poorly someone recovers from the death of a loved one?
Psychologists of various kinds are called on to make informed (and precise) guesses about such things as
how well a particular job applicant is likely to perform if hired, how much a reading program is likely to help a particular third grader, how likely a particular patient is to commit suicide, or how likely a potential parolee is to commit a violent crime if released
learning the intricacies of statistical prediction deepens your insight into
other statistical topics and prepares you for central themes in more advanced statistics courses
An important thing to notice from Table 12-2 is that in actual prediction situations it is not a good idea to
predict from scores on the predictor variable that are much higher or lower than those in the study you used to figure the original correlation. -
In formulas the
predictor variable is usually labeled X, and the criterion variable is usually labeled Y. - That is, X predicts Y. In the example we just considered, SAT scores would be the predictor variable or X, and college grades would be the criterion variable or Y
standardized regression coefficient (beta)
regression coefficient in standard deviation units. It shows the predicted amount of change in standard deviation units of the criterion variable if the value of the predictor variable increases by one standard deviation. - The standardized regression coefficient is equal to the regular, unstandardized regression coefficient multiplied by the result of dividing the square root of the sum of squared deviations for the predictor variable by the square root of the sum of squared deviations for the criterion variable. - The (beta), which is the standardized regression coefficient, is entirely separate from the beta you learned in Chapter 6. In that chapter, beta referred to the probability of making a Type II error—the probability of not getting a significant result when the research hypothesis is true. Also, note that the use of the term for the standardized regression coefficient is an exception to the rule that Greek letters refer to population parameters. - This formula has the effect of changing the regular (unstandardized) regression coefficient (b), the size of which is related to the specific scales for the predictor and criterion variables, to a standardized regression coefficient 12 that shows the relationship between the predictor and criterion variables in terms of standard deviation units. β = (b)(√SSx / √SSy)
slope
steepness of the angle of a regression line in a graph of the relation of scores on a predictor variable and predicted scores on a criterion variable; number of units the line goes up for every unit it goes across - The steepness of the angle of the regression line, called its slope, is the amount the line moves up for every unit it moves across. - In the example in Figure 12-1, the line moves up .004 on the GPA scale for every additional point on the SAT. In fact, the slope of the line is exactly b, the regression coefficient. (We don't usually think of SAT scores increasing by as little as 1 point—say, from 600 to 601. So instead of thinking about a 1-point increase in SAT giving a .004-point increase in GPA, it may be easier to think in terms of a 100-point increase in SAT giving a .4-point increase in GPA.) - For the hours of sleep and happy mood example shown in Figure 12-2, the value of the regression coefficient, b, is 1. So, the line moves up 1 on the happy mood scale for every additional hour of sleep.
4 steps on how to draw the regression line
step 1 Draw and label the axes for a scatter diagram. Remember to put the predictor variable on the horizontal axis. step 2 Figure the predicted value on the criterion variable for a low value of the predictor variable and mark the point on the graph. You make the prediction using the linear prediction rule you learned earlier: y hat = a + (b)(X) step 3 Do the same thing again, but for a high value on the predictor variable. It is best to pick a value of the predictor variable (X) that is much higher than you used in Step ❷. This is because it will make the dots fairly far apart, so that your drawing will be more accurate. step 4 Draw a line that passes through the two marks. This is the regression line. You can check the accuracy of your line by finding any third point. It should also fall on the line.
an example of drawing the regression line
step 1: Draw and label the axes for a scatter diagram. Note that we labeled the Y axis "Predicted" college GPA, as the regression line shows the relationship between actual scores of the predictor variable (X) and predicted scores of the criterion variable (Y hat) step 2: Figure the predicted value on the criterion variable for a low value of the predictor variable and mark the point on the graph. Recall from earlier that the linear prediction rule formula for predicting college GPA from an SAT score is y hat = .3 + (.004)(X) So, for an SAT score of 200 (a low SAT score), the predicted college GPA is 1.1. Thus, you mark this point 1X = 200, y hat = 1.12 on the graph, as shown in Figure 12-3. step 3: Do the same thing again, but for a high value on the predictor variable. We saw earlier that if a person has an SAT score of 700 (a high SAT score), we predict a college GPA of 3.1 [that is, .3 + (.004)(700) = 3.1]. Thus, you mark this point 1X = 700, Y hat = 3.12 on the graph, as shown in Figure 12-3. step 4: Draw a line that passes through the two marks.
the steps for figuring the regression constant, a
step 1: Multiply the regression coefficient, b, by the mean of the X variable. step 2: Subtract the result of Step ❶ from the mean of the Y variable. This gives the regression constant, a.
steps for figuring the regression coefficient, b
step 1: Change the scores for each variable to deviation scores. Figure the mean of each variable. Then subtract each variable's mean from each of its scores. step 2: Figure the product of the deviation scores for each pair of scores. That is, for each pair of scores, multiply the deviation score on one variable by the deviation score on the other variable. step 3: Add up all the products of the deviation scores. step 4: Square each deviation score for the predictor variable (X). step 5: Add up the squared deviation scores for the predictor variable (X). step 6: Divide the sum of the products of deviation scores from Step ❸ by the sum of squared deviations for the predictor variable (X) from Step ❺. This gives the regression coefficient, b.
The main part of this chapter considers in some detail the logic and procedures for making predictions about one variable,
such as predicting college grade point average (GPA), based on information about another variable, such as SAT scores.
sum of the squared errors
sum of the squared differences between each predicted score and actual score on the criterion variable - Thus, to evaluate how good a prediction rule is, we figure the sum of the squared errors that we would make using that rule. Table 12-4 gives the sum of squared errors for each of the four prediction rules shown in Figure 12-6 and Table 12-3. To avoid making Table 12-4 too complex, we show only the actual figuring of the sum of squared errors for Rule 1 and Rule 4. The scores in each "Error" column show the result of subtracting the predicted score on the criterion variable (Y hat) using the rule from the actual score on the criterion variable (Y) for each of the six students in the example (error = Y - Y hat. We then squared each error score for each rule. The sum of these squared error scores was 73.33 for Rule 1 and 6.00 for Rule 4. We repeated this process for Rules 2 and 3. Overall, the results for the sum of squared errors for Rules 1, 2, 3, and 4 were 73.33, 22.00, 7.50, and 6.00, respectively. Because the goal is to come up with a linear prediction rule—values for a and b in the formula Yn = a + (b)(X)—that creates the smallest sum of squared errors, we would in this case pick Rule 4.
In picking this linear prediction rule, we have used what statisticians call
the least squares criterion. - That is, we found the regression line that gave the lowest sum of squared errors between the actual scores on the criterion variable (Y) and the predicted scores on the criterion variable y hat. In the next section, you learn how to find the values of a and b for the linear prediction rule that gives the smallest sum of squared errors possible. - Remember, the regression line is a visual display of the linear prediction rule.
But with prediction we have to decide which
variable is being predicted from and which variable is being predicted.
Ideally, we would like a prediction rule that is not only simpler from this kind of complicated rule but also does not depend on only a few cases for each prediction. The solution that is favored by research psychologists is a rule of the form "to predict a person's score on Y, start with some baseline number,
which we will call a, then add to it the result of multiplying a special predictor value, which we will call b, by the person's score on X." For our SAT and GPA example, the rule might be "to predict a person's graduating GPA, start with .3 and add the result of multiplying .004 by the person's SAT score." That is, the baseline number (a) would be .3 and the predictor value (b) is .004. Thus, if a person had an SAT of 600, we would predict the person would graduate with a GPA of 2.7. That is, the baseline number of .3 plus the result of multiplying the predictor value of .004 by 600 gives .3 plus 2.4, which equals 2.7. For a student with an SAT of 700, we would predict a graduating GPA of 3.1 [that is, .3 + 1.004 * 7002 = 3.1]. - This is an example of a linear prediction rule (or linear prediction model). We will see in the next main section why it is called "linear." For now it is enough just to note that "linear" here means the same thing as it did when we considered correlation in the last chapter: Lows go with lows and highs with highs (or, for a negative correlation, lows with highs and highs with lows). In our SAT score and college GPA example, a low score on the predictor variable (SAT score) predicts a low score on the criterion variable (college GPA), and a high score on the predictor variable predicts a high score on the criterion variable
One of the ways correlation and prediction look different is this:
with correlation it did not matter much which variable was which
if two variables are correlated it means that
you can predict one from the other. - So if sleep the night before is correlated with happiness the next day, this means that you should be able, to some extent, to predict how happy a person will be the next day from knowing how much sleep the person got the night before
But if two variables are not correlated, then knowing about one does not help
you predict the other. - So if shoe size and income have a zero correlation, knowing a person's shoe size does not allow you to predict anything about the person's income. As we proceed, we will be referring again to the connections of correlation with prediction. But for now let us turn to prediction before we come back to correlation.
linear prediction rules (formula)
ŷ = a + (b)(X) - A person's predicted score on the criterion variable equals the regression constant, plus the result of multiplying the regression coefficient by that person's score on the predictor variable. - ŷ is the person's predicted score on the criterion variable - a is the regression constant - b is the regression coefficient, and X is the person's score on the predictor variable - The symbol over the Y means "predicted value of " and is called a hat (so, ŷ is said "Y hat")