stats- chapter 12: prediction

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

here are the steps for figuring the regression coefficient, b:

1. change the scores for each variable to deviation scores- figure the mean of each variable. then subtract each variable's mean from each of its scores. 2. figure the product of the deviation scores for each pair of scores- that is, for each pair of scores, multiply the deviation score on one variable by the deviation score on the other variable 3. add up all the products of the deviation scores 4. square each deviation score the predictor variable (x) 5. add up the squared deviation scores for the predictor variable (x) 6. divide the sum of the products of deviation scores from step 3 by the sum of squared deviations for the predictor variable (x) from step 5. this gives the regression coefficient, b

notice that you need to know the value of b in order to figure the value of

a

1. what does a standardized regression coefficient show

a standardized regression coefficient shows how much of a standard deviation the predicted value of the criterion variable changes when the predictor variable changes by one standard deviation

standardized regression coefficient

b = (b) sqrt SSx / sqrt SSy

finding a and b for the least squares linear prediction rule

b = sum of (x-mx)(y-my) / SSx- the regression coefficient is the sum, over all the people in the study, of the product of each person's two deviation scores, divided by the sum of everyone's squared deviation scores on the predictor variable

in this formula,

b is the regression coefficient. x-mx is the deviation score for each person on the x (predictor) variable and y-my is the deviation score for each person on the y (criterion) variable; (x-mx)(y-my) is the product of deviation scores for each person; and (x-mx)(y-my) is the sum of the products of deviation scores over all the people in the study. SSx is the sum of squared deviations for the x variable.

multiple correlation

correlation of a criterion variable with two or more predictor values

linear prediction rule

formula for making predictions; that is, formula for predicting a person's score on a criterion variable based on the person's score on one or more predictor variables

3. write the formula for the linear prediction rule and define each of the symbols

formula for the linear prediction rule: Y= a + (b)(x). Y is the predicted score on the criterion variable (Y); a is the regression constant; b is the regression coefficient; and X is the score on the predictor variable (X)

8. list four conditions that may lead to regression procedures being inaccurate

four conditions that may lead to regression procedures being inaccurate: (i) restriction in range, (ii) curvilinear associations, (iii) unreliable measurement, (iv) outliers

regression constant (a)

in a linear prediction rule, particular fixed number added into the prediction

4. in multiple regression, why are the standardized regression coefficients for each predictor variable often smaller than the ordinary correlation coefficient of that predictor variable with the criterion variable?

in multiple regression, a predictor variable's association with the criterion variable usually overlaps with the other predictor variables' association with the criterion variable. Thus, the unique association of a predictor variable with the criterion variable (as shown by the standardized regression coefficient) is usually smaller than the ordinary correlation of the predictor variable with the criterion variable

multiple correlation coefficient (R)

in multiple regression, the correlation between the criterion variable and all the predictor variables taken together

criterion variable (usually y)

in prediction, a variable that is predicted (variable being predicted)

error

in prediction, the difference between a person's predicted score on the criterion variable and the person's actual score on the criterion variable

predictor variable (usually x)

in prediction, variable that is used to predict scores of individuals on another variable (variable being predicted from)

regression line

line on a graph such as a scatter diagram showing the predicted value of the criterion variable for each value of the predictor variable; visual display of the linear prediction rule

remember, the regression line is a visual display of the

linear prediction rule

draw the regression line for x and y where a = 4 and b = 1.33 (put values from 0 to 12 on the x axis and values of 0 to 20 on the y axis)

look at page 502

1. what is multiple regression

multiple regression is the procedure for predicting a criterion variable from a prediction rule that includes more than one. predictor variable

regression coefficient (b)

number multiplied by a person's score on a predictor variable as part of a linear prediction rule

4. in a particular prediction rule, a= -1.23 and b = 6.11. What is the predicted score on the criterion variable if the score on the predictor variable is (a) 2.00; (b) 4.87; (c) -1.92?

predicted scores: (a) y = a + (b)(x) = -1.23 + (6.11)(2.00) = 10.99; (b) -1.23 + (6.11)(4.87) = 28.53; (c) -1.23 + (6.11)(-1.92) = -12.96

bivariate prediction

prediction of scores on one variable based on scores of one other variable. Also called bivariate regression

multiple regression

procedure for predicting scores on a criterion variable from scores on two or more predictor variables

standardized regression coefficient (beta)

regression coefficient in standard deviation units. it shows the predicted amount of change in standard deviation units of the criterion variable if the value of the predictor variable increases by one standard deviation

it turns out that the intercept is the same as the

regression constant

slope

steepness of the angle of a regression line in a graph of the relation of scores on a predictor variable and predicted scores on a criterion variable; number of units the line goes ups or every unit it goes across

sum of the squared errors

sum of the squared differences between each predicted score and actual score on the criterion variable- thus, to evaluate how good a prediction rule is

7. what are the assumptions of the significance test for prediction

the assumptions of the significance test for prediction are: there is an equal distribution of each variable at each point of the other variable; the relationship between the variables is linear; the people (or cases) are independent; and the error scores follow a normal distribution

6. what are the effect size conventions for multiple regression

the effect size conventions for multiple regression are .02, a small effect size; .13, a medium effect size; and .26, a large effect size

1. what is the least squared error principle

the least squared error principle is the principle that the best prediction rule is the rule that gives the lowest sum of squared errors between the actual scores on the criterion variable and the predicted scores on the criterion variable

4. how is hypothesis testing carried out with a regression coefficient

when predicting a criterion variable based on scores on one predictor variable, the hypothesis test for a regression coefficient is the same as the hypothesis test for the correlation between the two variables. The test uses a t statistic and tests whether the regression coefficient is significantly different from 0. A statistically significant regression coefficient means that knowing a person's score on the predictor variable provides useful information for predicting a person's score on the criterion variable

that is,

x predicts y

a multiple regression linear prediction rule with three predictor variables goes like this:

y = a + (b1)(x1) + (b2)(x2) + (b3)(x3)

all linear prediction rules have this formula:

y= a + (b)(x) a person's predicted score on the criterion variable equals the regression constant, plus the result of multiplying the regression coefficient by that person's score on the predictor variable

4. (a) find the linear prediction rule for predicting y scores from x scores based on the following numbers. (b) figure the sum of squared errors when using the linear prediction rule from part (a) to predict the scores for the four people in this study. (c) repeat part (b), this time using the prediction rule y = 9 - (.7)(x). (d) why must your answer to (b) be lower than your answer to (c)? x values- 4,6,7,3; y values= 6,8,3,7

(a) the figuring is shown in table 12-6. the linear prediction rule is y = 9 - (.6)(x). (b) as shown in table 12-7, the sum of squared errors is 10.40. (c) as shown in table 12-7, the sum of squared errors is 11.50. (d) the linear prediction rule figured in part (a) is the rule that gives the smallest sum of squared errors; so the sum of squared errors for any other rule will be larger than this sum

3. (a) write the formula for the regression coefficient, b, and define each of the symbols. (b) write the formula for the regression constant, a, and define each of the symbols

(a) the formula for the regression coefficient: b = (x-mx)(y-my)/SSx b is the regression coefficient; sum symbol is the symbol for sum of- add up all the scores that follow (in this formula, you add up all the products of deviation scores that follow); x-mx is the deviation score for each person on the predictor (x) variable; y-my is the deviation score for each person on the criterion (y) variable; SSx is the sum of squared deviations for the predictor variable. (b) a = my-(b)(mx). a is the regression constant; my is the mean of the criterion variable; b is the regression coefficient; mx is the mean of the predictor variable

2. (a) write the formula for the standardized regression coefficient, b, and define each of the symbols. (b) figure the value of b when b = -1.21, SSx = 2.57, and SSy = 7.21

(a) the formula for the standardized regression coefficient is b = (b) sqrt SSx / SSy. beta is the standardized regression coefficient; b is the regular, unstandardized regression coefficient; SSx is the sum of squared deviations for the predictor variable (x); and SSy is the sum of squared deviations for the criterion variable y

4. (a) what is the intercept of the regression line? (b) what is it equivalent to in the linear prediction rule?

(a) the intercept of the regression line is the point at which the regression line crosses the vertical axis (assuming the vertical axis is at 0 on the horizontal axis). (b) in the linear prediction rule, it is equivalent to a, the regression constant

3. (a) what is the slope of the regression line? (b) what is it equivalent to in the linear prediction rule?

(a) the slope of the regression line is the amount the line goes up for every one unit it moves across. (b) in the linear prediction rule, it is equivalent to b, the regression coefficient

earlier in the chapter, we told you that the regression coefficient, b, for the hours of sleep and mood example was 1, and the regression constant, a, was -3. Now let's see how we figured those values. The figuring for the regression coefficient, b, is shown in Table 12-5. Using the steps,

1. change the scores for each variable to deviation scores- figure the mean of each variable. then subtract each variable's mean from each of its scores. the deviation scores are shown in the x-mx and y-my columns in table 12-5 2. figure the product of the deviation scores for each pair of scores- that is, for each pair of scores, multiply the deviation score on one variable by the deviation score on the other variable 3. add up all the products of the deviation scores- adding up all the products of the deviation scores, as shown in the final column of table 12-5, gives a sum of 16 4. square each deviation score for the predictor variable (x) 5. add up the squared deviation scores for the predictor variable (x). as shown in the (x-mx)^2 column of table 12-5, the sum of squared deviations for the sleep variable is 16. 6. divide the sum of the products of deviation scores from step 3 by the sum of squared deviations for the predictor variable (x) from step 5. dividing 16 by 16 gives a result of 1. this is the regression coefficient. in terms of the formula, b = (x-mx)(y-my)/SSx = 16/16= 1

how to draw the regression line

1. draw and label the axes for a scatter diagram- remember to put the predictor variable on the horizontal axis 2. figure the predicted value on the criterion variable for a low value of the predictor variable and mark the point on the graph- you make the prediction using the linear prediction rule you learned earlier: y = a + (b)(x) 3. do the same thing again, but for a high value on the predictor variable- it is best to pick a value of the predictor variable (x) the is much higher than you used in step 2. this is because it will make the dots fairly far apart, so that your drawing will be more accurate 4. draw a line that passes through the two marks- this is the regression line

here are the steps for figuring the regression constant, a:

1. multiply the regression coefficient, b, by the mean of the x variable 2. subtract the result of step 1 from the mean of the y variable- this gives the regression constant, a

summary

1. prediction (or regression) involves making predictions about scores on a criterion variable based on scores on a predictor variable 2. the linear prediction rule for predicting scores on a criterion variable from scores on a predictor variable is: y = a + (b)(x) where y is the predicted score on the criterion variable, a is the regression constant, b is the regression coefficient, and x is the score on the predictor variable 3. a regression line, which is drawn in the same kind of graph as a scatter diagram, shows the predicted criterion variable value (y) for each value of the predictor variable (x). the slope of this line equals b; a is where this line crosses the vertical axis (the intercept). a regression line is a visual display of a linear prediction rule 4. the best linear prediction rule is the rule that gives the lowest sum of squared errors between the actual scores on the criterion variable and the predicted scores on the criterion variable. there are formulas for figuring the regression constant (a) and the regression coefficient (b) that will give the linear prediction rule with the smallest sum of squared errors 5. a standardized regression coefficient (beta) shows how much of a standard deviation the predicted value of the criterion variable changes when the predictor variable changes by one standard deviation. the standardized regression coefficient can be figured from the regular regression coefficient and the sum of squared deviations for the predictor and criterion variables. when predicting scores on the criterion variable from scores on one other variable (bivariate prediction), the standard regression coefficient (beta) is the same as the correlation coefficient (r) between the two variables 6. in bivariate prediction, the hypothesis test for a regression coefficient is the same as the hypothesis test for the correlation between the two variables 7. in multiple regression, a criterion variable is predicted from two or more predictor variables. in a multiple regression linear prediction rule, there is a regression constant, the score for each predictor variable is multiplied by its own regression coefficient, and the results are added up to make the prediction. each regression coefficient tells you the unique relation of the predictor to the criterion variable int he context of the other predictor variables. the multiple correlation coefficient (R) is the overall degrees of association between the criterion variable and the predictor variables taken tougher. r2 is the overall proportionate reduction in error for multiple regression 8. the measure of effect size for multiple regression is R2. the assumptions for the significance test for prediction are as follows: there is an equal distribution of each variable at each point of the other variable; the relationship between eh variables is linear; the people (or cases) are independent; and the error scores follow a normal distribution. bivariate prediction and multiple regression have the same limitations as ordinary correlation. in addition, in multiple regression there is ambiguity in interpreting the relative importance of the predictor variables 9. bivariate prediction results are rarely described directly in research articles, but regression lines are sometimes shown when prediction rules for more than one group are being compared. Multiple regressions are commonly reported in articles, often in a table that includes the regression coefficients and overall proportionate reduction in error (R2)

so you first use formula (12-2) to find the value of b and then use formula (12-3) to figure the value of a

12-2: b = (x-mx)(y-my) / SSx 12-3: a = my - (b)(mx)

2. write the multiple regression linear prediction rule with two predictors and define each of the symbols`

the multiple regression linear prediction rule is y = a + (b1)(x1) + (b2)(x2). y is the predicted score on the criterion variable; a is the regression constant; b1 is the regression coefficient for the first predictor variable; x1 is the person's score on the first predictor variable; b2 is the regression coefficient for the second predictor variable; and x2 is the person's score on the second predictor variable

intercept

the point where the regression line crosses the vertical axis; the regression constant (a)

a = my - (b)(mx)

the regression constant is the mean of the criterion variable minus the result of multiplying the regression coefficient by the mean of the predictor variable

2. what is the relationship between the regression line and the linear prediction rule

the regression line is a visual display of the linear prediction rule

1. what does the regression line show

the regression line shows the relationship between the predictor variable (x) and predicted values of the criterion variable (y)

3. when predicting scores on a criterion variable from scores on one predictor variable, the standardized regression coefficient has the same value as what other statistic?

the standardized regression coefficient has the same value as r, the correlation coefficient between the two variables

1. fill in the blanks: the variable being predicted from is called the ___ variable and the variable being predicted is called the ___ variable

the variable being predicted from is called the predictor variable and the variable being predicted is called the criterion variable

5. what are the different hypothesis tests for multiple regression

there is a hypothesis test to test the significance of the multiple correlation. Also, a hypothesis test can be carried out for each predictor variable to test whether its regression coefficient is significantly different from 0. Finally, there is a hypothesis test to test whether the regression constant is significantly different from 0

2. what is the linear prediction rule in words?

to predict a person's score on a criterion variable (y), start with a particular regression constant and add to it the result of multiplying the particular regression coefficient by the person's score on the predictor variable (x)

2. give three advantages of using squared errors, as opposed to un squared errors, when figuring the best linear prediction rule

unlike un squared errors, squared errors do not cancel each other out. Using squared error penalizes large errors more than small errors; they also make it easier to do more advanced computations

why should the linear prediction rule be used for making predictions only within the same range of scores in the group of people studied that was the basis for forming the particular prediction rule?

using scores outside the range for the predictor variable may give unrealistic (or even impossible) predicted scores for the criterion variable


Kaugnay na mga set ng pag-aaral

My Mod 1 for Foundations: Data, Data, Everywhere

View Set

Chapter 59 Iggy Practice Questions, Chapter 58 Iggy Practice Questions, Med-Surg Chapters 59 & 60, Gastro Nclex Questions, Nclex Review: Lower GI Problems - Intestinal Obstruction, Nclex Review: Lower GI Problems- ileostomy, Total Parenteral Nutritio...

View Set