Chapter 9: Correlation and Regression
Section 2
Linear Regression
residuals (p. 486)
On a scatter plot, the differences between the observed *y* - value and the predicted *y* - value for a given *x* - value. For a given *x* - value, *d* = (observed *y* - value) - (predicted *y* - value)
*Using Table 11 for the Correlation Coefficient, "ρ"* (p. 476)
*In Words:* *1.)* Determine the number of pairs of data in the sample. *2.)* Specify the level of significance. *3.) Find the critical value. *4.)* Decide whether the correlation coefficient is significant. *5.)* Interpret the decision in the context of the originally claim. *In Symbols:* *1.)* Determine *n*. *2.)* Identify *α*. *3.)* (*Use Table 11 in Appendix B.*) *4.)* If |*r*| is greater than the critical value, then the correlation coefficient is significant. Otherwise, there is *not* enough evidence to conclude that the correlation coefficient is significant.
*Calculating a Correlation Coefficient* (p. 474)
*In Words:* *1.)* Find the sum of the *x* - values. *2.)* Find the sume of the *y* - values. *3.) Multiply each *x* - value by its corresponding *y* - value and find the sum. *4.)* Square each *x* - value and find the sum. *5.)* Square each *y* - value and find the sum. *6.)* Use these five sums to calculate the correlation coefficient. *In Symbols:* *1.)* *∑*×*x* *2.)* *∑*×*y* *3.)* *∑*×*x*×*y* *4.)* *∑*×*x²* *5.)* *∑*×*y²* *6.)* *r* = ((*n*×*∑*×*x*×*y*) -(*∑*×*x*)(*∑*×*y*)) / (√(*n*×*∑*×*x²*-(*∑*×*x*)²)) × (√(*n*×*∑*×*y²* - (*∑*×*y*)²))
*Using the "t" - Test for the Correlation Coefficient, "ρ"* (p. 478)
*In Words:* *1.)* Identify the null and alternative hypotheses. *2.)* Specify the level of significance. *3.)* Identify the degrees of freedom. *4.)* Determine the critical value(s) and the rejection region(s). *5.)* Find the standardized test statistic. *6.)* Make a decision to reject or fail to reject the null hypothesis. *7.)* Interpret the decision in the context of the original claim. *In Symbols:* *1.)* State *H₀* and *H*∨*a*. *2.)* Identify *α*. *3.)* d.f. = *n* - 2 *4.)* (*Use Table 5 in Appendix B.*) *5.)* *t* = (*r*)/√((1 - *r²*)/(*n* - 2)) *6.)* If *t* is in the rejection region, then reject *H₀*. Otherwise, fail to reject *H₀*.
*The "t" - Test for the Correlation Coefficient* (p. 478)
A *"t" - test* can be used to test whether the correlation between two variables is significant. The *test statistic* of *r* and the *standardized test statistic*, *t* = (*r*)/(*σ*∨*r*) = (*r*)/√((1 - *r²*)/(*n* - 2)) follows a *t* - distribution with *n* - 2 degrees of freedom, where *n* is the number of pairs of data. (Note that there are *n* - 2 degrees of freedom because one degree of freedom is lost for each variable.)
correlation coefficient (p. 473)
A measure of the strength and the direction of a linear relationship between two variables. The symbol *r* represents the sample correlation coefficient. A formula for *r* is: *r* = ((*n*×*∑*×*x*×*y*) -(*∑*×*x*)(*∑*×*y*)) / (√(*n*×*∑*×*x²*-(*∑*×*x*)²)) × (√(*n*×*∑*×*y²* - (*∑*×*y*)²)) ~~ (*Sample correlation coefficient*) where *n* is the number of pairs of data. The *population correlation coefficient* is represented by *ρ*, (the lowercase Greek letter rho, pronounced *"row"*.)
Correlation
A relationship between two variables. The data can be represented by the ordered pairs (*x*, *y*), where *x* is the *independent* (or *explanatory*) *variable*, and *y* is the *dependent* (or *response*) *variable*. (p. 470)
*(Y.T.I.): Exercise 2 (p. 471):* The accompanying table shows the ages (in years) of 11 children and the numbers of words in their vocabulary. Complete parts 1 (a) through 5 (d) below. *Age (x)* = *Vocabulary size (y)* *1* = *8* *2* = *260* *3* = *560* *4* = *1,200* *5* = *2,100* *6* = *2,500* *3* = *640* *5* = *2,200* *4* = *1,300* *6* = *2,300* *(I didn't copy the accompanying table of critical values for the Pearson correlation coefficient because it's too big.)* *Part 1 (a):* Display the data in a scatter plot. Choose the correct graph below. (Since I don't have Quizlet+, I can't insert the images of the actual scatter plots; ergo, I pasted their descriptions.) A.) A scatter plot has a horizontal axis labeled *Age (years)* from *0 to 8* in increments of *1*, and a vertical axis labeled *Vocabulary size* from *0 to 3,000* in increments of *300*. The following 11 points are plotted (*Age (years)*, *Vocabulary size*): *(1, 0); (2, 250); (2, 300); (3, 550); (3, 650); (4, 1200); (4, 1300); (5, 2100); (5, 2200); (6, 2300); (6, 2500)*. From left to right, the points follow a general trend of *rising* from left to right at a constant rate. All vertical coordinates are approximate. B.) A scatter plot has a horizontal axis labeled *Age (years)* from *0 to 8* in increments of *1*, and a vertical axis labeled *Vocabulary size* from *0 to 3,000* in increments of *300*. The following 11 points are plotted (*Age (years)*, *Vocabulary size*): *(1, 2500); (2, 2300); (2, 2200); (3, 2100); (3, 1300); (4, 1200); (4, 650); (5, 550); (5, 300); (6, 250); (6, 0)*. The points follow a general trend of *falling* from left to right at a constant rate. All vertical coordinates are approximate. C.) A scatter plot has a horizontal axis labeled *Age (years)* from *0 to 3,000* in increments of *300*, and a vertical axis labeled *Vocabulary size* from *0 to 8* in increments of *1*. The following 11 points are plotted (*Age (years)*, *Vocabulary size*): *(0, 1); (250, 2); (300, 2); (550, 3); (650, 3); (1200, 4); (1300, 4); (2100, 5); (2200, 5); (2300, 6); (2500, 6)*. The points follow a general trend of *rising* from left to right at a constant rate. All horizontal coordinates are approximate. D.) A scatter plot has a horizontal axis labeled *Age (years)* from *0 to 3,000* in increments of *300*, and a vertical axis labeled *Vocabulary size* from *0 to 8* in increments of *1*. The following 11 points are plotted (*Age (years)*, *Vocabulary size*): *(0, 6); (250, 6); (300, 5); (550, 5); (650, 4); (1200, 4); (1300, 3); (2100, 3); (2200, 2); (2300, 2); (2500, 1)*. The points follow a general trend of *falling* from left to right at a constant rate. All horizontal coordinates are approximate. *Part 2 (b):* Calculate the sample correlation coefficient, *r*. (Round answer to *three* decimal places.) *r* = *___* *Part 3 (c):* Describe the type of correlation, if any, and interpret the correlation in the context of the data. There is *______________________* linear correlation. *Part 4 (c):* Interpret the correlation. Choose the correct answer below. A.) Based on the correlation, there does not appear to be a linear relationship between children's ages and the number of words in their vocabulary B.) Aging causes the number of words in children's vocabulary to increase. C.) As age increases, the number of words in children's vocabulary tends to decrease. D.) Aging causes the number of words in children's vocabulary to decrease. E.) Based on the correlation, there does not appear to be any relationship between children's ages and the number of words in their vocabulary. F.) As age increases, the number of words in children's vocabulary tends to increase. *Part 5 (d):* Use the table of critical values for the Pearson correlation coefficient to make a conclusion about the correlation coefficient. Let *α* = 0.01. (Round answer(s) to *three* decimal places.) The critical value is *__(1)__*. Therefore, there *_(2)_* sufficient evidence at the 1% level of significance to conclude that *____________________(3)____________________* between children's ages and the number of words in their vocabulary.
Correct Answers: *Part 1 (a):* A.) A scatter plot has a horizontal axis labeled *Age (years)* from *0 to 8* in increments of *1*, and a vertical axis labeled *Vocabulary size* from *0 to 3,000* in increments of *300*. The following 11 points are plotted (*Age (years)*, *Vocabulary size*): *(1, 0); (2, 250); (2, 300); (3, 550); (3, 650); (4, 1200); (4, 1300); (5, 2100); (5, 2200); (6, 2300); (6, 2500)*. From left to right, the points follow a general trend of *rising* from left to right at a constant rate. All vertical coordinates are approximate. *Part 2 (b):* *.979* *Part 3 (c):* *a strong positive* *Part 4 (c):* F.) As age increases, the number of words in children's vocabulary tends to increase. *Part 5 (d):* *(1):* *.735* *(2):* *is* *(3):* *there is a significant linear correlation*
*(Y.T.I.): Exercise 4 (p. 474):* The accompanying table shows the heights (in inches) of 8 school girls and their scores on an IQ test. Complete parts 1 (a) through 5 (d) below. *Height (x)* = *IQ score (y)* *62* = *107* *58* = *97* *66* = *104* *68* = *110* *59* = *94* *66* = *107* *65* = *116* *56* = *123* *(I didn't copy the accompanying table of critical values for the Pearson correlation coefficient because it's too big.)* *Part 1 (a):* Display the data in a scatter plot. Choose the correct graph below. (Since I don't have Quizlet+, I can't insert the images of the actual scatter plots; ergo, I pasted their descriptions.) A.) A scatter plot has a horizontal axis labeled *Height (inches)* from *90 to 130* in increments of *5*, and a vertical axis labeled *IQ score* from *52 to 72* in increments of *2*. The following 8 points are plotted (*Height*, *IQ score*): *(94, 59); (97, 58); (104, 66); (107, 62); (107, 66); (110, 68); (116, 65); (123, 56)*. There is *no "obvious" pattern* in the points. B.) A scatter plot has a horizontal axis labeled *Height (inches)* from *90 to 130* in increments of *5*, and a vertical axis labeled *IQ score* from *52 to 72* in increments of *2*. The following 8 points are plotted (*Height*, *IQ score*): *(94, 68); (97, 66); (104, 66); (107, 65); (107, 62); (110, 59); (116, 58); (123, 56)*. The points *follow a general trend* of *"falling"* from left to right. C.) A scatter plot has a horizontal axis labeled *Height (inches)* from *52 to 72* in increments of *2*, and a vertical axis labeled *IQ score* from *90 to 130* in increments of *5*. The following 8 points are plotted (*Height*, *IQ score*): *(56, 123); (58, 97); (59, 94); (62, 107); (65, 116); (66, 104); (66, 107); (68, 110)*. There is *no "obvious" pattern* in the points. D.) A scatter plot has a horizontal axis labeled *Height (inches)* from *52 to 72* in increments of *2*, and a vertical axis labeled *IQ score* from *90 to 130* in increments of *5*. The following 8 points are plotted (*Height*, *IQ score*): *(56, 94); (58, 97); (59, 104); (62, 107); (65, 107); (66, 110); (66, 116); (68, 123)*. The points *follow a general trend* of *"rising"* from left to right. *Part 2 (b):* Calculate the sample correlation coefficient, *r*. (Round answer to *three* decimal places.) *r* = *___* *Part 3 (c):* Describe the type of correlation, if any, and interpret the correlation in the context of the data. There is *__* linear correlation. *Part 4 (c):* Interpret the correlation. Choose the correct answer below. A.) Increases in high school girls' heights cause their IQ scores to increase. B.) As high school girls' heights increase, their IQ scores tend to increase. C.) Based on the correlation, there does not appear to be any relationship between high school girls' heights and their IQ scores. D.) Based on the correlation, there does not appear to be a linear relationship between high school girls' heights and their IQ scores. E.) Increases in high school girls' heights cause their IQ scores to decrease. F.) As high school girls' heights increase, their IQ scores tend to decrease. *Part 5 (d):* Use the table of critical values for the Pearson correlation coefficient to make a conclusion about the correlation coefficient. Let *α* = 0.01. (Round answer(s) to *three* decimal places.) The critical value is *__(1)__*. Therefore, there *___(2)___* sufficient evidence at the 1% level of significance to conclude that *____________________(3)____________________* between high school girls' heights and their IQ scores.
Correct Answers: *Part 1 (a):* C.) A scatter plot has a horizontal axis labeled *Height (inches)* from *52 to 72* in increments of *2*, and a vertical axis labeled *IQ score* from *90 to 130* in increments of *5*. The following 8 points are plotted (*Height*, *IQ score*): *(56, 123); (58, 97); (59, 94); (62, 107); (65, 116); (66, 104); (66, 107); (68, 110)*. There is *no "obvious" pattern* in the points. *Part 2 (b):* *.051* *Part 3 (c):* *no* *Part 4 (c):* D.) Based on the correlation, there does not appear to be a linear relationship between high school girls' heights and their IQ scores. *Part 5 (d):* *(1):* *.834* *(2):* *is not* *(3):* *there is a significant linear correlation*
*(Y.T.I.): Exercise 5 (p. 475):* The accompanying table shows the maximum weights (in kilograms) for which one repetition of a half squat can be performed, and the times (in seconds) to run a 10-meter sprint for 12 international soccer players. Complete parts 1 (a) through 5 (d) below. *Maximum weight (x)* = *Time (y)* *170* = *1.87* *170* = *1.86* *145* = *2.14* *205* = *1.51* *145* = *2.14* *180* = *1.7* *180* = *1.8* *160* = *2* *180* = *1.68* *175* = *1.74* *155* = *2.05* *165* = *1.97* *(I didn't copy the accompanying table of critical values for the Pearson correlation coefficient because it's too big.)* *Part 1 (a):* Display the data in a scatter plot. Choose the correct graph below. (Since I don't have Quizlet+, I can't insert the images of the actual scatter plots; ergo, I pasted their descriptions.) A.) A scatter plot has a horizontal axis labeled *Max Weight (kg)* from *1.4 to 2.2* in increments of *0.1*, and a vertical axis labeled *Time (seconds)* from *140 to 220* in increments of *10*. The following 12 points are plotted (*Max Weight (kg)*, *Time (seconds)*): *(1.51, 205); (1.68, 180); (1.7, 180); (1.74, 175); (1.8, 180); (1.86, 170); (1.87, 170); (1.97, 165); (2, 160); (2.05, 155); (2.14, 145); (2.14, 145)*. The points *follow a general trend* of *"falling"* from left to right at a constant rate. B.) A scatter plot has a horizontal axis labeled *Max Weight (kg)* from *1.4 to 2.2* in increments of *0.1*, and a vertical axis labeled *Time (seconds)* from *140 to 220* in increments of *10*. The following 12 points are plotted (*Max Weight (kg)*, *Time (seconds)*): *(1.51, 145); (1.68, 145); (1.7, 155); (1.74, 160); (1.8, 165); (1.86, 170); (1.87, 170); (1.97, 175); (2, 180); (2.05, 180); (2.14, 180); (2.14, 205)*. The points *follow a general trend* of *"rising"* from left to right at a constant rate. C.) A scatter plot has a horizontal axis labeled *Max Weight (kg)* from *140 to 220* in increments of *10*, and a vertical axis labeled *Time (seconds)* from *1.4 to 2.2* in increments of *0.1*. The following 12 points are plotted (*Max Weight (kg)*, *Time (seconds)*): *(145, 1.51); (145, 1.68); (155, 1.7); (160, 1.74); (165, 1.8); (170, 1.86); (170, 1.87); (175, 1.97); (180, 2); (180, 2.05); (180, 2.14); (205, 2.14)*. The points *follow a general trend* of *"rising"* from left to right at a constant rate. D.) A scatter plot has a horizontal axis labeled *Max Weight (kg)* from *140 to 220* in increments of *10*, and a vertical axis labeled *Time (seconds)* from *1.4 to 2.2* in increments of *0.1*. The following 12 points are plotted (*Max Weight (kg)*, *Time (seconds)*): *(145, 2.14); (145, 2.14); (155, 2.05); (160, 2.00); (165, 1.97); (170, 1.86); (170, 1.87); (175, 1.74); (180, 1.68); (180, 1.70); (180, 1.80); (205, 1.51)*. The points *follow a general trend* of *"falling"* from left to right at a constant rate. *Part 2 (b):* Calculate the sample correlation coefficient, *r*. (Round answer to *three* decimal places.) *r* = *___* *Part 3 (c):* Describe the type of correlation, if any, and interpret the correlation in the context of the data. There is *_____________________* linear correlation. *Part 4 (c):* Interpret the correlation. Choose the correct answer below. A.) Increases in the maximum weight for which one repetition of a half squat can be performed cause time to run a 10-meter sprint to increase. B.) As the maximum weight for which one repetition of a half squat can be performed increases, time to run a 10-meter sprint tends to decrease. C.) As the maximum weight for which one repetition of a half squat can be performed increases, time to run a 10-meter sprint tends to increase. D.) Increases in the maximum weight for which one repetition of a half squat can be performed cause time to run a 10-meter sprint to decrease. E.) Based on the correlation, there does not appear to be any relationship between the maximum weight for which one repetition of a half squat can be performed and time to run a 10-meter sprint. F.) Based on the correlation, there does not appear to be a linear relationship between the maximum weight for which one repetition of a half squat can be performed and time to run a 10-meter sprint. *Part 5 (d):* Use the table of critical values for the Pearson correlation coefficient to make a conclusion about the correlation coefficient. Let *α* = 0.01. (Round answer(s) to *three* decimal places.) The critical value is *__(1)__*. Therefore, there *_(2)_* sufficient evidence at the 1% level of significance to conclude that *____________________(3)____________________* between the maximum weight for which one repetition of a half squat can be performed and time to run a 10-meter sprint.
Correct Answers: *Part 1 (a):* D.) A scatter plot has a horizontal axis labeled *Max Weight (kg)* from *140 to 220* in increments of *10*, and a vertical axis labeled *Time (seconds)* from *1.4 to 2.2* in increments of *0.1*. The following 12 points are plotted (*Max Weight (kg)*, *Time (seconds)*): *(145, 2.14); (145, 2.14); (155, 2.05); (160, 2.00); (165, 1.97); (170, 1.86); (170, 1.87); (175, 1.74); (180, 1.68); (180, 1.70); (180, 1.80); (205, 1.51)*. The points *follow a general trend* of *"falling"* from left to right at a constant rate. *Part 2 (b):* *-.976* *Part 3 (c):* *a strong negative* *Part 4 (c):* B.) As the maximum weight for which one repetition of a half squat can be performed increases, time to run a 10-meter sprint tends to decrease. *Part 5 (d):* *(1):* *.708* *(2):* *is* *(3):* *there is a significant linear correlation*
*(Y.T.I.): Exercise 1 (p. 487):* Find the equation of the regression line for the given data. Then construct a scatter plot of the data and draw the regression line. (The pair of variables have a significant correlation.) Then use the regression equation to predict the value of *y* for each of the given *x* - values, if meaningful. The table below shows the heights (in feet) and the number of stories of six notable buildings in a city. *Height (x)* = *Stories (y)* *768* = *52* *628* = *48* *518* = *44* *511* = *41* *491* = *39* *478* = *38* *Part 3 (a): x = 502 feet* *Part 4 (b): x = 648 feet* *Part 5 (c): x = 802 feet* *Part 6 (d): x = 725 feet* *Part 1:* Find the regression equation. (Round the *slope (1)* to *three* decimal places, and round the *"y" - intercept (2)* to *two* decimal places.) *ŷ* = *__(1)__*x + (*__(2)__*) *Part 2:* Choose the correct graph below. (Since I don't have Quizlet+, I can't insert the images of the actual scatter plots; ergo, I pasted their descriptions.) A.) A scatterplot has a horizontal axis labeled *Height (feet)* from *0 to 800* in increments of *200*, and a vertical axis labeled *Stories* from *0 to 60* in increments of *10*. The following *6* points are plotted (*Height (feet)*, *Stories*): *(758, 34), (608, 33), (488, 29), (461, 41), (331, 29), (278, 38)*. A trend line that *falls* from left to right passes through the points *(200, 46)*, and *(600, 42)*. All coordinates are approximate. B.) A scatterplot has a horizontal axis labeled *Height (feet)* from *0 to 800* in increments of *200*, and a vertical axis labeled *Stories* from *0 to 60* in increments of *10*. The following *6* points are plotted (*Height (feet)*, *Stories*): *(748, 32), (608, 38), (498, 34), (491, 31), (471, 29), (458, 28)*. A trend line that *falls* from left to right passes through the points *(200, 31)*, and *(600, 28)*. All coordinates are approximate. C.) A scatterplot has a horizontal axis labeled *Height (feet)* from *0 to 800* in increments of *200*, and a vertical axis labeled *Stories* from *0 to 60* in increments of *10*. The following *6* points are plotted (*Height (feet)*, *Stories*): *(778, 57), (638, 53), (528, 49), (521, 46), (501, 44), (488, 43)*. A trend line that *rises* from left to right passes through the points *(200, 37)*, and *(600, 55)*. All coordinates are approximate. D.) A scatterplot has a horizontal axis labeled *Height (feet)* from *0 to 800* in increments of *200*, and a vertical axis labeled *Stories* from *0 to 60* in increments of *10*. The following *6* points are plotted (*Height (feet)*, *Stories*): *(768, 52), (628, 48), (518, 44), (511, 41), (491, 39), (478, 38)*. A trend line that *rises* from left to right passes through the points *(200, 27)*, and *(600, 45)*. All coordinates are approximate. *Part 3 (a):* Predict the value of *y* for *x = 502*. Choose the correct answer below. A.) 51 B.) 41 C.) 47 D.) not meaningful *Part 4 (b):* Predict the value of *y* for *x = 648*. Choose the correct answer below. A.) 41 B.) 54 C.) 47 D.) not meaningful *Part 5 (c):* Predict the value of *y* for *x = 802*. Choose the correct answer below. A.) 51 B.) 54 C.) 47 D.) not meaningful *Part 6 (d):* Predict the value of *y* for *x = 725*. Choose the correct answer below. A.) 54 B.) 51 C.) 41 D.) not meaningful
Correct Answers: *Part 1:* *(1):* *.046* *(2):* *17.5* *Part 2:* D.) A scatterplot has a horizontal axis labeled *Height (feet)* from *0 to 800* in increments of *200*, and a vertical axis labeled *Stories* from *0 to 60* in increments of *10*. The following *6* points are plotted (*Height (feet)*, *Stories*): *(768, 52), (628, 48), (518, 44), (511, 41), (491, 39), (478, 38)*. A trend line that *rises* from left to right passes through the points *(200, 27)*, and *(600, 45)*. All coordinates are approximate. *Part 3 (a):* B.) 41 *Part 4 (b):* C.) 47 *Part 5 (c):* D.) not meaningful *Part 6 (d):* B.) 51
*(Y.T.I.): Exercise 3 (p. 489):* Find the equation of the regression line for the given data. Then construct a scatter plot of the data and draw the regression line. (The pair of variables have a significant correlation.) Then use the regression equation to predict the value of *y* for each of the given *x* - values, if meaningful. The caloric content and the sodium content (in milligrams) for 6 beef hot dogs are shown in the table below. *Calories (x)* = *Sodium (y)* *160* = *415* *180* = *465* *120* = *350* *130* = *370* *80* = *270* *190* = *520* *Part 3 (a): x = 170 calories* *Part 4 (b): x = 100 calories* *Part 5 (c): x = 140 calories* *Part 6 (d): x = 220 calories* *Part 1:* Find the regression equation. (Round both answers to *three* decimal places.) *ŷ* = *__(1)__*x + (*__(2)__*) *Part 2:* Choose the correct graph below. (Since I don't have Quizlet+, I can't insert the images of the actual scatter plots; ergo, I pasted their descriptions.) A.) A scatterplot has a horizontal axis labeled *"Calories"* from *0 to 200* in increments of *20*, and a vertical axis labeled *"Sodium (mg)"* from *0 to 560* in increments of *40*. The following *6* points are plotted (*"Calories"*, *"Sodium (mg)"*): *(60, 170), (100, 80), (110, 270), (130, 315), (140, 365), (150, 420)*. A trend line *falls* from left to right and passes through the points *(0, 445)* and *(211, 0)*. All coordinates are approximate. B.) A scatterplot has a horizontal axis labeled *"Calories"* from *0 to 200* in increments of *20*, and a vertical axis labeled *"Sodium (mg)"* from *0 to *560* in increments of *40*. The following *6* points are plotted (*"Calories"*, *"Sodium (mg)"*): *(90, 320), (100, 285), (130, 400), (140, 420), (180, 470), (90, 320)*. A trend line *rises* from left to right and passes through the points *(50, 207)* and *(100, 313)*. All coordinates are approximate. C.) A scatterplot has a horizontal axis labeled *"Calories"* from *0 to 200* in increments of *20*, and a vertical axis labeled *"Sodium (mg)"* from *0 to 560* in increments of *40*. The following *6* points are plotted (*"Calories"*, *"Sodium (mg)"*): *(80, 270), (120, 350), (130, 370), (160, 415), (180, 465), (190, 520)*. A trend line *rises* from left to right and passes through the points *(50, 201)* and *(100, 307)*. All coordinates are approximate. D.) A scatterplot has a horizontal axis labeled *"Calories"* from *0 to 200* in increments of *20*, and a vertical axis labeled *"Sodium (mg)"* from *0 to 560* in increments of *40*. The following *6* points are plotted (*"Calories"*, *"Sodium (mg)"*): *(0, 250), (30, 110), (40, 315), (40, 265), (60, 170), (70, 220)*. A trend line *falls* from left to right and passes through the points *(0, 245)* and *(116, 0)*. All coordinates are approximate. *Part 3 (a):* Predict the value of *y* for *x = 170*. Choose the correct answer below. A.) 560.290 B.) 391.250 C.) 454.640 D.) not meaningful *Part 4 (b):* Predict the value of *y* for *x = 100*. Choose the correct answer below. A.) 560.290 B.) 306.730 C.) 391.250 D.) not meaningful *Part 5 (c):* Predict the value of *y* for *x = 140*. Choose the correct answer below. A.) 454.640 B.) 306.730 C.) 391.250 D.) not meaningful *Part 6 (d):* Predict the value of *y* for *x = 220*. Choose the correct answer below. A.) 454.640 B.) 306.730 C.) 560.290 D.) not meaningful
Correct Answers: *Part 1:* *(1):* *2.113* *(2):* *95.43* *Part 2:* C.) A scatterplot has a horizontal axis labeled *"Calories"* from *0 to 200* in increments of *20*, and a vertical axis labeled *"Sodium (mg)"* from *0 to 560* in increments of *40*. The following *6* points are plotted (*"Calories"*, *"Sodium (mg)"*): *(80, 270), (120, 350), (130, 370), (160, 415), (180, 465), (190, 520)*. A trend line *rises* from left to right and passes through the points *(50, 201)* and *(100, 307)*. All coordinates are approximate. *Part 3 (a):* C.) 454.640 *Part 4 (b):* B.) 306.730 *Part 5 (c):* C.) 391.250 *Part 6 (d):* D.) not meaningful
*(Y.T.I.): Exercise 2 (p. 488):* Find the equation of the regression line for the given data. Then construct a scatter plot of the data and draw the regression line. (The pair of variables have a significant correlation.) Then use the regression equation to predict the value of *y* for each of the given *x* - values, if meaningful. The number of hours 6 students spent studying for a test and their scores on that test are shown below. *Hours spent studying (x)* = *Test score (y)* *0* = *38* *1* = *45* *2* = *52* *3* = *48* *4* = *62* *5* = *73* *Part 3 (a): x = 2 hours* *Part 4 (b): x = 4.5 hours* *Part 5 (c): x = 13 hours* *Part 6 (d): x = 2.5 hours* *Part 1:* Find the regression equation. (Round the *slope (1)* to *three* decimal places, and round the *"y" - intercept (2)* to *two* decimal places.) *ŷ* = *__(1)__*x + (*__(2)__*) *Part 2:* Choose the correct graph below. (Since I don't have Quizlet+, I can't insert the images of the actual scatter plots; ergo, I pasted their descriptions.) A.) A scatterplot has a horizontal axis labeled *Hours studying* from *0 to 8* in increments of *1*, and a vertical axis labeled *Test score* from *0 to 80* in increments of *10*. The following *6* points are plotted (*Hours studying*, *Test score*): *(1, 41), (2, 50), (3, 59), (4, 53), (5, 67), (6, 78)*. A trend line that *rises* from left to right passes through the points *(2, 56)*, and *(6, 81)*. All coordinates are approximate. B.) A scatterplot has a horizontal axis labeled *Hours studying* from *0 to 8* in increments of *1*, and a vertical axis labeled *Test score* from *0 to 80* in increments of *10*. The following *6* points are plotted (*Hours studying*, *Test score*): *(0, 38), (1, 45), (2, 52), (3, 48), (4, 62), (5, 73)*. A trend line that *rises* from left to right passes through the points *(2, 50)*, and *(6, 75)*. All coordinates are approximate. C.) A scatterplot has a horizontal axis labeled *Hours studying* from *0 to 8* in increments of *1*, and a vertical axis labeled *Test score* from 0 to 80 in increments of *10*. The following *6* points are plotted (*Hours studying*, *Test score*): *(4, 37), (3, 44), (3, 51), (2, 48), (2, 59), (1, 63)*. A trend line that *falls* from left to right passes through the points *(2, 54)*, and *(6, 29)*. All coordinates are approximate. D.) A scatterplot has a horizontal axis labeled *Hours studying* from *0 to 8* in increments of *1*, and a vertical axis labeled *Test score* from *0 to 80* in increments of *10*. The following *6* points are plotted (*Hours studying*, *Test score*): *(3, 36), (3, 44), (0, 51), (1, 47), (2, 61), (3, 72)*. A trend line that *falls* from left to right passes through the points *(2, 39)*, and *(6, 14)*. All coordinates are approximate. *Part 3 (a):* Predict the value of *y* for *x = 2*. Choose the correct answer below. A.) 49.8 B.) 53.0 C.) 65.7 D.) not meaningful *Part 4 (b):* Predict the value of *y* for *x = 4.5*. Choose the correct answer below. A.) 119.6 B.) 65.7 C.) 53.0 D.) not meaningful *Part 5 (c):* Predict the value of *y* for *x = 13*. Choose the correct answer below. A.) 119.6 B.) 49.8 C.) 65.7 D.) not meaningful *Part 6 (d):* Predict the value of *y* for *x = 2.5*. Choose the correct answer below. A.) 119.6 B.) 53.0 C.) 49.8 D.) not meaningful
Correct Answers: *Part 1:* *(1):* *6.343* *(2):* *37.14* *Part 2:* B.) A scatterplot has a horizontal axis labeled *Hours studying* from *0 to 8* in increments of *1*, and a vertical axis labeled *Test score* from *0 to 80* in increments of *10*. The following *6* points are plotted (*Hours studying*, *Test score*): *(0, 38), (1, 45), (2, 52), (3, 48), (4, 62), (5, 73)*. A trend line that *rises* from left to right passes through the points *(2, 50)*, and *(6, 75)*. All coordinates are approximate. *Part 3 (a):* A.) 49.8 *Part 4 (b):* B.) 65.7 *Part 5 (c):* D.) not meaningful *Part 6 (d):* B.) 53.0
*(Y.T.I.): Exercise 7 (p. 479):* The weights (in pounds) of 6 vehicles and the variability of their braking distances (in feet) when stopping on a dry surface are shown in the table. Can you conclude that there is a significant linear correlation between vehicle weight and variability in braking distance on a dry surface? Use *α* = 0.05. *Weight (x)* = *Variability in braking distance (y)* *5,960* = *1.76* *5,360* = *1.93* *6,500* = *1.93* *5,100* = *1.63* *5,890* = *1.62* *4,800* = *1.50* *(I didn't copy the accompanying table of critical values for Student's "t" - distribution because it's too big.)* *Part 1:* Setup the hypothesis for the test. *H₀*: *ρ* *_(1)_* 0 *H*∨*a*: *ρ* *_(2)_* 0 *Part 2:* Identify the critical value(s). Select the correct choice below and fill in any answer boxes within your choice. (Round answer(s) to *three* decimal places.) A.) The critical value is *___*. B.) The critical values are *−t₀* = *____* and *t₀* = *____*. *Part 3:* Calculate the test statistic. (Round answer to *three* decimal places.) *t* = *___* *Part 4:* What is your conclusion? There *___(1)___* enough evidence at the 5% level of significance to conclude that there *_(2)_* a significant linear correlation between vehicle weight and variability in braking distance on a dry surface.
Correct Answers: *Part 1:* *(1):* *=* *(2):* *≠* *Part 2:* B.) The critical values are *−t₀* = *-2.776* and *t₀* = *2.776*. *Part 3:* *1.59* *Part 4:* *(1):* *is not* *(2):* *is*
Section 1
Correlation
*The Equation for a Regression Line* (p. 487)
The equation of a regression line for an independent variable *x* and a dependent variable *y* is: *ŷ* = *mx* + *b* where *ŷ* is the predicted *y* - value for a given *x* - value. The slope *m* and *y* - intercept *b* are given by: *m* = ((*n*×*∑*×*x*×*y*) -(*∑*×*x*)(*∑*×*y*)) / ((*n*×*∑*×*x²*) - (*∑*×*x*)²) and *b* = *ȳ* - *m*×*x̄* = ((*∑*×*y*)/*n*) - *m*×((*∑*×*x*)/*n*) where *ȳ* is the mean of the *y* - values in the data set, *x̄* is the mean of the *x* - values, and *n* is the number of pairs of data. The regression line always passes through the point (*x̄*, *ȳ*).
regression line
The line for which the sum of the squares of the residuals is a minimum. (Also called *line of best fit*.) (p. 486)
