chapter 9 ex & hw with calculator instructions

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Use the value of the linear correlation coefficient to calculate the coefficient of determination. What does this tell you about the explained variation of the data about the regression​ line? About the unexplained​ variation? r=0.028

(a) Calculate the coefficient of determination. r = -0.696^2 = .484 =.484 (b) What does this tell you about the explained variation of the data about the regression​ line? ​48.4% of the variation can be explained by the regression line. (c) About the unexplained​ variation? 51.6​% of the variation is unexplained and is due to other factors or to sampling error.

9. Use the value of the linear correlation coefficient to calculate the coefficient of determination. What does this tell you about the explained variation of the data about the regression​ line? About the unexplained​ variation? r=0.638

(a) Calculate the coefficient of determination. r = 0.638^2 = .407 =.407 (b) What does this tell you about the explained variation of the data about the regression​ line? 40.7​% of the variation can be explained by the regression line. (c) About the unexplained​ variation? 40.7 - 100 = 59.3 59.3​% of the variation is unexplained and is due to other factors or to sampling error.

The coefficient of determination r2 is the ratio of which two types of​ variations? What does r2 measure? What does 1−r2 measure?

(a) The coefficient of determination r2 is the ratio of which two types of​ variations? The coefficient of determination is the ratio of the explained variation to the total variation. (b) What does r2 measure? The coefficient of determination is the percent of variation of y that is explained by the relationship between x and y. (c)What does 1−r2 measure? The coefficient of​ determination, r2​, is the percent of variation of y that is explained by the relationship between x and y.​ Therefore, 1−r2​, is the percent of variation of y that is unexplained by the relationship between x and y.

Describe the explained variation about a regression line in words and in symbols. Choose the correct answer below. A. The explained variation is the sum of the squares of the differences between the​ y-values of each ordered pair and the mean of the​ y-values of the ordered pairs. B. The explained variation is the sum of the squares of the differences between the predicted​ y-values and the mean of the​ y-values of the ordered pairs. Your answer is correct. C. The explained variation is the sum of the squares of the differences between the observed​ y-values and the predicted​ y-values.

(a). The explained variation is the sum of the squares of the differences between the predicted​ y-values and the mean of the​ y-values of the ordered pairs. (b) see image b

In a study conducted to examine the quality of fish after 7 days in ice​ storage, ten raw fish of the same kind and approximately the same size were caught and prepared for ice storage. The fish were placed in ice storage at different times after being caught. A measure of fish quality was given to each fish after 7 days in ice storage. The sample data are shown​ below, where​ "Time" is the number of hours after being caught that the fish was placed in ice storage and​ "Fish Quality" is the measure given to each fish after 7 days in ice storage​ (higher numbers mean better​ quality). The​ least-squares regression equation is y=8.4425−​0.1495x, where x is​ "Time" and y is predicted​ "Fish Quality." What is the residual for the first​ observation, (0,8.5)? Time 0 0 2 3 5 6 7 9 11 12 Fish Quality 8.5 8.4 8.0 8.1 7.8 7.6 7.3 7.0 6.8 6.7

0.0575

There is a certain geyser that erupts on a regular basis. Researchers are interested in the relationship between the duration of a current eruption of the geyser​ (duration) and the time between when that eruption ends and the next eruption begins​ (interval). Review the accompanying scatterplot of 222 eruptions of the geyser. The​ least-squares regression equation is y=33.967+​11.358x, where y is the interval from the end of the current eruption to the beginning of the next eruption and x is the duration of current eruption. For a duration of 4​ minutes, y=75.4 minutes. What does this​ mean? Select all that apply. A. For eruptions that last 4​ minutes, it is estimated that a visitor will have to wait 75.4 minutes after the current eruption ends before the next eruption begins. Your answer is correct. B. The average​ wait-time until the next eruption for all eruptions that last 4 minutes is 75.4 minutes. Your answer is correct. C. Assuming that the current eruptions lasts 4​ minutes, the next eruption will always occur 75.4 minutes later. D. A visitor will have to wait exactly 75.4 minutes after the current eruption ends before the next eruption begins.

A. For eruptions that last 4​ minutes, it is estimated that a visitor will have to wait 75.4 minutes after the current eruption ends before the next eruption begins. Your answer is correct. B. The average​ wait-time until the next eruption for all eruptions that last 4 minutes is 75.4 minutes.

Using the scatter plot of the registered nurse salary data​ shown, what type of​ correlation, if any do you think the data​ have? Explain. A. There appears to be a strong positive linear correlation. As the years of experience of the registered nurses​ increase, salaries tend to increase. B. There appears to be a strong positive linear correlation. As the years of experience of the registered nurses​ increase, salaries tend to decrease. C. There appears to be a strong negative linear correlation. As the years of experience of the registered nurses​ increase, salaries tend to decrease. D. There does not appear to be any linear correlation.

A. There appears to be a strong positive linear correlation. As the years of experience of the registered nurses​ increase, salaries tend to increase.

9.1 ex 17 In a study conducted to examine the quality of fish after 7 days in ice​ storage, ten raw fish of the same kind and approximately the same size were caught and prepared for ice storage. The fish were placed in ice storage at different times after being caught. A measure of fish quality was given to each fish after 7 days in ice storage. Review the accompanying sample data and​ scatterplot, where​ "Time" is the number of hours after being caught that the fish was placed in ice storage and​ "Fish Quality" is the measure given to each fish after 7 days in ice storage​ (higher numbers mean better​ quality). Is it appropriate to use the correlation coefficient to describe the strength of the relationship between​ "Time" and​ "Fish Quality"? A. ​Yes, because both variables are quantitative and there is a linear relationship between the two variables. B. ​No, because there is very little scatter of the points around a line fit between the points. C. ​No, because there are a couple of influential outliers. D. ​No, because​ "Fish Quality" is not quantitative.

A. ​Yes, because both variables are quantitative and there is a linear relationship between the two variables.

In a​ scatterplot, what is an observation that stands away from the rest of the observations​ called? A residual Leverage A lurking observation Extrapolation An outlier

An outlier

Is it appropriate to use a regression line to predict​ y-values for​ x-values that are not in​ (or close​ to) the range of​ x-values found in the​ data? Choose the correct answer below. A. It is appropriate because the regression line models a​ trend, not the actual​ points, so although the prediction of the​ y-value may not be exact it will be precise. B. It is not appropriate because the regression line models the trend of the given​ data, and it is not known if the trend continues beyond the range of those data. C. It is not appropriate because the correlation coefficient of the regression line may not be significant. D. It is appropriate because the regression line will always be​ continuous, so a​ y-value exists for every​ x-value on the axis.

B. It is not appropriate because the regression line models the trend of the given​ data, and it is not known if the trend continues beyond the range of those data.

Given a set of data and a corresponding regression​ line, describe all values of x that provide meaningful predictions for y. A. Prediction values are meaningful for all​ x-values that are realistic in the context of the original data set. B. Prediction values are meaningful only for​ x-values in​ (or close​ to) the range of the original data. Your answer is correct. C. Prediction values are meaningful only for​ x-values that are not included in the original data set.

B. Prediction values are meaningful only for​ x-values in​ (or close​ to) the range of the original data. Your answer is correct.

18 A linear equation is an equation of the form y=ax+b​, and a power equation is an equation of the form y=axb. The linear equation and power equation for the accompanying data are provided below. Determine which equation is a better model for the data. Explain your reasoning. x 1 2 3 4 5 6 7 8 y 695 407 239 108 81 77 69 75 Choose the correct answer below. A. The linear equation is a better model for the data because the graph of the linear equation fits the data better than the graph of power equation. B. The power equation is a better model for the data because the graph of the power equation fits the data better than the graph of linear equation. Your answer is correct. C. The linear equation is a better model for the data because the graph of the linear equation passes through more data points than the graph of power equation. D. The power equation is a better model for the data because the graph of the power equation has more data points above the line than the graph of linear equation.

B. The power equation is a better model for the data because the graph of the power equation fits the data better than the graph of linear equation.

There is a certain geyser that erupts on a regular basis. Researchers are interested in the relationship between the duration of a current eruption of the geyser​ (duration) and the time between when that eruption ends and the next eruption begins​ (interval). Review the accompanying scatterplot of 222 eruptions of the geyser. The​ least-squares regression equation is y=33.967+​11.358x, where y is the interval from the end of the current eruption to the beginning of the next eruption and x is the duration of current eruption. In this​ equation, what is​ 11.358? Choose the correct answer below. A. The predicted value of y B. The slope of the​ least-squares regression line Your answer is correct. C. The duration of the current eruption D. The​ y-intercept of the​ least-squares regression line

B. The slope of the​ least-squares regression line

9.1 ex 10 Which of the following statements best describes this​ scatterplot? A. There are two clusters of points. The relationship between X and Y for each cluster is strong and​ positive, but one is stronger than the other because it is more vertical. B. There are two clusters of points. The relationship between X and Y for each cluster is strong and positive. Your answer is correct. C. The relationship between X and Y is strong and positive with a number of outliers. D. The relationship between X and Y is weak and positive with a number of outliers. E. There are two clusters of points. The relationship between X and Y for one cluster is strong and​ positive, but the other is weak and positive.

B. There are two clusters of points. The relationship between X and Y for each cluster is strong and positive.

9.1 ex 16 Gina calculated a correlation coefficient between hours studied and grade point average as​ +0.75. Which of the following is a correct statement based on this correlation​ coefficient? A. Grade point average is predicted to increase by 0.75 points for every additional hour studied. B. There is a fairly strong positive relationship between hours studied and grade point​ average, indicating that grade point averages tend to be higher for students who study more. Your answer is correct. C. There is a weak positive relationship between hours studied and grade point​ average, indicating that grade point averages tend to be higher for students who study more. D. Students who study only one hour are predicted to have a grade point average of 0.75. E. There is a fairly strong positive relationship between hours studied and grade point​ average, indicating that grade point averages tend to be lower for students who study more.

B. There is a fairly strong positive relationship between hours studied and grade point​ average, indicating that grade point averages tend to be higher for students who study more. Your answer is correct.

Suppose the equation of a​ least-squares regression line is y=−3.17−2.4x. What can be said about the​ y-intercept? Choose the correct answer below. A. It is 3.17. B. It is 2.4. C. It is −3.17. Your answer is correct. D. It is −2.4. E. It is dependent on the value of x.

C. It is −3.17.

Two variables have a positive linear correlation. Is the slope of the regression line for the variables positive or​ negative? A. The slope is negative. As the independent variable increases the dependent variable also tends to increase. B. The slope is positive. As the independent variable increases the dependent variable tends to decrease. C. The slope is positive. As the independent variable increases the dependent variable also tends to increase. Your answer is correct. D. The slope is negative. As the independent variable increases the dependent variable tends to decrease.

C. The slope is positive. As the independent variable increases the dependent variable also tends to increase.

Suppose the equation of a​ least-squares regression line is y=−3.17−2.4x. What can be said about the correlation​ coefficient? Choose the correct answer below. A. It is 0. B. It is −3.17. C. It is −2.4. D. It is​ negative, but its exact value cannot be determined from the given information. Your answer is correct. E. None of the above. More information needs to be given to say anything about the correlation coefficient.

D. It is​ negative, but its exact value cannot be determined from the given information.

Explain how to predict​ y-values using the equation of a regression line. Choose the correct answer below. A. Use the graph of the regression line to determine the​ x-value that corresponds to the​ y-value for which you are solving. B. Substitute the correlation coefficient into the equation and solve for y. C. Substitute a value of y into the equation of a regression line and solve for x. D. Substitute a value of x into the equation of a regression line and solve for y.

D. Substitute a value of x into the equation of a regression line and solve for y.

In order to predict​ y-values using the equation of a regression​ line, what must be true about the correlation coefficient of the​ variables? Choose the correct answer below. A. The correlation between variables must be an​ x-value of a point on the graph. B. The correlation between variables must be greater than zero. C. The correlation between variables must be a​ y-value of a point on the graph. D. The correlation between variables must be significant.

D. The correlation between variables must be significant.

9.1 ex 13 April calculated a correlation coefficient between sex and GPA as −0.25. She said there is a weak correlation between a​ person's sex and their GPA. Which of the following is an appropriate comment about​ April's statement? Choose the correct answer below. A. A correlation coefficient of −0.25 indicates a strong relationship. B. It is not possible to get a negative correlation coefficient. C. ​April's statement is correct. D. The correlation coefficient does not make sense to describe the relationship between a categorical and quantitative variable.

D. The correlation coefficient does not make sense to describe the relationship between a categorical and quantitative variable.

An analyst used the regression line for the data to the right to predict the annual salary for a registered nurse with 28 years of experience. Is this a valid​ prediction? Explain your reasoning. A. ​Yes, the prediction is meaningful because x=28 is not part of the original data set. B. ​Yes, the prediction is meaningful because x=28 makes sense in the context of the original data set. C. ​No, the prediction is not meaningful because the regression line may not be used to generate meaningful predictions. D. ​No, the prediction is not meaningful because x=28 is outside the range of the original data set.

D. ​No, the prediction is not meaningful because x=28 is outside the range of the original data set.

9.1 ex 12 What does a correlation coefficient of 0​ indicate? A. It indicates a​ non-linear relationship between the two quantitative variables. B. It indicates a calculation​ error, as the correlation coefficient cannot be 0. C. There is a strong relationship between the two quantitative variables. D. There is a weak relationship between the two quantitative variables. E. There is no linear relationship between the two quantitative variables.

E. There is no linear relationship between the two quantitative variables.

9.1 ex 18 In a study conducted to examine the quality of fish after 7 days in ice​ storage, ten raw fish of the same kind and approximately the same size were caught and prepared for ice storage. The fish were placed in ice storage at different times after being caught. A measure of fish quality was given to each fish after 7 days in ice storage. Review the accompanying sample data and​ scatterplot, where​ "Time" is the number of hours after being caught that the fish was placed in ice storage and​ "Fish Quality" is the measure given to each fish after 7 days in ice storage​ (higher numbers mean better​ quality). What is the correlation​ coefficient? (Try to figure out the correct answer without calculating the correlation​ coefficient.) A. +0.35 B. −0.75 C. +0.94 D. −0.25 E. −0.99

E. −0.99

In​ regression, what is predicting outside the range of the​ x-values from the sample data​ called?

Extrapolation

Review the accompanying scatterplots. Which of the four scatterplots correspond to positive R2​-values? (SEE IMAGE)

I, III, IV

Review the accompanying scatterplots. Which of the four scatterplots corresponds to the smallest R2​-value? (SEE IMAGE)

II

Review the accompanying scatterplots. Which of the four scatterplots corresponds to the highest R2​-value? (SEE IMAGE)

III

Researchers wondered if brain size has an effect on a​ person's IQ. From a sample of 20​ individuals, the equation of the​ least-squares regression line is y=71.8+​0.0286x, where x represents the size of a brain in cubic centimeters and y represents IQ. What is the interpretation of the​ y-intercept?

IQ is predicted to be 71.8 for a brain size of 0 cubic centimeters.

Researchers wondered if brain size has an effect on a​ person's IQ. From a sample of 20​ individuals, the equation of the​ least-squares regression line is y=71.8+​0.0286x, where x represents the size of a brain in cubic centimeters and y represents IQ. What is the interpretation of the​ slope?

IQ is predicted to increase by 0.0286 for every 1 cubic centimeter increase in brain size.

If an observation has a residual of​ 0, which of the following statements is​ true?

Its predicted value is the same as its observed value.

Review the accompanying scatterplots. Which of the four scatterplots correspond to negative R2​-values?(SEE IMAGE)

NONE OF THE ABOVE

In a study conducted to examine the quality of fish after 7 days in ice​ storage, ten raw fish of the same kind and approximately the same size were caught and prepared for ice storage. The fish were placed in ice storage at different times after being caught. A measure of fish quality was given to each fish after 7 days in ice storage. Review the accompanying sample​ data, scatterplot, and Minitab output from a simple linear regression​ analysis, where​ "Time" is the number of hours after being caught that the fish was placed in ice storage and​ "Fish Quality" is the measure given to each fish after 7 days in ice storage​ (higher numbers mean better​ quality). Is it appropriate to use this regression model to predict​ "Fish Quality" for a fish put in storage 20 hours after it was​ caught?

No; the linear pattern that exists in the model from the sample data may not continue in the same fashion outside the range of the​ x-values from the sample data.

5.The following table shows the weights​ (in pounds) and the number of hours slept in a day by a random sample of infants. Test the claim that M≠0. Use α=0.01. Then interpret the results in the context of the problem. If​ convenient, use technology to solve the problem. ​Weight, x 8.1 10.2 9.8 7.3 6.9 11.1 10.9 15.1 Hours​ slept, y 14.8 14.5 14.2 14.2 13.8 13.2 13.9

STAT EDIT LinRegTTest 2 tailed (a) Identify the null and alternative hypotheses. H0​: M = 0 HA​: M ≠ 0 (claim) (b) calculate the test statistic STAT EDIT L1 & L2 input data into L1 & L2 STAT TEST E: LinRegTTest t = -2.782 calculate the P-value P = .0319 (c) State the conclusion Fail to reject H0. There is insufficient evidence at the 1​% level of significance to support the claim that there is a linear relationship between weight and number of hours slept. EXTRA STUFF STAT CALC 4: LinReg (ax+b) a = -.218 b = 16.038 ŷ = -.218x+16.038 calculate Se STAT EDIT at the top of list "YHAT" -.218*L1+16.038 ENTER at top of L3 (L2-YHAT)^2 ENTER PRGM θ ENTER Se = .546

The following table shows the weights​ (in pounds) and the number of hours slept in a day by a random sample of infants. Test the claim that M≠0. Use α=0.01. Then interpret the results in the context of the problem. If​ convenient, use technology to solve the problem. Weight, x 8.1 10.3 9.9 7.1 6.9 11.1 11.1 14.9 Hours​ slept, y 14.9 14.5 14.2 14.1 13.7 13.2 13.9 12.6

STAT EDIT LinRegTTest 2 tailed (a) Identify the null and alternative hypotheses. ​Ho: M = 0 Ha​: M ≠ 0 (claim) (b) calculate the test statistic STAT EDIT L1 & L2 input data into L1 & L2 STAT TEST E: LinRegTTest t = -2.17 calculate the P-value P = .0731 (c) State the conclusion Fail to reject H0. There is insufficient evidence at the 1​% level of significance to support the claim that there is a linear relationship between weight and number of hours slept.

9.1 hw 11 The accompanying table shows the height​ (in inches) of 8 high school girls and their scores on an IQ test. Complete parts​ (a) through​ (d) below. ​Height, x 60 56 64 65 58 64 64 54 IQ​ score, y 109 98 106 113 93 109 116 128

STAT EDIT LineReg (ax+b) (a) Display the data in a scatter plot STAT EDIT L1 & L2 zoom 9 (see image) (b) Calculate the sample correlation coefficient r STAT CALC 4: LineReg (ax+b) r = -.006 ​(c) Describe the type of​ correlation, if​ any, and interpret the correlation in the context of the data. There is no linear correlation. (d) Interpret the correlation Based on the​ correlation, there does not appear to be a linear relationship between high school​ girls' heights and their IQ scores. (e) ​Use the table of critical values for the Pearson correlation coefficient to make a conclusion about the correlation coefficient. Let α=0.01. The critical value is .834. Therefore, there is not sufficient evidence at the 1​% level of significance to conclude that there is a significant linear correlation between high school​ girls' heights and their IQ scores.

9.1 hw 13 The accompanying table shows eleven altitudes​ (in thousands of​ feet) and the speeds of sound​ (in feet per​ second) at these altitudes. Complete parts​ (a) through​ (d) below. ​Altitude, x 0 5 10 15 20 25 30 35 40 45 50 Speed of​ sound, y 1116.8 1095.8 1077.2 1057.1 1037.4 1016.3 996.3 970.8 967.7 967.7 967.7

STAT EDIT LineReg (ax+b) (a) Display the data in a scatter plot STAT EDIT L1 & L2 zoom 9 (see image) (b) Calculate the sample correlation coefficient r STAT CALC 4: LineReg (ax+b) r = -.974 ​(c) Describe the type of​ correlation, if​ any, and interpret the correlation in the context of the data. There is a strong negative linear correlation. (d) Interpret the correlation As altitude​ increases, speeds of sound tend to decrease. (e) ​Use the table of critical values for the Pearson correlation coefficient to make a conclusion about the correlation coefficient. Let α=0.01. The critical value is .735. Therefore, there is sufficient evidence at the 1​% level of significance to conclude that there is a significant linear correlation between altitude and speed of sound.

9.1 hw 12 The accompanying table shows the earnings per share​ (in dollars) and the dividends per share​ (in dollars) for 6 companies in a recent year. Complete parts​ (a) through​ (d) below. Earnings per​ share, x 0.96 3.93 3.41 7.97 1.74 2.77 Dividends per​ share, y 0.95 0.38 2.15 1.02 0.69 1.32

STAT EDIT LineReg (ax+b) (a) Display the data in a scatter plot STAT EDIT L1 & L2 zoom 9 (see image) (b) Calculate the sample correlation coefficient r STAT CALC 4: LineReg (ax+b) r = .024 ​(c) Describe the type of​ correlation, if​ any, and interpret the correlation in the context of the data. There is no linear correlation. (d) Interpret the correlation Based on the​ correlation, there does not appear to be a linear relationship between​ companies' earnings per share and their dividends per share (e) ​Use the table of critical values for the Pearson correlation coefficient to make a conclusion about the correlation coefficient. Let α=0.01. The critical value is .917. ​Therefore, there is not sufficient evidence at the 1​% level of significance to conclude that there is a significant linear correlation between​ companies' earnings per share and their dividends per share.

9.1 hw 10 The accompanying table shows the ages​ (in years) of 11 children and the numbers of words in their vocabulary. Complete parts​ (a) through​ (d) below. ​Age, x 1 2 3 4 5 6 3 5 2 4 6 Vocabulary​ size, y 5 270 530 1100 1900 2700 560 2100 240 1400 2400

STAT EDIT LineReg (ax+b) (a) Display the data in a scatter plot STAT EDIT L1 & L2 zoom 9 (see image) (b) Calculate the sample correlation coefficient r STAT CALC 4: LineReg (ax+b) r = .978 ​(c) Describe the type of​ correlation, if​ any, and interpret the correlation in the context of the data. There is a strong positive linear correlation. (d) Interpret the correlation As age​ increases, the number of words in​ children's vocabulary tends to increase. (e) ​Use the table of critical values for the Pearson correlation coefficient to make a conclusion about the correlation coefficient. Let α=0.01. The critical value is .735. ​Therefore, there is sufficient evidence at the 1​% level of significance to conclude that there is a significant linear correlation between​ children's ages and the number of words in their vocabulary.

9.1 hw 15 10. The maximum weights​ (in kilograms) for which one repetition of a half squat can be performed and the times​ (in seconds) to run a​ 10-meter sprint for 12 international soccer players are shown in the attached data table with a sample correlation coefficient r of −0.956. A 13th data point was added to the end of the data set for an international soccer player who can perform the half squat with a maximum of 205 kilograms and can sprint 10 meters in 2.01 seconds. Describe how this affects the correlation coefficient r. Use technology Maximum weight​, xterm-7 170 175 160 205 155 185 185 155 195 185 160 165 205 ​Time, y 1.82 1.77 2.06 1.43 2.04 1.61 1.72 1.89 1.59 1.63 1.99 1.92 2.01

STAT EDIT LineReg (ax+b) STAT EDIT L1 & L2 STAT CALC 4: LineReg (ax+b) r = -.657 The new correlation coefficient r gets weaker, going from -0.956 to -.657

9.1 hw 14 The ages​ (in years) of 10 men and their systolic blood pressures​ (in millimeters of​ mercury) are shown in the attached data table with a sample correlation coefficient r of 0.915. Remove the data entry for the man who is 49 years old and has a systolic blood pressure of 199 millimeters of mercury from the data set and find the new correlation coefficient. Describe how this affects the correlation coefficient r. Use technology. Age, x 17 27 37 44 49 63 68 32 57 23 Systolic blood​ pressure, y 111 122 143 134 199 184 198 132 176 117

STAT EDIT LineReg (ax+b) STAT EDIT L1 & L2 remove 49 & 199 STAT CALC 4: LineReg (ax+b) r = .976 The new correlation coefficient r gets stronger, going from 0.915 to .976

9.1 hw 18 The maximum weights​ (in kilograms) for which one repetition of a​ half-squat can be performed and the jump heights​ (in centimeters) for 12 international soccer players are given in the accompanying table. The correlation​ coefficient, rounded to three decimal​ places, is r=0.734. At α=0.05​, is there enough evidence to conclude that there is a significant linear correlation between the​ variables? Maximum​ weight, x 190 185 155 180 175 170 150 160 160 180 190 210 Jump​ height, y 61 57 53 59 56 65 51 50 49 58 58 63

STAT EDIT PRGM InvT (custom) LinRegTtest (a) Determine the null and alternative hypotheses. Ho: ρ = 0 Ha​: ρ ≠ 0 2 tailed test (b) Identify the critical​ value(s). Select the correct choice below and fill in any answer boxes within your choice. PRGM InvT AREA LEFT: .025 (.05/2) DF: 10 (n-2) −tₒ = −2.228 and tₒ = 2.228 (c) Calculate the test statistic. STAT TEST E: LinRegTtest (y=a+bx) t = 3.421 (d) Conclusion Reject H0. There is enough evidence at the 5​% level of significance to conclude that there is a significant linear correlation between the maximum weight for one repetition of a half squat and the jump height.

9.1 hw 16 4. The weights​ (in pounds) of 6 vehicles and the variability of their braking distances​ (in feet) when stopping on a dry surface are shown in the table. Can you conclude that there is a significant linear correlation between vehicle weight and variability in braking distance on a dry​ surface? Use α=0.05. ​Weight, x 5920 5370 6500 5100 5870 4800 Variability in braking​ distance, y 1.71 1.95 1.91 1.56 1.62 1.50

STAT EDIT PRGM InvT (custom) LinRegTtest (a) Setup the hypothesis for the test. Ho: ρ = 0 Ha​: ρ ≠ 0 2 tailed test (b) Identify the critical​ value(s). PRGM InvT AREA LEFT: .025 (.05/2) DF: 4 (n-2) −tₒ = −2.776 and tₒ = 2.776. (c) Calculate the test statistic. STAT TEST E: LinRegTtest (y=a+bx) t = 1.484 (d) Conclusion There is not enough evidence at the 5​% level of significance to conclude that there is a significant linear correlation between vehicle weight and variability in braking distance on a dry surface.

9.1 hw 17 11. The number of hours 10 students spent studying for a test and their scores on that test are shown in the table. Is there enough evidence to conclude that there is a significant linear correlation between the​ data? Use α=0.01. ​ Hours, x 0 1 2 4 4 5 5 6 7 8 Test​ score, y 38 40 55 53 63 66 73 72 81 91

STAT EDIT PRGM InvT (custom) LinRegTtest (a) Setup the hypothesis for the test. Ho: ρ = 0 Ha​: ρ ≠ 0 2 tailed test (b) Identify the critical​ value(s). Select the correct choice below and fill in any answer boxes within your choice. PRGM InvT AREA LEFT: .005 (.01/2) DF: 8 (n-2) −tₒ = −3.355 and tₒ = 3.355. (c) Calculate the test statistic. STAT TEST E: LinRegTtest (y=a+bx) t = 10.65 (d) Conclusion There is enough evidence at the 5​% level of significance to conclude that there is a significant linear correlation between hours spent studying and test score.hours spent studying and test score.

Use the data in the table below to complete parts​ (a) through​ (d). x 37 34 41 45 42 50 60 56 52 y 25 23 27 32 30 30 28 24 27

STAT EDIT STAT CALC (a) Find the equation of the regression line. STAT EDIT enter data into L1 & L2 STAT CALC 4: LinReg (ax+b) ENTER ŷ = .072x + (24.013) (b) Construct a scatter plot of the data and draw the regression line. Plot the​ x-values on the horizontal axis and the​ y-values on the vertical axis. ZOOM 9 A. (see image) (c) Construct a residual plot. Plot the​ x-values on the horizontal axis and the residuals on the vertical axis. Choose the correct graph below. STAT CALC 8: LinReg (a+bx) ENTER ZOOM 9 B. (see image) ​(d) Determine if there are any patterns in the residual plot and explain what they suggest about the relationship between the variables. The residual plot shows a pattern because the residuals do not fluctuate about 0. This implies the regression line is not a good representation of the relationship between the variables.

13. Find the equation of the regression line for the given data. Then construct a scatter plot of the data and draw the regression line. The table shows the shoe size and heights​ (in.) for 6 men. Shoe size, x 6.5 7.5 8.5 12.0 13.0 13.5 ​Height, y 65.5 66.5 72.5 73.5 73.5 72.5

STAT EDIT STAT CALC (a) Find the regression equation. STAT EDIT input data into L1 & L2 STAT CALC 4: LinReg (y=ax+b) ŷ = 1.011x+60.389 (b) choose the correct graph (see image)

Use the data shown in the table. Replace each​ x-value and​ y-value in the table with its logarithm. Find the equation of the regression line for the transformed data. Then construct a scatter plot of (log x,logy) and sketch the regression line with it. What do you​ notice? x 1 2 3 4 5 6 7 8 y 1064 462 305 209 146 103 94 66

STAT EDIT STAT CALC (a) find the equation of the regression line of the transformed data STAT EDIT Input info into L1 & L2 at the top of L3 lOG L1 ENTER at the top of L4 LOG L2 ENTER STAT CALC 4: LinReg (ax+b) L3, L4 ENTER log y=−1.311 logx+3.063 (b) construct a scatterplot of (log x,logy) with the values of logx on the horizontal axis and the values of logy on the vertical axis and sketch the regression line with it. D (see image) (c) What do you​ notice? The graph to the right is a scatterplot of the untransformed data with the​ x-values on the horizontal axis and the​ y-values on the vertical axis along with the resulting regression line A linear model is more appropriate for the transformed data than for the untransformed data.

17 A power equation is a nonlinear regression equation of the form y=ax^b. Use a technology tool to find and graph the power equation for the data below. Include a scatter plot in your graph. Note that you can also find this model by solving the equation logy = m(log x)+b. x 1 2 3 4 5 6 7 8 y 681 415 255 120 92 72 61 77

STAT EDIT STAT CALC (a) find the graph Input data into L1 & L2 STAT EDIT ZOOM 9 D (see image) (b) Find the power equation STAT CALC A: PwrReg y=a*x^b ENTER y = (789.99)_x^-1.25

2. Complete parts​ (a) through​ (c) using the following data. Row 1 2 3 3 3 3 5 5 6 7 7 Row 2 90 84 77 72 90 75 79 80 63 62

STAT EDIT STAT CALC LineRegR (a) Find the equation of the regression line for the given​ data, letting Row 1 represent the​ x-values and Row 2 the​ y-values. Sketch a scatter plot of the data and draw the regression line. STAT EDIT input data into L1 & L2 STAT CALC 4: LinReg (ax+b) ENTER ŷ = -4.039x + (94.974) (b) choose the correct graph (see image) ​(c) Find the equation of the regression line for the given​ data, letting Row 2 represent the​ x-values and Row 1 the​ y-values. Sketch a scatter plot of the data and draw the regression line. STAT CALC 8: LinReg (a+bx) ENTER ŷ = -.145x + (15.558) (d)​choose the correct graph (see image) (e) ​What effect does switching the explanatory and response variables have on the regression​ line? The sign of m is​ unchanged, but the values of m and b change.

6.The accompanying data are the number of wins and the earned run averages​ (mean number of earned runs allowed per nine innings​ pitched) for eight baseball pitchers in a recent season. Find the equation of the regression line. Then construct a scatter plot of the data and draw the regression line. Then use the regression equation to predict the value of y for each of the given​ x-values, if meaningful. If the​ x-value is not meaningful to predict the value of​ y, explain why not. ​(a) x=5 wins ​(b) x=10 wins ​(c) x=19 wins ​(d) x=15 wins ​Wins, x 20 18 17 16 14 12 11 9 Earned run​ average, y 2.79 3.27 2.66 3.82 3.86 4.26 3.72 5.11

STAT EDIT STAT CALC LineRegR (a) find the regression equation STAT EDIT input data into L1 & L2 STAT CALC 4: LinReg (ax+b) ENTER ŷ = -.18x + (6.35) (b) choose the correct graph C (see image) (c)​Predict value of y for x=5. PRGM LineRegR A=-.18 B=6.35 X=5 It is not meaningful to predict this value of y because x=5 is well outside the range of the original data. (d) Predict value of y for x=10 PRGM LineRegR A=-.18 B=6.35 X=10 ŷ = 4.55 (e) Predict value of y for x=19 PRGM LineRegR A=-.18 B=6.35 X=19 ŷ = 2.93 (f) Predict value of y for x=15 PRGM LineRegR A=-.18 B=6.35 X=15 ŷ = 3.65

Find the equation of the regression line for the given data. Then construct a scatter plot of the data and draw the regression line.​ (The pair of variables have a significant​ correlation.) Then use the regression equation to predict the value of y for each of the given​ x-values, if meaningful. The table below shows the heights​ (in feet) and the number of stories of six notable buildings in a city. Height, x 758 621 518 510 492 483 ​Stories, y 51 47 46 43 37 36 ​(a) x=503 feet ​(b) x=650 feet​ (c) x=315 feet ​(d) x=735 feet

STAT EDIT STAT CALC LineRegR (a) find the regression equation STAT EDIT input data into L1 & L2 STAT CALC 4: LinReg (ax+b) ENTER ŷ = .046x + (17.51) (b) choose the correct graph C (see image) (c)​Predict value of y for x=503. PRGM LineRegR A=.046 B=17.51 X=503 y = 41 (d) Predict value of y for x=650 PRGM LineRegR A=.046 B=17.51 X=650 y = 47 (e) Predict value of y for x=315 PRGM LineRegR A=.046 B=17.51 X=315 y = not meaningful (f) Predict value of y for x=735 PRGM LineRegR A=.046 B=17.51 X=735 y = 51

1. The number of initial public offerings of stock issued in a​ 10-year period and the total proceeds of these offerings​ (in millions) are shown in the table. The equation of the regression line is y=46.418x+18,589.89. Complete parts a and b. ​Issues, x 415 472 689 488 481 394 60 72 196 158 Proceeds, y 19672 29442 42747 30845 65649 65315 20928 11798 30746 27740

STAT EDIT STAT CALC PRGM StderEst (custom) (a) Find the coefficient of determination and interpret the result. STAT EDIT input data into L1 & L2 STAT CALC 4: LinReg (y=ax+b) r^2 = .282 explained = 28.2% unexplained = 71.8% (b) How can the coefficient of determination be​ interpreted? The coefficient of determination (28.2%) is the fraction of the variation in proceeds that can be explained by the variation in issues. The remaining 71.8% fraction of the variation is unexplained and is due to other factors or to sampling error. ​(b) Find the standard error of estimate se and interpret the result. STAT EDIT at the top of list "YHAT" 46.418*L1+18589.89 ENTER at top of L3 (YHAT-L2)^2 ENTER PRGM StdErEst ENTER Se = 16424.066 (c) How can the standard error of estimate be​ interpreted? This means that the standard error of estimate of the proceeds for a specific number of issues is about ​$16,424,066,000. ​Therefore, 28.62% of the variation in proceeds can be explained by the regression line y=46.418x+18,589.89 and the standard error of estimate of the proceeds for a specific number of issues is about $16,424,066,000.

The table shows the amounts of crude oil​ (in thousands of barrels per​ day) produced by a certain country and the amounts of crude oil​ (in thousands of barrels per​ day) imported by the same country for seven years. The equation of the regression line is y=−1.326x+17,011.27. Complete parts​ (a) and​ (b) below. ​Produced, x 5,833 5,700 5,637 5,385 5,238 5,167 5,092 Imported, y 9,323 9,154 9,682 10,045 10,166 10,190 10,061

STAT EDIT STAT CALC PRGM θ (custom) (a) Find the coefficient of determination and interpret the result. STAT EDIT L1 & l2 STAT CALC 4: LinReg (y=ax+b) r^2 = .82 (b) How can the coefficient of​ determination, r2=0.82 be​ interpreted? The fraction of the variation in the amount of imported crude oil that can be explained by the variation in the amount of produced crude oil is r2. The remaining fraction 1−r2 of the variation is unexplained and is due to other factors or to sampling error. (c) Find the standard error of estimate se and interpret the result. STAT EDIT at the top of list "YHAT" -1.326*L1+17011.27 ENTER at top of L3 (L2-YHAT)^2 ENTER PRGM θ ENTER Se = 196.658 (d) How can the standard error of estimate be​ interpreted? The standard error of estimate of the amount of imported crude oil for a specific amount of produced crude oil is about se thousand of barrels per day.

The table shows the average weekly wages​ (in dollars) for state government employees and federal government employees for 8 years. The equation of the regression line is y=1.423x−24.155. Complete parts​ (a) and​ (b) below. Average Weekly Wages​ (state), x 748 777 781 815 835 889 917 941 Average Weekly Wages​ (federal), y 1009 1051 1117 1154 1197 1253 1273

STAT EDIT STAT CALC PRGM θ (custom) (a) Find the coefficient of determination and interpret the result. STAT EDIT L1 & l2 STAT CALC 4: LinReg (y=ax+b) r^2 = .934 (b) How can the coefficient of​ determination, r2=0.934​, be​ interpreted? The coefficient of determination is the fraction of the variation in average weekly wages for federal government employees that can be explained by the variation in average weekly wages for state government employees and is represented by r^2. The remaining fraction of the​ variation, 1−r2, is unexplained and is due to other factors or to sampling error. (c) find the standard error of estimate STAT EDIT at the top of list "YHAT" 1.568*L1+151.164 ENTER at top of L3 (YHAT-L2)^2 ENTER PRGM θ ENTER Se = 28.86 (d) How can the standard error of​ estimate, se=​28.86, be​ interpreted? The standard error of estimate of the average weekly wage for federal government employees for a specific average weekly wage for state government employees is about se dollars.

14 12. Use the data in the table below to complete parts​ (a) through​ (c). x 4 6 9 11 13 16 20 46 y 26 31 27 31 21 22 22 7

STAT EDIT STAT CALC (a) construct a scatterplot STAT EDIT enter data into L1 & L2 STAT CALC 4: LinReg (ax+b) ENTER ŷ = -.532x + (31.69) ZOOM 9 D. (see image) (b) identify any possible outliers A. The point (46,7) may be an outlier. (c) Determine if the point is influential. The change in slope or intercept is significant if it is larger than 10%. STAT EDIT modify L1 & L2 by deleting the outlier then, STAT CALC 4: LinReg (ax+b) ENTER ŷ = -.493x + (31.28) equation with outlier is ŷ = -.532x + (31.69) (from above) equation w/out outlier is ŷ = -.493x + (31.28) The point is not an influential point because the slopes with the point included and without the point included are not significantly​ different, and the intercepts are not significantly different.

An exponential equation is a nonlinear regression equation of the form y=abx. Use technology to find and graph the exponential equation for the accompanying​ data, which shows the number of bacteria present after a certain number of hours. Include the original data in the graph. Note that this model can also be found by solving the equation log y=mx + b for y. Number of​ hours, x 1 2 3 4 5 6 7 Number of​ bacteria, y 167 279 469 781 1313 1923

STAT EDIT STAT CALC Input data into L1 & L2 STAT CALC 0: ) ŷ = 93.65x + (1.71) ZOOM 9 Choose the correct graph (see image)

The coefficient of determination r2 is the ratio of which two types of​ variations? What does r2 measure? What does 1−r2 ​measure?

The coefficient of determination is the ratio of the explained variation to the total variation.

9.1 ex 5 What is the definition of the correlation​ coefficient?

The correlation coefficient is a measure that describes the direction and strength of the linear relationship between two quantitative variables.

Two variables have a positive linear correlation. Does the dependent variable increase or decrease as the independent variable​ increases?

The dependent variable increases.

Identify the explanatory variable and the response variable. A teacher wants to determine if the teaching method used by her students can be used to predict the students' test scores.

The explanatory variable is the teaching method. The response variable is the students' test scores.

Identify the explanatory variable and the response variable. A farmer wants to determine if the temperature received by similar crops can be used to predict the harvest of the crop.

The explanatory variable is the temperature. The response variable is the harvest of the crop.

What does it mean to say​ "correlation does not imply​ causation"?

The fact that two variables are strongly correlated does not in itself imply a​ cause-and-effect relationship between the variables.

The line that fits best between the points in a scatterplot is the line that gives the​ _______ sum of the squared​ _______ distances between each point and the line.

The line that fits best between the points in a scatterplot is the line that gives the smallest sum of the squared vertical distances between each point and the line.

Describe the range of values for the correlation coefficient.

The range of values for the correlation coefficient is −1 to​ 1, inclusive.

9.1 ex 6 When looking at a scatterplot of two quantitative​ variables, what do we typically look​ for?

The relationship between the two variables and if there are any deviations from the pattern​ (outliers or clusters of​ points, for​ example

9.1 ex 8 Which of the following statements best describes this​ scatterplot?

There is a​ negative, moderately strong relationship between X and Y with one outlier.

9.1 ex 9 Which of the following statements best describes this​ scatterplot?

There is a​ non-linear relationship between X and Y with two outliers.

What is the coefficient of determination for two variables that have perfect positive linear correlation or perfect negative linear​ correlation? Interpret your answer.

Two variables that have a perfect positive or a perfect negative linear correlation have a correlation coefficient of 1 or −1. The coefficient of determination is this value squared. In either​ case, this value is 1. The coefficient of determination is the fraction of the variation in the response variable explained by the variation in the explanatory variable.

n​ regression, what is the difference between an observed value of the response variable and its predicted value​ called?

a risidual

9.1 ex 1 Determine if the following statement is true or false. Allie calculated a correlation coefficient of −0.5. She made a mistake in her calculation since the correlation coefficient cannot be negative.

false

9.1 ex 14 Determine if the following statement is true or false. Audrey examined a scatterplot and saw that there was no relationship between height and grade point average.​ Therefore, Audrey should not use the correlation coefficient to describe the strength of this relationship.

false

9.1 ex 15 Steve calculated a correlation coefficient between gas price and miles driven as −0.15. Steve said there was a strong negative association between gas price and miles driven. Is this statement true or​ false?

false

9.1 ex 3 Determine if the following statement is true or false. Benjamin was investigating the relationship between outside temperature on a given day and number of hours spent outside that day. After sampling 25 people on 25 different​ days, he obtained the displayed scatterplot. He should use the correlation coefficient to describe the strength of the relationship between temperature and hours spent outside.

false

9.1 ex 4 Determine if the following statement is true or false. A correlation coefficient close to 1 is evidence of a​ cause-and-effect relationship between the two variables.

false

Researchers wondered if brain size has an effect on a​ person's IQ. The response variable is brain size. Is this statement true or​ false?

false

There is a certain geyser that erupts on a regular basis. Researchers are interested in the relationship between the duration of a current eruption of the geyser​ (duration) and the time between when that eruption ends and the next eruption begins​ (interval). Review the accompanying scatterplot of 222 eruptions of the geyser. The​ least-squares regression equation is y=33.967+​11.358x, where y is the interval from the end of the current eruption to the beginning of the next eruption and x is the duration of current eruption. For a duration of 4​ minutes, y=75.4 minutes. This means that a visitor will have to wait exactly 75.4 minutes after the current eruption ends before the next eruption begins. Is this statement true or​ false?

false

In​ regression, what can be said about the sum of the residuals of all the​ observations?

it will always be zero

9.2When performing a linear regression​ analysis, it is important that the relationship between the two quantitative variables be​ _______.

linear

When analyzing two quantitative​ variables, what is the first thing that should be​ done?

make a scatterplot

Discuss the difference between r and ρ.

r represents the sample correlation coefficient.ρ represents the population correlation coefficient.

Which value of r indicates a stronger​ correlation: r=0.763 or r=−0.897​? Explain your reasoning.

r=−0.897 represents a stronger correlation because −0.897 > 0.763.

In​ regression, what is the proportion of variation in the response variable that is explained by the regression model

r^2

7. For each of the scatter plots​ below, determine whether there is a perfect positive linear​ correlation, a strong positive linear​ correlation, a perfect negative linear​ correlation, a strong negative linear​ correlation, or no linear correlation between the variables.

see image

9.1 ex 7 Which of the following scatterplots indicates a strong negative linear relationship between X and​ Y?

see image

Match the description with its symbol.

see image

Match the regression equation with the appropriate graph.

see image

The scatter plots below show the results of a survey of 20 randomly selected males ages 24-35.Using age as the explanatory​ variable, match each graph with the appropriate description.

see image

9.1 ex 11 Determine if the following statement is true or false. A correlation coefficient can be 0.

true

9.1 ex 2 Determine if the following statement is true or false. Alex calculated a correlation coefficient of −1.5. He made a mistake in his calculation since the correlation coefficient has to be between −1 and 1.

true

In​ regression, a residual can be negative. Is this statement true or​ false?

true


Ensembles d'études connexes

Section 2: What is Machine Learning?

View Set

Anatomy and Physiology: Chapter 7

View Set

- ALL 46 PRESIDENTS OF THE USA -

View Set

Risk Management: Emergency and Spill Response

View Set