Stat 042 chapter 7 homework

Ace your homework & exams now with Quizwiz!

Question 22 Researchers hoping to find ways to make a good estimate of a​ person's body fat percentage immersed 20 male subjects in​ water, then measured their waists and recorded their weights. The results are shown in the accompanying table. The linear model used to predict​ % Body Fat from Weight has Rsquared equals47.1 ​% and s Subscript e equals6.9 ​%. Would a model that uses the​ person's Waist size be able to predict the​ % Body Fat more accurately than one that uses​ Weight? Create and analyze that model. 1.Write the equation of the regression line.

-62, 2.21

Question 3 A least squares regression line was calculated to relate the length​ (cm) of newborn boys to their weight in kg. Look at pearson to see equation. A newborn was 48cm long and weighed 3 kg. According to the regression​ model, what was his​ residual? What does that say about​ him? 1. What was his​ residual? 2. What does that say about​ him? Select the correct choice and fill in any answer boxes to complete your answer. A. The newborn weighs ____ kg more than the weight predicted by the regression equation. B. The newborn weighs _____ kg less than the weight predicted by the regression equation. C. The newborn weighs the same as the weight predicted by the regression equation.

1. -.092 kg 2. B

A least squares regression line was calculated to relate the length​ (cm) of newborn boys to their weight in kg. The line is ModifyingAbove weight with caret equals negative 5.28 plus 0.1694 length. Explain in words what this model means. Should new parents​ (who tend to​ worry) be concerned if their​ newborn's length and weight​ don't fit this​ equation? 1. What does the given model​ mean? A. The weight of a newborn boy will always equal negative 5.28 kg plus 0.1694 kg per cm of length. B. The minimum length of a newborn baby boy can be no less than 31.1688 cm. C. The weight of a newborn boy can be predicted as negative 5.28 kg plus 0.1694 kg per cm of length. D. The length of a newborn boy can be predicted as negative 5.28 cm plus 0.1694 cm per kg of weight. 2. Should new parents​ (who tend to​ worry) be concerned if their​ newborn's length and weight​ don't fit this​ equation? A. ​No, because this is a model fit to divide the data. All newborn weights above the line are normal and all newborn weights below the line are a matter for concern. B. ​Yes, because any newborn whose length and weight do not fit the model are far outside the normal weight to length proportion. C. ​Yes, because​ 97% of new born babies fit into this linear model perfectly. D. ​No, because this is a model fit to data. No particular baby should be expected to fit this model exactly.

1. C 2. D

Question 18 Suppose the entering freshmen at a certain college have a mean combined SAT score of 1218 ​, with a standard deviation of 126 . In the first​ semester, these students attained a mean GPA of 2.62 ​, with a standard deviation of 0.52 . A scatterplot showed the association to be reasonably​ linear, and the correlation between SAT score and GPA was 0.48 . What SAT score would you predict a freshman who attained a​ first-semester GPA of 2.9 would have​ gotten? Note that in this​ case, the explanatory variable is the​ student's GPA and the response variable is their SAT score.

1251

Question 10 Analysis of the relationship between the fuel economy​ (mpg) and engine size​ (liters) for 35 models of cars produces the regression model look at questionIf a car has a 5 liter​ engine, what does this model suggest the gas mileage would​ be?

17mpg

The correlation between a​ car's engine size and its fuel economy​ (in mpg) is r​ = minus 0.735 . What fraction of the variability in fuel economy is accounted for by the engine​ size?

54%

Question 8 What is the value of R squared? What is the interpretation of R squared ​? The value of Upper R squared equals98.89 ​% indicates the percentage of the variability in the price of these disk drives that can be accounted for by a linear model on the capacity of the drives. ​(Round to two decimal places as​ needed.)

98.89

Question 12 look at problem on pearson A.The true potassium contents of cereals vary from the predicted amounts with a standard deviation of 30.03 milligrams. B. The true potassium contents of cereals vary from the predicted amounts with a variance of 30.03 milligrams. C. The true fiber contents of cereals vary from the predicted amounts with a standard deviation of 30.03 grams. D. The true fiber contents of cereals vary from the predicted amounts with a variance of 30.03 grams.

A

Question 5' A CEO complains that the winners of his​ "rookie junior executive of the​ year" award often turn out to have less impressive performance the following year. He wonders whether the award actually encourages them to slack off. Can you offer a better​ explanation? Which of the following is a better explanation for why the winners of the​ "rookie junior executive of the​ year" award often turn out to have less impressive performance the following​ year? A. Performance is often random. If a junior executive had a good year their first​ year, then odds say it is unlikely that they will have a repeat performance the next year. B. Perhaps they​ weren't really better than other rookie​ executives, but just happened to have a lucky year. C. The winners were considered the best of the​ year, so naturally they reached the maximum level of performance that year and it is impossible to improve upon that. D. ​No, the CEO stated it​ perfectly; the award actually encourages them to slack off.

B

question 14 Players in any sport who are having great​ seasons, turning in performances that are much better than anyone might have​ anticipated, often are pictured on the cover of Sports Illustrated.​ Frequently, their performances then falter​ somewhat, leading some athletes to believe in a​ "Sports Illustrated​ jinx." Similarly, it is common for phenomenal rookies to have less stellar second​ seasons, the​ so-called "sophomore​ slump." While​ fans, athletes, and analysts have proposed many theories about what leads to such​ declines, a statistician might offer a simpler​ (statistical) explanation. Explain. The slope of the linear​ regression, predicting performance from years in the​ sport, must be negative because an​ athlete's performance always decreases over time. No matter how well an athlete performed one​ year, they must perform worse the next year. B. People on the cover are usually there for outstanding performances. Because they are so far from the​ mean, the performance in the next year is likely to be closer to the mean. C. People on the cover are usually considered the best of the​ year, so naturally they reached the maximum level of athletic performance that year and it is impossible to improve upon that. D. Once an athlete has made the cover of Sports​ Illustrated, they have reached their ultimate goal as an athlete and lack motivation to try the following year.

B

Question 17 A group of high school seniors took a scholastic aptitude test. The resulting math scores had a mean 473.4 with a standard deviation of 179.3, verbal scores had a mean 508.4 with a standard deviation of 171.5, and the correlation between verbal and math scores was r=0.707 . a) What is the​ correlation? The correlation is=? b) Write the equation of the line of regression predicting verbal scores from math scores. c) In​ general, what would a positive residual mean in this​ context? A. A positive residual means the student has the exact verbal score that the linear model would predict. B. A positive residual means the student has a higher verbal score than the linear model would predict. C. A positive residual means the student has a lower verbal score than the linear model would predict. d)A person tells you her math score was 394. Predict her verbal score. The student is expected to have a verbal score of ______ e)Using the predicted verbal score from part​ (d) and the regression equation predict the​ student's math score.

a) .707 b) y=188.382+.676x c) B d)454.726 e)433.735

Question 4 a) Find the slope​ estimate, b 1 b)What does this b 1 value​ mean, in this​ context? A. It means​ that, on​ average, the total sales will equal b 1* $ 1000 multiplied by the number of sales people working. B. It means​ that, on​ average, an additional increase of b 1* 1000 sales people working is associated with each additional dollar in sales. C. It means​ that, on​ average, an additional increase of b 1* $ 1000 in sales is associated with each additional sales person working. D. It means​ that, on​ average, the number of sales people working is approximately b 1 *1000 multiplied by the total sales. c) Find the​ intercept, b 0. b 0 =? ​d) What does this b 0 value​ mean, in this​ context? Does this value of b 0 make​ sense? A. It means​ that, on​ average, an additional increase of b 0 times $ 1000 in sales is associated with each additional sales person working. It does not make​ sense, because the value of b 0 is much larger than b 1. B. It would mean​ that, on​ average, the minimum amount of sales made from 2 sales people working is b 0*$ 1000 . It makes​ sense, because b 0 is greater than zero. C. It means​ that, on​ average, the expected sales is b 0 times $ 1000 with 0 sales people working. It does not make​ sense, because it is unlikely that any sales would be made with zero sales people working. D. It means​ that, on​ average, the total sales will equal b 0 times $ 1000 multiplied by the number of sales people working. It makes​ sense, because b 0 is greater than zero. ​e) Write down the equation that predicts Sales from Number of Sales People​ Working, using the variable x to represent Number of Sales People Working. sales= ____+_____x ​f) If 19 people are​ working, what sales​ (in dollars) do you​ predict? The predicted sales are _____ dollars. ​g) If sales are actually ​$24 ​,000, what is the value of the​ residual? residual=?dollar(s) ​ ​h) Has the original estimate from part f overestimated or underestimated the​ sales? A. overestimated B. underestimated C. neither overestimated nor underestimated

a) .782 b) C c) 8.776 d)C e) 8.776+.782x f)23634 g) 366 h) underestimated

Question 15 A regression analysis of 117 homes for sale produced the following​ model, where price is in thousands of dollars and size is in square feet. Look at model pearson ​a) Explain what the slope of the line says about housing prices and house size. A. For every additional square foot of area of a​ house, the price is predicted to increase by ​$0.067 . B. For every additional square foot of area of a​ house, the price is predicted to increase by ​$67 . C. For every​ $1000 increase in price of a​ house, the size is predicted to increase by 0.067 square feet. D. For every​ $1 increase in price of a​ house, the size is predicted to increase by 67 square feet. b)​What price would you predict for a 2500 ​-square-foot house in this​ market? c)A real estate agent shows a potential buyer a 1200 ​-square-foot ​house, saying that the asking price is ​$6000 less than what one would expect to pay for a house of this size. What is the asking​ price? d) What is the ​$6000 called? A. Intercept B. Residual C. Slope D. Predicted value

a) B b) 215350 c) 122250 d. residual

Question 9 ​a) Select all the assumptions or conditions that are violated. A. The Quantitative Variables Condition is violated. B. The Outlier Condition is violated. . C. The Does the Plot Thicken Condition is violated. D. The Straight Enough Condition is violated. E. There are no assumptions or conditions that are violated. ​b) Choose the correct answer below. A. The capacity should be expressed in megabytes instead of terabytes and the regression performed again. B. The prices should be expressed in cents instead of dollars and the regression performed again. C. The high influence point should be removed and the regression performed again. D. There are no issues with the regression.

a) B,D b) C

Question 16 ​a) Is the linear model appropriate​ here? Explain. A.The linear model could be appropriate because the Tar coefficient is close to​ zero, meaning the Nicotine value does not vary much. B. The linear model is not appropriate because the constant coefficient in not equal to zero. C. The linear model could be appropriate. There is some curvature to the residuals but not enough to completely disregard the linear model. Some more data points may be required. D. The linear model is not appropriate because the residuals are constantly decreasing ​b) Explain the meaning of R squared in this context. A. The predicted nicotine content is equal to some constant plus​ 92.4% of the tar content. B. Around​ 92.4% of the data points fit the linear model. C. Around​ 92.4% of the data points have a residual with magnitude less than the constant coefficient. D. The linear model on tar content accounts for​ 92.4% of the variability in nicotine content.

a) C b) D

Question 19 he accompanying scatterplot shows the relationship between the percentage of teenagers who had used marijuana and the percentage of teenagers who had used other drugs in 11 countries. Summary statistics showed that the mean percent that had used marijuana was 23.7%, with a standard deviation of 15.8%. An average of 11.9% of teens had used other​ drugs, with a standard deviation of 10.0%. a) Do you think a linear model is​ appropriate? Explain. A. No. There are outliers in the plot. B. Yes. While the relationship is​ weak, there is no reason to think that the linear model is not appropriate. C. Yes. The plot shows a​ positive, linear, fairly strong relationship. D. No. The plot shows a nonlinear pattern. ​b) For this​ regression, R squared is 79.6%. Interpret this statistic in this context. A linear model on the percentage of ____ use accounts for ____% of the variation in the percent use of _______ c) Write the equation you would use to estimate the percentage of teens who use Other Drugs from the percentage who have used Marijuana. d) Explain in context what the slope of this line means. The slope indicates that_____​ increases, on​ average, by _____ for each percent increase in ______. e) Do these results confirm that marijuana is a​ "gateway drug," that​ is, that marijuana use leads to the use of other​ drugs? A. Since the value of Upper R squared is​ small, the results do not indicate that marijuana leads to other drug use. B. The results indicate an association between marijuana and other drug​ use; however, association does not imply causation. C. Since the value of Upper R squared is​ large, the results confirm that marijuana leads to other drug use. D. The results do not show a strong association between marijuana and other drug use.

a) C b) marijuana, 79.6, other drugs c) -1.573+.565 marijuana % d) other drug, .565, marijuana use e) B

Question 1 Determine if the following statements are True or False. If​ False, explain briefly. ​a) Choose the linear model that passes through the most data points on the scatterplot. A. True. Choose the linear model that passes through the most data points on the scatterplot. B. False. The linear model line usually passes through exactly half of the data points. C. False. All of the data points either touch the line or fall below the line. D. False. The line usually touches none of the points. Minimize the sum of the squared errors. Part 2 ​b) The residuals are the observed​ y-values minus the​ y-values predicted by the linear model. A. True. The residuals are the observed​ y-values minus the​ y-values predicted by the linear model. B. False. The residuals are the observed​ x-values minus the​ x-values predicted by the linear model. C. False. The residuals are the predicted​ y-values minus the​ y-values observed by the linear model. D. False. The residuals are the observed​ y-values minus the mean​ y-value. Part 3 ​c) Least squares means that the square of the largest residual is as small as it could possibly be. A. True. Least squares means that the square of the largest residual is as small as it could possibly be. B. False. Least squares means that the product of the squares of all the residuals is minimized. C. False. Least squares means that the sum of the squares of all the residuals is minimized. D. False. Least squares means that the square of the median residual is minimized.

a) D b) A c) C

Question 20 a) Using this​ information, describe the association between the costs of a cappuccino and a third of a liter of water. The association is (strong/moderate/weak),(positive, negative) and shaped ________. ​b) The correlation is 0.655. Find and interpret the value of Rsquared Interpret this value. Select the correct choice below and fill in the answer box within your choice. ​(Round to one decimal place as​ needed.) A. For every​ $1 increase in the price of a regular​ cappuccino, the price of a third of a liter of water​ increases, on​ average, by ​$enter your response here . B. About ______ ​% of the variation in cappuccino prices can be explained by using a linear model on water prices. C. About enter your response here ​% of the variation in water prices can be explained by using a linear model on cappuccino prices. D. For every​ $1 increase in the price of a third of a liter of​ water, the price of a regular cappuccino​ increases, on​ average, by ​$enter your response here ​c) The regression equation predicting the cost of a cappuccino from the cost of a third of a liter of water is Look at pearson.In a certain​ city, a third of a liter of water costs ​$0.53 and a cappuccino is ​$1.19. Calculate and interpret the residual for this city. the residual=? interpret the residual the price of a ______ in this city is _____ cents ___ than predicted

a) moderate, positive, mostly linear but curving ar the highest water prices b) .429, 42.9 c)-.90, cappacino, 90, less

Question 21 a)​What is the correlation between CO 2 and​ Temperature? r=? b)Explain the meaning of​ R-squared in this context. A. A linear model on mean temperature accounts for 33.1 % of the variation in CO 2 levels. B. A linear model on mean temperature accounts for 66.9 % of the variation in CO 2 levels. C. A linear model on CO 2 levels accounts for 66.9 % of the variation in mean temperature. D. A linear model on CO 2 levels accounts for 33.1 % of the variation in mean temperature. d) What is the meaning of the slope of this​ equation? A. For every 0.003 ppm increase in CO 2 ​levels, the mean temperature increases by 1 degrees Upper C . B. For every degree that the mean temperature​ increases, CO 2 levels increase by 0.003 ppm. C. For every 1 ppm increase in CO 2 ​levels, the mean temperature increases by 0.003 degrees C. D. The slope does not have a meaningful interpretation in the context of this problem. ​e) What is the meaning of the​ y-intercept of this​ equation? A. For every 1 ppm increase in CO 2 ​levels, the mean temperature increases by 0.003 degrees C. B. When the global mean temperature is 0degrees ​C, the CO 2 level is 15.606 ppm. C. When the CO 2 level is 0​ ppm, the global mean temperature will be 15.606 degreesC. D. The​ y-intercept does not have a meaningful interpretation in the context of this problem. ​f) View the accompanying scatterplot of the residuals vs. CO 2 . Does the scatterplot of the residuals vs. CO 2 show evidence of the violation of any assumptions behind the​ regression? A. ​Yes, the outlier condition is violated. B. ​Yes, all the assumptions are violated. C. ​Yes, the linearity and equal variance assumptions are violated. D. ​Yes, the equal variance assumption is violated. E. ​Yes, the linearity assumption is violated. Your answer is not correct. F. ​No, all assumptions are okay. ​g) Suppose CO 2 levels reach 362 ppm this year. What mean temperature does the regression predict from this​ information? 16.692 degreesC ​h) Does the answer is part g mean that when CO 2 levels hit 362 ppm ​, the temperature will reach the predicted​ level? Explain briefly. A. No. The actual temperature will be 15.606 degrees C. B. No. The actual temperature will be significantly higher than the predicted level. C. No. The actual temperature is likely to be different than the predicted level. D. Yes. The temperature will reach the predicted level when CO 2 levels hit 362 ppm .

a) r=.575 b) D c) 15.606, .003, d) C e)D f) F g) 16.692 h) C

Question 6 look at pearson a) What are the units of the​ residuals? The residuals are in terms of: ​b) Which residual contributes the most to the sum that was minimized according to the least squares criterion to find this​ regression? The residual that contributes the most to the sum is ____ ​c) Which residual contributes the least to the sum that was minimized according to the least squares criterion to find this​ regression? The residual that contributes the least to the sum is____.

a) thousands of dollars b) 2.81 c) .07

question 7 look at pearson ​a) Which drive capacity contributes the most to the sum that is minimized by the least squares​ criterion? ​b) Two of the residuals are negative. What does that mean about those​ drives? Be specific and use the correct units. a)The drive with a capacity of _____TB contributes the most to the sum of squared residuals b)A negative residual means that the drive costs less than what might be expected from this model and its capacity. A residual of negative 15.58 indicates a drive that costs $15.58 less than expected.

a)3


Related study sets

Humerus & Shoulder Girdle & hands exam 3 positioning

View Set

LA stave IV and V A Christmas Carol

View Set

Biology Quiz 1 - Principles of Ecology

View Set

Chapter 12: The Central Nervous System

View Set