AP STAT -- unit 2

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Data were collected on two variables, x and y, to create a model to predict y from x. A scatterplot of the collected data revealed a curved pattern with a possible cubic relationship (y=ax3, where a is a constant) between the variables. Which of the following transformations would be most appropriate for creating linearity between the variables? A Taking the cube of yy B Taking the cube root of yy C Taking the cube root of both yy and xx D Taking the log of yy E Taking the log of both yy and x

Answer E Correct. Variables related by a power, such as y=x3y=x3, are best transformed by taking the log of both variables.

A researcher studying a specific type of tree creates a least-squares regression line for relating the height and the diameter, both in meters, of a fully grown tree. The results are shown in the following computer output. Which of the following values represents the predicted change in the height of the tree for each one-meter increase in the diameter of the tree? A 30 B 5 C 4 D 2.5 E 130130

Answer A Correct. The value 30 is the slope of the model; it represents the predicted change in height for each one-meter increase in diameter.

A set of bivariate data was used to create a least-squares regression line. Which of the following is minimized by the line? A The sum of the residuals B The sum of the squared residuals C The sum of the absolute values of the residuals D The influence of outliers E The slope

Answer B Correct. The least-squares regression line minimizes the sum of the squares of the residuals.

Which of the following statements about a least-squares regression analysis is true? I) A point with a large residual is an outlier. II) A point with high leverage has a y-value that is not consistent with the other y-values in the set. II) The removal of an influential point from a data set could change the value of the correlation coefficient. A I only B II only C III only D I and III only E I, II, and III

Answer C Correct. An influential point is one for which its removal from the set can have a substantial effect on the correlation.

The relationship between carbon dioxide emissions and fuel efficiency of a certain car can be modeled by the least-squares regression equation ln(yˆ)=7−0.045x, where x represents the fuel efficiency, in miles per gallon, and yˆ represents the predicted carbon dioxide emissions, in grams per mile. Which of the following is closest to the predicted carbon dioxide emissions, in grams per mile, for a car of this type with a fuel efficiency of 20 miles per gallon? A 1.8 B 6.1 C 446 D 2,697 E 1,250,000

Answer C Correct. When 20 is substituted for xx, the resulting value on the right side of the equation is 6.1. The value of approximately 446 results from raising ee to the power of 6.1 (that is,e6.1)(that is,e6.1).

The height hh and collar size cc, both in centimeters, measured from a sample of boys were used to create the regression line cˆ=−94+0.9hc^=−94+0.9h. The line is used to predict collar size from height, both in centimeters, for boys' shirt collars. Which of the following has no logical interpretation in context? A The predicted collar size of a boy with height 140cm140cm B The hh values in the sample C The cc values in the sample D The slope of the regression line E The cc-intercept of the regression line

Answer E Correct. The cc-intercept of the regression line can be determined when the value of hh is 0 centimeters. It is not meaningful to predict the collar size of a boy with a height of 0 centimeters.

The following data were collected from a random sample of people on their favorite types of leisure activities and their age. The results are shown in the two-way table below. What proportion of the people aged 7 to 18 years gave watching television as their favorite type of leisure activity? A 300/2,200 B 200/900 C 100/1,300 D 640/3,500 E 300/640

Answer A Correct. There are 900+1,300=2,200900+1,300=2,200 people aged 7 to 18 years. 200 of those aged 7 to 12 years and 100 of those aged 13 to 18 years gave watching television as their favorite leisure activity, so 200+100=300200+100=300 of the people aged 7 to 18 years gave watching television as their favorite leisure activity. Thus the proportion of people aged 7 to 18 years who gave watching television as their favorite leisure activity is 3002,2003002,200.

The least-squares regression line Sˆ=0.5+1.1LS^=0.5+1.1L models the relationship between the listing price and the actual sales price of 12 houses, with both amounts given in hundred-thousands of dollars. Let LL represent the listing price and SS represent the sales price. Which of the following is the best interpretation of the slope of the regression line? A For each hundred-thousand-dollar increase in the listing price, the sales price will increase by $1.1. B For each hundred-thousand-dollar increase in the listing price, the sales price will increase by $110,000. C For each hundred-thousand-dollar increase in the listing price, the sales price will decrease by $110,000. D For each hundred-thousand-dollar increase in the listing price, the sales price is predicted to increase by $1.1. E For each hundred-thousand-dollar increase in the listing price, the sales price is predicted to increase by $110,000.

Answer E

A researcher collected data on the age, in years, and the growth of sea turtles. The following graph is a residual plot of the regression of growth versus age. Does the residual plot support the appropriateness of a linear model? A Yes, because there is a clear pattern displayed in the residual plot. B Yes, because about half the residuals are positive and the other half are negative. C Yes, because as age increases, the residuals increase. D No, because the points appear to be randomly distributed. E No, because the graph displays a UU-shaped pattern.

Answer E Correct. A UU-shaped pattern is evidence that a linear model is notappropriate.

The following table shows the data collected from students in grades 3 and 4 in an elementary school about their favorite types of pets. Which of the following statements is supported by the table? A Dogs were the type of pet chosen most often by the students at the elementary school. B There were more students surveyed in grade 3 than in grade 4. C The percentage of students who chose cats as their favorite pet was 60%. D The percentage of students in grade 3 who chose dogs as their favorite pet was 55%. E Birds were the type of pet chosen least often by the students in grade 4.

Answer A Correct. The number of dogs (110) is greater than the number of cats (60), birds (20), or other pets (25), so dogs were chosen most often.

A tennis ball was thrown in the air. The height of the ball from the ground was recorded every millisecond from the time the ball was thrown until it reached the height from which it was thrown. The correlation between the time and height was computed to be 0. What does this correlation suggest about the relationship between the time and height? A There is no relationship between time and height. B There is no linear relationship between time and height. C The distance the ball traveled upward is the same as the distance the ball traveled downward. D The correlation suggests that there is measurement or calculation error. E The correlation suggests that more measurements should be taken to better understand the relationship.

Answer B Correct. A correlation of 0 suggests that there is no linear relationship. There may still be a non-linear relationship, perhaps a quadratic relationship in this example.

Dairy farmers are aware there is often a linear relationship between the age, in years, of a dairy cow and the amount of milk produced, in gallons per week. The least-squares regression line produced from a random sample is Milkˆ=40.8−1.1(Age)Milk^=40.8−1.1(Age). Based on the model, what is the difference in predicted amounts of milk produced between a cow of 5 years and a cow of 10 years? A A cow of 5 years is predicted to produce 5.5 fewer gallons per week. B A cow of 5 years is predicted to produce 5.5 more gallons per week. C A cow of 5 years is predicted to produce 1.1 fewer gallons per week. D A cow of 5 years is predicted to produce 1.1 more gallons per week. E A cow of 5 years and a cow of 10 years are both predicted to produce 40.8 gallons per week.

Answer B Correct. The difference of 5 years produces a 5.5 gallon per week difference in favor of the younger cow.

The following is a residual plot for a linear regression of y versus x. What is indicated by the plot? A A linear model is appropriate. B A linear model is not appropriate. C Variability in yy is constant for all values xx. D At least one point is influential with respect to the regression. E At least one point is an outlier with respect to the regression.

Answer B Correct. The pattern in the plot indicates that the linear model is notappropriate.

A restaurant manager collected data to predict monthly sales for the restaurant from monthly advertising expenses. The model created from the data showed that 36 percent of the variation in monthly sales could be explained by monthly advertising expenses. What was the value of the correlation coefficient? A 0.64 B 0.60 C 0.40 D 0.36 E 0.13

Answer B Correct. The proportion of the variation explained by the explanatory variable is the coefficient of determination r2r2, and the correlation coefficient is the square root of r2r2. In this case, the correlation of determination is 36 percent, meaning r2=0.36r2=0.36, so r=0.36−−−−√=0.60r=0.36=0.60. The correlation coefficient has the value 0.60.

The following data were collected from a random sample of people, who identified their favorite type of juice. The results are shown in the following two-way table. What proportion of the children identified orange as their favorite type of juice? A 4001,0004001,000 B 400700400700 C 4002,0004002,000 D 6001,3006001,300 E 1,0002,000

Answer B Correct. There are 700 children, and 400 of the 700 children identified orange as their favorite type of juice, so 400700 is the proportion of children who identified orange as their favorite type of juice.

A restaurant manager collected data on the number of customers in a party in the restaurant and the time elapsed until the party left the restaurant. The manager computed a correlation of 0.78 between the two variables. What information does the correlation provide about the relationship between the number of customers in a party at the restaurant and the time elapsed until the party left the restaurant? A The relationship is linear because the correlation is positive. B The relationship is not linear because the correlation is positive. C The parties with a larger number of customers are associated with the longer times elapsed until the party left the restaurant. D The parties with a larger number of customers are associated with the shorter times elapsed until the party left the restaurant. E There is no relationship between the number of customers in a party at a table in the restaurant and the time required until the party left the restaurant.

Answer C Correct. A positive correlation indicates that as values of one variable increase, the values of the other variable tend to increase.

A researcher in Alaska measured the age (in months) and the weight (in pounds) of a random sample of adolescent moose. When the least-squares regression analysis was performed, the correlation was 0.59. Which of the following is the correct way to label the correlation? A 0.59 months per pound B 0.59 pounds per month C 0.59 D 0.59 months times pounds E 0.59 month pounds

Answer C Correct. The correlation rr is unit-free.

The following segmented bar chart shows the number of flights that were either on time or delayed at three different airports on one day. Which of the following statements is supported by the bar chart? A Airport T has the greatest percentage of on-time flights compared to the other two airports. B Airport R has the least percentage of on-time flights compared to the other two airports. C The number of on-time flights at Airport S is half the number of on-time flights at Airport T. D The number of on-time flights at Airport R is less than the number of on-time flights at Airport S. E The number of flights at Airport T is equal to the total number of flights at Airports R and S combined.

Answer C Correct. Airport S has 100 on-time flights, and Airport T has 200 on-time flights. Since 100 is one-half of 200, Airport S has one-half the number of on-time flights that Airport T has.

A study was conducted on three types of home siding and the type of damage done to the siding by woodpeckers. Each hole made by a woodpecker was classified as either drumming (territorial signaling), foresting (looking for food), or nesting. The following bar chart shows the relative frequency of the holes for each type of siding. Which of the following statements is supported by the bar chart? A The proportion of holes created for drumming is the same for all three siding types. B The proportion of holes created for drumming is greatest for grooved plywood. C The proportion of holes created for drumming is least for grooved plywood. D The number of holes created for drumming is least for grooved plywood. E The number of holes created for drumming is greatest for nonwood.

Answer C Correct. The proportion of holes created by woodpeckers for drumming is around 30% for clapboard, 10% for grooved plywood, and 70% for nonwood. Therefore, the proportion of holes created for drumming is least for grooved plywood.

For a random sample of 20 professional athletes, there is a strong, linear relationship between the number of hours they exercise per week and their resting heart rate. For the athletes in the sample, those who exercise more hours per week tend to have lower resting heart rates than those who exercise less. Which of the following is a reasonable value for the correlation between the number of hours athletes exercise per week and their resting heart rate? A 0.71 B 0.00 C −0.14−0.14⁢ D −0.87−0.87 E −1.00

Answer D Correct. A correlation of −0.87−0.87 suggests a strong negative relationship.

An engineer believes that there is a linear relationship between the thickness of an air filter and the amount of particulate matter that gets through the filter; that is, less pollution should get through thicker filters. The engineer tests many filters of different thickness and fits a linear model. If a linear model is appropriate, what should be apparent in the residual plot? A There should be a positive, linear association in the residual plot. B There should be a negative, linear association in the residual plot. C All of the points must have residuals of 0. D There should be no pattern in the residual plot. E The residuals should have a small amount of variability for low values of the predictor variable and larger amounts of variability for high values of the predictor variable.

Answer D Correct. Apparent randomness in a residual plot for a linear model is evidence of a linear association between the variables.

An exponential relationship exists between the explanatory variable and the response variable in a set of data. The common logarithm of each value of the response variable is taken, and the least-squares regression line has an equation of log(yˆ)=7.3−1.5x. Which of the following is closest to the predicted value of the response variable for x=4.8 ? A 0.1 B 0.68 C 1.105 D 1.26 E 14.5

Answer D Correct. Substituting x=4.8x=4.8 into the equation gives log(yˆ)=7.3−1.5(4.8)log⁡(y^)=7.3−1.5(4.8) or log(yˆ)=0.1log⁡(y^)=0.1. To solve for yˆy^, raise 10 to the power of 0.1 to get 1.26.

The following bar chart displays the relative frequency of responses of students, by grade level, when asked, "Do you volunteer in a community-service activity?" Which of the following statements is not supported by the bar chart? A More than 60% of both tenth-grade and eleventh-grade students responded yes. B Twelfth-grade students had the least percentage of students respond yes. C Less than 40% of tenth-grade students responded no. D The number of tenth-grade students who responded yes was greater than the number of ninth-grade students who responded yes. E The percentage of eleventh-grade students who responded no was less than the percentage of ninth-grade students who responded no.

Answer D Correct. The statement is not supported by the bar chart. The graph only gives information about the percentage of students who responded yes or no. The graph gives no information on the number of students who responded.

The following table shows data that were collected from a random sample of people, who indicated their age and their favorite sporting event to watch on television. Based on the results above, what proportion of the randomly sampled people are over age 12 years? A 900/3,500 B 1,300/3,500 C 1,200/3,500 D 2,300/3,500 E 1,000/3,500

Answer D Correct. There are 1,300 people aged 13 to 18 years and 1,000 people aged 19 years and older, so there are 1,300+1,000=2,3001,300+1,000=2,300 people in the sample who are over age 12. The proportion of people over age 12 in the sample is 2,3003,5002,3003,500.

The following scatterplot shows two variables, x and y, along with a least-squares model. Which of the following is a high leverage point with respect to the regression? A (5,8)(5,8) B (20,31)(20,31) C (27,22)(27,22) D (30,60)(30,60) E (80,70)

Answer E

A botanist found a correlation between the length of an aspen leaf and its surface area to be 0.94. Why does the correlation value of 0.94 not necessarily indicate that a linear model is the most appropriate model for the relationship between length of an aspen leaf and its surface area? A The value must be exactly 1 or −1−1 to indicate a linear model is the most appropriate model. B The value must be 0 to indicate a linear model is the most appropriate model. C A causal relationship should be established first before determining the most appropriate model. D The value of 0.94 implies that only 88% of the data have a linear relationship. E Even with a correlation value of 0.94, it is possible that the relationship could still be better represented by a nonlinear model.

Answer E Correct. A value close to 1 or −1−1 does indicate a strong linear relationship, however it does not necessarily mean that a linear model is the best fit for the data (e.g.e.g., an exponential or quadratic model might be a more appropriate model).

In a study to determine whether miles driven is a good predictor of trade-in value, 11 cars of the same age, make, model, and condition were randomly selected. The following scatterplot shows trade-in value and mileage for those cars. Five of the points are labeled A, B, C, D, and E, respectively. Which of the five labeled points is the most influential with respect to a regression of trade-in value versus miles driven? A A B B C C D D E E

Answer E Correct. Point E does not follow the trend with respect to the other data and is probably an outlier. The value of the car is much higher than other cars with similar miles driven.

A penalty kick in soccer involves two players from different teams, the shooter and the goalie. During the penalty kick the shooter will try to score a goal by kicking a soccer ball to the left or right of the goal area. To prevent the shooter from scoring a goal, the goalie will move to the left or right of the goal area. The following table summarizes the directions taken by the shooter and the goalie for 372 penalty kicks. Which of the following indicates an association between the shooter's choice of direction and the goalie's choice of direction? A The marginal relative frequencies for the shooter and the goalie are equal. B The marginal relative frequencies for the shooter and the goalie are not equal. C The row totals are not equal. D For the goalie, the relative frequency of a direction is equal to the relative frequency conditioned on the shooter's direction. E For the goalie, the relative frequency of a direction is not equal to the relative frequency conditioned on the shooter's direction.

Answer E Correct. The goalie moved left 212 times, giving a relative frequency of 212372≈0.57212372≈0.57. If there were no association, the relative frequency of moving left conditioned on the direction of the shooter should be very close 0.57. However, 80165≈0.4880165≈0.48 and 132207≈0.64132207≈0.64. It is likely that there is an association between the two variables.

The table shows several values of x and their corresponding values of y. Which of the following is closest to the correlation between x and y? A −0.98−0.98 B −0.95−0.95 C 0.20 D 0.95 E 0.98

Answer E Correct. Using technology or the formula for correlation results in this answer


Ensembles d'études connexes

Chapter 16 - Cardiac Emergencies

View Set

Incorrect PrepU- Exam 3- Ch 23- Management of Patients With Chest and Lower Respiratory Tract Disorders

View Set

Ultrasound Procedures Final Review

View Set

Homework: Chapter 13 Video: Jones Soda: Product (4:12 mins)

View Set

Macroeconomics Chapter 14 My Econ Lab

View Set

4th Grade Math - Metric Measurement Conversions

View Set

Layers of the heart wall & Cardiac Muscle Tissue

View Set

Pearson - Seizure Disorders - Module

View Set

CHAPTER 16: Gene Regulation in Prokaryotes

View Set