AP STAT Chapter 3
The table shows several values of x and their corresponding values of y. Which of the following is closest to the correlation between x and y? A −0.98−0.98 B −0.95−0.95 C 0.20 D 0.95 E 0.98
0.98 Answer E Correct. Using technology or the formula for correlation results in this answer
There is a linear relationship between the number of chirps made by the striped ground cricket and the air temperature. A least squares fit of some data collected by a biologist gives the model ŷ = 25.2 + 3.3x 9 < x < 25, where x is the number of chirps per minute and ŷ is the estimated temperature in degrees Fahrenheit. What is the estimated increase in temperature that corresponds to an increase of 5 chirps per minute? A 3.3 ° F B 16.5 ° F C 25.2 ° F D 28.5 ° F E 41.7 ° F
B 16.5 ° F
Consider n pairs of numbers (x1,y1), (x2,y2), ..., and (xn, yn). The mean and standard deviation of the x-values are x̄ =5 and sx = 4, respectively. The mean and standard deviation of the y-values are ȳ = 10 and sy = 10 respectively. Of the following, which could be the least squares regression line? A ŷ = -5.0 + 3.0x B ŷ = 3.0x C ŷ = 5.0 + 2.5x D ŷ = 8.5 + 0.3x E ŷ = 10.0 + 0.4x
D ŷ = 8.5 + 0.3x
For which of the following scatterplots is the correlation between x and y closest to 0 ?
E
Which of the following scatterplots could represent a data set with a correlation coefficient of r = -1?
A
A researcher studying a specific type of tree creates a least-squares regression line for relating the height and the diameter, both in meters, of a fully grown tree. The results are shown in the following computer output.Which of the following values represents the predicted change in the height of the tree for each one-meter increase in the diameter of the tree? A 30 B 5 C 4 D 2.5 E 130
Answer A Correct. The value 30 is the slope of the model; it represents the predicted change in height for each one-meter increase in diameter.
Dairy farmers are aware there is often a linear relationship between the age, in years, of a dairy cow and the amount of milk produced, in gallons per week. The least-squares regression line produced from a random sample is Milkˆ=40.8−1.1(Age)Milk^=40.8−1.1(Age). Based on the model, what is the difference in predicted amounts of milk produced between a cow of 5 years and a cow of 10 years?A A cow of 5 years is predicted to produce 5.5 fewer gallons per week. B A cow of 5 years is predicted to produce 5.5 more gallons per week. C A cow of 5 years is predicted to produce 1.1 fewer gallons per week. D A cow of 5 years is predicted to produce 1.1 more gallons per week. E A cow of 5 years and a cow of 10 years are both predicted to produce 40.8 gallons per week.
Answer B Correct. The difference of 5 years produces a 5.5 gallon per week difference in favor of the younger cow.
The least-squares regression line yˆ=1.8−0.2xy^=1.8−0.2x summarizes the relationship between velocity, in feet per second, and depth, in feet, in measurements taken for a certain river, where xx represents velocity and yy represents the depth of the river. What is the predicted value of yy, in feet, when x=5x=5? A −16−16 B −1−1 C −0.2−0.2 D 0.8 E 1.8
Answer D Correct. By substituting 5 for xx, the value of 1.8−0.2(5)1.8−0.2(5) is equal to 0.8.
The following scatterplot shows two variables, xx and yy, along with a least-squares model.Which of the following is a high leverage point with respect to the regression? A (5,8)(5,8) B (20,31)(20,31) C (27,22)(27,22) D (30,60)(30,60) E (80,70)
Answer E Correct. A high leverage point is one that has a substantially larger or smaller xx-value than the other observations. The xx-value of 80 is substantially larger than the other xx-values that occur between 5 and 40.
A field researcher who studies lions conjectured that the more time a cub spends playing, the sooner the cub will begin to hunt. Observational data were collected from 20 lion cubs. The researcher recorded how long they spent playing and the age when they began hunting. Because male and female lions have different hunting behaviors, the researcher recorded the data for males and females separately. The two scatterplots show the data for the 10 female lions and the 10 male lions. A For female cubs only B For male cubs only C For both male cubs and female cubs, with equal evidence D For both male cubs and female cubs, with more evidence for female cubs than for male cubs E For neither male cubs nor female cubs
A For female clubs only
A set of bivariate data was used to create a least-squares regression line. Which of the following is minimized by the line? A The sum of the residuals B The sum of the squared residuals C The sum of the absolute values of the residuals D The influence of outliers E The slope
Answer B Correct. The least-squares regression line minimizes the sum of the squares of the residuals.
The following is a residual plot for a linear regression of y versus x What is indicated by the plot? A A linear model is appropriate. B A linear model is not appropriate. C Variability in yy is constant for all values xx. D At least one point is influential with respect to the regression. E At least one point is an outlier with respect to the regression.
Answer B Correct. The pattern in the plot indicates that the linear model is not appropriate.
A researcher in Alaska measured the age (in months) and the weight (in pounds) of a random sample of adolescent moose. When the least-squares regression analysis was performed, the correlation was 0.59. Which of the following is the correct way to label the correlation? A 0.59 months per pound B 0.59 pounds per month C 0.59 D 0.59 months times pounds E 0.59 month pounds
Answer C Correct. The correlation r is unit-free.
For a random sample of 20 professional athletes, there is a strong, linear relationship between the number of hours they exercise per week and their resting heart rate. For the athletes in the sample, those who exercise more hours per week tend to have lower resting heart rates than those who exercise less. Which of the following is a reasonable value for the correlation between the number of hours athletes exercise per week and their resting heart rate? A 0.71 B 0.00 C −0.14−0.14 D −0.87−0.87 E −1.00
Answer D Correct. A correlation of −0.87−0.87 suggests a strong negative relationship.
An engineer believes that there is a linear relationship between the thickness of an air filter and the amount of particulate matter that gets through the filter; that is, less pollution should get through thicker filters. The engineer tests many filters of different thickness and fits a linear model. If a linear model is appropriate, what should be apparent in the residual plot? A There should be a positive, linear association in the residual plot. B There should be a negative, linear association in the residual plot. C All of the points must have residuals of 0. D There should be no pattern in the residual plot. E The residuals should have a small amount of variability for low values of the predictor variable and larger amounts of variability for high values of the predictor variable
Answer D Correct. Apparent randomness in a residual plot for a linear model is evidence of a linear association between the variables.
An exponential relationship exists between the explanatory variable and the response variable in a set of data. The common logarithm of each value of the response variable is taken, and the least-squares regression line has an equation of log(yˆ)=7.3−1.5xlog(y^)=7.3−1.5x. Which of the following is closest to the predicted value of the response variable for x=4.8x=4.8 ? A 0.1 B 0.68 C 1.105 D 1.26 E 14.5
Answer D Correct. Substituting x=4.8x=4.8 into the equation gives log(yˆ)=7.3−1.5(4.8)log(y^)=7.3−1.5(4.8) or log(yˆ)=0.1log(y^)=0.1. To solve for yˆy^, raise 10 to the power of 0.1 to get 1.26.
The least-squares regression line Sˆ=0.5+1.1LS^=0.5+1.1L models the relationship between the listing price and the actual sales price of 12 houses, with both amounts given in hundred-thousands of dollars. Let LL represent the listing price and SS represent the sales price. Which of the following is the best interpretation of the slope of the regression line? A For each hundred-thousand-dollar increase in the listing price, the sales price will increase by $1.1. B For each hundred-thousand-dollar increase in the listing price, the sales price will increase by $110,000. C For each hundred-thousand-dollar increase in the listing price, the sales price will decrease by $110,000. D For each hundred-thousand-dollar increase in the listing price, the sales price is predicted to increase by $1.1. E For each hundred-thousand-dollar increase in the listing price, the sales price is predicted to increase by $110,000.
Answer E Correct. Nondeterministic language (predicted, on average, estimated, etc.) is used, and the correct increase is indicated.
A botanist found a correlation between the length of an aspen leaf and its surface area to be 0.94. Why does the correlation value of 0.94 not necessarily indicate that a linear model is the most appropriate model for the relationship between length of an aspen leaf and its surface area? A The value must be exactly 1 or −1−1 to indicate a linear model is the most appropriate model. B The value must be 0 to indicate a linear model is the most appropriate model. C A causal relationship should be established first before determining the most appropriate model. D The value of 0.94 implies that only 88% of the data have a linear relationship. E Even with a correlation value of 0.94, it is possible that the relationship could still be better represented by a nonlinear model.
E Correct. A value close to 1 or −1−1 does indicate a strong linear relationship, however it does not necessarily mean that a linear model is the best fit for the data (e.g.e.g., an exponential or quadratic model might be a more appropriate model).
Clear-cut harvesting of wood from forests creates long periods of time when certain animals cannot use the forests as habitats. Partial-cut harvesting is increasingly used to lessen the effects of logging on the animals. The following scatterplot shows the relationship between the density of red squirrels, in squirrels per plot, 2 to 4 years after partial-cut harvesting, and the percent of trees that were harvested in each of 11 forests.Which of the following is the best description of the relationship displayed in the scatterplot? A Negative, linear, and strong B Positive, linear, and weak C Negative, nonlinear, and strong D Positive, nonlinear, and weak E Positive, nonlinear, and strong
A Negative, linear, and strong
A roadrunner is a desert bird that tends to run instead of fly. While running, the roadrunner uses its tail as a balance. A sample of 10 roadrunners was taken, and the birds' total length, in centimeters (cm), and tail length, in cm, were recorded. The output shown in the table is from a least-squares regression to predict tail length given total length.Suppose a roadrunner has a total length of 59.0 cm and tail length of 31.1 cm. Based on the residual, does the regression model overestimate or underestimate the tail length of the roadrunner? A Underestimate, because the residual is positive. B Underestimate, because the residual is negative. C Overestimate, because the residual is positive. D Overestimate, because the residual is negative. E Neither, because the residual is 0.
A Underestimate, because the residual is positive.
A researcher collected data on the age, in years, and the growth of sea turtles. The following graph is a residual plot of the regression of growth versus age. Does the residual plot support the appropriateness of a linear model? A Yes, because there is a clear pattern displayed in the residual plot. B Yes, because about half the residuals are positive and the other half are negative. C Yes, because as age increases, the residuals increase. D No, because the points appear to be randomly distributed. E A U-shaped pattern is evidence that a linear model is not appropriate.
Answer E Correct. A U-shaped pattern is evidence that a linear model is not appropriate.
Which of the following is the best description of a positive association between two variables? A The values will create a line when graphed on a scatterplot. B The values will create a line with positive slope when graphed on a scatterplot. C As the value of one of the variables increases, the value of the other variable tends to decrease. D As the value of one of the variables increases, the value of the other variable tends to increase. E All values of both variables are positive.
Answer D Correct. A positive association indicates a tendency for both variables to move in the positive direction.
For a specific species of fish in a pond, a wildlife biologist wants to build a regression equation to predict the weight of a fish based on its length. The biologist collects a random sample of this species of fish and finds that the lengths vary from 0.75 to 1.35 inches. The biologist uses the data from the sample to create a single linear regression model. Would it be appropriate to use this model to predict the weight of a fish of this species that is 3 inches long? A Yes, because 3 inches falls above the maximum value of lengths in the sample. B Yes, because the regression equation is based on a random sample. C Yes, because the association between length and weight is positive. D No, because 3 inches falls above the maximum value of lengths in the sample. E No, because there may not be any 3-inch fish of this species in the pond.
Answer D Correct. The length 3 inches is beyond the interval of lengths from the sample. This is extrapolation, and the predicted value would not be reliable.
A restaurant manager collected data on the number of customers in a party in the restaurant and the time elapsed until the party left the restaurant. The manager computed a correlation of 0.78 between the two variables. What information does the correlation provide about the relationship between the number of customers in a party at the restaurant and the time elapsed until the party left the restaurant? A The relationship is linear because the correlation is positive. B The relationship is not linear because the correlation is positive. C The parties with a larger number of customers are associated with the longer times elapsed until the party left the restaurant. D The parties with a larger number of customers are associated with the shorter times elapsed until the party left the restaurant. E There is no relationship between the number of customers in a party at a table in the restaurant and the time required until the party left the restaurant.
C Correct. A positive correlation indicates that as values of one variable increase, the values of the other variable tend to increase. The parties with a larger number of customers are associated with the longer times elapsed until the party left the restaurant.
A factory has two machines, A and B, making the same part for refrigerators. The number of defective parts produced by each machine during the first hour of operation was recorded on 19 randomly selected days. The scatterplot below shows the number of defective parts produced by each machine on the selected days.Which statement gives the best comparison between the number of defective parts produced by the machines during the first hour of operation on the 19 days? A Machine A always produced the same number of defective parts as machine B. B Machine A always produced fewer defective parts than machine B. C Machine A always produced more defective parts than machine B. D Machine A usually, but not always, produced fewer defective parts than machine B. E Machine A usually, but not always, produced more defective parts than machine B.
D Machine A usually, but not always, produced fewer defective parts than machine B.
Exercise physiologists are investigating the relationship between lean body mass (in kilograms) and the resting metabolic rate (in calories per day) in sedentary males.Based on the computer output above, which of the following is the best interpretation of the value of the slope of the regression line? A For each additional kilogram of lean body mass, the resting metabolic rate increases on average by 22.563 calories per day. B For each additional kilogram of lean body mass, the resting metabolic rate increases on average by 264.0 calories per day. C For each additional kilogram of lean body mass, the resting metabolic rate increases on average by 144.9 calories per day. D For each additional calorie per day for the resting metabolic rate, the lean body mass increases on average by 22.563 kilograms. E For each additional calorie per day for the resting metabolic rate, the lean body mass increases on average by 264.0 kilograms.
A For each additional kilogram of lean body mass, the resting metabolic rate increases on average by 22.563 calories per day.
A tennis ball was thrown in the air. The height of the ball from the ground was recorded every millisecond from the time the ball was thrown until it reached the height from which it was thrown. The correlation between the time and height was computed to be 0. What does this correlation suggest about the relationship between the time and height? A There is no relationship between time and height. B There is no linear relationship between time and height. C The distance the ball traveled upward is the same as the distance the ball traveled downward. D The correlation suggests that there is measurement or calculation error. E The correlation suggests that more measurements should be taken to better understand the relationship.
Answer B Correct. A correlation of 0 suggests that there is no linear relationship. There may still be a non-linear relationship, perhaps a quadratic relationship in this example.
A restaurant manager collected data to predict monthly sales for the restaurant from monthly advertising expenses. The model created from the data showed that 36 percent of the variation in monthly sales could be explained by monthly advertising expenses. What was the value of the correlation coefficient? A 0.64 B 0.60 C 0.40 D 0.36 E 0.13
Answer B Correct. The proportion of the variation explained by the explanatory variable is the coefficient of determination r2r2, and the correlation coefficient is the square root of r2r2. In this case, the correlation of determination is 36 percent, meaning r2=0.36r2=0.36, so r=0.36−−−−√=0.60r=0.36=0.60. The correlation coefficient has the value 0.60.
Researchers observed the grouping behavior of deer in different regions. The following scatterplot shows data collected on the size of the group and the percent of the region that was woodland.The relationship between group size and percent woodland appears to be negative and nonlinear. Which of the following statements explains such a relationship? A As the percent of woodland increases, the number of deer observed in a group decreases at a fairly constant rate. B As the percent of woodland increases, the number of deer observed in a group increases at a fairly constant rate. C As the percent of woodland increases, the number of deer observed in a group decreases quickly at first and then more slowly. D As the percent of woodland increases, the number of deer observed in a group increases quickly at first and then more slowly. E As the percent of woodland increases, the number of deer observed in a group remains fairly constant.
Answer C Correct. A negative relationship indicates a tendency for one variable to decrease as the other increases. Nonlinear indicates that the rate of decrease is not constant.
A restaurant manager collected data on the number of customers in a party in the restaurant and the time elapsed until the party left the restaurant. The manager computed a correlation of 0.78 between the two variables. What information does the correlation provide about the relationship between the number of customers in a party at the restaurant and the time elapsed until the party left the restaurant? A The relationship is linear because the correlation is positive. B The relationship is not linear because the correlation is positive. C The parties with a larger number of customers are associated with the longer times elapsed until the party left the restaurant. D The parties with a larger number of customers are associated with the shorter times elapsed until the party left the restaurant. E There is no relationship between the number of customers in a party at a table in the restaurant and the time required until the party left the restaurant.
Answer C Correct. A positive correlation indicates that as values of one variable increase, the values of the other variable tend to increase.
Which of the following statements about a least-squares regression analysis is true? A point with a large residual is an outlier. A point with high leverage has a yy-value that is not consistent with the other yy-values in the set. The removal of an influential point from a data set could change the value of the correlation coefficient. A II only B IIII only C IIIIII only D II and IIIIII only E II, IIII, and III
Answer C Correct. An influential point is one for which its removal from the set can have a substantial effect on the correlation.
Data were collected on the fiber diameter and the fleece weight of wool taken from a sample of 2020 sheep. The data are shown in the following graphs. Graph 11 is a scatterplot of fleece weight versus fiber diameter with the respective least-squares regression line shown. Graph 22 is the associated plot of the residuals versus the predicted values.One point is circled on graph 11. Five points labeled A, B, C, D, and E are identified on graph 22. Which point on graph 22 represents the residual for the circled point on graph 11 ? A A B B C C D D E E
Answer C Correct. The circled point in Graph 11 corresponds to the sample value that has a fiber diameter of approximately 2626 and a predicted fleece weight of approximately 1010. For that point, the value of the residual fleece weight can be found using values for the observed fleece weight and predicted fleece weight from Graph 11. The value of the residual is given by residual=observed−predicted≈5−10≈−5residual=observed−predicted≈5−10≈−5. Point C is the point on Graph 22 that has a predicted fleece weight of approximately 1010 and that has a residual fleece weight that is approximately −5−5 .
The height and age of each child in a random sample of children was recorded. The value of the correlation coefficient between height and age for the children in the sample was 0.80.8. Based on the least-squares regression line created from the data to predict the height of a child based on age, which of the following is a correct statement? A On average, the height of a child is 80%80% of the age of the child. B The least-squares regression line of height versus age will have a slope of 0.80.8. C The proportion of the variation in height that is explained by a regression on age is 0.640.64. D The least-squares regression line will correctly predict height based on age 80%80% of the time. E The least-squares regression line will correctly predict height based on age 64%64% of the time.
Answer C Correct. The coefficient of determination, r2r2, is the proportion of the variation in height that is explained by the least-squares regression line. The value of the coefficient of determination is r2=(0.8)2=0.64r2=(0.8)2=0.64, so the proportion of the variation in height that is explained by a regression on age is 0.640.64.
The relationship between carbon dioxide emissions and fuel efficiency of a certain car can be modeled by the least-squares regression equation ln(yˆ)=7−0.045xln(y^)=7−0.045x, where xx represents the fuel efficiency, in miles per gallon, and yˆy^ represents the predicted carbon dioxide emissions, in grams per mile. Which of the following is closest to the predicted carbon dioxide emissions, in grams per mile, for a car of this type with a fuel efficiency of 20 miles per gallon? A 1.8 B 6.1 C 446 D 2,697 E 1,250,000
Answer C Correct. When 20 is substituted for xx, the resulting value on the right side of the equation is 6.1. The value of approximately 446 results from raising ee to the power of 6.1 (that is,e6.1)(that is,e6.1).
In a study to determine whether miles driven is a good predictor of trade-in value, 11 cars of the same age, make, model, and condition were randomly selected. The following scatterplot shows trade-in value and mileage for those cars. Five of the points are labeled A, B, C, D, and E, respectively.Which of the five labeled points is the most influential with respect to a regression of trade-in value versus miles driven? A A B B C C D D E E
Answer E Correct. Point E does not follow the trend with respect to the other data and is probably an outlier. The value of the car is much higher than other cars with similar miles driven.
A family would like to build a linear regression equation to predict the amount of grain harvested per acre of land on their farm. They subdivide their land into several smaller plots of land for testing and would like to select an explanatory variable they can control. Which of the following is an appropriate explanatory variable that the family could use to create a linear regression equation? A The total amount of rainfall recorded at their farm B The type of crop planted in the plot the previous year C The average daily temperature at their farm D The variety of grain planted in the plot E The amount of fertilizer applied to each plot of land
Answer E Correct. The amount of fertilizer is quantitative, could be controlled by the farmers, and could help predict the amount of grain.
The height hh and collar size cc, both in centimeters, measured from a sample of boys were used to create the regression line cˆ=−94+0.9hThe line is used to predict collar size from height, both in centimeters, for boys' shirt collars. Which of the following has no logical interpretation in context? A The predicted collar size of a boy with height 140cm140cm B The hh values in the sample C The cc values in the sample D The slope of the regression line E The cc-intercept of the regression line
Answer E Correct. The cc-intercept of the regression line can be determined when the value of hh is 0 centimeters. It is not meaningful to predict the collar size of a boy with a height of 0 centimeters.
Data were collected on two variables, xx and yy, to create a model to predict yy from xx. A scatterplot of the collected data revealed a curved pattern with a possible cubic relationship (y=ax3y=ax3, where aa is a constant) between the variables. Which of the following transformations would be most appropriate for creating linearity between the variables? A Taking the cube of yy B Taking the cube root of yy C Taking the cube root of both yy and xx D Taking the log of yy E Taking the log of both yy and x
Answer E Correct. Variables related by a power, such as y=x3y=x3, are best transformed by taking the log of both variables.
The computer output below shows the result of a linear regression analysis for predicting the concentration of zinc, in parts per million (ppm), from the concentration of lead, in ppm, found in fish from a certain river.Which of the following statements is a correct interpretation of the value 19.0 in the output? A On average there is a predicted increase of 19.0 ppm in concentration of lead for every increase of 1 ppm in concentration of zinc found in the fish. B On average there is a predicted increase of 19.0 ppm in concentration of zinc for every increase of 1 ppm in concentration of lead found in the fish. C The predicted concentration of zinc is 19.0 ppm in fish with no concentration of lead. D The predicted concentration of lead is 19.0 ppm in fish with no concentration of zinc. E Approximately 19% of the variability in zinc concentration is predicted by its linear relationship with lead concentration.
B On average there is a predicted increase of 19.0 ppm in concentration of zinc for every increase of 1 ppm in concentration of lead found in the fish.
At a large airport, data were recorded for one month on how many baggage items were unloaded from each flight upon arrival as well as the time required to deliver all the baggage items on the flight to the baggage claim area. A scatterplot of the two variables indicated a strong, positive linear association between the variables. Which of the following statements is a correct interpretation of the word "strong" in the description of the association? A A least-squares model predicts that the more baggage items that are unloaded from a flight, the greater the time required to deliver the items to the baggage claim area. B The actual time required to deliver all the items to the baggage claim area based on the number of items unloaded will be very close to the time predicted by a least-squares model. C The time required to deliver an item to the baggage claim area is relatively constant, regardless of the number of baggage items unloaded from a flight. D The variability in the time required to deliver all items to the baggage claim area is about the same for all flights, regardless of the number of items unloaded from a flight. E The time required to unload baggage items from a flight is related to the time required to deliver the items to the baggage claim area.
B The actual time required to deliver all the items to the baggage claim area based on the number of items unloaded will be very close to the time predicted by a least-squares model.
An agriculturalist working with Australian pine trees wanted to investigate the relationship between the age and the height of the Australian pine. A random sample of Australian pine trees was selected, and the age, in years, and the height, in meters, was recorded for each tree in the sample. Based on the recorded data, the agriculturalist created the following regression equation to predict the height, in meters, of the Australian pine based on the age, in years, of the tree. predicted height = 0.29 + 0.48(age) Which of the following is the best interpretation of the slope of the regression line? A The height increases, on average, by 1 meter each 0.48 year. B The height increases, on average, by 0.48 meter each year. C The height increases, on average, by 0.29 meter each year. D The height increases, on average, by 0.29 meter each 0.48 year. E The difference between the actual height and the predicted height is, on average, 0.48 meter for each year.
B The height increases, on average, by 0.48 meter each year.
In a recent survey, high school students and their parents were asked to rate 60 recently released movies. The ratings were on a scale from 1 to 9, where 1 was "horrible" and 9 was "excellent". For each movie, the average rating by the students and the average rating by their parents was calculated and the scatterplot below was constructed.The horizontal axis represents the student rating, and the vertical axis represents the parent rating.Thus, an individual data point would represent the rating of a single movie. Which of the following statements is justified by the scatterplot? A The movies that the students liked the best also tended to be the movies that the parents liked the best, but the students tended to give lower scores. B The movies that the students liked the best also tended to be the movies that the parents liked the best, but the students tended to give higher scores. C The movies that the students liked the best also tended to be the movies that the parents liked the best, but each group tended to give the same scores. D The movies that the students liked the best tended to be the movies that the parents liked the least, but the students tended to give lower scores. E The movies that the students liked the best tended to be the movies that the parents liked the least, but the students tended to give higher scores.
B The movies that the students liked the best also tended to be the movies that the parents liked the best, but the students tended to give higher scores.
A scatterplot of student height, in inches, versus the corresponding arm span length, in inches, is shown below. One of the points in the graph is labeled A.If the point labeled A is removed, which of the following statements would be true? A The slope of the least squares regression line is unchanged and the correlation coefficient increases. B The slope of the least squares regression line is unchanged and the correlation coefficient decreases. C The slope of the least squares regression line increases and the correlation coefficient increases. D The slope of the least squares regression line increases and the correlation coefficient decreases. E The slope of the least squares regression line decreases and the correlation coefficient increases.
C The slope of the least squares regression line increases and the correlation coefficient increases.
An experiment was conducted to investigate the relationship between the dose of a pain medication and the number of hours of pain relief. Twenty individuals with chronic pain were randomly assigned to one of five doses—0.0, 0.5, 1.0, 1.5, 2.0—in milligrams (mg) of medication. The results are shown in the scatterplot below. The data were used to fit a least-squares regression line to predict the number of hours of pain relief for a given dose. Which of the following would be revealed by a plot of the residuals of the regression versus the dose? A The sum of the residuals is less than 0. B The sum of the residuals is greater than 0. C There are outliers associated with the lower doses. D The variation in the hours of pain relief is not the same across the doses. E There is a positive linear relationship between the residuals and the dose.
D The variation in the hours of pain relief is not the same across the doses.
Suppose a certain scale is not calibrated correctly, and as a result, the mass of any object is displayed as 0.75 kilogram less than its actual mass. What is the correlation between the actual masses of a set of objects and the respective masses of the same set of objects displayed by the scale? A −1 −0.75 C 0 D 0.75 E 1
E 1