AP STATS
The owner of a chain of supermarkets notices that there is a positive correlation between the sales of beer and the sales of ice cream over the course of the previous year. During seasons when sales of beer were above average, sales of ice cream also tended to be above average. Likewise, during, seasons when sales of beer were below average, sales of ice cream also tended to be below average. Which of the following would be a valid conclusion from these facts?
It is likely that sales of both beer and ice cream are confounded with a lurking variable, such as seasonal variation in temperature.
Scenario 12-10 Use of the internet worldwide increased steadily from 1990-2002. A residual plot for the regression of worldwide Internet Users (in millions) on Year is shown below. Use Scenario 12-10. Suppose we use the regression whose residuals are shown here to predict the number of Internet users in 1991. Which of the following best describes the accuracy of the prediction?
The prediction would probably underestimate the true number of Internet users in 1991
The stemplot below shows the number of home runs hit in 2008 by members of the Philadelphia Phillies, who won major League Baseball's World Series that year. (Each of the 13 players who appeared in at least half the Phillies' games that year is included). Note that 4|8 represents 48 home runs. The five number summary for these data is:
0, 6.5, 11, 28.5, 48
A lobster fisherman is keeping track of the productivity of a set of traps he has placed in a favorite location. Below are the numbers of lobsters in these traps over the course of 12 different hauls. According to the 1.5x IQR rule, which values in the above distribution are outliers?
14 only
Use scenario 1-4. the interquartile range for the number of AP courses is
2
Use Scenario 12-10. A scatterplot of Internet Users (in millions) versus Year is strongly linear, suggesting that linear regression of this transformation may be more appropriate. Below is a computer regression analysis of the transformed data (note that natural logarithms are used). What is the predicted number of Internet users (in millions) in 1991, based on this model?
4.92
A sample of production records for an automobile manufacturer shows the following figures for production per shift.... the variance of the sample is
50.00.
Which of the following statements concerning residuals is true?
All of the above
Use scenario 1-4. Which of the following is the correct boxplot
B (skewed left and no median line)
Scenario 4-1 A sports writer wants to know how strongly Lafayette residents support the local minor league baseball team, the Lafayette Leopards. She standed outside the stadium before a game and interviewed the first 20 people who enter the stadium. Use Scenario 4-1. The intended population for the survey is
all residents of Lafayette
Scenario 4-6 Does caffeine improve exam performance? Suppose all students in the 8:30 section of a course are given a "treatment" (two cups of coffee) and all students in the 9:30 section are not permitted to have any caffeine before a mid term exam.Use Scenario 4-6. Unfortunately, any systematic difference between the two sections on the exam might be due to the fact that the 8:30 and 9:30 classes have different instructors. This is an example of
confounding
Scenario 3-2 The following table and scatter plot present data on wine consumption (in liters per person per year) and death rates from heart attacks (in deaths per 100,000 people per year) in 19 developed Western countries. Use Scenario 3-2. The scatterplot shows that
countries that drink more wine have lower death rates from heart disease
A double-blind experiment was conducted to evaluate the effectiveness of the Salk polio vaccine. The purpose of keeping the diagnosing physicians ignorant of the treatment status of the experimental subjects was to
eliminate a possible source of bias
Consider the following scatter plot of two variables, X and Y. We may conclude that the correlation between X and Y
is close to 1, even though the relationship is not linear
X and Y are two categorical variables. the best way to determine if there is a relation between them is to
make a two-way table of the X and Y values.
there are three children in a room, ages three, four, and five. if a four-year-old child enters the room the
mean age will stay the same but the variance will decrease
A stemplot of a set of a data is roughly symmetric, but the data do not even approximately follow the 68-95-99.7 rule. We conclude that the data are
not normal
Use Scenario 3-1. If the data point (65,70) were removed from this study, how would the value of the correlation r change?
r would be larger, since this point does in the pattern of the rest of the data
You plan to give a math achievement test to samples of 15 year-olds from both the U.S and Korea in order to compare mathematics knowledge in the two countries. In each country, you will randomly choose: 300 students from low-income families 400 students from middle-income families 200 students from high-income families The sample from Korea is a ..
stratified random sample
a set of data has a mean that is much larger than the median. which of the following statements is most consistent with this information
the distribution is skewed right
The correlation between the age and the height of children is found to be about r=0.7. Suppose we use the age x of a child to predict the height y of the child. We conclude that
the fraction of the variation in heights explained by the least-squares regression line of y on x is 0.49.
which of the following statements is not true
the median is always greater than the mean
the bar graph below summarizes responses of dog owners to the question "where in the car do you let your dog ride?"
these data could also be presented in a pie chart
you measure the age, martial status and earned income of an SRS of 1463 women. the number and type of variables you have measured is
three; one categorical and two quantitative
Scenario 3-1 The height (in feet) and volume (in cubic feet) of usable lumber of 32 cherry trees are measured by a researcher. The goal is to determine if volume of usable lumber can be estimated from the height of a tree. Use Scenario 3-1. In this study, the response variable is
volume of lumber
Scenario 3-8 A fisheries biologist studying whitefish in a Canadian Lake collected data on the length (in centimeters) and egg production for 25 female fish. A scatter plot of her results and computer regression analysis of egg production versus fish length are given below. Use Scenario 3-8. The equation of the least-squares regression line is
Eggs= -142.74+39.25 (Length)
Different writers have different styles. one way to quantify this difference is to compare this distribution of word lengths in their work. below are parallel boxplots describing the distributions of word lengths for the first 60 words in....
The median word length for rowling is longer than for either starnes or James