AP Stats MC Question Bank

Ace your homework & exams now with Quizwiz!

B. The specified mean length of 3.5 cm is not within a 90 percent confidence interval. Explanation: The right-tailed test is used. So there is a 5 percent area in the rejection region in the right tail of the sampling distribution. If we construct a 90 percent confidence interval, then the upper confidence limit will match the critical value. If the test of hypothesis is rejected at a 5 percent level of significance, then the test statistic fell in the rejection region. In other words, the hypothesized value of mean did not belong to the 90 percent confidence interval.

A botanist is interested in testing H0: μ = 3.5 cm versus Ha: μ > 3.5, where μ = the mean petal length of one variety of flowers. A random sample of 50 petals gives significant results at a 5 percent level of significance. Which of the following statements about the confidence interval to estimate the mean petal length is true? A. The specified mean length of 3.5 cm is within a 90 percent confidence interval. B. The specified mean length of 3.5 cm is not within a 90 percent confidence interval. C. The specified mean length of 3.5 cm is below the lower limit of a 90 percent confidence interval. D. The specified mean length of 3.5 cm is below the lower limit of a 95 percent confidence interval. E. Not enough information is available to answer the question.

E. No, because the possibility of picking 5 cars with an average of 32 mpg or lower from a fleet of cars of known 35 mpg efficiency is very small, only 4 out of 160, or 0.025. Explanation: The consumer group randomly picked 5 of the cars and found an average efficiency of only 32 mpg. The chance of picking 5 cars with an average of 32 mpg or lower from a fleet of cars of real 35 mpg efficiency is so small, only 4 out of 160, or 0.025, that it seems reasonable to conclude that the consumer group's set of 5 cars was not picked from a fleet of cars with 35 mpg efficiency and that the company's claim is wrong.

A car manufacturer claims that the cars it sells have an average fuel efficiency of 35 mpg. A consumer group believes that the true figure is lower. The consumer group obtains a random sample of 5 of the company's cars and determines an average of 32 mpg. In response to the consumer group's findings, the company runs a simulation by randomly picking 5 cars 160 times from a fleet of cars of known 35 mpg efficiency and calculating the resulting averages to show that 32 mpg was possible. The company makes the dotplot shown. Does the company's argument seem reasonable given this dotplot? A. Yes, because the dotplot shows that anything is possible in the real world. B. Yes, because the dotplot shows that the average was not 35 for every 5-car set that the company randomly picked. C. Yes, because the dotplot shows that 32 mpg is a possible average of 5 cars from a fleet of cars of known 35 mpg efficiency. D. No, because the dotplot shows that anything is possible in the real world. E. No, because the possibility of picking 5 cars with an average of 32 mpg or lower from a fleet of cars of known 35 mpg efficiency is very small, only 4 out of 160, or 0.025.

E. 1 - (1 - 0.025)(1 - 0.034)(1 - 0.02) Explanation: The probabilities of each pump not failing are 1 − 0.025, 1 − 0.034, and 1 − 0.02, respectively. The probability of none failing is the product of (1 − 0.025)(1 − 0.034)(1 − 0.02). So the probability of at least one failing is 1 − (1 − 0.025)(1 − 0.034)(1 − 0.02).

A city water supply system involves three pumps, the failure of any one of which crashes the system. The probabilities of failure for each pump in a given year are 0.025, 0.034, and 0.02, respectively. Assuming the pumps operate independently of each other, what is the probability that the system does crash during the year? A. 0.025 + 0.034 + 0.02 B. 1 - (0.025 + 0.034 + 0.02) C. 1 - (0.025)(0.034)(0.02) D. (1 - 0.025)(1 - 0.034)(1 - 0.02) E. 1 - (1 - 0.025)(1 - 0.034)(1 - 0.02)

E. It is possible for both the company report to be true and the investigator's claim to be correct. Explanation: It is possible for both to be correct, for example, if there were 11 secretaries (10 women, 3 of whom receive raises, and 1 man who receives a raise) and 11 executives (10 men, 1 of whom receives a raise, and 1 woman who does not receive a raise). Then 100% of the male secretaries receive raises while only 30% of the female secretaries do; and 10% of the male executives receive raises while 0% of the female executives do. At the same time, of all the employees, 3 out of 11 women receive raises while only 2 out of 11 men receive raises. This is an example of Simpson's paradox.

A company employs both men and women in its secretarial and executive positions. In reports filed with the government, the company shows that the percentage of female employees who receive raises is higher than the percentage of male employees who receive raises. A government investigator claims that the percentage of male secretaries who receive raises is higher than the percentage of female secretaries who receive raises and that the percentage of male executives who receive raises is higher than the percentage of female executives who receive raises. Is this possible? A. No, either the company report is wrong or the investigator's claim is wrong. B. No, if the company report is correct, either a greater percentage of female secretaries than of male secretaries receive raises or a greater percentage of female executives than of male executives receive raises. C. No, if the investigator is correct, by summation of the corresponding numbers, the total percentage of male employees who receive raises would have to be greater than the total percentage of female employees who receive raises. D. All of the above are true. E. It is possible for both the company report to be true and the investigator's claim to be correct.

D. There are four levels of a single explanatory variable. Explanation: Octane is the only explanatory variable, and it is being tested at four levels. Miles per gallon is the single response variable.

A consumer product agency tests miles per gallon for a sample of automobiles using each of four different octanes of gasoline. Which of the following is true? A. There are four explanatory variables and one response variable. B. There is one explanatory variable with four levels of response. C. Miles per gallon is the only explanatory variable, but there are four response variables corresponding to the different octanes. D. There are four levels of a single explanatory variable. E. Each explanatory level has an associated level of response.

A. 0.0057 Explanation: The sample of 40 jars is large enough for us to apply the central limit theorem and get an approximately normal distribution of the sample mean.

A filling machine puts an average of four ounces of coffee in jars, with a standard deviation of 0.25 ounces. Forty jars filled by this machine are selected at random. What is the probability that the mean amount per jar filled in the sampled jars is less than 3.9 ounces? A. 0.0057 B. 0.0225 C. 0.0250 D. 0.0500 E. 0.3446

A. E (Fine tuning) = 1.3 hr SD(Fine tuning) = 0.15 hr Explanation: Means and variances add.

A home-theater projector system technician knows that based on past experience, for unpacking, assembly, and fine tuning, the mean total setup time is 5.6 hours with a standard deviation of 0.886 hours. The mean and standard deviation for the unpacking time are 1.5 hours and 0.2 hours, and for the assembly time are 2.8 hours and 0.85 hours, respectively. If the times for the three steps are independent, what are the mean and standard deviation for the fine tuning time? A. E (Fine tuning) = 1.3 hr SD(Fine tuning) = 0.15 hr B. E (Fine tuning) = 1.3 hr SD(Fine tuning) = 1.24 hr C. E (Fine tuning) = 2.15 hr SD(Fine tuning) = 0.164 hr D. E (Fine tuning) = 2.15 hr SD(Fine tuning) = 1.24 hr E. E (Fine tuning) = 4.61 hr SD(Fine tuning) = 0.15 hr

C. Selection bias makes this a poorly designed survey. Explanation: Surveying people coming out of any church results in a very unrepresentative sample of the adult population, especially given the question under consideration. Using chance and obtaining a high response rate will not change the selection bias and make this into a well-designed survey.

A researcher plans a study to examine the depth of belief in God among the adult population. He obtains a simple random sample of 100 adults as they leave church one Sunday morning. All but one of them agree to participate in the survey. Which of the following is a true statement? A. Proper use of chance as evidenced by the simple random sample makes this a well-designed survey. B. The high response rate makes this a well-designed survey. C. Selection bias makes this a poorly designed survey. D. The validity of this survey depends on whether or not the adults attending this church are representative of all churches. E. The validity of this survey depends upon whether or not similar numbers of those surveyed are male and female.

D. The predicted noise level is greater than the measured noise level. Explanation: Residual = Measured − Predicted, so if the residual is negative, the predicted must be greater than the measured (observed).

A rural college is considering constructing a windmill to generate electricity but is concerned over noise levels. A study is performed measuring noise levels (in decibels) at various distances (in feet) from the campus library, and a least squares regression line is calculated with a correlation of 0.74. Which of the following is a proper and most informative conclusion for an observation with a negative residual? A. The measured noise level is 0.74 times the predicted noise level. B. The predicted noise level is 0.74 times the measured noise level. C. The measured noise level is greater than the predicted noise level. D. The predicted noise level is greater than the measured noise level. E. The slope of the regression line at that point must also be negative.

C. A stratified random sample Explanation: The four classes (freshmen, sophomores, juniors, and seniors) are strata.

A school psychologist wants to investigate student depression. She selects, at random, samples of 30 freshmen, 30 sophomores, 30 juniors, and 30 seniors to interview. Which of the following best describes the principal's sampling plan? A. A convenience sample B. A simple random sample C. A stratified random sample D. A cluster sample E. A systematic sample

D. A more meaningful study would be to compare an SRS from each of the two groups of 100 students. Explanation: Using only a sample from the observations gives less information. It may well be that very bright students are the same ones who both choose to take AP Statistics and have high college GPAs. If students could be randomly assigned to take or not take AP Statistics, the results would be more meaningful. Of course, ethical considerations might make it impossible to isolate the confounding variable in this way.

A study is made to determine whether taking AP Statistics in high school helps students achieve higher GPAs when they go to college. In comparing records of 200 college students, half of whom took AP Statistics in high school, it is noted that the average college GPA is higher for those 100 students who took AP Statistics than for those who did not. Based on this study, guidance counselors begin recommending AP Statistics for college-bound students. Which of the following is incorrect? A. While this study indicates a relation, it does not prove causation. B. There could well be a confounding variable responsible for the seeming relationship. C. Self-selection here makes drawing the counselors' conclusion difficult. D. A more meaningful study would be to compare an SRS from each of the two groups of 100 students. E. This is an observational study, not an experiment.

E. a mistake in arithmetic has been made. Explanation: The correlation r cannot take a value greater than 1.

A study of department chairperson ratings and student ratings of the performance of high school statistics teachers reports a correlation of r = 1.15 between the two ratings. From this information we can conclude that A. chairpersons and students tend to agree on who is a good teacher. B. chairpersons and students tend to disagree on who is a good teacher. C. there is little relationship between chairperson and student ratings of teachers. D. there is strong association between chairperson and student ratings of teachers, but it would be incorrect to infer causation. E. a mistake in arithmetic has been made.

D. As home size increases by 1000 square feet, the selling price tends to change by a constant amount, on average. Explanation: A linear association means that as the explanatory variable (home size here) changes by a constant amount, the response variable (selling price here) also changes by a constant amount, on average. Unless there was perfect linear correlation, the points will not line up on a straight line. No distinct pattern in the residual plot just means that there is no obvious better model out there, but it doesn't necessarily say that the data are linear. The coefficient of determination indicates something about the strength of the relationship but does not define linearity. Choice (E) is an interpretation of the slope, but again, not a definition of linearity.

A study of selling prices of homes in a southern California community (in $1000) versus size of the homes (in 1000s of square feet) shows a moderate positive linear association. The least squares regression equation is: Predicted selling price = 35.3 + 214.1(Size). What does "linear" mean in this context? A. The points in the scatterplot line up in a straight line. B. There is no distinct pattern in the residual plot. C. The coefficient of determination, r 2 , is large (close to 1). D. As home size increases by 1000 square feet, the selling price tends to change by a constant amount, on average. E. Each increase of 1000 square feet in home size gives an increase of 214.1($1000) in selling price.

D. Regions with a higher percentage of seat belt usage tend to have lower numbers of highway deaths due to failure to wear seat belts. Explanation: The word "negative" in the phrase "strong negative linear association" means that generally as one variable increases, the other variable decreases. Thus, regions with a higher percentage of seat belt usage tend to have lower numbers of highway deaths due to failure to wear seat belts. Choice (A) is an interpretation of the y intercept. Correlation is a measure of the strength of a linear relationship but does not by itself explain the meaning of "positive" or "negative." While the overall pattern is of a negative association, anything can be true about two points on the scatterplot. Choice (E) is an interpretation of the slope.

A study of the number of highway deaths due to failure to wear seat belts in a year versus percentage seat belt usage in that year in 300 national regions shows a strong negative linear association. The least squares regression equation is: Predicted number of deaths = 11,100 - 305.1(Belt usage). What does "negative" mean in this context? A. If no one wore seat belts in a given region, it is predicted that there will be 11,100 highway deaths due to failure to wear seat belts. B. The correlation, r , between the number of highway deaths due to failure to wear seat belts and the percentage of seat belt usage is negative. C. If a given region has a lower percentage of seat belt usage than a second region, the given region will have a higher number of highway deaths due to failure to wear seat belts than the second region. D. Regions with a higher percentage of seat belt usage tend to have lower numbers of highway deaths due to failure to wear seat belts. E. If a region has a one percent gain in seat belt usage, then it will have a reduction of 305 highway deaths due to failure to wear seat belts, on average.

A. 87 Explanation: The sum of the scores in one class is 20 × 92 = 1840, while the sum in the other is 25 × 83 = 2075. The total sum is 1840 + 2075 = 3915. There are 20 + 25 = 45 students, and so the average score is 3915/45=87.

A teacher is teaching two AP Statistics classes. On the final exam, the 20 students in the first class averaged 92, while the 25 students in the second class averaged only 83. If the teacher combines the classes, what will the average final exam score be? A. 87 B. 87.5 C. 88 D. None of the above. E. More information is needed to make this calculation.

D. 95% Explanation: 54.9 and 93.7 are two standard deviations below and above the mean, respectively. By the empirical rule, 95% of the data are in this interval.

About what percent of the years are between 54.9 and 93.7 (points B and F)? A. 17% B. 34% C. 68% D. 95% E. 99.7%

A. 17% Explanation: 64.6 is one standard deviation below the mean. By the empirical rule, 68% of the data is between one standard deviation below and above the mean. This leaves 34% outside this interval and 17% in each tail.

About what percent of the years are less than 64.6 (point C)? A. 17% B. 34% C. 68% D. 95% E. 99.7%

A. 10(0.15)^2x(0.85)^3 Explanation: This is a binomial with n = 5 and p = 0.15.

According to a CBS/ New York Times poll taken in 1992, 15% of the public have responded to a telephone call-in poll. In a random group of five people, what is the probability that exactly two have responded to a call-in poll? A. 10(0.15)^2x(0.85)^3 B. 5(0.15)^2x(0.85)^3 C. (0.15)^2x(0.85)^3 D. (0.15)^2 E. 5(0.15)^2

B. This happened because each bar shows a complete distribution. Explanation: In a complete distribution, the probabilities sum to 1 and the relative frequencies total 100%.

All three bars have a height of 100%. Which of the following is true? A. This is a coincidence. B. This happened because each bar shows a complete distribution. C. This happened because there are three bars each divided into three segments. D. This happened because of the nature of musical patterns. E. None of the above is true.

B. Allan's IQ was greater than the average IQ in Pennsylvania and less than the average IQ in California. Explanation: Removing a value that is greater than the average will always lower the average, and adding in a value that is less than the average will also always lower the average.

Allan famously quipped that when he moved from Pennsylvania to California, the average IQ dropped in both states. What would have had to been true for this to happen? A. Allan's IQ was greater than the average IQ in both Pennsylvania and California. B. Allan's IQ was greater than the average IQ in Pennsylvania and less than the average IQ in California. C. Allan's IQ was less than the average IQ in Pennsylvania and greater than the average IQ in California. D. Allan's IQ was less than the average IQ in both Pennsylvania and California. E. There is no way for Allan's statement to be true.

D. E (Total) = $122,500 SD(Total) = $4066 Explanation: Expected values and variances can be added.

An auto dealer offers discounts averaging $2450 with a standard deviation of $575. If 50 autos are sold during one month, what is the expected value and standard deviation for the total discounts given? A. E (Total) = $17,324 SD(Total) = $170 B. E (Total) = $17,324 SD(Total) = $4066 C. E (Total) = $122,500 SD(Total) = $170 D. E (Total) = $122,500 SD(Total) = $4066 E. E (Total) = $122,500 SD(Total) = $28,750

(D) 1000(0.2) + 5000(0.05) = 450, and 800 − 450 = 350.

An insurance company charges $800 annually for car insurance. The policy specifies that the company will pay $1000 for a minor accident and $5000 for a major accident. If the probability of a motorist having a minor accident during the year is 0.2 and of having a major accident is 0.05, how much can the insurance company expect to make on a policy? A. $200 B. $250 C. $300 D. $350 E. $450

A. Voluntary response bias makes the survey meaningless. Explanation: This survey provides a good example of voluntary response bias, which often overrepresents negative opinions. The people who chose to respond were most likely parents who were very unhappy, and so there is very little chance that the 10,000 respondents were representative of the population. Knowing more about her readers, or taking a sample of the sample, would not have helped.

Ann Landers, who wrote a daily advice column appearing in newspapers across the country, once asked her readers, "If you had it to do over again, would you have children?" Of the more than 10,000 readers who responded, 70% said no. What does this show? A. Voluntary response bias makes the survey meaningless. B. No meaningful conclusion is possible without knowing something more about the characteristics of her readers. C. The survey would have been more meaningful if she had picked a random sample of the 10,000 readers who responded. D. The survey would have been more meaningful if she had used a control group. E. This was a legitimate sample, randomly drawn from her readers and of sufficient size to allow the conclusion that most of her readers who are parents would have second thoughts about having children.

A. No, because 0.475 ≠ 0.506. Explanation: If independent, these would have been equal.

Are "GPA between 2.0 and 3.0" and "skipped few classes" independent? A. No, because 0.475 ≠ 0.506. B. No, because 0.475 ≠ 0.890. C. No, because 0.450 ≠ 0.475. D. Yes, because of conditional probabilities. E. Yes, because of the product rule.

B. $352 Explanation: E ( X ) = µ X = ∑ x i p i = $700(0.05) + $540(0.25) + $260(0.7) = $352

At a warehouse sale, 100 customers are invited to choose one of 100 unopened, identical boxes, each containing one item. Five boxes contain $700 flat-screen television sets, 25 boxes contain $540 smartphones, and the remaining boxes contain $260 cameras. What should a customer be willing to pay to participate in the sale? A. $260 B. $352 C. $500 D. $540 E. $699

A. Yes, because the corresponding segments of the three bars have different lengths. Explanation: The different lengths of corresponding segments show that in different geographic regions, different percentages of people prefer each of the music categories.

Based on the given segmented bar chart, does there seem to be a relationship between geographic location and music preference? A. Yes, because the corresponding segments of the three bars have different lengths. B. Yes, because the heights of the three bars are identical. C. Yes, because there are three segments and three bars. D. No, because the heights of the three bars are identical. E. No, because summing the corresponding segments for classical, summing the corresponding segments for country, and summing the corresponding segments for pop or rock all give approximately the same total.

D. If the times are arranged in order, the middle time would be between 15 and 20 minutes. Explanation: The minimum time is somewhere between 5 and 10 minutes but might not be exactly 5 minutes. Similarly, the maximum time is somewhere between 30 and 35 minutes. With 155 times, the middle time will be the 78th time if the times are arranged in order. There are 70 times less than 15 minutes and 40 times between 15 and 20 minutes, so the 78th time must be between 15 and 20 minutes.

Based on the histogram, which of the following must be true? A. The minimum time taken by any of these students was 5 minutes. B. The maximum time taken by any of these students was 35 minutes. C. If the times are arranged in order, the middle time would be between 10 and 15 minutes. D. If the times are arranged in order, the middle time would be between 15 and 20 minutes. E. The same number of students took less than 15 minutes as took over 20 minutes.

D. More students had a GPA under 3.0 than a GPA 3.0 or higher. Explanation: In mosaic plots, the area of a box is proportional to the count corresponding to that box. Choice (D) is true because the three boxes corresponding to students with GPAs under 3.0 have a total area greater than the three boxes corresponding to students with GPAs 3.0 or higher, or note that along the vertical axis, "GPA under 3.0" has a greater length than "GPA 3.0 or higher."

Based on this plot, which of the following is a true statement? A. The number of students with GPAs under 3.0 who were banned from car use is greater than the number of students with GPAs 3.0 or higher who were banned from car use. B. More students had a GPA 3.0 or higher and were either yelled at or grounded than had a GPA under 3.0 and were yelled at. C. Of the students who were grounded, a greater proportion had GPAs 3.0 or higher than GPAs under 3.0. D. More students had a GPA under 3.0 than a GPA 3.0 or higher. E. More students were yelled at than were grounded.

E. Among the students living on campus, the proportion who study humanities is less than the proportion who study science. Explanation: In mosaic plots, the area of a box is proportional to the count corresponding to that box. The horizontal axis in this plot indicates the proportions of humanities versus science students. The vertical axis in this plot indicates the proportions of living on campus versus living off campus for humanities students and for science students. Choice (E) is not true because of the two boxes corresponding to students living on campus, the humanities box has greater area than the science box.

Based on this plot, which of the following statements is NOT true? A. The number of humanities students who live on campus is greater than the number of science students who live on campus. B. The proportion of humanities students who live on campus is less than the proportion of science students who live on campus. C. The number of humanities students at this college is greater than the number of science students at this college. D. The number of students who live on campus is less than the number living off campus. E. Among the students living on campus, the proportion who study humanities is less than the proportion who study science.

B. E (Donut hole) = 0.3 oz SD(Donut hole) = 0.02 oz Explanation: Means and variances add.

Boxes of 50 donut holes weigh an average of 16.0 ounces with a standard deviation of 0.245 ounces. If the empty boxes alone weigh an average of 1.0 ounce with a standard deviation of 0.2 ounces, what are the mean and standard deviation of donut hole weights? A. E (Donut hole) = 0.3 oz SD(Donut hole) = 0.0063 oz B. E (Donut hole) = 0.3 oz SD(Donut hole) = 0.02 oz C. E (Donut hole) = 0.3 oz SD(Donut hole) = 0.142 oz D. E (Donut hole) = 15 oz SD(Donut hole) = 0.142 oz E. E (Donut hole) = 15 oz SD(Donut hole) = 0.445 oz

A. Yes. Explanation: The probabilities 7/24, 8/24, and 9/24 are all nonnegative, and they add up to 1.

Can the function f(x) = (x+6)/24, for x = 1, 2, and 3, be the probability distribution for some random variable taking the values 1, 2, and 3? A. Yes. B. No, because probabilities cannot be negative. C. No, because probabilities cannot be greater than 1. D. No, because the probabilities do not sum to 1. E. Not enough information is given to answer the question.

E. No value for n can make r = 1. Explanation: A scatterplot readily shows that while the first three points lie on a straight line, the fourth point does not lie on this line. Thus, no matter what the fifth point is, all the points cannot lie on a straight line, and so r cannot be 1.

Consider the set of points {(2, 5), (3, 7), (4, 9), (5, 12), (10, n )}. What should n be so that the correlation between the x and y values is 1? A. 21 B. 24 C. 25 D. A value different from any of the above. E. No value for n can make r = 1.

A. 6 Explanation: The regression line, also called the least squares regression line, minimizes the sum of the squares of the vertical distances between the points and the line. In this case (2, 10), (3, 19), and (4, 28) are on the line, and so the minimum sum is (10 − 11)^2 + (19 − 17)^2 + (28 − 29)^2 = 6.

Consider the three points (2, 11), (3, 17), and (4, 29). Given any straight line, we can calculate the sum of the squares of the three vertical distances from these points to the line. What is the smallest possible value this sum can be? A. 6 B. 9 C. 29 D. 57 E. None of these values

E. The value 60 is an outlier, and there may be others at the high end of the data set. Explanation: Outliers are any values below Q 1 − 1.5(IQR) = 25 1.5(38 25) = 5.5 or above Q 3 + 1.5(IQR) = 38 + 1.5(38 25) = 57.5. With a minimum of 10 > 5.5, there are no outliers on the low end; however, with a maximum of 60 > 57.5, the maximum is an outlier and so are any other values falling between 57.5 and 60.

Dieticians are concerned about sugar consumption in teenagers' diets (a 12-ounce can of soft drink typically has 10 teaspoons of sugar). In a random sample of 55 students, the number of teaspoons of sugar consumed for each student on a randomly selected day is tabulated. Summary statistics are noted below: Min = 10 Max = 60 First quartile = 25 Third quartile = 38 Median = 31 Mean = 31.4 n = 55 s = 11.6 Which of the following is a true statement? A. None of the values are outliers. B. The value 10 is an outlier, and there can be no others. C. The value 60 is an outlier, and there can be no others. D. Both 10 and 60 are outliers, and there may be others. E. The value 60 is an outlier, and there may be others at the high end of the data set.

A. No, because the volunteers know whether they are drinking a blue or a green drink. Explanation: Blinding does have to do with whether or not the subjects know which treatment (color in this experiment) they are receiving. However, drinking out of solid black thermoses makes no sense when the beverages are identical except for color and the point of the experiment is the teenager's reaction to color. Blinding has nothing to do with blocking (sports team participation in this experiment).

Do teenagers prefer sports drinks colored blue or green? Two different colorings, which have no effect on taste, are used on an identical drink to result in either a blue or a green beverage. Volunteer teenagers are randomly assigned to drink one or the other colored beverage, and the volunteers then rate the beverage on a one to ten scale. Because of concern that sports interest may affect the outcome, the volunteers are first blocked by whether or not they play on a high school sports team. Is blinding possible in this experiment? A. No, because the volunteers know whether they are drinking a blue or a green drink. B. No, because the volunteers know whether or not they play on a high school sports team. C. Yes, by having the experimenter in a separate room randomly pick one of two containers and remotely having a drink poured from that container. D. Yes, by having the statistician analyzing the results not knowing which volunteer sampled which drink. E. Yes, by having the volunteers drink out of solid black thermoses so that they don't know the color of the drink they are tasting.

E. The actual incidence of skin cancer at a given latitude will be very close to what is predicted by a least squares model. Explanation: "Strong" means that the points in the related scatterplot fall close to the least squares regression line. In context, this means that the actual incidence of skin cancer at a given latitude will be very close to what is predicted by the least squares model. Association does not imply causation. While (C) refers to the existence of an association, and (B) and (D) further refer to the direction of the association, none of these statements refer to the strength of the association.

Does lower sun exposure, as measured by greater distance from the equator, result in lower incidence of skin cancer? A study of skin cancer mortality versus latitude of 100 northern hemisphere locales shows a strong negative linear association. What does "strong" mean in this context? A. More sun exposure causes greater numbers of skin cancer deaths. B. More sun exposure is associated with greater numbers of skin cancer deaths. C. A locale's incidence of skin cancer deaths has a linear association with its latitude. D. A least squares model predicts that the greater the latitude, the lower the average incidence of skin cancer. E. The actual incidence of skin cancer at a given latitude will be very close to what is predicted by a least squares model.

E. At least 200 Explanation: Look closely at the chart. For example, the point (160, 0.4) indicates that on 40 percent of the days, up to 160 shots were given. To find the number of shots needed to meet the demand on 95 percent of the days, we need to find the 95th percentile. Draw a horizontal line from the cumulative probability of 0.95 to the curve. From the point at which the line meets the curve, draw a vertical line down to the x-axis, and then read the number of flu shots given there. It should be approximately 200.

During flu season, a city medical center needs to keep a large supply of flu shots. A nurse's aid compiles data on the number of flu shots given per day in the past few years during flu season. A cumulative probability chart of the collected data is shown. How many flu shots should the center store every day to meet the demand on 95 percent of the days? A. At most 190 B. At most 140 C. Exactly 170 D. At least 150 E. At least 200

E. No, because not each group of 60 players has the same chance of being selected. Explanation: In a simple random sample, every possible group of the given size has to be equally likely to be selected, and this is not true here. For example, with this procedure it would be impossible for all the players of one team to be together in the final sample. This procedure is an example of stratified sampling, but stratified sampling does not result in simple random samples.

Each of the 30 MLB teams has 25 active roster players. A sample of 60 players is to be chosen as follows. Each team will be asked to place 25 cards with its players' names into a hat and randomly draw out two names. The two names from each team will be combined to make up the sample. Will this method result in a simple random sample of the 750 baseball players? A. Yes, because each player has the same chance of being selected. B. Yes, because each team is equally represented. C. Yes, because this is an example of stratified sampling, which is a special case of simple random sampling. D. No, because the teams are not chosen randomly. E. No, because not each group of 60 players has the same chance of being selected.

D. $790.17

For an advertising promotion, an auto dealer hands out 1000 lottery tickets with a prize of a new car worth $25,000. For someone with a single ticket, what is the standard deviation for the amount won? A. $7.07 B. $25.00 C. $49.95 D. $790.17 E. $624,375

(C) The residual = 0.59 − 0.562 = 0.028.

For one player, the winning game proportions were 0.55 and 0.59 for "facing" and "back," respectively. What was the associated residual? A. -0.0488 B. -0.028 C. 0.028 D. 0.0488 E. 0.3608

D. 75 Explanation: Relative frequencies must be equal. You can use either rows or columns: the rows give 20/70=30/30+n, and the columns give 20/50=50/50+n. Solving for n in either proportion works: n/30 = 50/20 or n/50 = 30/20. In both cases, n = 75.

For what value of n does the following table show perfect independence? A. 10 B. 40 C. 60 D. 75 E. 100

D. No, because the events are not independent. Explanation: P ( E ∩ F ) = P ( E ) P ( F ) only if the events are independent. In this case, it is well known that women, on average, live longer than men, and so the events are not independent.

Given that 52% of the U.S. population are female and 15% are older than age 65, can we conclude that (0.52)(0.15) = 7.8% are women older than age 65? A. Yes, by the multiplication rule. B. Yes, by conditional probabilities. C. Yes, by the law of large numbers. D. No, because the events are not independent. E. No, because the events are not mutually exclusive.

E. None must be true. Explanation: The median is somewhere between 20 and 30 but not necessarily at 25. Even a single very large score can result in a mean over 30 and a standard deviation over 10.

If quartiles Q 1 = 20 and Q 3 = 30, which of the following must be true? I. The median is 25. II. The mean is between 20 and 30. III. The standard deviation is at most 10. A. I only B. II only C. III only D. All must be true. E. None must be true.

B. the placebo effect. Explanation: The desire of the workers for the study to be successful led to a placebo effect. That is, they saw that the lighting was being changed, and they realized they were being observed. They then assumed that the lighting would be causing a change in production and responded by making this assumption come true.

In a 1927-1932 Western Electric Company study on the effect of lighting on worker productivity, productivity increased with each increase in lighting but then also increased with every decrease in lighting. If it is assumed that the workers knew that they were being observed and that a study was in progress, this is an example of A. the effect of a treatment unit. B. the placebo effect. C. the control group effect. D. sampling error. E. voluntary response bias.

B. 0.5 Explanation: The probability of the next child being a girl is independent of the sex of the previous children. Before she had any children, if the question had been about the probability of having eight girls in a row, then the answer would have been (0.5) 8 , or about 1 in 256.

In a 1974 "Dear Abby" letter, a woman lamented that she had just given birth to her eighth child and all were girls! Her doctor had assured her that the chance of the eighth child being a girl was less than 1 in 100. What was the real probability that the eighth child would be a girl? A. 0.0039 B. 0.5 C. (0.5)^7 D. (0.5)^8 E. ((0.5)^7 + (0.5)^8)/2

E. To obtain higher statistical precision because variability of responses within a state is likely less than variability of responses found in the overall population Explanation: Each of the 50 states, with its own longtime past state standards and different regional culture, is considered a homogeneous stratum.

In a study of successes and failures in adopting Common Core standards, a random sample of high school principals will be selected from each of the 50 states. Selected individuals will be asked a series of evaluative questions. Why is stratification used here? A. To minimize response bias B. To minimize nonresponse bias C. To minimize voluntary response bias D. Because each state is roughly representative of the U.S. population as a whole E. To obtain higher statistical precision because variability of responses within a state is likely less than variability of responses found in the overall population

B. to reduce variation. Explanation: Blocking divides the subjects into groups, such as men and women, or political affiliations, and thus reduces variation. That is, when we group similar individuals together into blocks and then randomize within the blocks, much of the variability due to the differences between the blocks is accounted for and so comparison of the treatment groups is clearer.

In designing an experiment, blocking is used A. to reduce bias. B. to reduce variation. C. as a substitute for a control group. D. as a first step in randomization. E. to control the level of the experiment.

D. A cluster sample Explanation: The dorms are clusters.

Many colleges are moving toward coed dormitories. A Residential Life director plans to sample student attitudes toward this arrangement. He randomly selects three of the 12 dorms and sends a questionnaire to all residents living in them. Which of the following best describes the director's sampling plan? A. A convenience sample B. A simple random sample C. A stratified random sample D. A cluster sample E. A systematic sample

E. the law of large numbers. Explanation: While the outcome of any single play on a roulette wheel or the age at death of any particular person is uncertain, the law of large numbers gives that the relative frequencies of specific outcomes in the long run tend to become closer to numbers called probabilities.

Mathematically speaking, casinos and life insurance companies make a profit because of A. their understanding of sampling error and sources of bias. B. their use of well-designed, well-conducted surveys and experiments. C. their use of simulation of probability distributions. D. the central limit theorem. E. the law of large numbers.

B. Binomial with n = 5 and p = 1/6 Explanation: The sample has five adults, so n = 5

One in six adults in the workplace has experienced cyberbullying. In a random sample of five adults in the workplace, what is the distribution for the number who have experienced cyberbullying? A. Binomial with n = 6 and p = 1/6 B. Binomial with n = 5 and p = 1/6 C. Binomial with n = 5 and p = 1/5 D. Geometric with p = 1/6 E. Geometric with p = 1/5

D. 84.0 years. Explanation: Point E appears to be one standard deviation above the mean. 74.3 + 9.7 = 84.0. Note that point E (and point C) are points where the slope is steepest.

Point E on this normal curve corresponds to A. 54.9 years. B. 64.6 years. C. 74.3 years. D. 84.0 years. E. 93.7 years.

(C=100) Among the 250 students, there are 100 students in classes of size 25 and 150 students in a class of size 150. The average size of their history class is 25,000/250=100

Refer to the following: 250 students are taking a college history course. There is one class of size 150 and four classes of size 25. Among the 250 students, what is the average size, per student, of their history class? A. 35 B. 50 C. 100 D. 125 E. 137.5

(B=50) There are 5 classes, and their average size is 250/5=50

Refer to the following: 250 students are taking a college history course. There is one class of size 150 and four classes of size 25. What is the average class size among the five classes? A. 35 B. 50 C. 100 D. 125 E. 137.5

C. The median will be unchanged, but the mean will decrease. Explanation: The high outlier is further from the bulk of values than is the low outlier, so removing both will decrease the mean. However, removing the lowest and highest values will not change what value is in the middle, so the median will be unchanged.

Removing both outliers will probably effect what changes, if any, on the mean and median costs for this state's four-year institutions of higher learning? A. Both the mean and the median will be unchanged. B. The median will be unchanged, but the mean will increase. C. The median will be unchanged, but the mean will decrease. D. The mean will be unchanged, but the median will increase. E. Both the mean and median will change.

E. The first question probably showed 23% and the second question probably showed 18% because of response bias due to the wording of the questions. Explanation: The wording "creating a level playing field " and "a right to express their individuality" are nonneutral and clearly leading phrasings.

School uniforms are being adopted by U.S. public schools in increasing numbers. Two possible wordings for a question on whether or not students should have to wear school uniforms are as follows: I. Many educators believe in creating a level playing field to reduce socioeconomic disparities. Do you believe that students should have to wear school uniforms? II. Many sociologists believe that students have a right to express their individuality. Do you believe that students should have to wear school uniforms? One of these questions showed that 18% of the population favors school uniforms, while the other question showed that 23% of the population favors school uniforms. Which question probably produced which result and why? A. The first question probably showed 23% of the population favors school uniforms, and the second question probably showed 18% because of the lack of randomization in the choice of pro-uniform and anti-uniform arguments as evidenced by the wording of the questions. B. The first question probably showed 18% and the second question probably showed 23% because of stratification in the wording of the questions. C. The first question probably showed 23% and the second question probably showed 18% because of the lack of a neutral cluster in the sample. D. The first question probably showed 18% and the second question probably showed 23% because of response bias due to the wording of the questions. E. The first question probably showed 23% and the second question probably showed 18% because of response bias due to the wording of the questions.

C. E (Diff) = $185 SD(Diff) = $158 Explanation: For a set of differences, means subtract, but variances add

Science majors in college pay an average of $650 per year for books with a standard deviation of $130, whereas English majors pay an average of $465 per year for books with a standard deviation of $90. What is the mean difference and standard deviation between the amounts paid for books by science and English majors? A. E (Diff) = $92.50 SD(Diff) = $15 B. E (Diff) = $185 SD(Diff) = $110 C. E (Diff) = $185 SD(Diff) = $158 D. E (Diff) = $185 SD(Diff) = $220 E. E (Diff) = $557.50 SD(Diff) = $110

A. An experiment with a single factor Explanation: This study is an experiment because a treatment (periodic removal of a pint of blood) is imposed. There is no blinding because the subjects clearly know whether or not they are giving blood. There is no blocking because the subjects are not divided into blocks before random assignment to treatments. For example, blocking would have been used if the subjects had been separated by gender or age before random assignment to give or not give blood donations. There is a single factor—giving or not giving blood.

Some researchers believe that too much iron in the blood can raise the level of cholesterol. The iron level in the blood can be lowered by making periodic blood donations. A study is performed by randomly selecting half of a group of volunteers to give periodic blood donations while the rest do not. Is this an experiment or an observational study? A. An experiment with a single factor B. An experiment with control group and blinding C. An experiment with blocking D. An observational study with comparison and randomization E. An observational study with little, if any, bias

D. z = 2.40 Explanation: Multiplying every value in a set by the same constant (in this case, by) multiplies both the mean and the standard deviation by the same constant. Standardized scores (the number of standard deviations from the mean) are unchanged and without units.

Students in an algebra class were timed in seconds while solving a series of mathematical brainteasers. One student's time had a standardized score of z = 2.40. If the times are all changed to minutes, what will then be the student's standardized score? A. z = 0.04 B. z = 0.4 C. z = 1.80 D. z = 2.40 E. The new standardized score cannot be determined without knowing the class mean.

B. 10% of the resulting data will lie between 90 and 130. Explanation: Increasing every value by 5 gives 10% between 45 and 65, and then doubling gives 10% between 90 and 130.

Suppose 10% of a data set lies between 40 and 60. If 5 is first added to each value in the set and then each result is doubled, which of the following is true? A. 10% of the resulting data will lie between 85 and 125. B. 10% of the resulting data will lie between 90 and 130. C. 15% of the resulting data will lie between 80 and 120. D. 20% of the resulting data will lie between 45 and 65. E. 30% of the resulting data will lie between 85 and 125.

B. E (broken) = 30 SD(broken) = 1.8 Explanation: Expected values and variances add.

Suppose a retailer knows that the mean number of broken eggs per carton is 0.3 with a standard deviation of 0.18. In a shipment of 100 cartons, what is the expected number of broken eggs and what is the standard deviation? Assume independence between cartons. A. E (broken) = 3 SD(broken) = 1.8 B. E (broken) = 30 SD(broken) = 1.8 C. E (broken) = 30 SD(broken) = 18 D. E (broken) = 300 SD(broken) = 18 E. E (broken) = 300 SD(broken) = 180

A. Both the mean and median will be unchanged. Explanation: Subtracting 10 from one value and adding 5 to two values leaves the sum of the values unchanged, so the mean will be unchanged. Exactly what values the outliers take will not change what value is in the middle, so the median will be unchanged.

Suppose follow-up testing determines that the low outlier should be 10 grams per kilometer less and the two high outliers should each be 5 grams per kilometer greater. What effect, if any, will these changes have on the mean and median CO 2 levels? A. Both the mean and median will be unchanged. B. The median will be unchanged, but the mean will increase. C. The median will be unchanged, but the mean will decrease. D. The mean will be unchanged, but the median will increase. E. Both the mean and median will change.

E. (0.02)(0.98)/((0.02)(0.98)+(0.98)(0.03)) Explanation: P(cancer | positive test)

Suppose that 2% of a clinic's patients are known to have cancer. A blood test is developed that is positive in 98% of patients with cancer but is also positive in 3% of patients who do not have cancer. If a person who is chosen at random from the clinic's patients is given the test and it comes out positive, what is the probability that the person actually has cancer? A. 0.02 B. 0.02 + 0.03 C. (0.02)(0.98) D. (0.02)(0.98) + (0.98)(0.03) E. (0.02)(0.98)/((0.02)(0.98)+(0.98)(0.03))

D. 0.15/0.39 Explanation: P(famine | plague)

Suppose that in a certain part of the world, in any 50-year period the probability of a major plague is 0.39, the probability of a major famine is 0.52, and the probability of both a plague and a famine is 0.15. What is the probability of a famine given that there is a plague? A. 0.39 - 0.15 B. 0.15/0.52 C. 0.52 - 0.15 D. 0.15/0.39 E. 0.39 + 0.52 - 0.15

B. No, because (0.4)(0.35) ≠ 0.3. Explanation: If E and F are independent, then P ( E ∩ F ) = P ( E ) P ( F ); however, in this problem, (0.4)(0.35) ≠ 0.3.

Suppose that, for any given year, the probabilities that the stock market declines, that women's hemlines are lower, and that both events occur are, respectively, 0.4, 0.35, and 0.3. Are the two events independent? A. Yes, because (0.4)(0.35) ≠ 0.3. B. No, because (0.4)(0.35) ≠ 0.3. C. Yes, because 0.4 > 0.35 > 0.3. D. No, because 0.5(0.3 + 0.4) = 0.35. E. There is insufficient information to answer this question.

E. 625 and 125 Explanation: When every value is multiplied by the same constant, both the mean and the standard deviation are multiplied by that constant. Graphically, increasing each value by 25% (multiplying by 1.25) both moves and spreads out the distribution.

Suppose the average score on a national test is 500 with a standard deviation of 100. If each score is increased by 25%, what are the new mean and standard deviation, respectively? A. 500 and 100 B. 525 and 100 C. 625 and 100 D. 625 and 105 E. 625 and 125

C. Mean = 525 and SD = 100 Explanation: Adding the same constant to every value increases the mean by that same constant; however, the distances between the increased values and the increased mean stay the same, and so the standard deviation is unchanged. Graphically, you should picture the whole distribution as moving over by a constant; the mean moves, but the standard deviation (which measures spread) doesn't change.

Suppose the average score on a national test is 500 with a standard deviation of 100. If each score is increased by 25, what are the new mean and standard deviation? A. Mean = 500 and SD = 100 B. Mean = 500 and SD = 125 C. Mean = 525 and SD = 100 D. Mean = 525 and SD = 105 E. Mean = 525 and SD = 125

B. It slopes up to the right, and the correlation is +0.57. Explanation: The slope and the correlation coefficient have the same sign. Multiplying every y value by −1 changes this sign.

Suppose the correlation between two variables is -0.57. If each of the y scores is multiplied by -1, which of the following is true about the new scatterplot? A. It slopes up to the right, and the correlation is -0.57. B. It slopes up to the right, and the correlation is +0.57. C. It slopes down to the right, and the correlation is -0.57. D. It slopes down to the right, and the correlation is +0.57. E. None of the above is true.

C. 0.23 Explanation: The correlation is not changed by adding the same number to every value of one of the variables, by multiplying every value of one of the variables by the same positive number, or by interchanging the x and y variables.

Suppose the correlation between two variables is r = 0.23. What will the new correlation be if 0.14 is added to all values of the x variable, every value of the y variable is doubled, and the two variables are interchanged? A. 0.74 B. 0.37 C. 0.23 D. -0.23 E. -0.74

E. I, II, and III Explanation: A negative correlation shows a tendency for higher values of one variable to be associated with lower values of the other; however, given any two points, anything is possible.

Suppose the correlation is negative. Given two points from the scatterplot, which of the following is possible? I. The first point has a larger x value and a smaller y value than the second point. II. The first point has a larger x value and a larger y value than the second point. III. The first point has a smaller x value and a larger y value than the second point. A. I only B. II only C. III only D. I and III only E. I, II, and III

B. No, because your expected winnings are only $0.14.

Suppose you are one of 7.5 million people who send in their name for a drawing with 1 top prize of $1 million, 5 second-place prizes of $10,000, and 20 third-place prizes of $100. Is it worth the $0.55 postage it costs you to send in your name? A. Yes, because 1,000,000/0.55 = 1,818,182, which is less than 7,500,000. B. No, because your expected winnings are only $0.14. C. Yes, because 7,500,000/(1+5+20) = 288,642 D. No, because 1,052,000 < 7,500,000. E. Yes, because 1,052,000/26 = 40,462

E. The probability that the next toss will again be heads is 0.5. Explanation: Coins have no memory. So, the probability that the next toss will be heads is 0.5, and the probability that it will be tails is 0.5. The law of large numbers says that as the number of tosses becomes larger, the percentage of heads tends to become closer to 0.5.

Suppose you toss a fair coin ten times and it comes up heads every time. Which of the following is a true statement? A. By the law of large numbers, the next toss is more likely to be tails than another heads. B. By the properties of conditional probability, the next toss is more likely to be heads given that ten tosses in a row have been heads. C. Coins actually do have memories, and thus what comes up on the next toss is influenced by the past tosses. D. The law of large numbers tells how many tosses will be necessary before the percentages of heads and tails are again in balance. E. The probability that the next toss will again be heads is 0.5.

A. Census Explanation: The main office at your school should be able to give you the class sizes of every math and English class. If need be, you can check with every math and English teacher.

Suppose you wish to compare the average class size of mathematics classes to the average class size of English classes in your high school. Which is the most appropriate technique for gathering the needed data? A. Census B. Sample survey C. Experiment D. Observational study E. None of these methods is appropriate.

C. E (Cats) = $114 SD(Cats) = $30 Explanation: Means and variances add.

The auditor working for a veterinary clinic calculates that the mean annual cost of medical care for dogs is $98 with a standard deviation of $25 and that the mean annual cost of medical care for pets is $212 with an average deviation of $39 for owners who have one dog and one cat. Assuming expenses for dogs and cats are independent for those owning both a dog and a cat, what is the mean annual cost of medical care for cats, and what is the standard deviation? A. E (Cats) = $114 SD(Cats) = $14 B. E (Cats) = $114 SD(Cats) = $46 C. E (Cats) = $114 SD(Cats) = $30 D. E (Cats) = $155 SD(Cats) = $32 E. E (Cats) = $188 SD(Cats) = $30

D. 0.0036 Explanation: The chance of an accident depends on whether or not the weather is wet, so the weather condition is our first branching on the tree. The probabilities at the next branch are conditional. There are two final branches on the tree that conclude with an accident: P(wet ∩ accident) = 0.2 × 0.010 = 0.0020 and P(dry ∩ accident) = 0.80 × 0.002 = 0.0016. Therefore the total probability of an accident is 0.0020 + 0.0016 = 0.0036.

The probability that there will be an accident on Highway 48 each day depends on the weather. If the weather is dry that day, there is a 0.2% chance of an accident on Highway 48; if the weather is wet that day, there is a 1.0% chance of an accident. Today, the weather station announced that there is a 20% chance of the weather being wet. What is the probability that there will be an accident on Highway 48 today? A. 0.0004 B. 0.0016 C. 0.0020 D. 0.0036 E. 0.0060

B. High leverage and a small residual Explanation: The point X has high leverage because its x value is much greater than the mean x value. Point X has a small has a small residual because the regression line would pass close to it.

The scatterplot has one point labeled X . Does this point have high leverage, a large residual, both, or neither? A. High leverage and a large residual B. High leverage and a small residual C. Low leverage and a large residual D. Low leverage and a small residual E. This cannot be answered without calculating the regression line.

C. Low leverage and a large residual Explanation: The point X has low or no leverage because its x value appears to be close to the mean x value. The point X has a large residual because it is far above the line of best fit.

The scatterplot has one point labeled X . Does this point have high leverage, a large residual, both, or neither? A. High leverage and a large residual B. High leverage and a small residual C. Low leverage and a large residual D. Low leverage and a small residual E. This cannot be answered without calculating the regression line.

D. 500 times for the first game, and 50 for the second Explanation: The probability of throwing heads is 0.5. By the law of large numbers, the more times you flip the coin, the more the relative frequency tends to become closer to this probability. With fewer tosses, there is a greater chance of wide swings in the relative frequency.

There are two games involving flipping a fair coin. In the first game, you win a prize if you can throw between 40% and 60% heads. In the second game, you win if you can throw more than 75% heads. For each game, would you rather flip the coin 50 times or 500 times? A. 50 times for each game B. 500 times for each game C. 50 times for the first game, and 500 for the second D. 500 times for the first game, and 50 for the second E. The outcomes of the games do not depend on the number of flips.

E. A systematic sample Explanation: With a random starting point and by picking every k th person for the sample, the method is called systematic sampling.

To evaluate their catering service with regard to culinary excellence, airline executives plan to pick a random passenger to start with and then survey every tenth passenger departing from a Beijing to San Francisco flight. Which of the following best describes the executives' sampling plan? A. A convenience sample B. A simple random sample C. A stratified random sample D. A cluster sample E. A systematic sample

C. Too high, because of undercoverage bias Explanation: It is most likely that the apartments at which the interviewer had difficulty finding someone home were apartments with fewer students living in them. Replacing these with other randomly picked apartments most likely replaces smaller-occupancy apartments with larger-occupancy ones.

To find out the average occupancy size of student-rented apartments, a researcher picks a simple random sample of 100 such apartments. Even after one follow-up visit, the interviewer is unable to make contact with anyone in 27 of these apartments. Concerned about nonresponse bias, the researcher chooses another simple random sample and instructs the interviewer to continue this procedure until contact is made with someone in a total of 100 apartments. The average occupancy size in the final 100-apartment sample is 2.78. Is this estimate probably too low or too high? A. Too low, because of undercoverage bias B. Too low, because convenience samples overestimate average results C. Too high, because of undercoverage bias D. Too high, because convenience samples overestimate average results E. Too high, because voluntary response samples overestimate average results

E. No, because not every sample of the intended size has an equal chance of being selected. Explanation: In a simple random sample, every possible group of the given size has to be equally likely to be selected, and this is not true here. For example, with this procedure it will be impossible for all the early arrivals to be together in the final sample. This procedure is an example of systematic sampling, but systematic sampling does not result in simple random samples.

To survey the opinions of bleacher fans at Wrigley Field, the home stadium of the Cubs baseball team, a surveyor plans to select every one-hundredth fan entering the bleachers one afternoon. Will this result in a simple random sample of Cub fans who sit in the bleachers? A. Yes, because each bleacher fan has the same chance of being selected. B. Yes, but only if there is a single entrance to the bleachers. C. Yes, because the 99 out of 100 bleacher fans who are not selected will form a control group. D. Yes, because this is an example of systematic sampling, which is a special case of simple random sampling. E. No, because not every sample of the intended size has an equal chance of being selected.

(A) The value 50 seems to split the area under the histogram in two, so the median is about 50. Furthermore, the histogram is skewed to the left with a tail from 0 to 30.

To which of the boxplots does the histogram correspond? A B C D E

(B) When looking at areas under the curve, Q 1 appears to be around 20, the median is around 30, and Q 3 is about 40.

To which of the boxplots does the histogram correspond? A B C D E

(A) A cumulative relative frequency plot that rises slowly at first, then quickly in the middle, and finally slowly again at the end corresponds to a histogram with little area under the curve on the ends and much greater area in the middle.

To which of the five histograms does the cumulative relative frequency plot correspond? A B C D E

(C) A cumulative relative frequency plot that rises at a constant rate to start and then slowly at the end corresponds to a histogram that is horizontal to start and then has little area under the curve at the end.

To which of the five histograms does the cumulative relative frequency plot correspond? A B C D E

(D) A cumulative relative frequency plot that rises slowly at first and then rises at a constant rate corresponds to a histogram with little area under the curve early on and then a horizontal section at the end.

To which of the five histograms does the cumulative relative frequency plot correspond? A B C D E

(C) The boxplot indicates that 25% of the data lie in each of the intervals 10-20, 20-35, 35-40, and 40-50. Counting boxes, only histogram C has this distribution.

To which of the histograms does the boxplot correspond? A B C D E

(E) The boxplot indicates that 25% of the data lie in each of the intervals 10-20, 20-30, 30-40, and 40-50. Counting boxes, only histogram E has this distribution.

To which of the histograms does the boxplot correspond? A B C D E

C. The first study is an observational study, while the second is an experiment. Explanation: In the first study, the families were already in the housing units, while in the second study, one of two treatments was applied to each family.

Two studies are run to compare the experiences of families living in high-rise public housing to those of families living in townhouse subsidized rentals. The first study interviews 25 families who have been in each government program for at least 1 year, while the second randomly assigns 25 families to each program and interviews them after 1 year. Which of the following is a true statement? A. Both studies are observational studies because of the time period involved. B. Both studies are observational studies because there are no control groups. C. The first study is an observational study, while the second is an experiment. D. The first study is an experiment, while the second is an observational study. E. Both studies are experiments.

(C) The median corresponds to the 0.5 cumulative proportion. The 0.25 and 0.75 cumulative proportions correspond to Q 1 = 1.8 and Q 3 = 2.8, respectively, and so the interquartile range is 2.8 − 1.8 = 1.0.

What are the median grade point average and the IQR? A. Median = 0.8, IQR = 1.8 B. Median = 2.0, IQR = 2.8 C. Median = 2.4, IQR = 1.0 D. Median = 2.5, IQR = 1.0 E. Median = 2.6, IQR = 1.8

D. Greater than one Explanation: The distribution is clearly skewed right, so the mean is greater than the median and the ratio is greater than one.

What can be said about the ratio (mean household income/median household income)? A. Approximately zero B. Less than one, but definitely above zero C. Approximately one D. Greater than one E. Cannot be answered without knowing the standard deviation

(E) r = square root of r^2 = 0.993. (The sign is the same as the sign of the slope, which in this case is positive.)

What is the correlation? A. -0.986 B. -0.984 C. 0.984 D. 0.986 E. 0.993

A. 25/460 Explanation: Of the 460 healthy people, 25 tested positive.

What is the false-positive rate? That is, what is the probability of testing positive given that the person does not have HIV? A. 25/460 B. 25/60 C. 35/40 D. 25/500 E. (35+25+5)/500

C. 3.7 Explanation: This is a binomial with n = 10 and p = 0.37, and so the mean is np = 10(0.37) = 3.7.

What is the mean of X ? A. 0.37 B. 0.63 C. 3.7 D. 6.3 E. None of the above

A. 35/500 Explanation: 35 out of the entire population of 500 both have HIV and tested positive.

What is the predictive value of the test? That is, what is the probability that a person has HIV and tests positive? A. 35/500 B. 35/60 C. 35/40 D. 35/(35+25+5) E. (35+25+5)/500

A. 5,592,012/9,664,994 Explanation: Conditional probability P(over 70 | at least 20)

What is the probability that a 20-year-old will survive to be 70? A. 5,592,012/9,664,994 B. 5,592,012/10,000,000 C. 9,664,994/10,000,000 D. 1-5,592,012/9,664,994 E. (1-5,592,012/9,664,994)/10,000,000

D. 0.892

What is the probability that a senator is under 70 years old given that he or she is at least 50 years old? A. 0.580 B. 0.624 C. 0.643 D. 0.892 E. 0.969

E. 475/1,000 Explanation: Column total divided by table total

What is the probability that a student has a GPA between 2.0 and 3.0? A. 25/110 B. 450/890 C. 25/1,000 D. 450/1,000 E. 475/1,000

C. 80/1,000 Explanation: Cell value divided by table total

What is the probability that a student has a GPA under 2.0 and has skipped many classes? A. 80/110 B. 80/255 C. 80/1,000 D. 110+255/1,000 E. 100+255-80/1,000

A. 80/110 Explanation: Cell value divided by row total

What is the probability that a student has a GPA under 2.0 given that he has skipped many classes? A. 80/110 B. 80/255 C. 110/255 D. 80/1,000 E. 110/1,000

E. 100+255-80/1,000 Explanation: Probability of a union

What is the probability that a student has a GPA under 2.0 or has skipped many classes? A. 80/110 B. 80/255 C. 80/1,000 D. 110+255/1,000 E. 100+255-80/1,000

(C) Predicted winning percentage = 44 + 0.0003(34,000) = 54.2, and Residual = Observed − Predicted = 55 − 54.2 = 0.8.

What is the residual if a team has a winning percentage of 55% with an average attendance of 34,000? A. -11.0 B. -0.8 C. 0.8 D. 11.0 E. 23.0

E. 35/40 Explanation: Of the 40 people with HIV, 35 tested positive.

What is the sensitivity of the test? That is, what is the probability of testing positive given that the person has HIV? A. (35+25+5)/500 B. 35/(35+25+5) C. 35/500 D. 35/60 E. 35/40

E. 435/460 Explanation: Of the 460 healthy people, 435 tested negative.

What is the specificity of the test? That is, what is the probability of testing negative given that the person does not have HIV? A. 5/40 B. 35/60 C. (35+25+435)/500 D. 435/500 E. 435/460

(E) There were 15 + 10 + 25 = 50 administrators, and 25 of them picked strict as most important: 25/50 = 0.5 or 50%.

What percentage of administrators picked strict as most important? A. 5% B. 10% C. 20% D. 25% E. 50%

C. 39.9% Explanation: The coefficient of determination r^2 gives the proportion of the y variance that is accountable from a knowledge of the variability of x . In this case r^2 = (0.632)^2 = 0.399 or 39.9%.

What percentage of the variation in GPAs can be accounted for by looking at the linear relationship between GPAs and SAT scores? A. 0.161% B. 16.1% C. 39.9% D. 63.2% E. This value cannot be computed from the information given.

(E) There were 150 + 50 + 10 = 210 people picking enthusiastic as most important, and 150 of them were students: 150/210 = 0.714, or 71.4%.

What percentage of those picking enthusiastic as most important were students? A. 30% B. 42% C. 50% D. 60% E. 71.4%

(A) In the bar corresponding to the Northeast, the segment corresponding to country music stretches from the 50% level to the 70% level, indicating a length of 20%.

What percentage of those surveyed from the Northeast prefer country music? A. 20% B. 30% C. 40% D. 50% E. 70%

(A) Of the 500 people surveyed, 125 both picked challenging as most important and were teachers: 125/500 = 0.25, or 25%.

What percentage of those surveyed picked challenging as most important and were teachers? A. 25% B. 38% C. 40% D. 62.5% E. 65.8%

(E) Of the 500 people surveyed, 50 + 150 + 50 = 250 were students, and 250/500 = 0.5, or 50%.

What percentage of those surveyed were students? A. 10% B. 20% C. 30% D. 40% E. 50%

C. This study was an experiment in which the subjects were used as their own controls. Explanation: In experiments on people, the subjects can be used as their own controls, with responses noted before and after the treatment. However, with such designs there is always the danger of a placebo effect. Thus, the design of choice would involve a separate control group to be used for comparison.

When the estrogen-blocking drug tamoxifen was first introduced to treat breast cancer, there was concern that it would cause osteoporosis as a side effect. To test this concern, cancer subjects were randomly selected and given tamoxifen, and their bone density was measured before and after treatment. Which of the following is a true statement? A. This study was an observational study. B. This study was a sample survey of randomly selected cancer patients. C. This study was an experiment in which the subjects were used as their own controls. D. With the given procedure, there cannot be a placebo effect. E. Causation cannot be concluded without knowing the survival rates.

C. Administrator Explanation: The percentages of students, teachers, and administrators picking strict as most important were 20%, 12.5%, and 50%, respectively.

Which group of people were most likely to pick strict as most important? A. Student B. Teacher C. Administrator D. Teacher and administrator, equally E. Student, teacher, and administrator, equally

B. Smaller mean, a ; smaller standard deviation, b Explanation: Curve a appears to have a mean of 6 and a standard deviation of 2, while curve b appears to have a mean of 18 and a standard deviation of 1.

Which has the smaller mean, and which has the smaller standard deviation? A. Smaller mean, a ; smaller standard deviation, a B. Smaller mean, a ; smaller standard deviation, b C. Smaller mean, b ; smaller standard deviation, a D. Smaller mean, b ; smaller standard deviation, b E. Smaller mean, a ; same standard deviation

E. Schools with lower percentages of students taking the exam tend to have higher average combined SAT scores. Explanation: The negative value of the slope (−2.84276) gives that, on average, the predicted combined SAT score of a school is 2.84 points lower for each one unit higher in the percentage of students taking the exam. Choices (A) through (D) are incorrect for the following reasons. The variable column indicates the independent (explanatory) variable. The sign of the correlation is the same as the sign of the slope (negative here). In this example, the y intercept is meaningless (predicted SAT result if no students take the exam). There can be a strong linear relation with high r 2 value but still a distinct pattern in the residual plot indicating that a nonlinear fit may be even stronger.

Which of the following is a correct conclusion? A. "SAT" in the variable column indicates that SAT score is the dependent (response) variable. B. The correlation is 0.875. C. The y intercept indicates the mean combined SAT score if percent of students taking the exam has no effect on combined SAT scores. D. The r^2 value indicates that the residual plot does not show a strong pattern. E. Schools with lower percentages of students taking the exam tend to have higher average combined SAT scores.

B. Sampling error reflects natural variation between samples, is always present, and can be described using probability. Explanation: Different samples give different sample statistics, all of which are estimates of a population parameter. Sampling error (also called sampling variability) relates to natural variation between samples, can never be eliminated, can be described using probability, and is generally smaller if the sample size is larger. Furthermore, it is not an error or mistake on anyone's part!

Which of the following is a true statement about sampling error? A. Sampling error can be eliminated only if a survey is both extremely well designed and extremely well conducted. B. Sampling error reflects natural variation between samples, is always present, and can be described using probability. C. Sampling error is generally larger when the sample size is larger. D. Sampling error implies an error, possibly very small, but still an error on the part of the surveyor. E. Sampling error is higher when bias is present.

D. Convenience samples often lead to undercoverage bias. Explanation: If there is bias, taking a larger sample just magnifies the bias on a larger scale. If there is enough bias, the sample can be worthless. Even when the subjects are chosen randomly, there can be bias due, for example, to nonresponse or to the wording of the questions. Convenience samples, like shopping mall surveys, are based on choosing individuals who are easy to reach, and they typically miss a large segment of the population. Voluntary response samples, like radio call-in surveys, are based on individuals who offer to participate, and they typically overrepresent persons with strong opinions.

Which of the following is a true statement? A. If bias is present in a sampling procedure, it can be overcome by dramatically increasing the sample size. B. There is no such thing as a "bad sample." C. Sampling techniques that use probability techniques effectively eliminate bias. D. Convenience samples often lead to undercoverage bias. E. Voluntary response samples often underrepresent people with strong opinions.

C. Stemplots can show symmetry, gaps, clusters, and outliers. Explanation: Stemplots are not used for categorical data sets, are too unwieldy to be used for very large data sets, and show every individual value. Stems should never be skipped over—gaps are important to see.

Which of the following is a true statement? A. Stemplots are useful for both quantitative and categorical data sets. B. Stemplots are equally useful for small and very large data sets. C. Stemplots can show symmetry, gaps, clusters, and outliers. D. Stemplots may or may not show individual values. E. Stems may be skipped if there is no data value for a particular stem.

B. In histograms, frequencies can be determined from relative heights. Explanation: Histograms give information about relative frequencies (relative areas correspond to relative frequencies) and may or may not have an axis with actual frequencies. Symmetric histograms can have any number of peaks. Choice of width and number of classes changes the appearance of a histogram. Stemplots clearly show outliers; however, in histograms outliers may be hidden in large class widths.

Which of the following is an incorrect statement? A. In histograms, relative areas correspond to relative frequencies. B. In histograms, frequencies can be determined from relative heights. C. Symmetric histograms may have multiple peaks. D. Two students working with the same set of data may come up with histograms that look different. E. Displaying outliers may be more problematic when using histograms than when using stemplots.

E. It is impossible to determine the answer without knowing the actual numbers of people involved. Explanation: The given bar chart shows percentages, not actual numbers.

Which of the following is greatest? A. The number of people in the Northeast who prefer pop or rock B. The number of people in the West who prefer classical C. The number of people in the South who prefer country D. The above are all equal. E. It is impossible to determine the answer without knowing the actual numbers of people involved.

B. The percentage of those from the West who prefer country Explanation: Based on lengths of indicated segments, the percentage from the West who prefer country is the greatest.

Which of the following is greatest? A. The percentage of those from the Northeast who prefer classical B. The percentage of those from the West who prefer country C. The percentage of those from the South who prefer pop or rock D. The above are all equal. E. It is impossible to determine the answer without knowing the actual numbers of people involved.

D. Blocking results in increased accuracy because the blocks have smaller size than the original group. Explanation: Unnecessary blocking detracts from accuracy because of smaller sample sizes. Blocking in experiment design first divides the subjects into representative groups called blocks, just as stratification in sampling design first divides the population into representative groups called strata. This procedure can control certain variables by bringing them directly into the picture, and thus conclusions are more specific. The paired comparison design is a special case of blocking in which each pair can be considered a block. In a block design, subjects within each block are randomly assigned treatments. One can think of blocking as running parallel experiments before combining the results.

Which of the following is incorrect? A. Blocking is to experiment design as stratification is to sampling design. B. By controlling certain variables, blocking can make conclusions more specific. C. The paired comparison design is a special case of blocking. D. Blocking results in increased accuracy because the blocks have smaller size than the original group. E. In a randomized block design, the randomization occurs within the blocks.

B. The distribution is skewed left and right. Explanation: There is no such thing as being skewed both left and right.

Which of the following is not a correct statement about this distribution? A. The distribution is roughly bell-shaped. B. The distribution is skewed left and right. C. The center is around 60. D. The spread is from 22 to 90. E. There are no outliers.

B. 1.1 mph Explanation: With bell-shaped data, the empirical rule applies. Given that the spread from 92 to 98 is roughly 6 standard deviations, one standard deviation is about 1.

Which of the following is the best estimate of the standard deviation of these speeds? A. 0.5 mph B. 1.1 mph C. 1.6 mph D. 2.2 mph E. 6.0 mph

C. Correlation is not affected by which variable is called x and which is called y. Explanation: If the points lie on a straight line, r = ±1. x and y are interchangeable, and r does not depend on which variable is called x or y . However, since means and standard deviations can be strongly influenced by outliers, r too can be strongly affected by extreme values. While r = 0.75 indicates a better fit with a linear model than r = 0.25 does, we cannot say that the linearity is threefold.

Which of the following statements about correlation r is true? A. A correlation of 0.2 means that 20% of the points are highly correlated. B. Perfect correlation, that is, when the points lie exactly on a straight line, results in r = 0. C. Correlation is not affected by which variable is called x and which is called y. D. Correlation is not affected by extreme values. E. A correlation of 0.75 indicates a relationship that is 3 times as linear as one for which the correlation is only 0.25.

D. I, II, and III Explanation: The sum and thus the mean of the residuals are always zero. In a good straight-line fit, the residuals show a random pattern.

Which of the following statements about residuals from the least squares line are true? I. The mean of the residuals is always zero. II. The regression line for a residual plot is a horizontal line. III. A definite pattern in the residual plot is an indication that a nonlinear model will show a better fit to the data than the straight regression line. A. I and II only B. I and III only C. II and III only D. I, II, and III E. None of the above gives the complete set of true responses.

B. Slope b = 0.377 Coefficient of determination r^2 = 97.0% Standard deviation of the residuals s = 0.617 Explanation: The point (35, 14) appears to follow the linear trend of the rest of the data. The slope wouldn't change, but with another point added to the trend, r^2 will increase, and s , the measure of typical deviation from the line, will decrease.

With the addition of a data point at (35, 14), which one of the following choices gives the most likely new regression statistics? A. Slope b = 0.377 Coefficient of determination r^2 = 97.0% Standard deviation of the residuals s = 0.663 B. Slope b = 0.377 Coefficient of determination r^2 = 97.0% Standard deviation of the residuals s = 0.617 C. Slope b = 0.377 Coefficient of determination r^2 = 93.0% Standard deviation of the residuals s = 0.640 D. Slope b = 0.377 Coefficient of determination r^2 = 93.0% Standard deviation of the residuals s = 0.663 E. Slope b = 0.397 Coefficient of determination r^2 = 95.0% Standard deviation of the residuals s = 0.640


Related study sets

Final ch 12, Final Chapter 9, final Ch. 8, Ch 2, Chapter 11

View Set

ACCT370: Financial Statement Analysis Read & Interact: Revsine, Collins, Johnson, Mittelstaedt, & Soffer: Chapter 17

View Set

PDHPE- HSC Online- Factors affecting Performace

View Set

Financial Accounting Mid-Term Exam Practice

View Set

Psych Chapter 9: Lifespan Development

View Set