Probability & Statistics #6
The pooled t-procedures can always be used to compare the means of two populations.
False
A new approach for teaching mathematics was introduced that engages students in group investigations and mathematical modeling. After field tests in 36 high schools over a three-year period, researchers compared the performances the new approach students with those taught using a traditional curriculum. During the study, students in both classes that used the new approach and traditional classes took a Algebra test that did not allow them to use calculators. The accompanying table shows the results. Are the mean scores of the two groups significantly different? a) Write appropriate hypotheses. Let μN be be the mean Algebra test score for students in classes that use the new approach. Let μT be be the mean Algebra test score for students in traditional classes. b) Do you think the assumptions for inferences are satisfied? Explain. c) Refer to the computer output for the hypothesis test. Explain what the P-value means in this context. Choose the correct answer below. d) State the conclusion of this test.
A. H0: μN−μT=0 HA: μN−μT≠0 B. Yes. The groups are independent, though it is not certain if students were randomly assigned to each class. However, the sample sizes are large enough that the Central Limit Theorem applies. C. If the means for the two classes are actually equal, there is less than a 1 in 10,000 chance of seeing a difference as large as or larger than the observed difference just from natural sampling variation. D. On average, students who learn in the new class method do significantly worse on Algebra tests that do not allow them to use calculators than students who learn by traditional methods.
Determine which of the scenarios should be analyzed as paired data. a) Workers at a grocery store undergo shelf-stocking efficiency training. The store supervisor's satisfactions with shelf stocking on the day of the training and three months after the training are compared. b) 20 male and 20 female students in class take a midterm. Their scores are compared. c) A tour group of prospective freshmen is asked about the quality of the university cafeteria. A second tour group is asked about the same cafeteria, one month later. Do prospective students' opinions change based on the time of year?
A. This scenario should be analyzed using paired data because the groups are dependent and have a natural pairing. B. This scenario should not be analyzed using paired data because the groups are independent and do not have a natural pairing. C. This scenario should not be analyzed using paired data because the groups are independent and do not have a natural pairing.
Are people who buy organic produce more likely to use alternative medicine (such as acupuncture) than those who don't? A student finds a 95% confidence interval for p(organic)− p(conventional) to be (0.045,0.081). What can we conclude from the interval?
Both endpoints of the interval are positive so p(organic) is larger than p(conventional) with 95% confidence.
An insurance agent randomly selected 10 of his clients and checked online price quotes for their policies. His summaries of the data are shown, where Diff is Local−Online. Variable Count Mean StdDev Local 10 871.400 187.201 Online 10 828.100 317.306 Diff 10 43.3000 183.369 Test an appropriate hypothesis to see if there is evidence that drivers might save money by switching to an online agent. Use α=0.05.
H0:μd=0 HA:μd>0 Yes, all three conditions for matched pairs inference are satisfied. Test Statistic = .75 P-Value = .237 Fail to reject H0. There is insufficient evidence that drivers would save money by switching.
Suppose an advocacy organization surveys 960 citizens of Country A and 198 of them reported being born in another country. Similarly, 178 out of 1450 citizens of Country B reported being foreign-born. This information was used to create a 95% two-proportion confidence interval for the difference between Country A citizens and Country B citizens who were born in foreign countries. For the organization's interval, explain what "95% confidence" means. 95% confidence interval for pA−pB is (5.28%,11.42%)
If the organization were to take repeated samples of Country A citizens and Country B citizens, they would expect 95% of the intervals to contain the true difference in the proportion of foreign-born citizens.
Suppose we did an experiment to gather data on the number of un-popped kernels of microwave popcorn. Half of the students in your class took home a bag, put it in the freezer overnight, and popped it the next day. The other half took the bag, left it on their kitchen counter overnight, and popped it the next day. Each group counted the number of un-popped kernels. Are these data paired or independent?
Independent
Do women spend more time talking on the phone than men? A student collects data from students at her university and reports a 90% confidence interval for μ(men)−μ(women) to be (−150,30), where μ(men) is the mean number of minutes per month for men. Based on this confidence interval, can it be concluded that μ(women) is higher than μ(men)?
No
On a final project in an introductory statistics class, a student reports a p-value of 0.0893. She states, "There is a 8.93% chance that my null hypothesis is true." Is her conclusion correct?
No
A student measuring how gasoline prices change records the cost of gas at 10 randomly selected stations in her hometown. One week later, she records the price again at the same 10 stations. Are these data paired or independent?
Paired
Suppose we did a new experiment to gather data on the number of un-popped kernels of microwave popcorn. Each student in your class brings home two bags. One is put it in the freezer overnight, while the other is left on the kitchen counter overnight. Each bag is popped the next day and the number of un-popped kernels is recorded. Are these data paired or independent?
Paired
Do consumers spend more on a trip to Store A or Store B? Suppose researchers interested in this question collected a systematic sample for 85 Store A customers and 85 Store B customers by asking customers for their purchase amount as they left the store. The data collected is summarized in the given table. To perform inference on these two samples, what conditions must be met? Are they? Explain.
One must assume that the samples were random or otherwise independent of each other. It must also be assumed that the distributions are roughly normal, so it would be a good idea to check a histogram to make sure there is not strong skewness or outliers.
886 randomly sampled teens were asked which of several personal items of information they thought it okay to share with someone they had just met. 44% said it was okay to share their e-mail addresses, but only 29% said they would give out their cell phone numbers. A researcher claims that a two-proportion z-test could tell whether there was a real difference among all teens. Explain why that test would not be appropriate for these data.
The responses are not from two independent groups.
In order to use the methods of this chapter to compare two proportions, we need to check our assumptions. Which of the following conditions do we need to check?
The samples are not more than 10% of the populations. The two groups are independent. There are at least 10 successes and 10 failures in each group. The data is drawn independently and at random.
A student measuring how gasoline prices change records the cost of gas at 10 randomly selected stations in her hometown. One week later, she records the price again at the same 10 stations. She wants to estimate the mean price increase and subtracts week 2 from week 1. Her 90% confidence interval is (−$0.23,−$0.06). What can she conclude?
These are both negative numbers so the week 2 prices are on average higher than the week 1 prices.
Suppose an experiment was done to gather data on the number of unpopped kernels of microwave popcorn. Half of the students in your class took home a bag, put it in the freezer overnight, and popped it the next day. The other half took the bag, left it on their kitchen counter overnight, and popped it the next day. Each group counted the number of unpopped kernels. The class constructs a 95% confidence interval for μ(freezer)−μ(counter) and finds it to be (−117,−32). What can be concluded?
These are both negative numbers, so μ(counter) is larger than μ(freezer) with 95% confidence.
In the July 2007 issue, Consumer Reports examined the calorie content of two kinds of hot dogs: meat (usually a mixture of pork, turkey, and chicken) and all beef. The researchers purchased samples of several different brands. The meat hot dogs averaged 111.7 calories, compared to 135.4 for the beef hot dogs. A test of the null hypothesis that there's no difference in mean calorie content yields a P-value of 0.124. Would a 95% confidence interval for μMeat−μBeef include 0? Explain.
Yes. The high P-value means that we lack evidence of a difference, so 0 is a possible value for the confidence interval.
A student measuring how gasoline prices change records the cost of gas at 10 randomly selected stations in her hometown. One week later, she records the price again at the same 10 stations. She wants to test the hypothesis that prices have increased and subtracts week 2 from week 1. What is an appropriate alternative hypothesis?
μ(diff)<0
A 95% confidence interval for the difference in mean fat content for meat vs. all-beef hot dogs is (−6.6,−1.4) grams for μMeat−μBeef. Explain why you think each of the following statement is true or false. a) If one eats a meat hot dog instead of a beef dog, there's a 95% chance they'll consume less fat. Choose the correct answer below. b) 95% of meat hot dogs have between 1.4 and 6.6 grams less fat than a beef hot dog. Choose the correct answer below. c) One is 95% confident that meat hot dogs average 1.4-6.6 grams less fat than the beef hot dogs. Choose the correct answer below. d) If one were to get more samples of both kinds of hotdogs, 95% of the time the meat hot dogs would average 1.4-6.6 less fat than the beef hot dogs. e) If one tested many samples, they'd expect about 95% of the resulting confidence intervals to include the true difference in mean fat content between the two kinds of hot dogs.
A. False, the confidence interval is about means, not about the individual hot dogs. B. False, the confidence interval is about means, not about the individual hot dogs. C. True. D. False, the confidence interval estimates the true difference in population means. There is no reason to expect other samples to conform to this result. E. True
A new vaccine was tested to see if it could prevent the ear infections that many infants suffer from. Babies about a year old were randomly divided into two groups. One group received vaccinations, and the other did not. The following year, only 352 of 2460 vaccinated children had ear infections, compared to 439 of 2454 unvaccinated children. A positive 99.9% confidence interval for the difference in the rates of ear infection was used to examine the effectiveness of a vaccine against ear infections in babies and was found to be (0.1%,7.0%). Suppose that instead you had conducted a hypothesis test. b) State a conclusion based on the given confidence interval. Choose the correct answer below. c) If that conclusion is wrong, which type of error did you make? d) What would be the consequences of such an error?
A. H0:pV−pNV=0 HA:pV−pNV<0 B. Because 0 is not in the confidence interval, reject the null hypothesis. There is sufficient evidence to conclude that the vaccine reduces the rate of ear infections. C. Type I D. Babies would be given ineffective vaccinations.
Researchers collected samples of water from streams in a mountain range to investigate the effects of acid rain. They measured the pH (acidity) of the water and classified the streams with respect to the kind of substrate (type of rock over which they flow). A lower pH means the water is more acidic. The plot to the right shows the pH of the streams by substrate (limestone, mixed, or shale). Selected parts of a software analysis comparing pH of streams with mixed and limestone substrates are shown below. Complete parts a through c. 2-Sample t-test of μ1−μ2=0, Difference Between Means=0.540 t-Statistic=6.30 w/134 df, P≤0.0001 a) State the null and alternative hypotheses for this test. Choose the correct answer below. b) From the information you have, do the assumptions and conditions appear to be met? Which of the following conditions are satisfied for the given data? c) What conclusion would you draw?
A. H0:μM−μL=0;HA:μM−μL≠0 B. Independent group assumption Nearly normal condition Independence assumption C. Reject H0. There is strong evidence that the streams with mixed substrates have mean pH levels different from those of streams with limestone substrates.
A researcher has data on the city and highway fuel efficiency of 316 cars and 316 trucks a) Would it be appropriate to use paired t methods to compare the cars and the trucks? b) Would it be appropriate to use paired t methods to compare the city and highway fuel efficiencies of these vehicles? c) A histogram of the differences (highway−city) is given. Are the conditions for inference satisfied?
A. No, the vehicles have no natural pairing. B. Possibly; the data are quantitative and paired by vehicle. C. No. There are several outliers.
Researchers examined whether people stay at home more on Friday the 13th. The data are the number of cars passing two intersections for consecutive Fridays (the 6th and 13th) for four different periods, along with summaries of two possible analyses. a) Which of the tests is appropriate for these data? b) Using the test selected in part a), state the proper conclusion. Use α=0.05. c) Are the assumptions and conditions for inference met?
A. The paired t-test should be used because the data for the 6th and 13th of each month are not independent B. There is evidence that people tend to stay home on Friday the 13th because theP-value of the appropriate hypothesis test is less than α. C. There is not enough information
Values for the labor force participation rate of women (LFPR) are published by the U.S. Bureau of Labor Statistics. A researcher is interested in whether there was a difference between female participation in 1968 and 1972, a time of rapid change for women. The researcher checks LFPR values for 19 randomly selected cities for 1968 and 1972, with the accompanying software output results for two possible tests. a) Which of these tests is appropriate for these data? Explain. b) Using the selected test, state the appropriate conclusion. Use α=0.05.
A. The paired t-test is appropriate because the data are taken from the same cities in different periods. B. There is a significant difference in the labor force participation rate for women in these cities; women's participation seems to have increased between 1968 and 1972.
A study examined the fat content (in grams) for samples of beef and meat hot dogs. The resulting 93% confidence interval for μMeat−μBeef is (−2.8,1.8). a) The endpoints of this confidence interval have opposite signs. What does that indicate? b) What does the fact that the confidence interval contains 0 indicate? c) If we use this confidence interval to test the hypothesis that μMeat−μBeef=0, what's the corresponding alpha level?
A. The type of hot dog with a higher mean fat content cannot be determined B. The difference in the two sample means is insignificant C. 7%
Researchers comparing the effectiveness of two pain medications randomly selected a group of patients who had been complaining of a certain kind of joint pain. They randomly divided these people into two groups, then administered the pain killers. Of the 123 people in the group who received medication A, 87 said it was effective. Of the 116 in the other group, 63 said pain reliever B was effective. a) Write a 95% confidence interval for the percent of people who may get relief from this kind of joint pain by using medication A. Interpret your interval. b) Write a 95% confidence interval for the percent of people who may get relief by using medication B. Interpret your interval. c) Do the intervals for A and B overlap? What do you think this means about the comparative effectiveness of these medications? d) Find a 95% confidence interval for the difference in the proportions of people who may find these medications effective. Interpret your interval. e) Does this interval contain zero? What does that mean? f) Why do the results in parts c and e seem contradictory? If we want to compare the effectiveness of these two pain relievers, which is the correct approach?
A. We are 95% confident, based on this study, that between 62.7% and 78.7% of patients with joint pain will find medication A effective. B. We are 95% confident, based on this study, that between 45.3% and 63.3% of patients with joint pain will find medication B effective. C. Yes, they overlap. This might indicate no difference in the effectiveness of the medications, although this is not a proper test. D. We are 95% confident that the proportion of patients who will find A effective is 4.2% to 28.6% higher than the proportion who will find B effective. E. No, it does not contain zero. That means that there is a difference in the effectiveness of the medications. F. To find the variability in the difference of proportions, add variances. The two one-sample intervals do not. The two-sample method is the correct approach.
Do consumers spend more on a trip to Store A or Store B? Suppose researchers interested in this question collected a systematic sample for 80 Store A customers and 80 Store B customers by asking customers for their purchase amount as they left the store. Using the given summary statistics, researchers calculated a 95% confidence interval for the mean difference between Store A and Store B purchase amounts. The interval was ($−19.94,$−8.06). Explain in context what this interval means.
With 95% confidence, the mean purchase amount at Store A is between $8.06 and $19.94 less than the mean purchase amount at Store B.
Suppose you were interested in testing whether older students had higher GPAs on average than younger students. Could you use the methods of this chapter to answer this question?
Yes
The data below shows the sugar content in grams of several brands of children's and adults' cereals. Create and interpret a 95% confidence interval for the difference in the mean sugar content, μC−μA. Be sure to check the necessary assumptions and conditions. (Note: Do not assume that the variances of the two data sets are equal.) Full data set Children's cereal: 40.2, 55.9, 49.3, 40.5, 54.4, 46.9, 51.3, 43.7, 41.6, 41.3, 48.6, 43.5, 36.2, 57.1, 49.9, 50.2, 36.2, 57.2, 43.5, 31.2 Adults' cereal: 21.2, 27.7, 0.3, 9, 0.5, 21, 19.2, 14, 22.6, 7.4, 9, 12.7, 16.5, 13.9, 4.9, 17.4, 4.1, 0.7, 0.9, 6, 10.5, 3.9, 3.1, 1.5, 7.2, 3.6, 17, 8.9, 19.8, 11.1
31.04 39.80 Based on these samples, with 95% confidence, children's cereals average between the lower boundary of 31.04 and upper boundary of 39.80 more grams of sugar content than adult's cereals.
Suppose an advocacy organization surveys 900 citizens of Country A and 193 of them reported being born in another country. Similarly, 172 out of 1150 citizens of Country B reported being foreign-born. If this information is to be used to make inferences about the proportion of all citizens of Country A and all citizens of Country B born in other countries, what conditions must be met before proceeding? Are they met? Explain. Assuming Country A and Country B are large countries. a) What conditions must be met before proceeding? Select all that apply. b) Are the conditions met? Explain. Select all that apply.
A. The sample sizes must be less than 10% of the population. There must be at least 10 national-born and foreign-born citizens for both groups. The citizens of Country B must have been selected independently of Country A. The data must have been collected randomly. B. Yes, all the conditions are met. Some of them require assumptions, but none of those assumptions are unreasonable, and the others are clearly met based on the given information.
Suppose we did an experiment that gathered data on the time a person could stand balanced on one leg. Subjects in our experiment are timed standing on one leg with their eyes open, and also with their eyes closed. Are these data paired or independent?
Paired
Suppose an advocacy organization surveys 950 citizens of Country A and 198 of them reported being born in another country. Similarly, 173 out of 1250 citizens of Country B reported being foreign-born. This information was used to create a 95% two-proportion confidence interval for the difference between Country A citizens and Country B citizens who were born in foreign countries. Interpret this interval with a sentence in context. 95% confidence interval for pA−pB is (3.79%,10.22%)
The organization is 95% confident that, based on these data, the proportion of all foreign-born Country A citizens is between 3.79% and 10.22% more than the proportion of foreign-born Country B citizens.