AP Statistics Semester 2 Quiz/Checkpoint Questions
A random sample of 85 adults found that average calorie consumption was 2,100 per day. Previous research has found a standard deviation of 450 calories, and you use this value for . A researcher wants to estimate a 95% confidence interval and is willing to accept a margin of error of ± 50 calories. She knows it will cost $50 to survey each member of the sample. Given this information, how much will it cost to survey the minimum number of people?
$15,600 (remember to round up)
Teachers in a medium-sized suburban school district have an average salary of $47,500 per year, with a standard deviation of $4,600. After negotiating with the school district, teachers receive a 5% raise and a one-time $500 bonus. What are the new mean and standard deviation of the teacher's salaries during the year in which they receive the bonus?
$50,375; $4,830
Elite Foods, a supermarket, is losing customers and suspects it's because another neighborhood market has lower prices. To determine whether this is the case, researchers from Elite Foods take a random sample of ten items from their store and compare their prices to prices for the same items at the other store: Item Elite Foods Neighborhood Markets 1 1.65 1.49 2 2.19 2.00 3 1.99 2.09 4 3.49 2.99 5 .99 .99 6 1.59 1.79 7 2.89 2.39 8 4.50 4.25 9 1.19 .99 10 1.99 1.79 If you take the difference in each pair, what set of numbers do you get for analysis in a single-sample procedure? (Calculate differences so that if the Elite Foods price is higher, the result is a negative number.)
(-.16, -.19, .1, -.5, 0, .2, -.5, -.25, -.2, -.2)
A whale-watching company noticed that many customers wanted to know whether it was better to book an excursion in the morning or the afternoon. To test this question, the company collected the following data on 15 randomly selected days over the past month: Number of Whales Spotted Morning (A) (B - A) 8 8 0 9 10 1 7 9 2 9 8 -1 10 9 -1 13 11 -2 10 8 -2 8 10 2 2 4 2 5 7 2 7 8 1 7 9 2 6 6 0 8 6 -2 7 9 2 What's the 95% confidence interval for the true population mean difference for these data?
(-.507, 1.307)
The following is a list of differences in first and second semester final grades (the first semester grade was subtracted from the second): -6, 10, -3, 5, 4, 8, -4, 9, -2, 0, -5, -5, 5, 12, 10. Compute a 95% confidence interval for the mean difference between first and second semester grades.
(-1.012, 6.078)
A statistics professor has been teaching first- and second-semester statistics for five years. She wants to know if the grades in her first-semester classes differ significantly from those of her second-semester classes. She takes a random sample of final grades from her two most recent first- and second-semester classes and gets these data: First semester: (89, 98, 78, 86, 95, 83, 90, 87, 85, 80, 96, 93, 90, 91, 81, 87, 93, 90, 87, 88) Second semester: (87, 85, 89, 78, 79, 77, 82, 90, 91, 93, 88, 87, 86, 83, 85, 81, 90, 86, 86, 84) She computes these sample statistics: x̅1 = 88.35, s1 = 5.314, n1 = 20 x̅2 = 85.35, s2 = 4.368, n2 = 20 Using your TI-83 calculator, construct a 99% confidence interval for the difference between these two populations. (Hint: Go to STAT TEST and choose 0: 2-SampTInt. Don't pool your variances.)
(-1.179, 7.1791)
A statistics professor has been teaching first- and second-semester statistics for five years. She wants to know if the grades in her first-semester classes differ significantly from those of her second-semester classes. She takes a random sample of final grades from her two most recent first- and second-semester classes and gets these data: x̅1 = 88.35, s1 = 5.314, n1 = 20 x̅2 = 85.35, s2 = 4.368, n2 = 20 Construct a 99% confidence interval for the difference between these two populations. Use the conservative method to calculate your degrees of freedom, k, and find your critical t value on a table. (Hint: Don't pool and don't use your calculator.)
(-1.4, 7.4)
Suppose we compare the data on two samples, A and B, and come up with the following data: nA = nB = 10, x̅A = 25, sA = 3.21, xB = 3.21, sB = 3.09. Use the conservative method to find a 99% confidence interval for the difference between the means. (Don't pool.)
(-1.78, 7.38) (use the conservative method, don't find df using calculator, here t* = 3.250)
Here's a list of differences in golf scores for players playing two rounds of golf. The score of round one is subtracted from the score of round two. 5, -5, 2, -6, -5, -5, 6, -16, 4, 3, -3, 1 Give a 90% confidence interval for the difference between round two and round one.
(-4.88, 1.55)
A random sample of 384 people in a mid-sized city (city one) revealed 112 individuals who worked at more than one job. A second random sample of 432 workers from another mid-sized city (city two) found 91 people who work at more than one job. Find a 99% confidence interval for the difference between the proportions of workers in the two cities who work at more than one job.
(.003, .159)
A researcher is interested in estimating the mean blood alcohol content (BAC) of people arrested for driving under the influence. The sample consists of 250 individuals with a mean BAC of .145. Based on past data, the researcher assumes a population standard deviation of .065. What's the 95% confidence interval for this scenario?
(.137, .153)
You want to test your newly created Web site, so you have 250 people access it from random locations at random times. Of the people accessing the site, 75 of them experience computer crashes. Construct a 95% confidence interval for the proportion of crashes.
(.243, .3568)
Use the computer output for linear regression shown here to create a 90% confidence interval for the slope of the population regression line.
(.2694, 2.6354)
A company wants to find out what kinds of transportation its employees use to get to work. It conducts a survey of 537 employees, and 243 say they ride the bus. Construct a 95% confidence interval for the proportion of employees who ride the bus to and from work.
(.411, .495)
A seaside resort is publishing a brochure and wants to include a statement about the proportion of clear days during its peak season. Out of a random sample of 150 days over the last two seasons, 117 days were recorded as clear. Construct a 90% confidence interval for the proportion of clear days.
(.7244, .8356)
Construct a 95% confidence interval for the slope of the regression line
(.8584, 1.013)
A whale-watching company noticed that many customers wanted to know whether it was better to book an excursion in the morning or the afternoon. To test this question, the company collected the following data on 15 randomly selected days over the past month. (Note: Though in the table they're labeled 1 through 15, the days aren't consecutive.) Number of Whales Spotted Day Morning Afternoon 1 8 8 2 9 10 3 7 9 4 9 8 5 10 9 6 13 11 7 10 8 8 8 10 9 2 4 10 5 7 11 7 8 12 7 9 13 6 6 14 8 6 15 7 9 If you take the difference between each pair, what set of numbers do you get for analysis in a single-sample procedure? (Calculate differences so that if the afternoon number is higher, the result is a positive number.)
(0, 1, 2, -1, -1, -2, -2, 2, 2, 2, 1, 2, 0, -2, 2)
In a deck of 52 playing cards, there are 12 face cards (4 Jacks, 4 Queens, 4 Kings). If you draw cards one at a time, replacing and shuffling the cards between draws, what's the probability of getting your first face card on the third draw? Choose the best answer.
(1 - 3/13)^2(3/13)
A random sample of 85 adults found that average calorie consumption was 2,100 per day. Previous research has found a standard deviation of 450 calories, and you use this value for . Construct a 99% confidence interval for the population mean.
(1,974.3, 2,225.7)
A manufacturer claims Fertilizer X will cover an average of 2,000 square feet per bag, with a standard deviation of 250 square feet. A sample of 80 bags was tested, and the mean coverage was 2,050 square feet. What's the 95% confidence interval?
(1,99.2, 2,104.8)
El Burrito, a nationwide fast food chain, is attempting to institute a standardization process in the production of food items. The management has determined that the average Grande Bean Burrito should weigh 1.2 lb. The company is concerned that if burritos are smaller than 1.2 lb customers will be dissatisfied and take their business to a competitor, La Enchilada. They're also concerned that if the burritos are too large the profit margin will decline and they'll go out of business. The corporate office is interested in checking their burrito-making process. They randomly select 42 Grande Bean Burritos from El Burrito stores around the country and weigh them. The mean weight is 1.4 lb and the standard deviation is .5 lb. Using the information provided for this activity, compute the 95% confidence interval for the estimate of the population mean weight based on the sample of 42 burritos.
(1.25, 1.55) x̅ ± 1.96(.5/sqrt(42))
A researcher collects infant mortality rate (IMR) data from a random sample of 200 villages in a large country. The mean IMR across these villages is 15.7 deaths per 1,000 infants born in a year. Based on previous research, the researcher assumes the population standard deviation is 6.5. What's the 90% confidence interval for the population IMR for villages in this country?
(14.94, 16.46)
A teacher administers a standardized math test to his class of 75 students. The mean score (out of 300 possible points) is 235. From previous studies, you know the population standard deviation is 28. Using the sample data given, calculate a 95% confidence interval for the population mean.
(228.7, 241.3)
According to a poll, approximately 30% of Americans spent over $750 on holiday presents in 1998. Suppose you took a random sample of 85 shoppers and found that 23 spent over $750 on holiday presents in 1999. What test statistic would you use to conduct a hypothesis test to see if the percentage of people spending more than $750 on holiday presents has decreased?
(23/85) - .03 / √(.03)(.07) / 85 (one-prop z formula)
What's this interval?
(59.07, 63.27)
As part of a promotional campaign, a gas station hands out game pieces to customers at a rate of one per visit. Each piece has a letter or an exclamation point on it, and if you collect all seven game pieces they spell you win! If you've collected six of the seven game pieces, and all you need is the letter w to get all seven, what's the probability you'll get your piece on the fifth trip to the gas station? Assume there are equal numbers of each piece.
(6/7)^4(1/7)
Find the mean and standard deviation of the following data. Value Frequency 3 2 4 1 6 9 7 4 9 4 10 6
(7.23, 2.18) (you can just use 1-Var Stats w/ freq list)
Consider a binomial event with B(75, .6). Which of the following represents the probability of getting exactly 50 successes?
(75 50) (.6)^50 (.4)^25
Find a 99% confidence interval for the population mean for the data set (120, 80, 83, 115, 95, 90, 75, 145, 130, 105, 118, 78, 86, 113, 98, 93, 79, 147, 136, 108).
(90.17, 119.23)
A company developing a new video game wants to know how many minutes it takes a player to beat the game. Researchers randomly select twenty players and record the following win times for each (in minutes): 120, 80, 83, 115, 95, 90, 75, 145, 130, 105, 118, 78, 86, 113, 98, 93, 79, 147, 136, and 108. Find a 95% confidence interval for the mean number of minutes it takes to beat this game.
(94.07, 115.33)
Find a 90% confidence interval for the population mean for the data set (120, 80, 83, 115, 95, 90, 75, 145, 130, 105, 118, 78, 86, 113, 98, 93, 79, 147, 136, 108).
(95.92, 113.48)
The following frequency plots represent observed frequency data gathered from probability experiments. Which of these indicates an experiment in a geometric setting?
(Note that this doesn't look exactly like a plot of a geometric distribution. That's because it's a plot of observed values, not theoretical ones. If you ran the experiment for a large enough number of trials, you'd see a regular geometric distribution, with the highest frequency at n = 1. Since the probability for each successive n is a fraction of the previous n, the probabilities will continue to get smaller as n gets larger.)
margin of error formula
(critical z)(√p̂(1 - p̂)/n)
Which of the following is not a condition that must be met before you can use a z-procedure to conduct a hypothesis test about a single population proportion?
(po)(p hat) > 10
In our sample of 42 burritos, we found that the average weight was 1.4 lb and the standard deviation was .5. In a test of the hypothesis H0: µ = 1.2 lb, where Ha: µ > 1.2 lb and α = .01, for what values of x̅ would you rejected the null hypothesis?
(x̅ < 1.002, or x̅ > 1.4) (this is the area outside the acceptance region (of the 99% two-sided confidence interval)) 1.2 ± 2.58(.5/sqrt42)
Researchers randomly assign subjects to two groups, with each group receiving a different asthma medication. For one month, researchers record the number of asthma attacks suffered by each subject. What's the correct formula for constructing a confidence interval for the difference between these two groups without pooling?
(x̅1 - x̅2) ± (t*)√(s^2/n1 + s^2/n2)
You draw two simple random samples from two distinct populations and calculate the following: x̅1 = 23.4, s1 = 4.2, n1 = 25 x̅2 = 25.3, s2 = 3.9, n2 = 27 Using the formula for a confidence interval for the difference between two means, (estimate) ± (margin of error), what's the correct value for (estimate)?
-1.9 (23.4 - 25.3 = -1.9)
The probability that a given 80-year-old person will die in the next year is .27. What's the probability that between 10 and 15 (inclusive) of 40 80-year-olds will die in the next year?
..6191 (binomcdf(40,.27,15)
You draw two random samples from two distinct populations and calculates the following sample statistics: x̅1 = 52.6, s1 = 3.2, n1 = 28 x̅2 = 49.3, s2 = 2.9, n2 = 29 k = df = 27 t = 4.07 Using tcdf on your TI-83 calculator, compute a P-value for the hypothesis test Ho: µ1 = µ2, Ha: µ1 > µ2. (Note: Don't use 2-sampTTest for this question.)
.0002 (P(t > 4.07) = tcdf(4.07, 100, 27) = .0002)
Margaret's Sock Emporium sells hand-knit wool socks. Margaret claims the mean amount of wool per pair is 2 oz, with a standard deviation of .2 oz. You take a random sample of 40 socks and find that the mean weight is 1.9 oz. What's the P-value for getting a sample mean of 1.9 oz or less?
.0008
Suppose biologists have been studying the question of whether squirrels produce more offspring in colder climates. The theory goes like this: in colder climates, fewer offspring will live to maturity, so the squirrels need to produce more offspring to make up for the additional loss. In a study on one species, scientists collected data on the number of offspring for squirrels in both states. Here are the summary statistics. (The study and statistics are hypothetical.) Sample size n Mean x̅ # per year Standard deviation s CA Squirrels 73 11.49 4.28 OR Squirrels 49 13.57 3.71 What's the P-value of this test? (For this question, use the two-sample t test on your TI-83.)
.0026 (Ha:(µ1 < µ2))
A researcher believes a population of fish has a normal distribution with a mean weight of 3.4 kilograms and a standard deviation of .8 kilograms. She takes a simple random sample of 30 fish and finds that the mean is 3 kilograms. What's the P-value for a sample mean of 3 or less? (Hint: Remember to use the distribution of .)
.0031
In a blind ESP test, a person correctly identifies whether a tossed coin comes up heads or tails in 63 trials out of 100. Use the normal approximation (without the continuity correction) to calculate the probability of correctly identifying 63 or more.
.0047
In a sample of 42 burritos, we found a sample mean 1.4 lb and assumed that σ = .5. In a test of the hypothesis H0: µ = 1.2 lb, with Ha: µ > 1.2 lb and α = .01, and where the actual burrito weight (the true population mean) is 1.6 lb, what's the probability of a Type II error and what's the power of the test?
.0047, .9953 β = normalcdf(1.002, 1.4, 1.6, .5/sqrt42) = .0047 Power = 1 - β = .9953
A sampling distribution (n = 20) of the number of penalties served last season by members of the Hitemhard hockey team has a mean of 200 with a standard deviation of 48. What's the probability of obtaining a sample with x̅ < 170?
.0057
A researcher is interested in estimating the mean blood alcohol content (BAC) of people arrested for driving under the influence. The sample consists of 250 individuals with a mean BAC of .145. Based on past data, the researcher assumes a population standard deviation of .065. What's the margin of error for a 90% confidence interval in this scenario?
.0068
The staff at an aquarium with 500,000 annual visitors wants to know whether there's a difference in the proportion of its patrons who come specifically for workshops and those who come for other reasons. The staff takes a random exit poll of 1,238 visitors, and 751 say they came for workshops. For the test statistic used in a test of the hypotheses Ho: p .5, Ha: p ?? .5, what's the standard deviation of p hat?
.0142
Management at a sparkling water factory has designed a new system it thinks will lower the proportion, currently at .12, of incorrectly labeled bottles. In the year following the implementation of the new system, quality control engineers randomly sample 378 bottles and find 40 that are labeled incorrectly. What standard deviation of would you use for a test statistic in the hypothesis test Ho: p = .12, Ha: p < .12?
.0167
In a sampling distribution with a mean of zero, what's the associated p-value for P(z ≥ 2.04)?
.0207 (normalcdf(2.04, e^99, 0, 1))
In a particular roulette game, there's a 1/36 chance of winning. In a single day, a gambler plays 100 rounds, and wins in 7 of them. What's the P-value for winning 7 or more out of 100 rounds? (Hint: Be sure you know which kind of probability distribution you're dealing with before you do the calculation.)
.0217
You perform a X2 goodness-of-fit test to see if the number of birthdays occurring each month matches the expected number (assuming each month is equally likely to be the birth month for any given individual). You get 20.5 as your X2 value. What is the P-value for this test?
.025 < p <.05 (12 months in the year, so df = 11)
In a normal sampling distribution with a mean of 0, what's the P-value for z ≥ 1.89?
.0294
A survey finds that 55 people out of 170 favor increasing property taxes to help pay for a new library. If this data is used to estimate the population proportion who favor new taxes, the standard error of the estimate is:
.036. (p hat = 55/70 = .3235, SEp hat = sqrt((.3235)(.6765)/170) = .036)
In a table with 11 categories, you calculated a chi-square statistic of 17.8. The P-value associated with this is:
.05 < P ≤ .1
A sample of size 10, drawn from a population with a standard deviation of 5, generates a confidence interval (26.9, 33.1). Which of the following hypothesis tests corresponds to this interval?
.05, two-sided.
Jeff is a snake charmer who can charm snakes with .78 probability of success. In a typical snake-charming season, Jeff attempts to charm 400 snakes. Using the exact binomial calculation, find the probability that he'll charm fewer than 300 snakes.
.067
If you flip three coins simultaneously, what's the probability you'll get three heads for the first time on the fifth toss?
.0733 (geometpdf(1/8, 5))
A manufacturer claims that the batteries it makes will last 18 hours, with a standard deviation of 1.5 hours. If the durations of the batteries are normally distributed, what proportion of batteries would be expected to last less than 16 hours?
.0918
Imagine a country where only one of every 5 births is a girl. To increase their chances of having a girl, a family is willing to have many children. What is the probability that the first girl they have is the fourth baby?
.1024 (This is a geometric setting with p = .2, so G(4) = (.2), (.8)^3 = .1024.)
A market claims the average weight of a package of hamburger in its meat department is one pound, with a standard deviation of .18 lb. A manager decides to test H0: µ = 1 against the two-sided alternative Ha: µ ≠ 1. It decides to reject H0 if a sample of 35 packages differs from 1 by more than 1.5 standard deviations. What's the probability of a Type I error? (Hint: Use your calculator.)
.13 (The probability of a Type I error is the probability of rejecting the null hypothesis when it's true. Since we've decided to reject the null whenever we're more than 1.5 standard deviations from the mean in any direction, the probability of rejecting the null is 1 - normalcdf(-1.5,1.5) = .1336.)
Compute the p-value for a random sample with n = 23, s = 2.33, x̅ = 15.5 for the test Ho: µ = 14.75, Ha: µ ≠ 14.75. Also give a conclusion for a test where alpha = .05.
.1378, don't reject Ho
The probability that a given 80-year-old person will die in the next year is .27. What's the probability that exactly 10 of 40 80-year-olds will die in the next year?
.1385 (this is binompdf(40,.27,10))
Suppose you roll a six-sided die 10 times. What's the probability of getting three fives in those 10 rolls?
.155 (binompdf(10, 1/6, 3))
A market claims the average weight of a package of hamburger in its meat department is one pound, with a standard deviation of .18 lb. A manager decides to test H0: µ = 1 against the one-sided alternative Ha: µ ≠ 1 (sic) at the .01 level of significance by weighing a sample of 35 packages of hamburger. The manager will reject the null hypothesis if x̅ < .93. Suppose the mean weight of all hamburger packages is really .9 lb. Find the probability of a Type II error and the power of the test.
.16, .84 (We're given x̅ < .93 for our decision rule. If the true mean is .9 lb, the area of this region is given by normalcdf(0,.93,.9,18/sqrt(35)) = .84. Since this is the rejection region, .84 is the power of the test and 1 - .84 = .16 is the probability of a Type II error.)
A researcher believes the mean for a certain normally distributed population is 102. The standard deviation is known to be 15. What's the P-value of drawing a random sample of 50 and getting a mean of 100 or less? Would this be strong, statistically significant evidence that the population mean is less than 102?
.1727, not strong evidence
According to the manufacturer, the average proportion of red candies in a package is 20%. An 8 oz. package contains about 250 candies. What's the probability that a randomly selected 8 oz. bag contains less than 45 red candies?
.212
You're interested in determining whether people prefer orange or pink food. You make a large batch of vanilla pudding and dye one half orange and the other half pink. You randomly select students to participate in a taste test, and you randomize which pudding people try first. Out of 132 subjects, 72 prefer the orange pudding. You use the hypothesesHo: P = .5, Ha: P = ≠ .5. What's the P-value of the test statistic?
.303
Here are some summary statistics for two variables x̅ = 40.4, sx = 5.22, y̅ = 25.2, sy = 3.4, r = .52 Find the value of b1, the slope of the regression line.
.3387
What's the probability of a sample of 10 students getting an average score of 510 or more on a standardized test if the test's scores are normally distributed with a mean of 505 and a standard deviation of 50?
.3759
About 25% of all dogs live more than 10 years. Out of a random sample of 80 dogs, what's the probability that between 15 and 20 dogs will live more than 10 years?
.40
Imagine a country where only one of every 5 births is a girl. To increase their chances of having a girl, a family is willing to have 5 children. What is the probability that they will have exactly one girl?
.4096
Leroy and Fred play chess at a club every Wednesday. The probability that Leroy will lose is .3, that he will stalemate is .5, and that he will win is .2. The probability that Fred will lose is .25, that he will stalemate is .4, and that he will win is .35. What is the probability that at least one of them wins on a given Wednesday?
.48
Given a set of ordered pairs (x, y) with sx = 2.5, sy = 1.9, r = .63, what is the slope of the regression line of y on x?
.48 (.63)(1.9/2.5) = .48
There are 533 successes in a random sample of size 1,000. How would you calculate a 90% confidence interval for this sample?
.533 ± 1.645(√(.533)(.467)/1000)
Let x be a binomial random variable with n = 15 and p = .5. Using the exact binomial calculation and the normal approximation with the continuity correction, find P(X > 6).
.6964, .6972
You have a rather strange die: three faces are marked with the letter A, two faces with the letter B, and one face with the letter C. What is the probability that you will get a B in three or fewer rolls?
.704 (The first success could occur on the first, second, or third roll. These are independent, so you'd add the probabilities of each of the outcomes together. Thus, P(first or second or third) = (1/3) + (2/3)(1/3) + (2/3)(2/3)(1/3) = 19/27 = .704.)
In a survey, 1,000 mothers and fathers are asked about the importance of sports for boys and girls. Of the parents interviewed, 75% said genders are equal and should have equal opportunities to participate in sports. Assuming that .75 is correct for the population, what are the mean, standard deviation, and shape of the distribution of the sample proportion p hat for n = 100?
.75, .0433, approximately normal
For the following 6 questions, use this data You collect the following data for explanatory variable A and response variable B: A(x) B(y) 1 14.2 2 14.9 3 15.5 4 16.8 5 17.8 6 18.9 7 20.1 8 20.9 9 21.5 10 22 You calculate the following summary statistics: x̅ = 5.5, sx = 3.03, y̅ = 18.26, sy = 2.85, r = .995. Calculate the slope of the line y hat = b0 + b1x.
.9359 (b1 = (r)(sy/sx))
A commercial crabber catches more than 1,000 crabs and measures the shells, and finds the mean length is 6.8 inches with a standard deviation of 3.2 inches. Assuming these measures are true for the population, if the crabber takes many random samples of size 50, what proportion of the sample means would we expect to be greater than 6 inches?
.9615 (P(X > 6) = P[z > (6 - 6.8) / (3.2 / sqrt 50)] = P(z > -1.77) = 0.9616)
What's your σ(sub x̅) value?
.968
In a sample of 42 burritos, we found a sample mean 1.4 lb and assumed that σ = .5. In a test of the hypothesis H0: µ = 1.2 lb, with Ha: µ > 1.2 lb and α = .01, and where the actual burrito weight (the true population mean) is 1.25 lb, what's the probability of a Type II error and what's the power of the test?
.9734, .0266 β = normalcdf(1.002, 1.4, 1.25, .5/sqrt42) = .9734 Power = 1 - β = .0266
You have a table of standard normal probabilities that gives you the area of the curve from the left tail to the z-score of interest. When using this type of table, what area of the curve would you use to find the corresponding z-score for confidence interval of 95%?
.975
Samples of size 49 are drawn from a distribution that's highly skewed to the right with a mean of 70 and a standard deviation of 14. What's the probability of getting a sample mean between 71 and 73?
0.02417
When rolling a 6-sided die, what's the probability of having to roll 6 times before you get a 4?
0.067 (geometpdf(1/6, 6))
If you flip two coins simultaneously, what's the probability you'll have to flip them four times before the first occurrence of two heads?
0.11 (geometpdf(1/4, 4))
For the game of roulette, the mean winnings for one bet is approximately minus 0.0526 with a standard deviation of about 0.9986. What's the probability that you come out ahead (win a positive amount) if you play 100 times?
0.2992
A bag of candy has equal numbers of candies in eight colors: blue, red, brown, green, yellow, orange, pink, and black. If you eat them one by one, what's the probability of getting your first red candy on or before the fifth pick?
0.487 (geometcdf(1/8, 5)
In the regression output shown here, identify the standard error of the slope.
0.6452 (SE(b1) is in the SE of Coeff column in the variable row)
Frank is a champion fish charmer who can charm fish with 0.62 probability of success. If he attempts to charm 100 fish, what's the probability he'll charm between 60 and 80 inclusive? Solve using two methods: normal approximation without the continuity correction, and normal approximation with the continuity correction.
0.660, 0.697
Suppose you attend a baseball game late in the season. The Seattle Mariners are playing, and Ken Griffey Jr.'s batting average is 0.308. What's the probability he'll get his first hit of the game on or before his 3rd at bat?
0.669 (geometcdf(.308, 3))
If you roll two dice, what's the probability of rolling a seven (the numbers on the dice add up to 7) on or before the eighth roll?
0.767 (geometcdf(1/6, 8))
You take a 100-question multiple-choice test. Each question has five choices, and you guess at each question. Which of the following calculator commands would give you the probability of getting at least 30 questions correct?
1 - binomcdf(100,.2,29) (Binomcdf(n, p, x) will sum the individual probabilities from X = 0 through X = x. Since you want the sum of all possibilities starting at 30, you must subtract the sum of all possibilities less than 30 from 1.)
Which of the following are key questions in a test of statistical significance? 1. If random chance is the only factor, what's the chance I'll get that result? 2. Is my result so unlikely that something other than chance must be a factor? 3. Am I asking a significant research question?
1 and 2
A researcher rejects a null hypothesis mu = 280 based on a 95% confidence interval. Which of the following must also be true? 1. A significance test would also reject the null at a = .1. 2. A significance test would also reject the null at a = .05. 3. A significance test would also reject the null at a = .01.
1 and 2 (If a 95% confidence level rejects the null, then a significance test would also reject the null at a = (1 - C), or (1 - .95) = .05. If the test would reject at .05, then it would reject at any weaker significance level, such as .1. However, without computing a P-value, we can't be sure whether the test would also reject the null at a stronger significance level, such as .01.
Which of the following are true? 1. A binomial event has exactly two possible outcomes. 2. If the population size is at least 20 times the sample size, the independence criteria for a binomial has been met. 3. As long as the probability for each trial is clearly specified, different trials can have different probabilities of success.
1 and 2 only
Which of the following are valid ways of testing a hypothesis? 1. Compute a confidence interval and check to see if the hypothesized value of the population parameter is in the interval. 2. Compute a p-value and use it as the basis of your decision. 3. Compute the magnitude of the difference between the expected and obtained values, and use this number as the basis of your decision.
1 and 2 only (Computing a confidence interval (then checking to see if it contains the hypothesized population value) and basing a decision on a computed p-value are both good approaches. On its own, the magnitude of the difference between observed and expected values won't help you much.)
Which of the following statements are true? 1. Boxplot A has a greater range than boxplot B. 2. Boxplot A has a smaller IQR than boxplot B. 3. Boxplot B contains more data than boxplot A.
1 and 2 only (The range on boxplot A is about 39-40, and the range on boxplot B is about 32-33. The IQR of boxplot A is about 11, and the IQR of boxplot B is about 13 or 14. Since the data are broken into quartiles, there's no way to know the quantity of data in either boxplot.)
Which of the following statements are true of hypothesis tests? 1 You must state null and alternative hypotheses in the context of the problem. 2 You must state a significance level so you can decide if a given P-value gives you evidence to reject the null hypothesis. 3 You must state a conclusion in the context of the problem.
1 and 3
To use a t procedure, which of the following must be true? 1. The standard deviation of the parent population must be unknown. 2. The sample size must be small (less than 30). 3. The parent population must be normally distributed.
1 and 3 only
Which of the following are true of the components of the chi-square statistic? 1. ∑(O - E) = 0 2. ∑(O - E)2 = X^2 3. ∑(O - E)2 = 0 when your data exactly match expected values.
1 and 3 only
Which of the following statements about pooling are true? 1. In statistics, to pool is to create an estimate of a common variance for two samples. Pooling gives narrower confidence intervals and lower P-values, so it yields a more precise final result with a higher power of the test. 2. Pooling can only be used in the rare instances when you can assume the populations are of equal size. 3. Without requiring that strong and often unrealistic conditions be met, computer software (or a calculator) will give a final result without pooling that's nearly as precise as the result you'd get if you pooled. 4. You can use pooling when the sample standard deviations are the same.
1 and 3 only
Which of the following are common to all cases of statistical inference? 1. Sample statistics are used to estimate population parameters. 2. Two means are compared. 3. A sample proportion is used to estimate a population proportion.
1 only
Eric is a statistician-Viking. He walks into a tavern and makes three statements about the continuity correction. Which are true? 1. You can find single probabilities (such as x = 4) using the normal approximation to the binomial. Do this by finding the probability for the range from .5 below to .5 above the number. (For x = 4 you'd find 3.5 < x < 4.5) 2. Using the normal approximation, x < 4 is the same as x ≤ 4. Using the exact binomial, x < 4 is not the same as x ≤ 4. 3. The continuity correction removes some of the error introduced when modeling a discrete distribution with a continuous one. The correction increases accuracy by acting as though a number occupies the interval from 0.5 below it to 0.5 above it.
1, 2, and 3
Which of the following are true? 1. A sampling distribution of a statistic consists of all possible random samples of the same size from a given population. 2. Regardless of the shape of the original population, for samples of size 2, μ = μ and σ = σ/sqrt(n). 3. Unless there was extreme skewness or outliers, we can assume that a sampling distribution of a sample mean was approximately normal for samples of size 40.
1, 2, and 3
Which of the following are true? 1. Before and after scores on the same individual are dependent. 2. For a two-sample t procedure, you must assume independent samples are drawn from distinct parent populations. 3. You should use one-sample t procedures if samples are dependent.
1, 2, and 3
Which of the following statements about matched pairs analysis are true? 1. In a matched pairs analysis, you use a single-sample procedure. The sample is made up of the differences between pairs of observations. 2. In a In a matched pairs analysis, you use a single-sample procedure. The sample is made matched pairs analysis, the two original samples are said to be dependent on each other. 3. In a matched pairs analysis, you start with two sets of observations on a single variable. 4. In a matched pairs analysis, each of the two original samples is analyzed separately before comparing their means.
1, 2, and 3
A researcher collects infant mortality rate (IMR) data from a random sample of 200 villages in a large country. The mean IMR across these villages is 15.7 deaths per 1,000 infants born in a year. Based on previous research, the researcher assumes the population standard deviation is 6.5. Then the researcher decides to take a larger sample so he can estimate the mean IMR with a margin of error of .5 infants per 1,000 born in a year at a 99% confidence interval. How many villages would the researcher have to sample?
1,122 (1,121 was wrong; round up)
If you roll a six-sided die 10 times, what's P(x > 3)?
1-binomcdf(10,1/6,3)
You want to conduct a hypothesis test for the slope, and your null hypothesis is that there's no relationship between the explanatory and response variables (regression coefficient, or slope = 0). Using the regression output provided here, what's your t statistic?
1.16 (t-ratio column on variable row)
What's z* for a 78% confidence interval?
1.22
A sample of size 54 has s = 5.5 and yields this confidence interval for the mean: (103.6, 105.42). What's the t* for this confidence interval?
1.22 (105.42 - 103.6)/2 = 1.82/2 = .91 .91 = t*(5.5/sqrt54) = t*(.748) = t* = .91/.748 = 1.22
Suppose we compare the data on two samples, A and B, and come up with the following data: nA = nB = 10, x̅A = 25, sA = 3.21, xB = 3.21, sB = 3.09. What's the standard error of the difference between the means (that is, what's sx̅1 - sx̅2)? For this question, don't pool.)
1.41
What's the critical z-value for an 85% confidence interval?
1.44
Compute a test statistic for a random sample with n = 23, s = 2.33, x̅ = 15.5 for the test Ho: µ = 14.75, Ha: µ ≠ 14.75.
1.54
Two independent random variables x and y have µx = 5.2, σx = .72, µy = 9.3, σy = 1.44. What is the value of AP Statistics Semester 2 ?
1.61
Which of the following is the upper critical z-score for a one-sided hypothesis test at the .05 level?
1.645
What critical t value, or t*, would you need to have for a 90% confidence interval based on a sample of size 22?
1.721
A backpack manufacturer wants to know if students in high school carry more books than college students do. Company researchers take a simple random sample from each group and record the number of textbooks each subject is carrying. They get the following data: High school: (5, 3, 2, 5, 6, 4, 7, 6, 5, 4, 3, 2, 1, 4, 3, 0, 2) College: (5, 3, 2, 4, 1, 0, 0, 3, 6, 2, 1, 3, 1, 2, 4, 4, 2) Using high-school students as sample one and college students as sample two, what's your test statistic for the hypothesis test Ho: µ1 = µ2, Ha: µ1 > µ2? Let: x̅1 = the sample mean for high school students. x̅2 = the sample mean for college students.
1.808
The upper critical t value (t*) for a 90% confidence interval with 8 degrees of freedom is:
1.86
What is the upper critical z-value for a two-sided significance test at .06?
1.88 (the lower is -1.88)
You want to know whether birthdays are equally distributed through all 12 months, so you take a random sample of 149 people and record each person's birth month. What would you use as the expected value for a chi-square goodness-of-fit test of these data?
12.417
Calculate the y-intercept of the line y = b0 + b1x.
13.11 (b0 = y bar - b1x)
You want to know if birthdays are equally distributed through all 12 months. You take a random sample of 149 people and record the following number of birthdays in each month: Jan: 14 Feb: 15 Mar: 13 Apr: 13 May: 14 Jun: 7 Jul: 5 Aug: 17 Sep: 13 Oct: 9 Nov: 11 Dec: 18 You calculate an expected value of 12.417 people for each month. Calculate X2 for the goodness-of-fit test for the following hypotheses: H0: Each month has a proportion of .083 birthdays. Ha: At least one month does not have a proportion of .083 birthdays.
13.12
You're going to construct a confidence interval for the difference between two population means. The sample sizes are 15 and 17. How many degrees of freedom do you base your analysis on if you don't pool your variances? How many if you do pool your variances?
14, 30
Suppose you generate a two-sample t statistic of 2.59 by hand in a one-sided hypothesis test. If you were testing the hypothesis at AP Statistics Semester 2 = .01, and if the samples are of equal size, how small could each sample be and still result in a null hypothesis?
17
Given the tree diagram above, what is P(A | C)?
17/27 (There are 17 + 10 = 27 outcomes for C. Of these, 17 came from A. Thus, P(A | C) = 17/27.)
The Acme Whoopee Cushion Company claims that each of its unbreakable whoopee cushions consists of 18 oz of rubber. You're a quality control engineer for Acme, and you know there's some variation in the manufacturing process so that not all whoopee cushions will weigh the same. You want to know if the population mean is consistent with the claim of 18 oz. Which of the following would be appropriate for testing the company's claim? 1. A significance test with a one-sided alternative hypothesis 2. A significance test with a two-sided alternative hypothesis 3. A confidence interval
2 and 3 (A significance test with a two-sided alternative and a confidence interval will both test whether the population mean is within a certain range on either side of the hypothesized mean.)
Which of the following are true of all sampling distributions? 1. μ = μ and σ = σ/sqrt(n) 2. It is a probability distribution of a statistic. 3. All samples must be the same size.
2 and 3 only
Which of the following are true? In order to use t procedures, you must know the population standard deviation. t procedures are considered to be robust. The standard error of the mean is an estimate of the standard deviation of the sampling distribution of a sample mean.
2 and 3 only
Which of the following statements is true of hypothesis testing? 1. You must always state a significance level when doing a hypothesis test unless you are using a confidence interval. 2. To say a finding is significant means that it is unlikely to have occurred by chance. 3. A confidence interval can be used as evidence in a two-sided hypothesis test.
2 and 3 only
To determine the health benefits of walking, researchers conduct a study in which they compare the cholesterol levels of women who walk at least 10 miles per week to those of women who do not exercise at all. The study finds that the average cholesterol level for the walkers is 198, and that the level for those who don't exercise is 223. Which of the following statements is true? 1. This study provides good evidence that walking is effective in controlling cholesterol. 2. This is an observational study, not an experiment. 3. Although the study was conducted only on women, we can confidently generalize the results to men in the same age group.
2 only
Which of the following are not essential characteristics of a binomial event? 1. Each outcome must be independent. 2. The sample size must be at least 20. 3. A trial can have only two possible outcomes.
2 only
Which of the following is not a matched pairs situation? 1. Students are paired according to gender, height, weight, and physical capability. Within each pair, one student is given a treatment and the other is given a placebo. 2. Researchers take two random samples: one of Ohio factory workers and another of Michigan factory workers. A researcher compares their mean ages. 3. A researcher selects a random sample of students and measures the weight of each before and after an exercise program.
2 only
Which of the following statements are true? 1. When generating confidence intervals and when doing significance tests we use the same expression for the standard error. 2. If zero is in the confidence interval for the difference between two proportions, we have evidence that the two population proportions could be the same. 3. When we do a significance test for the difference between two proportions, we're justified in pooling our estimates of the population proportion only if the sample sizes are large.
2 only
Which of the following is not true of the least-squares regression line? 1. The y-intercept of the line is the value of y when x = 0. 2. The mean of y for each value of x is the same. 3. Each value of y in the data set is made up of the residual and the fit of the line. 4. The line must pass through the point (x bar, y bar)
2 only (The mean of y changes in a linear fashion with every value of x.)
A set of 7,500 scores on a test are distributed normally, with a mean of 23 and a standard deviation of 4. To the nearest integer value, how many scores are there between 21 and 25?
2,873 (area between 21 and 25 = .3830, .3830 x 7500 = 2872.5)
Use your TI-83 to determine the necessary t* value for a 95% confidence interval based on a sample size of 55 and a sample standard deviation of 7.42.
2.005
You draw two simple random samples from two distinct populations and calculate the following: x̅1 = 23.4, s1 = 4.2, n1 = 25 x̅2 = 25.3, s2 = 3.9, n2 = 27 The estimate of the degrees of freedom, k, equals n1 - 1, or 24. Using your table, find the critical t value for a 95% confidence interval.
2.064
What's the margin of error?
2.1 inches
What's your value for z*?
2.17
What's the critical t value you'd use for a 95% confidence interval for the slope of this regression line?
2.306 (df = n - 2)
You draw two simple random samples from two distinct populations and calculate the following: x̅1 = 23.4, s1 = 4.2, n1 = 25 x̅2 = 25.3, s2 = 3.9, n2 = 27 The estimate of the degrees of freedom, k, equals n1 - 1, or 24, and t* equals 2.064. For the formula for a confidence interval, what's the correct value for margin of error?
2.325
You want to conduct a one-sided test of H0 at a = .01. What's the critical z-value for this test?
2.33 (A one-sided test with an a = .01 is the 1% area at the end of the right tail of the normal curve. The corresponding z-score is z = 2.33.)
You want to construct a 99% confidence interval for a sample of size 498. What's your critical z-value (z*)?
2.576
What is σx for the discrete probability distribution here? x 4 10 5 p .5 .3 .2
2.65 (4-6)^2(.5) + (10-6)^2(.3) + (5-6)^2(.2) = sqrt(7) = 2.65
Use the table of t distribution critical values to find the necessary t* value for a 99% confidence interval for a sample mean based on a sample of size 28.
2.771
You have a rather strange die: Three faces are marked with the letter A, two faces with the letter B, and one face with the letter C. You roll the die until you get a B. What is the probability that a B does not appear during the first three rolls?
2.96 (4/6)^3 = .296
Suppose you flip a coin n times, and the probability of getting heads 15 times is .0148. What's n?
20 (If you flip a coin 20 times, there's a .0148 probability that you'll get 15 heads. You can solve this in one of two ways: 1. Use the binomial formula, plug in the values you know for X, p, andP open parentheses X equals 15 close parentheses, and solve for n. 2. Using binompdf on your calculator, try all answer choices and choose the one that gives you a probability of .0148.)
You want to determine whether boys or girls at your school spend more time online. You randomly sample 25 girls and 35 boys from your school, and ask them how many minutes they spent online during the previous day. To conduct a hypothesis test to see if there's a difference in the mean number of minutes girls and boys spent online, what would you use for the degrees of freedom? Assume you're conducting this hypothesis test by hand.
24
What are the mean and the standard deviation of a sampling distribution consisting of samples of size 16? These samples were drawn from a population whose mean is 25 and whose standard deviation is 5.
25, 1.25
Management at a seaside resort is publishing a brochure and wants to include a statement about the proportion of clear days during their peak season. Out of a random sample of 150 days from over the last two peak seasons, 117 days were recorded as clear. They want to estimate the proportion of clear days to within a 5% margin of error with a 95% confidence interval. What's the sample size necessary to construct this interval?
264
You draw two random samples from two distinct populations and calculates the following sample statistics: x̅1 = 52.6, s1 = 3.2, n1 = 28 x̅2 = 49.3, s2 = 2.9, n2 = 29 Using the conservative method, what are the degrees of freedom for a test of the difference between the two population means?
27
You're asked to analyze a two-variable data set consisting of 30 pairs of data. In this example, how many degrees of freedom are there for the value of s (the estimate of the standard deviation of the residuals)?
28
Which of the following is NOT one of the conditions that must be met before you can use a z-procedure for a confidence interval for the difference between two proportions? 1. Both samples must have at least 10 failures. 2. Both samples must have at least 10 successes. 3. Samples can't show any skewness or outliers. 4. At least one of the populations should be at least 10 times greater than the sample drawn from it.
3 and 4 only
Which of the following are true of the margin of error for a single population proportion? 1. Multiply the margin of error by z* to get the standard error. 2. The margin of error is a calculation that describes the error introduced into a study when the sample isn't truly random. 3. The margin of error describes a possible random sampling error that occurs within truly random samples.
3 only
Which of the following is NOT true about samples that can be treated as matched pairs data? 1. Data collected on these samples is analyzed by calculating the differences between paired observations and using that data in one-sample tests. 2. Each value in the first group of numbers has a relationship to a value in the second group. 3. It's best to analyze related samples by conducting hypothesis tests on the mean from the first and the second groups, in that order.
3 only
Which of the following statements are correct? 1. If two categorical variables are associated, they're also independent. 2. Two categorical variables are independent if the values of one variable can be predicted from the values of the other. 3. Two categorical variables are associated if the values of one variable can be predicted from he values of the other.
3 only
Which of the following statements is false? 1. We use one-sample procedures when our samples are equal in size but aren't independent. 2. Everything else being equal, a confidence interval based on 15 degrees of freedom will be narrower than one based on 10 degrees of freedom. 3. We can pool our estimates of the population variance if we want a narrower confidence interval for a given confidence level.
3 only
A significance test allows you to reject at the .05 level of significance. Which of the following are true? 1. You can reject H0 at the .01 level of significance. 2. Ha can be rejected at the .05 level of significance. 3. H0 can be rejected at the .10 level of significance
3 only (If you can reject at a significance level of .05, you can also reject at any significance level greater than .05.)
A school designs two experimental courses(course A and course B)and randomly assigns 50 students so that there are 25 in each course. At the end of the year, the average final exam score in course A was 75 with a standard deviation of 7, and in course B it was 72 with a standard deviation of 5. Which of the following gives a 99% confidence interval for the difference between the final exam scores of course A and course B?
3 ± 2.797(sqrt(7^2/25) + (5^2/25))
Jeff typically makes three out of nine attempted free throws. What's the average waiting-time for Jeff to make his first basket, and what's the probability he'll make a basket on or before the very last attempt within his average waiting-time?
3, 0.70 (1/.3 = 3, and geometcdf(1/3, 3))
For this two-way table, what are the number of degrees of freedom, chi-squared, and the P-value corresponding to the chi-squared statistic with that number of degrees of freedom? (You should use the chi-squared test feature on your calculator here.) 35 147 101 629 28 222 4 34
3, 6.16, .104
A sample of size 38 has s = 15.5 and yields this confidence interval for the mean: (304.67, 320.48). What's the t* for this confidence interval?
3.14 (320.48 - 304.67)/2 = 15.81/2 = 7.905 7.905 = t*(15.5/sqrt38) = t*(2.514) t* = .7905/2.514 = 3.14
A teacher administers a standardized math test to his class of 75 students. The mean score (out of 300 possible points) is 235. From previous studies, you know the population standard deviation is 28. The principal has decided that she wants to estimate the average score to within 4 points (margin of error = 4) with 99% confidence. If she can only administer the test to one random sample of students, how large should this sample be to achieve the desired margin of error and confidence level?
326 (Since you can't sample part of a student, round up your sample size to the next whole number. (325 was wrong))
El Burrito wants to estimate the average weight of the Grande Bean Burrito to within .05 lb with 95% confidence. Using the sample standard deviation as an estimate of the population standard deviation, how large must the sample be to achieve this margin of error?
385 n = ((1.96 x .5)/.05)^2
Which of the following is the best estimate of the standard deviation of the normal distribution shown here?
4 (The distance from 30 to 54 is 24. According to the 68-95-99.7 rule, most of the area under a normal curve is within three standard deviations of the mean, or a range of six standard deviations. Thus we get 24/6 = 4.)
A researcher draws two random samples from two distinct populations and calculates the following sample statistics: x̅1 = 52.6, s1 = 3.2, n1 = 28 x̅2 = 49.3, s2 = 2.9, n2 = 29 Compute a test statistic for the hypothesis test Ho: µ1 = µ2, Ha: µ1 > µ2
4.07
The standard deviation of SAT scores is 100 points. A researcher decides to take a sample of 500 students' scores to estimate the mean score of students in your state. What is the standard deviation of the sample mean?
4.47
The spinner for a board game has eight colors the arrow can land on. What's the minimum number of spins you'd need to be able to perform a chi-square goodness-of-fit test?
40
You want to use a five-number summary of a univariate data set, (10, 15, 25, 45, 95), to construct a (modified) boxplot. The length of the left-side whisker is 5. Using the accepted rule for determining outliers, what is the maximum length of the right-side whisker?
45 (The maximum length of a whisker in a modified boxplot is 1.5 x IQR = 1.5(45 - 15) = 45.)
Suppose population data suggests that 20% of applicants to a statistical surveying job will have prior surveying experience. How many candidates would have to be interviewed, on average, to find someone with prior surveying experience?
5 (1/.2)
Calculate the test statistic and P-value for the significance test for H0: β1 = 0 (knowing x is not helpful in predicting y), and Ha: β1 =/= 0 (knowing x is helpful in predicting y).
5 = 27.85, p < .0005
A store manager wants to know if there's a difference between packing containers for eggs. One type of container holds 30 cartons of eggs. Cartons shipped in these containers contain an average of 0.75 broken eggs, with a standard deviation of 0.1. A second type of container holds 32 cartons. These contain an average of 0.6 broken eggs per carton, with a standard deviation of 0.12. What's the two-sample t statistic you'd use to conduct a hypothesis test in this situation?
5.36
In a binomial distribution with n = 140 and p = .62, what is the expected standard deviation of the distribution, to the nearest hundredth?
5.74
In the regression output shown here, identify the value of s, the estimate of the standard deviation of the residuals calculated from the data.
5.954 (s is before the df)
You want to test your newly created Web site, so you have 250 people access it from random locations at random times. Of the people accessing the site, 75 of them experience computer crashes. You want to estimate the proportion of crashes within a margin of error of 4% at a 95% confidence interval. What sample size do you need?
505
In a binomial distribution with sample size n = 65, and probability of success p = .8, what would the approximate mean of the distribution be?
52
Your friend says she has an unfair die: the probability of getting a one or a six is 1/3 for each, and the probability of getting a two, three, four, or five is 1/12 for each. You want to test her statement. What's the minimum number of times you have to roll the die to use a chi-square goodness-of-fit test here?
60
You create a Web site, and you want to estimate the proportion of people who experience computer crashes when they access the site. You want to ensure a 4% margin of error for a 95% confidence level, but you don't have any basis for an estimate of the population proportion. Calculate the minimum sample size you'd need for this estimate.
601
For the following 4 questions, refer to this data: To estimate the mean height of female high school juniors, you take a random sample of 30 female students and get these results (in inches): 72, 51, 67, 68, 61, 69, 58, 56, 60, 56 66, 61, 60, 59, 59, 54, 58, 53, 68, 63 57, 62, 63, 64, 56, 62, 58, 67, 57, 70 If the σ is 5.3, based on past research, and you want to construct a 97% confidence interval using (estimate) ± (margin of error) = (estimate) ± z*(σ/√n), what's your point estimate of µ?
61.17
In a one-way table with 8 categories, how many degrees of freedom will the chi-square statistic for goodness-of-fit have?
7
The spinner for a board game has eight colors the arrow can land on. What are the degrees of freedom for a goodness-of-fit test of the fairness of this spinner?
7
You've decided to research whether the number of people running by your house between 6:00 PM and 7:00 PM is the same each day of the week. You collect data for four weeks and organize it in a table: Day Number Monday 17 Tuesday 10 Wednesday 18 Thursday 9 Friday 6 Saturday 9 Sunday 11 Total 80 What's the expected value for each category?
80/7
Consider this two way table: x x X 164 x x x 236 103 91 206 Which of the following is the correct expected value for the shaded cell?
84.46
The weight of adult Golden Retrievers is normally distributed with a mean of 65 lb and a standard deviation of 3 lb. A random sample of 15 Golden Retrievers has an average weight of 66 lb. The percentile rank of this sample is:
90
A group of dieters at a weight clinic are losing weight. Nine patients have lost the following number of pounds in the past month: 26.2, 17.2, 13.6, 13.4, 12.8, 11, 9.9, 6.2, 7.9. Find a 95% confidence interval for the mean weight loss for the population of dieters.
<8.60, 17.67>, t* = 2.306
To test a new diet drug, researchers divide their 96 subjects by gender into two groups (there are equal numbers of men and women). They then randomly subdivide these into three groups each so they can test three different dosages. Which of the following best describes the design of this experiment?
A comparative randomized design, blocked by gender.
Which of these is an example of a matched pairs situation?
A cosmetics company wants to measure the effectiveness of its moisturizer. Company researchers take two measurements on a random sample of people: one for skin condition before using the product and the other for skin condition after using the product.
You compute a 99% confidence interval with sample of size A and a margin of error of + or - 5 units. You now wish to compute a 90% confidence interval with the same margin of error with a sample of size B. For this situation, which of the following is true?
A is greater than B. (To achieve the same margin of error (that is, 5 units) at a lower confidence level, it's not necessary for your sample to be as large as it was for a higher confidence level.)
Which of the following situations satisfies all the conditions of a binomial setting? (These conditions are: we know the number of repetitions, the outcome of each trial can be considered either a success or a failure, we know the probability of success or failure of any trial, and the probability doesn't change from trial to trial.)
A jar contains 500 balls-300 red and 200 white. Ten balls are randomly selected from the jar, and the number X of red balls is recorded.
Which of the following is most important for accuracy in inferential statistics?
A properly drawn random sample
Which of the following scenarios is not a matched pairs situation?
A sample of 50 people agree to participate in a study on vitamin C. Half the group take vitamin C supplements, while the other half take a placebo. During a six-month period, researchers record the number of colds each subject has.
We draw two independent simple random samples from two distinct normally distributed populations. The means and standard deviations of the populations are unknown, but we're interested in comparing the two populations to see if the means are the same. Which of the following could we use to get this information?
A two-sample t confidence interval
Which of the following isn't mandatory in a hypothesis-testing procedure using a test statistic? - state null and alternative hypothesis in the context of the problem - justify the use of a particular test statistic - correctly compute the test statistic of interest - give a conclusion in the context of the problem
All of the above
The null hypothesis is: There's no association between race of respondent and political preference. A 3 X 2 table is constructed to test this hypothesis. The computed value of chi-squared is 7.50. Which of the following is the best conclusion we can make based on this information? (Hint: The best conclusion is given in the context of the problem.)
At the .05 level, we have good evidence that race of respondent and political preference are associated.
What's the probability of getting 1 or 3 fives on 10 rolls of a fair die? Hint: Remember the rule for finding P(A or B).
B(10, 1/6, 1) + B(10, 1/6, 3) (P(X = 1) is binompdf(10, 1/6, 1) and P(x = 3) is binompdf(10, 1/6, 3). To combine the two probabilities using or, simply add them: P(X = 1) + P(X = 3) = .323 + .155 = .478.)
Fred is a weightlifter who can lift 800 pounds on 45% of his attempts. Which of these expressions represents the probability Fred will make 30 lifts out of 60?
B(60, .45, 30)
Which of the following is not a necessary component of a valid experimental design?
Blocking
High Voltage, Inc., a light bulb manufacturer, has gathered data for a matched pairs analysis of whether there's a difference in the life span between its light bulbs and those of a competitor. They have two data sets: the number of hours its bulbs lasted on various lamps and the number of hours its competitor's bulbs lasted on various lamps. Observations are paired by lamp (there are two observations for each type of lamp; one observation for a High Voltage bulb and another observation for a competitor's bulb). What should High Voltage do next?
Calculate the difference within each pair of observations.
A cosmetics company wants to measure the effectiveness of its moisturizer. Company researchers take two measurements on a random sample of people: one for skin condition before using the product and the other for skin condition after using the product. Which of these is the company's next step in this study?
Calculate the differences (after moisture content minus before moisture content) and use the mean and standard deviation of the sample of differences to construct a confidence interval of the true population difference.
What kind of data would you analyze with a chi-square statistic?
Categorical data
Which of the following is NOT true of chi-square distributions and the goodness-of-fit test?
Chi-square distributions approximate normal distributions.
Researchers calculated the following summary statistics from two simple random samples from two distinct populations: x̅1 = 3.647, s1 = 1.9, n1 = 17 x̅2 = 2.529, s2 = 1.7, n2 = 17 When the researchers computed the degrees of freedom by hand, they got k = 16, and for the P-value, they got p = .0447. However, their computer software gave them k = 31.60 and p = .0402. Which of the following is most likely true about the difference in P-values?
Computer software gives a larger value for k and a lower P-value, which gives you a higher power of the test. (Results generated using software more accurately represent a t distribution, and so give you a smaller P-value. This is because software uses a greater value for k, the degrees of freedom. A smaller P-value reduces the chance of a Type II error, which increases the power of your test.)
You draw two simple random samples from two distinct populations and calculate the following: x̅1 = 23.4, s1 = 4.2, n1 = 25 x̅2 = 25.3, s2 = 3.9, n2 = 27 Using the conservative method (no pooling and no calculator or computer software), what are the degrees of freedom for this problem? What if you were to use pooling?
Conservative: 24; pooled: 50
Which of the following conditions doesn't need to be met before you can use a two-sample procedure?
Data in two samples are matched together in pairs that are compared.
A manufacturer claims Fertilizer X will cover an average of 2,000 square feet per bag, with a standard deviation of 250 square feet. A sample of 80 bags was tested, and the mean coverage was 2,050 square feet. Using a 95% confidence interval, which of the following would be an acceptable conclusion?
Don't reject the null that µ = 2,000.
Management of El Burrito decides that if their burritos are larger than the standard weight of 1.2 lb, drink sizes will have to be reduced to compensate for the loss of profits. For a one-sided hypothesis test where H0: µ = 1.2 lb, Ha: µ > 1.2 lb, and α = .01, which of these statements represents a Type II error in this scenario?
Drink sizes are kept the same because the null hypothesis is not rejected, but the actual weight of burritos is greater than 1.2 lb
True or False: As long as there are no empty cells, we're justified in using the chi-squared statistic to test a null hypothesis of independence between two categorical variables.
False.
You'd like to give an estimate for the mean difference in the length between the right foot and the left foot on people within a certain population. Which of the following is the best way to do this?
Find the differences in the lengths of the left and right feet for each person you measured. Use the mean and standard deviation of your sample to construct a confidence interval for the difference.
For which of these scenarios could you use the normal approximation to the binomial?
Flip a coin 100 times and count the number of heads.
Suppose you weigh each rabbit in a group of 100 before and after they're put on a special protein diet. Starting with these data, what would you do next to determine whether the rabbits had gained weight while on this diet?
For each rabbit, calculate the weight difference-subtract the before weight from the after weight-then analyze the set of differences as a single sample.
Which of the following is not true about two-sample t procedures?
For two-sample t procedures, you can use a pooled estimator of the variance if both samples are assumed to be normal. (You can use a pooled estimator of the variance only when you can assume both samples are drawn from populations with equal standard deviations. Because you can rarely make this assumption, and because we now have wide access to computer programs and calculators that can compute more precise estimates of degrees of freedom without pooling, it's usually not a good idea to pool.)
A teacher believes that girls will perform better on a test of spatial ability than boys will, and is designing a study to test this hypothesis. Which of the following statements is correct?
Generating a confidence interval isn't a valid technique in this context.
The following is a list of differences between first and second semester final grades in chemistry (the first semester grade was subtracted from the second): -6, 10, -3, 5, 4, 8, -4, 9, -2, 0, -5, -5, 5, 12, 10. Interpret the hypotheses in the one-sided test Ho: µ(2 - 1) = 0 and Ha: µ(2 - 1) > 0
H0: µ(2-1) (The mean difference between the first and second semester grade is 0.) Ha:µ(2-1) (The mean second semester grade is higher than the mean first semester grade.)
Which of these are proper null and alternative hypotheses for a two-sided significance test about the slope of the regression line?
H0: β1 = 0 Ha: β1 =/= 0
Management at High Voltage, a light bulb manufacturer, wants to know if there's a difference in mean life span between its product and that of a competitor. To answer this question, what would be an appropriate test using matched pairs single-sample analysis?
High Voltage should set up an experiment where randomly selected bulbs from both companies are paired, with each pair consisting of a High Voltage bulb and a competitor's bulb. Researchers should place each pair of bulbs in identical light fixtures and measure the difference in their life spans. (In a matched pairs analysis, observations are always paired in some way. In this situation, researchers compare the life spans of each bulb in a pair and record the difference as a single number.)
Researchers randomly assign subjects to two groups, with each group receiving a different asthma medication. For one month, researchers record the number of asthma attacks suffered by each subject and compute the summary statistics as follows: x̅ = 3.8, s1 = 1.7, n1 = 23 x̅ = 4.2, s2 = 1.5, n2 = 24 Which pair of hypotheses test whether the first mean is less than the second?
Ho: (µ1 = µ2) = 0, Ha: (µ1 - µ2) < 0
You're interested in determining whether there's a difference in people's preferences for orange or pink food. To see if there's a difference, you make a large batch of vanilla pudding and dye one half orange and the other half pink. You randomly select 132 students to participate in a taste test, and you randomize which pudding people try first. You're planning to record the proportion of people who prefer pink. What will your null and alternative hypotheses be?
Ho: P = .5 Ha: P = ≠ .5 (always stated in pop. parameters)
Management at a sparkling water factory has designed a new system it thinks will lower the proportion, currently at .12, of incorrectly labeled bottles. In the year following the implementation of the new system, quality control engineers randomly sample 378 bottles and find 40 that are labeled incorrectly. Which of these hypotheses tests whether the proportion of incorrectly labeled bottles has decreased?
Ho: p .12 Ha: p < .12
The staff at an aquarium with 500,000 annual visitors wants to know whether there's a difference in the proportion of its patrons who come specifically for workshops and those who come for other reasons. The staff takes a random exit poll of 1,238 visitors, and 751 say they came for workshops. What are the hypotheses for your test?
Ho: p = .5 Ho: p ≠ .5
Which of the following is a proper set of hypotheses for a test about a single population proportion?
Ho: p = po Ha: p < po
You've decided to research whether the number of people running by your house between 6:00 PM and 7:00 PM is the same each day of the week. You collect data for four weeks and organize it in a table: Day Number Monday 17 Tuesday 10 Wednesday 18 Thursday 9 Friday 6 Saturday 9 Sunday 11 Total 80 Assuming that p represents the true proportion of runners each day, what will your null and alternative hypothesis be?
Ho: p(m) = p(t) = p(w) = p(th) = p(f) = p(sa) = p(su) = 1/7 Ha: proportion of runners on each day are not all 1/7
High Voltage, Inc., a light bulb manufacturer, wants to know if there's a difference in mean life span between its product and that of a competitor. They collect paired data and calculate the difference in each pair to create one set of numbers that represents the differences within each pair. What notation should High Voltage use to construct its null and alternative hypotheses?
Ho: µ(H.volt - Comp.) = 0 (The mean of the differences in life spans equals 0) Ha: µ(H.volt - Comp.) ≠ 0 (The mean of the differences in life spans doesn't equal 0)
If you're conducting a significance test for the difference between the means of two independent samples, what's your null hypothesis?
Ho: µ1 - µ2 = 0
Which of the following is a list of common steps to inference?
Identify the study, be sure the study and your sample are valid, calculate probabilities or confidence intervals, test for significance
Which of the following statements are true?
If sample sizes are large enough, and if the population is large compared to the sample, we can compare two proportions using a normal approximation to the binomial.
The 99.7% confidence interval for the mean length of frog jumps is (12.64 cm, 14.44 cm). Which of the following statements is a correct interpretation of 99.7% confidence? - Of the total number of frogs in your area of the country, 99.7% can jump between 12.64 cm and 14.44 cm. - There's a 99.7% chance that the mean length of frog jumps falls between 12.64cm and 14.44 cm. - If we were to repeat this sampling many times, 99.7% of the confidence intervals we could construct would contain the true population mean. - 99.7% of the confidence intervals we could construct after repeated sampling would go from 12.64 cm to 14.44 cm. - There's a 99.7% chance that any particular frog I catch can jump between 12.64 cm and 14.44 cm.
If we were to repeat this sampling many times, 99.7% of the confidence intervals we could construct would contain the true population mean.
A researcher computes a 90% confidence interval for the mean weight (in lbs) of widgets produced in a factory. The interval is (7.2, 8.9). Which of these is a correct interpretation of this interval?
If you drew many samples of size n and constructed a confidence interval from each sample, 90% of the intervals would contain the true population value.
An advice columnist asks readers to write in about their marriages. The results indicate that 79% of respondents would not marry the same partner if they could do it all over again. Which of the following statements is true?
It is likely that this percentage is higher than the true population proportion, since people who aren't happy in their marriages are more likely to respond than those who are happy. (The tendency in voluntary response surveys is for the people who feel most strongly about an issue to respond in disproportionate numbers. In this case, if people are very unhappy in their marriage, they are more likely to respond.)
slope = .03849, p = .611 Given the linear regression analysis above, which of the following conclusions can we draw?
It is unlikely that the slope of the regression line is different from zero.
For the line y hat = 36.5 + .48x, how do you interpret the value .48?
It's the amount of change in y when x increases by one unit. (The value .48 is the slope of the line. The slope is interpreted as the amount of change in y when x increases by one unit.)
Which of the following best describes the power of a test?
It's the probability that a test will successfully reject a false hypothesis.
If a p-value is statistically significant, this means that:
It's unlikely that this result occurred as a result of random variation
Which of the following is true about the chi-square distribution?
It's usually right-skewed and can't have negative values.
Which of the following is a characteristic of the t distribution?
Its shape depends on the number of degrees of freedom.
You're at a Seattle Mariners baseball game late in the season, when Ken Griffey Jr.'s batting average is 0.308. You want to calculate the probability, using a binomial setting, that he'll get his first hit of the game on or before his third at bat. What assumption(s) do you have to make to get your answer?
Ken Griffey Jr.'s 0.308 batting average won't change (at least significantly) in each at bat.
You're designing a new mouse for computers. You want to see if it's faster for left-handed people to use it with their right or their left hand. Which of the following best describes a matched pairs scenario for this study?
Left-handed employees are asked to use the mouse to complete a series of timed tasks on each hand. For each employee, the starting hand is selected randomly. (In this case, the experimental units (hands) are paired. You analyze the difference in performance between hands for each pair of hands. This scenario also uses randomization of the starting hand to reduce bias or confounding.)
Out of a population of 10,000 voters, 60% vote for Carl Fredrick Gauss. You take many, many samples of size 30 from this population and, for each sample, count the number of people who vote for Gauss. What's the approximate distribution of counts of votes for Gauss?
N(18, 2.68)
The B(225, 1/5) distribution can be approximated by what normal distribution?
N(45, 6)
What probability model would you use for the following scenario? Pennies currently in circulation have a mean age of about 8 years, and a standard deviation of about 8 years. You gather a sample of 40 pennies and find their mean age to be 9.2 years.
N(8, 8/sqrt(40))
Researchers are testing a treatment for tumors. They've found that tumors are shrinking in patients receiving the new treatment. Their calculations suggest that if chance were the only factor, there's a 2% chance they'd get the same results. Using a significance level α of .01, they conclude the results aren't significant. Would the results be more convincing if they'd used an α of .05? Why?
No. Raising the significance level α wouldn't lower the P-value.
Researchers are testing a treatment for tumors. They've found that tumors are shrinking in patients receiving the new treatment. Their calculations suggest that if chance were the only factor, there's a .9% chance they'd get the same results. At a .01 significance level, do the researchers have proof the new treatment is working?
No. They may have one piece of evidence, but they don't have proof.
Suppose you want to estimate the proportion of polydactyl cats (cats with extra toes). You sample 419 randomly selected cats, and find 56 polydactyls. You have no prior assumption to test-you only want to estimate the proportion. Should you carry out a hypothesis test using the sample proportion 56/419 as your null hypothesis? Select the best answer.
No. When all you want to do is estimate a population parameter, you should construct a confidence interval.
A researcher is looking for evidence that some people can predict the outcome of a roll of dice. Out of 500 subjects, 3 have results significantly better than random guessing would produce, meaning these 3 subjects had p-values less than .01. What can you conclude?
Not much. Using the laws of chance, you'd expect 5 people on average out of 500 to do that much better than guessing. Having 3 people in that range isn't very far from what you'd expect.
When do you pool the standard errors of proportion estimates?
Only when calculating test statistics for significance tests about the difference between two proportions
Joe is a fish thrower who throws fish into a chute at a processing plant. Joe misses the chute 20% of the time. Using the normal approximation with the continuity correction, which of these correctly represents the probability Joe will miss fewer than 80 times if he throws 500 fish?
P(X < 79.5) for N(100, 8.94)
Which of these represents the probability of getting doubles (getting the same number on two dice) on or before the seventh roll of two six-sided dice?
P(X = 1) + P(X = 2) + P(X = 3) + P(X = 4) + P(X = 5) + P(X = 6) + P(X = 7) (this equals geometcdf(1/6, 7))
A friend of yours can shoot free throws with 70% accuracy (she makes 70% of her shots). If she attempts 25 free throws, what's the probability she'll make fewer than 14?
P(x < 14)
A friend of yours can shoot free throws with 75% accuracy (she makes 75% of her shots). If she attempts 25 free throws, what's the probability she'll make at least 20?
P(x = 20) + P(x = 21) + P(x = 22) + P(x = 23) + P(x = 24) + P(x = 25)
In a blind ESP test, a person correctly identifies whether a tossed coin comes up heads or tails in 63 trials out of 200. Using the normal approximation (without the continuity correction), which of the following would you use to calculate the probability of correctly identifying 63 or more?
P(z > -5.233)
Suppose voters from a simple random sample of 500 (N > 1,000,000) are interviewed and asked which presidential candidate they're going to vote for. Of these, 35% say they'll vote for the Statistics Party. You want to know the probability, assuming this proportion is correct for the population, that more than 40% of a random sample of 500 people will vote for the Statistics Party. Which of these shows the three ways you could find your answer?
Proportion, using normal approximation: normalcdf(.40, E99, .25, .0213) Exact binomial count: 1 - binomcdf(500, .35, 200) Binomial count, using normal approximation: normalcdf(200, E99, 17, 10.66)(200, E99, 175, 10.66)
A chi-squared test for independence is performed on a 3x4 two-way table. Chi-squared is found to equal 13.3 Which of the following can we do next?
Reject the independence hypothesis at the .05 level of significance with 6 degrees of freedom.
You draw two random samples from two distinct populations and calculates the following sample statistics: x̅1 = 52.6, s1 = 3.2, n1 = 28 x̅2 = 49.3, s2 = 2.9, n2 = 29 k = df = 27 t = 4.07 p = .0002 State a conclusion, based on the P-value and alpha = .05, for the hypothesis test Ho: µ1 = µ2, Ha: µ1 > µ2.
Reject the null hypothesis in favor of the alternative, which states that the mean of population one is greater than the mean of population two.
In a particular roulette game, there's a 1/36 chance of winning. In a single day, a gambler plays 100 rounds and wins 7. You think the gambler may be cheating. At a significance level of .05, are your results significant, and do you have evidence the gambler may be cheating?
Results are significant and there's evidence the gambler may be cheating. (The results are significant at the .05 level, but significance means you do have evidence that the gambler may be cheating.)
Which of the following isn't necessary to compute the sample size appropriate for a given confidence level and margin of error?
Sample mean x-bar
The following is a list of differences in first and second semester final grades. (To calculate each number, a second semester grade was subtracted from a first semester grade): -6, 10, -3, 5, 4, 8, -4, 9, -2, 0, -5, -5, 5, 12, 10. Give a conclusion for the one-sided test H0: µ(2-1)= 0, Ha:µ(2-1) > 0 , where p = .074 and α= .05.
Second semester grades aren't higher than the first semester grades. (Since p = .074 and α = .05, we can't reject the null hypothesis that µ= 0, where is the true population mean difference between first and second semester grades. To reject the null hypothesis in a case where you've been given a significance level, you need the P-value to be less than alpha.)
Which of the following statements is true?
Smaller sample sizes produce larger margins of error because smaller samples are more susceptible to random variability.
Management of El Burrito decides that if their burritos are larger than the standard weight of 1.2 lb, drink sizes will have to be reduced to compensate for the loss of profits. For a one-sided hypothesis test where H0: µ = 1.2 lb, Ha: µ > 1.2 lb, and α = .01, which of these statements represents a Type I error in this scenario, and what's the probability of making such an error?
The average burrito is actually 1.2 lb, but the test concludes that the burrito is larger than 1.2 lb. The company decreases drink sizes. P(Type I error) = .01. (same as α)
Which statement best describes the meaning of the term average waiting-time?
The average number of trials required to get the first success
Remember that El Burrito is concerned that customers will be dissatisfied if its Grande Bean Burrito is averaging less than 1.2 lb. But if its Grande Bean Burrito is averaging more than 1.2 lb, its profits will suffer. Conduct a one-sided hypothesis test for burrito weight where H0: µ = 1.2 lb and Ha: µ > 1.2 lb. Which of these statements represents a logical conclusion based on the results of the test?
The company's profits may suffer because the burritos are too large. The computed z-value, z = 2.6, is greater than the critical z-value for α = .01, which is 2.33 (one-sided!)
You draw two simple random samples from two distinct populations and calculate the following: x̅1 = 23.4, s1 = 4.2, n1 = 25 x̅2 = 25.3, s2 = 3.9, n2 = 27 The estimate of the degrees of freedom, k, equals n1 - 1, or 24, and t* equals 2.064, and m, the margin of error, is 2.325. Construct a 95% confidence interval for the difference between these two populations and draw a conclusion based on this confidence interval.
The confidence interval is (-4.225, .425) there's no difference between the two population means.
Using a random sample of 2,000 students, you compute a 95% confidence interval to estimate the mean calories consumed by eighth graders. You decide to compute another 95% confidence interval using a different sample, this time with only 1,000 students. What change would you expect from the first confidence interval to the second?
The confidence interval will be wider.
An advertisement claims HairBuilder cures male pattern baldness. Your father, who wouldn't mind having more hair, is evaluating the advertisement's claim. If he makes a Type II error in his evaluation, which of the following will be true?
The cure doesn't work, but your father believes the claim and buys the medication.
An advertisement claims HairBuilder cures male pattern baldness. Your father, who wouldn't mind having more hair, is evaluating the advertisement's claim. If he makes a Type I error in his evaluation, which of the following will be true?
The cure really works, but your father doesn't believe the claim and doesn't buy the medication.
Your friend says she has an unfair die: the probability of getting a one or a six is 1/3 for each, and the probability of getting a two, three, four, or five is 1/12 for each. You want to test her statement. If you roll the die 96 times, what are the expected values for each number on the die?
The expected values are 32 for both ones and sixes, and 8 for twos, threes, fours and fives.
In a particular roulette game, there's a 1/36 chance of winning. In a single day, a gambler plays 100 rounds and wins in 7 of them. You think the gambler may be cheating. If you were to test this theory, what would be an acceptable null hypothesis?
The gambler wins about 1 out of every 36 rounds; the wins are due to chance alone. (The null is a simple statement about what you'd expect if nothing significant were happening. It usually states what you'd expect from chance alone.)
Elite Foods, a supermarket, is losing customers and suspects it's because another neighborhood market has lower prices. To determine whether this is the case, researchers from Elite Foods collected data on ten items found in both stores. They created a single set of numbers by subtracting their prices from those of the neighborhood market. For the test Ho: µ(B - A) = 0 and Ha: µ(B - A) < 0, where µ(B - A) represents the mean difference in price between the two stores (A = Elite, B = neighborhood market), researchers found that t = -2.237, p = .021. What can you conclude from these results if α = .05?
The neighborhood market has lower prices than Elite Foods.
Which of the following represents a geometric setting?
The number of random telephone numbers you dial until you get an answer (This meets the criteria for a geometric setting: each trial has just two outcomes (usually success and failure), and the probability of success is the same for each trial (usually referred to as p). In the geometric situation, however, the random variable X of interest is the number of trials required to obtain the first success. In this case, someone either answers or doesn't, and the probability p of success should be the same each time.)
Suppose you calculated a p-value of .0417. How would you interpret this value?
The p-value is statistically significant.
A p-value tells you:
The probability that you'd get results as extreme as you did, from random variation alone
According to a poll, approximately 49% of Americans spent over $500 on holiday presents in 1998. Suppose you took a random sample of 120 shoppers and found that 71 spent over $500 on holiday presents in 1999. Using a null hypothesis that the proportion is .49 and an alternative that the proportion is greater than .49, what would a hypothesis test suggest about the proportion of people spending over $500 on holiday presents?
The proportion has probably gone up since 1998, since the P-value for the sample proportion is around .014.
In a recent public opinion poll of 750 adults, 40% said that, if they had to decide between watching a rerun of Gilligan's Island and watching the State of the Union address, they'd choose the rerun. The poll reported a margin of error of 5%. Which of the following best describes what is meant by a 5% margin of error?
The true population percentage is likely to be within 5% of the sample percentage.
In the geometric setting the trials are independent, each trial has just two possible outcomes (success and failure), the probability of success is the same for each trial (referred to as p), and the random variable X is the number of trials required to get the first success. Which of the following scenarios meets the requirements of a geometric setting?
There are 10 different prizes in boxes of Googily-Snaps. Prize four is the most valuable among collectors. What's the probability that I'll get prize four without having to buy more than 4 boxes? (Each trial is independent, each trial has the same probability of success or failure, and you're interested in the probability of success within a certain number of trials.)
High Voltage, Inc., a light bulb manufacturer, wants to know if there's a difference in mean life span between its product and that of a competitor. The company has collected data for a matched pairs analysis, calculated the differences in the pairs of data to make one set of values, and set up the test Ho: µ(H.volt - Comp.) = 0 Ha: µ(H.volt - Comp.) ≠ 0 where µ(H.volt - Comp.) stands for the population mean difference between the life spans of the two brands of light bulbs. What conclusion can be drawn if p = .001 and α = .05?
There's a difference between the mean life span of High Voltage's bulbs and those of its competitor.
A statistics professor has been teaching first- and second-semester statistics for five years. She wants to know if the grades in her first-semester classes differ significantly from those of her second-semester classes. She takes a random sample of final grades from her two most recent first- and second-semester classes and gets these data: x̅1 = 88.35, s1 = 5.314, n1 = 20 x̅2 = 85.35, s2 = 4.368, n2 = 20 Using her TI-83 calculator, she finds the following 99% confidence interval: (-1.179, 7.1791) Draw a conclusion about the difference between these population means based on this confidence interval. Use Ho:(µ1 - µ2) = 0.
There's no difference between the average grades of the first- and second-semester classes.
The 95% confidence interval for the mean of the differences between first and second semester chemistry grades is (-1.501, 5.23). What can you conclude from this confidence interval?
There's no difference in grades between the first and second semester. (When we test for a difference, we're testing the null hypothesis that µ = 0. Since 0 is included in this interval, the null hypothesis can't be rejected.)
A whale-watching company noticed that many customers wanted to know whether it was better to book an excursion in the morning or the afternoon. To test this question, the company recorded the number of whales seen in the morning and afternoon on 15 randomly selected days over the past month, calculated differences between the observations in each pair (afternoon minus morning), and then constructed a 95% confidence interval for the true population mean difference. What conclusion can you draw from the 95% confidence interval (-.507, 1.307) for µ(B - A), the population mean difference?
There's no significant difference between the number of whales seen in the morning and the number seen in the afternoon. (Since zero is contained in the confidence interval for µ(B - A), the true population mean difference, the null hypothesis µ(B - A) can't be rejected.)
What is µX for the discrete probability distribution here? X 2 3 4 6 p .2 .2 .3 .2
This is not a valid probability distribution. (the probs. don't add up to 1)
Suppose a study showed that the four most popular pizzas are preferred in the following percentages: Topping Percent Cheese 24 Pepperoni 32 Sausage 29 Veggie 15 You decide to keep track of the next 160 pizzas of these types ordered at the restaurant where you work to see if the percentages hold true. What are the expected numbers for each type of pizza in your study?
Topping Expected Number Cheese 38.4 Pepperoni 51.2 Sausage 46.4 Veggie 24
You want to know whether high-school females score differently from high-school males on a standardized test. You take a random sample of high-school males and a random sample of high-school females from across the country and record their scores. What method should you use to analyze the data?
Two-sample procedure for the difference between means
You want to know whether high-school females are more likely to pass an AP Calculus class than high-school males. You take a random sample of males and females from high-school AP Calculus courses across the country and record whether each receives a passing or failing grade. What method should you use to analyze these data?
Two-sample procedure for the difference between proportions
You're designing a study and want to know the sample size needed to achieve a confidence interval for the population proportion with a given margin of error at a given confidence level. Although there have been previous studies like the one you're conducting, you think the population proportion has changed significantly, and you don't know what value to use for p*. How should you go about finding the sample size needed to achieve your desired margin of error?
Use p* = .5 to get a conservative estimate of the sample size.
Based on the information you've been given for this activity, and assuming for this question that σ and s are equivalent, which of the following is a proper conclusion for the two-sided hypothesis test where H0: µ = 1.2?
We can reject H0 at α = .05. computed z-value, z = (x̅ - µ) / (σ / sqrt(n)) = 2.6, is greater than the critical z-value, z* = 1.96
Suppose we have a hypothesized population mean µ = 50, and we use our sample data to construct a 98% confidence interval (48, 54). Which of these statements is true if our alternative hypothesis is µ ≠ 50?
We can't reject the null hypothesis at a = .02. ( Since the confidence interval contains the mean, we can't reject the null. And since the confidence level is 98% (or .98), the significance level at which our null hypothesis can't be rejected is (1 - C) = (1 - .98) = .02.)
A backpack manufacturer wants to know if students in high school carry more books than college students do. Company researchers take a simple random sample from each group and record the number of textbooks each subject is carrying. They compute the following summary statistics: x̅1 = 3.647, s1 = 1.9, n1 = 17 x̅2 = 2.529, s2 = 1.7, n = 17 Using software, the researchers get p = .0402 for the hypothesis test o: µ1 = µ2, Ha: µ1 > µ2. Draw a conclusion for this hypothesis test using alpha= .05.
We have significant evidence to reject the null in favor of the alternative hypothesis that high-school students carry more books.
Based on the computer output for linear regression shown here, what conclusion could you reach for a test of the null hypothesis that the value of the slope is 0? Use alpha = .05.
We'd conclude that knowing x isn't helpful in predicting y because the P-value is .2669. (the two-sided p-value is in the p column on the variable row)
When you're estimating population means, or when you're doing hypothesis tests about population means, when is it appropriate to use the t procedures?
When you don't know your population standard deviation and your sample shows no signs of extreme skewness or outliers.
Which of the following is true of the degrees of freedom, k, if you find them by taking the smaller n1 - 1 and n2 - 1 when n1 ≠ n2?
When you use this estimate for k, you're less likely to reject a false null hypothesis.
Suppose a study showed that the four most popular pizzas are preferred in the following percentages: Topping Percent Cheese 24 Pepperoni 32 Sausage 29 Veggie 15 You decide to keep track of the next 150 pizzas of these types ordered at the restaurant where you work to see if the percentages hold true. This is the data you get at your restaurant: Topping Observed Number Cheese 41 Pepperoni 55 Sausage 36 Veggie 18 Find the chi-square statistic and the associated P-value for a hypothesis test where the null hypothesis is that the distribution of pizza toppings is as reported in the study, and where the alternative hypothesis is that the distribution of pizza toppings is different from that reported in the study.
X^2 = 3.908, p = .272
You conduct a hypothesis test and find a P-value of .025. Which of the following is true?
You can reject H0 at a = .05. (The computed p-value of .025 is less than the .05 significance level.)
Which of the following is true for a two-sided significance test but isn't true for a confidence interval?
You know the P-value (the likelihood) of your sample mean. (The probability of making this type of error is the same as the significance level a, which is equal to 1 - C. So if you reject the null at a = .05, or if you reject it because your hypothesized mean lies outside of a 95% confidence interval, the probability that you're wrong is .05.)
You're going to create a confidence interval for a population mean using z* = 2.58. Which of the following is true?
You must assume the sample is a simple random sample from a normal population.
In a fund-raising game for your school, you bet $1 to roll two dice. If your total is 8, 9, 10, or 11, you win $2. If your total is 12, you win $6. If your total is 7 or less, you lose the dollar you bet. How much, on average, do you expect to win or lose with each dollar bet?
You will lose 5.6 cents.
Suppose you want to know if comedy movies tend to be shorter than action movies. You take a random sample of five movies from each genre, and compare the average lengths of both samples. What would you need to know if you wanted to use a two-sample t procedure here?
You'd need to know that running lengths of movies are approximately normally distributed.
A manufacturer of parking lot sealant claims its product will cover, on average, 1,900 square feet per drum with a standard deviation of 250 square feet. You draw a sample of 50 drums and find a mean coverage of 1,800 square feet. Which of the following two methods will lead you to the same conclusion regarding whether to reject the null?
a) A two-sided significance test at alpha = .05 b) A 95% confidence interval (A two-sided significance test at level alpha will give the same conclusion as a confidence interval at confidence level (1 -alpha).)
A manufacturer of parking lot sealant claims its product will cover, on average, 1,900 square feet per drum with a standard deviation of 250 square feet. You draw a sample of 50 drums and find a mean coverage of 1,800 square feet. Which of the following are two correct conclusions?
a) Reject the null, based on a two-sided significance test at a = .05. b) Reject the null, based on a 95% confidence interval.
If a success is defined as getting a three on a six-sided die, what's P(x < 3) if you roll the die 10 times?
binomcdf(10,1/6,2)
A twenty-sided die, with the faces numbered 1 to 20, is rolled 100 times. What's the exact probability of getting more than 15 elevens but at most 30 elevens?
binomcdf(100, 1/20, 30) - binomcdf(100, 1/20, 15)
What's the probability of getting exactly 1 five when we roll a fair die 10 times?
binompdf(10, 1/6, 1) = .323
Which of the following indicates the value of (8 2) (.3)^2 (.7)^6?
binompdf(8,.3,2)
The spinner in a board game has eight colors the arrow can land on. To test the fairness of the spinner, you spin the arrow 75 times and get the following results: Green: 3 Blue: 13 Red: 9 Orange: 9 Brown: 16 Yellow: 14 White: 8 Black: 3 Calculate the chi-square statistic for these data and use a table to find the P-value.
chi-squared = 17.267, .02 > p > .01
Your friend says she has an unfair die: the probability of getting a one or a six is 1/3 for each, and the probability of getting a two, three, four, or five is 1/12 for each. You want to test her statement, so you roll the die 96 times and get the following results: 1 : 26 2 : 6 3 : 7 4 : 13 5 : 10 6 : 34 The expected values are 32 for both ones and sixes, and 8 for twos, threes, fours, and fives. Calculate your chi-square statistic and draw a conclusion for a test of the following hypotheses with alpha = .1: Ho: The probability of getting a one or a six is 1/3 for each, and the probability of getting a two, three, four or five is 1/12 for each. Ha: At least one of the hypothesized probabilities is incorrect.
chi-squared equals 5.5. We can't reject the null hypothesis that the probability of getting a one or size is 1/3 and that the probability of getting a two, three, four of five is 1/12. With six categories, a chi-squared of 5.5 or greater will occur due to chance alone more than 25% of the time.
In an experiment, the purpose of randomization is to:
equalize treatment groups.
A researcher wants to know if a topical cream reduces the number of pimples on teenagers' faces. She recruits 40 teenagers and counts the number of pimples on each subject. Subjects use the cream for 30 days, after which the researcher again counts the number of pimples on each teen's face. She wants to know if the number of pimples after using the cream is less than the number before using the cream. True or False: The researcher must use a two-sample t test to find out whether the number of pimples decreased after students used the cream.
false
True or False: Higher p-values indicate stronger statistical significance.
false
True or False: For a population whose mean is 100 and whose standard deviation is 15, 1000 random samples of size 20 are enough to generate a sampling distribution
false (1000 random samples may give you a very good simulation of the sampling distribution, but the sampling distribution is composed of all possible random samples of a given size. To emulate a sampling distribution we use either simulations of sampling distributions or the laws of probability.)
True or False: A 95% confidence interval is narrower than a 90% confidence interval for the same data set.
false (A 95% confidence interval is wider than a 90% confidence interval because we need to include more possible values in our estimate to increase our confidence that we've captured the population mean.)
True or False: The sampling distribution for samples of size 10, with p near .2, would be approximately normal.
false (Since np = (10)(.2) = 2 < 10, the assumptions needed to use the normal approximation aren't satisfied-np and n(1 - p) must both be greater than 10. (You might have a textbook that says that np and n(1 - p) need to be greater than or equal to 5, but this Tutorial follows the standard that they should be greater than or equal to 10.))
A fisheries report states that the average length of a population of adult fish is 25.8 inches, with a standard deviation of 6 inches. Dr. Jones draws a sample of 45 fish at random and finds that the average is 24.5 inches. She wants to know whether the true mean length has changed. True or False: If Dr. Jones used a 99% confidence interval, she'd reject the null hypothesis that µ = 25.8 inches.
false (he null value of 25.8 inches is within the 99% confidence interval, so she wouldn't reject the null. Using the confidence interval formula, which gives you an interval of (22.2, 26.8).
True or False: A B(225, 1/5) distribution can be approximated by an N(225, 1/5) distribution.
false (it would be an N(45, 6) distribution)
True or False: The standard error of the sample proportion and the standard deviation of the sample proportion are usually the same.
false (p hat is estimating p)
You want to know whether high-school females are more likely to pass an AP Calculus class than high-school males. You take a random sample of males and females from high-school AP Calculus courses across the country and record whether each receives a passing or failing grade. Which confidence interval would you use to analyze the data?
formula for two-proportion confidence interval
What's the formula for the test statistic used in a significance test of the difference between proportions?
formula for two-proportion z test
Suppose you're playing a game where you roll two dice. If you get doubles on or before the fourth roll you win the game. What's the probability of winning?
geometcd(1/6, 4)
A backpack manufacturer wants to know if students in high school carry more books than college students do. Company researchers take a simple random sample from each group and record the number of textbooks each subject is carrying. They get the following data: High school: (5, 3, 2, 5, 6, 4, 7, 6, 5, 4, 3, 2, 1, 4, 3, 0, 2) College: (5, 3, 2, 4, 1, 0, 0, 3, 6, 2, 1, 3, 1, 2, 4, 4, 2) Using high-school students as sample one and college students as sample two, the researchers compute the following sample statistics and t statistic: x̅1 = 3.647, s1 = 1.9, n1 = 17 x̅2 = 2.529, s2 = 1.7, n = 17 t = 1.808 Calculate the degrees of freedom, k, using the conservative method, and use tcdf on your TI-83 calculator to find the P-value for the hypothesis test Ho: µ1 = µ2, Ha: µ1 > µ2.
k = 16, p = .0447
A backpack manufacturer wants to know if students in high school carry more books than college students do. Company researchers take a simple random sample from each group and record the number of textbooks each subject is carrying. They get the following data: High school: (5, 3, 2, 5, 6, 4, 7, 6, 5, 4, 3, 2, 1, 4, 3, 0, 2) College: (5, 3, 2, 4, 1, 0, 0, 3, 6, 2, 1, 3, 1, 2, 4, 4, 2) Using high-school students as sample one and college students as sample two, the researchers compute the following sample statistics and t statistic: x̅1 = 3.647, s1 = 1.9, n1 = 17 x̅2 = 2.529, s2 = 1.7, n = 17 t = 1.808 Using 2-SampTTest in the STAT TESTS menu of your TI-83, calculate the degrees of freedom and P-value for the hypothesis test Ho: µ1 = µ2, Ha: µ1 > µ2. Don't pool.
k = 31.60, p = .0402
You get a speeding ticket and demand a jury trial. The jury is under instructions to assume you are innocent unless proven guilty. You actually are guilty but hope that the jury:
makes a Type II error. (you want them to accept the null when the alternative is true)
You have an SRS of 300 students selected from over 100,000 college students. Of your sample, 35% said they had fallen asleep in their English class at least once during the previous semester. The mean and standard deviation of this statistic are:
mean = .35 and standard deviation = .028
When conducting a significance test for paired data, you need to determine thep-value for which one of the following?
n - 1 degrees of freedom
You want to estimate the population proportion of pine trees over 90 years old in a forest near where you live. You need a margin of error of no more than plus-or-minus 2 %. With no more information to go on, what's the simplest formula for calculating your minimum sample size?
n = (critical z/2m)^2
You want to estimate the proportion of adults who make over $50,000 a year in your county. According to a study done two years ago, the proportion is .18, and you suspect the proportion hasn't changed. You need a margin of error of no more than ± 2%. Which of these expressions would give you your minimum sample size?
n = (critical z/m)^2 · .18(1 - .18)
For which of these values of n and p, can you use the normal approximation to the binomial distribution?
n = 60, and p = .4
A company wants information about the mode of transportation its employees use to get to and from work. It conducts a survey of 537 workers, and 243 say they take the bus. To plan parking space distribution, management wants to estimate the proportion of employees who take the bus to within a 3% margin of error at a 90% confidence level. What's the sample size necessary to construct this interval?
n = 745
Which of the following statements is (are) true? 1. Confidence intervals and hypothesis tests are interchangeable. 2. A 99% confidence interval is more accurate than a hypothesis test with a = .05. 3. Hypothesis tests and confidence intervals are always two-sided.
none of these
The probability of finding a mistake on an Income Tax Return is about .23. An employee of the IRS plans to inspect 100 random returns. Using the normal approximation to the binomial and the continuity correction, you want to calculate the probability that the workers find less than 20 mistakes. Which matches most closely what you'll enter into your calculator?
normalcdf(-E99, 19.5, 23, 4.21)
John is an expert horseshoe thrower who only misses 15% of the time. Choose the expression that correctly represents the probability John will miss fewer than 50 times if he throws 400 horseshoes.
normalcdf(-E99, 50, 60, 7.14)
Joe is a fish thrower who throws fish into a chute at a processing plant. Joe misses the chute 20% of the time. Using the normal approximation with the continuity correction, how would you calculate the probability Joe will miss fewer than 80 times if he throws 500 fish?
normalcdf(-e99, 79.5, 100, 8.94)
Under what conditions can you use a normal distribution to approximate the binomial distribution?
np ≥ 10 and n(1 - p) ≥ 10
A research firm wants to determine whether there's a difference in married couples between what the husband earns and what the wife earns. The firm takes a random sample of married couples and measures the annual salary of each husband and wife. What procedure should the firm use to analyze the data for the mean difference in salary within married couples?
one-sample t procedure (The data should be analyzed as matched pairs. Though they're collected as two lists of numbers—husbands' salaries and wives' salaries—the data are paired according to each married couple. Within each pair, the husband's salary can be subtracted from the wife's salary to get one number that represents the difference between the two. The list of differences can be analyzed as a single sample.)
Which of these do we not need to assume is true in order to carry out a hypothesis test about a single proportion?
p hat = .5
The formula for the test statistic for a one-sample significance test of a population proportion is (one-prop z-test formula) . Identify the components of this formula.
p hat is the sample proportion, p0 is the hypothesizes population proportion, n is the sample size, and √(p0(1-p0)/n) is the standard deviation of the hypothesized population proportion.
The sample mean, x̅, is called a __________ of the population mean µ.
point estimate
Which of the following is the correct confidence interval for a single population proportion?
p̂ ± (critical z)(SEp̂)
In an experiment, the purpose of replication is to:
reduce variability by repeating the experiment on many subjects.
What's the correct t statistic for the difference between means for a one-sample matched pairs procedure?
t = (x̅(sub n1 - n2) - µ0) / (s / √n)
What's the correct t statistic for the difference between means of independent samples (without pooling)?
t = (x̅1 - x̅2) / √(s^2/n1 + s^2/n2)
You collect a random sample of 28 adult golfers and record two scores for each: one taken before the subject receives professional coaching and one taken after. What test statistic should you use in a significance test for the difference between the before-coaching scores and the after-coaching scores?
t = (x̅1 - µ) / s/√n
Researchers randomly assign subjects to two groups, with each group receiving a different asthma medication. For one month, researchers record the number of asthma attacks suffered by each subject and compute the summary statistics as follows: x̅ = 3.8, s1 = 1.7, n1 = 23 x̅ = 4.2, s2 = 1.5, n2 = 24 Compute a t statistic and P-value for the hypothesis test Ho: (µ1 = µ2) = 0, Ha: (µ1 - µ2) < 0. Use the conservative method to calculate your estimate for degrees of freedom, and don't pool or use your calculator. What did you get for your t statistic and your P-value, and what's your conclusion based on alpha = .05?
t = -.854, p = .2012, do not reject the null hypothesis that there's no difference between the two means.
Elite Foods, a supermarket, is losing customers and suspects it's because another neighborhood market has lower prices. To determine whether this is the case, researchers from Elite Foods collected the following data on ten items found in both stores: Elite Foods Neighborhood Markets (B - A) 1.65 1.49 -.16 2.19 2.00 -.19 1.99 2.09 .1 3.49 2.99 -.5 .99 .99 0 1.59 1.79 .2 2.89 2.39 -.5 4.50 4.25 -.25 1.19 .99 -.2 1.99 1.79 -.2 Compute the test statistic and P-value for Ho: µ(B - A) = 0 and Ha: µ(B - A) < 0, where µ(B - A) represents the true population mean difference in price between the two stores.
t = -2.37, p = .021
The following list shows first and second semester final student grades for a chemistry class: Students 1st 2nd 1 96 90 2 80 90 3 83 80 4 80 85 5 90 94 6 70 78 7 74 70 8 80 89 9 82 80 10 90 90 11 95 90 12 85 80 13 70 75 14 80 92 15 70 80 Calculate the test statistic and P-value for the one-sided test H0: µ(2-1)= 0, Ha:µ(2-1) > 0
t = 1.533, p = .074
A sample of size 36 has s = 3 and yields the confidence interval (23.5, 26.5). What's t*?
t* = 3 t-star (3/sqrt36) = 26.5 - 25 = 1.5, t* = 3
For a left skewed histogram:
the mean is less than the median
We have a null hypothesis that mu = 3.75, but we are reluctant to accept this as the population value in case the true value is different. An analysis tells us that the probability of rejecting the null hypothesis, if mu is really 3.95, is .75. This probability is:
the power of the test against the alternative mu = 3.95.
After choosing a simple random sample from a population having unknown mean µ and an estimated standard deviation σ, you use the following formula to construct a confidence interval for µ: ± z*(σ/√n). In this formula, which of the following is represented by ± z*?
the z-score equivalent for the confidence interval of µ
A researcher thinks a population of fish has a mean weight of 3.4 kilograms and a standard deviation of .8 kilograms. She takes a simple random sample of 30 fish and finds that the mean is 3 kilograms. True or False: Using a significance level of .05, there's significant evidence that the population mean is actually lower than 3.4 kilograms.
true
True or False. If we take a properly drawn sample and calculate the mean and standard deviation, we can estimate the mean of the population and we can come up with a probability that the true mean falls within a certain interval.
true
True or False: Given that there are only 10 different possible samples of size two that can be selected from a population of five values, the sampling distribution of the mean would be composed of the means of these 10 samples.
true
True or False: It's unlikely you'll know whether the two populations from which you're sampling have the same variance, so it's usually not a good idea to use pooled variances.
true
True or False: Random sampling is important in studies where you'll be calculating p-values.
true
True or False: The true least-squares regression line for the population goes through µ(sub y), the mean of the possible y values for each x value.
true
A confidence interval for the difference between two population means is (.56, 1.65). True or False: This provides good evidence that the population means are different.
true (0 isn't contained in the interval)
True or False: The following situation could be considered a binomial experiment: In 1999, 25,000 students took the AP Statistics Exam. The probability that a randomly selected student from this group passed the exam was about .6. A statistician wants to know the likelihood that more than 650 out of 1000 students randomly selected from this group passed the exam.
true (Even though this situation doesn't strictly fit the definition of a binomial event (there are no independent trials), it's still a binomial event, since the population is more than 20 times greater than the sample size.)
True or False: A geometric probability distribution is skewed.
true (Geometric distributions are always skewed, though some are less skewed than others. There's a limit on one end and a long tail of diminishing probabilities on the other end.)
Thirty couples applying for marriage licenses are asked to give their ages. The researcher wants to know if husbands tend to be older than wives. True or False: This scenario could be analyzed using a matched pairs procedure.
true (In this case, the subjects are paired. To answer this question, the researcher will analyze the age differences between each husband and wife pair. Note that members of each pair aren't matched on as many characteristics as possible, so this isn't a typical matched pairs situation. However, it's appropriate to treat it as such because you're looking at the difference between two measurements of the variable age.)
True or False: Increasing the sample size will decrease the margin of error in your confidence interval.
true (Larger sample sizes reduce variability, so your estimate will be more accurate and your margin of error will be smaller.)
Researchers randomly assign subjects to two groups, with each group receiving a different asthma medication. For one month, researchers record the number of asthma attacks suffered by each subject. Which of these is the correct procedure for measuring the difference in the number of asthma attacks between the two groups?
two sample t procedure
A market claims the average weight of a package of hamburger in its meat department is one pound, with a standard deviation of .18 lb. A manager decides to test H0: µ = 1 against the one-sided alternative Ha: µ ≠ 1 (sic) at the .01 level of significance. To do this, he randomly selects 35 packages of meat. For what range of values of x̅ would you reject the null hypothesis?
x̅ < .93 (At the .01 level, for a one-sided test, z* = -2.33 for H0: µ = 1 (x̅ - 1) / (.18 / sqrt(35)) x̅ = .93 Any value of x̅ < .93 will cause the null hypothesis to be rejected.)
In a one-sided hypothesis test where H0: µ = 1.2 lb, Ha: µ > 1.2 lb, and α = .01, what's the rejection region we'd use in computing the Type II error?
x̅ > 1.38 (> bc the one-sided hypothesis being tested is Ha: µ > 1.2)
You want to know if high-school females score differently than high-school males on a standardized test. You take a random sample of 28 males and 27 females from high schools across the country and record their scores. Which confidence interval would you use to analyze these data?
x̅1 - x̅1 ± (t star)√(s^2/n1 + s^2/n2)
A statistics professor has been teaching first- and second-semester statistics for five years. She wants to know if the grades in her first-semester classes differ significantly from those of her second-semester classes. She takes a random sample of final grades from her two most recent first- and second-semester classes and gets these data: First semester: (89, 98, 78, 86, 95, 83, 90, 87, 85, 80, 96, 93, 90, 91, 81, 87, 93, 90, 87, 88) Second semester: (87, 85, 89, 78, 79, 77, 82, 90, 91, 93, 88, 87, 86, 83, 85, 81, 90, 86, 86, 84) Compute the statistics necessary to construct a confidence interval for the difference between these two populations.
x̅1 = 88.35, s1 = 5.314, n1 = 20 x̅2 = 85.35, s2 = 4.368, n2 = 20
A research firm wants to determine whether there's a difference in married couples between what the husband earns and what the wife earns. The firm takes a random sample of married couples and measures the annual salary of each husband and wife. What formula should the researchers use to find the confidence interval for the difference in means?
x̅1(sub n1 - n2) ± (t*)(s/√n)
Construct the least-squares regression line that best fits these data.
y hat = 13.11 + .9359x
Here are some summary statistics for two variables: x̅ = 40.4, sx = 5.22, y̅ = 25.2, sy = 3.4, r = .52 For these statistics, what's the equation for the least-squares regression line?
y hat = 22.147 + .215x
Management at a sparkling water factory has designed a new system it thinks will lower the proportion, currently at .12, of incorrectly labeled bottles. In the year following the implementation of the new system, quality control engineers randomly sample 378 bottles and find 40 that are labeled incorrectly. What are the test statistic and the P-value for the significance test Ho: p = .12, Ha: p < .12? (Find the P-value for this test using your normal probability table.)
z = -.8484 p = .1977
The staff at an aquarium with 500,000 annual visitors wants to know whether there's a difference in the proportion of its patrons who come specifically for workshops and those who come for other reasons. The staff takes a random exit poll of 1,238 visitors, and 751 say they came for workshops. For a significance test of Ho: p =.5, Ha: p ?? .5, what are your test statistic and your P-value? Use your TI-83 calculator to get your P-value.
z = 7.5 p approx. = 0
Which of the following represents the information you need in order to find the sample size necessary for a confidence interval for a single population proportion with a given margin of error and confidence level?
z star, m, p star
A fisheries report states that the average length of a population of adult fish is 26 inches, with a standard deviation of 6 inches. Dr. Jones draws a sample of 45 fish at random and finds that the average is 24.5 inches. She wants to know whether the true mean length is 26 inches at a significance level of = .01. If she were to use a confidence interval that would give the same result as a two-sided significance test at a = .01, what would she use for a critical z-value?
z* = 2.58
A random sample of 85 adults found that average calorie consumption was 2,100 per day. Previous research has found a standard deviation of 450 calories, and you assume this value for . A researcher has been given $10,000 to conduct a similar survey, which costs $50.00 per person surveyed. She must compute a 95% confidence interval. Given her budget restriction, what would the minimum margin of error be for the confidence interval for the population mean?
± 62.4 calories
What parameter does the statistic y hat = b0 + b1x estimate?
µy = β0 + β1x
Which of the following statements is correct?
β0 is a parameter and b0 is a statistic
How do you calculate the chi-square statistic?
∑ (observed - expected^2 / expected
On average, a fair die should give you a five on one roll out of six, or 1/6 of the time. You have a die that was run over by a car, and you want to see if it's producing the expected number of fives. You roll the die 60 times and get 7 fives. What's the value of the standard deviation of the sample proportion you'd expect if the die were fair?
√((1/6)(5/6) / 60) (standard error formula)
Suppose you record 83 successes out of a random sample of 200 drawn from a population that yielded a proportion of successes of .40 in a previous study. If you were to calculate a standardized test statistic for a hypothesis test about the population proportion, where Ho: po = .40, what would your standard deviation of p hat look like?
√(.4)(.6) / 200
Suppose you record 83 successes out of a random sample of 200 drawn from a population that yielded a proportion of successes of .40 in a previous study. If you were to construct a confidence interval for the population proportion of your sample, what would your standard error of p hat look like?
√(0.415)(0.585) / 200