Statistics Final Exam FILTERED
Professor Gill is designing a multiple-choice test. There are to be 10 questions. Each question is to have five choices for answers. The choices are to be designated by the letters a, b, c, d, and e. Professor Gill wishes to use a random-number table to determine which letter choice should correspond to the correct answer for a question. Using the number correspondence 1 for a, 2 for b, 3 for c, 4 for d, and 5 for e, use a random-number table to determine the letter choice for the correct answer for each of the 10 questions.
1.2.17
Greg took a random sample of size 100 from the population of current season ticket holders to State College men's basketball games. Then he took a random sample of size 100 from the population of current season ticket holders to State College women's basketball games. (a) What sampling technique (stratified, systematic, cluster, multistage, convenience, random) did Greg use to sample from the population of current season ticket holders to all State College basketball games played by either men or women? (b) Is it appropriate to pool the samples and claim to have a random sample of size 200 from the population of current season ticket holders to all State College home basketball games played by either men or women? Explain.
1.2.7
A study involves three variables: income level, hours spent watching TV per week, and hours spent at home on the Internet per week. List some ways the variables might be confounded.
1.3.1
Zane is examining two studies involving how different generations classify specified items as either luxuries or necessities. In the first study, the Echo generation is defined to be people ages 18-29. The second study defined the Echo generation to be people ages 20-31. Zane notices that the first study was conducted in 2006 while the second one was conducted in 2008. (a) Are the two studies inconsistent in their description of the Echo generation? (b) What are the birth years of the Echo generation?
1.3.5
Which technique for gathering data (observational study or experiment) do you think was used in the following studies? (a) The Colorado Division of Wildlife netted and released 774 fish at Quincy Reservoir. There were 219 perch, 315 blue gill, 83 pike, and 157 rainbow trout. (b) The Colorado Division of Wildlife caught 41 bighorn sheep on Mt. Evans and gave each one an injection to prevent heartworm. A year later, 38 of these sheep did not have heartworm, while the other three did. (c) The Colorado Division of Wildlife imposed special fishing regulations on the Deckers section of the South Platte River. All trout under 15 inches had to be released. A study of trout before and after the regulation went into effect showed that the average length of a trout increased by 4.2 inches after the new regulation. (d) An ecology class used binoculars to watch 23 turtles at Lowell Ponds. It was found that 18 were box turtles and 5 were snapping turtles.
1.3.7
How long does it take to finish the 1161-mile Iditarod Dog Sled Race from Anchorage to Nome, Alaska (see Viewpoint)? Finish times (to the nearest hour) for 57 dogsled teams are shown below. 261 271 236 244 279 296 284 299 288 338 360 341 333 261 266 287 296 313 299 303 277 283 304 305 288 290 288 332 330 309 328 307 328 285 291 295 310 318 318 320 333 321 323 324 327 Use five classes. a) Find the class width. (b) Make a frequency table showing class limits, class boundaries, midpoints, frequencies, relative frequencies, and cumulative frequencies. (c) Draw a histogram. (d) Draw a relative-frequency histogram. (e) Categorize the basic distribution shape as uniform, mound-shaped sym- metrical, bimodal, skewed left, or skewed right. (f) Draw an ogive.
2.1.15
Certain kinds of tumors tend to recur. The following data represent the lengths of time, in months, for a tumor to recur after chemotherapy. Use five classes. 19 18 50 1 14 45 38 40 27 20 17 1 21 22 54 46 25 49 59 39 43 39 5 9 38 18 54 59 46 50 29 12 19 36 43 41 10 50 41 25 19 39 a) Find the class width. (b) Make a frequency table showing class limits, class boundaries, midpoints, frequencies, relative frequencies, and cumulative frequencies. (c) Draw a histogram. (d) Draw a relative-frequency histogram. (e) Categorize the basic distribution shape as uniform, mound-shaped sym- metrical, bimodal, skewed left, or skewed right. (f) Draw an ogive.
2.1.17
A data set with whole numbers has a low value of 20 and a high value of 82. Find the class width and class limits for a frequency table with 7 classes.
2.1.5
You are manager of a specialty coffee shop and collect data throughout a full day regarding waiting time for customers from the time they enter the shop until the time they pick up their order. (a) What type of distribution do you think would be most desirable for the wait- ing times: skewed right, skewed left, mound-shaped symmetrical? Explain. (b) What if the distribution for waiting times were bimodal? What might be some explanations?
2.1.7
It is costly in both time and money to go to college. Does it pay off? According to the Bureau of the Census, the answer is yes. The average annual income (in thousands of dollars) of a household headed by a person with the stated education level is as follows: 21.6 if ninth grade is the high- est level achieved, 39.6 for high school graduates, 56.8 for those holding associ- ate degrees, 75.6 for those with bachelor's degrees, 91.7 for those with master's degrees, and 120.9 for those with doctoral degrees. Make a bar graph showing household income for each education level.
2.2.5
The Boston Marathon is the oldest and best-known U.S. marathon. It covers a route from Hopkinton, Massachusetts, to downtown Boston. The distance is approximately 26 miles. Search the marathon site to find a wealth of information about the history of the race. In particular, the site gives the win- ning times for the Boston Marathon. They are all over 2 hours. The following data are the minutes over 2 hours for the winning male runners: 1961-1980: 23 23 18 19 16 17 15 22 13 10 18 15 16 13 9 20 14 10 9 12 1981-2000: 9 8 9 10 14 7 11 8 9 8 11 8 9 7 9 9 10 7 9 9 (a) Make a stem-and-leaf display for the minutes over 2 hours of the winning times for the years 1961 to 1980. Use two lines per stem. (b) Make a stem-and-leaf display for the minutes over 2 hours of the winning times for the years 1981 to 2000. Use two lines per stem. (c) Interpretation Compare the two distributions. How many times under 15 minutes are in each distribution?
2.3.5
Consider the numbers 2 3 4 5 5 (a) Compute the mode, median, and mean. (b) If the numbers represent codes for the colors of T-shirts ordered from a catalog, which average(s) would make sense? (c) If the numbers represent one-way mileages for trails to different lakes, which average(s) would make sense? (d) Suppose the numbers represent survey responses from 1 to 5, with 1 disagree strongly, 2 disagree, 3 agree, 4 agree strongly, and 5 agree very strongly. Which averages make sense?
3.1.13
In this problem, we explore the effect on the mean, median, and mode of multiplying each data value by the same number. Consider the data set 2, 2, 3, 6, 10. (a) Compute the mode, median, and mean. (b) Multiply each data value by 5. Compute the mode, median, and mean. (c) Compare the results of parts (a) and (b). In general, how do you think the mode, median, and mean are affected when each data value in a set is multiplied by the same constant? (d) Suppose you have information about average heights of a random sample of airplane passengers. The mode is 70 inches, the median is 68 inches, and the mean is 71 inches. To convert the data into centimeters, multiply each data value by 2.54. What are the values of the mode, median, and mean in centimeters?
3.1.17
The Grand Canyon and the Colorado River are beautiful, rugged, and sometimes dangerous. Thomas Myers is a physician at the park clinic in Grand Canyon Village. Dr. Myers has recorded (for a 5-year period) the number of visitor injuries at different landing points for commercial boat trips down the Colorado River in both the Upper and Lower Grand Canyon: Upper Canyon: Number of Injuries per Landing Point Between North Canyon and Phantom Ranch 2 3 1 1 3 4 6 9 3 1 3 Lower Canyon: Number of Injuries per Landing Point Between Bright Angel and Lava Falls 8 1 1 0 6 7 2 14 3 0 1 13 2 1 (a) Compute the mean, median, and mode for injuries per landing point in the Upper Canyon. (b) Compute the mean, median, and mode for injuries per landing point in the Lower Canyon. (c) Compare the results of parts (a) and (b). (d) The Lower Canyon stretch had some extreme data values. Compute a 5% trimmed mean for this region, and compare this result to the mean for the Upper Canyon computed in part (a).
3.1.21
For mallard ducks and Canada geese, what percentage of nests are successful (at least one offspring survives)? Studies in Montana, Illinois, Wyoming, Utah, and California gave the following percentages of successful nests. x: Percentage success for mallard duck nests 56 85 52 13 39 y: Percentage success for Canada goose nests 24 53 60 69 18 (a) Use a calculator to verify that sigma x=245; sigma x squared = 14,755; sigma y = 224; and sigma y squared =12,070. (b) Use the results of part (a) to compute the sample mean, variance, and standard deviation for x, the percent of successful mallard nests. (c) Use the results of part (a) to compute the sample mean, variance, and standard deviation for y, the percent of successful Canada goose nests.
3.2.19 A, B, C ONLY
Consider the data set 2 3 4 5 6 (a) Find the range. (b) Use the defining formula to compute the sample standard deviation s. (c) Use the defining formula to compute the population standard deviation.
3.2.5
Each of the following data sets has a mean of xbar = 10. (i) 8 9 10 11 12 (ii) 7 9 10 11 13 (iii) 7 8 10 12 13 (a) Without doing any computations, order the data sets according to increasing value of standard deviations. (b) Why do you expect the difference in standard deviations between data sets (i) and (ii) to be greater than the difference in standard deviations between data sets (ii) and (iii)? Hint: Consider how much the data in the respective sets differ from the mean.
3.2.9
Consider the following ordered data: 2 5 5 6 7 7 8 9 10 (a) Find the low, Q1, median, Q3, high. (b) Find the interquartile range. (c) Make a box-and-whisker plot.
3.3.5
At Center Hospital there is some concern about the high turnover of nurses. A survey was done to determine how long (in months) nurses had been in their current positions. The responses (in months) of 20 nurses were 23 2 5 14 25 36 27 42 12 8 7 23 29 26 28 11 20 31 8 36 Make a box-and-whisker plot of the data. Find the interquartile range.
3.3.7
What percentage of the general U.S. population have bachelor's degrees? The Statistical Abstract of the United States, 120th Edition, gives the percentage of bachelor's degrees by state. For convenience, the data are sorted in increasing order. 17 18 18 18 19 20 20 20 21 21 21 21 22 22 22 22 22 22 23 23 24 24 24 24 24 24 24 24 25 26 26 26 26 26 26 27 27 27 27 27 28 28 29 31 31 32 32 34 35 38 (a) Make a box-and-whisker plot and find the interquartile range. (b) Illinois has a bachelor's degree percentage rate of about 26%. Into what quartile does this rate fall?
3.3.9
Consider a family with 3 children. Assume the probability that one child is a boy is 0.5 and the probability that one child is a girl is also 0.5, and that the events "boy" and "girl" are independent. a) List the equally likely events for the gender of the 3 children, from oldest to youngest. b) What is the probability that all 3 children are male? Notice that the com- plement of the event "all three children are male" is "at least one of the children is female." Use this information to compute the probability that at least one child is female.
4.1.11
Consider the following events for a driver selected at random from the general population: A = driver is under 25 years old B =driver has received a speeding ticket Translate each of the following phrases into symbols. (a) The probability the driver has received a speeding ticket and is under 25 years old (b) The probability a driver who is under 25 years old has received a speeding ticket (c) The probability a driver who has received a speeding ticket is 25 years old or older (d) The probability the driver is under 25 years old or has received a speeding ticket (e) The probability the driver has not received a speeding ticket or is under 25 years old
4.2.13
M&M plain candies come in various colors. According to the M&M/Mars Department of Consumer Affairs, the distribution of colors for plain M&M candies is: Purple 20% Yellow 20% Red 20% Orange 10% Green 10% Blue 10% Brown 10% Suppose you have a large bag of plain M&M candies and you choose one candy at random. Find: (a) P(green candy or blue candy). Are these outcomes mutually exclusive? Why? (b) P(yellow candy or red candy). Are these outcomes mutually exclusive? Why? (c) P(not purple candy)
4.2.15
You roll two fair dice, a green one and a red one. (a) Are the outcomes on the dice independent? (b) Find P(5 on green die and 3 on red die). (c) Find P(3 on green die and 5 on red die). (d) Find P ((5 on green die and 3 on red die) or (3 on green die and 5 on red die)).
4.2.17
This problem involves a deck of 52 playing cards. There are four suits of 13 cards each. The four suits are: hearts, diamonds, clubs, spades. The 26 cards included in hearts and diamonds are red in color. The 26 cards included in clubs and spades are black in color. The 13 cards in each suit are: 2, 3, 4, 5, 6, 7, 8, 9, 10, Jack, Queen, King, and Ace. This means there are four Aces, four Kings, four Queens, four 10's, etc., down to four 2's in each deck. You draw two cards from a standard deck of 52 cards without replacing the first one before drawing the second. (a) Are the outcomes on the two cards independent? Why? (b) Find P(Ace on 1st card and King on 2nd). (c) Find P(King on 1st card and Ace on 2nd). (d) Find the probability of drawing an Ace and a King in either order.
4.2.21
Wing Foot is a shoe franchise commonly found in shopping centers across the United States. Wing Foot knows that its stores will not show a profit unless they gross over $940,000 per year. Let A be the event that a new Wing Foot store grosses over $940,000 its first year. Let B be the event that a store grosses over $940,000 its second year. Wing Foot has an administrative policy of closing a new store if it does not show a profit in either of the first 2 years. The accounting office at Wing Foot provided the following information: 65% of all Wing Foot stores show a profit the first year; 71% of all Wing Foot stores show a profit the second year (this includes stores that did not show a profit the first year); however, 87% of Wing Foot stores that showed a profit the first year also showed a profit the second year. Compute the following: (a) P(A) (b) P(B) (c) P(B|A) (d) P(A and B) (e) P(A or B) (f) What is the probability that a new Wing Foot store will not be closed after 2 years? What is the probability that a new Wing Foot store will be closed after 2 years?
4.2.33
Assume A and B are events such that: 0 < P(A) < 1 and 0 < P(B) < 1. Answer true or false and give a brief explanation for each answer. P(A) and (complement of A) = 0 P(A | complement of A) = 1 P(A and B) ≤ P(A)
4.2.37, 4.2.39, 4.2.43
What is the income distribution of super shoppers? In the following table, income units are in thousands of dollars, and each interval goes up to but does not include the given high value. The midpoints are given to the nearest thousand dollars. Income range: 5-15 15-25 25-35 35-45 45-55 55+ Midpoint x: 10 20 30 40 50 60 % spr shppers: 21% 14% 22% 15% 20% 8% (a) Using the income midpoints x and the percent of super shoppers, do we have a valid probability distribution? Explain. (b) Use a histogram to graph the probability distribution of part (a). (c) Compute the expected income m of a super shopper. (d) Compute the standard deviation s for the income of super shoppers.
5.1.11
The following data are based on information taken from Daily Creel Summary, published by the Paiute Indian Nation, Pyramid Lake, Nevada. Movie stars and U.S. presidents have fished Pyramid Lake. It is one of the best places in the lower 48 states to catch trophy cutthroat trout. In this table, x = number of fish caught in a 6-hour period. The percentage data are the percentages of fishermen who catch x fish in a 6-hour period while fishing from shore. x 0 1 2 3 4 or more % 44% 36% 15% 4% 1% (a) Convert the percentages to probabilities and make a histogram of the probability distribution. (b) Find the probability that a fisherman selected at random fishing from shore catches one or more fish in a 6-hour period. (c) Find the probability that a fisherman selected at random fishing from shore catches two or more fish in a 6-hour period. (d) Compute μ, the expected value of the number of fish caught per fisherman in a 6-hour period (round 4 or more to 4). (e) Compute σ, the standard deviation of the number of fish caught per fisherman in a 6-hour period (round 4 or more to 4)
5.1.13
Consider the probability distribution shown below: x 0 1 2 p(x) 0.25 0.60 0.15 Compute the expected value and the standard deviation of the distribution.
5.1.7
Consider a binomial experiment with n = 7 trials where the probability of success on a single trial is p = 0.30. (a) Find P(r=0). (b) Find P (r≥1) by using the complement rule.
5.2.11
Consider a binomial experiment with n=6 trials where the probability of success on a single trial is p 5 0.85. (a) Find P (r≤1) (b) Interpretation If you conducted the experiment and got fewer than 2 successes, would you be surprised? Why?
5.2.13
A fair quarter is flipped three times. For each of the following probabilities, use the formula for the binomial distribution and a calculator to compute the requested probability. Next, look up the probability in Table 3 of Appendix II and compare the table result with the computed result. (a) Find the probability of getting exactly three heads. (b) Find the probability of getting exactly two heads. (c) Find the probability of getting two or more heads. (d) Find the probability of getting exactly three tails.
5.2.15
Sociologists say that 90% of married women claim that their husband's mother is the biggest bone of contention in their marriages (sex and money are lower-rated areas of contention). Suppose that six married women are having coffee together one morning. What is the probability that (a) all of them dislike their mother-in-law? (b) none of them dislike their mother-in-law? (c) at least four of them dislike their mother-in-law? (d) no more than three of them dislike their mother-in-law?
5.2.19
Approximately 75% of all marketing personnel are extroverts, whereas about 60% of all computer programmers are introverts. (a) At a meeting of 15 marketing personnel, what is the probability that 10 or more are extroverts? What is the probability that 5 or more are extroverts? What is the probability that all are extroverts? (b) In a group of 5 computer programmers, what is the probability that none are introverts? What is the probability that 3 or more are introverts? What is the probability that all are introverts?
5.2.23
Suppose you are a hospital manager and have been told that there is no need to worry that respirator monitoring equipment might fail because the probability any one monitor will fail is only 0.01. The hospital has 20 such monitors and they work independently. Should you be more concerned about the probability that exactly one of the 20 monitors fails, or that at least one fails? Explain.
5.2.5
According to the college registrar's office, 40% of students enrolled in an introductory statistics class this semester are freshmen, 25% are sophomores, 15% are juniors, and 20% are seniors. You want to determine the probability that in a random sample of five students enrolled in introductory statistics this semester, exactly two are freshmen. (a) Describe a trial. Can we model a trial as having only two outcomes? If so, what is success? What is failure? What is the probability of success? (b) We are sampling without replacement. If only 30 students are enrolled in introductory statistics this semester, is it appropriate to model 5 trials as independent, with the same probability of success on each trial? Explain. What other probability distribution would be more appropriate in this setting?
5.2.9
Old Friends Information Service is a California company that is in the business of finding addresses of long-lost friends. Old Friends claims to have a 70% success rate. Suppose that you have the names of six friends for whom you have no addresses and decide to use Old Friends to track them. (a) Make a histogram showing the probability of r 5 0 to 6 friends for whom an address will be found. (b) Find the mean and standard deviation of this probability distribution. What is the expected number of friends for whom addresses will be found? (c) Quota Problem How many names would you have to submit to be 97% sure that at least two addresses will be found?
5.3.13
Consider a binomial experiment with n=8 trials and p=0.20. (a) Find the expected value and the standard deviation of the distribution. (b) Interpretation Would it be unusual to obtain 5 or more successes? Explain. Confirm your answer by looking at the binomial probability distribution table.
5.3.3
What percentage of the area under the normal curve lies (a) to the left of μ? (b) between μ - σ and μ + σ? (c) between μ - 3σ and μ + 3σ?
6.1.5
Assuming that the heights of college women are normally distributed with mean 65 inches and standard deviation 2.5 inches, answer the following questions. (a) What percentage of women are taller than 65 inches? (b) What percentage of women are shorter than 65 inches? (c) What percentage of women are between 62.5 inches and 67.5 inches? (d) What percentage of women are between 60 inches and 70 inches?
6.1.7
Sketch the areas under the standard normal curve over the indicated intervals and find the specified areas. -To the left of z = 0.45 -To the right of z =-1.22 -Between z =-2.18 and z =1.3
6.2.17, 6.2.21, 6.2.25
Let z be a random variable with a standard normal distribution. Find the indicated probability, and shade the corresponding area under the standard normal curve. A) P(z ≤ -0.13) B) P(-1.20 ≤ z ≤ 2.64)
6.2.33, 6.2.41
A normal distribution has μ = 30 and σ = 5. (a) Find the z score corresponding to x = 25. (b) Find the z score corresponding to x = 42. (c) Find the raw score corresponding to z = -2. (d) Find the raw score corresponding to z = 1.3.
6.2.5
Find the z value described and sketch the area described: Find z such that 55% of the standard normal curve lies to the left of z.
6.3.17
A person's blood glucose level and diabetes are closely related. Let x be a random variable measured in milligrams of glucose per deciliter (1/10 of a liter) of blood. After a 12-hour fast, the random variable x will have a distribution that is approximately normal with mean μ=85 and standard deviation σ=25. Note: After 50 years of age, both the mean and standard deviation tend to increase. What is the probability that, for an adult (under 50 years old) after a 12-hour fast, (a) x is more than 60? (b) x is less than 110? (c) x is between 60 and 110? (d) x is greater than 140 (borderline diabetes starts at 140)?
6.3.25
Assume that x has a normal distribution with the specified mean and standard deviation. Find the indicated probabilities. P(3 ≤ x ≤ 6); μ= 4, σ=2 P(8 ≤ x ≤ 12); μ= 15, σ=3.2 P (x ≥ 30); μ= 20, σ=3.4
6.3.5, 6.3.9, 6.3.11
Suppose x has a distribution with μ=15 and σ=14. (a) If a random sample of size n=49 is drawn, find μ of x̅, σ of x̅, and P(15 ≤ x̅ ≤ 17). (b) If a random sample of size n=64 is drawn, find μ of x̅, σ of x̅, and P(15 ≤ x̅ ≤ 17). (c) Why should you expect the probability of part (b) to be higher than that of part (a)? Hint: Consider the standard deviations in parts (a) and (b).
6.5.11
Let x be a random variable that represents the level of glucose in the blood (milligrams per deciliter of blood) after a 12-hour fast. Assume that for people under 50 years old, x has a distribution that is approximately normal, with mean μ=85 and estimated standard deviation σ=25. A test result x < 40 is an indication of severe excess insulin, and medication is usually prescribed. (a) What is the probability that, on a single test, x < 40? (b) Suppose a doctor uses the average x for two tests taken about a week apart. What can we say about the probability distribution of x? Hint: See Theorem 6.1. What is the probability that x̅ < 40? (c) Repeat part (b) for n=3 tests taken a week apart. (d) Repeat part (b) for n= 5 tests taken a week apart. (e) Interpretation Compare your answers to parts (a), (b), (c), and (d). Did the probabilities decrease as n increased? Explain what this might imply if you were a doctor or a nurse. If a patient had a test result of x̅ < 40 based on five tests, explain why either you are looking at an extremely rare event or (more likely) the person has a case of excess insulin.
6.5.15
Let x be a random variable that represents the weights in kilograms (kg) of healthy adult female deer (does) in December in Mesa Verde National Park. Then x has a distribution that is approximately normal, with mean μ=63.0 kg and standard deviation σ=7.1 kg. Suppose a doe that weighs less than 54 kg is considered undernourished. (a) What is the probability that a single doe captured (weighed and released) at random in December is undernourished? (b) If the park has about 2200 does, what number do you expect to be under-nourished in December? (c) To estimate the health of the December doe population, park rangers use the rule that the average weight of n=50 does should be more than 60 kg. If the average weight is less than 60 kg, it is thought that the entire population of does might be undernourished. What is the probability that the average weight x̅ for a random sample of 50 does is less than 60 kg (assume a healthy population)? (d) Interpretation Compute the probability that x̅ < 64.2 kg for 50 does (assume a healthy population). Suppose park rangers captured, weighed, and released 50 does in December, and the average weight was x̅ = 64.2 kg. Do you think the doe population is undernourished or not? Explain.
6.5.17
List two unbiased estimators and their corresponding parameters.
6.5.3
Suppose x has a distribution with a mean of 8 and a standard deviation of 16. Random samples of size n=64 are drawn. (a) Describe the x̅ distribution and compute the mean and standard deviation of the distribution. (b) Find the z value corresponding to x̅=9. (c) Find P(x̅ >9). (d) Interpretation: Would it be unusual for a random sample of size 64 from the x distribution to have a sample mean greater than 9? Explain.
6.5.5
Check that it is appropriate to use the normal approximation to the binomial. Then use the normal distribution to estimate the requested probabilities. The Denver Post stated that 80% of all new products introduced in grocery stores fail (are taken off the market) within 2 years. If a grocery store chain introduces 66 new products, what is the probability that within 2 years (a) 47 or more fail? (b) 58 or fewer fail? (c) 15 or more succeed? (d) fewer than 10 succeed?
6.6.11
Based on long experience, an airline has found that about 6% of the people making reservations on a flight from Miami to Denver do not show up for the flight. Suppose the airline overbooks this flight by selling 267 ticket reservations for an airplane with only 255 seats. (a) What is the probability that a person holding a reservation will show up for the flight? (b) Let n = 267 represent the number of ticket reservations. Let r represent the number of people with reservations who show up for the flight. Which expression represents the probability that a seat will be available for everyone who shows up holding a reservation? P(255≤r) P(r≤255) P(r≤267) P(r=255) (c) Use the normal approximation to the binomial distribution and part (b) to answer the following question: What is the probability that a seat will be available for every person who shows up holding a reservation?
6.6.15
Suppose we have a binomial experiment in which success is defined to be a particular quality or attribute that interests us. (a) Suppose n = 100 and p = 0.23. Can we safely approximate the pˆ distribution by a normal distribution? Why? Compute μ of pˆ and σ of pˆ . (b) Suppose n=20 and p=0.23. Can we safely approximate the pˆ distribution by a normal distribution? Why or why not?
6.6.21
You need to compute the probability of 5 or fewer successes for a binomial experiment with 10 trials. The probability of success on a single trial is 0.43. Since this probability of success is not in the table, you decide to use the normal approximation to the binomial. Is this an appropriate strategy? Explain.
6.6.5
Check that it is appropriate to use the normal approximation to the binomial. Then use the normal distribution to estimate the requested probabilities. More than a decade ago, high levels of lead in the blood put 88% of children at risk. A concerted effort was made to remove lead from the environment. Now, according to the Third National Health and Nutrition Examination Survey (NHANES III) conducted by the Centers for Disease Control and Prevention, only 9% of children in the United States are at risk of high blood-lead levels. (a) In a random sample of 200 children taken more than a decade ago, what is the probability that 50 or more had high blood-lead levels? (b) In a random sample of 200 children taken now, what is the probability that 50 or more have high blood-lead levels?
6.6.7
Check that it is appropriate to use the normal approximation to the binomial. Then use the normal distribution to estimate the requested probabilities. It is estimated that 3.5% of the general population will live past their 90th birthday (Statistical Abstract of the United States, 112th Edition). In a graduating class of 753 high school seniors, what is the probability that (a) 15 or more will live beyond their 90th birthday? (b) 30 or more will live beyond their 90th birthday? (c) between 25 and 35 will live beyond their 90th birthday? (d) more than 40 will live beyond their 90th birthday?
6.6.9
True or false: The value z of c is a value from the standard normal distribution such that P(-zc < x < zc) = c
7.1.1
Suppose x has a normal distribution with σ=6. A random sample of size 16 has sample mean 50. (a) Check Requirements Is it appropriate to use a normal distribution to compute a confidence interval for the population mean μ? Explain. (b) Find a 90% confidence interval for μ. (c) Explain the meaning of the confidence interval you computed.
7.1.11
Suppose x has a mound-shaped distribution with σ=3. (a) Find the minimal sample size required so that for a 95% confidence interval, the maximal margin of error is E=0.4. (b) Check Requirements Based on this sample size, can we assume that the x distribution is approximately normal? Explain.
7.1.13
Total plasma volume is important in determining the required plasma component in blood replacement therapy for a person undergoing surgery. Plasma volume is influenced by the overall health and physical activity of an individual. Suppose that a random sample of 45 male firefighters are tested and that they have a plasma volume sample mean of x̅ = 37.5ml/kg (milliliters plasma per kilogram body weight). Assume that σ=7.50ml/kg for the distribution of blood plasma. (a) Find a 99% confidence interval for the population mean blood plasma volume in male firefighters. What is the margin of error? (b) What conditions are necessary for your calculations? (c) Interpret your results in the context of this problem. (d) Find the sample size necessary for a 99% confidence level with maximal margin of error E = 2.50 for the mean plasma volume in male firefighters.
7.1.17
A random sample is drawn from a population with σ= 12. The sample mean is 30. (a) Compute a 95% confidence interval for μ based on a sample of size 49. What is the value of the margin of error? (b) Compute a 95% confidence interval for μ based on a sample of size 100. What is the value of the margin of error? (c) Compute a 95% confidence interval for m based on a sample of size 225. What is the value of the margin of error? (d) Compare the margins of error for parts (a) through (c). As the sample size increases, does the margin of error decrease? (e) Critical Thinking Compare the lengths of the confidence intervals for parts (a) through (c). As the sample size increases, does the length of a 90% confidence interval decrease?
7.1.21
True or False: A larger sample size produces a longer confidence interval for μ.
7.1.5
Sam computed a 95% confidence interval for m from a specific random sample. His confidence interval was 10.1 < μ < 12.2. He claims that the probability that μ is in this interval is 0.95. What is wrong with his claim?
7.1.9
Suppose x has a mound-shaped distribution. A random sample of size 16 has sample mean 10 and sample standard deviation 2. (a) Check Requirements Is it appropriate to use a Student's t distribution to compute a confidence interval for the population mean μ? Explain. (b) Find a 90% confidence interval for μ. (c) Explain the meaning of the confidence interval you computed.
7.2.11
At Burnt Mesa Pueblo, the method of tree-ring dating gave the following years a.d. for an archaeological excavation site: 1189 1271 1267 1272 1268 1316 1275 1317 1275 (a) Use a calculator with mean and standard deviation keys to verify that the sample mean year is x̅ = 1272, with sample standard deviation s = 37years. (b) Find a 90% confidence interval for the mean of all tree-ring dates from this archaeological site. (c) Interpretation What does the confidence interval mean in the context of this problem?
7.2.13
Over the past several months, an adult patient has been treated for tetany (severe muscle spasms). This condition is associated with an average total calcium level below 6 mg/dl. Recently, the patient's total calcium tests gave the following readings (in mg/dl). 9.3 8.8 10.1 8.9 9.4 9.8 10.0 9.9 11.2 12.1 (a) Use a calculator to verify that x̅ = 9.95 and s= 1.02. (b) Find a 99.9% confidence interval for the population mean of total calcium in this patient's blood. (c) Based on your results in part (b), does it seem that this patient still has a calcium deficiency? Explain.
7.2.17
Consider a 90% confidence interval for μ. Assume σ is not known. For which sample size, n =10 or n=20, is the critical value tc larger?
7.2.7
Lorraine computed a confidence interval for μ based on a sample of size 41. Since she did not know σ, she used s in her calculations. Lorraine used the normal distribution for the confidence interval instead of a Student's t distribution. Was her interval longer or shorter than one obtained by using an appropriate Student's t distribution? Explain.
7.2.9
Isabel Myers was a pioneer in the study of personality types. The following information is taken from A Guide to the Development and Use of the Myers-Briggs Type Indicator by Myers and McCaulley (Consulting Psychologists Press). In a random sample of 62 professional actors, it was found that 39 were extroverts. (a) Let p represent the proportion of all actors who are extroverts. Find a point estimate for p. (b) Find a 95% confidence interval for p. Give a brief interpretation of the meaning of the confidence interval you have found. (c) Check Requirements Do you think the conditions np > 5 and nq > 5 are satisfied in this problem? Explain why this would be an important consideration.
7.3.11
A random sample of 5792 physicians in Colorado showed that 3139 provide at least some charity care (i.e., treat poor people at no cost). These data are based on information from State Health Care Data: Utilization, Spending, and Characteristics (American Medical Association). (a) Let p represent the proportion of all Colorado physicians who provide some charity care. Find a point estimate for p. (b) Find a 99% confidence interval for p. Give a brief explanation of the meaning of your answer in the context of this problem. (c) Is the normal approximation to the binomial justified in this problem? Explain.
7.3.15
In a marketing survey, a random sample of 730 women shoppers revealed that 628 remained loyal to their favorite supermarket during the past year (i.e., did not switch stores). (a) Let p represent the proportion of all women shoppers who remain loyal to their favorite supermarket. Find a point estimate for p. (b) Find a 95% confidence interval for p. Give a brief explanation of the meaning of the interval. (c) As a news writer, how would you report the survey results regarding the percentage of women supermarket shoppers who remained loyal to their favorite supermarket during the past year? What is the margin of error based on a 95% confidence interval?
7.3.19
What percentage of your campus student body is female? Let p be the proportion of women students on your campus. (a) If no preliminary study is made to estimate p, how large a sample is needed to be 99% sure that a point estimate pˆ will be within a distance of 0.05 from p? (b) The Statistical Abstract of the United States, 112th Edition, indicates that approximately 54% of college students are female. Answer part (a) using this estimate for p.
7.3.25
Results of a poll of a random sample of 3003 American adults showed that 20% did not know that caffeine contributes to dehydration. The poll was conducted for the Nutrition Information Center and had a margin of error of +/-1.4%. (a) Does the margin of error take into account any problems with the wording of the survey question, interviewer errors, bias from sequence of questions, and so forth? (b) What does the margin of error reflect?
7.3.3
Consider n=100 binomial trials with r =30 successes. (a) Is it appropriate to use a normal distribution to approximate the pˆ distribution? (b) Find a 90% confidence interval for the population proportion of successes p. (c) Explain the meaning of the confidence interval you computed.
7.3.7
Inorganic phosphorous is a naturally occurring element in all plants and animals, with concentrations increasing progressively up the food chain (fruit , vegetables , cereals , nuts , corpse). Geochemical surveys take soil samples to determine phosphorous content (in ppm, parts per million). A high phosphorous content may or may not indicate an ancient burial site, food storage site, or even a garbage dump. The Hill of Tara is a very important archaeological site in Ireland. It is by legend the seat of Ireland's ancient high kings. Independent random samples from two regions in Tara gave the following phosphorous measurements (in ppm). Assume the population distributions of phosphorous are mound-shaped and symmetric for these two regions. Region I: x1; n1 =12 540 810 790 790 340 800 970 720 890 860 820 640 Region II: x2; n2=16 750 870 700 810 635 955 710 890 895 850 280 993 965 350 520 650 (a) Use a calculator to verify that x̅1 =747.5, s1 = 170.4, x̅2=738.9, and s2 = 212.1. (b) Let μ1 be the population mean for x1 and let μ2 be the population mean for x2. Find a 90% confidence interval for μ1 - μ2. (c) Interpretation Explain what the confidence interval means in the context of this problem. Does the interval consist of numbers that are all positive? all negative? of different signs? At the 90% level of confidence, is one region more interesting than the other from a geochemical perspective? (d) Check Requirements Which distribution (standard normal or Student's t) did you use? Why?
7.4.11
Isabel Myers was a pioneer in the study of personality types. She identified four basic personality preferences, which are described at length in the book A Guide to the Development and Use of the Myers-Briggs Type Indicator by Myers and McCaulley (Consulting Psychologists Press). Marriage counselors know that couples who have none of the four preferences in common may have a stormy marriage. Myers took a random sample of 375 married couples and found that 289 had two or more personality preferences in common. In another random sample of 571 married couples, it was found that only 23 had no preferences in common. Let p1 be the population proportion of all married couples who have two or more personality preferences in common. Let p2 be the population proportion of all married couples who have no personality preferences in common. (a) Check Requirements Can a normal distribution be used to approximate the pˆ 1 - pˆ 2 distribution? Explain. (b) Find a 99% confidence interval for p1 2 p2. (c) Interpretation Explain the meaning of the confidence interval in part (a) in the context of this problem. Does the confidence interval contain all positive, all negative, or both positive and negative numbers? What does this tell you (at the 99% confidence level) about the proportion of married couples with two or more personality preferences in common compared with the proportion of married couples sharing no personality preferences in common?
7.4.17
A random sample of size 20 from a normal distribution with σ=4 produced a sample mean of 8. (a) Is the x̅ distribution normal? Explain. (b) Compute the sample test statistic z under the null hypothesis H0: μ = 7. (c) For H1: μ does not equal 7, estimate the P-value of the test statistic. (d) For a level of significance of 0.05 and the hypotheses of parts (b) and (c), do you reject or fail to reject the null hypothesis? Explain.
8.1.13
Bill Alther is a zoologist who studies Anna's hum- mingbird. Suppose that in a remote part of the Grand Canyon, a random sample of six of these birds was caught, weighed, and released. The weights (in grams) were 3.7 2.9 3.8 4.2 4.8 3.1 The sample mean is x̅=3.75 grams. Let x be a random variable representing weights of Anna's hummingbirds in this part of the Grand Canyon. We assume that x has a normal distribution and σ=0.70 gram. It is known that for the population of all Anna's hummingbirds, the mean weight is μ=4.55 grams. Do the data indicate that the mean weight of these birds in this part of the Grand Canyon is less than 4.55 grams? Use a 5 0.01.
8.1.21
Suppose the P-value in a right-tailed test is 0.0092. Based on the same population, sample, and null hypothesis, what is the P-value for a corresponding two-tailed test?
8.1.9
Weatherwise is a magazine published by the American Meteorological Society. One issue gives a rating system used to classify Nor'easter storms that frequently hit New England and can cause much damage near the ocean. A severe storm has an average peak wave height of μ = 16.4 feet for waves hitting the shore. Suppose that a Nor'easter is in progress at the severe storm class rating. Peak wave heights are usually measured from land (using binoculars) off fixed cement piers. Suppose that a reading of 36 waves showed an average wave height of x̅ = 17.3 feet. Previous studies of severe storms indicate that σ=3.5 feet. Does this information suggest that the storm is (perhaps temporarily) increasing above the severe rating? Use a=0.01. (a) What is the level of significance? State the null and alternate hypotheses. (b) Check Requirements What sampling distribution will you use? Explain the rationale for your choice of sampling distribution. Compute the value of the sample test statistic. (c) Estimate the P-value. Sketch the sampling distribution and show the area corresponding to the P-value. (d) Based on your answers in parts (a) to (c), will you reject or fail to reject the null hypothesis? Are the data statistically significant at level a? (e) Interpret your conclusion in the context of the application.
8.2.11
A random sample of 46 adult coyotes in a region of northern Minnesota showed the average age to be x̅=2.05 years, with sample standard deviation s=0.82 years (based on information from the book Coyotes: Biology, Behavior and Management by M. Bekoff, Academic Press). However, it is thought that the overall population mean age of coyotes is μ=1.75. Do the sample data indicate that coyotes in this region of northern Minnesota tend to live longer than the average of 1.75 years? Use a=0.01. (a) What is the level of significance? State the null and alternate hypotheses. (b) Check Requirements What sampling distribution will you use? Explain the rationale for your choice of sampling distribution. Compute the value of the sample test statistic. (c) Estimate the P-value. Sketch the sampling distribution and show the area corresponding to the P-value. (d) Based on your answers in parts (a) to (c), will you reject or fail to reject the null hypothesis? Are the data statistically significant at level a? (e) Interpret your conclusion in the context of the application.
8.2.13
Let x be a random variable that represents red blood cell (RBC) count in millions of cells per cubic millimeter of whole blood. Then x has a distribution that is approximately normal. For the population of healthy female adults, the mean of the x distribution is about 4.8. Suppose that a female patient has taken six laboratory blood tests over the past several months and that the RBC count data sent to the patient's doctor are: 4.9 4.2 4.5 4.1 4.4 4.3 i. Use a calculator to verify that x̅=4.40 and s=0.28. ii. Do the given data indicate that the population mean RBC count for this patient is lower than 4.8? Use a=0.05. (a) What is the level of significance? State the null and alternate hypotheses. (b) Check Requirements What sampling distribution will you use? Explain the rationale for your choice of sampling distribution. Compute the value of the sample test statistic. (c) Estimate the P-value. Sketch the sampling distribution and show the area corresponding to the P-value. (d) Based on your answers in parts (a) to (c), will you reject or fail to reject the null hypothesis? Are the data statistically significant at level a? (e) Interpret your conclusion in the context of the application.
8.2.17
When using the Student's t distribution to test μ, what value do you use for the degrees of freedom?
8.2.3
Consider a test for μ. If the P-value is such that you can reject H0 for a=0.01, can you always reject H0 for a=0.05? Explain.
8.2.5
A random sample of 25 values is drawn from a mound-shaped and symmetrical distribution. The sample mean is 10 and the sample standard deviation is 2. Use a level of significance of 0.05 to conduct a two-tailed test of the claim that the population mean is 9.5. (a) Check Requirements Is it appropriate to use a Student's t distribution? Explain. How many degrees of freedom do we use? (b) What are the hypotheses? (c) Compute the sample test statistic t. (d) Estimate the P-value for the test. (e) Do we reject or fail to reject H0? (f) Interpret the results.
8.2.9
To use the normal distribution to test a proportion p, the conditions np >5 and nq >5 must be satisfied. Does the value of p come from H0, or is it estimated by using pˆ from the sample?
8.3.1
The U.S. Department of Transportation, National Highway Traffic Safety Administration, reported that 77% of all fatally injured automobile drivers were intoxicated. A random sample of 27 records of automobile driver fatalities in Kit Carson County, Colorado, showed that 15 involved an intoxicated driver. Do these data indicate that the population proportion of driver fatalities related to alcohol is less than 77% in Kit Carson County? Use a =0.01. (a) What is the level of significance? State the null and alternate hypotheses. (b) Check Requirements What sampling distribution will you use? Do you think the sample size is sufficiently large? Explain. Compute the value of the sample test statistic. (c) Find the P-value of the test statistic. Sketch the sampling distribution and show the area corresponding to the P-value. (d) Based on your answers in parts (a) to (c), will you reject or fail to reject the null hypothesis? Are the data statistically significant at level a? (e) Interpret your conclusion in the context of the application.
8.3.11
The following is based on information from The Wolf in the Southwest: The Making of an Endangered Species by David E. Brown (University of Arizona Press). Before 1918, the proportion of female wolves in the general population of all southwestern wolves was about 50%. However, after 1918, southwestern cattle ranchers began a widespread effort to destroy wolves. In a recent sample of 34 wolves, there were only 10 females. One theory is that male wolves tend to return sooner than females to their old territories where their predecessors were exterminated. Do these data indicate that the population proportion of female wolves is now less than 50% in the region? Use a=0.01. (a) What is the level of significance? State the null and alternate hypotheses. (b) Check Requirements What sampling distribution will you use? Do you think the sample size is sufficiently large? Explain. Compute the value of the sample test statistic. (c) Find the P-value of the test statistic. Sketch the sampling distribution and show the area corresponding to the P-value. (d) Based on your answers in parts (a) to (c), will you reject or fail to reject the null hypothesis? Are the data statistically significant at level a? (e) Interpret your conclusion in the context of the application.
8.3.13
Are most student government leaders extroverts? According to Myers-Briggs estimates, about 82% of college student government leaders are extroverts. Suppose that a Myers-Briggs personality preference test was given to a random sample of 73 student government leaders attending a large national leadership conference and that 56 were found to be extroverts. Does this indicate that the population proportion of extroverts among college student government leaders is different (either way) from 82%? Use a=0.01. (a) What is the level of significance? State the null and alternate hypotheses. (b) Check Requirements What sampling distribution will you use? Do you think the sample size is sufficiently large? Explain. Compute the value of the sample test statistic. (c) Find the P-value of the test statistic. Sketch the sampling distribution and show the area corresponding to the P-value. (d) Based on your answers in parts (a) to (c), will you reject or fail to reject the null hypothesis? Are the data statistically significant at level a? (e) Interpret your conclusion in the context of the application.
8.3.21
A random sample of 30 binomials trials resulted in 12 successes. Test the claim that the population proportion of successes does not equal 0.50. Use a level of significance of 0.05. (a) Check Requirements Can a normal distribution be used for the pˆdistribution? Explain. (b) State the hypotheses. (c) Compute pˆ and the corresponding standardized sample test statistic. (d) Find the P-value of the test statistic. (e) Do you reject or fail to reject H0? Explain. (f) Interpretation What do the results tell you?
8.3.5
Benford's Law claims that numbers chosen from very large data files tend to have "1" as the first nonzero digit disproportionately often. In fact, research has shown that if you randomly draw a number from a very large data file, the probability of getting a number with "1" as the leading digit is about 0.301. Now suppose you are an auditor for a very large corporation. The revenue report involves millions of numbers in a large computer file. Let us say you took a random sample of n=215 numerical entries from the file and r=46 of the entries had a first nonzero digit of 1. Let p represent the population proportion of all numbers in the corporate file that have a first nonzero digit of 1. i. Test the claim that p is less than 0.301. Use a=0.01. (a) What is the level of significance? State the null and alternate hypotheses. (b) Check Requirements What sampling distribution will you use? Do you think the sample size is sufficiently large? Explain. Compute the value of the sample test statistic. (c) Find the P-value of the test statistic. Sketch the sampling distribution and show the area corresponding to the P-value. (d) Based on your answers in parts (a) to (c), will you reject or fail to reject the null hypothesis? Are the data statistically significant at level a? (e) Interpret your conclusion in the context of the application. ii. If p is in fact less than 0.301, would it make you suspect that there are not enough numbers in the data file with leading 1's? Could this indicate that the books have been "cooked" by "pumping up" or inflating the numbers? Comment from the viewpoint of a stockholder. Comment from the perspective of the Federal Bureau of Investigation as it looks for money laundering in the form of false profits. i. Comment on the following statement: "If we reject the null hypothesis at level of significance a, we have not proved H0 to be false. We can say that the probability is a that we made a mistake in rejecting H0." Based on the outcome of the test, would you recommend further investigation before accusing the company of fraud?
8.3.7
Are data that can be paired independent or dependent?
8.4.1
In environmental studies, sex ratios are of great importance. Wolf society, packs, and ecology have been studied extensively at different locations in the United States and foreign countries. Sex ratios for eight study sites in northern Europe are shown in the following table. Location % Males Winter % Males Summer Finland 72 53 Finland 47 51 Finland 89 72 Lapland 55 48 Lapland 64 55 Russia 50 50 Russia 41 50 Russia 55 45 It is hypothesized that in winter, "loner" males (not present in summer packs) join the pack to increase survival rate. Use a 5% level of significance to test the claim that the average percentage of males in a wolf pack is higher in winter. (a) What is the level of significance? State the null and alternate hypotheses. Will you use a left-tailed, right-tailed, or two-tailed test? (b) Check Requirements What sampling distribution will you use? What assumptions are you making? Compute the value of the sample test statistic. (c) Find (or estimate) the P-value. Sketch the sampling distribution and show the area corresponding to the P-value. (d) Based on your answers in parts (a) to (c), will you reject or fail to reject the null hypothesis? Are the data statistically significant at level a? (e) Interpret your conclusion in the context of the application.
8.4.13
The following data are based on information taken from the book Navajo Architecture: Forms, History, Distributions by S. C. Jett and V. E. Spencer (University of Arizona Press). A survey of houses and traditional hogans was made in a number of different regions of the modern Navajo Indian Reservation. The following table is the result of a random sample of eight regions on the Navajo Reservation. Area # Inhabited Houses # Inhabited Hogan Bitter Springs 18 13 Rnbow Lodge 16 14 Kayenta 68 46 Red Mesa 9 32 Black Mesa 11 15 CanyondeChelly 28 47 Cedar Point 50 17 Burnt Water 50 18 Does this information indicate that the population mean number of inhabited houses is greater than that of hogans on the Navajo Reservation? Use a 5% level of significance. (a) What is the level of significance? State the null and alternate hypotheses. Will you use a left-tailed, right-tailed, or two-tailed test? (b) Check Requirements What sampling distribution will you use? What assumptions are you making? Compute the value of the sample test statistic. (c) Find (or estimate) the P-value. Sketch the sampling distribution and show the area corresponding to the P-value. (d) Based on your answers in parts (a) to (c), will you reject or fail to reject the null hypothesis? Are the data statistically significant at level a? (e) Interpret your conclusion in the context of the application.
8.4.15
Do professional golfers play better in their first round? Let row B represent the score in the fourth (and final) round, and let row A represent the score in the first round of a professional golf tournament. A random sample of finalists in the British Open gave the following data for their first and last rounds in the tournament. B: Last 73 68 73 71 71 72 68 68 74 A: First 66 70 64 71 65 71 71 71 71 Do the data indicate that the population mean score on the last round is higher than that on the first? Use a 5% level of significance. (a) What is the level of significance? State the null and alternate hypotheses. Will you use a left-tailed, right-tailed, or two-tailed test? (b) Check Requirements What sampling distribution will you use? What assumptions are you making? Compute the value of the sample test statistic. (c) Find (or estimate) the P-value. Sketch the sampling distribution and show the area corresponding to the P-value. (d) Based on your answers in parts (a) to (c), will you reject or fail to reject the null hypothesis? Are the data statistically significant at level a? (e) Interpret your conclusion in the context of the application.z
8.4.19
When testing the difference of means for paired data, what is the null hypothesis?
8.4.3
For a random sample of 36 data pairs, the sample mean of the differences was 0.8. The sample standard deviation of the differences was 2. At the 5% level of significance, test the claim that the population mean of the differences is different from 0. (a) Check Requirements Is it appropriate to use a Student's t distribution for the sample test statistic? Explain. What degrees of freedom are used? (b) State the hypotheses. (c) Compute the sample test statistic. (d) Estimate the P-value of the sample test statistic. (e) Do we reject or fail to reject the null hypothesis? Explain. (f) Interpretation What do your results tell you?
8.4.7