STAT 311 Practice Problems, Quizzes, and Exams
A researcher believes that the ankle circumference for adult females in Europe can be considered to have a normal distribution with a mean of 20 cm.If his belief is correct which of the following ranges of ankle sizes will have the largest proportion of members of this population?
17 to 23 (Q2)
Elon Musk is working on a new hybrid engine designed specifically for SUV's and he wants to know how these engines would sell in Orange County, California. As such, he wants to estimate the proportion of cars in Orange County, California that would be considered SUV's. Using the county's vehicle registration records they randomly select 220 vehicles. From that list they determined that 90 of them were SUV's. The sample proportion is 0.409, and the 95% confidence interval for the proportion of cars in Orange County that are SUV's is: (0.344, 0.474). If you created a 90% confidence interval instead of the 95% confidence interval the margin of error would? If you had a random sample of 98 vehicles instead of 220 the margin of error would? If you created a 99% confidence interval instead of the 95% confidence interval the margin of error would? If you had a random sample of 320 vehicles instead of 220 the margin of error would?
- Decrease compared to the interval above - increase compared to the interval above - increase compared to the interval above -decrease compared to the interval above
At a large university it is known that 40% of the students live on campus. The director of student life is going to take a random sample of 200 students. What is the probability that more than half of the sampled students live on campus? Give your answer to 4 decimal places
.0019 (q4)
In a Midwestern state an automobile insurance company has a large number of customers. From company files it is known that 86% of the customers have only the state minimums for insurance. An official with the state board of insurance is going to take a random sample of 100 accounts to review. Find the standard deviation of the sample proportion in this situation. Give your answer to 4 decimal places
.0347 (q3)
The CEO of a large corporation asked her head HR representative to take a random sample of 97 employees. From his sample, he finds that the proportion of their employees that have children is 0.85. What is the standard error of the proportion in this situation? Give your answer to 3 decimal places
.036 (q5)
In a sample of 100 students, what is the probability that over 35% of the sample has AP credit? Give your answer to 4 decimal places
.1379 (q3)
In a sample of 200 students, what is the probability that over 33% of the sample has AP credit? Give your answer to 4 decimal places
.1762 (q3)
in a specific, mid-size, warm-weathered city in southern California, there are 2 million homes. As part of an environmental status survey, it was desired to estimate the proportion of homes in this city which contain lead based paints. A simple random sample of 140 households revealed that 37 homes had lead based paints in at least one room. What is the sample proportion? Give your answer to 3 decimal places.
.264 (q5)
In psychology, there is a particular Mental Development Index (MDI) used in the study of infants. The scores on the MDI have approximately a normal distribution with a mean of 100 and standard deviation of 16. We are going to randomly select 64 children and average their MDI scores. What is the probability that the average is under 104? Give your answer to 4 decimal places.
.9772 (q4)
When engineers design products, it is important to consider the weights of people so that airplanes or elevators aren't overloaded. Based on data from the National Health Survey, we can assume the weight of adult males in the US has a mean weight of 197 pounds and standard deviation of 32 pounds. We randomly select 50 adult males. What is the probability that the average weight of these 50 adult males is over 195 pounds? Give your answer to 4 decimal places
0.67 (q4)
The Mental Development Index (MDI) of the Bayley Scales of Infant Development is a standardized measure used in longitudinal follow-up of high-risk infants. The scores on the MDI have approximately a normal distribution with a mean of 100 and standard deviation of 16. What proportion of children have MDI of at least 80?
0.8944 (Q2)
For a normal distribution, what standard score (Z-score) has 10% of the distribution above it? Find the closest value listed on the table. 2 decimal places
1.28 (Q2)
A state administered standardized reading exam is given to eighth grade students. The scores on this exam for all students statewide have a normal distribution with a mean of 531 and a standard deviation of 66. A local Junior High principal has decided to give an award to any student who scores in the top 10% of statewide scores. How high should a student score be to win this award? Give your answer to the nearest integer
615 (Q2)
Which of the following scenarios would it be appropriate to use normal approximation for the sampling distribution of the proportion? Select all that apply. Select one or more: A researcher wishes to find the probability that less than 25% of a sample of undergraduate students from Winston Salem State University will be male. He randomly samples 42 undergraduate students from the student database. The population proportion of undergraduate males at WSSU is known to be 29.3%. A researcher wishes to find the probability that more than 11% of a sample of undergraduate students from East Carolina State University will be between the ages of 25 and 34. He samples the first 105 students that walk into the gym on Monday morning. The population proportion of undergraduates between the ages of 25 and 34 is 9.9%. A grad student at UNC Asheville wants to know how likely it is that a group of students would be made up of more than 5% graduate students. She will ask 40 of her friends if they are a graduate student or an undergraduate student. The population proportion of grad students at UNC Asheville is 1.3%. A full time student at UNC Wilmington wants to know how likely it is that a group of students would be made up of more than 90% full time students. She will randomly ask 60 students their enrollment status. The population proportion of full time students at UNC Wilmington is 85%.
A researcher wishes to find the probability that less than 25% of a sample of undergraduate students from Winston Salem State University will be male. He randomly samples 42 undergraduate students from the student database. The population proportion of undergraduate males at WSSU is known to be 29.3%. (q3)
In which of the following scenarios would it be appropriate to construct a 95% confidence interval for the parameter of interest? Select all that apply. Select one or more: A state safety official wants to estimate the proportion of licensed cars in NC that have worn tires. He randomly selects 5 mechanic shops and he looks at the first 100 car records he finds to see what work was done. He finds that 45% of them have worn tires. A state congressman wants to know the proportion of NC voters who support higher taxes to pay for state parks. He randomly selects 200 voters, and he asks them if they support higher taxes for state parks. 15% of the selected voters supported the higher taxes. A police officer wants to know the proportion of licensed drivers who have had a speeding ticket within the last five years. He randomly selects 30 drivers from a database of all licensed drivers, and checks their records. He finds that 30% of the people had speeding tickets within the last five years. A politician wants to know the proportion of registered women voters that are registered with the Democratic Party. He looks at voter registration, randomly selects 100 women, and sees that 56% of women are registered with the Democratic Party.
A state congressman wants to know the proportion of NC voters who support higher taxes to pay for state parks. He randomly selects 200 voters, and he asks them if they support higher taxes for state parks. 15% of the selected voters supported the higher taxes., A politician wants to know the proportion of registered women voters that are registered with the Democratic Party. He looks at voter registration, randomly selects 100 women, and sees that 56% of women are registered with the Democratic Party. (q3)
Which of the following are NOT likely to be well modeled by a normal distribution because the distribution is NOT likely to be symmetric? (Hint: Sketch what you think the histogram would look like based on the information given.) Select one or more: a. The scores from a university's mathematics placement exam in which the minimum score is 0 and the maximum score is 100. Although there were scores throughout the entire range, more than half of the students scored over 85. b. The number of hours students at a large university work per week at outside jobs. More than 75% of the students worked less than 10 hours but about 5% had jobs in which they worked over 30 hours outside the university. c. The height (in centimeters) of 10 year old boys in the U.S. Rarely are values lower than 120 cm or over 160 cm and the majority are between 135 cm and 145 cm. d. The amounts of time people wait at a particular bus stop for the bus. About 60% of the time they wait less than 7 minutes, however, occasionally because of traffic issues they wait as long as 40 minutes.
A, B, D (Q2)
After taking an aptitude test, the computer told Bob that he had a z-score of 1.08. If scores on the aptitude test are normally distributed, which of the following statements can Bob conclude from his score? Select all that apply. Select one or more: Bob scored within 2 standard deviations of the mean score. Bob did better than the mean score. Bob scored within 1 standard deviation of the mean score. Bob did worse than the mean score. About 14% of students taking the aptitude test did better than Bob. About 14% of students taking the aptitude test did worse than Bob.
Bob scored within 2 standard deviations of the mean score. Bob did better than the mean score. About 14% of students taking the aptitude test did better than Bob. (q2)
A college professor stops at McDonald's every morning for 10 days to get a number 1 value meal costing $5.39. On the 11th day he orders a number 8 value meal costing $4.38. Which of the following are true? Select all that apply. Select one or more: During the first 10 days the professor's standard deviation was more than 0. During the first 10 days the professor's standard deviation was less than 0. During the first 10 days, the professor's standard deviation was 0. It is impossible to tell anything about the professor's standard deviation for the first 10 days. Considering all 11 days, the professor's standard deviation was lower than the standard deviation of the first 10 days. Considering all 11 days, the professor's standard deviation was higher than the standard deviation of the first 10 days. Considering all 11 days, the professor's standard deviation was the same as the standard deviation of the first 10 days. Considering all 11 days, It is impossible to tell anything about the professor's standard deviation compared to the first 10 days.
During the first 10 days, the professor's standard deviation was 0. (Q2)
A large company knows that 7% of their employees are paid sales commissions as part of their compensation. The director of human resources is going to take a random sample of 25 employees to complete a survey about compensation. What do we know about the shape of the sampling distribution of the sample proportion of employees who are paid commission? Select one: It will be very close to a normal distribution with a mean of 0.07. It will be somewhat skewed to the right. It will be very close to a normal distribution with a mean of 0.50. It will be distinctly bimodal with a peak at 7% and another at 93%.
It will be somewhat skewed to the right. (q3)
An instructor in a college class recently gave an exam that was worth a total of 100 points. The instructor inadvertently made the exam harder than he had intended. The scores were very symmetric, but the average score for his students was 61 and the standard deviation of the scores was 3 points. The instructor is considering 2 different strategies for rescaling the exam results: Method 1:Multiply everyone's score by 1.2. Method 2:Add 9 points to everyone's score. Which of the following are true? Select all that apply. Select one or more:
Method 1 will increase the standard deviation of the students' scores (Q2)
At a large university it is known that 35% of the students live on campus. The director of student life is going to take a random sample of 200 students. Which of the following is most likely to occur. Select one: The sample proportion falls between 0.15 and 0.35 The sample proportion falls between 0.35 and 0.55 The sample proportion falls between 0.25 and 0.45 The sample proportion falls between 0.3 and 0.4
The sample proportion falls between 0.25 and 0.45 (q3)
According to the most recent census, the average income of households in Wake County is $58,500. It is also known that the distribution of household income in Wake County is strongly skewed to the right with a standard deviation of $14,000. A researcher is going to randomly select a sample of 5 households from Wake County. Which of the following is true? Select all that apply. Select one or more: The sampling distribution of the mean will have a smaller standard We know that the shape of the sampling distribution of the mean will be right skewed. The sampling distribution of the mean will have a larger standard deviation than the population. We can not tell what the shape sampling distribution of the mean will look like. We know that the shape of the sampling distribution of the mean will be approximately symmetric. The sampling distribution of the mean will have the same standard deviation as the population.
The sampling distribution of the mean will have a smaller standard We know that the shape of the sampling distribution of the mean will be right skewed. (q4)
At North Carolina State University it is known that 56% of undergraduates are male. If a sample of 160 undergraduate students was taken, which of the following would accurately describe the sampling distribution? Select all that apply. Select one or more: The sampling distribution will be approximately normal. The sampling distribution will be skewed right. The sampling distribution will be skewed left. The mean of the sampling distribution will be equal to 50% The mean of the sampling distribution will be equal to 56% We can not determine the mean of the sampling distribution from the given information. The standard deviation of the sampling distribution will be 0.0392. The standard deviation of the sampling distribution will be 0.0015 The standard deviation of the sampling distribution will be 0.4964. We can not determine the standard deviation of the sampling distribution from the given information.
The sampling distribution will be approximately normal. The mean of the sampling distribution will be equal to 56%. The standard deviation of the sampling distribution will be 0.0392. (q3)
Which of the following is true about a parameter? a) It is typically known b) Its a numerical summary of the population c) Its a numerical summary of the sample d) Both a and b
Unit 1 Practice Problems b) It's a numerical summary of the population
True or False: Non-response bias occurs when a subset of the sample cannot be contacted or does not respond.
Unit 1 Practice Problems True
The Chief Financial Officer (CFO) at a national retailer would like to estimate the average amount spent by all Americans on Valentine's Day. He surveys 5,819 people on the company's catalog mailing list, asking them to report the amount they spent on gifts, food, and entertainment this past Valentine's Day. He finds that each person spent an average of $130.97. i) What is the population of interest? ii) What is the sample? iii) What is the parameter of interest? iv) What is the statistic? v) Is the sample representative of the population? Explain your reasoning
Unit 1 Practice Problems i) All Americans. ii) The 5,819 people on the company's mailing list that were surveyed. iii) The average amount spent by all Americans on gifts, food, and entertainment for Valentine's Day iv) $130.97, which is the average amount spent by the 5,819 people on the company's mailing list that were surveyed on gifts, food, and entertainment for Valentine's Day v) No, as the sample only includes people who are on the company's catalog list, which especially nowadays is not representative of all Americans. Additionally, we do not even know what kind of retailer this CFO runs, so it could be that the customers of that retailer arent representative of all Americans to begin with.
True or False: Cluster sampling involves dividing the sampling frame into groups, and then randomly sampling within each group.
Unit 1 Practice problems False. This is stratified! Cluster sampling randomly selects the entire groups.
True or false: A 95% confidence interval means that 95% of all sample statistics will be in our interval.
Unit 5 false. A correct statement is "A 95% confidence interval means we believe the PARAMETER is within our interval". Remember, we use the statistic to learn about the parameter, because the parameter is ultimately what we are interested in.
A local radio station is interested in determining how North Carolina residents feel about marijuana legalization. The station set up a special phone number which could be called by people who wished to voice their opinion. The station found that 67% of the 1,624 callers support legalization. i) What is the parameter of interest? ii) What is the population? iii) What is the statistic? iv) What is the sample? v) Is the sample representative of the population?
Unit 1 Practice problems i) The proportion of all North Carolina residents who support marijuana legalization. ii) All North Carolina residents iii) 67%, the proportion of callers who support the legalization of marijuana. iv) The 1,624 people who called into the radio station and voiced their opinions. v) No, this is a voluntary response sample, which is biased because those who participate tend to have stronger opinions than the general public. In addition, its possible that people who are not residents of North Carolina participated in the survey.
True or False: A parameter is a numerical summary of a variable for a sample.
Unit 1 Practice problems: False. A parameter is a numerical summary of a variable for a population. A statistic is a numerical summary of a variable for a sample.
The amount of data used by mobile customers has skyrocketed in recent years. Of great concern is how to provide fast and reliable data service at a reasonable cost to customers. Some mobile service companies (like AT&T) are considering charging app providers for the data their app uses. Suppose that AT&T wants to find out the proportion of developers who provide apps to the iPhone app store that are willing to pay for data usage. i) What is the population of interest? ii) What is the parameter of interest? iii) Suppose AT&T takes a random sample of app developers and asks them this question: Would you be willing to pay a charge for the mobile data used by your app if it would result in a substantial number of new customers? What type of bias is most likely to affect the results of this survey? Explain your reasoning.
Unit 1 practice Problems i) App developers who provide apps to the iPhone app store. ii) The proportion of developers who provide apps to the iPhone app store that are willing to pay for data usage. iii) Response bias because of the question wording (substantial number of new customers).
A campaign for a local politician wants to estimate the proportion of voters in a large urban county that plan to vote in an upcoming election. To do so, they take a sample of 150 people who "liked" the candidate on Facebook. i) What is the population of interest? ii) What is the sample? iii) What is the parameter of interest? iv) Is it appropriate to conduct inference in this context? Explain your reasoning.
Unit 1 practice problems i) Voters in a large urban county ii) The 150 people they selected who like the candidate on Facebook iii) The proportion of voters in a large urban county that plan to vote in an upcoming election iv) No, the sample is not representative of the population (all voters in a large urban county) since it only includes people who like the candidate on Facebook, which are probably more likely to vote and also can include people who dont live in the county of interest.
a) Vanessa scored 83 on her History final. The class average was 74, and the standard deviation was 8 points. What is Vanessa's standardized score? b) Marcus is also in Vanessa's class. His standardized score was 1.75. How many points did he score on the exam? c) Julian's standardized score was -.85. Did he score above or below the class average? d) (Modules 2.6, 2.7) Another student, Harper, scored an 89, on the exam. If we assume that the Normal distribution is a good model for the scores in this class, did Harper score in the top 10% of the class? What grade is needed to be in the top 10%?
Unit 2 Practice Problems a) y − µ σ = 83 − 74 8 = 1.125 b) Marcus scored 1.75 standard deviations above the mean → µ+1.75×σ = 74+1.75×8 = 88. You can show this is true using algebra. c) Below, his normal score is negative, so he scored less than the mean. d) Julian scored .85 standard deviation less than the mean → µ−.85×σ = 74−.85×8 = 67.2 e) First, we need to calculate Harper's z-score: y − µ σ = 89 − 74 8 = 1.88 (round to 2 decimal places) When we look 1.88 up in Table Z, we see that 97.0% of students scored less than Harper, so she is in the top 10%. To find the score needed to be in the top 10% find the value in the middle of the z-table closest to .9000. This gives you a z-score of 1.28. You must score at least 1.28 standard deviations above the mean to be in the top 10% of the class. → µ+1.28×σ = 74+1.28×8 = 84.24
Would you expect distributions of these variables to be uniform, unimodal, or bimodal? Symmetric or skewed? Explain why. a) Ages of people at a Little League game. b) Number of siblings of people in your class. c) Pulse rates of college-age males. d) Number of times each face of a die shows in 100 tosses
Unit 2 Practice Problems a) Bimodial because you have players and parents. It may also be skewed to the right, since parents' ages can be higher than the mean more easily than lower. b) Unimodal and skewed to the right. There are probably many students with 0 or 1 sibling and some with 2 or more. c) Unimodal and symmetric. It will be unusual to have either very high or low pulse rates. d) Uniform. Each face of the die has the same chance of coming up, so the number of times should be about the same for each face.
A meteorologist preparing a talk about global warming compiled a list of weekly low temperatures (in degrees Fahrenheit) he observed at his southern Florida home last year. The coldest temperature for any week was 36○ F, but he inadvertently recorded the Celsius value of 2○ . Assuming that he correctly listed all the other temperatures, explain how this error will affect these summary statistics: a) measures of center: mean and median. b) measures of spread: range, IQR, and standard deviation
Unit 2 Practice Problems a) The mean will be smaller; the median will not be affected. b) The range and standard deviation will be larger; the IQR won't change
A large organization with membership consisting of professionals with or without an M.D. degree wanted to know the average income of its members. Name the type of sampling plan they used in each of the following scenarios: a) They numbered all the members using an alphabetical list and generated 1000 random numbers. Members corresponding to the numbers are selected for the sample. b) They randomly selected 500 members from a list of all professionals with M.D. degree and 500 members from a list of all without M.D. degree (for a total of 1000). They then surveyed those 1000 members. c) They randomly selected ten cities from all cities in which its members lived, then surveyed all members in those cities. d) They randomly choose a starting point from the first 50 names in an alphabetical list of members, then chose every 50th member in the list from that point. e) They posted a survey on their website and collected responses from all members who completed the survey.
Unit 2 practice Problems a) They numbered all the members using an alphabetical list and generated 1000 random numbers. Members corresponding to the numbers are selected for the sample. b) They randomly selected 500 members from a list of all professionals with M.D. degree and 500 members from a list of all without M.D. degree (for a total of 1000). They then surveyed those 1000 members. c) They randomly selected ten cities from all cities in which its members lived, then surveyed all members in those cities. d) They randomly choose a starting point from the first 50 names in an alphabetical list of members, then chose every 50th member in the list from that point.
A company that markets build-it-yourself furniture sells a computer desk that is advertised with the claim "less than an hour to assemble." However, through post purchase surveys the company has learned that only 25% of its customers succeeded in building the desk in under an hour. One way the company could solve this problem would be to change the advertising claim. What assembly time should the company quote in order that 60% of its customers succeed in finishing the desk by then? The company assumes that consumer assembly time follows a Normal model with a mean of 1.29 hours and a standard deviation of 0.43 hours.
Unit 2 practice Problems We want to rewrite our claim so that way 60% of our customers have finished assembling the desk. By changing our percentage from 25% to 60%, our new standard score is z = .25. What we want to do with this problem is identify a new y so that our claim is true. So, we want to solve the following equation: z = y − µ σ ⇒ .25 = y − 1.29 .43 Solving for y we find that y = 1.4 hours
A company's customer service hotline handles many calls relating to orders, refunds, and other issues. The company's records indicate that the median length of calls to the hotline is 4.4 minutes with an IQR of 2.3 minutes. a) If the company were to describe the duration of these calls in seconds instead of minutes, what would the median and IQR be? b) In an effort to speed up the customer service process, the company decides to streamline the series of push-button menus customers must navigate, cutting the time by 24 seconds. What will the median and IQR of the length of hotline calls become?
Unit 3 Practice Problems a) Median: 264 seconds, IQR: 138 seconds b) Median: 240 seconds, IQR: 138 seconds
Information on a packet of seeds claims that the germination rate is 92%. What's the probability that more than 95% of the 160 seeds in the packet will germinate? Be sure to discuss any assumptions that you make and check the conditions that support your model.
Unit 3 Practice Problems First we need to assume that this is a random sample. Additionally, we need to check the 'Rule of Thumb' to see if a Normal distribution would be appropriate. Here n ∗ p = 160 ∗ 0.92 = 147.2 ≥ 10 and n ∗ q = 160 ∗ 0.08 = 12.8 ≥ 10. Therefore a Normal distribution is appropriate. As in the problem above, we need to first find a standard score and then use the Table Z chart in the book to find an appropriate probability for the given values. In this problem, we are given pˆ = 0.95, p = 0.92, σpˆ = r 0.92 ∗ 0.08 160 = 0.0214 Thus Pr(ˆp > 0.95) = Pr pˆ− p σpˆ > 0.95 − 0.92 0.0214 = Pr(z > 1.40) = 1 − 0.9192 = 0.0808.
In a really large bag of M&M's, students randomly selected 500 candies, and 12% of them were green. Note: There are 6 different colors of M&M's in each bag, and the company claims that each color is equally likely. a) Is it appropriate to use a Normal model to describe the distribution of the proportion of green M&M's they might expect? b) Is this an unusually small proportion of green M&M's? Explain.
Unit 3 Practice Problems a) Note first, that the probability of selecting a green M&M is 1 6 = 0.167. Then, n ∗ p = 500 ∗ 0.167 = 83.3 ≥ 10 and n ∗ q = 500 ∗ 0.833 = 416.7 ≥ 10. Therefore, we have satisfied the 'Rule of Thumb' b) Because we know that normal distribution is appropriate here, we need to first find a standard score and then use the Table Z chart in the book to find an appropriate probability for the given values. In this problem, we are given pˆ = 0.12, p = 0.167, σpˆ = r 0.167 ∗ 0.833 500 = 0.0167 Thus Pr(ˆp < 0.12) = Pr pˆ− p σpˆ < 0.12 − 0.167 0.0167 = Pr(z < −2.81) = 0.0025 If each color is equally likely to occur, then the probability we would find a sample with less than 12% green M&M's is approximately 0.0024. Since the probability is so small, this indicates that the sample proportion is unusually small given what we would expect to see based on the sampling distribution.
It's believed that 4% of children have a gene that may be linked to juvenile diabetes. Researchers hoping to track 20 of these children for several years test 732 newborns for the presence of this gene. What's the probability that they find at least 20 subjects for their study?
Unit 3 Practice Problems Solution First, we need to check the 'Rule of Thumb' to see if a Normal distribution would be appropriate. Here n ∗ p = 732 ∗ 0.04 = 29.28 ≥ 10 and n ∗ q = 732 ∗ .96 = 702.72 ≥ 10. Therefore a Normal distribution is appropriate. Again, we need to first find a standard score and then use the Table Z chart in the book to find an appropriate probability for the given values. In this problem, we are given pˆ = 0.027, p = 0.04, σpˆ = r 0.04 ∗ 0.96 732 = 0.007243 Thus Pr(ˆp > 0.027) = Pr pˆ− p σpˆ > 0.027 − 0.04 0.007243 = Pr(z > −1.79) = 1 − 0.0367 = 0.9633.
According to a recent report the average income of all people in Adair County Missouri is $31,023. Which of the following is more likely? Pick one and explain. (a) We take a random sample of 50 people from this county and find that the average is over $50,000. (b) We take a random sample of 200 people from this county and find that the average is over $50,000. (c) We have no basis for predicting which is more likely to have an average over $50,000.
Unit 4 (a) With smaller sample sizes, there is more sampling variability, so we would be more likely to observe extreme values of a sample mean.
A swimsuit manufacturer wants to test the speed of its newly designed suit. The company designs an experiment by having 6 randomly selected Olympic swimmers swim as fast as they can with their old swimsuit first and then swim the same event again with the new, expensive swimsuit. a) What is the explanatory variable? Is it quantitative or categorical? b) What is the response variable? Is it quantitative or categorical? c) Is this a completely randomized, blocked or matched design? d) Criticize the experiment and point out some of the problems with generalizing the results to the general public
Unit 7 a) The explanatory variable here is swimsuit type. It is a categorical variable. b) The response variable is the swimmer's racing speed. It is quantitative. c) This is a matched pairs design since each subject (swimmer) underwent both treatments. d) There are a number of issues with this experiment. To begin with, we need to randomize the order in which the swims are performed. If everyone swims in the old swimsuit first and then the new swimsuit, we may be introducing a systematic bias into the experiment. Additionally, there is no way to blind the test. The swimmers will know which kind of suit they are wearing, so that may have an impact on their performance. Lastly, the study only included Olympic swimmers which is not representative of the general public.
Some IQ tests are standardized to a Normal model, with a mean of 100 and a standard deviation of 16. a) About what percent of people should have IQ scores below 72? b) About what percent of people should have IQ scores between 68 and 84 c) About what percent of people should have IQ scores above 116?
Unti 3 Practice Problems a) 0.0401 b) 0.1359 c) 0.1587
A political action committee wanted to estimate the proportion of county residents who support the installation of red light cameras throughout the county. They took a random sample of 600county residents and found that the proportion who wanted to install these cameras was 32% with a margin of error of +/- 4% (with 95% confidence). This implies: Select one: We believe that the true proportion of county residents who want the law changed is between 28% and 36%. There is a 95% chance that the true parameter is 32%. If we take many other samples from this population 95% of them will have a sample proportion that is between 28% and 36%. If we took another sample of 600 residents the sample proportion would definitely be between 28% and 36%. We can not conclude anything about the population parameter since this is only a sample.
We believe that the true proportion of county residents who want the law changed is between 28% and 36%. (q5)
A west-coast tech company knows the average age of its employees is 34 years. They also know that the standard deviation of the ages of these employees is 8 years. We know that the population of employee ages will have a right skewed distribution. A manager from human resources is going to randomly select a sample of 100. Which of the following is true? Select all that apply. Select one or more: We know that the shape of the sampling distribution of the mean will be right skewed. We know that the shape of the sampling distribution of the mean will be approximately symmetric. We can not tell what the shape sampling distribution of the mean will look like. The sampling distribution of the mean will have a smaller standard deviation than the population. The sampling distribution of the mean will have the same standard deviation as the population. The sampling distribution of the mean will have a larger standard deviation than the population.
We know that the shape of the sampling distribution of the mean will be approximately symmetric. The sampling distribution of the mean will have a smaller standard deviation than the population. (q4)
According to a recent report it was found that 48.7% of residents in Franklin county Ohio are registered to vote. Which of the following is more likely. Select one: We take a random sample of 30 people from this county and find that the proportion is less than 45%. We take a random sample of 50 people from this county and find that the proportion is less than 45% We take a random sample of 100 people from this county and find that the proportion is less than 45% We take a random sample of 500 people from this county and find that the proportion is less than 45% We have no basis for predicting which is more likely to have an proportion less than 45%
We take a random sample of 30 people from this county and find that the proportion is less than 45%. (q3)
We are interested in the average height of 5 year old children, so we did a survey. The resulting data (in inches) are 24, 26, 34, 38, 29, 33. The sample mean is 30.7 and the sample standard deviation is 5.3. [Learning Objectives C4, D6, E10, F16, F18] a) How would the mean and standard deviation change if we measured heights in cm instead of inches? (1 inch = 2.5 cm) b) Suppose that heights for 5 year old children follow an approximately normal distribution with mean 32 inches and standard deviation 3 inches. What proportion of children will be greater than 34 inches tall? c) Supposing the same population values as in (b), what is the probability that in a sample of 6 children, the average height will be greater than 34 inches tall? d) Using the sample statistics given in the problem set-up, find a 90% confidence interval for the average height. e) Suppose instead we found the 90% confidence interval for average height based on a sample of 40 children instead of 6. Would our confidence interval be wider or narrower?
a) They would both increase by a factor of 2.5 because multiplication (or division) affects both measures of center and measures of spread. Mean: 30.7 × 2.5 = 76.75cm, SD: 5.3 × 2.5 = 13.25cm. b) z = 34−32 3 = 0.67. Looking this up on the Z table we find that the proportion of values in the Z distribution less than 0.67 is 0.7486. However, since we are interested in the upper end of the distribution, the answer is 1 − 0.7486 =0.2514. c) z = 34−32 3/ √ 6 = 1.63, which corresponds to 0.9484 from Table Z, so 1-0.9484 = 0.0516 d) If we assume the distribution is symmetric (which seems reasonable since the mean and median are very close) we can calculate the confidence interval. We use the t-distribution with 5 degrees of freedom (n = 6; so we use n − 1 df) to get the multiplier for a 90% interval, which is 2.015. So the interval is: y¯ ± t · s √ n ⇒ 30.7 ± 2.015 · 5.3 √ 6 ⇒ 30.7 ± 2.015 · 2.164 ⇒ 30.7 ± 4.36 ⇒ (26.34, 35.06) We believe that the true average height for all 5 year old children is between 26.34 inches and 35.06 inches. e) If we had a sample of 40 children our confidence interval would be narrower because our standard error would be smaller since the sample size (n) is in the denominator. (s/√ 40 will be smaller than s/√ 6). Note: Our t multiplier will also be slightly smaller since we have more degrees of freedom, but the effect of sample size on standard error is much greater and much more important!
For each of the following, tell whether the population parameter of interest is µ or p. The U.S. Bureau of Labor and Statistics sampled fifty people in Arkansas asking them their age. Their average age is 49.4 years old with a standard deviation of 17.1 years. What is the average age of all According to a recent survey of 850 Generation Y web users (people born between 1978 and 1983), 449 reported using the internet to download music. What proportion of all Gen Y web users download music from the internet? In a recent study of 62 Agricultural Inspectors from North Carolina it was found that on average they made $41,250 a year. What is the average salary of all Agricultural Inspectors in the state of North Carolina?
m,p,m (q4)
Hoping to lure more shoppers downtown, a city builds a new public parking garage in the central business district. The city plans to pay for the structure through parking fees. During a two-month period (41 weekdays), daily fees collected averaged $126, with a sample standard deviation of $15. a) Write a 90% confidence interval for the mean daily income this parking garage will generate. b) The consultant who advised the city on this project predicted that parking revenues would average $130 per day. Based on your confidence interval, do you think that consultant was correct? Why?
unit . 6 (a) The 90% confidence interval is as follows: y¯ ± t · s √ n ⇒ 126 ± 1.684 · 15 √ 41 ⇒ 126 ± 3.94 ⇒ (122.06, 129.94) (b) The interval provides us with a range of reasonable values for the true average, and since the interval does not contain the value of 130, so we would believe that the consultant is not correct.
Statistics from Cornell's Northeast Regional Climate Center indicate that Ithaca, NY, gets an average of 35.4" of rain each year, with a standard deviation of 4.2". Assume that a Normal model applies. (a) During what percentage of years does Ithaca get more than 40" of rain? (b) Less than how much rain falls in the driest 20% of all years?
unit 4 (a) 0.1357 (b) 31.872
The finishing times of a 10k race are believed to have a skewed right population with mean 59 minutes with a standard deviation of 8 minutes. (a) Can we estimate the probability that a runner will beat her goal time of 45 minutes? If so, calculate the probability. If not, explain why not. (b) If we were to take a random sample of 15 people, can we estimate the probability that the average time for the group will be longer than 1 hour? If so, calculate the probability. If not, explain why not. (c) If we were to take a random sample of 60 people, can we estimate the probability that the average time for the group will be longer than 1 hour? If so, calculate the probability. If not, explain why not.
unit 4 (a) No, parent population is skewed and therefore NOT Normal. (b) No, since the parent population is skewed we need a sample size of at least 30 to use the CLT. Thus, the sampling distribution is NOT Normal. (c) Yes, since we took a random sample and n = 60 > 30, the conditions are satis- fied and the sampling distribution is Normal. P(y¯ > 60) = P( y¯ − 59 8/ √ 60 > 60 − 59 8/ √ 60 ) = P(Z > 0.97) = 1 − 0.8340 = 0.166
A university expects that the proportion of all undergraduate students enrolled who own a laptop is .70. (a) If we were to take a random sample of 10 undergraduate students, can we estimate the probability that less than half of the students sampled own a laptop? If so, calculate the probability. If not, explain why not. (b) If we were to take a random sample of 45 undergraduate students, can we estimate the probability that more than three-quarters of the students sampled own a laptop? If so, calculate the probability. If not, explain why not. (c) What is the minimum sample size needed to guarantee that the 'rule of thumb' has been satisfied?
unit 4 (a) No, we must have np ≥ 10 and nq ≥ 10, but nq = 10 ∗ 0.3 = 3 < 10. (b) Yes, we have a random sample and np = 45∗0.7 = 31.5 ≥ 10 and nq = 45∗0.3 = 13.5 ≥ 10, so we can use the Normal distribution. P(pˆ > 0.75) = P ⎛ ⎜ ⎝ pˆ− 0.7 √0.7∗0.3 45 > 0.75 − 0.7 √0.7∗0.3 45 ⎞ ⎟ ⎠ = P(Z > 0.73) = 1 − 0.7673 = 0.2327 (c) We need both np ≥ 10 and nq ≥ 10 where p = 0.7. Since q is smaller, we'll work with nq. So we need n ∗ 0.3 ≥ 10 ⇒ n ≥ 10 0.3 ⇒ n ≥ 33.33 ⇒ n = 34
For each of the following items, write the proper formula for the standard score. Refer to the formulas on the Formula Sheet. Note: Just write out the proper formula including all the appropriate values as if you were going to calculate the standard score. Do not worry about calculating the standard score. (a) Research suggests that red kangaroos have an average height of 63 in with a standard deviation of 4 in. What is the probability that a random sample of 45 kangaroos has an average of height that is greater than 70 inches. (b) We believe that approximately 3 out of 4 people prefer statistics to all other subjects. In a random sample of 100 people, what is the probability that less than half of those sampled prefer statistics to all other subjects. (c) IQ scores are believed to follow a normal distribution with mean 90 and standard deviation 10. What is the probability that a person will have an IQ score that is greater than 95
unit 4 (a) z = y¯−µ σ/ √ n = 70−63 4/45 (b) z = √ pˆ−p pq n = √.5−.75 .75×.25 100 (c) z = y−µ σ = 95−90
An institution that ranks colleges and universities along several dimensions is interested in learning the proportion of undergraduate students at Mid-South State University who have had at least one alcoholic drink in the past week. The institution randomly selects 100 undergrads at MSSU and asks them to fill out an anonymous survey. From their sample, they find that 38% of the undergrads at MSSU have had a least one alcoholic drink in the past week. (Modules 5.1 - 5.3) [Learning Objectives F1, F2, F6, F7, F8, F9] (a) Calculate the standard error based on this data. Write a sentence interpreting this value. (b) What is the appropriate multiplier to use for a 95% confidence interval? (c) Calculate the margin of error based on this data. (d) Check the necessary conditions for appropriate inference using a confidence interval for a population proportion. (e) Calculate the 95% confidence interval for the population proportion based on this data. Write a sentence interpreting this interval.
unit 5 (a) The standard error of ˆp is qpˆqˆ n = q(.38)(1−.38) 100 = q(.38)(0.62) 100 = 0.049. This is an estimate of the variability of the sampling distribution for the sample proportion. (b) The conditions are that we need a large, random sample. We are told the sample is random. To check that it is large enough, we need to use the rules of thumb: npˆ = (100)(0.38) = 38 and nqˆ = n(1−pˆ) = (100)(0.62) = 62. Both of these quantaties are at least 10, so the sample is large enough. Thus, both necessary conditions are met. (c) The appropriate multiplier for a 95% confidence level is z = 1.96. (d) The margin of error is the multiplier times the standard error: MOE = z ∗ qpˆqˆ n = 1.96 × q(.38)(0.62) 100 = 0.096 (e) 0.38 ± 1.96 × q(.38)(.62) 100 . The endpoints are: (0.284, 0.476). We believe that the true proportion of all undergrads at MSSU who have had at least one drink in the past week is somewhere between 0.284 and 0.476.
FirstUSA, a major credit card company, is planning a new offer for their current cardholders. The offer will give double airline miles on purchases for the next 6 months if the cardholder goes online and registers for the offer. To test the effectiveness of the campaign, FirstUSA recently sent out offers to a random sample of 50,000 cardholders. Of those, 1184 registered. a) Give a 95% confidence interval for the true proportion of those cardholders who will register for the offer. b) If fewer than 2% will register for the offer, the campaign won't be worth the expense. Given the confidence interval you found, would you suggest the company go forward with the campaign?
unit 5 a) ˆp = 1184 50000 = 0.024, so the confidence interval is ˆp ± z qpˆqˆ n ⇒ .024 ± 1.96q .024·.976 50000 = .024 ± .00133 = (.0227, .0253). We believe the true percent of cardholders who will register for the offer is between 2.27% and 2.53%. b) Since the entire interval from part a) is above 2%, then we suggest that the company should go forward with the campaign.
True or false: A larger sample size will produce a larger margin of error.
unit 5 false because a larger sample size will produce a smaller margin of error.
True or false: The confidence level is an expression of our confidence in the procedure that was used to create the interval of interest.
unit 5 s true. "A 95% confidence level means that using the same procedure repeatedly on different samples of the same size, about 95% of the intervals will contain the parameter".
True or false: A higher level of confidence will produce a larger margin of error.
unit 5 true because a higher level of confidence corresponds to a larger standard score (draw a picture to see this). Intuitively, to have more confidence that our interval contains the parameter, the interval needs to include more values.
An undergraduate French instructor gives an exam, and the exam scores are lower than he had hoped. He decides to curve the scores by adding 10 points to everyone's exam score. Which of the following will be affected by the curve: a) Mean b) Standard deviation c) IQR
unit 6 a) Mean: Yes b) Standard deviation: No, the standard deviation is not affected by adding the same value to everyone's score, as addition (and subtraction) simply shift the distribution along the number line but do not change the shape. c) IQR: No, the IQR is not affected by adding the same value to everyone's score
Public relations staff at State U. phoned 850 randomly selected local residents. After identifying themselves, the callers asked the survey participants their ages, whether they had attended college, and whether they had a favorable opinion of the university. a) Identify the variables, classify each as categorical or quantitative, and identify the appropriate sample statistic - e.g. mean or proportion. b) They found that 650 of the residents attended college. Calculate a 95% confidence interval for the corresponding parameter. Be sure to clearly communicate about the parameter of interest when reporting your interval. c) State the conditions that go along with the confidence interval in part (b). Indicate if they are met.
unit 6 a) i. Age - Quantitative - Mean ii. College Attendance - Categorical - Proportion iii. Favorable Opinion of the University - Categorical - Proportion b) From the prompt, ˆp = 650 850 = 0.76. The 95% confidence interval is as follows: pˆ± z · r pˆ · qˆ n ⇒ 0.76 ± 1.96 · r 0.76 · 0.24 850 ⇒ 0.76 ± 0.0287 ⇒ (0.7314, 0.7887) We believe the true percent of local residents who have attended college is between 73% and 79%. c) The conditions are given on the formula sheet: • Random sample (true because it says a random sample was taken) • npˆ ≥ 10: npˆ = 850 · 0.76 = 650 ≥ 10, so this condition is satisfied • nqˆ ≥ 10: nqˆ = n(1 − pˆ) = 850 · (1 − 0.76) = 200 ≥ 10 ⇒ condition satisfied
True or false: A t statistic is used to construct confidence intervals for means in order to account for additional uncertainty in estimating the standard deviation.
unit 6 true
What standard score (z ∗ ) has 4.95% of values above it?
unit 6 z = 1.65
Software analysis of the salaries of a random sample of 288 Nevada teachers produced the 90% confidence interval of ($38, 944, $42, 893). Which statement(s) is(are) correct? What's wrong with the other(s)? a) If we took many random samples of 288 Nevada teachers, about 90% of them would produce this confidence interval. b) If we took many random samples of Nevada teachers, about 90% of them would produce a confidence interval that contained the mean salary of all Nevada teachers. c) About 9 out of 10 Nevada teachers earn between $38, 944 and $42, 893. d) We are 100% confident that the average amount the teachers surveyed earn is between $38, 944 and $42, 893. e) We are 90% confident that the average teacher salary in the United States is between $38, 944 and $42, 893.
unit 6 a) Ninety percent of the intervals will contain the true mean salary; different samples will produce different intervals. b) This is correct. Page 2 ST 311 Evening Problem Session - Solutions Week 6 c) The interval is for the population mean, not individual teachers. d) This is correct, because we know the sample average is in the middle of the confidence interval. e) The interval addresses only Nevada teachers, not the entire country
Sony is interested in estimating the proportion of US households that have Blueray players. A researcher randomly selects 500 households, and discovers 115 own Blueray players. [Learning Objectives A1, A4, E5] a) What is the population? The parameter of interest? b) What kind of sample is used? What is the sample size? c) What is an appropriate sample statistic that estimates the parameter of interest? d) Suppose the true proportion of households with Blueray players is 0.2. Describe the sampling distribution for the statistic in c as specifically as possible. e) Which of the following histograms could represent the sampling distribution described in d?
unit 6 a) The population is all US households. The parameter of interest is the proportion of US households that own Blueray players. b) A simple random sample is used. The sample size is 500. c) The sample proportion ˆp = 115 500 = 0.23. Page 4 ST 311 Evening Problem Session - Solutions Week 6 d) To describe a distribution, we need to specify center, spread, and shape. If the true proportion p is 0.2, the sampling distribution is centered at (its mean of) p = 0.2 and will have a standard deviation of q 0.2×0.8 500 = 0.0179. As for the shape, we need to check the conditions to see if it is normally distributed or not. A random sample was taken, and both 500 × 0.2 = 100 and 500 × 0.8 = 400 are greater than 10. Since all three of these conditions are satisfied, the shape of the distribution is normal. e) Histogram 2, because it is centered at 0.2 and is symmetric and bellshaped (like the normal distribution). for extra practice, the median of 1 is about 250, the median of 2 is about 0.2, and the median of 3 is slightly less than 0.2. Note that these histograms have relative frequency on the Y axis
An experiment showed that subjects fed the DASH diet were able to lower their blood pressure by an average of 6.7 points compared to a group fed a "control diet." All meals were prepared by dietitians. a) Why were the subjects randomly assigned to the diets instead of letting people pick what they wanted to eat? b) Why were the meals prepared by dieticians? c) Why did the researchers need the control group? If the DASH diet group's blood pressure was lower at the end of the experiment than at the beginning, wouldn't that prove the effectiveness of that diet? d) What additional information would you want to know in order to decide whether the average reduction in blood pressure of 6.7 points was a significant difference?
unit 7 a) Self-selection could systematically introduce lurking variables, resulting in groups that were very different at the start of the experiment, making it impossible to attribute the differences in the results to their diet. b) This assured that all subjects in each group received comparable treatments. c) The researchers can compare the change in blood pressure observed in the DASH group to the control group. They need to rule out the possibility that external variables (e.g. the season, news events) affected everyone's blood pressure. These are potential confounders. d) It would be helpful to know the standard deviation of the changes. If this change is very large compared to the standard deviations, then 6.7 might appear to be significant. Practically we would want some sort of reference point to know whether or not a change of this magnitude is actually clinically relevant.
When spending large amounts to purchase advertising time, companies want to know what audience they'll reach. In January 2007, a poll asked 1008 randomly selected American adults whether they planned to watch the upcoming Super Bowl. Men and women were asked separately whether they were looking forward more to the football game or to watching the commercials. Among the men, 16% were planning to watch and were looking forward primarily to the commercials. Among women, 30% were looking forward primarily to the commercials. a) Was this a stratified sample or a blocked experiment? b) Was the design of the study appropriate for the advertisers' question?
unit 7 a) This is a stratified sample. The question was about population values (the proportions of men and women who look forward more to the commercials). There was no treatment applied or manipulation of explanatory variables, so it is not an experiment. b) Yes, this design was appropriate to answer the advertisers' question.
Coffee stations in offices often just ask users to leave money in a tray to pay for their coffee, but many people cheat. On randomly selected days, researchers at Newcastle University replaced the picture of flowers on the wall behind the coffee station with a picture of staring eyes. They found that the average contribution increased significantly above the well-established standard when people felt they were being watched, even though the eyes were patently not real. (New York Times 12/10/06). a) Is this an observational study or an experiment? b) What is the explanatory variable? Is it quantitative or categorical? c) What is the response variable? Is it quantitative or categorical? d) Is it correct to conclude from this study that using a picture of staring eyes will cause people to pay more? Explain.
unit 7 a) This is an experiment. b) The explanatory variable is picture that was used. This is a categorical variable. c) The response variable is the average amount that was contributed. This is a quantitative variable. d) Yes, by randomly selecting the days for which the picture of the eyes was displayed, the researchers have helped to eliminate the issue of lurking variables.
Is diet or exercise effective in combating insomnia? Some believe that cutting out desserts can help alleviate the problem, while others recommend exercise. Forty volunteers suffering from insomnia agreed to participate in a month-long test. Half were randomly assigned to a special no-desserts diet; the others continued desserts as usual. Half of the people in each of these groups were randomly assigned to an exercise program, while the others did not exercise. Those who ate no desserts and engaged in exercise showed the most improvement. a) Is this an observational study or an experiment? b) What are the explanatory variables? Are they quantitative or categorical? c) What is the response variable? Is it quantitative or categorical? d) Is this a completely randomized, blocked or matched design? e) Why didn't the investigators just have everyone exercise and see if their ability to sleep improved? f) Identify the subjects in this study g) Is it correct to conclude from this study that eating no desserts and engaging in exercise reduces insomnia? Explain.
unit 7 a) This is an experiment. b) The explanatory variables are desserts and exercise (each with 2 levels). They are each categorical. c) The response variable is improvement in insomnia. Depending on how insomnia is measured, it could be either categorical or quantitative. If an insomnia score is calculated using a number of different factors, then we might consider it as quantitative. However, it is more likely that insomnia improvement is a categorical variable. d) This is a completely randomized design. We know it's a completely randomized design because none of the divisions of the groups are made because of some characteristic. If we first divided the group into males and then females and then randomly assigned participants within each group to the diet and exercise groups, then it would be a blocked design. However, because we just randomly assign people to diet and exercise groups, it is a completely randomized design. e) It is important to include a control in your experimental design. It could be that the amount of daylight has an effect on insomnia, and by not considering a group that doesn't exercise, we lose the ability to actually identify whether or not exercise plays a role in helping to combat insomnia. f) The subjects in this study are the forty volunteers who were suffering from insomnia. g) Yes, by randomizing the participants between the groups, we should equalize any unknown factors (e.g. diet, exercise, genetics) in a way that will allow us to infer causality from the study results.
Some people who race greyhounds give the dogs large doses of vitamin C in the belief that the dogs will run faster. Investigators at the University of Florida tried three different diets in random order on each of five racing greyhounds. They were surprised to find that when the dogs at high amounts of vitamin C they ran more slowly. (Science News, 07/20/02) a) Is this an observational study or an experiment? b) What is the explanatory variable? Is it quantitative or categorical? c) What is the response variable? Is it quantitative or categorical? d) Is this a completely randomized, blocked or matched design? e) Identify the subjects in this study f) Is it correct to conclude from this study that a diet high in vitamin C will result in a slower running pace?
unit 7 a) This is an experiment. b) The explanatory variables is the diet. There are three different levels of the factor. This is a categorical variable. c) The response variable is the dog's racing speed. d) This is a matched design since each dog tried each of the three diets, and within each of the dogs, the order of the diets are randomized. e) The subjects in the study are the five greyhounds. f) Yes, because of the experimental nature, and the random assignment of the diet, it is correct to conclude that a diet high in vitamin C will result in a slower running pace.
Researchers who examined health records of thousands of males found that men who died of myocardial infarction (heart attack) tended to be shorter than men who did not. a) Is this an observational study or an experiment? b) What is the explanatory variable? Is it quantitative or categorical? c) What is the response variable? Is it quantitative or categorical? d) Is it correct to conclude that shorter men are at higher risk for heart attack? Explain.
unit 7 a) This is an observational study because it's not manipulating any of the variables. They are using previously collected data and considering relationships between the variables. b) The explanatory variable is height. This could be either quantitative or categorical depending on how the data were coded. If height was recorded as an actual inch amount, then it would be quantitative, but if we consider men to be short if they are shorter than 5.5 feet, then the variable would be considered categorical, indicating whether or not a male is short. c) The response variable is whether or not the patient suffered a myocardial infarction. This is a categorical variable. d) It is not correct to conclude that shorter men are at higher risk for heart attack. We cannot infer causation from an observational study due to potential lurking variables such as a patient's age.
The weight of potato chips in a medium-size bag is stated to be 10 ounces. The amount that the packaging machine puts in these bags is believed to have a Normal model with mean 10.2 ounces and standard deviation 0.12 ounces. (a) Can we estimate the the probability that a bag sold will be underweight? If so, calculate the probability. If not, explain why not. (b) If we were to take a random sample of 3 bags, can we estimate the probability that the mean weight of the sample is underweight? If so, calculate the probability. If not, explain why not. (c) If we were to take a random sample of 24 bags, can we estimate the probability that the average weight of the sample is greater than 10.4 ounces. If so, calculate the probability. If not, explain why not.
unti 4 a. = 0.0475. b. 0.0019 c. =0
Direct mail advertisers send solicitations (a.k.a. "junk mail") to thousands of potential customers in the hope that some will buy the company's product. The acceptance rate is usually quite low. Suppose a company wants to test the response to a new flyer, and sends it to 1000 people randomly selected from their mailing list of over 200,000 people. They get orders from 123 of the recipients. a) Create a 90% confidence interval for the percentage of people the company contacts who buy something. b) Interpret the interval calculated in part a. Be sure to clearly communicate about the parameter of interest. c) Write an interpretation of what this 90% confidence level means. d) What percentage of 90% confidence intervals would we expect to miss the true parameter? Why does this happen? e) The company must decide whether to now do a mass mailing. The mailing won't be cost-effective unless at least 17% of those contacted buy something. What does your confidence interval suggest?
unti 5 a) ˆp = 123 1000 = 0.123 , so the confidence interval is ˆp ± z qpˆqˆ n ⇒ .123 ± 1.64q .123·.877 1000 = .123 ± .017 = (.106, .140) b) We believe that the percentage of people the company contacts who buy something is between 10.6% and 14%. c) If we were to do numerous random samples of the same size and calculate confidence intervals for each sample, approximately 90% of the confidence intervals would contain the true population parameter. d) If we expect 90% of the intervals to contain the true parameter (per our interpretation in the previous part), then about 10% should miss it. This happens because these intervals are based on sample statistics that come from the extremes of the sampling distribution (i.e. are either extremely small or extremely large); thus 10% of the confidence intervals are too far away from the true parameter to contain it in the interval. e) Because 17% is above the confidence interval calculated in part a), the confidence interval suggests that we should not do the mass mailing.
A demographer recently collected a random sample of 25 people to find out the average number of children that households in Durham, NC have. It is believed that the population has a normal distribution. They found the average value of their sample was 2.4 children with a sample standard deviation of 1. [Learning Objectives F1, F14, F15, F16, F17] (a) Calculate the standard error based on this data. (b) What is the appropriate multiplier to use for a 95% confidence interval for the population mean? (c) Calculate the margin of error based on this data. (d) Check the necessary conditions for appropriate inference using a confidence interval for a population mean. (e) Calculate the 95% confidence interval for the population mean based on this data. Write a sentence interpreting this interval.
unti 6 (a) Standard error = s √ n = 1 √ 25 = 0.2 (b) tn−1 = t24 = 2.064 (c) Margin of error = t × √s n = 2.064 × 0.2 = 0.413 (d) A random sample was used and the distribution of the population is normal (which is needed since n = 25 < 30), which are both satisfied. Page 1 ST 311 Evening Problem Session - Solutions Week 6 (e) CI: ¯y ± t × √s n ⇒ 2.4 ± 0.413 ⇒ (1.987, 2.813) Communication: We believe that the true average number of children that households in Durham have is between 1.99 an 2.81.
Find the appropriate t multiplier for the following situations: a) A 95% confidence interval with n = 8 b) A 99% confidence interval with n = 11
unti 6 a) n = 8 ⇒ df = n − 1 = 7, so t = 2.365 b) n = 11 ⇒ df = 10, so t = 3.169
True or false: We use a t distribution to calculate the margin of error for sample proportions.
unti 6 false, When calculating confidence intervals for proportions, we use a normal distribution.
A sample of 900 high school seniors were randomly selected for a national survey. Among the survey participants, 372 students were interested in pursuing a liberal arts degree within the next year. The sample proportion is 0.413. What is the margin of error for a 90% confidence interval for this sample? What is the lower endpoint for the 90% confidence interval?
~ .027 ~ .386 (q5)