STATS Test 3
CI
A study dealing with health care issues plans to take a sample survey of 1500 Americans to estimate the proportion who have health insurance and the mean dollar amount that Americans spent on health care this past year. Identify two population parameters that this study will estimate? The population proportion who have health insurance. The population mean dollar amount spent on health care this past year. Identify two statistics that can be used to estimate these parameters? The sample mean. The sample proportion. Mention two population parameters that this survey is trying to estimate? The proportion of adults in the country who primarily watch content time-shifted. The mean weekly time adults in the country spend watching content over the Internet. Mention two corresponding statistics that will estimate these population parameters with the help of the survey? The proportion of adults in the sample who primarily watch content time-shifted. The mean weekly time that adults in the sample spend watching content over the Internet. The sample mean number of hours watched online is an unbiased estimator for what parameter? It is an unbiased estimator for the mean weekly time adults in the country spend watching content over the Internet. An unbiased estimator is centered at the parameter it tries to estimate. Use this example to explain why a point estimate alone is usually insufficient for statistical inference? An interval estimate gives us a sense of the accuracy of the point estimate whereas a point estimate alone does not. Interpret the confidence interval in context? This is the interval containing the most believable values for the mean number of days that people have felt lonely in the last 7 days. Compared to the interval for females, is there much evidence of a difference between the means? There is not much evidence of a difference between the means because the sample means and standard deviations are very similar, and the two intervals overlap significantly. What assumptions are needed to construct a 95% confidence interval for μ? The data are obtained by randomization, and the population distribution is approximately normal. Point out any assumptions that seem questionable? The data do not appear to be chosen from an approximately normal population distribution because the boxplot is skewed to the right. Name two things you could do to get a narrower interval than the one in part a? Decrease the confidence level or increase the sample size. Why is the 99% confidence interval wider than the 95% interval? The t-distribution critical value is larger with a higher confidence level. On what assumptions is the interval in part a based? The data are obtained by randomization and the population distribution is approximately normal. What must we assume to use these data to find a 95% confidence interval for the population mean cell phone price? The data are obtained by randomization and the population distribution is approximately normal. The table shows the way software reports results. How was the standard error of the mean (SE Mean) obtained? Divide the standard deviation by the square root of the sample size. Use the MINITAB report to explain how to interpret the 95% confidence interval in context? There is 95% confidence that the population mean cell phone price is between $614.293 and $647.263. is there evidence that the mean price is higher when purchased new? Yes, because all plausible values for the price of new phones are higher than the plausible values for the mean price of used phones. Does this change your answer to part d? No, because all plausible values for the price of new phones are still higher than the plausible values for the mean price of used phones. Is this a concern for the validity of the confidence interval? No, because the sample size is very large so the central limit theorem applies. Interpret the confidence interval from the previous step? There is 90% confidence that the population mean for the number of hours per week people spend sending and answering e-mail is between these two values. Explain why the population distribution may be skewed right? Since there will be many women that are at least 80 years of age who do not use e-mail at all but some who use e-mail frequently, the distribution is likely to be skewed right. If the population distribution is skewed right, is the interval you obtained in b useless, or is it still valid? Valid What effect does sample size have on the margin of error? As sample size increases, the margin of error becomes smaller. Does it seem plausible that the population distribution of this variable is normal? No, because there will be many students who do not read a newspaper but some students who read at least one newspaper every day. The distribution is likely to be skewed right. Explain the implications of the term "robust" regarding the normality assumption made to conduct this analysis? The term "robust" means that even if the normality assumption is not completely met, this analysis is still likely to produce valid results. A point estimate is the value of a statistic that estimates the value of a parameter. The level of confidence represents the expected proportion of intervals that will contain the parameter if a large number of different samples of size n is obtained. It is denoted (1-alpha)x100%. How does increasing the sample size affect the margin of error, E? As the sample size increases, the margin of error decreases. How does increasing the level of confidence affect the size of the margin of error, E? As the level of confidence increases, the size of the interval increases. Could we have computed the confidence intervals in parts (a)-(c) if the population had not been normally distributed? No, the population needs to be normally distributed. If the sample size is 15, what conditions must be satisfied to compute the confidence interval? The sample data must come from a population that is normally distributed with no outliers. Provide two recommendations for decreasing the margin of error of the interval? Increase the sample size. Decrease the confidence level. With 95% confidence, the limits of the confidence interval contain the proportion of healthy people aged 18-49 who are vaccinated with the vaccine but still develop the illness. What do the numbers in this interval represent? The numbers represent the most believable values for the population proportion. The data must be obtained randomly, and the expected numbers of successes and failures must both be at least 15. The data must be obtained randomly, the number of successes must be at least 15, and the number of failures must be at least 15. With 95% confidence, the interval .598 to .643 contains the population proportion of adults in the country who were in favor of the death penalty. Explain what the "95% confidence" refers to, by describing the long-run interpretation? If the same method is used to estimate the same population proportion many times, then about 95% of the intervals would contain the population proportion. Is it safe to conclude that more than half of all adults in the country were in favor? Yes, since the confidence interval lies completely above 0.5. The "Sample p" is the proportion of all respondents in the sample who believe stem cell research has merit, 1523/2109≈0.72. The "95% CI" is the 95% confidence interval and it means that we can be 95% confident that the interval .7030 to .7413 contains the population proportion. The 99% confidence interval would be wider than a 95% confidence interval. Treating the sample as a random sample from the population of all voters, would you predict the winner? The winner can not be predicted because 0.50 does not fall outside of the 95% confidence interval. Base your decision on a 99% confidence interval? The winner can not be predicted because 0.50 does not fall outside of the 99% confidence interval. Explain why you need stronger evidence to make the prediction when you want greater confidence? The more confident you want to be about the results, the wider the confidence interval will be. A smaller sample results in a greater standard error, which results in a greater margin of error for the same proportions and confidence level, meaning less information is provided. Is the interpretation reasonable? The interpretation is flawed. The interpretation provides no interval about the population proportion. The interpretation is flawed. The interpretation indicates that the level of confidence is varying. The interpretation is reasonable. The interpretation is flawed. The interpretation suggests that this interval sets the standard for all the other intervals, which is not true. There is 85% confidence that the proportion of the adult citizens of the nation that dreaded Valentine's Day is between 0.103 and 0.317. Determine the population of interest? The population is all adults 19 years of age or older. The variable of interest is bringing one's cell phone every trip to the bathroom. This variable is qualitative with two outcomes because individuals are classified based on a characteristic. Why is the point estimate found in part (c) a statistic? Its value is based on a sample. Why is the point estimate found in part (c) a random variable? Its value may change depending on the individuals in the survey. What is the source of variability in the random variable? The individuals selected to be in the study. We are 95% confident the proportion of adults 19 years of age or older who bring their cell phone every trip to the bathroom is between .206 and .258. What ensures that the results of this study are representative of all adults 19 years of age or older? Random sampling. The results are close because 0.54(1−0.54)=0.2484 is very close to 0.25. What does it mean to say the race was too close to call? The margin of error suggests candidate A may receive between 44% and 50% of the popular vote and candidate B may receive between 43% and 49% of the popular vote. Because the poll estimates overlap when accounting for margin of error, the poll cannot predict the winner. What does "98% confidence" mean in a 98% confidence interval? If 100 different confidence intervals are constructed, each based on a different sample of size n from the same population, then we expect 98 of the intervals to include the parameter and 2 to not include the parameter.
probability
Find the probability that an observation is at least 1 standard deviation above the mean? 0.159 Find the probability that an observation is at least 1 standard deviation below the mean? 0.159 Find the probability that an observation is within 1 standard deviation of the mean? 0.683
Distribution
Is the population distribution of the duration of your phone calls likely to be bell shaped, right-, or left-skewed? Since there is a minimum but no maximum value, the distribution is skewed to the right. You are on a shared wireless plan with your parents, who are statisticians. They look at some of your recent monthly statements that list each call and its duration and randomly sample 45 calls from the thousands listed there. They construct a histogram of the duration to look at the data distribution. Is this distribution likely to be bell shaped, right-, or left-skewed? Since there is a minimum but no maximum value, the distribution is skewed to the right. From the sample of n=45 calls, your parents compute the mean duration. Is the sampling distribution of the sample mean likely to be bell shaped, right-, or left-skewed, or is it impossible to tell? Since the sample size is large, the distribution is approximately normal. The distribution is approximately normal. The number of people in a household. The variable X is quantitative, because each observation is a numerical value that represents a magnitude of the variable. It is probably skewed right because the company probably has many lower-level employees with lower incomes and a few upper-level employees with very high incomes. The data distribution is probably skewed right because the population distribution is probably skewed right. It is approximately normal because all sampling distributions with sufficiently large sample sizes are approximately normal. It would not be unusual to observe an individual earning more than $100,000 because this is well within three population standard deviations of the mean. It would be highly unusual to observe a sample mean income above $100,000 for a random sample size of 100 people because this is well beyond three sampling distribution standard deviations from the mean. Use technology to create a sampling distribution for the sample mean using sample sizes n=2. Take N=9000 repeated samples of size 2, and observe the histogram of the sample means. What shape does this sampling distribution have? The sampling distribution is triangular. Now take N=9000 repeated samples of size 8. Explain how the variability and the shape of the sampling distribution changes as n increases from 2 to 8? The sampling distribution is more normal, and the variability is smaller. Now take N=9000 repeated samples of size 25. Explain how the variability and the shape of the sampling distribution changes as n increases from 2 to 25? The sampling distribution is more normal, and the variability is much smaller. Compare the results from parts a through c to the displayed example curves? The distributions from parts a through c roughly match the displayed curves. Explain how the central limit theorem describes what has been observed in this problem? The sampling distribution of the mean became more and more normal as the sample size increased from 2 to 8 to 25, which the central limit theorem says should happen. Suppose a simple random sample of size n is drawn from a large population with mean μ and standard deviation σ. The sampling distribution of x has mean μx=______ and standard deviation σx=______? μx=μ and standard deviation σx= σ/square root of n The standard deviation of the sampling distribution of x, denoted σx, is called the standard error of the mean. The distribution of the sample mean, x, will be normally distributed if the sample is obtained from a population that is normally distributed, regardless of the sample size. True To cut the standard error of the mean in half, the sample size must be doubled? False. The sample size must be increased by a factor of four to cut the standard error in half. Does the population need to be normally distributed for the sampling distribution of x to be approximately normally distributed? Why? No because the Central Limit Theorem states that regardless of the shape of the underlying population, the sampling distribution of x becomes approximately normal as the sample size, n, increases. What must be true regarding the distribution of the population? The population must be normally distributed. What effect does increasing the sample size have on the probability? If the population mean is less than 71 minutes, then the probability that the sample mean of the time between eruptions is greater than 71 minutes decreases because the variability in the sample mean decreases as the sample size increases. The population mean may be greater than 60. The population mean is 60, and this is just a rare sampling. To compute probabilities regarding the sample mean using the normal model, what size sample would be required? The sample size needs to be greater than 30.
The normal curve is symmetric about its mean, μ.
The statement is true. The normal curve is a symmetric distribution with one peak, which means the mean, median, and mode are all equal. Therefore, the normal curve is symmetric about the mean, μ. The area under the normal curve to the right of μ equals? 1/2 The histogram is not bell-shaped, so a normal distribution could not be used as a model for the variable. What happens to the graph of the normal curve as the mean increases? The graph of the normal curve slides right. What happens to the graph of the normal curve as the standard deviation decreases? The graph of the normal curve compresses and becomes steeper. The notation zα is the z-score that the area under the standard normal curve to the right of zα is? α Describe the shape, mean, and standard deviation of the sampling distribution of the player's batting average after a season of 600 at-bats. Describe the shape of the sampling distribution? The distribution is bell-shaped, centered on a mean of 0.303, and the majority of the distribution lies within three standard deviations of the mean. Explain why a batting average of 0.284 or 0.322 would not be especially unusual for this player's year-end batting average? Year-end batting averages of 0.322 and 0.284 lie one standard deviation from the mean. Therefore, it is not unlikely that a player with a career batting average of 0.303 would have a year-end batting average of 0.322 or 0.284. Explain how the results in (b) indicate that the sample proportion is closer to the population proportion when the sample size is larger? When n is larger, the standard deviation is smaller, so the interval is smaller.
Sampling distribution
What does this sampling distribution represent? It represents the probability distribution of the sample proportion of the number of full-time students in a random sample of 325 students. Choose the correct description of the mean of the sampling distribution? The expected value for the mean of a sample of size 36. Choose the correct description of the standard deviation? The variablility of the mean for samples of size 36. The sample proportion, denoted p, is given by the formula p=x/n, where x is the number of individuals with a specified characteristic in a sample of n individuals. The population proportion and sample proportion always have the same value? False The mean of the sampling distribution of p is p? True Suppose the random sample of 100 people is asked, "Are you satisfied with the way things are going in your life?" Is the response to this question qualitative or quantitative? The response is qualitative because the responses can be classified based on the characteristic of being satisfied or not. Explain why the sample proportion, p, is a random variable. What is the source of the variability? The sample proportion p is a random variable because the value of p varies from sample to sample. The variability is due to the fact that different people feel differently regarding their satisfaction. Choose the phrase that best describes the shape of the sampling distribution of p below? Approximately normal because n≤0.05N and np(1-p)>10.