Chapter 3 STAT 1312
A politician asked his constituents via a Facebook poll to help him how he should vote on a bill that would establish 150m Protester Exclusion Zones around abortion clinics in New South Wales, Australia. On his Facebook page, Philip Donato stated "I believe this is a matter of conscience and the views of my electorate should supersede my own." (e) Is it correct to say that, based on the final Facebook poll, we are 95% confident that 78% ±0.7% of all of Philip Donato's constituents believe that Protester Exclusion Zones should be established in New South Wales? Explain your answer. Be careful not to confuse your personal opinion with the statistical issues.
It is not correct to make this statement because the Facebook poll yields a voluntary response sample and is likely biased. Extended Answer: It is not correct to make this statement because it is not possible to generalize to all of Donato's constituents, because the poll was not a random sample of all of Donato's constituents. In general, Facebook polls yields a voluntary response sample and is likely to be biased. Generalizations about a population cannot be made unless the sample is drawn randomly from the entire population. Even if the margin of error is smaller in the second poll, the margin of error formula requires a random sample from the population of interest to be valid. A large sample will not yield statistically accurate results if the sample is nonradomly drawn from the population. A more accurate confident statement would be, "We are 95% confident that between 77.3 and 78.7% of Donato's constituents believe that protestor exclusion zones should be established in New South Wales. But, in general, confidence statements cannot be made if the data was not drawn randomly.
A Gallup Poll conducted from February 28‑March 1, 2015, asked 1015 randomly selected adults, "How important is it that parents get their children vaccinated—extremely important, very important, somewhat important, or not at all important?" Gallup found that 54% of respondents said "extremely important." Even though 54% describes the relatively small sample of 1015 adults, compared to the 274 million adults in the United States, Gallup feels that on the basis of this one sample, they can conclude that the majority (more than 50%) of American adults feel that it is extremely important for parents to get their children vaccinated. To see why, we need to understand the variability of random samples of size 1015 drawn from the same population. In thinking about Gallup's sample of size 1015, we asked, "Could it happen that one random sample finds that 54% of adults feel that childhood vaccination is extremely important and a second random sample finds that only 42% do?" Look at the figure, which shows the results of 1000 samples of this size when the population truth is 𝑝 = 0.5, or 50%. Use the figure to support your reasoning. Would it be surprising if a sample from this population gave 54%?
No, because a value of 54% is not an unlikely sample proportion based on the histogram. Extended Answer: Proportions are surprising when they fall outside the range of expected values. A sample that gives a percentage of 54% would not be surprising because it is not an unlikely value and falls inside the range of expected values. A sample that gives a percentage of 42% would be surprising because it is an unlikely value and falls outside the range of expected values. The histogram shows the proportions obtained from 1000 samples of size 1015. If a specific proportion was not obtained in one of these 1000 samples, that proportion would be viewed as an extreme value. A proportion that is not obtained in one of the 1000 samples would also be unexpected and surprising. Common proportions are proportions obtained in one of the 1000 samples in the histogram. Therefore, proportions between 0.45 to 0.55 would be seen as not extreme and would not be a surprising result. Any proportions smaller than 0.45 or larger than 0.55 would be seen as extreme and would be a surprising result.
A politician asked his constituents via a Facebook poll to help him how he should vote on a bill that would establish 150m Protester Exclusion Zones around abortion clinics in New South Wales, Australia. On his Facebook page, Philip Donato stated "I believe this is a matter of conscience and the views of my electorate should supersede my own." (c) Using the information you calculated in parts (a) and (b), which snapshot of the poll do you think most accurately reflects the views of Donato's constituents? Explain your answer.
The snapshot in part (b) ought to be more accurate because the snapshot in part (b) included a greater number of Donato's constituents. Extended Answer: The snapshot in part (b) is more accurate because it has a larger sample size. Larger samples are more accurate because they more closely represent the whole population. The accuracy of a poll cannot be determined by the sample statistic. Earlier results of a poll tend to be less accurate because they have smaller sample sizes. The snapshot in part (a) resulted in a larger margin of error, not a smaller margin of error.
Would it be surprising if a sample gave 42%?
Yes, because a value of 42% is an unlikely sample proportion based on the histogram. Extended Answer: Proportions are surprising when they fall outside the range of expected values. A sample that gives a percentage of 54% would not be surprising because it is not an unlikely value and falls inside the range of expected values. A sample that gives a percentage of 42% would be surprising because it is an unlikely value and falls outside the range of expected values. The histogram shows the proportions obtained from 1000 samples of size 1015. If a specific proportion was not obtained in one of these 1000 samples, that proportion would be viewed as an extreme value. A proportion that is not obtained in one of the 1000 samples would also be unexpected and surprising. Common proportions are proportions obtained in one of the 1000 samples in the histogram. Therefore, proportions between 0.45 to 0.55 would be seen as not extreme and would not be a surprising result. Any proportions smaller than 0.45 or larger than 0.55 would be seen as extreme and would be a surprising result.
A state representative wants to know how voters in his district feel about enacting a statewide smoking ban in all enclosed public places, including bars and restaurants. His staff mails a questionnaire to a simple random sample of 800 voters in his district. Of the 800 questionnaires mailed, 152 were returned. Of the 152 returned questionnaires, 101 support the enactment of a statewide smoking ban in all enclosed public places. What is the statistic?
101/152=66.4% Extended Answer: Of the 152 questionnaires returned, 101 support the smoking ban. 66.4% is a statistic that estimates the true population percentage of voters who support the smoking ban. A parameter represents an entire population, not a sample. 19.0% is a parameter because it is the percentage of responses of the total population contacted. Statistics and parameters need to be logical quantities, otherwise they have no meaning. 12.6% is the percentage of responses supporting the ban out of the total population contacted, a meaningless quantity because 699 of the voters contacted did not respond.
In July 2018, the Gallup Poll asked a random sample of 1033 American adults, "Has drug abuse ever been a cause of trouble in your family?" The poll found that 30% of respondents said "yes," a record high percentage since the question started being asked in 1995. Suppose that the sample size had been 3500 rather than 1033. Find the margin of error for 95% confidence for the larger sample. Give your answer to three decimal places
Margin of Error: 0.016 Extended Answer: For 95% confidence, the margin of error is approximately equal to 1𝑛√, where 𝑛 is the size of the observed sample. If the sample size is 3500, the margin of error is 1/sqrt(3500)≈1/√159.161≈0.017 (that is, 1.7%) The margin of error means that for results based on a sample of 3500 adults, one can say with 95% confidence that the error attributable to sampling and other random effects could be plus or minus 1.7 percentage points for all American adults.
In July 2018, the Gallup Poll asked a random sample of 1033 American adults, "Has drug abuse ever been a cause of trouble in your family?" The poll found that 30% of respondents said "yes," a record high percentage since the question started being asked in 1995. What is the approximate margin of error for 95% confidence? Give your answer to three decimal places.
Margin of Error: 0.031 Extended Answer: The margin of error occurs due to variability from sample to sample. Apply a quick and approximate method to find the margin of error. Use the sample proportion 𝑝̂ from a simple random sample of size 𝑛 to estimate an unknown population proportion 𝑝. The margin of error for 95% confidence is approximately equal to 1𝑛√. Substitute 1033 for the sample size and compute the margin of error. 1/sqrt(1033)≈1/√132.14≈0.031 (that is, 3.1%) That means if many samples are taken, 95% of the samples would give a result within plus or minus 3.1 percentage points of the truth about all American adults.
The National Health Interview Survey (NHIS) telephone survey is conducted annually in the United States. Of the first 100 numbers dialed, 55 numbers were for wireless telephones. This is not surprising, because, as of the second half of 2016, 50.8% of all U.S. households had only wireless telephones. Classify each of the two numbers as a parameter or a statistic.
Parameter: 50.8% Statistic: 55% Extended Answer: A parameter is a fixed number that describes the population. The population of interest is all U.S. households. 50.8% is a parameter because this result was obtained from the population. A statistic is a number that describes a sample. The purpose of sampling is to use a sample to gain information about a population. In this way a sample statistic is often used to estimate an unknown population parameter. The observed sample is 100 households. The information collected from this sample can be used to draw conclusions about all U.S. households. Because 55 of 100 (that is, 55%) answers were collected through wireless phones, this value is a sample statistic.
Just before a presidential election, a national opinion poll increases the size of its weekly random sample from the usual 1000 people to 4000 people. Does the larger random sample reduce the bias and the variability of the poll result?
The larger sample reduces variability but not bias. Both samples are unbiased because random sampling was used. Extended Answer: Recall the definitions of bias and variability. Bias is consistent, repeated deviation of the sample statistic from the population parameter in the same direction when many samples are taken. In other words, bias is a systematic overestimate or underestimate of the population parameter. Variability describes how the values of the sample statistic will vary when many samples are taken. Large variability means that the result of sampling is not repeatable. Biased sampling can be caused by favoritism of the sampler or self‑selection of respondents. No matter how large the sample size is, if some part of the population is eliminated from consideration, the result of the study can be far from the true population parameter. To reduce bias, use random sampling. Variability shows how widespread the values are in a sample. The larger the sample, the more the values will give an estimate that is close to the truth. So, variability depends on the sample size. To reduce the variability, use a large sample. The margin of error is due to variability from sample to sample. According to the estimation rule, to decrease the margin of error in half, a sample four times as large should be used. So, the statement about how increasing the sample size decreases the margin of error and not the variability is incorrect. Therefore, the correct statement is "The larger sample reduces variability but not bias. Both samples are unbiased because random sampling was used."
In July 2018, the Gallup Poll asked a random sample of 1033 American adults, "Has drug abuse ever been a cause of trouble in your family?" The poll found that 30% of respondents said "yes," a record high percentage since the question started being asked in 1995. How does the margin of error for 95% confidence for the sample size of 3500 compare with the margin of error for a sample of size 1033?
The margin of error for 95% confidence, when the sample size is 3500, is SMALLER THAN the margin of error when the sample size is 1033. Extended Answer: In the previous steps, the margin of error for both samples was computed using the approximate method. Compare these values. 1/sqrt(1033) > 1/sqrt(3500) The margin of error for 95% confidence when the sample size is 3500 is smaller than the margin of error when the sample size is 1033.
An online store contacts 1500 customers from its list of customers who have purchased in the last year and asks the customers if they are very satisfied with the store's website. One thousand (1000) customers respond, and 696 of the 1000 say that they are very satisfied with the store's website. What is the parameter?
The percentage of all customers who purchased in the last year who would have replied they are very satisfied with the store's website. Extended Answer: Of all the customers in the last year, the percentage who were very satisfied is the parameter because it is a fixed value describing the entire population. The 696 very satisfied responses out of the 1000 total responses is an approximation, or estimate, of the parameter. The statistic (69.6%) is likely biased because if the store conducted another survey, it would probably have a slightly different result since the second survey would require contacting different customers. If enough surveys are conducted, the average of the sample statistics will closely approximate the parameter with very little bias. The unknown percentage of very satisfied responses out of the 1500 customers contacted does not represent either a parameter or a statistic.
A politician asked his constituents via a Facebook poll to help him how he should vote on a bill that would establish 150m Protester Exclusion Zones around abortion clinics in New South Wales, Australia. On his Facebook page, Philip Donato stated "I believe this is a matter of conscience and the views of my electorate should supersede my own." (d) Consider the following statement that appeared in the Herald article by Labor MP Penny Sharpe: "I would encourage everyone in the Central West who thinks that women should be able to go to the doctor with privacy and without interference to support the poll and encourage Mr. Donato to vote for the bill."
The poll is likely less accurate because the statement encouraged more people to support the bill by rewording what the bill says. Extended Answer: The poll is less accurate because the statement encouraged more people to support the bill by rewording what the bill stated. Abortion clinics can be very polarizing. By not mentioning "abortion clinic" in her statement, Sharpe was most likely able to get more individuals to be in support of the bill and answer the poll. This biases the results of the poll, since some people may have supported Sharpe's statement but not the specifics of the bill. Biased results are less accurate. This will likely result in an overestimate of the true proportion of the population who supports the bill. This statement mostly likely increases the number of people who answered the poll. However, it was not a random sample of people, it does not help increase the accuracy of the poll. It is possible for newspapers to influence Facebook polls, since individuals read newspapers and use Facebook.
The Ministry of Health in the Canadian province of Ontario wants to know whether the national health care system is achieving its goals in the province. Much information about health care comes from patient records, but that source does not allow to compare people who use health services with those who do not. So the Ministry of Health conducted the Ontario Health Survey, which interviewed a random sample of 61,239 people who live in the province of Ontario.
The population for this sample survey is all people living in the province of Ontario and the sample is the 61,239 people who answered the survey. Extended Answer: The population in a statistical study is the entire group of individuals about which the researchers want information. A sample is the part of the population from which the researchers actually gather information and is used to draw conclusions about the whole. The Ministry of Health was interested in the percentage residents of the province of Ontario who used any kind of medical service, so the population is all people living in the province of Ontario. The sample is the 61,239 people who answered the survey.
A November 2017 Gallup Poll of 1028 U.S. adults found that 627 are satisfied with the total cost they pay for their health care. The announced margin of error is ±4 percentage points. The announced confidence level is 95%. Make a confidence statement about the population parameter 𝑝.
We are 95% confident that the 61% of U.S. adults who are satisfied with the total cost they pay for their health care is between 4% and 100% Extended Answer: The conclusion of a confidence statement always applies to the population, not to the sample. A confidence statement has two parts: a margin of error and a level of confidence. The margin of error states how close the sample statistic lies to the population parameter. However, it is not completely certain that the true population parameter differs from the estimate by no more than the margin of error. The level of confidence states what percentage of all possible samples satisfy the margin of error. Consider this explanation to compose an appropriate confidence statement.
The Ministry of Health in the Canadian province of Ontario wants to know whether the national health care system is achieving its goals in the province. Much information about health care comes from patient records, but that source does not allow to compare people who use health services with those who do not. So the Ministry of Health conducted the Ontario Health Survey, which interviewed a random sample of 61,239 people who live in the province of Ontario. The survey found that 76% of males and 86% of females in the sample had visited a general practitioner at least once in the past year. Do you think these estimates are close to the truth about the entire population?
Yes, because the sample was randomly chosen. Extended Answer: Large random samples almost always give an estimate that is close to the truth. The sample was chosen at random from the population, so the results of the sample are accurate. The sample was large, so the results of the sample are precise. Looking at many such samples would show a pattern of low bias and low variability, which is exactly what researchers want when estimating facts about a population based on a sample.
The figure shows the behavior of a sample statistic in many samples in four situations. The heights of the bars show how often the sample statistic took various values in many samples from the same population. The true value of the population parameter is marked on each graph. Label each of the graphs in the figure as showing high or low bias and as showing high or low variability.
high bias, high variability: some high and somewhat scattered from population parameter low bias, high variability: super scattered from population parameter and mainly high low bias, low variability: all together in population parameter high bias, low variability: all together on one side of the population parameter Extended Answer: Bias is consistent, repeated deviation of the sample statistic from the population parameter in the same direction when we take many samples. In other words, bias is a systematic overestimate or underestimate of the population parameter. Variability describes how the values of the sample statistic will vary when we take many samples. Large variability means that the result of sampling is not repeatable. A good sampling method has both small bias and small variability. In the first histogram, the sample statistic parameter is far to the left from the actual population parameter, which determines high bias. The values of the sample statistic are varied. Therefore, the variability is also high. In the second histogram, the sample statistic parameter is close to the actual population parameter. Therefore, the bias is low. The values of the sample statistic do not vary a lot, which determines low variability. In the third histogram, the sample statistic parameter is close to the actual population parameter, which determines low bias. The values of the sample statistic are varied. Therefore, the variability is high. In the second histogram, the sample statistic parameter is far to the right from the actual population parameter. Therefore, the bias is high. The values of the sample statistic do not vary a lot, which determines low variability.
A politician asked his constituents via a Facebook poll to help him how he should vote on a bill that would establish 150m Protester Exclusion Zones around abortion clinics in New South Wales, Australia. On his Facebook page, Philip Donato stated "I believe this is a matter of conscience and the views of my electorate should supersede my own." (b) After the poll closed, the final Facebook poll results had 78% of the 19600 voters indicating support for the Protester Exclusion Zones. Use our quick and approximate method to determine the margin of error for the final Facebook poll results. Give your answer as a percentage precise to two decimal places.
margin of error: .71% Extended Answer: The quick approximation of the margin of error is one divided by the square root of the sample size. The sample size is the 19600 voters who participated in the Facebook poll. Calculate the margin of error as 1√𝑛=1√19600=0.00714 In order to make it a percentage, multiple the margin of error by 100. 0.00714⋅100=0.714
The Ministry of Health in the Canadian province of Ontario wants to know whether the national health care system is achieving its goals in the province. Much information about health care comes from patient records, but that source does not allow to compare people who use health services with those who do not. So the Ministry of Health conducted the Ontario Health Survey, which interviewed a random sample of 61239 people who live in the province of Ontario. Estimate the margin of error for conclusions having 95% confidence about the entire adult population of Ontario. Give your answer to three decimal places.
margin of error: 0.004 Extended Answer: The margin of error occurs due to variability from sample to sample. For 95% confidence, it is approximately equal to 1√𝑛, where 𝑛 is the size of the observed sample. 1√61239≈1√247.465≈0.004 (that is, 0.4%) The margin of error means that if a large number of samples were obtained using the same method, the proportion of people who use health services among all Ontario residents would fall within plus or minus 0.4 percentage points of the sample result 95% of the time.
A politician asked his constituents via a Facebook poll to help him how he should vote on a bill that would establish 150m Protester Exclusion Zones around abortion clinics in New South Wales, Australia. On his Facebook page, Philip Donato stated "I believe this is a matter of conscience and the views of my electorate should supersede my own." (a) According to The Sydney Morning Herald, as of 1:00 p.m. June 6, 2018 (the day before Donato's Facebook poll closed), 74% of the 7700 poll voters supported the Protestor Exclusion Zones. Use our quick and approximate method to determine the margin of error for the poll results as of 1:00 p.m. on June 6, 2018. Give your answer as a percentage precise to two decimal places.
margin of error: 1.14% Extended Answer: The quick approximation of the margin of error is one divided by the square root of the sample size. The sample size is the 7700 voters who participated in the Facebook poll. Calculate the margin of error as 1√𝑛 =1√7700 =0.0114 In order to make it a percentage, multiple the margin of error by 100. 0.0114⋅100=1.14
In October 2017, the Gallup Poll asked a sample of 1028 U.S. adults, "Are you in favor of the death penalty for a person convicted of murder?" Suppose that the margin of error needs to be half as large for this poll. How many people must be interviewed? Note that the size of the new sample is denoted by 𝑛. Give your answer as a whole or exact number.
n= 4112 Extended Answer: To estimate the sample size, use the approximation rule. A quick and approximate method for the margin of error is to use the sample proportion 𝑝̂ from a simple random sample of size 𝑛 to estimate an unknown population proportion 𝑝. The margin of error for 95% confidence is approximately equal to 1𝑛√. Apply this estimation to calculate the margin of error. Denote the new sample size by 𝑛. Write the equation using the fact that the margin of error for the new sample should be half of the value for the initial sample. 1/√𝑛= 1/2 * 1/√1028 √𝑛 = 1/2 * √1028 n =2 * 1028 n = 4 * 1028 n = 4112 So, 4112 people must be interviewed.
A November 2017 Gallup Poll of 1028 U.S. adults found that 627 are satisfied with the total cost they pay for their health care. The announced margin of error is ±4 percentage points. The announced confidence level is 95%. What is the value of the sample proportion 𝑝̂ who say they are satisfied with the total cost they pay for their health care? Give your answer to three decimal places.
𝑝̂=0.609 Extended Answer: According to the study, 627627 out of the sample of 10281028 U.S. adults were satisfied with the total cost of their health care. Find the sample proportion as the ratio of the number of adults who are satisfied with the total cost of health care to all adults in the sample. 𝑝̂ =627/1028≈0.61p^=6271028≈0.61 Thus, the sample proportion is 0.61.