05.04 Confidence Intervals for Proportions
margin of error
the maximum expected difference between the true population parameter and a sample estimate of that parameter
point estimate
the statistic itself, such as the sample mean, sample median, or sample proportion given as an estimate of the population parameter of interest
In a confidence interval for a proportion, the margin of error is one piece of the formula itself:
±z* sqrt(p̂(1−p̂)/n)
Constructing Confidence Interval example 2 An AP Statistics teacher asks her students whether they text while driving. In a sample of 50 students (sampled randomly from her 700 students), 35 say they text while driving. Construct a 95% confidence interval for the true proportion of the students who text while driving.
1. Parameter Estimate the population proportion, p, of the students who text while driving. 2. Conditions Simple random sample: The problem states the sample was chosen at random. Independence: There are more than 10(50) = 500 students. (You know there are 700 students.) Normality: p̂ = 35/50 = 0.70 np̂ =50(0.70) = 35 35 ≥ 10 n(1−p̂) = 50(0.30) = 15 15 ≥ 10 Because the conditions check, you calculate using a one-sample z-interval. 3. Calculations p̂ ± z*sqrt(p̂(1−p̂)/n) 0.70 ± 1.960 sqrt[(0.70)(1−0.70)/50] 0.70 ± 0.1270 (0.573,0.827) 4. Conclusion We are 95% confident that the true population proportion, p, of the students who text while driving is between 0.573 and 0.827.
Constructing Confidence Interval example 1 Suppose a sample of 100 Ford Mustangs is chosen at random. In the sample, there are 12 defective airbags. Construct a 98% confidence interval for the proportion of defective airbags Ford produces.
1. Parameter You want to estimate the population proportion, p, of defective airbags in Ford Mustangs that Ford Motor Company produces. 2. Conditions Simple random sample: The problem states the sample was chosen at random. Independence: You can assume there are more than 10(100) = 1,000 Ford Mustangs Ford Motor Company produces. Normality: np̂ = 100(0.12) = 12 12 ≥ 10 n(1−p̂) = 100(0.88) = 88 88 ≥ 10 Because the conditions check, you need to name the test that you'll be using. In this case, you have only one sample proportion with which to work, so the test you'll be using is a one-sample z-interval. If you had more than one proportion, it is appropriate to use the two-sample z-interval (more about that test later). 3. Calculations To perform the one-proportion z-interval test by hand, your work should look like this: p̂ ± z*sqrt(p̂(1−p̂)/n) 0.12 ± 2.326 sqrt[(0.12)(1−0.12)/100] 0.12 ± 0.0756 (0.0444,0.1956) To perform the one-proportion z-interval using a calculator, your work should look like this: Step 1: Select [STAT] and select TESTS. Scroll down to choice A:1-PropZInt. Where prompted by the calculator enter x, the number of successes; n, the number of trials, and C-Level, the confidence level (in decimal form). Step 3: Select Calculate. 4. Conclusion We are 98% confident that the true population proportion, p, of defective Ford Mustang airbags that Ford Motor Company produces is between 0.0444 and 0.1956.
When creating and interpreting confidence intervals, it is always important to follow these four steps:
1. Parameter: You must first identify the population of interest and define the parameter of interest being estimated. For example, "I want to estimate the proportion, p, of [context of problem]." 2. Conditions: There are three conditions (assumptions) you must always verify: simple random sample, independence, and Normality. (Need a quick way to remember these three items? Just think, "It's a SIN not to check the conditions!") Simple random sample: If a simple random sample is stated in the problem, you must state it in your response. If a simple random sample is not stated in the problem, you must note, "We are not told whether there is a simple random sample of all [context of problem], so we will proceed with caution." Independence: The population must be greater than 10n. You must note, "We can assume there are more than 10(n) [context of the problem]." Normality: You must verify and note that the Normal conditions of np̂≥10 and n(1−p̂)≥10have been met. After you check the conditions, you must always name the appropriate confidence interval. 3. Calculations: You must always show your work! If you use the formula, you must show all the calculations and equations. If you use a calculator, you must name the process you are using, and show and describe all input information and all output information. If data are provided, you must always graph the data and describe them. 4. Conclusion: You must always interpret the results of the confidence interval in the context of the problem and make a connection to the given information. For example, you would note, "I am [confidence level]% confident the true population proportion, p, of [context of problem] is between [lower value] and [upper value]."
Example 3 In a poll, McDougall's fast-food restaurant asked 1,000 randomly selected customers whether they requested sauce with their meals. Of the sample, 72% said they did. 1. Find the margin of error if you want to be 90% confident in your estimate of the percentage of all customers who order sauce with their meals. 2. If McDougall's wants to be 98% confident, would the confidence interval get wider or more narrow? 3. McDougall's margin of error is ±2%. If it wants it to be ±1%, would the confidence level be higher or lower? 4. If McDougall's polls more people, but keeps the 90% confident level, is the confidence interval's margin of error larger or smaller?
1. With n = 1,000 and p̂ = 0.72, σp̂ = sqrt(p̂(1−p̂)n) = sqrt(0.72(1−0.72)/1,000) = 0.0142 For a 90% confidence level, z* = 1.645, so the margin of error (or ME) is Margin of error = (Critical value)(Standard deviation) = 1.645(0.0142) = 0.02336. The margin of error you need to use to be 90% confident in your estimate of the percentage of all customers who order sauce with their meals is ±2.34%. 2. The confidence interval is wider because the critical value changes to z* = 2.326. When you multiply this critical value by the same standard deviation, you get a larger margin of error, which means the confidence interval gets wider. 3. The confidence level is lower because a smaller confidence interval produces a smaller margin of error. 4. The confidence interval's margin of error is smaller because using a larger value for n in the formula results in a smaller standard deviation, which makes the margin of error smaller.
Practice 2 You are going to construct a 95% confidence interval for p. Which critical value will you use?
1.96
Practice 4 US Weekly magazine claims that 32% of its readers believe Angelina Jolie was best dressed at the Academy Awards. To verify this claim, how large a sample is needed for a 0.05 margin of error at a 95% confidence level?
335
IMPORTANT
A confidence level, such as 95%, describes uncertainty associated with a sampling method, whereas a confidence interval, expressed as a range of values, describes the amount of uncertainty associated with a sample statistic of a population parameter.
Critical Value example An online retailer promises to deliver orders to premium members within two days. One thousand randomly selected customer packages were tracked. The results show that a 95% confidence interval for the proportion of orders arriving on time is 90% ± 4%. What does this mean?
Based on these results, you can be 95% confident that between 86% and 94% of packages get delivered on time.
Example 4 Customers purchase about 60% of Gorge clothing through its website. The company is preparing to introduce a new line of jeans and wants to generate a 95% confidence interval for the proportion of customers who will purchase the product online. They want to be accurate within 3%. How many customers does the company need to sample?
Because you have a target margin of error, 3%, this problem can be solved by substituting the percentage for the margin of error and then solving for n, the sample size: Gorge needs to sample 1,025 customers. (Because there is no such thing as 0.4267 of a person, round up the answer to 1,025 customers to ensure a 3% margin of error.)
What if you were interested in finding the critical value to construct a 95% confidence interval? There are two ways to find the critical values: using a table (review Table A from the AP Resource Packet) or a calculator.
Calculator: Recall that when finding the area under a normal curve, you use invNorm. To find the critical value on the calculator, follow these steps. Step 1: Select 2nd VARS [DISTR] and scroll down to 3:invNorm( and press [ENTER]. Step 2: Enter the area under the curve to the left of z*, the mean, and the standard deviation. The standard format is invNorm(area, µ, σ). For a Standard Normal curve, the mean is always zero and the standard deviation is one. For the area, you need to think a bit. In the case of a 95% confidence interval, the shaded region has an area of 0.95 (or 95%). This means 5% remain outside the shaded area: 2.5% on both the left and right sides. You are looking for the critical value that corresponds with the upper limit on the graph. When you put the area in the calculator, be sure to account for the entire area below the upper value in the confidence interval. This means you need to account for the remaining 2.5% (or 0.025) on the lower tail of the curve. So using a calculator, enter the total area of the shaded region plus the left tail: 0.95 + 0.025 = 0.975. You will get the value 1.959963986, or about 1.96. By Hand: To find the critical value, you can also use the AP Resource Packet and look at Table B. To use this table, look at the bottom row and find the value that is above the confidence level in which you are interested. In this case, a 95% confidence interval gives the value 1.960 (the same value you found using a calculator). The critical value you found for a 95% confidence interval is z* = 1.96. In this example, the critical value (z*) tells you that 95% of a Normal model is found within ±1.96 standard deviations of the mean.
Confidence interval formula is composed of two parts: the point estimate of a specific statistic and a margin of error; in the form (Statistic) ± (Margin of error) The margin of error is composed of two parts: the critical value (z* or t*) and the standard deviation of the statistic (also known as the "standard error").
Confidence interval = Statistic ± (Critical value) • (Standard deviation of statistic) Confidence Interval for Proportions: CI = p̂ ± z* sqrt(p̂(1−p̂)/n) CI: confidence interval p̂: the statistic Z*: the critical value sqrt(p̂(1−p̂)/n): the standard deviation of the statistic
Practice 3 A 95% confidence interval for the true proportion of people who prefer Coca Cola over Pepsi is found to be (0.55, 0.65). Which of the following statement(s) is/are correct? I. The probability is 0.95 that the proportion of people who prefer Coca Cola over Pepsi is between 0.55 and 0.65. II. The interval constructed by this procedure will capture the true population proportion 95% of the time. III. This interval provides evidence that more people prefer Coca Cola than Pepsi. IV. We are 95% confident that the true proportion of people who prefer Coca Cola over Pepsi is between 0.55 and 0.65.
II, III, and IV only
Practice 1 Quick check! Do you remember the four steps you need to use every time you calculate a confidence interval? Drag the steps in the correct order before you go any further in the practice problems. Please note, incorrect answers will highlight in light orange.
Parameter Conditions Calculation Conclusion
IMPORTANT
Pay close attention to the relationships among margin of error, confidence levels, confidence intervals, and sample size! For a fixed sample size, the margin of error varies directly with the confidence level and confidence interval width. For a fixed confidence level, the margin of error varies inversely with sample size. Sample size increases → Margin of error decreases Confidence level or confidence interval increases → Margin of error increases
nterpretation of the confidence interval is of the utmost importance. It should always be given in terms of the context of the question. You always want to note, "We are __% confident that the true population proportion, p, of [context of problem] is between [lower value] and [upper value]."
This means the process you used to generate the interval captures the true population value __% of the time. This is not a probability statement about the interval. You are not sure whether the interval contains the true population value, but you do know, on average, the specified percentage of the intervals constructed contain the true value.
critical value (z*)
a factor used to compute margin of error; represents the z-score associated with the level of confidence; how many standard deviations you must move away from the mean to correspond to a specific level of confidence
confidence interval
range of values that describes the amount of uncertainty associated with a sample statistic of a population parameter