Exam 3 Statistics

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Section 9.1- Null Hypothesis and Alternative Hypothesis Example 1:

Boxes of cereal state that their containers are 20 ounces. An inspector thinks that the mean weights may be less. State the hypotheses for a test of mean weight. Solution: Remember we have to initially go on the assumption that the original claim is true. That means that our null hypothesis is H0:μ=20 (read "H-naught"). The alternative hypothesis, or the alternative we are going to pursue, would be written as HA:μ<20. You should get into the habit of writing your hypotheses in stacked vertical format: H0:μ=20HA:μ<20 Note that this is a left-tailed test (how do you know?). □

Confidence Interval

a range of values used to estimate the value of a population parameter.

Section 9.4 - Rare Events, the sample and the conclusion

This section begins to discuss the further steps of performing a hypothesis test. We will not begin to collect our "evidence" to use to make the decision to either reject a null, or fail to reject the null. I will introduce the stages of the hypothesis test with this example: A teacher learns that the mean score for students taking the math portion of the SAT is 530. The teacher is determined to show his students do better, so he believes the mean score should be higher. We can write this in symbols as follows: H0:μ=530HA:μ>530 We see that this is a right-tailed test. In order to prove his point, the teacher draws a random sample of 100 of his students and personally tutors them. After the students take the SAT, their mean score turns out to be x¯=562. We want to know if this sample mean is strong enough evidence against the population mean of μ=530, or in other words, is the mean of this sample sufficient enough to say that the true population mean score is greater than 530? Hopefully you noticed that this is an example of the Central Limit Theorem in action. We have a large sample and now we can compute a z-score. Recall the formula is z=x¯−μσ/n√. We will call this particular z-score the test statistic. Suppose we know that the population standard deviation is σ=116. From this, we have the test statistic is z=562−530116/100√≈2.76. We can visualize the scenario using the sampling distribution: The region shaded in red is an area, and thus a probability. Since it is way in the right tail, we can visually say that this appears to be fairly strong evidence against the null hypothesis. Remember that we are looking for evidence that the mean (or initial claim that we assume to be true) is incorrect. Since our sample produced such a small probability, we can likely reject that claim in favor of our alternative (which is the sample). The question now becomes, what is that sufficient evidence that we need to reject the claim? When performing a hypothesis test, we need to be able to draw some threshold on what z-value should separate values that are likely to occur from ones that are not. This ideas takes us back to confidence intervals. When constructing a confidence interval, we declared a level on confidence that we desired to estimate the parameter is within two boundaries. For hypothesis testing, we say α is the significance level for the test. This is the boundary line we will use the determine if we should reject a null hypothesis or not. There are two ways of looking at the significance level. The first approach to completing a full hypothesis test is using the (1) Critical Value Method. This method looks at the test statistic computed from the sample, and rejects the null hypothesis if the test statistic lies within the critical region which is created by the critical value. The second way to perform a full hypothesis test is to use the (2) p-value Method. The p-value of the test is the area/probability that is created by the test statistic. The red shaded region in the sketch above is the p-value. We will proceed with the previous example, utilizing both methods. For the purpose of this example, we will use significance level α=0.05.

Characteristic of the Chi-Square Distribution

- The chi-square is skewed to the right -the chi-square is nonsymmetrical -the mean of the chi-square distribution is located to the right of the peak -the total under the x2 -curve is equal to 1 -as the degrees of freedom increases the chi-square curves look more and more like a normal curve. -The chi-square approaches but never approaches the positive horizontal axis because the long tail is never 0.

p -method example 2

1. H0:μ=74HA:μ≠74 2. We are given significance level α=0.05. 3. The test statistic is again z=x¯−μσ/n√=76−748/80√≈2.24. 4. The p-value is the area in the tails that is created by the test statistic. p-value = (area in left tail) + (area in right tail) The area in the left tail is normalcdf(−1⋅1099,−2.24,0,1)≈0.0125. By symmetry, this means that the area in the right tail is also approximately 0.0125. Thus we have p-value = 0.0125 + 0.0125 = 0.025. 5. We see that p-value < significance level, or 0.025 < 0.05. We reject the null hypothesis H0 at level α=0.05. 6. The concluding statement is the same: "there is sufficient evidence to suggest that the mean level of employee satisfaction has changed since the new policy." □ Example 2 The mean annual tuition and fees for a sample of 14 private colleges in California was $37,900 with a standard deviation of $7,200. A dot plot shows that it is reasonable to assume that the population is approximately normal. Can you conclude that the mean tuition and fees for all private institutions in California is more than $35,000? Part A: Critical Value Method 1. For our hypotheses, we begin with the assumption that the actual mean is $35,000. We are testing that claim that the mean is greater than $35,000. H0:μ=35000HA:μ>35000 We see that this is a right-tailed test. 2. We are not provided a significance level, so we use α=0.05. 3. It is important to note that we have a small sample of n=14. Even though our sample size is n<30, we are told that the population is approximately normal. As result, we will conduct our test using the t-distribution. If α=0.05, then we use the t-table with 14−1=13 degrees of freedom to find that tα=t0.05≈1.771. The shaded region represents the critical region. Any test statistic that falls within this region would provide significant evidence against the null hypothesis. 4. The test statistic is t=x¯−μ0s/n√=37900−350007200/14√≈1.507. 5. We label the test statistic on the curve: The test statistic is clearly outside of the shaded critical region, thus we do not have sufficient evidence to say the null is untrue at level α=0.05. 6. The concluding statement is: "We fail to reject the claim that the average tuition and fees of California private colleges is $35,000. There is not sufficient evidence that the average is higher." Part B: p-value Method Steps 1, 2, and 3 are the same as the critical value method. 1. H0:μ=35000HA:μ>35000 2. The significance level is α=0.05. 3. The test statistic is t=x¯−μ0s/n√=37900−350007200/14√≈1.507. 4. To find the p-value, we need the area under the curve to the right of t=1.507 on the t-distribution. We will need to use the graphing calculator to compute this area. Here are the instructions: STAT → TESTS → 2: T-Test... The calculator asks if we want to use "Data" or "Stats." We are given statistics, not raw data, so be sure that "Stats" is highlighted. We also have μ0=35000x¯=37900Sx=7200n=14 Since this is a right-tailed test, highlight the option μ:>μ0. Then press "Calculate." You will notice that the T-Test gives our p-value of approximately 0.0779. 5. We see that 0.0779 > 0.05, thus we fail to reject the null. 6. The concluding statement is the same: "We fail to reject the claim that the average tuition and fees of California private colleges is $35,000. There is not sufficient evidence that the average is higher." □

example 1

A 2010 survey polled 830 people aged 18-29 and found that 166 of them have at least one tattoo. Can we conclude that the percentage of people aged 18-29 that have at least one tattoo is less than 25%? Use significance level α=0.01. Solution: Notice that there are only two categories here: (1) having at least one tattoo or (2) not having at least one tattoo. Having only these two outcomes allows us to continue with the other properties of the binomial distribution. A sample of 830 is plenty large. 1) The claim is made that 25% of adults aged 18-29 have at least one tattoo. We will test this claim with the sample provided. The hypotheses of this left-tailed test are H0:p=0.25HA:p<0.25 You should always convert your percentages to decimals when performing computations with proportions. 2) The significance level α=0.01 is provided. With this significance level small, it suggests that we require a higher level of confidence for this test. 3) We compute our sample proportion to be pˆ=166830=0.20. We also have p0=0.25 and q0=1−p0=0.75. The test statistic is z=pˆ−p0p0q0n√=0.20−0.250.25(0.75)830√≈−3.33. 4) We can use the calculator or z-table to compute the p-value. The calculator syntax would be p−value=normalcdf(−1⋅1099,−3.33,0,1)≈0.00043. 5) Our p-value is very small and is less than α=0.01. This is strong evidence against the null H0. 6) Our concluding statement could be: "There is sufficient evidence at level α=0.01 that the proportion of 18-29 year olds who have at least one tattoo is less than 25%." □ Once you understand how to compute the p-value and its relation to the normal curve, you may want to consider exploring how to conduct the test in the calculator. STAT → TESTS → 5: 1-PropZTest

final thoughts

A few final thoughts: You should notice that the lower bound contains the value of the greater critical value. This is not a mistake. The margin of error is already within the boundaries. The exponent in χ2 has no mathematical meaning-- it's just notation. We will not discuss a TI-option for computing confidence intervals about a standard deviation. Make sure you understand the trends of this curve of this distribution. You will be asked to compute a confidence interval about a standard deviation in Forum 5.

Section 8.3 - estimating a population proportion Example 5:

A survey was taken of 806 random people in an airport. They were asked "what is your favorite seat on the airplane?" Assume 492 chose the window seat. Construct a 99% confidence interval for the proportion of flyers who prefer the window seat. Solution: Notice that these are parameters of the binomial distribution. That is, there are only two possible outcomes-- success (liking the window seat) or failure (not liking the window seat). It is also a large sample. Here is a summary of what we know: pˆ=492806=0.61, qˆ=1−pˆ=1−0.61=0.39 n=806, x=492 (number of "successes") Now we follow the steps from the table. 1. satisfied 2. satisfied 3. The critical values come from the same z-table we used from Chapter 6. Since the confidence level is 99%, we have α=0.01 and α/2=0.012=0.005. This means that on the normal curve, there is an area of 0.01 in both tails, and 0.005 in a single tail. Using either the table or invnorm(0.005, 0, 1), we get a critical value of zα/2≈±2.575. 4. The margin of error is E=2.575(0.61)(0.39)806−−−−−−−−√≈0.044. 5. Now we add and subtract the margin of error from the point estimate and obtain the boundaries for the confidence interval. The interval is 0.566<p<0.654. 6. The sentence summary is: "We are 99% confident that the true proportion of airline flyers who prefer the window seat is between 0.566 and 0.654."□ Here is an example of how to choose an appropriate sample size necessary to construct a confidence interval.

Section 9.6 full test examples part 2

Before the summary of the steps to perform a full test, let's revisit the notation from the binomial distribution: p≡ population proportion of subjects in a category p0≡ population proportion specified by H0 x≡ the number of subjects in the sample within the category (number of "successes") n≡ sample size pˆ≡ sample proportion, such that pˆ=xn. We want the sampling distribution of pˆ to be approximately normal. There are some other conditions we want to satisfy before conducting a test, namely: The collected sample is simple and random. The population is at least 20 times larger than the sample. Members of the population fit into two categories only (success and failure-- however those are defined). The values of np0 and nq0 are at least 10.

Example 3:

Consider Example 1 and the test of H0:μ=20 vs. HA:μ<20. If the inspector does not reject the null, what is the conclusion that is made? Solution: Remember that the inspector thought the mean number of ounces was less than 20. If the inspector fails to reject the null hypothesis, we are saying there is not sufficient evidence in favor of the alternative to reject the null. In other words, there is not sufficient evidence to conclude that the mean weight is less than 20 ounces.

Critical Method

Critical Value Method: If we are using significance level α=0.05, we can find the critical value zα that creates an area of 0.05 (or 5%) in the right tail of the distribution. Said differently, if the significance level is α=0.05, then that means we are going to complete the test saying that anything in the top 5% would be unusual or less likely to occur. Any test statistic that does not lay in the top 5% would not be strong enough evidence against the null. Remember there are two ways to find a z-score when given an area. The calculator syntax is zα=invNorm(0.95)≈1.645. You can also look up 0.95 in the body of the table. The critical value zα=1.645 creates the critical region in purple. What we are saying is that any test statistic that is within the critical region would be sufficient evidence against the null at level α=0.05. Notice that if the significance level were different, the rejection region would also be different. Recall that the test statistic for our example was z=562−530116/100√≈2.76. This value is very much in the critical region, thus very strong evidence against the null. Here is our conclusion: "There is sufficient evidence at level α=0.05 that the true mean score of the math portion of the SAT is greater than 530." Notice that we are not "accepting" anything. We are simply stating that we have found evidence in favor of the alternative. If the test statistics were to be anywhere left (or less than) the critical value, we would be forced to fail to reject the null.

example 1

Example 1 Find the critical values for a 90% confidence interval for a variance with n=25. Solution: Remember that when we're doing confidence intervals, there are always two tails. So we are looking for the values χ2α/2 and χ21−α/2. Since we are using 90% confidence, we have α=1−0.90=0.10 and α2=0.05. Remember that the values at the top of the table are areas to the RIGHT of the critical values (body of the table). For the critical value on the left side of the curve, which has an area of 95% to the right, we will use the table with df=25−1=24, and look up 0.95 on the top line. We see that the critical value is χ20.95≈13.848. In a similar fashion, we find the critical value with a 5% area to the right using the table. The second critical value is χ20.05≈36.415. The critical values separate the top and bottom 5% as shown in the curve below with 25 degrees of freedom: □ Now that we know how to find the critical values, suppose I want to be able to estimate the value of a population standard deviation using a confidence interval. Here is a summary of the appropriate procedure. Steps for Constructing Confidence Interval for Standard Deviation 1) The sample should be simple and random and the population must be normally distributed. 2) Compute the sample variance (use "1-Var Stats" in calculator). 3) Find critical values χ2α/2 and χ21−α/2 using table and df=n−1. 4) Compute the bounds of the interval. The lower bound is (n−1)s2χ2α/2 and the upper bound is (n−1)s2χ21−α/2. 5) Write an inequality of the form (n−1)s2χ2α/2−−−−−−√<σ<(n−1)s2χ21−α/2−−−−−−√. 6) Write a sentence to interpret the result.

Example 3:

Find the critical value tα2/ corresponding to n = 40 and 95% confidence. Solution: Since the sample size is n = 40, we will use degrees of freedom n - 1 = 39. Also note that 95% confidence level corresponds to α = 0.05. This means that there is an area of 0.052=0.025 in each tail of the distribution. Here is a snipet of the t-table: There is an area of 0.05 in both tails and 0.025 in a single tail, so from the table we can see that the critical value is tα2/=t0.025=2.023. Keep in mind there is symmetry, so the opposite tail will have critical value t = -2.023.□ Before I get into an example of setting up a confidence interval, here is a summary of the steps: The margin of error is given by E=tα/2sn√ thus the confidence interval will be of the form: x¯−tα/2sn√<μ<x¯+tα/2sn√ where s is the sample standard deviation and can be obtained from 1-variable statistics in the calculator.

Example 2:

Find the critical values that correspond to an 80% confidence level. Solution: Since we are provided a confidence level of 80%, we know that α = 0.20. From this, we can conclude that α2=1−0.802=0.10/. This means that if we draw a normal distribution and shade in the two tails, we know that there is a proportion/area of 0.10 in either tail, and 80% or 0.80 remains in the middle of the distribution. Now we want to find the z-score values that create the 10% in each tail. Remember there are two ways to do this-- (1) using the z-table or (2) using invnorm in the calculator. We are essentially finding the 10th and 90th percentiles, which we know will be the same values (only differing by a minus symbol). Using the calculator syntax: InvNorm(0.10, 0, 1) we see that the critical values are zα2/=±1.28. Here is a visual: □ There are two main ways we discuss confidence intervals for means. Section 8.1 deals with when the standard deviation of the population is known, and Section 8.2 deals with when it is unknown. In the real world, if you're trying to estimate the value of a population parameter, it is not likely that you'll have any knowledge of the population standard deviation. So as result, the main focus for means will be in Section 8.2. When the standard deviation is known, we want to make sure the normality standard is still applicable; that is, we have a sample of at least n > 30. Remember that for this course, the sample mean x¯ is still the best point estimate for μ. You should also pay attention to the part of the section that deals with choosing an appropriate sample size to properly construct a confidence interval. The formula is: n=[zα2/σE]2, where E is the margin of error. You should always round this value UP to the nearest whole number. Again, the gist of confidence intervals is to be able to take a good guess at the value of a population mean based only on values associated with a sample. We are concerned with computing the endpoint values of a and b to create a compound inequality. We don't know exactly what the value of μ is, but we can say that with (1−α)% confidence that it is between a and b. The distance between a and μ is (point estimate - margin of error), or x¯−E and the distance between μ and b is (point estimate + margin of error) or x¯+E. We know the margin of error formula for standard deviation known is E=zα2/σn√. This means that our confidence interval for population mean with standard deviation known is of the form: x¯−zα2/σn√<μ<x¯+zα2/σn√. Never forget to write your single sentence interpretation as stated in my preliminary notes on confidence intervals. The best way to do this is just to copy the sentence exactly as written, and change your values of a and b accordingly.

Example 1:

Let's say a news company interviews 2,000 CEOs of large companies and asks them if they find their jobs stressful. The survey found that 72% of the CEOs said "yes." Solution: The best point estimate here would be the sample proportion pˆ = 0.72. □Albeit confidence intervals are extremely useful, we need to keep in mind that they are still just a guess. So how do we know how "good" our estimate is? We need to consider a few additional ideas. A 1−α probability is the confidence level that the confidence interval actually contains the population parameter we are trying to estimate. We also want to make sure that the process of sampling is repeated a significant number of times. (The symbol α is the lowercase Greek letter "alpha") Here are some common confidence levels that you will see come up in the examples in the textbook and in Knewton: Confidence Level Value of α 90% α = 0.10 95% α = 0.05 99% α = 0.01 Keep in mind that you may see other confidence levels and have to locate its respective value of . Before we get in to computing confidence intervals for population means and proportions, we should discuss how to interpret them. It is traditional to write a confidence interval in the form of a compound inequality. Here is an example using proportions: a<p<b. Let's say we computed a 95% confidence interval for the population proportion. Here is the correct way to write an interpretation sentence: "We are 95% confident that the interval from a to b contains the true value of the population proportion." The wording of the interpretation sentence is crucial. Here are some incorrect statements: "There is a 95% chance ..." "95% of values ..." The probabilities already took place, so be careful with how you phrase the statement. Finally let's consider a sampling distribution (remember from Chapter 7) that takes the following shape: We know that since we are considering both of the shaded tails (left and right) and the normal curve has symmetry about the mean, that the area in each tail can be described as α2. So the obvious next question would be-- what z-values separate that α2 area from the rest of the curve? These z-scores have a special name. We say that zα2/ is a critical value of the sampling distribution as it separates values of z-scores from those that are unlikely to occur. By looking at the image above, we can try to eyeball the location of the boundary line for the critical value, but it's best to use the z-table or our calculators to get a more accurate value. Here is an example of how to find critical values when you are provided a specific confidence level.

Example 4:

The amount of mercury in sushi is normally distributed. Suppose you go to your favorite restaurants and measure the amount of mercury in seven pieces of sushi (the units are ppm - parts per million) 0.56 0.75 0.10 0.95 1.25 0.54 0.88 Construct a 90% confidence interval for the mean amount of mercury in ALL sushi. Solution: Note that we are given a small sample of data values. A good rule to practice is that when you are given raw data, you should immediately type that data into your calculator list. Using 1-variable statistics we obtain x¯≈0.719 ppm and s≈0.366 ppm. Also note that even though we are given a very small sample (n = 7), it is from a normally distributed population. Now follow the steps in the table above to set up the confidence interval. 1. satisfied 2. satisfied 3. The level of confidence for our interval is 90%. So we will use the t-table to locate our critical values. We have α=0.10 and α/2=0.102=0.05. We will use degrees of freedom 7 - 1 = 6 and the t-table. We see the critical values are tα/2=t0.05≈±1.943. You should verify that you're able to find this for yourself in the table before moving on to step (4). 4. The margin of error is E=1.943⋅0.3667√≈0.269. 5. Now we add and subtract the margin of error from the point estimate and we have 0.719 - 0.269 < μ < 0.719 + 0.269. This means our interval is 0.450<μ<0.988. 6. Interpretation sentence: "We are 90% confident that true mean amount of mercury in sushi is between 0.450 ppm and 0.988 ppm."

Example 2:

Scores on a standardized test have a mean score of 70. Some modifications are made to the test and educators believe the mean has changed. State the hypotheses for a test of mean score. Solution: We have to go with the mean weight that is stated in the claim, thus our null hypothesis is H0:μ=70. The statement doesn't say that the educators believe the score is higher or lower than 70, so we say that the alternative hypothesis would just be that the mean score is different from 70. That is, HA:μ≠70. Written in vertical format: H0:μ=70HA:μ≠70 This is considered a two-tailed test. □ For our class, the null hypothesis should always be that a parameter is equal to a stated value. The three types of alternate hypotheses are "less than," "greater than," or "not equal to."

Section 9.6 - Full test examples exam 1

Suppose a large company measures its employee attitudes with a standardized test. A score of 100 indicates the most satisfied an employee can be. The mean score for Company A is 74 with a standard deviation of 8. The CEO of Company A would like to improve the attitudes of the employees. A policy is implemented that allows workers to telecommute (work from home) one time per week. A few weeks after the new policy is introduced, 80 workers are retested and turn out to have a sample mean of 76. Can we conclude that the mean level of satisfaction is different at significance level α=0.05? Part A: Critical Value Method We will follow the steps in the strategy box from Section 9.5. Remember when a claim is made about a mean, we have to believe it is true until we are able to find some data or evidence that suggests it is not. For this example, we know that the mean attitude score for all employees in the company is μ=74 with a standard deviation of σ=8. 1. To write our hypotheses, we begin by making our null hypothesis the claim that we are believing to be true. This is always the case for the null hypothesis. For the alternative, we should pay attention to the wording of the problem. The last sentence of the question says "mean level of satisfaction is different..." This suggests that we should use a two-tailed test. You could also use a right-tailed test if you would like (why?). Our hypotheses are: H0:μ=74HA:μ≠74 2. The significance level is provided, α=0.05. Remember that if one is not given to you in the problem, you can (1) make up your own level, or (2) use α=0.05 by default. 3. Since this is a two-tailed test, the critical values are of the form zα/2. Since we have a large sample, we can use the z-table or the calculator to find the critical value. With α=0.05, we have α2=0.025. This means we are looking up the z-score for the 2.5 percentile or the 97.5 percentile. The calculator syntax is invNorm(0.025,0,1)≈−1.96. The portions shaded in purple are the critical region (or the "rejection" region). 4. The test statistic is z=x¯−μσ/n√=76−748/80√≈2.24. 5. Remember that the test statistic is another z-score. We are interested in whether or not that z-score falls into the region that we specify to be the threshold of unlikely values (which is the critical region). We now add to our sketch: The test statistic clearly falls within the critical region, thus we reject H0 at level α=0.05. 6. Concluding statement: "there is sufficient evidence to suggest that the mean level of employee satisfaction has changed since the new policy."

Properties of the Chi-Square Distribution:

The population mean can be described as μ=df and population standard deviation σ=2(df)−−−−−√. When random samples are drawn from a normal distribution, the values of the χ2-distribution are obtained from (n−1)s2σ2. As shown in the graph, the curve begins to "flatten" as df→∞. The Chi-Square variable is always non-negative and the distribution is skewed right (sometimes described as positively skewed). As the number of degrees of freedom approach 100, the curve becomes slightly symmetric. Values in the body of the Chi-Square table represent critical values and values on the top row represent area to the right. (See Tables and Charts tab)

Section 9.3 - distribution needed for a hypothesis test

The purpose of performing a hypothesis test is to make some conclusion based on a statistical claim. So once we have our hypotheses, how do we begin the test? Step 1: Begin by assuming the null hypothesis is true. Step 2: We look for evidence to support the alternative (more on how to do this in an upcoming section). If the evidence against the null hypothesis is strong, we can reject the null in favor of the alternative. What are the types of conclusions? Conclusion Option 1: reject the null hypothesis H0. Conclusion Option 2: fail to reject the null hypothesis H0. Notice that we never accept the alternative hypothesis. The original claim is about the null, and the best we can do is find evidence in favor of the alternative, not enough to suggest it is true. Return to the court trial analogy: if there is enough evidence that student #1 did complete the homework, we say the student is not guilty. We do not accept the student is innocent.

section 8.2 estimating population mean standard deviation meanHere are the major things you need to know about Student's t-distribution:

The t-distribution is used for small samples. Specifically when n < 30 and population standard deviation is unknown. The t-distribution behaves similar to the normal distribution. The larger the sample is, the closer it resembles the bell curve of its population. When the sample is smaller, we need to discuss how the sample standard deviation adjusts the shape of the curve. This is partially done by considering the distribution with degrees of freedom n - 1. The critical values for Student's t-Distribution come from using the t-table. The t-table is included in the Tables and Charts tab to the left.

Section 11.1- facts about chi-square distribution

The two main ways we have used inferential statistics in the course are through confidence intervals and hypothesis testing. Within those, we have estimated the value of a population parameter and used techniques to test claims about those parameters. We have limited our discussion to inferences about means and proportions. In Chapter 11, we will describe a new distribution called the Chi-Square (from the Greek letter χ-- pronounced "ki-square"). The Chi-Square distribution is a family of curves that is skewed based on its number of degrees of freedom. For our purposes, we can use this distribution to perform inferential statistics related to a variance or standard deviation. Unfortunately, we only have time to cover the basics.

P-value method

This method is more common. Remember that the p-value is an area or probability. When we are provided a significance level, is is a declaration of an "unlikely" probability. This means that if we compute the p-value for our test statistic and it is smaller than the provided significance level, then we have sufficient evidence against the null. In summary, the significance level gives us our boundary, and the p-value is the probability as it relates to our sample. Let's return to our example. Our test statistic is z=562−530116/100√≈2.76. The area of the red shaded region, created by the test statistic, is the p-value. We use the calculator (or table) to find the area: p=normalcdf(2.76,1⋅1099,0,1)≈0.00289 This is a very small area, and as result, very strong evidence against our null hypothesis. We say, since p-value = 0.00289 < α = 0.05, we reject the null hypothesis. We will look at more examples of fully worked out hypothesis tests in week 12.

Example 6:

We want to know how many (or the proportion) of robberies in the state of Texas that result in arrest. How large of a sample is needed to construct an 80% confidence interval with a margin of error no greater than 4%? Solution: Recall from the reading that the formula to compute sample size for population proportions is n=[zα/2]2pˆqˆE2. There are a couple things we need to note before we proceed subbing into the formula. The first is that we need to use the z-table to find the critical value zα/2 that corresponds to 80% confidence. We did this in Example 2. The other thing to note is that we are not provided the sample proportion. If you are not provided a sample proportion, we can assume pˆ=qˆ=0.50. Here is a summary of what we know: zα/2≈±1.28, E = 0.04pˆ=0.50, qˆ=0.50 substituting into the formula gives n=(1.28)2(0.50)(0.50)(0.04)2≈256. Thus we need a minimum sample size of at least 256 values (or 256 robbery cases) to compute this confidence interval with these specifications requested. □ Here are a couple comments about the last example: (1) if we get a decimal value for n, we always want to round up to the nearest whole number-- clearly it's always better to have a larger sample than a smaller one, and (2) there are not many instances in the real world where you only want to be 80% confident in an inferential measure. If we were to make the level of confidence larger, the required sample size would also increase proportionally.

Section 8.1 - Confidence Intervals Point Estimate

a single value estimate of a population parameter


Ensembles d'études connexes

SOCIOLOGY MIDTERM REVIEW CH1-CH4

View Set

HSPS Practice Milestones stuff 2023 Woodfin PAGES 1-10

View Set

government and politics (UK) paper 2 - parliament

View Set

CCNA 1 Ch: 3 Quiz/Exam Flash cards

View Set

Diffusion of responsibility/bystander effect

View Set

Week 9 Pharmacotherapy of Hypertension

View Set

mod 2 Behavioral Challenges of Autism

View Set