Stats Chapter 9
Testing the Mean with a Finite Population
- As in previous cases, if the sample size is greater than or equal to 5% of the population size, the correction factor should be used - z= (z bar - mew) / (st dev / sqrt(n))(sqrt((N-n)/(N-1))
Three Types of Hypotheses
1. Research Hypotheses 2. Statistical Hypotheses 3. Substantive Hypotheses
Using the critical values established at step 4, the possible statistical outcomes of a study can be divided into two groups
1. Those that cause the rejection of the null hypothesis 2. Those that do not cause the rejection of the null hypothesis
Four Major Tasks for Testing Hypotheses
HTAB Task 1. Establishing the hypotheses (H) Task 2. Conducting the test (T) Task 3. Taking statistical action (A) Task 4. Determining the business implications (B)
Testing Hypotheses About a Proportion
Hypotheses about population proportions can be tested in the same way as means • For the central limit theorem to hold, n ∙ p ≥ 5, and n ∙ q ≥ 5 • Then the test statistic is: z= (p hat - p)/sqrt(pq/n)
Effect of Increasing Sample Size on the Rejection Limits
Increasing the sample size decreases the value of the standard error, and the critical value changes • In the soft drink example, increase n to 100 • With n = 60, the critical value was 11.979 • The z score is now z = −0.60 • Probability of a Type II error falls to 0.7257 • Increasing the sample size decreases β • Also, the analyst can reduce α without increasing β if the sample can be increased
Testing Hypotheses About a Variance
Just as hypotheses about means and proportions can be tested, hypotheses about a population variance can be tested • Must be able to assume that the population is normally distributed • Extremely sensitive to violations of this assumption If normality can be assumed, then the test statistic is X^2 = ((n-1)s^2)/st dev^2
Eight-Step Process for Testing Hypotheses
Step 1. Establish a null and an alternative hypothesis Step 2. Determine the appropriate statistical test Step 3. Set the value of alpha, the Type l error rate Step 4. Establish the decision rule Step 5. Gather sample data Step 6. Analyze the data Step 7. Reach a statistical conclusion Step 8. Make a business decision
Two-tailed tests
always use = and ≠ in the statistical hypotheses and are directionless - has the population mean changed?
One-tailed tests
are always directional, and the alternative hypothesis uses either the greater than (>) or the less than (<) sign - Is the population mean less than x? - The population mean more than x?
Type II Error
o Committed by failing to reject a false null hypothesis o The null hypothesis is false, but a decision is made to not reject it o Example: Suppose in the case of the flour problem that the packaging process is actually producing a population mean of 81 ounces even though the null hypothesis is 80 ounces. A sample of 100 packages yields a sample mean of 80.2 ounces, which falls in the nonrejection region. The business decision maker decides not to reject the null hypothesis. o Non-statistical example: The court declared you not guilty of stealing money, but you actually had done it → The court committed a Type 2 Error o The probability of committing a Type II error is called beta (β)
Type I Error
o Committed by rejecting a true null hypothesis o The null hypothesis is true, but a decision is made to reject it o Example: Suppose the flour-packaging process actually is "in control" and is averaging 80 ounces of flour per package; the decision is to reject the null hypothesis even though the population mean is actually 80 ounces o Non-statistical example: The court declared you guilty of stealing money, but you actually hadn't done it → The court committed a Type 1 Error o The probability of committing a Type I error is called alpha (α), or level of significance → equals the area under the curve that is in the rejection region beyond the critical value(s)
Statistical Hypotheses
• A more formal hypothesis structure set up to scientifically test research hypotheses • Suppose business researchers want to "prove" the research hypothesis that older workers are more loyal to a company • First, a "loyalty" survey instrument is either developed or obtained • If this instrument is administered to both older and younger workers, how much higher do older workers have to score on the "loyalty" instrument than younger workers to prove the research hypothesis? What is the "proof threshold"? • Instead of attempting to prove or disprove research hypotheses directly in this manner, business analysts convert their research hypotheses to statistical hypotheses and then test the statistical hypotheses using standard procedures
Research Hypotheses
• A statement of what the analyst believes will be the outcome of an experiment or a study • Before studies are undertaken, business analysts often have some idea or theory based on experience or previous work as to how the study will turn out • These ideas, theories, or notions established before an experiment or study is conducted are research hypotheses • Such hypotheses can lead decision makers to new and better ways to accomplish business goals • However, to formally test research hypotheses, it is generally best to state them as statistical hypotheses
Step 8. Make a business decision
• After a statistical decision is made, make a business decision, i.e., decide what business implications the study results contain • E.g., if the hypothesis-testing procedure results in a conclusion that train passengers are significantly older today than they were in the past, the manager may decide to cater to these older customers or to draw up a strategy to make ridership more appealing to younger people • It is at this step that the business decision maker must decide whether a statistically significant result is really a substantive result
Step 3. Set the value of alpha, the Type l error rate
• Alpha is the probability of committing a Type I error • Common values of alpha include .05, .01, .10, and .001
Type I and Type II Errors
• Because the hypothesis testing process uses sample statistics calculated from random data to reach conclusions about population parameters, it is possible to make an incorrect decision about the null hypothesis • In particular, two types of errors can be made in testing hypotheses: Type I error and Type II error
Rejection and Nonrejection Regions
• Conceptually and graphically, statistical outcomes that result in the rejection of the null hypothesis lie in what is termed the rejection region • Statistical outcomes that fail to result in the rejection of the null hypothesis lie in what is termed the nonrejection region
Using the p-Value to Test Hypotheses two tailed test
• For a two-tailed test, the p-value is compared to α/2 o In the CPA income problem, the observed value was z = 2.71 o From the standard normal table, the probability of a value greater than 2.71 is 0.5000 - 0.4966 = 0.0034 o Compare with α/2 = 0.025 o Reject H0 since 0.0034 < 0.025 • Some statisticians and software double the p-value for a two-sided test instead and compare to α o 2(0.0034) = 0.0068 < 0.05, so reject H0
Comparing Type I and Type II Errors
• Generally, α and β are inversely related, i.e., if alpha is reduced, then beta is increased, and vice versa (e.g., if the courts make it harder to send innocent people to jail, then they have made it easier to let guilty people go free) • Increasing the sample size reduces both α and β, i.e., a larger sample more likely to represent the population, less likely to commit Type 1 & 2 errors • Power (= 1 − β ) is the probability of a statistical test rejecting the null hypothesis when the null hypothesis is false
Some Observations About Type II Errors
• If the alternative mean is close to the hypothesized value, the probability of a Type II error is high • If the alternative mean is far from the hypothesized value, the probability of a Type II error is small • In other words, hypothesis testing cannot always distinguish between similar values of the hypothesized mean, but can more easily identify means that are far apart
Substantive Hypotheses
• If the null hypothesis is rejected and therefore the alternative hypothesis is accepted, it is common to say that the result is statistically significant o The word significant means that the outcome of the experiment is unlikely to occur by chance and a decision has been made to reject the null hypothesis • One possible problem is that a statistically significant outcome may not be a significant business outcome (i.e., does not imply a material, substantive difference) • A substantive result occurs when the outcome of a statistical study produces statistically significant results that are also important to the decision maker
Step 1. Establish a null and an alternative hypothesis
• In establishing the null and alternative hypotheses, it is important that the business analyst clearly identify what is being tested and whether the hypotheses are one tailed or two tailed • It is always assumed that the null hypothesis is true at the beginning of the study • E.g., it is assumed that • the process is in control (no problem) • the market share has not increased • older workers are not more loyal to a company than younger workers
Step 5. Gather sample data
• Might include the construction and implementation of a survey, conducting focus groups, randomly sampling items from an assembly line, or even sampling from secondary data sources (e.g., financial databases) • Care should be taken in random sampling, establishing a frame, determining the sampling technique, constructing the measurement device, and avoiding all non-sampling errors (Ch. 7)
Operating Characteristics and Power Curves
• Power is the probability of rejecting the null hypothesis when it is false and is equal to 1 − β • Plotting the β values against the various values of the alternative hypotheses gives the operating characteristics (OC) curve • As the alternative means move away from the hypothesized mean, the curve shows that the probability of a Type II error decreases • Plotting power values against the against the various values of the alternative hypotheses gives the power curve • Power increases as the alternative mean moves away from the value of μ in the null hypotheses
Step 2. Determine the appropriate statistical test
• Select the most appropriate statistical test to use for the analysis • The business analysts needs to consider the type, number, and level of data being used in the study along with the statistic used in the analysis (mean, proportion, variance, etc.) • She should also consider the assumptions underlying certain statistical tests and determine whether they can be met in the study before using such tests
Using the Critical Value Method to Test Hypotheses
• The critical value method determines the critical mean value required for z to be in the rejection region and uses it to test the hypotheses • Use the z formula to solve for the critical value(s) of the sample mean • Reject the null hypothesis if the observed mean is greater than the critical value (for an upper tail, one tailed test)
Using the p-Value to Test Hypotheses
• The p-value (a.k.a. observed significant level) is another way to reach a statistical conclusion in hypothesis testing • The probability of getting a test statistic at least as extreme as the observed test statistic (computed from the data) is computed under the assumption that the null hypothesis is true • The p-value defines the smallest value of for which the null hypothesis can be rejected • E.g., if the p-value of a test is .038, the null hypothesis cannot be rejected at = .01 because the p-value is not less than (However, the null hypothesis can be rejected for = .05)
Step 6. Analyze the data
• The test statistic can be calculated
Two parts of statistical hypothesis
• These two parts are constructed to contain all possible outcomes of the experiment or study • Null Hypothesis (H0 ): states that the "null" condition exists; that is, there is nothing new happening, the old theory is still true, the old standard is correct, and the system is in control • Alternative Hypothesis (Ha ): states that the new theory is true, there are new standards, the system is out of control, and/or something is happening → what you want to prove • The null hypothesis is the "complement" of the alternative hypothesis (i.e., mutually exclusive and collectively exhaustive)
Step 4. Establish the decision rule
• Using alpha and the test statistic, critical values can be determined • These critical values are used at the decision step to determine whether the null hypothesis is rejected or not • If the p-value method is used, the value of alpha is used as a critical probability value • The process begins by assuming that the null hypothesis is true • Data are gathered and statistics computed • If the evidence is away from the null hypothesis, the analyst begins to doubt that the null hypothesis is really true • If the evidence is far enough away from the null hypothesis that the critical value is surpassed, she will reject the null hypothesis and declare that a statistically significant result has been attained
Step 7. Reach a statistical conclusion
• Using the previously established decision rule (in step 4) and the value of the test statistic, draw a statistical conclusion • In all hypothesis tests, conclude whether the null hypothesis is rejected or is not rejected