CFA Level 1 - Section 2: Quantitative Methods - Reading 12: Hypothesis Testing
z-test
A z-test is a hypothesis that uses a z-statistic, which follows a z-distribution (a standard normal distribution).
Statistical Decision
Making a statistical decision involves using the stated decision rule to determine whether or not to reject the null hypothesis. A statistical decision is based solely on statistical analysis of the sample information.
Null Hypothesis Definition
Step 1: State the Null Hypothesis and the Alternate Hypothesis The first step is to state the null hypothesis (designated: H0), which is the statement that is to be tested. The null hypothesis is a statement about the value of a population. The null hypothesis will either be rejected or fail to be rejected. The alternate hypothesis is the statement that is accepted if the sample data provides sufficient evidence that the null hypothesis is false. It is designated as H1 and is accepted if the sample data provides sufficient statistical evidence that H0 is false.
Alternate Hypothesis Definition
The alternate hypothesis is the statement that is accepted if the sample data provides sufficient evidence that the null hypothesis is false. It is designated as H1 and is accepted if the sample data provides sufficient statistical evidence that H0 is false.
z-Test Formula for a Single Population Mean (μ)
The z-test should be used if the population variance is normally distributed with known variance. For a z-test concerning a single population mean (μ), the test statistic to be used is the z-statistic with n degrees of freedom.
t-Test Formula for a Single Population Mean (μ)
This test should be used if the population variance is unknown and either of the following conditions holds: 1.) The sample size is large (in general, n ≥ 30). 2.) The sample size is small (n ≤ 30) but the population is normally distributed or approximately normally distributed. For a t-test concerning a single population mean (μ), the test statistic to be used is the t-statistic with n - 1 degrees of freedom.
Decision Rule
A decision rule is a statement of the conditions under which the null hypothesis will be rejected and under which it will not be rejected. In general, the decision rule is that: 1.) If the magnitude of the calculated test statistic exceeds the rejection point(s), the result is considered statistically significant and the null hypothesis (H0) should be rejected. 2.) Otherwise, the result is considered not statistically significant and the null hypothesis (H0) should not be rejected. The specific decision rule varies depending on two factors: 1.) The distribution of the test statistic. 2.) Whether the hypothesis test is one-tailed or two-tailed. For the following discussion, you assume that the test statistic follows a standard normal distribution. Therefore: 1.) The calculated value of the test statistic is a standard normal variable, denoted by z. 2.) Rejection points are determined from the standard normal distribution (z-distribution).
z-Test versus t-Test Summary
For a population with unknown variance, it is acceptable to use the z-test if the sample size is large. The t-test should be used if the population variance is unknown and the sample size is large. In this case, it is also acceptable to use the z-test because of the central limit theorem. Recall that according to the central limit theorem, if the sample size is sufficient large, the sampling distribution of the sample mean will be approximately normally distributed. The z-statistic is computed below: Reading 12 Page 12, Summary 1.) In practice, the population variance is typically unknown. 2.) The table below summarizes tests concerning the population mean when the population has unknown variance.
t-test
A t-test is a hypothesis test that uses a t-statistic, which follows a t-distribution.
Testing a Hypothesis Steps
Step 1: State the Null Hypothesis and the Alternate Hypothesis Step 2: Determine the Appropriate Test Statistic and its Probability Distribution Step 3: Select the Level of Significance Step 4: Formulate the Decision Rule Step 5: Collect the Data, Perform Calculations Step 6: Making a Decision Step 7: Making an Investment Decision
Critical Value (or Rejection Point)
The critical value (or rejection point) is the dividing point between the region where the null hypothesis is rejected and the region where it is not rejected. The region of rejection defines the location of all those values that are so large (in absolute value) that the probability of their occurrence is low if the null hypothesis is true.
Critical Value
The critical value (or rejection point) is the dividing point between the region where the null hypothesis is rejected and the region where it is not rejected. The region of rejection defines the location of all those values that are so large (in absolute value) that the probability of their occurrence is low if the null hypothesis is true.
Economic Decision (Investment Decision)
The economic or investment decision takes into consideration not only the statistical decision, but also all economic issues pertinent to the decision. Slight differences from a hypothesized value may be statistically significant but not economically meaningful (taking into account transaction costs, taxes, and risk). For example, it may be that a particular strategy has been shown to be statistically significant in generating value-added returns. However, the costs of implementing that strategy may be such that the added value is not sufficient to justify the costs required. Thus, while statistical significance may suggest a particular course of action is optimal, the economic decision also takes into account the costs associated with the strategy, and whether the expected benefits justify implementing the strategy.
Nonparametric - tests
There are other types of hypothesis tests, which may not involve a population parameter or much in the way of assumptions about the population distribution underlying a parameter. Such tests are nonparametric tests. Nonparametric tests have different characteristics: 1.) They are concerned with quantities other than parameters of distributions. 2.) They can be used when the assumptions of parametric tests do not hold for the particular data under consideration. 3.) They make minimal assumptions about the population from which the sample comes. A common example is the situation in which an underlying population is not normally distributed. Other tests, such as a median test or the sign test, can be used in place of t-tests for means and paired comparisons, respectively. Nonparametric tests are normally used in three cases: 1.) When the distribution of the data to be analyzed indicates or suggests that a parametric test is not appropriate. 2.) When the data are ordinal or ranked, as parametric tests normally require the data to be interval or ratio. One might be ranking the performance of investment managers; such rankings do not lend themselves to parametric tests because of their scale. 3.) When a test does not involve a parameter. For instance, in evaluating whether or not an investment manager has had a statistically significant record of consecutive successes, a nonparametric runs test might be employed. Another example: if you want to test whether a sample is randomly selected, a nonparametric test should be used. In general, parametric tests are preferred where they are applicable. They have stricter assumptions that, when met, allow for stronger conclusions. However, nonparametric tests have broader applicability and, while not as precise, do add to your understanding of phenomena, particularly when no parametric tests can be effectively used.
Test Statistic
A test statistic is simply a number, calculated from a sample, whose value, relative to its probability distribution, provides a degree of statistical evidence against the null hypothesis. In many cases, the test statistic will not provide evidence sufficient to justify rejecting the null hypothesis. However, sometimes the evidence will be strong enough so that the null hypothesis is rejected and the alternative hypothesis is accepted instead. The value of the test statistic is the focal point of assessing the validity of a hypothesis. Typically, the test statistic will be of the general form:
Confidence Interval
A confidence interval is a range of values within which it is believed that a certain unknown population parameter (often the mean) will fall with a certain degree of confidence. The percentage of confidence is denoted (1 - α)%. Note how the confidence interval is related to the test statistic. They are linked by the rejection point(s). Confidence intervals can be used to test hypotheses. Note that the right side of the equation is the left endpoint of the confidence interval. Essentially, if the confidence interval contains the value of the unknown population parameter as hypothesized under H0, then H would not be rejected in a two-sided hypothesis test with corresponding α; if the confidence interval does not contain the value of the unknown population parameter as hypothesized under H0, then H would be rejected in a two-sided hypothesis test with corresponding α. The reason for the above determination is as follows: 1.) If the hypothesized value does not fall in the confidence interval, then there is a very small chance that the value can be a true value for the unknown parameter, so H0 is almost certainly wrong and will be rejected. 2.) If the hypothesized value does fall in the confidence interval, then there is a very good chance that the value can be a true value for the unknown parameter, so H0 is definitely possible and will not be rejected. This comparison can be used only for two-sided tests, not one-sided tests, because confidence intervals cannot be linked with one-sided tests.
Hypothesis
A hypothesis is a statement about a population created for the purpose of statistical testing. All hypothesis tests involve making statements about population parameters, or population distributions, and testing those statements based on samples taken from the population to see whether the statements are true. Examples of hypotheses made about a population parameter are: 1.) The mean monthly income for a financial analyst is $5,000. 2.) 25% of R&D costs are ultimately written off. 3.) 19% of net income statements are later found to be materially incorrect.
The Decision Rule for a One-Tailed Test
A one-tailed hypothesis test for the population mean (μ) is structured as H0: μ ≤ μ vs. H1: μ > μ . For a given level of significance (α), the probability of a Type I error equals α, all in the right tail. Therefore, the decision rule for a one-tailed test is: For a test of H0: μ ≤ μ vs. H1: μ > μ at the 5% significance level, the total probability of a Type I error equals 5%, all in the right tail. As a result, the rejection point is z = 1.645. Therefore, the null hypothesis will be rejected if z > 1.645, where z is the calculated value of the test statistic. If z ≤ 1.645, the null hypothesis cannot be rejected. If a one-tailed hypothesis test for the population mean (μ) is structured as H0: μ ≥ μ0 vs. H1: μ < μ0, similar analysis can be performed (i.e., reject H0 if z < -zα ).
Test of Significance
A test of significance is a test to ascertain whether or not the value of an unknown population parameter is as stated by an individual or institution. This test is carried out at a significance level of α, which determines the size of the rejection region. Note how the confidence interval is related to the test statistic. They are linked by the rejection point(s). The likelihood that the z-value will be less than the test statistic is what is being tested. Setting up this inequality, and rearranging, the test is:
The Decision Rule for a Two-Tailed Test
A two-tailed hypothesis test for the population mean (μ) is structured as H0: μ = μ0 vs. H1: μ ≠ μ0. For a given level of significance (α), the total probability of a Type I error must sum to α, with α/2 in each tail. Therefore, the decision rule for a two-tailed test is: For a two-sided test at the 5% significance level, the total probability of a Type I error must sum to 5%, with 5%/2 = 2.5% in each tail. As a result, the two rejection points are -z0.025 = -1.96 and z0.025 = 1.96. These values are obtained from the z-table. Therefore, the null hypothesis can be rejected if z < -1.96 or z > 1.96, where z is the calculated value of the test statistic. If the calculated value of the test statistic falls within the range of -1.96 and 1.96, the null hypothesis cannot be rejected.
Rejection Point(s)
Compare the calculated value of the t-statistic with the rejection point(s) to make the decision. For a two-tailed test (H0 : μ = μ versus Ha : μ ≠ μ ), there are two rejection points. Reject the null if the t-statistic is greater than the upper rejection point or less than the lower rejection point. For a one-tailed test (H0: μ ≤ μ versus Ha: μ > μ ), there is only one rejection point: the upper rejection point. Reject the null if the t-statistic is greater than the upper rejection point. For a one-tailed test (H0: μ ≥ μ versus Ha: μ < μ ), there is only one rejection point: the lower rejection point. Reject the null if the t-statistic is less than the upper rejection point.
Chi-square Table - "Probability in Right Tail"
Critical values for this distribution are found in the chi-square table. The table is titled "Probability in Right Tail," so be aware of what this region of the table gives and modify your calculation accordingly. It is possible to use both tails in this distribution, despite the fact that it is non-symmetrical.
Type II error
Don't reject the null hypothesis when it's false. This is known as a Type II error. The probability of a Type II error (the Type II error rate) is designated by the Greek letter beta (β). A Type II error is only an error in the sense that an opportunity to reject the null hypothesis correctly was lost. It is not an error in the sense that an incorrect conclusion was drawn, since no conclusion is drawn when the null hypothesis is not rejected. It has nothing to do with α, other than the fact that it moves in the opposite direction from α (that is, the bigger the one, the smaller the other).
Test Statistic For the Mean of a Distribution
For example, a test statistic for the mean of a distribution (such as the mean monthly return for a stock index) often follows a standard normal distribution. In such a case, the test statistic requires use of the z-test, P(Z <= test statistic = z). Where: X-bar = sample mean μ = hypothesized value σ = sample standard deviation n = sample size Note that this assumes the population variance (and, therefore, the population standard deviation) is unknown and can only be estimated from the sample data. After setting up H and H , the next step is to state the level of significance, which is the probability of rejecting the null hypothesis when it is actually true. Alpha (α) is used to represent this probability. The idea behind setting the level of significance is to choose the probability that any decision will be subject to a Type I error. There is no one level of significance that is applied to all studies involving sampling. A decision by the researcher must be made to use the 0.01 level, 0.05 level, 0.10 level, or any other level between 0 and 1. A lower level of significance means that there is a lower probability that a Type I error will be made.
Parametric tests
In hypothesis tests, analysts are usually concerned with the values of parameters, such as means or variances. To undertake such tests, analysts have had to make assumptions about the distribution of the population underlying the sample from which test statistics are derived. Given either of these qualities, the tests can be described as parametric tests. All hypotheses tests that have been considered in this section are parametric tests. For example, an F-test relies on two assumptions: 1.) Populations 1 and 2 are normally distributed. 2.) Two random samples drawn from these populations are independent. The F-test is concerned with the difference between the variance of the two populations. Variance is a parameter of a normal distribution. Therefore, the F-test is a parametric test.
Hypothesis Tests Concerning Differences between Means (normally distributed populations, population variances unknown but assumed equal)
In practice, analysts often want to know whether the means of two populations are equal or whether one is larger than the other. If it is reasonable to believe that the samples are from populations at least approximately normally distributed and that the samples are also independent of each other, whether a mean value differs between the two populations can be tested. The test procedure is the same as before. There are just a couple of modifications that need to be made. As mentioned previously, the null hypothesis involves an equal sign. So, in this situation, the null hypothesis would be that the two unknown population means are equal. The alternative hypothesis would involve one of >, < or ≠. The rest of the testing procedure is the same, but the test statistic is different. It's now time to look at what formula should be used. The test statistic to be used in this section is a t-value, but it varies based on the assumptions. The assumption has been made throughout that the population means are normally distributed. 1. Test statistic for a test of the difference between two population means (normally distributed populations, population variances unknown but assumed equal): where: sp2 = [(n - 1)(s1^2) + (n - 1)(s2^2)] /(n1 + n2 - 2) Reading 12, Page 15 to 17
Type I Error
Incorrectly reject the null hypothesis when it's correct. This is known as a Type I error. The probability of a Type I error is designated by the Greek letter alpha (α) and is called the Type I error rate. A Type I error, on the other hand, is an error in every sense of the word. A conclusion is drawn that the null hypothesis is false when, in fact, it is true. Therefore, Type I errors are generally considered more serious than Type II errors. The probability of a Type I error (α) is called the significance level; it is set by the experimenter. For example, a 5% level of significance means that there is a 5% probability of rejecting the null hypothesis when it is true. There is a trade-off between Type I and Type II errors. The more an experimenter protects him or herself against Type I errors by choosing a low level, the greater the chance of a Type II error. Requiring very strong evidence to reject the null hypothesis makes it very unlikely that a true null hypothesis will be rejected. However, it increases the chance that a false null hypothesis will not be rejected, thus lowering its power. The Type I error rate is almost always set at 0.05 or at 0.01, the latter being more conservative (since it requires stronger evidence to reject the null hypothesis at the 0.01 level then at the 0.05 level).
P-Value Approach to Hypothesis Testing
Original hypothesis testing procedure: Step 1: State the hypotheses. Step 2: Identify the test statistic and its probability distribution. Step 3: Specify the significance level. Step 4: State the decision rule. Step 5: Collect the data in the sample and calculate the necessary value(s) using the sample data. Step 6: Make a decision regarding the hypotheses. Step 7: Make a decision based on the test results. When a p-value approach is used, steps 3 and 4 become redundant. The new steps are as follows: Step 1: State the hypotheses. Step 2: Identify the test statistic and its probability distribution. Step 3: Collect the data in the sample and calculate the necessary value(s) using the sample data. Step 4: Calculate the p-value for the test. Step 5: Make a decision regarding the hypotheses. Step 6: Make a decision based on the test results. It can thus be seen that the p-value approach is simpler, and also provides useful information about a test statistic.
P-Value
P-value is the area from the test statistic to the end of the tail (or tails, in the case of a two-sided test) of interest. The "p" stands for probability, and the p-value is a probability. Effectively, it represents the chance of obtaining a test statistic whose value is at least as extreme as the one just calculated in the test, when H0 is true. The term "tail of interest" means the right-hand tail in a > test, the left-hand tail in a < test, and both tails in a ≠ test. It should be clear to you that if the test statistic is very extreme (that is, close to the tail) then the area from that value to the end of the tail will be very small. By definition, the p-value is thus very small. In this case, the test statistic would certainly fall in the rejection region. Conversely, if the test statistic is not extreme (that is, it is far from the tail) and would thus fall in the acceptance region, the area from the test statistic to the end of the tail would be fairly large. In this case, the p-value is fairly large. So, you have just learned how to interpret p-values. A small p-value indicates a rejection of the null hypothesis whereas a large p-value indicates a non-rejection of the null hypothesis. This is all well and good, but how large is large? P-value is normally compared with our usual α value, so 0.05 is a common reference point. This means that, for p-values smaller than 0.05, H0 would be rejected, and for p-values larger than 0.05, H would not be rejected. However, p-value provides not only a reference point but an idea about just how strong or weak this rejection or non-rejection is. For example, a p-value of 0.0003 would indicate a very strong rejection of H0 at virtually any a-level, whereas a p-value of 0.048 would indicate a rejection of H0 when compared with a value of 0.05 but a non-rejection of H when compared with an α value of 0.01. In general, the larger the p-value, the smaller the likelihood that H0 will be rejected and vice versa. In the case of a two-sided test, if the area from the test statistic to the end of the right tail is 0.04, for example, then the p-value will be 0.08; doubling the area is necessary because both tails are being used. It's as simple as that. Examples Now try the following for yourself. In each case, determine whether or not you would reject H0 when compared with an α value of 0.05. You are given the following p-values. A.) 0.02 B.) 0.456 C.) 0.053 D.) 0.0001 The correct answers are: A.) Reject H . B.) Do not reject H : The p-value is very large (over 45%), and therefore you are very far away from a rejection of H0. C.) Do not reject H : The p-value is only just more than 0.05, so you are close to a rejection. D.) Reject H0: The p-value is very small, so you would reject H in almost all cases. Now that you understand how p-values work, let us go through the testing procedure again: 1.) First write down your hypotheses, then determine which test statistic to use. There is no need to have a significance level, and hence, you don't have a specifically demarcated rejection region. 2.) Calculate the test statistic and determine the resultant p-value. 3.) The conclusion of the test is based on your p-value. Recall that a small p-value means a rejection of H0, whereas a large p-value means a non-rejection of H0. Note: P-values can be worked out for all continuous statistical distributions, no matter what the shape of the distribution is. So, this method can be used for z-graphs, t-graphs, x2-graphs and F-graphs.
Power (Hypothesis Testing)
Power is the probability of correctly rejecting a false null hypothesis. Power is therefore defined as: 1 - β where β is the Type II error probability. If the power of an experiment is low, then there is a good chance that the experiment will be inconclusive. That is why it is so important to consider power in the design of experiments. There are methods for estimating the power of an experiment before the experiment is conducted. If the power is too low, then the experiment can be redesigned by changing one of the factors that determine power. Sometimes more than one test statistic is used to conduct a hypothesis test. In this case the relative power of the test needs to be computed for the competing statistics; the test statistic that is most powerful must be selected. Consider a hypothetical experiment designed to test whether rats brought up in an enriched environment can learn mazes faster than rats brought up in the typical laboratory environment (the control condition). Two groups of 12 rats are tested. Although the experimenter does not know it, the population mean number of trials it takes to learn the maze is 20 for rats from the enriched environment and 32 for rats from the control condition. The null hypothesis that the enriched environment makes no difference is therefore false. The question is, What is the probability that the experimenter is going to be able to demonstrate that the null hypothesis is false by rejecting it at the 0.05 level? This is the same as asking, What is the power of the test? Before the power of the test can be determined, the standard deviation (s) must be known. If s = 10, then the power of the significance test is 0.80. This means that there is a 0.80 probability that the experimenter will be able to reject the null hypothesis. Since power = 0.80, b = 1 - 0.80 = 0.20. It is important to keep in mind that power is not about whether or not the null hypothesis is true (it is assumed to be false). It is the probability the data gathered in an experiment will be sufficient to reject the null hypothesis. The experimenter does not know that the null hypothesis is false. The experimenter asks the question: If the null hypothesis is false with specified population means and standard deviation, what is the probability that the data from the experiment will be sufficient to reject the null hypothesis? If the experimenter discovers that the probability of rejecting the null hypothesis is low (power is low), even if the null hypothesis is false to the degree expected (or hoped for), then it is likely that the experiment should be redesigned. Otherwise, considerable time and expense will go into a project that has little chance of being conclusive even if the theoretical ideas behind it are correct.
Test Statistic
Step 2: Determine the Appropriate Test Statistic and its Probability Distribution A test statistic is simply a number, calculated from a sample, whose value, relative to its probability distribution, provides a degree of statistical evidence against the null hypothesis. In many cases, the test statistic will not provide evidence sufficient to justify rejecting the null hypothesis. However, sometimes the evidence will be strong enough so that the null hypothesis is rejected and the alternative hypothesis is accepted instead. Typically, the test statistic will be of the general form: test statistic = (sample statistic - parameter value under H0) / standard error of sample statistic
Level of Significance
Step 3: Select the Level of Significance After setting up H0 and H1, the next step is to state the level of significance, which is the probability of rejecting the null hypothesis when it is actually true. Alpha is used to represent this probability. The idea behind setting the level of significance is to choose the probability that a decision will be subject to a Type I error. There is no one level of significance that is applied to all studies involving sampling. A decision by the researcher must be made to use the 0.01 level, 0.05 level, 0.10 level, or any other level between 0 and 1. A lower level of significance means that there is a lower probability that a Type I error will be made.
Decision Rule
Step 4: Formulate the Decision Rule A decision rule is a statement of the conditions under which the null hypothesis will be rejected and under which it will not be rejected. The critical value (or rejection point) is the dividing point between the region where the null hypothesis is rejected and the region where it is not rejected. The region of rejection defines the location of all those values that are so large (in absolute value) that the probability of their occurrence is low if the null hypothesis is true.
Hypothesis Decision
Step 6: Making a Decision If the test statistic is greater than the higher critical value (or, in a two-tailed test, less than the lower critical value), then the null hypothesis is rejected in favor of the alternative hypothesis.
Chi-square Test For Variance or standard Deviation
Suppose an analyst is interested in testing whether the variance from a single population is statistically equal to some hypothesized value. Let σ^2 represent the variance and let σ0^2 represent the hypothesized value. The null and alternative hypotheses would be expressed as: H0 : σ^2 = σ0^2 , versus H1: σ^2 ≠ σ0^2 Also note that directional hypotheses could be made instead: H0: σ^2 ≤ (σ0)^2, versus H1: σ^2 > (σ0)^2, or H0: σ^2 ≥ (σ0)^2, versus H1: σ^2 < (σ0)^2 The test statistic to be used is a chi-square (χ^2) statistic with n-1 degrees of freedom. In contrast to the t-test, the chi-square test is sensitive to violations of its assumptions. If the sample is not actually random or if it does not come from a normally distributed population, inferences based on a chi-square test are likely to be faulty.
The difference between when a test for independence is used and when a paired comparisons test is used
Test for Independence: A test of the differences in means is used when there are two independent samples. Essentially, you have two separate groups, and you wish to compare their population means. Paired comparisons test: A test of the mean of the difference (as conducted in subject j) is used when the samples are dependent, either because you have a before-and-after situation or because there is an inherent relation between the pairs. In the first case, you keep the groups completely separate and combine their sample sizes for the purpose of calculating degrees of freedom. In the second case, you reduce the two samples to a single sample of differences and treat the entire process from then on as if you were dealing with a single sample. Another telltale sign is to look at sample sizes. If the sample sizes are different, the test has to be for independent samples, as paired comparisons tests require equal-sized samples. If the sample sizes are the same, either test could be used. What the two procedures have in common is that they both require normally distributed populations and they both make use of t-tests.
F-Statistic or Fisher Statistic
The F-statistic or Fisher statistic is the test for equality of variances of two normally distributed populations. Label the populations 1 and 2, as has always been done. The choice of labeling is arbitrary. F = (s2)^2 / (s2)^2 where (s1)^2 and (s2)^2 are the sample variances from two samples, which have n1 and n2 observations in each. The samples are random, independent of each other, and generated by normally distributed populations. The F stands for Fisher, the name of the person who formulated this distribution. This test statistic will have n1 - 1 degrees of freedom in the numerator and n2 - 1 degrees of freedom in the denominator. The values n1 - 1 and n2 - 1 are also the divisors used in calculating (s1)^2 and (s2)^2, respectively. Another point of interest is that, under H0, rhe ratio (σ1^2) /(σ2^2) is 1. A F-test checks whether or not the ratio of the test statistic, (s1)^2 /(s2)^2, is close to 1. If it is, you will obtain a non-rejection of H0; if not, you will obtain a rejection of H0. For this reason, it is common practice to use the larger of the two ratios (s1)^2 /(s2)^2 or (s2)^2 /(s1)^2, as the actual test statistic. So, whichever of (s1)^2 and (s2)^2 is larger should go in the numerator, with the smaller number in the denominator. If these values are equal, it makes no difference which value goes on top, as the test statistic will be 1 in either case. Like the chi-square distribution: 1.) The F-distribution is bounded below by zero, and therefore values of the F-distribution test statistic cannot be negative. 2.) The F-distribution is asymmetrical. 3.) The F-distribution is a family of distributions. 4.) The F-test is not very robust when assumptions are violated. Unlike the chi-square distribution, the F-distribution is determined by two parameters: the degrees of freedom in the numerator and the degrees of freedom in the denominator. Note that F(8,6) refers to an F-distribution with 8 numerator and 7 denominator degrees of freedom. Another area in which F-tests differ from chi-square tests is that, for F-tests, the rejection region is always in the right tail of the graph, never the left tail. This presents few difficulties. It does mean that if you are conducting a one-sided test, the full value goes into the right tail, whereas if you are conducting a two-sided test, you divide by 2 to obtain the area in the right tail. The remaining half of the area effectively falls away because of the convention of putting the larger sample variance on top in the test statistic. Provided you do this, you will have no problems here. Critical values for this distribution are found in F-tables. In order to find the required value, you should look up the value that corresponds with the correct numerator and denominator degrees of freedom. Example - Page 23
Alternate Hypothesis Explanation
The alternate hypothesis is the statement that is accepted if the sample data provides sufficient evidence that the null hypothesis is false. It is designated as H1 and is accepted if the sample data provides sufficient statistical evidence that H is false. The following example clarifies the difference between the two hypotheses. Suppose the mean time to market for a new pharmaceutical drug is thought to be 3.9 years. The null hypothesis represents the current or reported condition and would therefore be H0: μ = 3.9. The alternate hypothesis is that this statement is not true, that is, H1: μ ≠ 3.9. The null and alternative hypotheses account for all possible values of the population parameter. There are three basic ways of formulating the null hypothesis. 1.) H0: μ = μ versus H1: μ ≠ μ . This hypothesis is two-tailed, which means that you are testing evidence that the actual parameter may be statistically greater or less than the hypothesized value. 2.) H0: ≤ μ versus H1: μ > μ . This hypothesis is one-tailed; it tests whether there is evidence that the actual parameter is significantly greater than the hypothesized value. If there is, the null hypothesis is rejected. If there is not, the null hypothesis is accepted. 3.) H0: μ ≥ μ versus H1: μ < μ . This hypothesis is one-tailed; it tests whether there is evidence that the actual parameter is significantly less than the hypothesized value. If there is, the null hypothesis is rejected. If there is not, the null hypothesis is accepted. The question most likely to be raised at this point is how do you know if a test is one-sided or two-sided? The general rule is as follows: 1.) If a question makes it clear that only one direction is to be examined, use a one-sided test. 2.) If there is no clue in the question as to which direction should be examined, use a two-sided test. Normally, there is little ambiguity; the question will make it clear which test should be used. A question often asked involves testing whether a population mean is greater than or less than a specific number. In this case, use a one-tailed test. If the question asks you to test whether a population mean is different from a specific number, use a two-tailed test.
Null Hypothesis Explantion
The null hypothesis (designated H0) is the statement that is to be tested. The null hypothesis is a statement about the value of a population. The null hypothesis will either be rejected or fail to be rejected. 1.) For example, the null hypothesis could be: "The mean monthly return for stocks listed on the Vancouver Stock Exchange is not significantly different from 1%." Note that this is the same as saying the mean (μ) monthly return on stocks listed on the Vancouver Stock Exchange is equal to 1%. This null hypothesis, H0, would be written as: H0: μ = 1%. 2.) As another example, if a null hypothesis is stated as "There is no difference in the revenue growth rate for satellite TV dishes before and after the negative TV advertising campaign aired by the cable industry," then the null hypothesis could be written to show that two rates are equal: H : r = r . It is important to point out that accepting the null hypothesis does not prove that it is true. It simply means that there is not sufficient evidence to reject it. Note that it makes no sense to hypothesize about known sample values, for the simple reason that they are known, just like it makes no sense to construct confidence intervals or obtain point estimates for known values. Hypothesis tests are carried out on unknown population parameters.
Chi-square graph
Unlike t-graphs and z-graphs, a chi-square graph is positively skewed. It is also truncated at zero, and thus is not defined for negative values. Like the family of t-graphs, the shape of the graph varies; the graph becomes more symmetrical as the degrees of freedom increase.
z-test and the Central Limit Theorem
When hypothesis testing a population mean, there are generally two options for the test statistic: The first statistic may be used if either the sample is large (n = 30 or greater) or, if n < 30, it may be used if the sample is at least approximately normally distributed. In most cases, this will be the statistic used, because in most practical problems, the population variance is not known with certainty. The second statistic is sometimes used with large sample sizes, because the central limit theorem implies that the distribution of a sample mean will be approximately normally distributed as the sample size increases. As can be seen above, when the degrees of freedom are low, the graph is fairly flat in the center and has long tails and a bigger standard deviation. As the degrees of freedom increase, the tails become narrower and flatter and the graph peaks in the center. Recall also that the area under any graph is 1, as this area represents a probability. So, as the degrees of freedom increase, the area that is "lost" in the flatter tails is "found" in the center of the graph, and that is why the graph becomes more peaked. However, 1.) There are differences between the t-test critical values and the z-test critical values (these can be significant but get smaller with large samples). 2.) The t-test is still the theoretically proper choice unless the population variance is known.
t-Statistic Formula for Paired Comparisons Test
With the test statistic for a test of the difference between two population means, t-tests were examined to discern differences between two population means. There were two separate populations which were independent of each other. With "Paired Comparisons Test", the focus will be on conducting a test based on the means of samples that are related in some way. The data are arranged in paired observations; the test is sometimes known as a paired comparisons test. The paired observations are either in or not in the same units. This test is normally used in two cases: 1.) A before-and-after situation, where analysts compare data before and after a certain process/procedure/treatment has taken place. 2.) When there is a relationship between the values, for example, collecting data from twins. In both cases, the data in each pair of observations are dependent. This method involves forming differences by subtracting one value of the pair from the other. The sample is then reduced to a single sample and the test statistic is based on the values of the differences. In this situation, use the subscript d to indicate the differences being dealt with. The hypotheses are therefore: H0 : μd = μd0 Ha: μd ≠ μd0 where μ is some fixed value, commonly zero, about which you are hypothesizing. The test statistic is then: