Chapter 12

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

The F Distribution Table

For ANOVA, we expect F near 1.00 if H0 is true. -An F-ratio that is much larger than 1.00 is an indication that H0 is not true. -In the F distribution, we need to separate those values that are reasonably near 1.00 from the values that are significantly greater than 1.00. -These critical values are presented in an F distribution table. -To use the table, you must know the df values for the F-ratio (numerator and denominator), and you must know the alpha level for the hypothesis test. -It is customary for an F table to have the df values for the numerator of the F-ratio printed across the top of the table. -The df values for the denominator of F are printed in a column on the left-hand side.

Analysis of Degrees of Freedom(cont'd)

Notice that the formula for dfwithin simply adds up the number of scores in each treatment (the n values) and subtracts 1 for each treatment. If these two stages are done separately, you obtain dfwithin = N - k -The df associated with SSbetween can be found by considering how the SS value is obtained. -This SS formula measures the variability for the set of treatments (totals or means). To find dfbetween, simply count the number of treatments and subtract 1. Because the number of treatments is specified by the letter k, the formula for df is dfbetween = k - 1

Post HOC tests

Post hoc tests (or posttests) are additional hypothesis tests that are done after an ANOVA to determine exactly which mean differences are significant and which are not. -In general, a post hoc test enables you to go back through the data and compare the individual treatments two at a time. -In statistical terms, this is called making pairwise comparisons. -The process of conducting pairwise comparisons involves performing a series of separate hypothesis tests, and each of these tests includes the risk of a Type I error. -As you do more and more separate tests, the risk of a Type I error accumulates and is called the experimentwise alpha level.

An Overview, Cont'd

Specifically, we must decide between two interpretations: -There really are no differences between the populations (or treatments). The observed differences between the sample means are caused by random, unsystematic factors (sampling error). -The populations (or treatments) really do have different means, and these population mean differences are responsible for causing systematic differences between the sample means.

Analysis of Sum of Squares

The ANOVA requires that we first compute a total sum of squares and then partition this value into two components: between treatments and within treatments. As the name implies, SStotal is the sum of squares for the entire set of N scores. It is usually easiest to calculate SStotal using the computational formula: SStotal = SX2 - (SX)2/N -The within-treatments sum of squares is simply the sum of all of the SSs within each of the three treatment conditions SSwithin treatments = SSSinside each treatment The between-treatments sum of squares is given by: SSbetween = SStotal - SSwithin

Between Treatments Variance

The between-treatments variance simply measures how much difference exists between the treatment conditions. -There are two possible explanations for these between-treatment differences: -The differences between treatments are not caused by any treatment effect but are simply the naturally occurring, random and unsystematic differences that exist between one sample and another. That is, the differences are the result of sampling error -The differences between treatments have been caused by the treatment effects. -Thus, when we compute the between-treatments variance, we are measuring differences that could be caused by a systematic treatment effect or could simply be random and unsystematic mean differences caused by sampling error.

ANOVA formulas

The final calculation for ANOVA is the F-ratio, which is composed of two variances: F-ratio = variance between treatments/ variance within treatments Each of the two variances in the F-ratio is calculated using the basic formula for sample variance. sample variance = s2 = SS / df

The Logic of Analysis of Variance

The formulas and calculations required in ANOVA are somewhat complicated, but the logic that underlies the whole procedure is fairly straightforward. -The analysis process divides the total variability into two basic components. -Between-treatments variance -Within-treatment variance

The Relationship between ANOVA and T-Tests

When you are evaluating the mean difference from an independent-measures study comparing only two treatments (two separate samples), you can use either an independent-measures t test or the ANOVA. -The basic relationship between t statistics and F-ratios can be stated in an equation: F = t2

Measuring Effect Size for ANOVA

A significant mean difference simply indicates that the difference observed in the sample data is very unlikely to have occurred just by chance. -Thus, the term significant does not necessarily mean large, it simply means larger than expected by chance. -To provide an indication of how large the effect actually is, it is recommended that researchers report a measure of effect size in addition to the measure of significance. -For ANOVA, the simplest and most direct way to measure effect size is to compute the percentage of variance accounted for by the treatment conditions. -The calculation and the concept of the percentage of variance is extremely straightforward. -Specifically, we determine how much of the total SS is accounted for by the SSbetween treatments = SSbetween treatments / SStotal

An Overview of Analysis of Variance

Analysis of variance (ANOVA) is a hypothesis-testing procedure that is used to evaluate mean differences between two or more treatments (or populations). -As with all inferential procedures, ANOVA uses sample data as the basis for drawing general conclusions about populations. -The major advantage of ANOVA is that it can be used to compare two or more treatments.

ANOVA notation and Formulas

Because ANOVA typically is used to examine data from more than two treatment conditions (and more than two samples), we need a notational system to keep track of all the individual scores and totals. -The letter k is used to identify the number of treatment conditions—that is, the number of levels of the factor. For an independent-measures study, k also specifies the number of separate samples. -The number of scores in each treatment is identified by a lowercase letter n. -The total number of scores in the entire study is specified by a capital letter N. -The sum of the scores (ΣX) for each treatment condition is identified by the capital letter T (for treatment total). -The sum of all the scores in the research study (the grand total) is identified by G.

The Logic and Process of Analysis of Variance(Cont'd)

Because all the individuals in a sample receive exactly the same treatment, any differences (or variance) within a sample cannot be caused by different treatments. -Thus, these differences are caused by only one source: -Chance or Error: The unpredictable differences that exist between individual scores are not caused by any systematic factors and are simply considered to be random chance or error. -By computing the variance for the three means we can measure the size of the differences. -Although it is possible to compute a variance for the set of sample means, it usually is easier to use the total, T, for each sample instead of the mean, and compute variance for the set of T values.

The Scheffe Test

Because it uses an extremely cautious method for reducing the risk of a Type I error, the Scheffé test has the distinction of being one of the safest of all possible post hoc tests (smallest risk of a Type I error). -The Scheffé test uses an F-ratio to evaluate the significance of the difference between any two treatment conditions. -The numerator of the F-ratio is an MS between treatments that is calculated using only the two treatments you want to compare. The denominator is the same MSwithin that was used for the overall ANOVA. -The "safety factor" for the Scheffé test comes from the following two considerations: Although you are comparing only two treatments, the Scheffé test uses the value of k from the original experiment to compute df between treatments. Thus, df for the numerator of the F-ratio is k - 1. -The critical value for the Scheffé F-ratio is the same as was used to evaluate the F-ratio from the overall ANOVA. Thus, Scheffé requires that every posttest satisfy the same criterion that was used for the complete ANOVA.

The F Ratio: The Test Statistic for ANOVA

For the independent-measures ANOVA, the F-ratio has the following structure: F-ratio = variance between treatments/variance within treatments -The value obtained for the F-ratio helps determine whether any treatment effects exist. -When there are no systematic treatment effects, the differences between treatments (numerator) are entirely caused by random, unsystematic factors. -When the treatment does have an effect, then the combination of systematic and random differences in the numerator should be larger than the random differences alone in the denominator. -For ANOVA, the denominator of the F-ratio is called the error term. -The error term provides a measure of the variance caused by random, unsystematic differences. -When the treatment effect is zero (H0 is true), the error term measures the same sources of variance as the numerator of the F-ratio, so the value of the F-ratio is expected to be nearly equal to 1.00.

Calculation of Variance and the F ratio

In ANOVA, it is customary to use the term mean square, or simply MS, in place of the term variance. -For the final F-ratio, we will need an MS (variance) between treatments for the numerator and an MS (variance) within treatments for the denominator. -In each case MS(variance) = s2 = SS / df MSbetween = s2between = SSbetween / dfbetween MSwithin = s2within = SSwithin / dfwithin

Logic, Process of Analysis

Logically, the differences (or variance) between means can be caused by two sources: -Treatment Effects: If the treatments have different effects, this could cause the mean for one treatment to be higher (or lower) than the mean for another treatment. -Chance or Sampling Error: If there is no treatment effect at all, you would still expect some differences between samples. -Mean differences from one sample to another are an example of random, unsystematic sampling error.

The Distribution of F Ratios

In analysis of variance, the F-ratio is constructed so that the numerator and denominator of the ratio are measuring exactly the same variance when the null hypothesis is true. -In this situation, we expect the value of F to be around 1.00. -If the null hypothesis is false, the F-ratio should be much greater than 1.00. -The problem is to define precisely which values are "around 1.00" and which are "much greater."\ -To answer this question, we need to look at all the possible F values that can be obtained when the null hypothesis is true—that is, the distribution of F-ratios. -Before we examine this distribution in detail, you should note two obvious characteristics: -Because F-ratios are computed from two variances (the numerator and denominator of the ratio), F values always are positive numbers. Variance is always positive -When H0 is true, the numerator and denominator of the F-ratio are measuring the same variance. In this case, the two sample variances should be about the same size, so the ratio should be near 1. In other words, the distribution of F-ratios should pile up around 1.00. -With these two factors in mind, we can sketch the distribution of F-ratios. The distribution is cut off at zero (all positive values), piles up around 1.00, and then tapers off to the right. The exact shape of the F distribution depends on the degrees of freedom for the two variances in the F-ratio.

Terminology in Analysis of Variance

In analysis of variance, the variable (independent or quasi-independent) that designates the groups being compared is called a factor. -The individual conditions or values that make up a factor are called the levels of the factor. -A study that combines two factors is called a two-factor design or a factorial design.

Analysis of Degrees of Freedom

In computing the degrees of freedom, there are two important considerations to keep in mind: -Each df value is associated with a specific SS value. -Normally, the value of df is obtained by counting the number of items that were used to calculate SS and then subtracting 1. For example, if you compute SS for a set of n scores, then df = n - 1. -To find the df associated with SStotal, you must first recall that this SS value measures variability for the entire set of N scores. -Therefore, the df value is dftotal = N - 1 -To find the df associated with SSwithin, we must look at how this SS value is computed. -Remember, we first find SS inside of each of the treatments and then add these values together. -Each of the treatment SS values measures variability for the n scores in the treatment, so each SS has df = n - 1. When all these individual treatment values are added together, we obtain: dfwithin = Σ(n - 1) = Σdfin each treatment

Statistical Hypotheses for ANOVA

In general, H0 states that there is no treatment effect. -In an ANOVA with three groups H0 could appear as: H0: μ1 = μ2 = μ3 -The alternative hypothesis states that the population means are not all the same: -H1: There is at least one mean difference

Within Treatments Variance

Inside each treatment condition, we have a set of individuals who receive the same treatment. -The researcher does not do anything that would cause these individuals to have different scores, yet they usually do have different scores. -The differences represent random and unsystematic differences that occur when there are no treatment effects. -Thus, the within-treatments variance provides a measure of how big the differences are when H0 is true.

ANOVA Summary Tables

It is useful to organize the results of the analysis in one table called an ANOVA summary table. -The table shows the source of variability (between treatments, within treatments, and total variability), SS, df, MS, and F. -Although these tables are no longer used in published reports, they are a common part of computer printouts, and they do provide a concise method for presenting the results of an analysis.

Assumptions for the Independent Measures ANOVA

The independent-measures ANOVA requires the same three assumptions that were necessary for the independent-measures t hypothesis test: -The observations within each sample must be independent. -The populations from which the samples are selected must be normal. -The populations from which the samples are selected must have equal variances (homogeneity of variance). -Ordinarily, researchers are not overly concerned with the assumption of normality, especially when large samples are used, unless there are strong reasons to suspect the assumption has not been satisfied. -The assumption of homogeneity of variance is an important one. If a researcher suspects it has been violated, it can be tested by Hartley's F-max test for homogeneity of variance. -If you suspect that one of the assumptions for the independent-measures ANOVA has been violated, you can still proceed by transforming the original scores to ranks and then using an alternative statistical analysis known as the Kruskal-Wallis test, which is designed specifically for ordinal data.

The Test Statistic for ANOVA

The test statistic for ANOVA is very similar to the t statistics used in earlier chapters. -For the t statistic, we first computed the standard error, which measures the difference between two sample means that is reasonable to expect if there is no treatment effect (that is, if H0 is true). -For ANOVA, however, we want to compare differences among two or more sample means. -With more than two samples, the concept of "difference between sample means" becomes difficult to define or measure. -The solution to this problem is to use variance to define and measure the size of the differences among the sample means. -The test statistic for ANOVA uses this fact to compute an F-ratio with the following structure: F = variance between sample means/ variance expected with no treatment effect F = MSbetween/MSwithin

Tukey's Honestly Significant Difference Test

Tukey's test allows you to compute a single value that determines the minimum difference between treatment means that is necessary for significance. -This value, called the honestly significant difference, or HSD, is then used to compare any two treatment conditions. -If the mean difference exceeds Tukey's HSD, you conclude that there is a significant difference between the treatments. -Otherwise, you cannot conclude that the treatments are significantly different.


Set pelajaran terkait

Programming 2 Final - Chapters 11, 12, 14

View Set

TRACCE (versione ufficiale) 2M PAROLE UTILI IN CLASSE II / NÜTZLICHE WÖRTER IM UNTERRICHT II

View Set

Lesson 6 software and hardware interaction

View Set

Eight Moral Disengagement Mechanisms

View Set

Chemistry Ch.8 Study Guide Multiple Choice #25-47

View Set