Chapter 13 Anova

Ace your homework & exams now with Quizwiz!

ANOVA

-analysis of variance, hypothesis testing procedure used to evaluate mean difference between two or more treatments.

SS and M are also important to the formulas for ANOVA

...

another way for computing the between-treatments sum of squares, but it can only be used when all treatments have the same number of scores. But, it presents the same results as equation

...

ANOVA advantage over t-test

ANOVA over t-test because ANOVA reduces the risk of type I error because it combines all the levels and can test more than two samples

The advantage of ANOVA is that it performs all there comparisons simultaneously in the same hypothesis test. Thus,

ANOVA uses one test with one alpha level to evaluate the mean differences and thereby avoids the problem of an inflated experimentwise alpha level

Step 3: To compute the F-ratio you need a series of calculations:

Analyze the SS to obtain SSbetween and SSwithin. Use the SS values and the df values from step 2 to calculate the two variances, MSbetween and MSwithin . Finally use the MS values to compute the F-ratio. SStotal is simply the SS for N = SStotal = ∑X2- G2 N SSwithin combines the SS values from inside each treatment condition. SSwithin = ∑SSinside each treatment = SSbetween measures the differences between the four treatments means. Ssbetween = ∑T2 - G2 n N

Two characteristics

Because F-ratios are computed from two variances (the numerator and denominator of the ratio), F values always will be positive numbers. Variance is always positive. When H0 is true, the numerator and denominator of the F-ratio are measuring the same variance. So, the two sample variances should be about the same size, so the ratio should be near 1. The distribution of F-ratios should pile up around 1.0.

We need two separate analyses

Compute SS for the total study and analyze it into two components (between and within), Then compute df for the total study, and analyze in into two components (between and within)

First, we find df for the total set of N scores, and them partition this value into two components. There are two considerations to keep in mind:

Each df value is associated with a specific SS value. Normally, the value of df is obtained by counting the number of items that were used to calculate SS and then subtracting l, i.e. compute SS for a set of n scores, then Df = n - 1

This is the total probability of a Type 1 error accumulated from all of the separate tests in the experiment

Experimentwise

the numerator and denominator of the F-ratio measure variances, or mean squared differences. Therefore, we can express the F-ration as follows:

F = (differences between samples2/ (differences expected by chance)2

The test statistic for ANOVA the test statisitc is called

F-ratio with F = variance (differences) between sample means divided by variance (differences) expected by chance (error)

The sum of all the scores in the research study (the grand total) is identified by

G. G = ∑T

Forming the Hypotheses for ANOVA

H0 : µ1 = µ2 = µ3 H1 : µ1 ≠ µ2 ≠ µ3 or H1 : µ1 = µ3, but µ2 is different

Hints for predicting data outcomes:

Hint 1: Remember: SS and MS provide a measure of how much difference there is between treatment conditions. Hint 2: Find the mean of total (T) for each treatment, and determine how much difference there is between the two treatments. The F-ratio always measures how much difference exists between treatments. But you should be able to look at the data to see if there is a large or small difference

There are two primary sources for chance differences

Individual differences and Experimental error

STEP 2:

Locate the critical region for the F-ratio. We must determine degrees of freedom for MSbetween treatments and MSwithin treatments (the numerator and denominator of F) dftotal = N -1

In analysis of variance, we combine two or more samples.

MSwithin = SS within = ∑SS = SS1 + SS2 + SS3 + . . . dfwithin ∑df df1 + df2 + df3

Between Treatments Variance

Measures the size of the differences are for a set of numbers, or treatment conditions and are simply due to chance, we are really measuring the differences between the sample means.

Total Sum of Squares SStotal, for the entire set of N scores.

SS = ∑X2 - (∑X)2 N

We will need to compute an ______ for the variance both between treatments (numerator of F) and another ____ value for the variance within treatments (denominator of F).

SS and a df; SS and df

The ANOVA summary table shows the sources of variability (between treatments, within treatments, and totoal variability),

SS, df, MS, and F.

The value of SSbetween treatments can be found by subtraction.

SSbetween = SStotal - SSwithin

SSbetween treatments

SSbetween = SStotal - Sswithin

To make the formula consistent:

SStotal = ∑X2 - G2 N

Anova testing Step 1

Step 1: state hypotheses H0 : µ1 = µ2 = µ3 = µ4 no effect H1 : At least one of the treatment means is different. Then the critical value α = .05

The total (∑X) for each treatment condition is identified by

T

The alpha level you select for each individual hypothesis.

Testwise

Or, you can use the formula dfwithin = N - k Here, adding up all the n values gives N.

The number of treatments = k. df within =

Post Hoc test are done when You reject H0 and

There are three or more treatments (k ≥ 3).

Two explanations for differences between treatments

Treatment Effect, The differences are simply due to chance

how is the alternate hypothsis determined in ANOVA

Usually theory or the results of other studies

Next, calculate mean squares

We must compute the variance or MS for each of the two components: MSbetween = SSbetween= df between = Mswithin = SSwithin= df within Calculation of F. F = Msbetween = Mswithin Decision: Reject the H0.

There are two possibilities for the F-ratio:

When the treatment has no effect, then the differences between treatments (numerator) are entirely due to chance. With the numerator and denominator roughly equal, the F-ratio should have a value around 1.00. When the treatment effect is zero

Researchers use the variance within treatments, the error term, as

a benchmark or standard for evaluating the differences between treatments.

The total number of scores in the entire study is specified by ____. When all the samples are the same size,

a capital N; n is a constant, N = kn

when a researcher uses a nonmanipulated variable to designate groups for ANOVA the variable is called

a quasi-independent variable

If H0 is true, we expect If H0 is not true,

a small value for F; we expect a large value for F.

an F-ratio near 1.00 indictates that the differences between treatments (numerator) are

about the same as the differences that are expected by chance (denominator).

The concept of pooled variance is the same whether you have exactly two samples or more than two samples. You simply

add the SS values and divide by the sum of the df values

If each of the t values is squared, then

all of the negative values become positive

If you have two independent-measures you can use either a t-test or ANOVA. These techniques will

always result in the same statistical decision. F = t2

Because we are going to analyze variability, the process is called

analysis of variance

Rejecting H0 indicates that

at least one difference exists among treatments.

The structure of the F-ratio also compares differences

between sample means vs. differences due to chance (error).

The term ____ refers to differences from one treatment to another. With three treatments we compare three different means, df = 3 - 1 = 2.

between treatments

The ANOVA requires that we first compute a total sum of squares and then partition the value into two components,

between treatments and within treatments

Because differences are unexplained and unpredictable, they are considered to be

chance occurrences

First, determine the total variability for the entire set of data by

combining all the scores from all the separate samples to obtain one general measure of variability for the complete experiment

How do we measure size?

compute r2 will measure how much of the differences between scores is accounted for by treatments. r2 = Ssbetween Sstotal

The shape of the F distribution depends on the ____ of the two variances of the F-ratio.

degrees of freedom

the variance in the _____ of the F-ratio and the standard error in the denominator of the t statistic both measure the differences that would be expected just by chance or sampling error

denominator

df Notation:

df = 2, 12

The SS formula measures the variability for the set of treatment totals. Count the number of T values and subtract 1. Because the number of treatments is k, the formula is

df between = k - 1

Analyze this total into two components.

df between = k -1 And, df within = ∑df inside each treatment

Degrees of Freedom, df total. Remember, SS measures varibility for the entire set of N scores.

df total = n - 1

To find the df associated with SS, look at how the SSwithin value is computed. We first find the SS inside of each of the treatments and then add these values together. Each of the treatment values measures variability for the n scores in the treatment, so each SS will have df = n-1. When all of these treatments are added together, we obtain

df within = ∑(n-1) = ∑dfin each treatment

The word analysis means

dividing into smaller parts

In ANOVA, the MS value in the denominator of the F-ratio is called the

error term

This MS value is intended to measure the amount of ____—that is, variability in the data for which there is no systematic or predicable explanation

error variability

Post hoc tests are additional hypothesis tests that are done after any ANOVA to determine

exactly which mean differences are significant and which are not

In analysis of variance, the variable (independent or quasi-independent) that designates the groups being compared is called a

factor

Any value that is in the critical region for t will end up in the critical region

for F-ratios after it is squared.

ANOVA uses sample data as the basis

for drawing general conclusions about populations

A large F-ratio indicates that the differences between treatments are

greater than expected by chance and has a significant effect.

In ANOVA we use variance to measure

how big the differences should be if there is no treatment effect

In the t statistic we computed an estimated standard error to measure

how much difference is expected by chance

two samples are not expected to be identical even

if there is no treatment effect whatsoever

variables in ANOVA where the researcher manipulates the variable to create a treatment condition is an

independent variables

Like t tests, ANOVA can be used with either an

independent-measures or repeated measures design

With very ____df values, nearly all the F-ratios will be clustered very near to 1.0. With the ____ df values, the F distribution is more spread out.

large; smaller

individual conditions or values that make up a factor are called the

levels of the factor

In ANOVA, it is customary to use the term _____ instead of variance, which is defined as the mean of the squared deviations. MSvariance = s2 = SS df

mean square or MS

the goal of Anova is to

measure the amount of variability (the size of the difference) and to explain where it comes from

When the treatment effect is zero (H0 is true), the error term

measures the same sources of variance as the numerator of the F-ratio, so the value of the F-ratio is expected to be nearly equal to zero

The number of scores in each treatment is identified by

n

entire process of analysis of variance will require ___ calculations;

nine; three values for SS, three values for df, two variances (between and within), and a final F-ratio

If the treatment had an effect, the numerator of the F-ratio should be _____, and we should obtain an F-ratio noticeably larger than 1.00.

noticeably larger than the denominator

ANOVA corresponds to two hypothesis:

null and alternative as part of the general hypothesis testing procedure

The precision of the sample variance depends on the

number of scores or the degrees of freedom.

the variance in the ____ of the F - ratio provides a single number that describes the differences between all the sample means

numerator

with both the t statistic and ANOVO the ____ of the ratio measures the actual difference obtained from the sample data, and the _____ measures the difference that would be expected if there is no treatment effect

numerator; denominator

The denominator of the F-ratio measures

only uncontrolled and unexplained variability and is called the error term

A post hoc test enables you to go back through the data and compare the individual treatments two at a time. This procedure is called

pairwise comparisons

Within-Treatment Variance

provide a measure of the variability inside each treatment condition

The numerator of the F-ratio always includes the ______ as in the error term, but it also includes any systematic differences caused by the treatment effect

same unsystematic variability

A ____ mean difference indicates that the differences observed in the sample data is very unlikely to have occured just by chance.

significant

The ability to combine different factors and to mix different designs within one study provides researchers with the flexibility to develop studies that address scientific questions tat could not be answered by a single design using a single factor. These are called

single-factor designs.

As the number of separate tests increases,

so does the experiment-wise alpha level

Present the findings:

source SS df MS________ Betn treatments 50 3 16.67 F = 8.33 w. Treatments 32 16 2.00 TOTAL 82 19___________

The fact that the t statistic is based on differences and the F-ratio is based on

squared differences leads to the basic relationship F = t2 You will be testing the same hypotheses: H0 and H1. 2. The df for the t statistic and the df for the denominator of the F-ratio (dfwithin) are identical. The distribution of t and the distribution of F-ratios match perfectly if you consider the relationship F = t2.

Anovo provides difficulty in hypothesis testing. The F-ratio tells you that a significant difference exists; it does not

tell exactly which means are significantly different and which are not.

differences between treatments are significantly greater than can be explained by chance alone;

that is, the differences have been caused by the treatment effects

The major advantage of ANOVA is

that it can be used to compare two or more treatments

With either the F-ratio or the t statistic, a large value provides evidence

that the sample mean difference is more than would be expected by chance alone

With an F-ratio near 1.00, we will conclude

that there is no evidence that the treatment had any effect.

The final calculation for ANOVA is

the F-ratio, which is composed of two variances, F = variance between treatments divided by the variance within treatments Variance for sample data: sample variance = s2 = SS/df

. The numerator of the F-ratio (MSwithin) simply measures how much difference exists between the treatment means. The bigger the mean differences,

the bigger is the F-ratio

k is used to identify_____. For an independent-measures study, k also specifies the number of separate samples

the number of treatment conditions.

If the between-treatments differences (MSwithin)are substantially greater than the error terms, then

the researcher can confidently conclude that the differences between treatments are due to more than chance

The denominator of the F-ratio (MSwithin) measures the variance of the scores inside each treatment; that is, the variance for each of the separate samples. In general, the larger the sample variances,

the smaller is the F-ratio

The structure of the t statistic compares to the actual differences between samples (numerator) with

the standard differences one would expect by chance.

With ANOVA we must decide between two interpretations:

there really is no differences between the populations (or treatments) and the populations (or treatments) really do have different means

The goal of an ANOVA three sample anaylisis is

to determine whether the mean differences observed among the samples provide enough evidence to conclude that there are mean differences among the three populations

The term ___ refers to the entire set of scores. We compute SS for the whole set of N scores, and the df is N-1.

total

Analyzing the total ____ into these two components is the heart of ANOVA

variability

F-ratio is based on ____ instead of sample mean difference

variance

We must compute the _____ between and within treatments in order to calculate the F-ratio

variance

Once we have analyzed the total variability into two basic components, between treatments and within treatments,

we compare them using the F-ratio

The error term is used as a standard for determining

whether or not the differences between treatments (measures by MSbetween) are greater than would be expected just by chance.

The term _____ refers to differences that exist inside the individual treatment conditions. We compute SS and df inside each of the separate treatments

within treatments

To measure chance differences, we compute the variance

within treatments

The _____ variance provides a measure of how much difference is reasonable to expect by chance, i.e, how big are the differences when H0 is true.

within-treatments

Within-Treatments Sum of Squares

∑ SSwithintreatments = ∑SSinside each treatment


Related study sets

accounting 201 ch 6: inventory and cost of goods sold

View Set

Elimination PREP U (NUR 2 TEST 2)

View Set

Ch 3 The Human Body: A Nutrition Perspective

View Set

Ch 18 Cost Behavior and Cost Volume Profit Analysis

View Set

Network+ Guide to Networks 7th ed. Quiz Ch. 12

View Set

NUR 304 Chapter 49 Concepts of Care for Patients with Inflammatory Intestinal Conditions

View Set

Business Management and Administration Career Cluster

View Set

American History Unit 2: Lesson 4 - The Plains Indian Wars

View Set

Automotive Electronics Practice Test

View Set