Data Analysis: Chapter 11: Analysis of Variance
Data format
- if we are only interested in comparing the means of c groups (treatments or factor levels), we have a one-factor anova - if subjects (or individuals) are assigned randomly to treatments, then we call this the completely randomized model
Analysis of variance assumes that the:
1. observations on Y are independent 2. populations being sampled are normal 3. populations being samples have equal variances
Analysis of Variance assumes that:
1. populations are normally distributed 2. population variances are equal
The Tukey test statistic relies on:
1. the sample sizes nj and nk 2. the MSE 3. the difference in groups means, [ybarj-ybark]
Choose the correct expressions for variation in a response variable Y from the choices given.
1. variation in Y = variation due to factor(s) + variation from random error 2. variation in Y = explained variation + unexplained variation
One-way ANOVA hypotheses are:
Ho: u1=u2=u3=u4 H1: At least one u is different
The test for comparing more than two population variances without assuming normal populations is called _______.
Levene's test
Match each variable description to the logical variable type: Quantitative response variable Categorical independent variable
Quantitative response variable: 1. recovery time after surgery 2. length of orthopedic surgery Categorical independent variable: 1. type of fracture 2. hospital location
The Sums of Squares formula for a non-replicated two-factor ANOVA is _______ = SSA + __________ + ________.
SST; SSB+ SSE
The partitioned Sum of Squares can be expressed using the notation _______ = SSB + _____.
SST; SSE
partitioned sum of squares
SST= SSB + SSE
In Analysis of Variance, a factor is defined as ________.
a categorical independent variables that explains variation in a response, or dependent, variable
general linear model
a versatile tool for estimating large and complex ANOVA models.
The acronym ANOVA stands for ______ ______ _____.
analysis of variance
The reason why we perform an analysis of variance for comparing means rather than conducting multiple two-mean comparisons is _____.
because multiple two-mean comparisons increase the Type I error probability
The reason why we perform an analysis of variance for comparing means rather than conducting multiple two-mean comparisons is ______.
because multiple two-mean comparisons increase the Type I error probability
A randomized block design is an experiment where subjects within each ______ are randomly assigned to each of the _______.
block; treatments
balanced designs
characterized by an equal number of observations for each factor combination
Levene's test
does not assume a normal distribution - requires a computer package other than excel - based on the distances of the observations from their sample medians rather than their sample means
The test statistic for a one-way ANOVA test follows the:
f dfi,df2 distribution
A random variable's variation about its mean can be attributed to known ______, called explained variation, or is simply _____ error, called unexplained variation.
factors; random
True or False: A completely randomized one-factor ANOVA requires that the sample size for each factor level be the same.
false
Hartley's test for equal variances assumes the populations are _____ distributed.
normally
Using the Tukey method, there are no significant differences to find if the ANOVA test does not reject the _____ hypothesis of _____ means.
null; equal
experimental design
refers to the number of factors under investigation, the number of levels assigned to each factor, the way factor levels are defined, and the way observations are obtained.
The one-way ANOVA test is always a ______.
right-tailed test
For an ANOVA test, the critical value is always found in the _______ tail of the _____ distribution.
right; F
The ANOVA test is considered ______ to departures from the normality assumption and equal variance assumptions.
robust
The spreadsheet format will show one factor across the ______ and one factor down the _____.
rows; columns
analysis of variance
seeks to identify sources of variation in a numerical dependent variable Y (the response variable) - variation in the response variable about its mean either is explained by one or more categorical independent variables or us unexplained. - comparison of means
the mean of each group is calculated by ________
summing the observations in the treatment and dividing by the sample size
If the test statistic for a one-way ANOVA is significantly greater than 1, this means that _________.
the between-treatments are significantly greater than the within-treatments variability
True or False: Analysis of variance assume homogeneous variances
true
Tukey's studentized range test
two-tailed test for equality of paired means from c groups compared simultaneously and is a natural follow-up when the results of the one-factor ANOVA test show a significant difference in a least one mean.
If a two-factor ANOVA study has only one factor of research interest the second factor is ______.
used to control for potential confounding influences
Hartley's test statistic for testing equal variances is the ratio of sample _______ with the larger value in the _____.
variances; numerator
Identify the correct two-factor ANOVA linear model
yij= u + Aj + Bk + Eij
Identify the correct one-factor ANOVA linear model
yij= u + Aj + Eij
A two-factor non-replicated ANOVA design is often used because ______.
1. the observations are expensive 2. repeated treatment observations can be impossible to collect
True or False: The alternative hypothesis in a one-factor ANOVA states that all the treatment means are different from each other.
false
A two factor ANOVA without replication is called a _______.
fixed-effects model
Tukey's HSD method ensures that the probability of a Type I error equals a
for any number of pairwise comparisons
fractional factorial designs
for reasons of economy, limit data collection to a subset of possible factor combinations