Analysis of Variance
Suppose you are conducting a two-factor ANOVA with interaction and you have 5 levels of Factor A, 4 levels of Factor B, and a total of 60 observations. Given the following partial ANOVA table, what would the F statistic for interaction be?
9.167
What is the definition of a "Blocking Variable" in two-way ANOVA?
A second treatment variable that will reduce the error term (SSE).
How do you calculate the SSE in two-way ANOVA?
By subtracting SST and SSB from SS total.
Which one of the following is part of the underlying strategy behind ANOVA?
Comparing two estimates of the population variance found in different ways.
Two-way ANOVA allows us to consider other factors besides the difference of treatment means. What is the advantage of this?
Explaining more of the variation within treatments reduces the probability of a Type I error.
What is the test statistic used for a test of two equal variances? Assume s12 is the larger sample variance.
F = s21/s22
What is the relationship between SS total, SST, and SSE?
SS total = SST + SSE
ANOVA assumes that the population variances are the same for all treatments. What is the name for this common value?
The Mean Square Error, MSE
ANOVA is usually used to test the equality of three or more means instead of using a t-test for each pair. Why is this?
The over all significance level of ANOVA is better.
The formula for sum of squares of the blocking variable is: SSB = kΣ(XXb- XXG)2. Match the variables to their description.
XG →The grand overall mean XB → The sample mean of block b k → The number of treatments b → The number of blocks
To find the random variation, a value is subtracted from each observation and then the differences are squared and totaled. Which of the following is subtracted from each observation?
its treatment mean.
Analysis of variance is a statistical method of comparing the ________ of several populations.
means
Which of the following is the confidence interval for the difference of treatment means?
(x⎯⎯1−x⎯⎯2) ±tMSE(1n1+1n2)⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯√
Suppose a trucking company wants to compare the population average miles per gallon for three brands of gasoline. If each gasoline is tested on five trucks what would the numerator degrees of freedom be?
2
Suppose we are conducting a test with the following hypotheses: H0: σ12≤σ22 H1: σ12>σ22 If we gather sample 6 observations from population 1 and a sample of 10 observations from population 2. Suppose we calculated the following: s12= 15 and s22=7. What would the test statistic be?
2.143
Suppose you have an ANOVA with 5 treatments and 6 observations per treatment. Given the following partial ANOVA table, what would the sum of squares treatment be?
300
A company that provides training courses for professional exams wants to study mean exam scores by looking at type of course (on-line, in-class) and length of course (1 week, 2 weeks, 4 weeks). If a two-way ANOVA with interaction was performed, what would be the factors and what would be the response variable.
Factor → Type of course, Length of course Response Variable → Exam score
Suppose we wish to conduct a test to see if the population mean sales for a product with 4 different colored package designs (red, blue, green, yellow) is the same. Which of the following would be the correct null?
H0: μr=μb=μg=μy
Which of the following are correct general form of the null and alternative hypothesis for a test for interactions? Select two.
Ho: There is no interaction between Factor A and Factor B. H1: There is interaction between Factor A and Factor B.
How do we use the confidence interval for difference in difference in treatment means?
If the confidence interval contains 0, we cannot conclude there is a difference in the population means.
A trucking company wants to study the effect of brand of tire (A, B, C) and brand of gasoline (1, 2) on miles per gallon. If a two-way ANOVA with interaction was performed and the p-value for interaction was 0.425, what would this suggest? Assume α=0.05.
Interaction is not present; continue with interpretation of two-factor results for each factor.
Suppose the manager of a restaurant decided to run an advertisement in the local newspaper. The manager decided to run the ad in different sections of the paper (local, sports, business, leisure) and to run it on different days of the week (Tuesday, Thursday, Saturday). If this two-way ANOVA with interaction was performed and the p-value for interaction was 0.025, what would this suggest? Assume α=0.05.
Interaction is present; conduct one-way ANOVA tests for factor.
Which of the following are advantages of using two-way ANOVA? Select all that apply.
It explains more of the variation, which reduces the chance of a Type I error. It allows us to consider other factors besides the treatment.
If the purpose of introducing a another variable into ANOVA was only to reduce the error variation (SSE), then what tests should be performed?
Just the test regarding the treatment variable.
ANOVA assumes that the population variances are the same for all treatments. How do we calculate this value?
MSE = SSE/(n - k)
Identify the correct formula for the Sum of Squares Error in two-way ANOVA.
SSE = SS total - SST - SSB
Statistical software analysis provides the following output for a single factor ANOVA, α =.05. Identify the correct value for the following. Select all that apply.
The denominator degrees of freedom for F crit = 32 F test = 3.036 F crit = 3.2945
Three plots of corn are grown using different types of fertilizer. We want to compare their average height using ANOVA. What is the "treatment" in this case?
The different types of fertilizer used.
Suppose you have an ANOVA with 3 treatments and 6 blocks. Given the following partial ANOVA table, what would the sum of squares for blocks be?
220
Suppose you are conducting a two-factor ANOVA with interaction and you have 3 levels of Factor A, 5 levels of Factor B, and a total of 30 observations. Given the following partial ANOVA table, what would the sums of squares for Factor A be?
100
What is the name for a second treatment variable in ANOVA that reduces the error term by explaining more of the variation?
A Blocking Variable
What does the acronym "ANOVA" stand for?
Analysis of Variance
Which of the following are characteristics of the F-distribution? Select all that apply.
It cannot be negative. It is a family of distributions. It is a continuous distribution.
Here is a list of five characteristics of the F-distribution. Which ones are not characteristics of the t-distribution? Select all that apply.
It is positively skewed. It cannot be negative.
Refer to the Two-way ANOVA Table shown here. Match the variables to their description.
MSB/MSE →F statistic for blocks SST →Sum of squares for treatments (k-1)/(b-1) → df for error k →number of treatments
If the confidence interval contains 0, we cannot conclude there is a difference in the population means.
MSE = SSE/(n - k)
The confidence interval for the difference in treatment means is given by (X1X1- XX2) ± t √MSE(1n1+1n2)MSE(1n1+1n2). Match the variables to their description.
MSE → Mean square error X2 → Mean of treatment number 2 t → Cutoff value of t-distribution n1 → Observation of treatment 1
Under what circumstances is it appropriate to follow ANOVA analysis with analysis of individual treatment means?
Only if the null hypothesis of equal means is rejected.
Suppose you have an ANOVA with 5 treatments and 6 observations per treatment. At the α=0.05, what would the decision rule be?
Reject H0 if F > 2.76
uppose you have an ANOVA with 5 treatments and 6 observations per treatment. At the α=0.05, what would the decision rule be?
Reject H0 if F > 2.76
Suppose you have an ANOVA with 3 treatments and 5 observations per treatment. At the α=0.01, what would the decision rule be?
Reject H0 if F > 6.93
What feature of an interaction plot suggests an interaction between factors?
The line segments cross, or do not appear to be parallel.
What are the three assumptions that have to be made to use ANOVA?
The populations are normally distributed. The populations are independent. The populations have equal standard deviations.
What is the definition of "Treatment Variation" (SST) in the context of ANOVA?
The sum of the squared differences between each treatment mean and the overall mean.
Which of the following formulas describes the calculation of the variance of the blocking variable (the SSB) in ANOVA?
kΣ(XXb- XXG)^2
The F-statistic depends on two numbers for degrees of freedom. Choose the two correct descriptions of df from the list. Sample size is n and number of treatments is k.
n - k, the df of the denominator k - 1, the df of the numerator
The F-statistic depends on two numbers for degrees of freedom. Match the variable or expression to its description.
n → total number of observations k → number of treatments k-1 → df in the numerator of F n-k → df in the denominator of F
As part of the calculation of the treatment variation, a value is subtracted from the treatment means. Which of the following is subtracted from each treatment mean?
the overall mean.
To find the total variation, a value is subtracted from each individual observation and then the differences are squared and totaled. Which of the following is subtracted from each observation?
the overall mean.
What is meant by the term "treatment" in the context of ANOVA?
"Treatment" refers to the different populations being examined.
Suppose we are conducting a test with the following hypotheses: H0: σ12=σ22 H1: σ12≠σ22 If we gather sample 6 observations from population 1 and a sample of 10 observations from population 2. Suppose we calculated the following: s12=5 and s22=12. What would the tests statistic be?
2.40
Which one of the following is part of the underlying strategy behind ANOVA?
Comparison of mean through two sample estimates of the population variance.
Suppose a trucking company wants to compare the population average miles per gallon for four brands of gasoline. If each gasoline is tested on six trucks what would the numerator degrees of freedom be?
3
Sir Ronald Fisher's F-distribution is used in many statistical tests. Pick two from the following list that use the F-distribution for their test statistic.
Analysis of Variance (ANOVA). Test of two samples to see if they are from populations with the same variance.
On an Interaction Plot, which of these features indicate an interaction between factors?
One line crosses one or more of the other lines.
What pieces of information do you need to look up a critical value for the F distribution? Select all that apply.
The degrees of freedom (n2-1) of the denominator. The degrees of freedom (n1-1) of the numerator. The significance level (0.05 or 0.01)
Suppose the manager of a restaurant decided to run an advertisement in the local newspaper. The manager decided to run the ad in different sections of the paper (local, sports, business, leisure) and to run it on different days of the week (Tuesday, Thursday, Saturday). In this two-way ANOVA with interaction, what would be an Interaction for this analysis?
The relative customer response to the different sections of the paper changes for the different days of the week.
If a blocking variable seems likely to be of the same importance as the treatment variable, what term is used to describe an ANOVA analysis?
Two-factor experiment
Suppose we wish to conduct a test to see if the population mean sales for a product with 4 different colored package designs (red, blue, green, yellow) is the same. Which of the following would be the correct alternative?
H1: At least one mean is different.
Which of the following is the correct general form of the null hypothesis for a test for interactions?
Ho: There is no interaction between Factor A and Factor B.
If we have conducted an ANOVA test of equality of means, and failed to reject the null hypothesis of equality of means, what is the next step?
Nothing, the analysis is complete.
Which one of the following conditions would invalidate the use of ANOVA?
One of the populations has a highly skewed distribution.
A manufacturer of plastic chairs decides to study the effect of temperature and molding time on the strength of the chairs. The manufacturer decided to the strength of chairs made at three temperatures and three molding times. In this two-way ANOVA with interaction, what would be an Interaction for this analysis?
The relative strength for different molding times changes for the different temperatures.
What is the definition of "Random Variation" (SSE) in the context of ANOVA?
The sum of the squared differences between each observation and its treatment mean.
What is the definition of "Total Variation" (SS total) as used in ANOVA?
The sum of the squared differences between each observation and the overall mean.
What does it signify if the confidence interval for the difference in treatment means contains 0?
The treatment means could be the same.
In a two way ANOVA the F statistic for Treatments is 3.23 with p-value of 0.006, and the F statistic for Blocks is 1.46 with p-value of 0.187. What can you conclude from this? Assume α=0.05.
There is a not significant difference in block means, but there is a significant difference in the treatment means.
In a two way ANOVA the F statistic for Treatments is 2.14 with p-value of 0.095, and the F statistic for Blocks is 3.56 with p-value of 0.018. What can you conclude from this? Assume α=0.05.
There is a significant difference in block means, but not in the treatment means.
The F-distribution (named for Sir Ronald Fisher) is widely used in statistics. Which two of the following make use of this test statistic?
To test if two samples are from populations with equal variances. To compare several population means simultaneously.
What is the relationship between total variation (SS total), treatment variation (SST), and random variation (SSE)?
Total variation is treatment variation plus random variation.
Why should you use ANOVA instead of conducting t-test for each pair of means?
Type I error builds up to an unsatisfactory level when testing pairs of means.
Suppose a water bottler has two machines that fill 16 ounce bottles with spring water. Both machines fill bottles on average to 16 ounces, but the manager of the bottling plant believes the older machine (Machine 1) has a larger variance. Which of the following could be sets of test hypotheses that could be used to test to see if the manager is correct?
H0: σ12≤σ22 H1: σ12>σ22
Which of the following could be hypotheses for a test of two variances? Select all that apply.
H0: σ12≥σ22 H1: σ12<σ22 H0: σ12=σ22 H1: σ12≠σ22
When you look up a critical value in an F distribution table, what number do you use for the degrees of freedom of the numerator?
n1-1, where n1 is the size of the sample with larger s2.