Data Analysis for Managers: Chapters 11 & 15
The degrees of freedom in a Chi-square distribution is
(r-1)x(c-1)
Identify which of the examples would be appropriate for a chi-square test for independence.
- A lawyer wants to show whether or not gender and the ability to be promoted are related. - A researcher wants to determine whether or not the race of a driver will influence whether a police officer will search a vehicle, once stopped.
One-factor ANOVA can be used to answer which of the following questions?
- Are the observed differences in overtime hours at four different manufacturing plants due to chance or too great to be attributed to chance? - Are the observed differences between four manufacturing plants' defect counts due to chance or too great to be attributed to chance?
Quantitative response variable
- Length of orthopedic surgery - Recovery time after surgery
Identify the reasons why one might analyze quantitative data using a contingency table.
- The researcher might have a mixture of quantitative and qualitative variables. - The assumption of a normal distribution, required for regression, might be invalid - The researcher might not want to make an assumption about a mathematical form for the relationship between X and Y.
Categorical independent variable
- Type of fracture - Hospital Location
Choose the correct expressions for variation in a response variable Y from the choices given.
- Variation in Y = Variation due to factor(s) + Variation from random error - Variation in Y = Explained Variation + Unexplained Variation
Analysis of Variance assumes that
- populations are normally distributed. - population variances are equal.
The Tukey test statistic relies on
- the sample sizes nj and nk - the difference in grouped means, |y�j - y�k| - the MSE
Suppose an ANOVA experiment is comparing means across 4 different categories. If one were to perform the necessary paired t tests for all 4 categories with an alpha = .01, what would the overall Type I error probability be?
.0585
A multinomial distribution has k probabilities that sum to
1
The test for independence for a contingency table is unreliable when the expected frequencies in each cell are less than or equal to
5
How many pairwise comparisons need to be performed when a factor has four treatments or levels?
6
You can find the critical value of the Fdf1,df2 statistic in Excel using the function
=F.INV.RT(α, df1,df2)
The mean of each group is calculated by taking a weighted average of the c sample means.
False
There are no differences between the data formats of replicated and unreplicated two-factor ANOVA.
False
True or false: A statistically significant result is always of practical importance.
False
True or false: Only multinomial distributions can be tested using a GOF Test.
False
True or false: The alternative hypothesis in a one-factor ANOVA states that all the treatment means are different from each other.
False
True or false: When conducting the chi-square test for independence one can skip stating the decision rule because the decision rule is always the same.
False
The test statistic for a one-way ANOVA test follows the
Fdf1,df2 distribution.
The Sums of Squares formula for a non-replicated two-factor ANOVA is SST =
SSA + SSB + SSE
The Sums of Squares formula for a replicated two-factor ANOVA is _____ = SSA + SSB + _____ + _____
SST SSI OR SSAB SSE
For a 2x2 contingency table, testing for independence with the chi-square test is the same as conducting a _____ test comparing two proportions.
Z
In Analysis of Variance, a factor is defined as
a categorical independent variable that explains variation in a response, or dependent, variable.
The goodness-of-fit depends on the _____ and ____ frequencies
actual; expected
The acronym ANOVA stands for
analysis of variance
To determine if there is a difference between the means of 3 or more populations, we use
analysis of variance.
The reason why we perform an analysis of variance for comparing means rather than conducting multiple two-mean comparisons is
because multiple two-mean comparisons increase the Type I error probability.
A randomized block design is an experiment where subjects within each ____ are randomly assigned to each of the ____
block; treatments
When formatting data for a one-factor ANOVA analysis the observations for each treatment are typically presented in _____ where each one represents a treatment or factor level.
columns
The chi-square test for independence analyzes frequency data from a _______ table
contingency
A ____ table is used to summarize _____ frequencies from customer surveys or designed experiments.
contingency; response
The expected frequency for each cell in a contingency table is calculated using the formula
ejk = (RjCk)/n
The hypotheses tested in a one-factor ANOVA are: H0: All the treatment means are ____ vs. H1: At least _____ treatment mean is different.
equal; one
Variation of a dependent random variable about its mean is categorized as either
explained; unexplained
A random variable's variation about its mean can be attributed to known ______, called explained variation, or is simply ____ error, called unexplained variation.
factors; random
True or false: If an interaction effect is present between two factors the plot of means will show parallel lines.
false
A two-factor ANOVA without replication is called a
fixed-effects model
Tukey's HSD method ensures that the probability of a Type I error equals α
for any number of pairwise comparisons.
Because a chi-square test is always right-tailed, the decision rule is to reject H0 when χ2calc is ____ than χ2critical
greater
In an ANOVA if the SSE is relatively _____, then we would fail to reject the ____ hypothesis
high; null
Even if an ANOVA study shows significant factor effects, the magnitude of the effect may not be ______. The researcher or user of the study results will have to determine this.
important
When stating the null hypothesis for a Chi-square test of independence, the null hypothesis always assumes the two variables are
independent
In a data table for a two-factor replicated ANOVA, the ______ of each row/column is a treatment.
intersection
The chi-square test is used to test goodness-of-fit because
it is easy and versatile.
Each probability distribution has its own ____ about the underlying process.
logic
Levene's test does not assume a _____ distribution
normal
Hartley's test for equal variances assumes the populations are _____ distributed
normally
Using the Tukey method, there are no significant differences to find if the ANOVA test does not reject the _____ hypothesis of _____ means.
null; equal
In the chi-square test for independence, the null hypothesis assumes two categorical variables are independent. If the variables are independent then the contingency table ______ frequencies should be close in value to the ____ frequencies.
observed; expected
A contingency table shows frequencies for _______ or ______ variables.
ordinal categorical
Replication in a two-factor ANOVA can add ____ to the test
power
For a 2x2 contingency table, testing for independence with the chi-square test is the same as conducting a z test comparing two
proportions
The dependent variable in an ANOVA should be
quantitative
In an ANOVA test, we ____ the null hypothesis when the numerator of the test statistic is significantly _____ than the denominator.
reject; higher
For an ANOVA test, the critical value is always found in the _____ tail of the F distribution.
right
The one-way ANOVA test is always a
right-tailed test
The ANOVA test is considered ______ to departures from the normality assumption and equal variance assumptions.
robust
The spreadsheet format will show one factor across the ____ and one factor down the _____
rows; columns
Quantitative variables can be analyzed using contingency tables by
summarizing frequencies into bins and using the bins as categories.
In an ANOVA, if the SSE is relatively high then this means:
that the factor effects do not differ significantly from zero.
A two-factor non-replicated ANOVA design is often used because
the observations are expensive. repeated treatment observations can be impossible to collect.
When analyzing difference in means between treatments in a single factor, a completely randomized model means that
the subjects or individuals are assigned randomly to each treatment.
Small expected frequencies can create a problem for a chi-square test because
they inflate the value of the chi-square statistic because ej is in the denominator.
A two-way ANOVA with interaction uses __________ F tests. One for each ____________ effect and one for the ____________ effect
three; main; interaction
The random error in the ANOVA linear model, εij, is assumed
to be normally distributed. to have a zero mean and constant variance.
A factor is an independent categorical variables and the categories are called
treatments
The one-factor ANOVA linear model reduces to yij = μ + εij if the null hypothesis is
true
Two-way ANOVA tests simultaneously examine the effect of ____ factors on the population mean.
two
Replication in a two-factor ANOVA allows
us to test for interaction between factors as well as significant effects due to individual factors.
A goodness-of-fit test
used to compare a set of sample observations against an assumption about a population distribution.
If a two-factor ANOVA study has only one factor of research interest the second factor is
used to control for potential confounding influences.
Identify the correct two-factor ANOVA linear model.
yij = μ + Ai + Bj +εij
Identify the correct two-factor, with replication, ANOVA linear model.
yij = μ + Aj + Bk + ABjk + εij
Identify the correct one-factor ANOVA linear model.
yij = μ + Tj +εij
If none of the factors have an effect on the mean nor are there significant interaction effects between the factors, the linear model reduces to
yijk = μ + εijk
Which of the following is the test statistic for the chi-square goodness-of-fit test?
χ2 = ∑j=1k(fj−ej)2/ej
The chi-square distribution has which of the following characteristics:
χ2 critical is dependent on degrees of freedom = (r-1)(c-1) χ2 is never negative.
The test statistic for the chi-square test for independence is
χ2calc = ∑j=1r∑k=1c(fjk−ejk)2ejk
The test statistic for a one-way ANOVA test =
MSB/MSE
The test for comparing more than two population variances without assuming normal populations is called
Levene's test
The steps for conducting the chi-square test for independence are: State the Hypotheses, Specify the Decision Rule, Make the Decision, Take Action. One step is missing from this list. Identify the missing step.
Calculate the χ2 test statistic.
A rule known as ______ Rule requires that each cell have a frequency greater than five.
Cochran
Choose the correct null hypotheses for a two-factor, without replication, ANOVA. Assume factor A is the factor to be tested and factor B is the level variable.
H0: A1 = A2 = . . . = Aj = 0
The null hypothesis for a fixed-effects ANOVA model is
H0: A1 = A2 = A3 = . . . = Aj
Choose the correct null hypotheses for a two-factor, with replication, ANOVA.
H0: All ABjk = 0 H0: A1 = A2 = . . . = Aj H0: B1 = B2 = . . . = Bk
The hypotheses for the chi-square test for independence are
H0: The two variables are independent. H1: The two variables are not independent.
One-way ANOVA hypotheses are:
H0: μ1 = μ2 = μ3 = μ4; H1: At least one μ is different.
A researcher would like to test the assumption that a particular brand of laundry detergent is twice as popular as the two next best selling brands which she believes have equal popularity. The researcher conducts an experiment with a focus group to determine the proportion of times each of the brands is selected. The null hypothesis would be:
H0: π1 = .5, π2 = .25, π3 = .25
The alternative hypothesis for a fixed-effects ANOVA model is
H1: At least one Aj is different.
The hypotheses H0: σ12 = σ22 = σ32 = . . . = σc2 vs H1: The variances are not all equal can be tested using
Hartley's test
A candy company claims that the proportion of red, green, and yellow jelly beans in a bag is 0.3, 0.4, and 0.3. Which of the following is the correct set of hypotheses for testing the company's claim?
Ho: πred = 0.3 πgreen = 0.4 πyellow = 0.3; H1: At least one of the π's is different from its hypothesized value.
Which of the following are NOT assumptions for performing a one-way ANOVA?
The population standard deviations are unknown but assumed unequal.
Which of the following is NOT an example of a multinomial distribution?
The probability that an exam can be completed in less than 30 minutes is 1-e-λ30
True or false: A quantitative variable might be included in a contingency table when the second variable is categorical.
True
True or false: Analysis of variance assume homogeneous variances.
True
True or false: If no interaction exists between factors the plot of means should show parallel lines.
True
True or false: Open-ended classes are acceptable in contingency tables. True false question.
True
The ____ test was developed by the 20th century statistician John Tukey. It performs pairwise comparisons of ____ simultaneously.
Tukey's; means
