psychology statistics final
SStot
= SSag + SSwg
more similar than would be expected by chance
F-statistics much less than 1.0 (close to 0) would suggest that the groups are
A posteriori tests need the omnibus test to be conducted but don't need for it to reject its null hypothesis. They can control the familywise error rate directly.
How do post-hoc tests relate to omnibus tests?
both standard errors are small.
Suppose you were comparing two regression coefficients. The t-statistic tends to be greatest when:
linearity
Which of assumption for the ANOVA is considered the most important?
cluster
With what type of sampling would we use hierarchical linear models?
denomination
all combinations yield positively skewed distributions, but the skewness seems to be reduced with larger sample sizes in the
nature of the differences among groups
completing multiple t-tests is also problematic, as it does not fully consider the
sum-of-squares total (SStot)
defined as the variability of the individual scores about the grand mean
df= n-3
degrees of freedom in a dependent (overlapping) correlation
ANOVA
is an extension of the t-test, our notational scheme is merely an extension of that presented earlier. However, it is a bit more general
mean square within groups (MSwg)
is not affected by differences among the groups, as it is calculated by looking at the spread within each of the groups separately
independence of observations
many researchers administer "treatments" to groups of students (cases) assembled specifically for purposes of the study and/or collect outcome data in group situations, thus creating a potential problem with regard to
how to calculate deviations from the mean in a two group situation
one would simply find the deviations for each of the groups from their respective means, take the absolute values, and then compare the means of the absolute values with a t-test.
SSwg
ooks at the variability of the individual scores about their respective group means
the observed t-statistic is distributed as t with (n - 2) df.
pearsons correlations null hypothesis
fstatistic
the ratio of two sample variances
means differ from one another
the variability of the group means can be interpreted as the degree to which the
skewed rather than symmetric
variances are different from means in that their sampling distributions are
pearsons correlation
A Test to Compare Two Variances of Dependent Samples
Bonferroni technique
A popular adjustment to the inflated error rate problem associated with multiple t tests is to divide the overall level of significance by the number of tests known as the
4
A researcher computes SSag = 600, dfag = 3, SSwg = 5,000, and dfwg = 100. The F-statistic is:
atleast one pair is different
A researcher conducting an ANOVA obtains a p-value of .0015. What should the researcher conclude about the population means?
D a. 1, 0, -1, -1 b. 1, -1, 0, 0 c. 0, 0, -1, -1 d. 2, 0, -1, -1.
A researcher is comparing four groups using a series of planned comparisons. If the researcher wishes to compare the first group to the average of the last two groups, the coefficients for this comparison would be:
yes
A researcher is comparing four groups with equal sample sizes. Do the coefficients 1, -1, 0, 0 define a comparison that is independent of the one defined by the coefficients 0, 0, 1, -1?
15
A researcher is making a set of planned comparisons. If the coefficients for the first comparison are 1, 0, -1 and the sample means are 70, 65, and 55, what is the estimated value of the contrast, d?
scheffe procedure
A researcher is making comparisons among five groups. She is interested in all of the possible pairwise comparisons, plus four complex comparisons. What analysis would you recommend?
Planned comparisons with the Bonferroni-Holm sequentially rejective procedure.
A researcher is making comparisons among five groups. She is interested in three specific comparisons (two complex comparisons and one pairwise comparison). Which of the following methods would you recommend?
no effect on the value of d
A researcher is making comparisons among groups. Consider the comparison of Group 1 to Group 2. An outlier in the third group would have what effect on the estimated contrast value, d?
decrease the value of F
A researcher is making comparisons among groups. Consider the comparison of Group 1 to Group 2. The presence of an outlier in the third group would have what effect on the F statistic from the distribution-based approach for testing the contrast?
levenes test
A researcher wishes to compare the variances of two groups. If she suspects the distributions are not normal, which of the following tests would you recommend?
k groups
ANOVA involves several groups; in general, we say that there are
the difference between the size of the first sample and the second samples mean
After calculating HSD, against what do we compare it to discover whether pairwise differences exist?
only the 5 of interest
Assume you're using resampling techniques for a comparison between groups. If you have ten groups but are interested in comparing only five of them, should you resample from all ten or just from the five groups of interest
conservative
Bonferroni technique is considered rather
practical significance of the results
Both descriptive statistics and measures of effect size are critical pieces of information to have in order to determine the
less likely to find differences
Consider a researcher wishing to make all possible pairwise comparisons among five groups. Relative to the Tukey HSD test, the Scheffé procedure is:
nature of the relationship
Correlations address the issue of "strength of relationship," but they do not tell you anything about the
k-1
Degrees of freedom among groups (dfag)
k(n-1)
Degrees of freedom within groups (dfwg)
sufficiently large
Due to the Central Limit Theorem, the t-test is fairly robust with respect to the assumption of normality when the sample sizes are
normality
F-test, noting that it was tied to an assumption of
units on which the variables are scaled.
First, we acknowledge that transforming data changes the values, and that is precisely the purpose. From a pure measurement perspective in which the numerals have a clear relationship to the external world (i.e., ratio scales), transformations may become problematic as they change the
With regard to correlations, two different ways in which the coefficients may be dependent.
First, you may have data on several variables on one group. If you are interested in comparing how different variables (X and Y) correlate with a third variable, Z, you will compare rxz with ryz. Because the two coefficients share one variable, they are said to be dependent
two groups
Fisher extended the work of Gosset to situations with more than
variance among the group means
Fisher showed that the question of the equality of several groups can be expressed in terms of the
population variance, σ2.
Following Gosset's logic in pooling the variances from two samples, Fisher argued that forming a weighted average of the several sample variances, sj2, would be an estimate of the
0.4
For Cohen's f, what value is considered to show a large effect size?
0.25
For Cohen's f, what value is considered to show a moderate effect size?
0.1
For Cohen's f, what value is considered to show a small effect size?
when we want to correct violations of the assumptions
For what purpose might we use a transformation on our data before performing an ANOVA?
dependent
Given situations in which there are two samples of observations, it is possible that the samples might be
separately like the chi square distribution
Given that F-distributions are skewed, as noted earlier, we will need to consider the lower and upper tails
They both deal with the average deviation which is the mean of the absolute values of the differences between the observations and their mean.
How do Levene's statistic and the Fligner-Killeen test statistic deal with the problem of being sensitive to departures from normality?
We would divide the mean squared among groups by the mean squared within groups.
How do we calculate the Fobs for the ANOVA?
We would divide the sum of squares among groups by the degrees of freedom among groups.
How do we calculate the MSag?
We would divide the sum of squares within groups by the degrees of freedom within groups.
How do we calculate the MSwg?
Two degrees of freedom, where df1 equals (n1 - 1) and df2 equals (n2 - 1).
How do we calculate the degrees of freedom used to look those critical values of the two tails in a Fcrit
k (total number of levels) - 1
How do we calculate the dfag
N(total number of scores in the entire experiment)-k(total nuber of levels)
How do we calculate the dfwg
MANOVA differs in the fact that there are multiple dependent variables.
How does MANOVA differ from both of ANCOVA and ANOVA?
This is a subset of the generalized linear model where the only difference from the multiple regression model is that the dependent variable is categorical.
How does a logistic regression differ from a multiple regression?
In this analysis the evidence is coming from not rejecting the null hypothesis instead of the evidence coming from rejecting the null hypothesis
How does testing the null hypothesis for this analysis differ from all of the other null hypothesis tests we've done? .
If you want to bypass the test of the omnibus null hypothesis and instead focus on a small set of hypotheses of particular interest you would use the posteriori test, then the comparisons are completed and an adjustment is made on the error rate using some type of Bonferonni correction
How does that relationship differ from the one between a posteriori tests and priori tests?
In ANCOVA, one of the independent variables is quantative.
How does the ANCOVA differ from the ANOVA?
It uses the separate sample estimate of t and it adjusts the critical values by changing the degrees of freedom.
How does the Welch-Satterthwaite correction deal with populations that have different variances?
The generalized linear model is identical to the general linear model except some of the dependent variables are categorical instead of quantitative.
How does the generalized linear model differ from the general linear model
means
Hypotheses about variances are hypotheses in their own right, although considerably less frequent than hypotheses about
0.0125
If a researcher states that he controlled the overall Type I error rate to .05 by using a Bonferroni adjustment when making four planned comparisons, what value of α was used for each comparison?
type I error
If the error components of the observations are not independent, then we run an increased risk of a
μ1 = μ2 = μ3 = μ4
If the four samples are actually random samples from the same population, then it is reasonable to assume that
too large
If the large sample is paired with the large variance, then the weighted average (s2pooled) will be
1
If the null hypothesis for a variance test is true, what should the F-statistic between the two variances equal?
They should both equal the same variance (the population variance) and the f ratio should be approximately 1.
If the null hypothesis for the ANOVA is true, how does the expected value of the MSwg compare to that of the MSag?
population variance
If the null hypothesis for the ANOVA is true, what is the expected value for the MSag?
same variance
If the null hypothesis is true in ANOVA, then these two mean square values should be estimates of the
pooled-samples t-test which assumes homogeneity of variance.
If the sample sizes are the same and sufficiently large to justify the assumption of normality (Central Limit Theorem), you are also probably on safe ground to use the
same quantity, σ2
If the samples are indeed random samples from the same population, then both the mean squares among groups, MSag and MSwg, estimate the
expected value
If there are no treatment effects (τj = 0 for all values of j), then the MSag and the MSwg have the same
"critical values" for the appropriate probability distribution
If we are going to use F-distributions to test hypotheses about variances, we will need a way to find the
they share a variable
If we say that two dependent correlations are overlapping, what do we mean?
pooled samples estimate of t
If you decide that the variances are sufficiently similar, given the differences in the sample sizes, proceed with the
not relevant; degrees of freedom are infinite
If you have the entire population, the notion of degrees of freedom is
n1 = 30, n2 = 60, s12 = 100, s22 = 200.
In comparing two means, under which of the following conditions would you be most motivated to use the separate samples t-test with Welch adjusted df?
fixed effect; ; we are interested in the differences among the levels of the factor
In most applications of one-way designs, the independent variable is considered a
interchangeable if the variances differ
In order to use a resampling approach, Good (2006) noted that there are some limitations on how to proceed, as the observations are not
dependent samples t test
In our testing for differences in location between two groups, we introduced the ___ which can be extended into an ANOVA context with three or more dependent samples
The null claims that the predicted and actual variance and co variance matrices match.
In structural equation modeling, what does the null hypothesis claim?
If you have the same sample sizes then the grand mean is equivilent to the mean of all the observations.
In the formulas for the ANOVA, what is the definition of the term "grand mean"?
1.0
In the situation where the four random samples are from the same population, we noted that both the MSag and the MSwg should be estimates of σ2. In that case, we would expect the F-statistic to be about
Very small sample sizes coupled with non-normal distributions.
In which of the following contexts would you recommend the researcher consider a non-parametric alternative to the ANOVA?
random or systematic
Inferential statistics are employed to assess whether the pattern of variation is
t scores
Into what type of statistic do we transform the regression coefficients?
the number of degrees of freedom (df) for both the numerator and the denominator
Like the t- and χ2-distributions, the F-distribution is actually a family of distributions defined by two parameters,
f distribution
Like the χ2-distribution, it is asymmetric so that the lower and upper tails must be tabled separately.
SSag/dfag
Mean square among groups (MSag)
SSwg/dfwg
Mean square within groups (MSwg)
Eta-squared
Of the three measures of effect size we looked at for ANOVA, which is considered to be the most biased?
reduced level of significance
One approach used by some researchers to deal with the inflated error rate problem is to conduct each test at a
.05
Only when the samples were as large as 30 and 120 did the procedure recommended by Good produce an error rate close to
the same
Previous research has shown that the t-test is fairly insensitive (i.e., robust) to departures from homogeneity of variance when the two samples sizes are
time series analysis
Problems of sequence are addressed with
equal and the largest group variance is no more than four times the smallest group variance
Regarding the assumption of homogeneity of variance, you are on relatively safe ground when the sample sizes are
normality or of homogeneous variances
Regarding the assumption of the linear model, there is no direct test, but a violation of the assumption will usually show up as a violation of the assumption of
tests of normality
Shapiro-Wilk test, the Shapiro-Francia procedure, and a test based on skewness and kurtosis
cells
Sometimes the groups in ANOVA are called
z-test for two independent correlations.
Suppose you wish to compare two correlation coefficients: the correlation between attitude about school and mathematics achievement for a sample of fifth-grade boys and the correlation between attitude about school and mathematics achievement for a sample of fifth-grade girls. Which approach would you take?
Test for two dependent overlapping correlations.
Suppose you wish to compare two correlation coefficients: the correlation between attitude about school and mathematics achievement for a sample of fifth-grade boys and the correlation between attitude about school and reading achievement for the same sample of fifth-grade boys. Which approach would you take?
numeric dependent variable and a categorical independent variable.
The "group" statistics are merely simplified computational routines to deal with the relationship between a
positively skewed
The F-distribution is:
at least one difference somewhere
The F-statistic as we have presented it is sometimes called the "omnibus" F-test, as it suggests only that there is
t statistic
The F-statistic that is reported by leveneTest is actually an extension of the
the level means and the grand mean
The SSag is based on the deviation between:
the raw scores and the grand mean
The SStotal is based on the deviation between:
the raw scores and the level means
The SSwg is based on the deviation between:
equal, or nearly so
The assumptions of normality and homogeneity of variance are less problematic, particularly when the sample sizes are
levels of the factor
The different groups in ANOVA represent different
t distribution
The first column of the F-table contains squared values from the
assumptions for the technique
The interpretation you can make following an ANOVA actually depends, in part, on whether the data appear to meet the
variables are scaled (measured) in different situations
The major difference between group statistics (t-test, ANOVA) and correlational statistics (Pearson's r, regression) is related to the difference in which the
increase the standardized effect size
The means from three groups are 65, 72, and 84. The last group is found to have an outlier score of 49. Removing this outlier from the data set would:
individual observations and group means.
The means square within groups is based on deviations between:
the f distribution
The particular member of the family is defined by the two df, where df1 equals (n1 - 1) and df2 equals (n2 - 1).
comparing two dependent (non overlapping) correlation
There is a second situation, less often encountered, in which one may be interested in comparing two correlations that are considered to be dependent. Similar to the first situation, you have one group for which you have several variables and their correlations. However, in this case, the correlations you want to compare do not share a common (overlapping) variable.
comparing two samples with regard to similarity of relationships
This topic will be treated in two parts: first with regard to the equivalence of correlation coefficients, and second with regard to the equivalence of regression coefficients.
regression coefficients (slopes)
To address the question of whether one variable varies as a function of another the same way in two different populations, we turn our attention to testing hypotheses about
zscores
To compare two independent correlation coefficients, into what type of score do we have to transform the values of r?
1 to k
To denote a particular group, we talk about the jth group. Thus, j can take on the values from
i and j
To identify particular cases, or values of Y, we use the subscripts
honestly significant difference
To what does "HSD" in Tukey's HSD refer to?
linear model, normality, or homogeneity of variance.
Transformations may be employed to rectify any one of three types of assumption violations:
If the distributions are the same shape even though they are not normal and if you have a sample size of at least 30 or all the sample sizes are equal
Under what conditions can we perform the test even if those two assumptions for ANOVA are violated?
If the distributions are the same shape, if the sample size is large, if the sample size is too small for the central limit theorm but they are still the same size. If the answer to all three of these conditions is no then we would use the welch satterhwalte correction.
Under what three conditions can we justifiably choose not to use the welch satterthwaite correction even when the variances are unequal?
independence
Violation of which of the following assumptions is most problematic when conducting an ANOVA?
an independent variable and a dependent variable
We want to remind you that group designs, such as the two-group situation addressed with the t-test and the k-group situation addressed with ANOVA, are really special cases of the more general matter of looking at the relationship between
1. random selection 2. random assignment
What additional two assumptions need to be made to allow generalizing to populations and inferring cause-and-effec
1. linearity 2.fixed effect 3.normality 4.independence of error 5.homogenety of variance
What are the five assumptions needed for the ANOVA?
Normality and homogenity of variance
What are the only two assumptions for ANOVAs that we can test directly?
The main effect of factor A, the main effect of factor B, and the interaction between the two effects .
What are the three F-ratios calculated in a factorial ANOVA?
We are comparting two categorical variables to see if the proportions are independent or coorelated with the groups.
What are we comparing in a proportion test for multiple groups?
the same people are involved in both but there are different variables.
What do we mean if we say they're non-overlapping?
Mean squared among groups measures how spread out the level means are from eachother with the expected value being equal to the population variance.
What does the MSag measure
Mean squared within groups focuses on the difference in individual scores instead of differences in averages where the espected value is equal to the population variance as well.
What does the MSwg measure?
The scheffe approach pulls the critical values away from the mean to keep the familywise error rate under 5%
What does the Scheffé approach do to control the familywise error rate?
It remains the same
What happens to the variance test's p-value if you switch the roles of the numerator and denominator?
This is a subset of the general linear model that is just a linear regression with multiple independent variables and one quantitative dependent variable.
What is a multiple regression
This is the parent model for every parametric analysis where you use quantitative data and have a matrix of independent variables and you are trying to see how they combine and effect the matrix of dependent variables
What is the general linear model
They are all sensitive to departures from normality.
What is the major weakness shared by the F-statistic, Hartley's Fmax, and Cochran's C
The first column of the f distribution containes the squared values from the t distribution, the last row of the first column on the f distribution contains the squared z values, the bottom row of the f distribution contins the values of a chi squared distribution divided by there respective degrees of freedom. So since you can find the critical values of all of the distributions with a correct f table, it is the most general of them all.
What makes the F-distribution the most general of all of the distributions we've worked with so far?
We must get rid of the negative numbers by adding a constant. For example if you have -16 you would add 16 to all the scores.
What must we do first when transforming our data before preforming an ANOVA if our dataset has negative numbers?
We must reflect the data over the y axis to get positively skewed scores instead.
What must we do if our dataset when transforming our data before preforming an ANPOVA is negatively skewed?
The mean of group A is equal to the averages of the other groups.
What would the null hypothesis say for a complex comparison
The mean of group A is equal to the mean of group B.
What would the null hypothesis say for a pairwise comparison?
alpha level divided by the total number of comparisons
What's the Bonferroni correction?
In bootstrapping you are sampling with replacment which in permutations you are not you just take random samples of random assignments.
What's the difference between how a bootstrap approach finds differences among multiple groups and how a permutation approach accomplishes that same task?
The bonferroni correction is found to be over conservative.
What's the problem with using the Bonferroni correction for multiple comparison procedures?
Because you are putting yourself at risk for a familywise error which means there is an increased probability of a type I error across all of the comparisons
When analyzing three-sample experiments, why would it be a mistake to simply conduct three t- tests?
the slope
When analyzing two linear regressions, what coefficient gets analyzed?
chi square test
When performing a proportion test for multiple groups, what previously-studied statistical test are we using?
greater than 1
When the null hypothesis for an ANOVA is false, the F-statistic tends to be:
1
When the null hypothesis that the variances are equal is true, the F-statistic tends to be close to:
sample means
When the sample sizes are large (n > 30), we can invoke the central limit theorem regarding the distribution of the
"group" means as the unit of analysis.
When there may be potential problems of non-independence, whether these problems arise from using intact groups or from research procedures, they have suggested using
A pairwise comparison only involves two groups while a complex comparison involves more than two groups.
When using multiple comparison procedures, what's the difference between a pairwise comparison and a complex comparison?
it increases
When using the formula for the variance of two dependent samples, what happens to the overall value of the t-scores as the correlation between the variables gets larger?
when the 5 assumptions cannot be met
When would we use a Kruskal-Wallis test instead of an ANOVA?
F
Which of the following mathematical probability distributions is the most general?
The last one, the interaction between the two effects (Faxb)
Which of those F-ratios is considered the most important?
descriptive information about the data
While ANOVA is essentially an inferential procedure, we would like to again stress that an inferential procedure is relatively meaningless when reported in the absence of
You need to consider the lower and upper tails of a f distribution seperately because the distribution is skewed.
Why do we need to look up each two-tailed Fcrit separately?
Because you cannot treat as independent errors because those sampled in each cluster will be correlated with each other, so in order to identify the individual differences you have to factor out the effects of each cluster.
Why do we need to use a special technique to analyze those samples?
Because we are estimating 4 parameters (two population means for both variable x and variable y)
Why do we subtract four when calculating the statistic's degrees of freedom for a linear regression?
It is important to know without a doubt that the variances in two populations are, in fact, equal because if not it can affect the results of a t distribution and affect the p value. Assuming the two populations have an equal variance when they do not will increase the chance of a type I error as well.
Why is it important to test our usual assumption about the variances in two populations being equal?
you can lose power with a lot of comparisons
Why is the Scheffé approach still a problem when we have a lot of comparisons?
With each significant result, the Bonferroni-Holm approach gains power and It controls the familywise error rate better.
Why is using the Bonferroni-Holm approach better than using the original Bonferroni correction?
increases the chance of a type II error
Why would we not want to use the welch- satterthwaite correction all of the time?
ANOVA is a parametric test while Kruskal-Wallis test is a non parametric test and parametric tests have more power so that is why we prefer to use ANOVA.
Why would we prefer to use an ANOVA if both tests can be used for our data?
10,15,20
With all other things being equal, which of the following pattern of means would lead to the greatest power for an ANOVA?
60,62,69
With all other things being equal, which of the following sets of sample sizes would lead to the greatest power for an ANOVA?
Levene's test
With regard to homogeneity of variance, you can employ any one of several different tests. but most recommended is
randomly sampled from one population
With the groups more spread out, the MSag should be larger than would be expected than when the groups are
n cases per group
Within each of the k groups, there are multiple observations. If the group sizes are all equal to one another, then we say that there are
nj to denote the number of cases in the jth group
Within each of the k groups, there are multiple observations. if the group sizes are not equal, then we use
larger in study 1
You are reading about two different studies comparing two groups. The effect size for the first study is reported as f = .5. For the second study the effect size is reported as d = .5. The size of the effect is:
``1
`
1
a larger sample size in the denomination produces means closer to the value of
ANOVA
a non-directional test on means translates to a directional test on variances, and we use the upper tail of the F-distribution to evaluate our findings.
variations of the levene test
a recommended alternative to the f distribution Brown and Forsythe (1974) examined other possible measures of center: the median and trimmed mean. These variations are sometimes labeled as the Brown-Forsythe test, but more often than not, they are simply labeled as
fligner killeen test
a recommended alternative to the f distribution; It also looks at absolute values of the deviations from the median, but differs because these values are ranked and then compared to the expected distribution of ranks under the null hypothesis.
Behrens-Fisher problem
alternatives to employ with small samples. With respect to the pooling of the two samples estimates of variance as in Equations 14.9 through 14.11, it hardly makes sense to proceed in this fashion if the two populations have different variances. This problem is known as the
different
an F-statistic defined as MSag/MSwg should be larger than 1.0 when the groups are
cochrans C test
an alternative to the f distribution where k is the number of groups. As with the Fmax test, a special table was used to determine the statistical significance of C. Although the two latter procedures are much simpler to calculate, they too are sensitive to departures from normality.
Hartley's Fmax-test.
an alternative to the f distribution, Like the Bartlett test, it was designed for any number of groups. The test simply involved finding s2 for each of the groups and then forming the test statistic The statistical significance of the observed test statistic was determined by comparing the observed result to the critical values in a special set of tables.
variance
an average squared difference from the mean and, as such, also a measure of the degree to which the values of the variable differ from each other.
the observations
are not independent in matched-samples or repeated-measures designs. However, the linear model for this design contains another parameter, π, to allow for the case, or person,
bartlett chi square test statistic
as an alternative to the f distribution this test statistic is approximately distributed as χ2 with df = 1 when sample sizes are both larger than 5. Unfortunately, this approach had two major drawbacks. First, without a calculator of some type, the formula is somewhat formidable. Second, the Bartlett test is also sensitive to departures from normality.
important
as the sample sizes become more and more different, the assumption of equivalent variances becomes more
population variance, σ2.
by looking at the variation of the sample means, we can derive an estimate of the
chi square distribution
for the one-sample test on a variance, we needed to use the similarly skewed
Y
generic variable name
may still be independent
he dependence of the observations is assumed to come from the parameter π, and thus the error components, ∊ij,
too small
if the large sample is paired with the small variance, then s2pooled will tend to be
Levene test or the Fligner-Killeen test
if the larger variance is less than twice the smaller variance, you can use the pooled-estimate version. Others mention a 1:4 ratio. When in doubt, you can apply the
treatment effects
if the observed F-statistic exceeds the critical value from the F-distribution, it is reasonable to infer the presence of
separate-sample estimate of t and the Welch-Satterthwaite correction for df.
if the sample sizes are quite different and the sample variances also appear different, you should probably employ the
variance of the population of sample means
if there are k random samples drawn from a population, then we can calculate the sum-of-squares among the sample means, divide it by its degrees of freedom (k - 1), and we should have an estimate of the
2.5% in each tail
in a f distribution If we wanted to conduct a two-tailed hypothesis at the .05 level we would seek the critical values that cut off
p value greater than .05 or 5%
in a fligner kileem test of homogeneity of variances when can we assume that the variances for all three variables are equal?
p value less than .05 or 5%
in a kruskal wallis rank sum test when do the treatment groups differ significantly?
if the p value is less than .05 or 5%
in a one way ANOVA, how to tell if the treatment groups differ significantly in the omnibus test
pvalue greater than .05 or 5%
in a shapiro wilk normality test when can we assume the distribution is normal
if there is a p value less than .05 or 5%
in a tukey HSD , how to tell if there are significant differences within the pairwise comparisons?
cases or replications
in anova , the observations within the groups are the
when the population means are different
in anova when the sample means will be more widely dispersed, and the variance estimated by the MSag will be larger than the variance estimated by the MSwg.
factor
in anova, The grouping variable, or independent variable, is called a
1 to n (or nj)
in anova, The subscript i can take on values from
Ȳ.j
in anova, a column mean is represented in general with the symbol
causality
in anova, the assumption of random assignment allows for stronger assertions of
population
in anova, the assumption of random selection enables generalization back to the
normality, independence, and homogeneity of variance.
in anova, the error components of individual scores are normally distributed within each group, that the error components are independent of one another, and that the variances of the errors are the same in all groups. Typically, these assumptions are labeled as the assumptions of
Ȳ..
in anova, the mean of all the observations is denoted
estimate it from the sample data
in the context of testing hypotheses about means, one distinction between the t-distribution and the z-distribution is whether you know the variance of the population or must
sample sizes
in the levene test or the fligner killeen test you will find that different experts recommend different levels of α to apply to the result, varying from .01 to .20, depending on the disparity between the
p value
in the welch two sample t test degrees of freedom have been adjusted, leading to a slightly different
multiple t test
is predicated on the assumption that the differences are all pairwise in nature, or that those are the only differences in which the researcher is interested
f distribution
is the most general of the four mathematical probability distributions (z, t, χ2, and F) that we cover in this book.
f distribution
it is theoretically defined as another special case of the gamma (γ) distribution, it can be more easily defined as the ratio of the variances of two independent samples of size n1 and n2, respectively, drawn from a normally distributed population
"omnibus" F-test,
it tests the null hypothesis against a "family" of alternative hypotheses. It does not give us any information about which populations are different from which other populations.
SSag
looks at the variability of the group means about the grand mean
dependent samples
may result from naturally occurring aggregations (twins), matched cases, or repeated measures
welch correction
might be the method of choice under any conditions; it always seems to work.
arbitrary
most scales of measurement in the social sciences are somewhat
non normality
newer approaches, specifically Levene's test, its variation known as the Brown-Forsythe test, and the Fligner-Killeen test. These newer procedures are currently recommended as they are less sensitive to
two tailed test
no direction was implied, so we would complete a
"analysis-of-variance" (ANOVA)
not only permitted a single test for group differences, it also avoided the use of matrix algebra, which was necessary to solve most applications of the general linear model but very difficult to employ without computers.
population mean of group 1 = population mean of group 2 = ...... and so on
null hypothesis of ANOVA (can be rejected in a variety of different ways)
the two sample estimates of the respective population variances are equal to eachother
null hypothesis of a t test of two random samples?
the two correlation coefficients do not differ by more than we might expect by chance.
null hypothesis when comparing two independent correlation coefficients
ANOVA
null hypothesis would be that μ1 = μ2 = μ3, and the alternate hypothesis would be that at least one of the population means was different than another
bootstrap
one can apply this to the confidence interval for the difference between the two population means, but it must be done by drawing the samples from the two original samples rather than drawing the samples from a pooled sample.
the average deviation (deviations from the mean)
one of the first recommended alternatives to the f distribution. it is based on a measure of dispersion we described toward the beginning of this book: .
normality
other alternatives to the f test (Bartlett's χ2, Hartley's Fmax, and Cochran's C), which also were sensitive to departures from
type I error rates in line with what we would expect
regardless of sample size, the first five approaches (traditional pooled-sample t, separate-sample t with Welch correction, Cochran-Cox approximation, pooled bootstrapping, and permutation test) yield
k group situation
the analysis-of-variance (ANOVA), the Kruskal-Wallis test, and randomization may be applied to the
chi square divided by their respected degrees of freedom
the bottom row of the F-table contains the values of
mean deviations of the two groups
the conceptual basis of the Levene test as a t-test comparing the
fixed effects model
the different levels of the treatment are assumed to include all the levels in which you are interested. If you were to replicate the study, you would use the same levels each time.
additive model
the error component contains all of the variation not directly relevant to the research question. Irrelevant factors that occur within a classroom affect not just one student, but they may affect several students.
dfwg/(dfwg - 2)
the expected value of F is
hypotheses
the f distribution may be used to test
one degree of freedom for the numerator
the first column of a f distribution represents
z squared
the last row of the first column of the F-table contains values of
random effects model
the levels of the factor represent a sample of the possible levels that could have been selected. In these situations, replications of the study will employ different samples of levels.
f distribution
the means and variances of the simulated distributions of the ratio of two sample variances are in very close agreement with the expected values if we assume that the simulations follow the
when using the t test to conduct three different tests ( multiple t tests)
the most important of these problems is the increased risk of making a Type I error (rejecting a true null hypothesis) when running several tests of significance. Each time a test is completed, there is a probability of making such an error.
a lot larger than .5
the multiple t tests are not independent and it is very difficult to determine the probability of at least one Type I error, but suffice it to say, the probability is
grouping
the only category we address is the problem of non-independence due to
distribution of the population
the other simple idea employed by Fisher comes from what we know about the distribution of sample means relative to the
the same
the sample sizes are equal so that the two t-statistics (pooled sample and separate sample) are
smaller samples
the separate-sample bootstrap approach recommended by Good seems to have an inflated Type I error rate, especially with
empirical error rates
the separate-sample bootstrapping procedure recommended by Good continued to provide upwardly biased
sample size, conformity to assumptions, and independent/dependent samples.
there are several different ways to compare two groups with regard to central tendency, or location. The differences among the procedures related to issues of
transformations
there are situations where ANOVA is not so robust, and cautious researchers may want to modify their data so that they more closely conform to the assumptions. These modifications are called
independence of errors
there is no simple (or readily available) test for the assumption of
t test
there is no single difference among three groups that can be used in the numerator of the
groups, sequence, space
three categories of situations in which you are likely to encounter violations of the assumption of independence:
descriptive statistics
transformations of data pose problems for interpreting
values
transforming data changes the
SS by its appropriate df
we are interested in comparing two estimates of variance with an F-statistic. Thus, we must take our SSag and SSwg and transform them into their respective MSag and MSwg. We accomplish this transformation by dividing each of the
non independence of error
we did not want to leave you with the misunderstanding that the use of dependent samples will necessarily lead you to problems of
alpha per test multiplied by the number of tests
when a small number of independent tests are run, this error rate is approximately the
means is most efficient
when confronted with the task of comparing two groups with regard to equivalent location, we think that, under the right conditions, a comparison of the two
p value is less than .05 or 5% (null is rejected)
when do correlation coefficients differ significantly in the comparison of two correlation based on independent groups?
p value is less than .05 or 5% (null is rejected)
when do the coorelation coefficients differ signifigantly in a comparison of two overlapping correlations based on dependent groups
p value is less than .05 or 5% (null is rejected)
when do the correlation coefficients differ signifigantly in a comparison of two non overlapping correlations based on dependent groups
p value less than .05 or 5%
when do the treatments differ significantly in a one way ANOVA
when p value is less than .05 or 5% (null hypothesis is rejected)
when should we treat the variances as being equal in a f test to compare two variables?
when p value is less than .05 or 5% (null hypothesis is rejected)
when should we treat the variances as equal in a fligner killeem test of homogeneity of variances?
MSag and the MSwg will be similar
when the three samples are drawn from the same population (normal populations having the same mean and variance), the
p value less than .05 or 5% (null hypothesis is rejected)
when would x and y differ signifigantly in the welch two sample t test?
f table
you could find the critical values of z, t, and χ2 within the appropriate
if the samples are relatively small
you may wish to employ the Mann-Whitney U-test or a permutation test.