PNB 3XE3 - midterm 2
How do we get N?
# of cells (levels of Y multiplied by # of levels of X) multiplied by # of participants within each cell
3 ways cell means may differ, other than sampling error:
(1) because they come from different levels of X; (2) because they come from different levels of Y (3) because of an interaction between X and Y
eta squared
- measure of the magnitude of effect. a biased measure (in the sense that it tends to overestimate the value we would obtain if we were able to measure whole populations of scores) -SS group/ SS total -this allows us to say what percentage of the variability among observations can be attributed to group effects
omega squared
-A less biased measure of the magnitude of effect than eta squared. -
Assumptions in ANOVA tests
-each of the populations from which we sample, are normally distributed (the normality of the sampling distribution of the mean) -each population of scores has the same variance (homogeneity of variance is expected to occur if the effect of a treatment is to add a constant to everyone's score) -observations are all independent of one another (ie: the fact that you scored above the mean has nothing to do/say about whether I will score above or below the mean)
Assumptions in repeated-measures
-homogeneity of variance -normality of populations -correlations among pairs of levels of the repeated variable are constant Solutions to violation of non-equal correlations between levels: -limit the amount of levels to the ones that have a chance at meeting the assumption (ie: take out levels that are not equally correlated with the others) -instead of using df: (w-1) and (w-1) x (n-1) -use: 1 and (n-1) df. -very conservative test
MS between groups (MS groups/ MS treatment)
-the variability among group means corrected by N in order to estimate population variance
How do we test the H0?
-to test H0 we calculate two estimates of the population variance -MS error and MS groups -MS error is independent of the falsity of H0 therefore it is ALWAYS a measure of population variance -MS groups is reliant on the truth of H0 -if the two are in approximate agreement we have no reason to reject H0, because MSwithin ALWAYS estimates population variance and MSwithin only estimates population when H0 is true -If MS group is much larger than MS error, we conclude that the treatment has made a real difference, causing MS group to be larger and therefore different from MS error and we can reject H0
magnitude of effect
A measure of the degree to which variability among observations can be attributed to treatments.
Tukey procedure
A multiple comparison procedure designed to hold the familywise error rate at a for a set of comparisons. -compares every mean with every other mean
interaction
A situation in a factorial design in which the effects of one independent variable depend on the level of another independent variable.
Factorial Design
An experimental design in which every level of each variable is paired with every level of each other variable. -we include all combinations of the levels of the independent variables
two-way factorial design
An experimental design involving two independent variables in a way in which every level of one variable is paired with each level of the other variable.
interpreting a graph:
Describe a graph that has no interaction: -this means that the lines are parallel -this means that the difference between the levels of IV1 at each level of IV2 are the same, it has the same effect regardless of the level of the other independent variable Describe a graph with an interaction: -this means that the lines are not parallel -this means that the difference between the levels of IV1 may differ depending on the level of IV2, the effect of IV1 differs depending on the level of IV2.
Advantages of Factorial Design over One-way design
First, they allow greater generalizability of the results. factorial designs allow for a much broader interpretation of the results and at the same time give us the ability to say something meaningful about the results for each of the independent variables. Secondly, they allow us to look at the interaction of variables Thirdly, its economy (more efficient) is an advantage. Less participants required to have same amount of power (better use of resources)
Methods of decreasing probability of Type I error in ANOVA
Fisher's least significant difference test (LSD) [protected t]: - A technique in which we run t tests between pairs of means only if the analysis of variance was significant. Bonferroni Procedure: A multiple comparisons procedure in which the familywise error rate is divided by the number of comparisons.
F distribution
If H0 is true, the ratio of MS groups/ MS errors is distributed in the F distribution
If you were to plot the means obtained from a two way factorial experiment, how could you tell if there was an interaction occurring?
If the lines plotted are not parallel -whenever the effect of some factor (B) is not the same at the different levels of the other factor (A)
How was it proven that MSerror is independent of the falsity of H0?
If there are 3 individual groups whose means were obtained, and were hypothetically equal, ie: 2, 2, 2. By linear transformation, or adding a constant to each individual observation within a couple of those populations you can change the mean, ie: adding 2 to all observations in groups 1 and 2. This will lead to their means being increased by 2, , and therefore H0 is no longer true. BUT the variance of each group will stay the same, and can still predict population variance.
adjusting the error:
If we want to add the effect of Gender and its interaction with Therapy back into the error term, all we need to do is to combine SSerror, SSGender, and SSInteraction, into a new SSerror term, combine their degrees of freedom, and then divide one by the other. The square root of this adjusted standard deviation will serve as the denominator for dˆ. In the case of the cold bathtub, we don't want that vari- ability added back in, so we just use the MSerror from our overall analysis as our denominator.
How is Fisher's LSD a "protected" t-test?
It is a protected t-test because we only make comparisons when the obtained F statistic is significant. Suppose that you are comparing 3 groups. 1. Suppose your null hypothesis is true: u1=u2=u3 -If you find a significant F (which is a 0.05 probability) then you've already made a Type I error and can't possibly make the situation worse -if you don't find a significant F, then you STOP because you failed to meet the requirement for Fisher's LSD 2. If H0 is false where there is one difference: u1=u2, but u3 is different -F is found significant and it is impossible to make a Type I error because the difference is real -When doing comparisons, there is only one opportunity to make a Type I error, which is when you compare U1 and U2, which fulfills your 0.05 probability 3. If H0 is false and all means are different: u1, u2 and u3 are all unequal -impossible to make a Type I error because the difference is real, and there is no true null hypothesis to erroneoulsy say was false -
When the H0 is found to be false why is the MSgroup value so large?
It is so large becasue it is not just estimating population variance, but also the varaince of population means themselves (that there are underlying differences in between group means)
two ways of estimating population variance
MS error = pooled variance of the sample variances (doesnt matter about H0) MS group = sample size, multiplied by the variance among sample means (relies on H0 being true)
What does MS error represent?
MS error represents the average variance of the observations within each population
Repeated measures minimizes
MS error, because it removes between subjects variability
What does MS groups rtepresent
MS groups is representative of the average variance amongst groups corrected by n to estimate the population variance; an estimate of population variance based on the variance of groups means
d-family measures
Measures of the size of an effect that depend directly on differences between means. based on differences between means
r-family measures
Measures of the size of an effect that resemble the correlation between the dependent and the independent variable. -eta squared -omega squared
Fisher's LSD (protected t)
Requirements: -the F for the overall analysis of variance must be significant -if F is significant, you make pairwise comparisons between individual means with a modified t-test -the pooled variance estimate in the t-formula is replaced with MS error (only when the variances of the groups are not too different) -if F is not significant then no comparisons between pairs of means are allowed -when making comparisons between groups, make sure there are no confounding variables This guaranteees that the probability of making at least one Type I error will not exceed .05 It does a good job of controlling the familywise error rate if you have a relatively small number of groups while at the same time being a test that you can easily apply and that has a reasonable degree of power.
in repeated measures design how do you get SS error?
SS total - SS subjects - SS conditions
What do these assumptions allow us to do?
Since the populations are equal on two fronts: they are all normal, and that they all have the same variance, the only other way that they can DIFFER, is their means. -since all populations are drawn from a mega-population, and all of these populations are ASSUMED to have the same variance, taking the variance of each population seaprately is ONE estimate of the MEGA-population variance, therefore by pooling all the variances of all the populations, we get our best estimate of the mega-population variance
Advantage of Repeated Measures
Since there's less within variability, there would technically be less noise and less overlap, and therefore more power to reject your null hypothesis when it is actually falso
How do you find degrees of freedom for: -total -each factor -for an interaction -for error -MSerror
TOTAL: N-1 EACH FACTOR: Number of levels -1 Interaction: product of the degrees of freedom for the components of that interaction error: Df errer = df total - df A - df C - df AC MS error : ac (n-1)
what does F tell us?
That if there is a REAL difference this means that the difference between the means (between groups) should be greater than MS within groups)
simple effect
The effect of one independent variable at one level of another independent variable. -looking at the effect of one factor for the data at only one level of the other factor -how do the levels of IV1 differ, depending on the level of IV2? how different is the difference between levels of IV1 at Levels 1, 2, 3 of IV?
main effect
The effect of one independent variable averaged across the levels of the other independent variable. -observing the differences in the levels one independent variable while ignoring the other independent variables ie: observing the difference in performance (DV) between young and old (IV1), while ignoring the conditions they were in
SS cells
The sum of squares assessing differences among the cell means. -a measure of how cell means differ
Relate variance within each group and population variance
The variance of the observations within each group, is an estimate of the variance of the population from with it was drawn. Due to the assumption of variance homogeneity, all the other groups have the same variance this means that each individual group variance is one estimate of the population variance. Therefore by taking the average of all of the variances within your study, you can get an estimate of the population variance. This estimate does not depend on the falsity or truth of the null hypothesis. Group Means -If we assume that H0 is true this means that all of the individual groups were individual samples of that population. -Therefore the variance between the group means is like an estimate of the population variance -Central Limit Theorem states that the variance of means drawn from the same population equals the variance of the population divided by the sample size -The variance of means (obtained from samples) that were drawn from the same population equals to the variance of the population divided by the sample size (if sample size is the same for all groups) -By reversing this, if H0 is true, then variance of sample means multiplied by sample size (assuming sample size is the same throughout), it equals the variance of the population.
Estimating population variance
Within Group Variance: -Pooled average of the samples variances is the best estimate for the overall variance that represents all of the populations (because of homogeneity) -MS within or MS error is the population variance -it does not require you to rely on the truth or falsity of the null hypothesis BECAUSE sample variance is calculated on each sample separately
Multiple comparison techniques
after finding out H0 is false or not, you do these tests to determine which groups are different from others
omnibus null hypothesis
all population means are equal
One-way Anova
an analysis of variance wherein the groups are defined on only one independent variable
Mean Squares
another term for variance
alternative hypothesis in ANOVA
at LEAST one population mean is different
explain why the error term is the interaction between the subject variance and the condition
because the error within the group varies depending on which condition each particular participant is in, hence an interaciton
with repeated measures design we are able to eliminate
between subjects variability and so the observed effect can be attributed to the actual variable of interest
What are the two different ways we can look at the magnitude of an effect?
calculate an r-family measure calculate Cohen's d
Disadvantage
carry over effects and pracrice effect solutioN: counterbalance, reverse the order so at least the carry-over effects affect subjects equally
df error
dferror: Degrees of freedom associated with SSerror; equal to k(n - 1).
df group
dfgroup: Degrees of freedom associated with SSgroup; equal to k - 1.
df total
dftotal: Degrees of freedom associated with SStotal; equal to N - 1
error variance
difference between individuals that receive the same treatment (random variation)
how do we calculate Mean Squares
dividing the sum of squares by the corresponding degrees of freedom
When looking at a difference between individual pairs of means, we look at
effect size estimate
Violations of assumption
if the populations can be assumed to be either symmetric or at least similar in shape (e.g., all negatively skewed) and if the largest variance is no more than four or five times the smallest, the analysis of variance is most likely to be valid
Bonferroni procedure
if you run several tests (say c tests) at a significance level rep- resented by a¿, the probability of at least one Type I error can never exceed ca¿. -same as Fisher's LSD but you omit the requirement of a significant F and you change the significance level for each individual test from a to a/c (c = # of comparisons) We simply decide how many tests we intend to run, divide the desired familywise error rate (usually a = .05) by that number, and reject whenever the probability of our test statistic is less than our computed critical value.
When an interaction is evident, what do you do next?
ignore the main effect, and explore the simple effects -Do not run all simple effects, only the ones that are relevant because this will increase your Type I error -EXAMPLE: the difference in performance between young and old individuals on a performance task is the same, no matter what the level of processing is..... If there is an interaction then maybe younger individuals will perform better than old individuals when deep processing is used
Repeated measures are not__________
independent
Adding a constant to all data points in one particular population does what? What is this proof for?
it changes the mean, but doesnt change the variations -this is proof that even though means may differ, internal variance (error variance) is still the same and that it is still an estimate of the population variance regardless of truth or falsity of the H0.
What is a major problem is comparing groups ?
making type 1 errors increases as you make more independent t tests
SS subjects
measures differences among people in therms of their status on the variable of interes
when H0 is false, then n x variance of sample means (MSgroups) is estimating....
population variance PLUS the variance of the population means themselves meaning that there is an observed effect in those that deviate far from the general mean.
When examining an omnibus F, we use...
r-family measure
cohen's d
s =sqrt of MSerror
Obtaining the F statistic form an ANOVA between two groups is the same as the t-statistic from a t-test between two groups... how are they equal?
t^2 = F
SS X (one of the variables)
tells us how much of the difference between cells can be attributed to differences in variable X
ANOVA
the analysis of variance -deals with differences between sample means -no restriction on the number of means -allows us to deal with two or more independent variables simultaneously -uses differences between sample means to draw inferences about the presence or absence of differences between population means
Cell
the combination of a particular row and column; the set of observations obtained under identical treatment conditions. -any specific combination of one level of one factor and one level of another factor C ij = i is row, j is column
r-family measures
the magnitude of effect associated with each independent variable -most easy to calculate is eta squared, but it is a biased estimate of the value that we would get if we obtained observations on whole populations -eta squared is calculated by dividing the sum of squares for that effect by SS total -it tells us what percentage of the overall variability is accounted for depending on which SS you used. -omega are less biased for cohen's d: -we take the difference between two groups and divide that by our estimate of the standard deviation WITHIN groups
grand mean
the mean of all of the observations
how is probability of type 1 error increased in these ANOVA?
the probability of Type I error increases as the number of pairwise comparisons increase
family-wise error rate
the probability that a family of comparisons contains at least one Type I error
familywise error rate
the probability that a family of comparisons contains at least one Type I error -making one Type 1 error is as bad as making ten
F statistic
the ratio of MS group to MS error, this is what you look up (there are two F statistic tables 0.05, and 0.01) to see if your ratio is large enough to reject the null
Sum of squares
the sum of squared deviations around some point (around a mean ususally)
SStotal
the sum of squared deviations of all of the scores from the grand mean, regardless of group membership
SS group
the sum of squared deviations of the group means from the grand mean, multiplied by the number of observations
SS error
the sum of the squared residuals or the sum of the squared deviations within each group
why can we we do a seemingly "indepepdnent" t-test of means with repeated measures...
this is because MS error represents the correct estimate of the standard error of the differences
repeated measures effect size
usually makes sense to use the Standard deviation from the pre-test score because it includes individual variability which is more realitic and meaningful in real world
MS within (MS error)
variability among subjects in the same treatment group
error variance
variance unrelated to group differences
If MS error and MS group are roughly in agreement...or in disagreement...
we will have support for the truth of H0 in disagreement, then we have support for the falsity of H0
SS XY
whatever difference is not accounted for by each individual variable, is accounted for in the interaction between variable X and Y.
factors
a synonym for independent variable in the analysis of variance
how to calculate total number of observations:
a x c x n -a equals number of levels in A -c equals number of levels in C -n equals number of participants in each cell