Stats exam 2
how do you figure out planned contrasts?
* note: the within groups pop variance estimate for a planned contrast is always the same as the within groups estimate from the overall analysis so first you would choose what 2 groups you want to compare (thus, usually the between groups df will be 1). then... STEP 1: estimate the variance of the dist of means (s2m) -- remember s2m = mean minus the grand mean squared for each number in sample, add these up, then divide by btwn groups degrees of freedom STEP 2: figure the estimated variance of pop of individual scores (aka s2between) -- remember for s2between you multiply (s2m)(n) STEP 3: find F (s2between/s2within)
what are some advantages of factorial research designs?
- efficiency - instead of needing a group of participants for one study comparing a variable (with 2 levels) and another group of participants for another study comparing a variable (with 2 levels), you can combine them all in one study and look at the different combinations of these 2 variables (with 4 levels) -- this uses a two-way ANOVA - you can look at the effects of combining two or more variables, and see if there are any interaction effects
what is a main effect
- this is the difference btwn groups on one grouping variable (aka "way") in a factorial design in ANOVA - in other words, it is the changes or differences in the DV due to one of the factors considered alone... so when ignore one of the variables completely and look at what is left - also, it is the mean differences among columns OR rows
what are cohens conventions for effect size in anova? (repeated question)
.01 = small .06 = medium .14 = large
what is r squared?
AKA proportion of variance accounted for AKA eta squared - this is the proportion of the total variation of scores from the GM that is accounted for by the variation btwn the means of the groups - in formula, (s2btwn)(dfbtwn) / (s2btwn)(dfbtwn) + (s2within)(dfwithin)
what is a repeated measures ANOVA
ANOVA for a repeated measures design where each person is tested more than once so that the levels of the grouping variable are different times or measures for the same people
how do you solve this problem when you carry out lots of planned contrasts?
BONFERONNI PROCEDURE = use a more strict significance level for each contrast (ie for 2 planned contrasts, make each at the .025 level) - the general rule in the bonferonni procedure is to divide the true significant level by the number of planned contrasts you plan on doing.
if the variation of scores within each sample is not affected by the null hypothesis, what is causing the level of within group variation?
CHANCE FACTORS (unknown to researcher) - this includes the idea that different people respond differently to the same situation or treatment and there might be some experimental error associated with the measurement of the variable of interest SO, we can think of within groups estimates of the population variance as estimates based on CHANCE FACTORS that cause different people in a study to have different scores
what are the principles of the structural model (ie what are the most important calculations you need to do in this model)?
DIVIDE UP THE DEVIATIONS: - the structural model is all about DEVIATIONS, so you have a deviation of scores from the GRAND MEAN; this comes in 2 part: 1) deviation of scores from the GM 2) and a deviation of scores from the SAMPLE's MEAN SUMMING THE SQUARED DEVIATIONS - square each of the deviation scores (the deviation of scores from GM and deviation of scores from the sample's mean) (NOTE: the sum of X-GM squared (SStotal) is the same as calculating the sum of X-M squared (SSwithin) + the sum of M-GM squared (SSbtwn)) POP VARIANCE ESTIMATES - figure s2between (btwn group pop variance estimate) = the sum of (M-GM)squared / dfbtwn - figure S2within (within group pop variance estimate) = the sum of (X-M)squared / dfwithin
explain what SStotal, SSwithin, and SSbtwn are
SStotal = sum of squared deviations from the grand mean, completely ignoring the group the score is in: sum of (X-GM) squared SSwithin = the sum of squared deviations of each score from its group's mean, added up for all Ps: sum of (X-M) squared SSbtwn = sum of squared deviations of each score's group mean from the GM, added up for all Ps: the sum of (M-GM) squared
how do you figure the two way ANOVA (structural model)?
STEP 1 figure the mean of each cell, row, and column. find the grand mean of all the scores STEP 2 figure all the deviation scores STEP 3 add up the squared deviation scores of each type STEP 4 divide each sum of squared deviations by its appropriate degrees of freedom STEP 5 divide the various between groups variance estimates by the within groups variance estimate to get F ratios
what would the steps of hypothesis testing look like for ANOVA?
STEP 1 research hyp - population's means are not the same null hup - population's means are the same STEP 2 - characteristics of comparison distribution - describe the f distribution; give the between and within degrees of freedom STEP 3 - determine cutoff sample score on the comparison distribution at which the null should be rejected - use f table, say alpha level STEP 4 - determine your sample's score on the comparison distribution - in ANOVA, the comparison distribution is the f distribution and sample's score on this distribution is called the f ratio.... - what's the f ratio? STEP 5 - decide whether to reject null
what is the hypothesis testing procedures for 2 way ANOVAs
STEP 1 restate the question as a research and null hyp about the pops for each main effect and the interaction effect STEP 2 characteristics of comparison distributions - the 3 comparison dists will be 3 F dists put dfs STEP 3 determine cutoff scores on comparison dists - youll have 3 cutoffs STEP 4 determine your samples score on each comparison dist - mean of cell, row, and column then find the GM of all scores - figure all deviation scores of each type - square each deviation score - add up squared deviation scores of each type - divide each sum of squared deviations by its appropriate df - divide the various btwn groups variance estimates by the within groups variance estimate STEP 5 decide whether to reject null hyp
how do we answer these hypothesis about means in ANOVAs? What do we need to find in each sample? what do you assume?
THE VARIANCE (hence the name analysis of variance...) - when you wanna know how several means differ, you want to focus on the variance among those means - you assume (like in t-tests) that all populations have the same variance, allowing you to average the estimate of each sample into a pooled estimate (called the within groups estimate of the population variance)
what are the different ways that we can see if there is an interaction effect?
WORDS "test difficulty depends on sensitivity" NUMERICALLY look at the pattern of cell means - if there is an interaction effect, the pattern of differences in cell means across one roll will not be the same as the pattern of differences in cell means across another row (or for columns if youre looking at columns) GRAPHICALLY graph the pattern of cell means usually by making a bar graph (or sometimes line graph); whenever there is an interaction, the pattern of bars on one section of the graph is different from the one the other section of the graph; for line graph, you can tell there is an interaction by the lines not being parallel
what is an interaction effect
a situation where the combination of variables has a special effect; basically, is one variable causing a difference on another variable?
what is the within groups estimate of the population variance in ANOVA?
an average of estimates figured entirely from the scores WITHIN each of the samples - REMEMBER - this estimate does NOT change based on whether or not the null hypothesis is true; this estimate comes out the same regardless because it focuses only on the variation INSIDE each population. (so it doesnt matter how far apart the means of the different populations are)
what are post hoc comparisons?
as mentioned before, ANOVAs tell you that there is a difference between groups, but not WHAT the difference is... these are multiple comparisons that are NOT specified in advance... this is a procedure conducted as part of an exploratory analysis AFTER an ANOVA
what are planned contrasts?
comparison where the particular means being compared were decided in advance (NOTE: ANOVA may tell us that means differ, but they dont tell us WHICH means differ the most... this is what planned contrasts do... you predict this BEFORE analyzing)
how do you figure degrees of freedom in 2 way ANOVAs?
df rows = # of rows =-1 df columns = # of columns - 1 df interaction = # of cells - df rows - df columns - 1 df within = sum of all dfs for all the cells
what is the controversy in factorial ANOVAs?
dichotomizing variables aka median split where you take a continuous variable such as a group and split it in half, making it categorical (ie high group and low group) - good bc it helps you look at interactions, BUT this is problematic bc it causes you to lose information - it's also saying that some who is a 15 is diff from someone who's a 14, but someone who's a 15 is the same as someone who's a 25
what is the between group estimate of the population variance in ANOVA?
estimate of the variance of the population of individuals based on the variation among the means of the groups studied * if the null hyp is true, this estimate gives an accurate indication of the variation WITHIN the populations (aka variance due to chance factors) * if the null hyp is false, this estimate is influenced by both variation WITHIN the populations (variance due to chance factors) and variation AMONG the populations (variation due to treatment effects). thus, it will NOTgive an accurate estimate of the variation WITHIN the population bc it also will be affected by the variation AMONG the populations * when the research hyp is true, the between groups estimate should be LARGER than the within groups estimate
how do you figure the scheffe test?
figure F, then divide F by between df; compare this smaller F to the original F cutoff and if this smaller F doesnt reach the cutoff, it's not significant
how do you get the F ratios for rows and columns in two-way ANOVAs?
for columns: between groups variance estimate based on the variation between the column marginal means / within groups (aka within cells) variation estimate based on averaging the variance estimates from each of the cells for row: figure by using a between groups variance estimate based on difference between the row marginal means
what is an ANOVA? how is it different from t-test for independent means?
for t-test for independent means, youre comparing 2 groups of scores from 2 completely different groups of ppl (ie experimental vs control; men vs women). for ANOVAs, youre looking at MORE than 2 groups of scores, each coming from entirely different groups of ppl
effect size in ANOVA vs t-test
for t-test, youre looking at the difference btwn the 2 means and dividing by the SD HOWEVER, in ANOVA it's a little more complex... bc you have more than 2 means, it is not as obvious what the equivalent would be of the difference between the means as done in t-test effect size in the numerator part of the equation... so, we consider a different approach to effect size AKA proportion of variance accounted for (aka R squared)
what problem do you encounter when you carry out lots of planned contrasts? how do you solve this problem?
if you set the cutoff level at .05, bc youre carrying out so many planned contrasts at this level, youre increasing your alpha level each time you carry out your planned contrast, thus making the cutoff level less extreme and making it easier for you to reach a significant result. ie if you make 2 planned contasts, you have .10 chance of finding a significant result
repeated measures ANOVAs vs one-way and two-way ANOVAs
in one and two way ANOVAs, the diff cells or groupings are based on scores coming from diff individuals in repeated ANOVAs, the researcher measures the same individual in several different situations
how do you figure the between groups estimate?
it requires 2 steps: - you estimate the variance of the distribution of means (aka s2m); subtract the mean minus the grand mean and square it for each sample, add these numbers up, then divide by the degrees of freedom between (aka the number of groups - 1) - now, figure the estimated variance of the population of individual scores by multiplying the variance of the distribution of means by the number of scores that are in each group (aka s2m(n))
what is the structural model in the ANOVA?
it still maintains the same logic, however, this model provides a more flexible way of figuring the 2 pop variance estimates and more easily handles situations in which the # of individuals in each group is not equal (which we dont deal with in this stats class) - the previous method of doing ANOVA emphasizes entire groups where you compare a variance based on diff among groups to a variance based on averaging variances of the groups - this model emphasizes individual scores where you compare a variance based on deviations of individual scores' groups' means from the grand mean to a variance based on deviation of individual scores from their group's mean * in SUMMARY: the earlier method focuses directly on what contributes to the overall pop variance estimates; this model focuses directly on what contributes to the divisions of the deviations of scores from the GM - this model is all about DEVIATIONS
what is the logic of the two way ANOVA (ie in comparison to one way ANOVA)?
like one-way ANOVA, youre still using an F ratio based one between and within estimates of the population variance, but in this case you'll have 3 F ratios: - one for the grouping variable spread ACROSS rows (the row main effect) - one for the grouping variable spread ACROSS columns (the column main effect) - one for the interaction effect * the numerator of each of these 3 F ratios will be a between groups population variance estimate based on the groupings being compared for that particular main or interaction effect * the within groups variance estimate (aka within cells variance estimate) is the same for all 3 F ratios
what is an ombinus difference?
means "overall" difference among several groups (most research situations do not involve this omnibus difference, but instead more specific comparisons)
null and research hypothesis for ANOVA
null hyp - the several populations being compared all have the SAME MEAN research hyp - the means of the several populations DIFFER in summary, hypothesis testing in ANOVAs is all about whether the means of the samples differ more than you would expect if the null were true
effect size in two way ANOVA
same as one way ANOVA, except you figure thee effect size and power for each main and interaction effect - in formula: COLUMNS EFFECT SIZE: (s2columns)(dfcolumns) / (s2columns)(dfcolumns) + (s2within)(dfwithin) ROWS EFFECT SIZE: (s2rows)(dfrows) / (s2rows)(dfrows) + (s2within)(dfwithin) INTERACTION EFFECT SIZE: (s2interaction)(dfinteraction) / (s2interaction)(dfinteraction) + (s2within)(dfwithin)
what are cohens effect size conventions for ANOVA?
small R2 = .01 medium R2 = .06 large R2 = .14
what is the controversy btwn omnibus tests vs planned contrasts?
some argue that these omnibus (overall) tests are not very useful and that you should ONLY do planned contrasts, replacing entirely the overall F test (aka the omnibus F test) instead of having planned contrasts be a supplement to the omnibus F test - however, though revolutionary, this idea has not yet been implemented bc critics argue that we lose out on finding UNEXPECTED differences that were not initially planned, and we put too much control of what is found in the hands of the researcher
just how much bigger than 1 should the ratio be between within and between groups population variance estimate?
statisticians developed the calculations of the F distribution and created tables of F rations - so you would look up in an F table how extreme an F ratio is needed to reject the null at various alpha levels (ie .05)
so what is this ratio called?
the F ratio!!
how are within and between group estimates of the population variance related?
the variation among means of samples is directly related to the variation in each of the populations from which the samples are taken. the more variation in each population, the more variation there is among the means of samples taken from those populations. SO, this idea brings up an important point - bc of this relationship, we should be able to estimate the variance in each population from the variation among the means in our samples (aka a between groups estimate of the population variance)
what is the F distribution?
there are many different F distributions, all having a slightly different shape due to how many samples you take each time and how many scores are in each sample. generally, this distribution is: - not symmetrical - has a positive skew (long tail on the right); the reason for this skew is that an f distribution is a distribution of variances and variances are always positive #s (bc theyre squared). bc of this, a variance can never be less than 0, but there isnt anything stopping it from being a really high number (hence the long tail)
what are the assumptions for ANOVA?
these assumptions are pretty much the same as the t-test for independent means: - the cutoff f ratio from the table = strictly accurate only when: a) the populations follow a normal curve b) populations have equal variances - (like in t tests) all scores are independent of each other
how do we describe factorial research designs (ie how do we write how many grouping variables and levels, etc in the results section)?
they are often described in terms of the # of grouping variables in the study and the # of levels that each grouping variable has ie in a study where the 1st grouping variable has 2 levels (high and low) and the 2nd grouping variable has 3 levels (small, medium, large), this would be a 2 x 3 factorial design (where the "2" = # for first grouping variable and "3" = # for second grouping variable) ie in a study where there's 3 grouping variables (first 1 has 2 levels, last 2 have 3 levels), this would be 2 x 3 x 3 factorial design
how do you get the F ratios for interaction effects in two-way ANOVAs?
think of it as the combination left over after considering the row and column main effects
what is the F ratio?
this is the crucial ration of the between groups to the within groups population variance
what is the between groups degrees of freedom?
this is the df used in the between groups estimate of the population variance in an ANOVA (aka the NUMERATOR of the f ratio) - in others words, this is the # of scores free to vary in figuring the between groups estimate of the population variance - in formula, # of groups - 1
what is the within groups degrees of freedom?
this is the df used in the within groups estimate of the population variance in ANOVA (aka the denominator of the f ratio) - in other words, this is the # of scores free to vary in figuring the within groups estimate of the population variance - in formula, sum of all the degrees of freedom
what is the central principle of the ANOVA?
when the nyll hyp is true, the ratio of between groups population variance estimate to the within groups population estimate should be about 1 (bc for ratios, you divide one by the other so if between groups and within groups was 107.5, 107.5/107.5 = 1, giving you a ratio of 1:1). when the research hyp is true, the ratio should be greater than 1. * so if you figure out this ratio and it comes out to greater than 1, you can reject the null hyp. in other words, it is unlikely that the null hyp could be true and the between groups population variance estimate be a lot bigger than the within groups estimate
what is a factorial research design?
when you examine the effects of two or more variables at the same time by making groupings of every possible combination of variables... this is a widely used research design in psychology * you use factorial ANOVAs for this design
how do you solve the problem of increasing significance level when you carry out post hoc comparisons?
you could use the bonferroni procedure, but often there are too many comparisons to do the calculations for each one so, what many use if the scheffe test or tukey test
how do you figure out the within groups estimate?
you use the usual method and it requires 2 steps: - you take the sum of the squared deviation scores and divide that sum of squared deviation scores by the number of groups - then, average these variance estimates ** this is aka s2within or MSwithin (mean squares within)