Week 3: One-way ANOVA and Post Hoc tests

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

df error

(k-1)-(N-k)

family wise error rate

1-(1-a)^n

homogeneity of variance

ANOVA is fairly robust is the sample sizes are equal, but not if the sample sizes are unequal if there is heterogeneity of variance, you can use the Brown-Forsyth or Welch F ratio instead of the regular F ratio - it takes into account the the variances are not equal

what is the alternative to using multiple t-tests?

ANOVA- can compare multiple groups in one test or FWE (family wise error) control

f ratio

F = MS between/ MS within

post hoc for unequal sample sizes

Fabriels (small discrepancy in n) or Hochberg's GT2 (large discrepancy in n)

post hoc for unequal variances

Games-howell

f ratio for repeated measures

MS between treatments/ MS error

df for total

N-1

df for within treatments

N-k

SS between subjects

P = total of the scores for each participant K= number of treatment levels G = grand total for all scores N = total number of scores

T-tests

Parametric, tests the difference between two sample means.

MS between

SSbetween/df between

MS error

SSerror/ df error

SS error

SSwithin - SSbetween subjects

MS within

SSwithin/dfwithin

type 2 error

accept the null hypothesis when we shouldn't- failed to reject this is related to power

omega-squareed

alternative to eta-squared- unbiased estimator and unbiased version of eta-squared 0.01 small 0.06 mediam 0.14 large

how do we run a one way independent ANOVA in SPSS

analyse- compare means- one way ANOVA

how do we run a One-way repeated measures ANOVA in SPSS

analyse- general linear model- repeated measures

Post hoc recommendations

assumptions met and equal sample size use- REGWQ or Tukey HSD safe option Bonferroni- although, quite conservative unequal sample sizes- Fabriels (small discrepancy in n) or Hochberg's GT2 (large discrepancy in n) unequal variances-Games-howell

sphericity

calculate the differences for all the possible participants

between-subjects

compare subjects between each other independent measures different participants take part in each condition

the ANOVA model

compare the amount of variability explained by the model to the error in the model (our model is our groups in ANOVA) if the model explains a lot more variability than it can't explain (the model variance is much greater than the residual variance), then the experimental manipulation had a significant effect on the outcome (DV) the larger F is the less likely it is due to sampling error

when group with larger sample sizes have larger variances than the groups with smaller sample sizes, the resulting F-ratio tends to be ________

conservative - more likely to produce a non-siginifcant result when genuine difference does exist

the f-ratio ______ tell us whether the F-ratio is large enough to not be a chance result

does not to discover this we can compare the obtained value of F against the maximum value we would expect to get by chance if the group means were equal in an F-distribution with the same degrees of freedom if the value we obtain exceeds this critical value we can be confident that this reflects an effect of ur independent variable

post hoc tests

essentially t-tests compare each mean against all others- in general terms they use a stricter criterion to accept an effect as significant they control the family-wise error rate (afw_ simplest example is the bonferroni method

standard pool

estimate of standard deviation

The REGWQ has _____ and tight control of the ____ error rate

good power; type1

sphericity violations

if Mauchly's test is less than p 0.05 we use an alternative, corrected ANOVA epsilon values: greenhouse-geisser, huynh-feldt, lower-bound each one of them corrects for the violation of sphericity corrects the degrees of freedom

when do we use greenhouse gessier correction?

if the greenhouse-geisser is less than 0.75 we use this correction - it is a conservative method

what is the problem of doing multiple t tests?

if we are doing a test it assumes that this is the only test being done decision-wirse error for each comparison assumes it is the only test (a) when we conduct multiple tests we have this 5% rate each time - the more tests we do the more likely that one of those tests will be wrong family-wise error (all of the tests we are conducting) becomes inflated- we need to control this family-wise error rate family wise error= 1-(1-0.05)^4 instance 3 tests our family wise error can be computed = 1(0.95)^4 increases type 1 error- false positive- rejecting the null hypothesis when it is true

underlying theory of repeated measures ANOVA

in a repeated measures search study, individual differences are not random- - because the same individuals are measured in every treatment condition, individuals differences will have a systematic effect on the results - these individual differences can be measured and separated out from other sources of error we partition the variance into: -total variance -between subjects - within subjects- previously the residual or error

assumptions

independent observations- someone's score in one group can't affect someone's score in another group interval/ratio level measurement- increments must be equal normality of sampling distribution- remember the central limit theorem homogeneity of variance the group variances should be equal

eta-squared

it is a biased estimator- really good at telling about our sample but doesn't give us a very accurate effect size in our population 0.01 small 0.09 mediam 0.25 large

what does the f-ratio measure?

it is a measure the ratio of the variation explained by the model and the variation explained by unsystematic factors

is ANOVA robust?

it is a robust test meaning that it doesn't really matter if we break he assumptions of the test- the F will still be accurate.

df for between treatments

k-1

when normality is violated when non-parametric test can be performed? (independent measures)

kruskal wallis

when the groups with larger sample sizes have smaller variances than the groups with smaller sample sizes, the resulting F-ratio tends to be_____

liberal- more likely to produce a significant result when there is no difference between groups in the population- type 1 error rate is not controlled

skewed distributions seem to have _____effect on the error rate and power for two-tailed tests

little

what test is used to assess sphericity

mauchly's test

what are the methods of follow up for ANOVA?

multiple t-tests? Not a good idea- increases our type 1 error we could do orthogonal contrasts/comparisons- this what we do when you have a hypothesis post hoc test- use when you DO NOT have a planned hypothesis, compare all pairs of means, in order to use post hoc tests properly we have to have a SIGNIFICANT ANOVA trend analysis- if we believe the means follow a particular shape

sum of squares

need to calculate: SST SSM= SSB (between subject factor) SSR= SSW (within subject factor) work out the overall mean and the group means

if the f-ratio is less than 1 it must represent a _________effect

non significant effect this is because there must be more unsystematic than systematic variance our experimental manipulation has been unsucessful

non-orthogonal comparisons

non-orthogonal comparisons are comparisons that are in some way related using a cake analogy- non-orthogonal comparisons are where you slice up your cake and then try to stick slices of cake together again the comparisons are related so the resulting test statistics and p-values will be correlated to some extent for this reason you should use a more conservative probability level to accept that a given contrast is statistically meaningful

labeling ANOVAs

one way- one IV or factor two-way- two IV or factors three-way- three IV or factors

post hoc tests consist of ______comparisons that are designed to compare all___________

pairwise; different combinations of the treatment groups

the three examples of effects sizes

proportion of variance accounted for (r^2) eta-squared (n^2) omega-squared (w^2)

when is anova robust to normality

providing that your same sizes are equal and large (df error >20) ANOVA is robust to the normality assumption being violated if the sample sizes are not equal/small, ANOVA is not so robust to the normality assumption violation- you can transform the data or use a non-parametric alternative to a one-way independent measures ANOVA (Kruskal-Wallis tests)

type 1 error

reject the null hypothesis when we shouldn't- say theres a difference when there isn't

________design is more powerful than_________design because they remove individual differences

repeated measures; independent measures

MSr

represents the average amount of variation explained by extraneous variables- the unsystematic variation

mean squares MSm

represents the average amount of variation explained by the model- systematic variation

In SPSS what does the row and column represent

row- data from on entity column- a level of a variable

within-subjects design

same participants take part in each condition one sample for level of the IV also known as repeated measures

the effects of kurtosis seem unaffected by whether _______

sample seizes are equal or not

f controls the type 1 error rate well under what conditions?

skew, kurtosis and non-normality

effect size

small effect sizes will be significant with large n larger the sample size the more likely you will find an effect effect sizes indicate how big that effect is it allows comparisons across contexts

Bonferroni has more power when the number of comparisons is_____, whereas Turkey is more powerful when testing______

small; large number of means

one-way ANOVA

statistical technique for comparing several means at once tests the null hypothesis that the means for all populations of interest are equal we call it the omnibus test - that somewhere there is an inequality- but it doesn't tell us where this difference actually is this is why need a follow up to work out where this difference is- there are two approaches: post hoc or planned comparisons/contrast

omnibus test

tells us means aren't equal somewhere, but not where

cohen's d

tells us the degree of separation between two distributions- how far apart, in standard deviation units, are the means 0.2 small 0.5 medium 0.8 large

mauchly's test

test of sphericity tests the null hypothesis that variances between level differences are equal assumption is violated when it is less than 0.05= we then must look at the Epsilon values the assumption is not violated when it is greater than 0.05

if the f-ratio is greater than 1 it is indicates______

that the experimental manipulation had some effect above the effect of individual differences in performance

interpretation of the F-ratio

the f-ratio tells us only that the direct or indirect manipulation was successful it does not tell us specifically which group means differ from which we need to do additional tests to find out where the group differences lie

power

the probability of failing to reject the null hypothesis if it is true/ the probability of detecting an effect if it is there we want more powered studies

SS within (repeated measures)

the variability within an individual- the sum of squared differences between each score and the mean for that respective participant broken down into two parts - how much variability there i between treatment conditions: model sums of squares SSm - how much variability cannot be explained by the model: residual sums of squared SSr

variance ratio method

there is overlap in individual groups- this is where we get our error variance how much variability is there in the total sample group- treatment model in ANOVA we look at this overall variance and we divide it by the error variance if the ratio is greater than 1 - we have evidence that the means are differet

multivariate tests

these tests conduct a MANOVA rather than an ANOVA these results can be used in pace of the regular ANOVA results i the sphericity or normality assumptions are violated whilst these tests are more robust than ANOVA to the assumption violations they are also less powerful

how does pairwise comparisons control the FWE?

they control FWE by correcting the level of significance of each test such that the overall type 1 error rate across comparisons rains at 0.05

polynomial contrasts: trend analysis

this contrast tests for trends in the data and in its most basic form it looks for a linear trend there are other trends- quadratic, cubic, and quartic trends linear trend- proportionate change in the value of the dependent variable across ordered categories quadratic trend- where there is a curve in the line (to find a quadratic trend you need atlas three groups) cubic trend- where there are two changes in the direction of the grand (must have at least four categories of the IV) quartic trend- has three changes of direction (need at least 5 categories of the IV). each of these trends has a set of codes for the dummy variables in the regression model- if you add the codes for a given trend the sum will equal zero and if you multiple the codes the sum of products will also equal zero- these contrasts are orthogonal

when do we use lower-bound correction?

this is too conservative- this assumes that sphericity is completely violated avoid using this correction

platykurtic distributions make the type 1 error rate ______ and consequently the power is______

too high; too low

leptokurtic distributions make the type 1 error rate ______ and consequently the power is______

too low; too high

SStotal

total variance in the data sum of squared differences between each score and the grand mean

if a test is conservative (the probability of a _____ error is small) then it is likely to lack statistical power (the probability of a _____error will be high)

type 1; type 2

why opt for repeated-measures?

uses a single sample, with the same set of individuals measured in all of the different treatment conditions- one of the characteristics of repeated measures design is that it eliminates variance caused by individual differences individual differences are those participant characteristics that vary from one person to another and may influence the measurement that you obtain for each person- age, gender etc.

problems with repeated measures

usually we assume independence but scores will be correlate between conditions, hence it violates the assumption of independence accordingly, an additional assumption is made: sphericity - assumes that the variances and covariances of differences between treatment levels are the same -related to the idea that the correlation of the treatment levels should be the same

SSmodel or SS between

variance explained by the model- groups sum of squared differences between each group mean and the grand mean weighted by group n the more groups means spread out the greater the sum of between will be SStotal- SS within treatments Between groups= our model = SS model = SS between

SSresidual or SSwithin

variance that cannot be explained by the model-groups sum of squared differences between each score and the group mean SSwithin treatments= SS1 + SS2 + SS3 etc. within groups = our error = SS residual = SS within

underlying theory of between-subjects ANOVA

we calculate how much total variability there is between scores- total sum of squares we then calculate how much of this variability can be explained by the model we fit to the data: model sum of squares and then how much cannot be explained- how much variability is due to individual differences or error - residual sum of squares

when do we use huynh-feld correction?

when greenhouse-geisser is greater than 0.75 it is a less contrastive method than compared to greenhouse-geisser

when shouldn't we use REGWQ test?

when group sizes are different

when is ANOVA fairly robust to violations of the assumption of homogeneity of variance?

when sample sizes are equal

when should we use REGWQ test?

when we want to test all pairs of means

what else can you use when assumption of normality or sphericity are violated?

you can also use the multivariate tests such as MANOVA if the assumptions of normality or sphericity are violated as this provides a more robust test of the null hypothesis that all treatment means are equal you could also conduct a Friendman's test as the non-parametric alternative to a one way repeated measures ANOVA


Set pelajaran terkait

Quantitative Reasoning: Modeling with Sequences and Series (assignment # 2)~ amdm

View Set

World Geography Chapter 6 "Eastern Europe"

View Set

Unit 1.1: The Iterative Development Process

View Set

class NCLEX questions & prep u (med surg) CANCER part 1 & 2

View Set

CH 38 prepu : allergic disorders

View Set