Statistical Power & Sample Size

Ace your homework & exams now with Quizwiz!

Power can be given as

%, probability, or proportion

Effect Size

A basic, commonly-used between-groups effect size is Cohen's d • An effect size corresponding to the difference between 2 group means - Can also be used in within-groups designs d=(M1-M2)/SD - difference between 2 means divided by the pooled SD for the two groups *percentile standing = control group percentile corresponding to mean of treatment group

Calculating power

The correlation coefficient (ie Pearson's r) is a simple measure of effect size -Varies between -1 and +1 -Not dependent on the unit of the measurement scale -Important point: r is just one measure of effect size; there are many different measures of effect size depending on the research design and statistical test being used

Power in contemporary psychology

psych and related disciplines are in the midst of a statistical crisis -many researchers are using poor statistical practices; many results have not been replicated and/or are not replicable -this severely limits the knowledge base of our field

violations of assumptions underlying tests will

reduce power Non-parametric tests (not covered in this module), which do not have as many assumptions, may be more powerful in such cases

As sample size increases, power increases

with effect size (Cohen's d) = 0.5, a=0.5

As effect size increases, total sample size required to achieve power goes down

with power = 0.8, a=0.05

Type I error

- False positives (stating an effect is present when in reality there isn't actually an effect) - is the conventional probability value (α) used for significance testing (ie p = .05, .01 etc)

Type II error

- Misses (stating there is no effect [failing to reject the null hypothesis] when there is actually an effect) - Occur with a probability of β

We can estimate effect size from:

- Previous research: use sample means and SDs obtained from previous studies. • Meta-analyses (analyses that incorporate a large number of studies to examine the evidence for, and size of, an effect) can be especially useful. - Researchers estimate of important effect: are searcher decides on the minimum important difference between means. Still need to estimate SD. - Using conventional labels of effect size magnitude for d originally provided by Cohen

Power is a function of the following factors:

- The probability of a Type I error (α) - The magnitude of effect assuming H1 (ie mean difference) - Sample size - Type of statistical test used - One or two-tailed test used - How well the data satisfy the test assumptions

A conventional value of power = _______ is often desired

0.80 Note that increasing levels of power beyond this value are often bought at very high sample sizes.

An example using correlation:

Calculate or estimate r We then decide on our value of α, and we can use statistical tables (e.g. see Cohen, 1992) or online calculators etc to calculate power If power equals 0.60, then we can say the researcher has a 60% chance of correctly rejecting H0 if it is false (i.e., if there is a real linear relationship between the two variables) - The typical H0 in psychology is that r = 0

Factors affecting power [1]

Decreasing the threshold for significance (ie α) will decrease power. Increasing the threshold will increase power, but also the probability of a Type I error.

Practical note: when estimating your effect size for a study you plan to run, it is always best to err on the side of caution and make a conservative estimate (i.e. underestimation). Why?

If you overestimate your effect size, then you may be underpowered

Factors affecting power [3]

Increasing sample size will decrease variance of distributions, thereby increasing power

Factors affecting power [2]

Power will increase with larger effect sizes (i.e. larger magnitudes of effect between distributions)

Why should we worry about power?

We are often interested in knowing the sample size needed to achieve an adequate level of power before we begin a study or research program. • This often has important implications for planning time, resources etc when designing a study.

How do we calculate power?

We need to know (or estimate) the following ingredients: - Effect size (e.g. Cohen's D, Pearson's r) - Sample size - Significance level (α) - Whether the statistical test is 1-tailed or 2-tailed

Power tests can be used either ________________ or _________________

a priori or post hoc Determining sample size needed before a study takes place is an example of controlling statistical power a priori.

When assumptions hold, non-parametric tests usually have

less power

Power has a prominent position in current discussions

many studies are underpowered increasing statistical power is regarded as a critical step toward moving past and learning from the current crisis a priori power analyses are not required for many journals and applications to funding agencies

Statistical power example

null: 2 means don't differ alternative: they differ [see pt. 2]

statistical power

power = 1 - β the probability of correctly rejecting a false H0 (null hypothesis e.g. there is no difference between 2 groups) In other words, the probability of detecting an effect that is really there

β =

probability of making a Type II error (i.e., incorrectly failing to reject H0 [stating 2 groups don't differ when they actually do])

The best way to approximate the value of a meaningful effect size is to look at

the immediate research area what is considered to be a meaningful effect size will vary considerably across subfields of psychology and germane disciplines (psychiatry, neuroscience)

conventional labels (and obviously measures of effect size) will vary as a function of

the type of test, the study design, the measures, used, etc.

Power is also a function of

the underlying assumptions of parametric tests

We can plot the relationships between factors like effect size, power and sample size to visualise ___________________________________________

their dependence on each other

Analysis outcomes can be parsed into a 2x2 table based on ______ _______ and __________________

true state; decision

Post hoc power analyses can be used to examine

whether a statistical test had a fair chance at rejecting an incorrect H0. Importantly in this case, the measure of effect size should be based on the population effect size Sometimes post hoc power analyses are reported using sample estimates of the effect size (sometimes called 'observed power'). - Easily done in SPSS - This use is generally frowned upon as sample effect sizes are biased estimates of population effect sizes.


Related study sets

Series 7: Analysis- Technical Analysis

View Set

Unit Test: Nineteenth-Century England 96% NOT ALL CORRECT

View Set

11 CP American Literature Vocab Unit 6 synonyms

View Set

CH 17 Cardiac Function and Assessment

View Set

Sustainability Course 4 - Eco Design

View Set

Coincidence,Correlation and Causeation

View Set