Exam 2

2 tails in a row

(1/2 * 1/2) 25%

research hypothesis

(usually) that the means of your populations will differ

null hypothesis

(usually) that there is no (null) difference between the means of your populations

.05 vs .01

.05 is more lenient and therefore is a lower risk for a type II error







7 tails in a row


what effects effect size?

1.) (a) the difference between the population means - big numerator= bigger effect size= more power (b) the comparison population's SD -smaller denominator= bigger effect size= more power

3 characteristics of a distribution of means

1.) DoM has the same mean as the distribution of individual scores 2.) (a.) DoM has less spread around the mean (smaller variance and standard deviation) than the distribution of individuals scores (b.) the standard deviation of the distribution of means is the (approximate) average amount of difference each of your sample's means is from the overall population mean 3.) The distribution of means is approximately normal: if the parent population is normal or if the sample has 30 or more people

Steps in Hypothesis Testing

1.) Restate the question as a research hypothesis and a null hypothesis about the populations 2.) Determine the characteristics of the comparison distribution 3.) Determine the cutoff sample score on the comparison distribution at which the null hypothesis should be rejected 4.) Determine your sample's score on the comparison distribution 5.) Decide whether to reject the null hypothesis

distribution of means can be narrow

1.) population of individuals may have a small standard deviation 2.) sample size is large

Power Influenced by

1.) predicted effect size (d) (a) the difference between the population means - big numerator= bigger effect size= more power (b) the comparison population's SD -smaller denominator= bigger effect size= more power 2.) Sample Size (N) 3.) Alpha level (a) 4.) Whether it will be one or two-tailed test descriptive vs inferential stats

influences power

1.) predicted effect size (d) 2.) Sample Size (N) 3.) Alpha level (a) 4.) Whether it will be one or two-tailed test

true for all distribution of means

1.) the mean of the distribution of means is about the same as the mean of the original population of individuals 2.) the spread of the distribution of means is less than the spread of the distribution of the population of individuals 3.) for most cases: the shape of the distribution of means is approximately normal

5 tails in a row


Probability of tail 1 flip


True or false: If you find the significance results with alpha set to .00000001, you can say that your results prove the research hypothesis

False: The term "prove" is too strong because the results of the research hypothesis are based on probabilities

are 5% of all studies type I error false positives

NO you cannot falsely conclude a drug works when it actually works *it is not true that the null is always true, so it is not true to say that 5% of all studies are type I errors*

if a result is statistically significant at the .05 level, will it always be significant at the .01 level?


If the result is significant with a one-tailed testing procedure, will it always be significant with a two-tailed procedure?

No criteria for significance is more stringent with a two tailed procedure

If a result is significant with a two-tailed testing procedure will it always be significant with a one-tailed testing procedure?

No, if you pre-specify the low tail and the result is in the high tail, the result will no longer be significant

standard error of the mean (SEM)

Same as standard deviation of a distribution of means; also called standard error (SE)

N (number in sample) goes up

Standard error goes down


Which of the below alpha levels makes it easiest to find significant results? .05 .01 .10 .03


Which of the following alpha levels makes it toughest to find significant results .05 .01 .10 .03

if a result is statistically significant at the .01 level, will it always be significant at the .05 level?


Distribution of means (DOM)

a hypothetical distribution of means of each of lots and lots of samples of the same size that are randomly taken from the same population of individuals a collection of sample means for all possible random samples of a particular size (N) each score in this distribution is a sample mean not an individual participant's score

bar too high (alpha too high)

a lot of people are able to crawl under the bar= a lot of people are athletes that aren't a lot of false alarms type I error

effect size

a measure of the difference between population means how much our sample mean differs from the null hypothesis population mean after our treatment

effect is there

a more powerful study will more likely find it

one-tailed test

a result in only one tail (not both) could reject the null

0.05 vs 0.001

a study with alpha of 0.05 is much more powerful than a study with alpha set to 0.001 big alphas set a high limbo bar they make it easier for your study to crawl under that bar and find a statistically significant result

hypothesis testing

based on probability we figure out how likely (probable) a result could have come about simply by chance (a fluke)

alpha is .001

chance of making a type I error is 0.1%

alpha is .01

chance of making a type I error is 1%

alpha is .10

chance of making a type I error is 10%

alpha is .05

chance of making a type I error is 5%

statistically significant

conclusion that the results of a study would be unlikely if in fact the sample studied represents a population that is no different from the population in general; an outcome of hypothesis testing in which the null hypothesis is rejected

95% confidence interval

confidence interval in which there is a 95% chance that the population mean falls within this interval

99% confidence interval

confidence interval in which there is a 99% chance that the population mean falls within this interval


descriptive stats on a sample usually roman letters


descriptive stats on population usually unknown usually greek letters

distribution of means

distribution of means of samples of a given size from a population; comparison distribution when testing hypotheses involving a single sample of more than one individual

comparison distribution

distribution used in hypothesis testing. It represents the population situation if the null hypothesis is true. It is the distribution to which you compare the score based on your sample's results

standardized effect size

divide the raw score effect size for each study by its respective population standard deviation

two-tailed tests

divide up the alpha into each tail, requiring a more extreme score in each tail to find a significant result set the limbo bar lower (further out in the tail), thereby reducing power

knowing result statistically significant

doesn't tell us how big the effect of our treatment is -need to calculate the effect size

raw score effect size

effect size is given in terms of the raw score on the measure

type II error

failing to reject the null hypothesis when in fact it is false; failing to get a statistically significant result when in fact the research hypothesis is true

type II error

failing to reject the null hypothesis when it is in fact false deciding out study does not support the null hypothesis when it does

make alpha larger

greater area for rejection

larger predicted effect size

greater power of your study

more stringent (.01)

greater risk for type II error

high power study

has a high probability of finding a significant result

low power study

has a low probability of finding a significant result

maximize your study's power

have a large sample use a strong treatment use a one-tailed test set the highest alpha level you can (usually .05) if possible decrease the comparison pop's SD by measuring your variables more precisely

studies w treatments w large effect sizes

have higher power than studies of treatments with smaller effect sizes

one-tailed tests

have more power than two-tailed tests

upper confidence limit

highest possible population mean that would have 95% (or 99%) probability of including our sample mean

Z test

how to determine the likelihood that the mean of our sample could have occurred simply due to chance if the null hypothesis is true

one-tailed test

hypothesis testing procedure for a directional hypothesis; situation in which the region of the comparison distribution in which the null hypothesis would be rejected is all on the one side (tail) of the distribution

two-tailed test

hypothesis-testing procedure for a nondirectional hypothesis; the situation in which the region of the comparison distribution in which the null hypothesis would be rejected is divided between the two sides (tails) of the distribution

z test

hypothesis-testing procedure in which there is a single sample and the population variance is known

decision errors

incorrect conclusions in hypothesis testing in relation to the real situation, such as deciding the null hypothesis is false when its really true

absence of evidence

is not evidence of absence

mean of distribution of means

is the same as the mean of the population of individuals

SE goes down

it becomes easier to get a significant Z score

make alpha smaller

less area for rejection

less people

less power

smaller your alpha

less power your study has

introduction to hypothesis testing

likelihood of result occurring due to chance goes down, the result becomes more improbable and hence is significant

real effect, important

likely if there's a large effect size

real effect, unimportant

likely if there's a small effect size

lower confidence limit

lowest possible population mean that would have 95% (or 99%) probability of including our sample mean

large difference between means (power)

more likely a significant result

more stringent alpha

more likely you are to commit a type II error

more people

more power

bigger your alpha

more power your study has

bar too low (alpha too stringent)

no one can crawl under the bar= no one is an athlete a lot of misses (alpha is too stringent)


null hypothesis

fail to reject our null hypothesis

our results are inconclusive

reject null hypothesis

our results support the research hypothesis

chance diffs vs sig diffs

our sample score may differ from the population simply due to chance-- it's a fluke real, true difference (significant difference) the difference between our sample score and our population mean must be relatively unlikely to be due to chance only inferential test can tell you this

cutoff sample score

point in hypothesis testing, on the comparison distribution at which, if researched or exceeded. by the sample score, you reject the null hypothesis. also called "critical value"


population 1


population 2

nondirectional hypothesis

predicted effect is in no specific direction difference between sample and population is predicted diff can be above or below the pop mean use two-tailed test


prediction, often based on informal observation, previous research, or theory, that is tested in a research study

sig level

probability level set at the beginning of the study only results that are more improbable than this level will be considered non-chance results


probability of making a Type II error


probability of making a type II error

statistical power

probability that the study will give a significant result if the research hypothesis is true

hypothesis testing

procedure for deciding whether the outcome of a study (results of a sample) supports a particular theory or practical innovation (which is thought to apply to a population)

confidence interval

range of scores that is likely to include the true population mean; the range of possible population means from which it is not highly unlikely that you could have obtained your sample mean

type I error

rejecting the null hypothesis when in fact it is true; getting a statistically significant result when in fact the research hypothesis is not true


research hypothesis

directional hypothesis

research hypothesis predicting a particular direction of difference between populations

nondirectional hypothesis

research hypothesis that does not predict a particular direction of difference between the population like the sample studied and the population in general

standard error (SE)

same as standard deviation of a distribution of means; also called standard error of the mean (SEM)

two-tailed procedure

sample can result in either tail could reject the null

cutoff scores

score so extreme that, if you found it (or a score even more extreme) in your sample then you'd be pretty convinced that something is going on score is sufficiently unlikely to occur if the null hypothesis was true


set of principles that attempt to explain one or more facts, relationships, or events; psychologists often drive specific predictions from theories that are then tested in research studies

treatment predicted to have large effect

should result in a large difference between your sample mean and the null hypothesis population mean


significance level; probability of making a type I error

decision errors

sometimes our decision to "reject the null" or "fail to reject the null" is wrong this decision is based on probability

standard deviation of distribution of means

square root of the variance of a distribution of means also called standard error (SE) or standard error of the mean (SEM)

effect size conventions

standard rules about what to consider a small, medium, and large effect size, based on what is typical in psychology research; also known as Cohen's conventions

Cohen's d

standardized effect sizes a significant treatment is not necessarily a meaningful treatment effect size tells you how meaningful your results may be becomes very useful when comparing the effectiveness of treatments from studies that use different measurement scales it'll tell you which treatment produces a bigger effect

effect size

standardized measure of difference (lack of overlap) between populations. Effect size increases w/ greater differences between means.

null hypothesis

statement about a relation between populations that is the opposite of the research hypothesis; statement that in the population there is no difference (or a difference opposite to that predicted) between populations; contrived statement to set up to examine whether it can be rejected as part of hypothesis testing

research hypothesis

statement in hypothesis testing about the predicted relation between populations (often a prediction of a difference between population means) also called alternative hypothesis


statistical method for combining effect sizes from different studies

treatment predicted to produce weak effect

study will be less powerful less able to detect that effect

treatment predicted to produce strong effect

study will be more powerful Better able to detect that effect

power tables

table for a hypothesis-testing procedure showing the statistical power of studies with various effect sizes & sample sizes

only calculate if

the effect is significant

more lenient (higher alpha is)

the less likely you are to commit a type II error

mean of a distribution of means

the mean of a distribution of means of samples of a given size from a population; it comes out to ne the same as the mean of the population of individuals

complete opposites

the null and research hypothesis

directional hypothesis

the predicted effect was in one (pre-specified) direction use one-tailed test procedure -> a result in only one tail (not both) could reject the null


the probability that you study will find a significant result (when the research hypothesis is true) the likelihood that you will find a significant result when it's really there

lower the alpha

the smaller the chance of making a type I error

result not statistically significant

treatment may have no effect treatment may have an effect but you missed it - type II error

don't know decision errors made

until a study is replicated (which does not happen a lot)

confidence limit

upper or lower value of a confidence interval

variance of distribution of means

variance of population of individuals divided by the number of people in the sample

probability high

we assume it was likely just due to chance

probability low (less than 5% of the time)

we can conclude that the outcome was not due to chance outcome is significant

correct decision 2

we do not reject the null hypothesis when the null hypothesis is true

correct decision 1

we reject the null hypothesis (research hypothesis is supported) when the research hypothesis is true

Type I error (alpha)

we rejected the null hypothesis when it is in fact true

never say prove

we set up the null hypothesis to see if we can or cannot reject it

can never have type I error

when the drug actually works when the drug works it is impossible to make the error of saying a drug works if the drug actually does work

type I error

worse for literature pollutes the scientific data base findings get published that something works when it doesn't

type II error

worse for researcher/career delays the discovery of important findings results inconclusive -> don't get grants or scholarships -> hampers their ability to increase their status

statistical significance & effect size

you need to know the likelihood that your result came about by chance (whether it's statistically significant) and the magnitude of your result (the effect size)

type II error

your treatment may have an effect but you missed it your treatment was too small (leading to a small effect size) your sample was too small your alpha was too small

type I error

your treatment really didn't work but you think it did your sample just behaved really weirdly

