Statistics 2
cohens d for independent measures t test
(M1-M2)/sqrt(Sp^2)
cohen's d
(Mtreatment - μno treatment)/σ
sampling distribution
a distribution of statistics obtained from samples of size n chosen from a population
what is rule of large numbers
as n of sample increases, M approaches μ
describe the shape of the distribution of sampling means
as n of sample increases, the sampling distribution of M approaches normal
as sample size increases, the standard error of M
decreases
explain z score as a ratio
its the ratio of the actual difference between the sample mean and the hypothesized population mean and how much distance is expected on average between a sample mean and the population mean
contrast between a sampling distribution of means using a large versus small n
larger n samples should have M closer to mu, and therefore the distribution of samples should be taller and tighter. smaller samples have M farther from mu, and therefore will have a wider shorter distribution of sample means
what does the standard error of M depend on
n σ
what is the symbol for the standard deviation of the distribution of sampling means
σM
formula for standard deviation of the distribution of sampling means, 2 forms
σM = σ/sqrt(n) = sqrt(σ^2 / n)
what is a way to measure sampling error
we can calculate the standard error of a sample and that will tell us how far off from the population mean our sample is
when is σM=σ true
when n=1
estimated standard measure for independent t test measures what?
how much difference is reasonable to expect between the 2 untreated sample means and the difference between the 2 untreated populations (i.e. if Ho is true)
the standard error of M provides a measure of
how much distance is expected on average between a sample mean and the population mean
explain the importance of variability of difference scores for the repeated measures t test
if the difference scores are highly variable then s will increase, which will make sD increase, which will make t decrease, which will reduce the likelihood of rejecting Ho
what is the standard error of M
it is the standard deviation of the sampling distribution of M
problem with z score hypothesis test. solution?
it requires σ, which may not be known - estimate the standard error of M using the sample standard deviation
null versus alternative hypothesis
null Ho says μ didn't change as a resut of treatment alternative says μ did change
power
probability that a statistical test will reject a false null hypothesis
type 1 error
rejecting a true null hypothesis b/c you got an extreme sample, ie the treatment didn't have an effect, but you still reject Ho just b/c u got an extreme population likely due to small sample size
disadvantage of repeated measures, and solution
time effects - counter balance treatment (ie some people get treatment 1 then 2, other 2 then 1)
compare these 2 questions: if μ=500 and σ=100, find probability of obtaining a score of 540 or greater? now find the probability of obtaining a sample n=25 with M of 540 or greater?
- first one, 540-500=40 z =40/100 = .4 therefore the score is .4 standard deviations above the mean. to find probability, look up z=.4 in chart - second one, need to find σM σM=100/5=20 540-500=40 z=40/20=2 therefore the sample mean is 2 standard deviations above the population mean (i.e. mean of the sampling distribution of M). use the chart to find probability
explain cohen's d
- it standardizes the difference between the difference between [the sample mean after treatment and the the untreated population mean] and the standard deviation of the untreated population - therefore it allows us to evaluate how much the treatment changed the variable while taking into account the natural variability of that variable in the untreated population
what happens to the sampling distribution of M if n=1
- this means each sample is one score itself, which means that the sampling distribution of M will be identical to the distribution of scores. - this means that σM=σ
advantages of repeated measures of independent measures t test, 3
1) n can be smaller 2) can be used for time dependent studies like learning 3) problem of individual differences accounting for differences between means is eliminated 4) also by obtaining difference values, the repeated measures eliminates individual differences that could otherwise lead to higher variance for each sample and higher variance will lead to a decreased t statistic, which will reduce chance of rejecting Ho
factors that affect power, 4 things
1) power increases with increasing effect size. increases effect size will shift the TREATED distribution of sample means farther away from the UNTREATED distribution of sample means. there as effect size increases, it is more and more likely to reject the null hypothesis. 2) power increases with increasing n. as n decreases, σM decreases and the distribution of sample means gets more spread out. this means for a given treatment effect, as n decreases, the treated and untreated distribution overlap more and more, meaning that it is more likely to obtain a TREATED sample mean that is not within the critical region defined in the UNTREATED sampling distribution of M 3) power increases with larger alpha. remember that increasing the alpha value increases the size of the critical region and therefore makes it more likely that your sample will be in the critical region and therefore makes it more likely that the null hypothesis will be rejected 4) changing from a 2 tailed to 1 tailed test increases power b/c this increases the size of the critical region
2 things that can influence outcome of hypothesis test
1) standard deviation 2) n
steps of hypothesis test
1) state hypothesis 2) select alpha level 3) find critical region z score 4) calculate σM 5) calculate actual z score 6) compare both z scores
2 general purposes of standard error of M
1) tell us about the variability of the distribution of sample means themselves 2) allow us to evaluate a single sample mean and see how likely it is to obtain that sample mean
2 general purposes of the standard deviation
1) tell us about the variability of the distribution of scores themselves 2) allows us to evaluate a single score and see how likely it is to obtain that score
pooled variance
Sp^2 = (SS1+SS2)/(df1+df2)
how do changes in standard deviation of population affect the outcome of a hypothesis test
as the variability of the population from which the sample is obtained increases, the sampling distribution of M becomes more spread out. this means the standard error increases as population variability increases... so as σ increases, σM increases. ... as σM increases, the calculated z score decreases, z= (M-μ)/σM this reduces chance of rejecting Ho
homogeneity of variances
assumption that the 2 populations for an independent groups t test have equal variances
give simple explanation of independent measures t statistic
basically its giving us the difference between the 2 sample means in terms of how much naturally occurring difference would be reasonable to expect between the 2 sample means if Ho was true
what 3 features are important for the sampling distribution of M
central tendency, variability, shape
equation for df of independent measures t test
df=n1+n1-2
probability of type 1 error
equal to the alpha value because this is the probability of choosing a sample with mean in the critical region
difference between estimated standard error and real standard error
estimated uses the standard deviation from the sample in in the numerator in place of the standard deviation of the population
why does t distribution vary more than z
every sample has its own z and t statistic. however, all samples have the same standard error in the denominator, whereas every sample has a unique estimated standard error that is calculated using the data from that sample
type 2 error
failing to reject a false null hypothesis b/c of small treatment effect
purpose of pooled variance
gives a weighted average of the sum of square values from both samples. this allows us to calculate an accurate estimated standard error
formula for estimated standard error of M, 2 ways
sm=s/sqrt(n) = sqrt(s^2 / n)
difference between t and z statistic
t statistic uses the estimated standard error of M in the denominator in place of the actual standard error
t statistic equation
t=(M-μ)/sm
explain F max test
test used to see if the homogeneity of variance assumption is true. basically just divide the larger sample variance by the smaller sample variance and look at the chart to determine the corresponding F value. then choose either alpha = .05 or .01 and find the F value. if the calculated F value is smaller than the other one, then we fail to reject the null hypothesis
what does p<.05 in z=3.00, p<.05 mean
that the probability of obtaining a sample mean with that z score IF the treatment had NO effect is less than 5%
distribution of sample means
the distribution of the means of every sample of size n chosen from a population
explain the expected value of M
the mean of the distribution of sample means is exactly equal to the mean of the population
describe the central tendency of the sampling distribution of M
the mean of the sampling distribution of M is exactly equal to the mean of the population
in a repeated measures t test, what is the relationship between n and D
the sample size is n. each individual has 1 difference score, so there are n D scores
equation for z score for sampling means distribution
z= (M-μ)/σM
symbol for population mean of differences for repeated measures t test
μD
what is the null hypothesis for repeated measures t test
μD = 0
how do changes in n affect the outcome of the hypothesis test
σM decreases as n increases as σM decreases, chance of rejecting Ho increases