Psych Stats Exam 2 Hilmire at William and Mary
independent samples t-test
Test whether it is reasonable to believe that two sample means from independent samples came from a population with the same mean, Data comes from between-subjects design, Use between-subjects design if concerned about order effects or carryover effects
Related Samples T test
Test whether it is reasonable to believe that two sample means from related samples came from a population with the same mean µH, within participants design, measures not independent so use difference scores
Power
The probability of a Type I error (α), the a priori level of significance and the criterion for rejecting H0
A Priori Power Analysis
You are planning a study, and you want to determine the power you might achieve with a given sample size if group differences and variability in the dependent measure were similar to your predictions
Assumptions for z-test
• Random sample • Independent observations • Normality • Population SD known and unchanged by treatment
Assumptions of t-test
• Random sample • Independent observations • Normality • Population SD unknown and unchanged by treatment
Under what circumstances is a t statistic used instead of a z-statistic for a hypothesis test?
The t-statistic is used when the population variance is not known and instead must be estimated from the sample variance.
Mean of sampling distribution
equal to the population mean
Post-hoc power analysis
In this case, you have already analyzed your data, Typically, there is an effect of interest that was NOT statistically significant you ask yourself if the failure to reject the null was not due to a lack of population group differences, but rather to a lack of power
Type 2 error
failing to reject null when null hypothesis false, beta
Sample space (s)
set of all possible elementary events that may occur in a simple experiment
Event
set of elementary events that share predefined characteristic
for Z stats, if looking for a score
use standard deviation
for z score if looking for a sample
use standard error (SD/sqrtN)
assumptions of independent samples t-test
variances of pop homogeneous (not robust) normal (robust) independent
samping distribution
relative frequency distribution of the alternative values that the statistic may take if you could repeat the sampling process an infinite number of times with replacement
How type 1 error rate affects power
As the Type I error rate is increased, the critical value is decreased. Thus, it becomes easier to find a more extreme obtained statistic and reject the null. However, your colleagues won't let you increase Type I error rate beyond .05 typically.
Explain the difference between a matched-subjects design and a repeated-measures design.
Computationally, both matched pairs and related samples are the same. They differ methodologically. How? Repeated-measures refers to an experiment where each participant has two scores (they experience both conditions of the independent variable). In contrast, matched-subjects refers to an experiment where two different participants are matched in regards to their individual differences.
State the hypotheses for related samples t test
H0: µD = 0 HA: µD ≠ 0
What is measured by standard error
The estimated standard error, gives an estimate of the population standard error. In other words, it estimates the variability of the means in the sampling distribution of the mean
Describe the basic characteristics of an independent-measures, or a between-subjects, research study.
The most basic characteristic of a between subjects research study is the comparison of two independent sample means to determine if they came from the same population. Participants only participate in one level of the independent variable (in contrast to within-groups designs where each participant participates in all levels of the independent variable). We use an independent-samples t-test to analyze this type of data.
Explain what is measured by the sample standard deviation.
The sample standard deviation gives the best estimate available of the population standard deviation, σ. In other words, it estimates the variability of the scores in the population.
Explain how the size of the sample standard deviation influences the likelihood of finding a significant mean difference.
You can see that as the standard deviation increases, so does the standard error of the sampling distribution. When this happens, it results in a larger denominator in the t-statistic formula, and thus a smaller t-statistic. This will make it less likely to find significant differences in your studies. In other words, when the variability of the scores increases, power decreases.
Cohen's d
.2 small .5 medium .8 large use sqrt(SS/df) for related us s(pooled) for independent
Three Characteristics of "an Effect"
1. Its statistical significance: Was the mean in Group 1 statistically different from the mean for Group 2 at the desired Type I error rate? 2. Its magnitude: What is the size of the mean difference in the context of the variability in the dependent measure? Effect size estimates like Cohen's d attempt to get at this. that most effects that are "large" in magnitude are also statistically significant. However, "small" effects may be statistically significant or not depending on whether your sample size is large enough to give you the power necessary to declare them so. 3. Its meaningfulness: A new drug may cure only 1 additional person out of 1000 people relative to the standard treatment, but what if you are that one person? Statistically significant effects that are small may still be meaningful to someone; maybe not the researcher, but to someone nonetheless.
"The Sampling Distribution Shuffle"
1. Take a sample from the population of size n 2. Calculate the statistic in question and write it down 3. Replace the elements of your sample into the population 4. Go back to step 1 and repeat an infinite number of times 5. The relative frequency distribution of the values that you wrote down constitute the sampling distribution for that statistic with sample size n - Note that the sampling distribution will change as n changes so it is important to specify the value of n used to create the sampling distribution
Explain how increasing alpha increases statistical power.
Alpha (α) is the probability of a Type I error (i.e., the probability of rejecting the null when the null is true). When we increase alpha, we decrease our critical t-value. Thus, we have made it easier to find an obtained t-value more extreme than the critical value. In other words, we have increased our probability of rejecting the null. When the null is false and there truly is an effect, the probability of detecting that effect is called power. Thus, by increasing alpha, we increase the probability of detecting an effect that truly exists which means we have increased power.
How Sample size (degrees of freedom) affects power.
As sample size increases, the standard error of the mean decreases. Because standard error is in the denominator of the obtained statistic, as the standard error decreases, the obtained statistic increases making the obtained statistic more likely to be more extreme than the critical statistic. Another way to think about this is that as sample size increases, the sampling distribution of the mean becomes less variable and this decreases the overlap between the sampling distribution of the mean under the null and the sampling distribution of the mean under the alternative. Overlap can result in Type II error, so decreasing overlap decreases Type II error. Because power = 1 - Type II error, power increases as Type II error decreases. Another way to think about this is that as sample size increases, degrees of freedom increase. With more degrees of freedom, the t-distribution approaches the z-distribution. In other words, the critical statistic gets smaller as the degrees of freedom get larger, making it more likely that the obtained statistic is more extreme than the critical statistic. We have the most control over sample size relative to most other factors so this is where we can gain power most easily.
How the Magnitude of the mean difference affects power
As the magnitude of the mean differences increases, the overlap between the sampling distribution of the mean under the null and the sampling distribution of the mean under the alternative decreases. Overlap can result in Type II error, so decreasing overlap decreases Type II error. Because power = 1 - Type II error, power increases as Type II error decreases. Another way to think about this is that the mean difference is in the numerator of the obtained statistic, so as the mean difference gets larger so does the obtained statistic which makes it more likely to be more extreme than the critical statistic. We can increase the magnitude of the mean difference by maximizing our treatments. However, this is not always practical.
how variance affects power
As variance and standard deviation (i.e., variability of the scores) decrease, the standard error decreases. Because standard error is in the denominator of the obtained statistic, as the standard error decreases, the obtained statistic increases making the obtained statistic more likely to be more extreme than the critical statistic. Another way to think about this is that as the variability of the scores decreases (i.e., variance decreases), the sampling distribution of the mean becomes less variable and this decreases the overlap between the sampling distribution of the mean under the null and the sampling distribution of the mean under the alternative. Overlap can result in Type II error, so decreasing overlap decreases Type II error. Because power = 1 - Type II error, power increases as Type II error decreases. We can decrease the variability of scores by decreasing individual differences through a related-samples (related, matched, repeated) approach or by decreasing error in our measurements (improving our operational definition).
Explain how the size of the sample influences the likelihood of finding a significant mean difference.
By increasing our sample size, we increase our degrees of freedom. This gives us a less extreme critical value because the t-distribution becomes more similar to the z-distribution as degrees of freedom approach infinity. In addition, increasing sample size reduces the standard error. Both of these factors lead to more statistical power to detect significant differences between the means. Large sample sizes are always desirable for this reason.
how does Increasing the number of scores in each sample influence the value of the independent-measures t statistic and the likelihood of rejecting the null hypothesis:
By increasing the number of scores in each sample, we are increasing N in our samples. As N increases, our standard error of the mean differences becomes smaller and gives us more power to detect significant differences (meaning we have a larger t statistic). In addition, increasing the sample size increases the degrees of freedom. This gives a less extreme critical value because the t-distribution becomes more like the z-distribution as sample size increases. This also increases the power of the test.
Describe the homogeneity of variance assumption and explain why it is important for the independent-measures t test.
Homogeneity of variance is the assumption that the variances of the populations that the two samples were drawn from are equal. In other words, your treatment or experimental manipulation may influence the population mean, but it does not influence the variability (variance) of the scores. This assumption allows us to pool the variance estimates from the two samples when calculating the t-statistic. If this assumption is violated, it can possibly artificially increase your t-statistic depending on the circumstances, and this would increase the possibility making a Type 1 Error beyond your alpha level.
How does Increasing the variance for each sample influence the value of the independent-measures t statistic and the likelihood of rejecting the null hypothesis:
If we increase the variance but keep everything else constant, we will be increasing our standard error and that will make our t-statistics smaller. In other words, increasing the variability of scores reduces the power of the test. If you want to see this in practice, go back to question 5 and use a variance of 200 instead of 64 and see what happens to your t-statistic.
how one tailed vs two tailed test affect variance
One-tailed vs. two-tailed. By putting all of our Type I error into one tail, we decrease our critical statistic and make it more likely to find a more extreme obtained statistic. Thus, we can increase our power using a one-tailed test compared to a two tailed test. However, we must choose one- or two-tailed before we see our data and a one-tailed test only makes sense if we don't at all care about results in the other direction which is very rarely true. Thus, two-tail tests are the default
Type 1 error
Rejecting null hypothesis when null is true, alpha
Define "sampling error" as it relates to sampling distributions.
Sampling error is the variability in a statistic from one sample to another. It can also be thought of as the difference between a sample statistic and the population parameter. Most sample statistics will be near the population parameter, but due to random variability, not all sample statistics will be identical. The frequency distribution of these statistics all drawn from the same population with the same sample size is called a sampling distribution.
How does statistical power relate to Type II errors?
The probability of committing a Type II error is β and power is 1-β. In both cases, the truth is that there is a difference in the population means. In other words, there truly is an effect in the population. Power is the probability of detecting the effect. Type II error is the probability of failing to detect the effect. As power (1- β) goes up, then Type II error (β) must go down. In other words, we can reduce our Type II error rate by increasing our power.
Describe what is measured by the estimated standard error in the bottom of the independent-measures t-statistic.
The standard error is an estimate of the standard deviation of the sampling distribution of the mean. In the case of an independent-samples t-test, it is the estimate of the standard deviation of the sampling distribution of the differences between means. The standard error of the mean allows us to test if our obtained sample mean difference is significantly different than what we would expect under the null hypothesis because it captures the variability of the differences of the means due to sampling error.
df (independent samples t test)
n1+n2-2
Disadvantages of Related Samples
participant attrition, order effects (practice, fatigue)
Advantages of Related Samples
remove individual variability from the data, Typically will need less participants to have the same ability to reject the null hypothesis