T-Test, Chi Square
Null Hypotheses for T-tests
-Null hypotheses for two-sample t-test is that the 2 group means are the same (the dependant and independant variable are not related) -Formally stated: H0 - M1= M2
Mann-Whitney U test
-tests the null hypothesis that 2 population distributions are identical (non-parametric version of the independant groups t-test) -For sample sizes >20, normal distribution (z-scores) can be used -When displaying outcomes in a table, median values of the dependant variable for the 2 groups are often shown
Selection bias
Are groups being compared really comparable? Ex: experimentals vs controls?
2 types of Two-Sampled T-tests
-1st type: 2 different groups of subjects have been tested (Independant Samples T-test) -2nd type: Same subjects have been tested twice (Dependant Samples T-tests)
Power Analysis
-Cramérs V can be used to estimate needed sample seize for 2x3 tables or larger 1) need estimate of V, desired power usually 0.80, alpha usually 0.05 2) Consult table to get estimate of needed sample size for crosstab tables of special dimensions *For a 2x2 design, the sample size needs is most often obtained using estimates of proportions in the 2 groups
Focus on non-directional testing
-If we give it a direction, we are closing ourselves to a one tailed test -Non-directional = 2 tails
2 non-parametric hypothesis tests using chi-square
-Test of goodness-of-fit -Test for independance
Separate Variance formula
-Used if Levene's F test indicates statistically significant differences in the variances -separate variance formula should be used to estimate SEd in computing the t-statistic
Selection of non-parametric test
Depends on: -number of groups being compared (2 or 3+) -Type of comparison being made (independent vs dependent) -Level of variable (Ordinal must be ranked, Chi Square does not rank...)
Significance does not determine meaningfulness
Ex: If means are the same for section C (class midday) and section D (class at night), it is not statistically significant BUT shows that the time the class is offered has no effect on grades or learning capacities
Chi Square hypotheses conclusions
Null hypothesis: 2 categorical variables are independant (unrelated) - proportions across groups are equal Alternative Hypothesis: 2 variables are not independent, they are related - proportions across groups are not equal
T-test
Test of whether 2 means are statistically the same (H0)
Confidence Intervals
Constructed around the difference between the 2 group means to indicate precision of the sample estimate
Non-parametric = descriptive
Fact that Chi Squares do not require assumptions about population nor do they test hypotheses about population parameters
Chi Square
Statistical test used to examine differences with categorical (nominal) data -Data = frequencies rather than numerical scores
Non-Parametric Tests
Used when: 1) Outcomes are not measured in interval/ratio scale (must be ordinal or nominal) 2) Assumptions for parametric tests are violated (especially with small population) Ex: Ranked correlations, chi squares
Robustness of the T-test
Violations of the last 2 assumptions do not affect statistical decision making if: 1) Sample sizes are large (40+) 2) Sample sizes are similar (less than 1.5x the number in one group as in the other) **Only relevant for independant t-test
Alternative Hypotheses for T-tests
-Alternative hypothesis is that the 2 group means are different (independant and dependant variables are related) -Formally stated: H0 - M1 not = M2
Related issues for Chi Square
-If expected value for multiple cells is under 5, use Fisher's exact test -sometimes for 2x2 tables, a correction factor is applied (Yates continuity correction): reduces the value of chi square, avoid if expected frequencies are large, as it can lead to type II errors -If you use corrections, lessen your rigor
Alternative Effect Size: Risk Indexes
-Magnitude of effects in a 2x2 design most often expressed through risk indexes such as: Odds ratio (OR) and relative risk (RR)
Dependant Samples T-test
-Used to test the difference between means for the 2 related groups, or for the same people tested twice Ex: pre-intervention vs postintervention, twins... -People in the 2 groups are systematically connected, which lowers variability (SEd)
One Sample T-test
Compares the mean score of a sample to a known value (usually, known mean is the population mean)
What would I use to compare my biology grade to my stats grade?
Dependant Sample T-test (systematically connected: same environment, same study habits, etc.)
Tests for Assumptions of T-test
With small or severely unequal group sizes, violations of assumptions should be tested 1) Test for normality: Kolgomorov-Smirnov Test -positive test indicates that both samples were from normal distributions 2) Test for variance: Levene's F test -positive test indicates that the variance of each sample is more/less equal
Other tests for Independant Groups
For ordinal level data (fundamental principle: rank), when parametric tests can't be used because of a violated assumption: -2 groups: Mann-Whitney U test -3+ groups: Kruskal Wallis test Both are rank tests that examine differences between groups in location (distribution of scores)
What T-test would be used to compare your midterm grade to another section's?
Independant Samples T-test (not systematically related)
Parametric Tests
-compute a statistic from a sample and relate it to a parameter in a reference population -Assumes the sample comes from a known distribution Ex: z-scores, T-tests, ANOVA
Levene's F test
tests the null hypothesis that the variances in the 2 populations are equal
General Logic for Chi Square
-If null is true, there should be no differences in proportions (relative frequencies) for groups being compared -Test contrasts observed frequencies in each cell of a crosstab table with expected frequencies -Expected frequencies: frequencies that would be expected if the null hypothesis was true
Variable types for two-sample t-test
-Independant variable at nominal-level, with 2 levels (2 groups being compared) Ex: Controls vs Experimentals, men vs women... -Dependant variable at interval/ratio level (variable for which it is appropriate to calculate a mean) Ex: weight, scores on a stress test, levels of fatigue...
McNemar Test
-tests differences in proportions for the same people measured twice (or paired groups, ex: mother-daughter) -Yields a statistic distributed as a chi square, with df = 1
Assumptions for Chi-Square test
1) Random sampling of the observations from the population (decreases sampling bias) 2) Each observation is independant (not appropriate for correlated groups/repeated measurements) 3) Each cell in contingency table must have an expected frequency greater than 0
Two-Sample T-test
Compares the mean scores of 2 different samples to see if they are the same (technically, testing to see if the 2 samples were taken from the same population) Ex: Is the mean for section C the same as that of section D?
Assumptions for T-tests (3)
1) Cases have been independently and randomly sampled from 2 populations 2) Outcome variable is normally distributed in the 2 populations 3) The 2 population variances are equal: this is the assumption of homogeneity of variance -Larger the sample size = more likely to have homogenic variance
Steps in calculating goodness-of-fit test with X2
1) Establish hypothesis 2) Calculate Chi Square statistic (we need: number of observations, observed and expected frequencies) 3) Assess significance level (need to know degrees of freedom) 4) Decide to accept or reject the null
Attrition bias
Are people who dropped out similar to those who stayed in the study? Ex: if the people who dropped out were fundamentally different, could skew the results
Standard error of the difference
If we compare the mean difference between 2 samples, then... -We can imagine an infinite amount of paired samples taken from the same population and an infinite number of mean differences between paired samples -The sample distribution of such a hypothetical distribution of mean differences is called the SEd -Can never be known, must be estimated **Estimates the dispersal of these mean differences
Magnitude of effects
-Index summarizing strength of relationship in a 2x2 table: Phi (varies from 0-1 and interpreted like a person's r) -Index summarizing strength of relationship in a larger table: Cramérs V (Varies from 0-1, in 2x2 table, V = phi) Large effect = smaller sample size needed
Chi Square Summary
-Nominal Data (focused on sample, descriptive statistics) -Can't make inferences -Hypothesis testing around independence in variables *Dogmatic: almost impossible to have Chi Square = 0 (closer to 0, closer to equal... Never perfectly equal)
Chi Square test for Independence
-Other primary use of Chi Square: testing if variables are independent or not -Independant = 2 factors are not related *Typical for social science: looking for factors that are related
Other statistical Issues
-existence of a relationship: t-tests -nature of relationship: direct comparison of the 2 means -precision of estimates: CI's around the mean difference -Magnitude of estimates (effect size): d-value
Independant Samples T-test
-independant groups t-test used to test the difference between means for 2 unrelated groups Ex: men vs women, lung cancer patients vs pancreatic cancer patients, etc. -the people in the 2 groups are not the same people and they are not related to each other in any systematic way
Kruskal-Wallis test
-tests the null hypothesis that 3+ population distributions are identical (non-parametric version of one-way ANOVA) -Compares the ranks of the values for the groups -Tests statistic is H, which follows Chi-Square distribution Nature of effects: significant result only indicates that there is a difference among the groups (does not indicate which ones)
Pooled variance formula
-Used when population variances are equal (positive Levene's F test) -basic formula for independant groups t-test uses a pooled variance estimate for the error of the difference -If t-obtained > t-tabled, result = statistically significant (reject H0)
What if your t-value is negative?
-The sign (-/+) of the computed t-statistic tells us the direction of the difference in the meams -Negative t-value indicates a reversal in the directionality of the effect -NO BEARING on the significance of the difference between groups - we look at the absolute value of the t-statistic when deciding what to do with H0: If |t-computed| > t-tabled, then reject the null If |t-computed| < t-tabled, then fail to reject the null
Chi Square test for goodness of fit
-uses frequency data from a sample to test hypotheses about the shape/proportions of a population -Each individual in the sample is classified into one category on the scale of measurement -The data (observed frequencies) simply count how many individuals are in each category