The Basics: Statistical Testing

Ace your homework & exams now with Quizwiz!

When should Pearson's r be used?

2 continuous variables Example question: How are latitude and temperature related?

What is a test statistic?

A number that describes how much the relationship between variables in your test differs from the null hypothesis of no relationship.

One-way ANOVA vs two-way ANOVA

A one-way ANOVA uses one independent variable, while a two-way ANOVA uses two independent variables.

When to use one-way ANOVA vs two-way ANOVA?

A one-way ANOVA uses one independent variable, while a two-way ANOVA uses two independent variables. Use a one-way ANOVA when you have collected data about one categorical independent variable and one quantitative dependent variable. The independent variable should have at least three levels (i.e. at least three different groups or categories).

What is a t-test?

A t-test is a statistical method that measures the difference between the means of two groups or a single group and a known value It is often used in hypothesis testing to determine whether a process or treatment actually has an effect on the population of interest, or whether two groups are different from one another.

What to use when assumptions are NOT met and an MANOVA would normally be conducted?

ANOSIM (Analysis of Similarities)

How does an ANOVA test work?

ANOVA determines whether the groups created by the levels of the independent variable are statistically different by calculating whether the means of the treatment levels are different from the overall mean of the dependent variable. If any of the group means is significantly different from the overall mean, then the null hypothesis is rejected.

How does ANOVA test work?

ANOVA determines whether the groups created by the levels of the independent variable are statistically different by calculating whether the means of the treatment levels are different from the overall mean of the dependent variable. If any of the group means is significantly different from the overall mean, then the null hypothesis is rejected. ANOVA uses the F test for statistical significance. This allows for comparison of multiple means at once, because the error is calculated for the whole set of comparisons rather than for each individual two-way comparison (which would happen with a t test). The F test compares the variance in each group mean from the overall group variance. If the variance within groups is smaller than the variance between groups, the F test will find a higher F value, and therefore a higher likelihood that the difference observed is real and not due to chance.

Which statistic does ANOVA employ?

ANOVA uses the F test for statistical significance. This allows for comparison of multiple means at once, because the error is calculated for the whole set of comparisons rather than for each individual two-way comparison (which would happen with a t test).

What is a post-hoc test and which is used for ANOVA?

ANOVA will tell you if there are differences among the levels of the independent variable, but not which differences are significant. To find how the treatment levels differ from one another, perform a TukeyHSD (Tukey's Honestly-Significant Difference) post-hoc test.

What is ANOVA?

Analysis of Variance A statistical test used to analyze the difference between the means of more than two groups. The null hypothesis (H0) of ANOVA is that there is no difference among group means. The alternative hypothesis (Ha) is that at least one group differs significantly from the overall mean of the dependent variable.

What to use when assumptions are NOT met and both variables are categorical? (This would be used in place of Pearson's r)

Chi square test

When should a one-sample t-test be used?

Compares the mean of a single sample to a known or hypothesized population mean.

When should a paired t-test be used?

Compares the means of the same group measured at two different points in time or under two different conditions

When should independent t-test be used?

Compares the means of two independent groups (different populations).

Example of test statistic and p-value

If the mice live equally long on either diet, then the test statistic from your t test will closely match the test statistic from the null hypothesis (that there is no difference between groups), and the resulting p value will be close to 1. It likely won't reach exactly 1, because in real life the groups will probably not be perfectly equal. If, however, there is an average difference in longevity between the two groups, then your test statistic will move further away from the values predicted by the null hypothesis, and the p value will get smaller. The p value will never reach zero, because there's always a possibility, even if extremely unlikely, that the patterns in your data occurred by chance.

What are the assumptions about data that need to meet to conduct parametric statistical tests?

Independence of observations (a.k.a. no autocorrelation) Homogeneity of variance Normality of data

What to use when assumptions are NOT met and an ANOVA would normally be conducted?

Kruskal-Wallis H

Which statistical tests are parametric tests?

Linear regression logistic regression t-tests ANOVA/MANOVA Pearson's R

MANOVA vs Regression

MANOVA examines differences between groups based on multiple criteria (e.g., academic performance and extracurricular involvement) whereas regression analysis models the relationship between a dependent variable and one or more independent variables (i.e., predicts a single outcome).

What are non-parametric tests?

Non-parametric tests don't make as many assumptions about the data, and are useful when one or more of the common statistical assumptions are violated. However, the inferences they make aren't as strong as with parametric tests.

What are parametric tests?

Parametric tests usually have stricter requirements than nonparametric tests, and are able to make stronger inferences from the data. They can only be conducted with data that adheres to the common assumptions of statistical tests.

What is a p-value?

Probability value The p-value estimates how likely it is that you would see the difference described by the test statistic if the null hypothesis of no relationship were true. The p value is a proportion: if your p value is 0.05, that means that 5% of the time you would see a test statistic at least as extreme as the one you found if the null hypothesis was true.

What to use when assumptions are NOT met and predictor is categorical and outcome is continuous? (This would be used in place of one-sample t-test)

Sign test

What to use when assumptions are NOT met and both variables are continuous? (This would be used in place of Pearson's r)

Spearman's r

What are statistical tests used for?

Statistical tests are used in hypothesis testing. They can be used to: - determine whether a predictor variable has a statistically significant relationship with an outcome variable. - estimate the difference between two or more groups.

What is assumed in statistical testing?

Statistical tests assume a null hypothesis of no relationship or no difference between groups. Then they determine whether the observed data fall outside of the range of values predicted by the null hypothesis.

What does the F test do?

The F test compares the variance in each group mean from the overall group variance (a measure of how far individual data points are from the mean of the dataset). If the variance within groups is smaller than the variance between groups, the F test will find a higher F value, and therefore a higher likelihood that the difference observed is real and not due to chance.

How does a Tukey test work?

The Tukey test runs pairwise comparisons among each of the groups, and uses a conservative error estimate to find the groups which are statistically different from one another.

How is the degrees of freedom calculated?

The degrees of freedom for the independent variable are calculated by taking the number of levels within the variable and subtracting 1 (1 is subtracted because variation is impossible when one choice is left), and the degrees of freedom for the residuals are calculated by taking the total number of observations minus 1, then subtracting the number of levels in each of the independent variables.

What is the degrees of freedom for a statistical test?

The number of independent pieces of information used to calculate the statistic is called the degrees of freedom. The degrees of freedom of a statistic depend on the sample size: - When the sample size is small, there are only a few independent pieces of information, and therefore only a few degrees of freedom. - When the sample size is large, there are many independent pieces of information, and therefore many degrees of freedom. - Although degrees of freedom are closely related to sample size, they're not the same thing. There are always fewer degrees of freedom than the sample size.

What does "independence of observations" mean?

The observations/variables you include in your test are not related (for example, multiple measurements of a single test subject are not independent, while measurements of multiple different test subjects are independent).

What is the relationship between a test statistic and the p-value?

The p value gets smaller as the test statistic calculated from your data gets further away from the range of test statistics predicted by the null hypothesis. If the value of the test statistic is more extreme than the statistic calculated from the null hypothesis, then you can infer a statistically significant relationship between the predictor and outcome variables. If the value of the test statistic is less extreme than the one calculated from the null hypothesis, then you can infer no statistically significant relationship between the predictor and outcome variables.

Which post hoc tests are used for a statistically significant ANOVA/MANOVA?

Tukey's Honestly-Significant-Difference (TukeyHSD): controls for the family-wise error rate Bonferroni correction: particularly useful when the number of comparisons is relatively small Holm-Bonferroni correction: A less conservative alternative to the Bonferroni correction that is still effective at controlling for Type I errors.

When should one-way ANOVA be used?

Use a one-way ANOVA when you have collected data about one categorical independent variable and one quantitative dependent variable. The independent variable should have at least three levels (i.e. at least three different groups or categories).

When should MANOVA be used?

Used to compare the means of multiple dependent (outcome) variables simultaneously across multiple groups. If your dependent variables are correlated, MANOVA can account for these correlations and provide a more accurate analysis.

When should ANOVA be used?

Used to compare the means of multiple groups simultaneously. It's particularly useful when you want to determine if there are significant differences between the means of three or more groups. For example: differences of categories within same variable

What are post hoc tests used for?

Used to identify which specific groups within a set of groups differ significantly from each other after a significant overall effect has been found. They are often used in conjunction with ANOVA (Analysis of Variance) or MANOVA (Multivariate Analysis of Variance).

What to use when assumptions are NOT met and an Independent t-test would normally be conducted?

Wilcoxon Rank-Sum test

What to use when assumptions are NOT met and an Paired t-test would normally be conducted?

Wilcoxon Signed-rank test

What does normality of data mean?

the data follows a normal distribution (a.k.a. a bell curve). This assumption applies only to quantitative data.

What does homogeneity of variance mean?

the variance within each group being compared is similar among all groups. If one group has much more variation than others, it will limit the test's effectiveness.


Related study sets

6.1 Windows Pre-Installation, 6.1.4 Practice Questions

View Set

Chapter 2: First Amendment in Principle and Practice

View Set