Biostat exam 2
Calculating df for chi-square independence test
(R-1) * (C-1) where R is number of rows and C is number of columns
Question for Mann Whitney u test
- are the mean ranks of the two categories equal -(if distributions are similar) are the medians of the two categories similar
What are the assumptions for Fisher's exact test?
-Random sampling -Directional hypothesis -Independent data values
Yates correction
1 degree of freedom
T-test requires which variable?
1 measurement variable
Looking at if the sample is different from the population
1 sample t-test
A chi-squared test needs which variables?
2 nominal variables
Looking at if a sample mean is different from a sample mean
2 sample t-test
You want to compare 2 independent groups to see if their means are different Even with very large sample sizes, it is robust to deviations to normality
2 sample t-test
researcher wants to know if the new drug is equally or more effective than Digibind
2 sample t-test, one-tailed
When data doesn't follow normal distribution, the Mann-Whitney U-Test takes the place of which parametric test?
2-sample T-test
The critical value will come from the standard normal table if the sample size exceeds...
30
William's correction
>1 degree of freedom
Types of variables with Mann Whitney u test
An independent variable with 2 categories Dependent variable (ordinal, intervals, or ratios of measurements)
Given that a statistical test (lets say a t-Test) is performed at the alpha = 0.05 significance level on a 2-tailed test with df=24 has a critical value of 2.064. On evaluating the data, a test statistic generated from a sample set of t=3.021 is obtained. What conclusion could be drawn for the null hypothesis that the mean of the sample is not different from the expected value being compared to?
As 3.021 > 2.064 it falls within the upper tail of the t-distribution for the alpha=0.05 significance level. The null hypothesis is rejected at that level.
A more powerful way to reduce type I error rate after multiple tests than the bonferroni correction
Benjamini-hochberg procedure
How to control false discovery rate
Benjamini-hochberg rate
How to control familywise error rate after multiple tests
Bonferroni correction
How to correct for increased type I error rate after multiple tests of a dataset?
Bonferroni correction
Odds ratios are most commonly used in which study?
Case-control
Parametric tests are based on what?
Central tendency
Which test is most appropriate for testing whether an observed population agrees with the expected proportion?
Chi-square
The chi-square test is nonparametric. Which distribution does it follow?
Chi-square distribution
Use _________ test when: -you have one nominal variable -you compare observed counts with expected counts -the expected number of observations in any category is not too small (5)
Chi-squared of goodness of fit
The test we use to measure the differences between what is observed and what is expected according to an assumed hypothesis
Chi-squared test
Test for independence used when you have data from 2x2 tables that you've REPEATED AT DIFFERENT TIMES or locations. It will tell you whether you have a consistent difference in proportions across the repeats
Cochran-Mantel-Haenzel test
Purpose of yates or william's correction
Correcting after chi-square/g-test if the sample size is a bit too small to find significance
Null hypothesis of Mann-Whitney U-test
Data distributions are equal between groups
Null hypothesis of mann Whitney u test
Data distributions are equal between groups
The extent of independence enjoyed by a given set of observed frequencies
Degree of freedom
What do you need to pay attention to for paired t-test?
Descriptive statistics Means Standard deviations The t, df, and sig columns
What are you computing when looking at paired t-test, wilcoxon signed rank test, or the sign test?
Difference scores for each individual, then analyzing the difference scores
Paired t-test looks at
Differences observed between two data collection events
A prediction made by a researcher regarding a positive or negative change, relationship, or differences between two variables of a population
Directional hypothesis
Degree of freedom alters the _________ of the statistic
Distribution
What is a fisher's exact test used for?
Examine the significance of the association between two kinds of classification
Bonferroni correction is used to reduce instance of what?
False positives/type-1 error
An alternative statistical test to chi squared used in the analysis of 2x2 contingency tables
Fisher's Exact Test
If the expected number of observations is less than 5, you should use which test for goodness of fit?
Fisher's exact test
When more than 20% of the cells (of a table) have expected frequencies < 5, we need to use which test for goodness of fit?
Fisher's exact test
As you decrease degrees of freedom (and therefore sample size), the distribution appears...
Flatter
The higher the value of the test-statistic of chi-square the _________ the chance of rejecting the null hypothesis
Greater
When can chi square test give an error result?
If the expected number of observations in any category is too small
What is the Mann-Whitney U-Test typically looking at?
If the mean ranks of the two categories are equal
Why must we adjust our p values when considering multiple tests of a data set?
Increase in type I error after multiple testing
How to decrease standard error?
Increase sample size
Chi-squared test and fisher's exact test can assess for _________ between two variables when comparing groups that are independent and not correlated.
Independence
Chi-squared test and Fisher's exact test can assess for what?
Independence between two variables when comparing groups are independent and not correlated
Assumptions of Mann-Whitney U-Test
Independent Observations Distributions of dependent variable visually similar
Bonferroni correction suggest p-value for each test is equal to what?
It's alpha value divided by number of tests performed
Chi-squared test applies approximations assuming sample size is _________, while fisher's exact test runs exact same procedure especially for _______-sized samples
Large,small
Seems that for all nonparametric tests, the test statistic needs to be ________ the critical value
Less than or equal to
Another name for Wilcoxon Rank Sum Test
Mann Whitney U Test
Steps: 1. Take observations from 2 groups 2. Rank in order of size (regardless of group membership) 3. Ranks of observations from first group then summed
Mann-Whitney U-test
Compare differences between 2 independent groups when the dependent variable is either ORDINAL or CONTINUOUS, but NOT normally distributed
Mann-Whitney u-test
Probability distributions of two populations are identical, except for location
Mann-Whitney u-test
Parametric tests use the ________ value for central tendency
Mean
Nonparametric tests use the ______ value for central tendency
Median
Null hypothesis of wilcoxon signed rank test
Median difference between pairs of observations equals zero
Null hypothesis of Wilcoxon Signed Rank Test
Median difference is zero
Null hypothesis of sign test
Median difference is zero
Calculate df for t-tests
N-1
Chi-square tests are useful for testing for which form of measurement variables?
Nominal, Ordinal
Chi-squared test is .....
Non-parametric
The test in which no constant of a population is used. Data do not follow any specific distribution and no assumptions are made in these tests
Non-parametric test
Null hypothesis of Sign test
Number of positive signs is equal to the negative signs (half of the differences are positive and half are negative)
Calculate odds ratio
OR = (a*d)/(b*c)
You want to use a Z-test, but the sample size is small. Which test should you use?
One sample t-test
Wilcoxon signed rank test is the nonparametric equivalent to which test
Paired t test
Scientists count the number of Galapagos tortoise on the Isabela Island, near the volcanoes of Isabela, every year. Scientists have piled data for two previous years, 2013 and 2014. They want to determine if the tortoise are still near becoming extinct.
Paired t-test
Test that looks at differences observed between two data collection events
Paired t-test
To compare the performance of a group at time T1 and then at T2, we would use
Paired t-test
Assumptions of The Wilcoxon signed-rank test
Pairs of samples are random and independent distribution of differences between groups symmetrical no assumption of normality
The test in which the population constants (mean, SD, SE, correlation coefficient, proportion, etc) and data tend to follow one assumed or established distribution such as normal, binomial, poisson
Parametric test
Alternative name for repeated-measures t-test?
Pearson's product-moment
For the 1 sample t-test, which mean is already known?
Population
Parametric tests are more ________ than nonparametric tests when they are appropriate to hse
Powerful
For the Mann Whitney u test, if the observed u value is greater than the u critical value, _____ the null
Reject
Test statistic outside critical value, ______ null
Reject
Decision rule of Mann Whitney U Test
Reject H0 if U < critical value from table
Decision rule of Wilcoxon signed rank test
Reject H0 if W < critical value from table.
Decision rule of sign test
Reject H0 if the smaller of the number of positive or negative signs < critical value from table.
Measurement options for paired t-test
Same individual- 2 treatments Same individual- 2 time points Matched individual- different treatments
Standard t-test assumptions
Sample is selected randomly Dependent variable is continuous/nominal Sample data is normally distributed Variances are equal
The mean of the _________ equal to the population mean
Samples
What are matched cases?
Scores are obtained from a second group of participants who are matched on vital characteristics with the first group of participants
The Mayo clinic states that one should aim for about 40 minutes of exercise each day to achieve greater health benefits, such as increased weight loss. Paul, the scientist, believes that this claim is incorrect, and people should exercise for at least an hour (60 minutes) every day in order to reap the benefits of weight loss. He decided to perform an experiment with a sample of 36 people. He had 18 of them exercise for 40 minutes and the other 18 exercise for 60 minutes a day for two weeks. He weighed his subjects before and after the experiment.
Sign test
Nonparametric alternatives to one sample and paired t-tests
Sign test and Wilcoxon signed rank test
Characteristic of paired t-test
Single variable Not independent samples Variable measured twice from "same" individuals Meet normality assumptions
Characteristics of 2 sample t-test
Single variable Independent samples Meet normality distributions
Characteristics of 1 sample t-test
Single variable Population mean is know, but not the sample mean Meets normality assumptions Sample size is small
General steps to carry out t-test
State question Define hypothesis Determine rejection criteria Do test Draw conclusions
Type of statistical test used to determine if there is a significant difference between the means of two groups
T-tests
When conducting a goodness of fit test, what do the expected frequencies represent?
The average counts for each category expected if the data truly follow the hypothesized distribution
The Mann-Whitney U-Test can look at if the medians of two categories are equal when?
The distributions are similar
When would you use a Mann-Whitney U-test?
The normality is disrupted OR there is a small sample size
What does an odds ratio help us describe?
The odds an outcome will occur given a particular exposure, compared to the odds of the outcome occurring in the absence of that exposure
Which section of a paired samples t-test output can be ignored?
The paired-samples correlations
According to the Central Limit Theorem, tests are robust in the presence of violations to the normality assumption when?
The sample size is large
Why are exact tests called that?
The significance of the deviation from a null hypothesis (p-value) can be calculated exactly, rather than relying on an approximation that becomes exact as the sample size grows to infinity
For the Mann Whitney U Test/Wilcoxon Rank Sum Test, what are ranks assigned to?
The total sample or combined sample (rank against each other)
When should fisher's exact test be used to test the significance of differences in proportions?
The total sample size is small (total N of 30 or fewer) There are cells with frequencies of 5 or fewer
Why can T-statistics be both above and below 0?
They are centered around the mean
Use of Mann Whitney U Test
To compare a continuous outcome in two independent samples.
Use of Sign test
To compare a continuous outcome in two matched or paired samples.
Use of Wilcoxon Signed Rank Test
To compare a continuous outcome in two matched or paired samples.
Null hypothesis of Mann Whitney U Test
Two populations are equal
In what case could you use a paired-samples t-test?
When comparing the same participants performance before and after training. [ A paired t-test is used when we are interested in the difference between two variables for the same subject. Often the two variables are separated by time.
What kind of variables can be assessed with a chi square test?
When you have one nominal variable with two or more values
What does a test for independence address?
Whether the relative proportions of one variable are independent of the second variable
In a genetic study, blood samples were collected from multiple ethnic groups. Several variables were measured, and we want to compare two ethnic groups, "African American" and "Caucasian", to determine if the number of chromosome crossover events that occur in both ethnic groups are similar.
Wilcoxon rank sum test
Nonparametric alternative to the unpaired t test
Wilcoxon rank sum test (Mann-Whitney U test)
Non-parametric substitute for the paired t-test?
Wilcoxon signed rank test
allows us to compare two related samples, matched samples, or repeated measurements on a single sample to assess whether their population mean ranks differ
Wilcoxon signed rank test
When would you use a 2-sample t-test vs. a Mann-Whitney U-test?
You have a large sample
Looking at if the sample is different from the population
Z-test
Only ________ tests require parametric assumptions
Z-test
You are performing at test of a mean on a single sample when the population variance is known. What would be the most appropriate test to use?
Z-test Having the population variance allows for you to compare to the standard normal distribution.
What is a critical value?
a line on a graph that splits the graph into sections. One or two of the sections is the "rejection region"; if your test value falls into that region, then you reject the null hypothesis
Unlike chi-square test, g-values are
additive
What can a repeated measures t-test can be used to assess?
differences between scores obtained on two separate occasions from the same participants
What can an independent t-test be used to assess?
differences between two groups of participants
Bonferroni correction method
divides raw p values by number of tests
2-sample test evaluates the __________________________ hence both the data distribution and inherent variance/standard deviation are of concern.
equivalents of 2 data sets
OR > 1
exposure associated with higher odds of outcome
OR < 1
exposure associated with lower odds of outcome
OR=1
exposure does not affect odds of outcome
Difference between cochran-mantel-haenzel test and repeated G-test of goodness of fit
g test has two nominal variables CMH has one nominal variable
For chi square goodness of fit, the test statistic needs to be _________ than the critical value in order to reject the null
greater
Why does standard error matter?
helps you estimate how well your sample data represents the whole population
Critical value of the chi-square test depends...
how many categories we have
Assumptions for Goodness-of-Fit test
individual observations are independent
Why is the sign test called that?
it allocates a sign, either positive (+) or negative (-), to each observation according to whether it is greater or less than some hypothesized value, and considers whether this is substantially different from what we would expect by chance
Calculate df for chi-square goodness of fit
k-p-1 (or just k-1) k= # of categories p = # of parameters estimated
The Paired t-Test evaluates the ________________ between serially collected samples and thus the differences must fall into the normal range.
magnitude of difference
Null hypothesis of wilcoxon signed rank test
median difference between pairs of observations equals zero
For the Mann-Whitney U-Test, there are no assumptions of......
normality
the researcher is comparing the average blood pressure of the sample population to the known mean blood pressure of the Native American population in the United States. Which test should be used?
one sample t-test
when you have one measurement variable and two nominal variables and you need to discern whether or not there is a statistically significant difference between the individuals of each pair. This could take place either with the same individuals being exposed to both trial variables or if pair groups are used
paired t-test
What does a low standard error show?
sample means are closely distributed around the population mean—your sample is representative of your population
What does a high standard error show?
sample means are widely spread around the population mean—your sample may not closely represent your population
measure of the variability of that statistic across different sample sizes - the variability of the sampling distribution
standard error of a statistic
how different the population mean is likely to be from a sample mean. It tells you how much the sample mean would vary if you were to repeat a study using new samples from within a single population
standard error of the mean, or simply standard error
Null hypothesis for repeated g-tests of goodness of fit
the numbers within each experiment fit the expectations
Null hypothesis of independence tests (chi-sq, g-test, fisher's exact)
the relative proportions of one variable are independent of the second variable; in other words, the proportions at one variable are the same for different values of the second variable.
The biggest thing to remember when a binomial test is performed to test for individually significant categories is that one category might not ultimately be what causes an overall significant P-value. What could it be?
the results of small to moderate sized deviations from all the categories that add up to show significance.
What is the basic tenet of the Central Limit Theorem?
the sampling distribution of the sample means approaches a normal distribution as the sample size gets larger — no matter what the shape of the population distribution
The standard error of the statistic is...
the standard deviation of a sampling distribution
Characteristics of wilcoxon signed rank test
two related groups (within subject design) One dependent variable (continuous, ordinal, intervals,ratios) minimum 5 pairs of observations
When is it appropriate to use an exact test?
when you have one nominal variable and you want to see whether the number of observations in each category fits a theoretical expectation
When should you use Fisher's Exact Test?
when you have one nominal variable and you want to see whether the number of observations in each category fits a theoretical expectation when sample size is small
When would you use the Wilcoxon signed rank test?
when you'd like to use the paired t-test, but the differences are severely non-normally distributed
Experimental set up for Wilcoxon signed rank test
wo nominal variables and one measurement variable. One of the nominal variables has only two values, such as "before" and "after," and the other nominal variable often represents individuals