Biostatistics Module 2
The test statistic is used in the chi distribution to get the p value. How many Degrees of freedom should you use?
1 degree of freedom is used to obtain the significance value for this test.
What is the first thing you need to do when you perform Mann-Whitney U-Test after you've obtained your data?
1) Rank observations in increasing order of magnitude (lowest = 1). Give observations with the same magnitude an avg ranking.
What assumptions does Pearson Chi-Square make?
1) values in the cells are counts, not proportions 2) the value of expected is 5 or more in at least 80% of the cells
How do you calculate your z-score from Mann-Whitney U-Test statistic?
3) Calculate Z = [U - (n1n2/2)] / √[n1n2(n1+n2+1) / 12] 4) Compare the Z value to the critical value to decide whether to reject tne null.
What kind of variables would you use a R&C Table for?
A data set presented with a certain number of Rows (R) and columns (C). Does not matter which nominal variable is in the rows and which is in the columns.
What is an intrinsic hypothesis?
A hypothesis that is based on that data that you collect.
What is an extrinsic hypothesis?
A hypothesis that you know before you collected the data.
Mann-Whitney U-Test test is also called?
AKA Wilcoxon rank sum test.
1) What is another name for paired t-test? 2) Think of this as a one-sample t-test where our hypothesized mean is equal to _. 3) We use this test when we have what kind and how many variables? 4) What is the null hypothesis of this test? 5) How would you report your results? 6) How would you calculate the t statistic? 7) How do you calculate variance of the sample? 8) How do you calculate SD of the sample?
AKA the dependent t-test. This is an application of the one-sample T-Test when data is paired, so it's kind of like our hypothesized mean of the population is 0. Used when we have two variables: one categorical independent variable and one continuous, dependent variable. Null hypothesis is that the mean of the first measurement is the same as the second measurement. The p value, as usual, is the probability of obtaining the measured difference due to chance if the null is true. If you're comparing multiple values, you'd use the two-way anova with replication where one nominal variables has just two values. This data looks nice plotted on a bar graph. T-statistic is calculated as follows: t = (sample mean of differences - μ) / (SE of sample of differences) The trick here is that μ = 0 S2 =∑(X-M)2 / (N-1) : Remember this is variance S = √S2 : Remember S is the estimate of the population standard deviation Standard Error of Mean = S/√N
Mann-Whitney U-Test is particularly effective when your alternative hypothesis states what?
According to prof, this test is especially appropriate when testing an alternate hypothesis that a particular population tends to have larger/smaller values than the other.
What is the test statistic for Fisher's Exact test of Independence?
Again, no test statistic in a Fisher's Exact test of independence, the proportion is calculate directly.
1) The Two-Sample T-Test (unpaired t-test) is also called what? 2) What is this test most commonly used to compare? 3) We use this test when we have how many and what kinds of variables? 4) Our focus when looking at paired data is the mean of the difference. What is our focus here? 5) Like all other t-tests, this assumes, but is not sensitive to what? 6) Is it a problem if our two samples are skwewed in different directions? 7) How would you calculate the SE in this kind of test, since you have two sets of data? 8) How would you calculate the t statistic? 9) WHat are the degrees of freedom for this kind of test?
Aka the independent t-test. This is mathemtically identical to a one-way anova with two categories. The most common comparison made in a t-test is between two means that arise from unpaired data. Used when we have two variables: one categorical, independent variable, the other contiunous and dependent. With paired data, we focus on the mean of the difference. With unpaired, we're instead going to focus on the difference of the means of the two groups. This test assumes normality, though it's not sensitive to deviations from normality as long as the two sample populations have the same skewedness (not such a big deal at n>50) Since the sample sizes of unpaired data are usually different, calculating SE is going to be different as well. SDdifference = √ [(n1 - 1) x SD12 + (n2 - 1) x SD22 / (n1 + n2 - 2)] SD1 = Standard deviation in group 1 SD2 = standard deviation in group 2 n1 = sample size of group 1 n2 = sample size of group 2 SEdifference = SDdifference x √ (1/n1 + 1/n2) t = difference in sample means / SE of Difference in sample means Degrees of freedom = n1 + n2 - 2 With a t statistic and the Degrees of freedom, you can find P, the probability that the two sets of scores came from the same population.
1)Critical Region is also known as ? 2) If your alpha value is 0.05 for a two tailed test, what is the critical region?
Aka the rejection region. This is the range of values of the test statistic that indicates there is a signficant difference and that the null can be rejected. Remember for a two-tailed test, if your alpha value is 0.05, then there is a critical region of 0.025 on each side of the curve. So you Z-statistic would actually be 0.025 and .975, not .05 and .95.
What is another name for G-Test Statistic?
Also called the log likelihood ratio.
When it is appropriate to apply Yates' Correction?
Also known as the continuity correction. This is used to account for the inaccuracies of the chi-square and g-tests when there are low sample sizes.
What is another name for G-Test of Goodness of Fit?
Also known as the likelihood ratio test, the log-likelihood ratio test, or the G-squared test).
If there is no difference in your data sets, your OR should be equal to?
Also remember, no difference in this is not 0. It is 1.
Benjamini-Hochberg Procedure is an alternative for what?
Alternative to the bonferroni correction.
Benjamini-Hochberg Procedure controls for what?
Alternative to the bonferroni correction. This controls the false discovery rate.
McNemar's Test is an alternative to what test(s)?
An alternative to the chi-square, g-test, and Fisher's exact test
Barnard's Test is an alternative to what test?
An alternative, more accurate form of Fisher's exact test, but will make people think you're evil.
Boschloo's Test is an alternative to what test?
An alternative, more accurate form of Fisher's exact test, but will make people think you're evil.
While this is another name for the Exact test of goodness-of-fit, what condition must apply for this test?
Another name for the Exact Test of Goodness-of-Fit, when the nominal variable has only two values.
Why do we need to apply the Bonferroni correction when we run multiple tests?
Anytime you analyze your data with more than one test, you increase your chances of a false positive. That's why it is stressed to determine what test you want to use prior to the experiment.
1) What is the main difference between a t test and a z test? 2) What assumption does your z-test make about the independent variable? The Dependent variable? The Standard deviation of the population? Sample sizes? 3) If the significance level is 0.05, what is the critical value for a one tailed test? two tailed? 4) How do you calculate this?
As opposed to the t tests which relates your observations back to a population mean, a z test is going to compare the means of two samples.This makes a few assumptions: 1) That your independent variable is qualitative (2 values only) and dependent is quantitative 2) Random samples 3) Population standard deviation is known 4) Sample size > 30 for both samples. Also, worth memorizing that the critical value for a one-tailed test with significant level of 0.05 is ±1.645, 2 tailed is ±1.96. z = (x̅1 -x̅2) / √(σ12/n1 + σ22/n2)
Once these are ranked, how do you calculate your test statistic for Mann-Whitney U-Test?
Assign points. For Group A, award a point for the number of observation in group B that are > that observation from Group A. Do this for each observation. THe lower U value is your test statistic.
1) What is the main assumption of t-tests but one that they're really not all that sensitive to? 2) What is meant by the term robust? 3) What kind of tests would we use if we want to compare more than just two groups?
Assume that data is normally distributed (or in context of the paired test, that the distribution of differences is normal). Luckily, the t-tests are not too sensitive to deviations from the assumption of normality or homogeniety (equality of variances), so we say it is robust, random samples, dependent variable is continuous and measure at the inerval or ratio level. If you want to make more comparisons, you'd use ANOVAs. In addition: the unpaired t-test assume the SDs from the two samples are approximately equal. .These tests are also restricted comparisons between one or two groups.
What is your objective when conducting G-test of Independence?
Basically tells you the chance that your observed percentages occurred by chance.
How do you use Bonferroni Correction?
Basically, you need to think of using this anytime you're running multiple tests at the same time.
What should you be leery of when using Repeated G-test of goodness-of-fit?
Be leery of this hypothesis if you have uneven sample sized where one group's results will have more influence on the overall results 4) Overall, the data from individual experiments fits the expectations.
Expected Mean rank
Calculated as (n+1)/2. Used in the Kruskal-Wallis test. N = the number of samples between the groups. When you calculate the sample mean, you'll use an N value that equals the number of samples in that specific group.
What are two different ways you can perform a post-hot test?
Can do the normal way, which is very similar (the same) as the post-hoc testing described in the chi-square test of independence. Alternatively, if you have a huge number (say 12) of values for one variable, you may analyze each group vs "all others" instead of of having to do the test (11+10+9+8+7+6+5+4+3+2+1) 66 times. Bonferroni correction would still be 12 in this example. Graphing this is just like graphing in Chi-square, use bar charts.
What is one thing to be careful of once you obtain your "p-value" from the exact binomial test (remember it's not really a p-value, it's actually the exact probability of obtaining your, or more extreme, data, given the null is true).
Careful, instead of using a significance level of 0.05, you need to dividee the significance level by the number of comparisons (the # of categories for the nominal variable) since that is the # of tests you are running. This will determine if your result is significant. (Bonferroni Correction)
Which goodness of fit test should you use with sample size of 250? (G-test, chi-squared, exact). Why?
Chi-square and G-Tests give lower p-values than exact tests do, this difference is significant even in the first few hundred samples, hence why author proposes we don't use these tests until sample size is >1000.
What is the purpose of G-test of goodness of fit?
Compares frequencies of one nominal variable to theoretical expectations
Kruskal-Wallis Assumptions
Does not assume normal distribution, but does assume observations come from populations with the same shape of distribution. So it's bad for using if your hypothesis is that medians are the same, good for means. Dichotomous nominal variable. Observations must be independent (data can't be paired). TO interpret a difference in medians, must assume the distributions of the dependent variable for both levels of the independen variable must be similar (similar shaped distributions, determine by viewing histograms and boxplots).
Kruskal-Wallis Test
Everything said here about this test also applies to the Mann-Whitney U-test. Non-parametric test used when you have one nominal variable and one measurement variable (HBS recommends only using this test if the measurement variable must be ranked) but the measurement variable does not meet the normality assumption (that's why we're not using a one-way anova here). HBS says it's overused but many people still use it. When doing this, assign ranks to your data (smallest =1, etc). This makes the test less powerful because information is lost. Test assumes homoscedasticity. Welch's anova would be used for heteroscedastic data. Null hypothesis for this test is that the mean ranks of the groups are the same. Test statistic for this test is H, which uses the same distribution as chi-square (but not if the sample sizes are too small, N < 5, so use a H-statistic with caution when dealing with small sample sizes). So H = P, the probability of getting a particular value of H if the null hypothesis is true.
With what sample size should you use an exact test for goodness of fit?
Exact Test of Goodness of fit: Current rule of thumb of use for values less than 5 come from the days calculations were done by hand. That's why author recommends using this now as long as <1000 sample, as this is generally more accurate than Chi-square or G-Test. With sample sizes between 50-1000 and expected values >5, use either of the 3, doesnt matter.
Sign Test is an application of what kind of test?
Exact binomial Test
1) If you're comparing the mean of two samples, what test statistic would you use? 2) If you you're comparing your data to a population mean and you don't know the Standard deviation? 3) If you're comparing your data to a population mean and you do know the standard deviation for a single sample? 4) For the entire sample (this one is tricky).
First, ask yourself what you're comparing your measurements to. If to the sample mean, use Z = (x - x̅)/s. If population mean, and you don't know the standard deviation, then use t = (x - μ) / s for a unique value or t = (x - μ) / (s/√n) for a sample mean. If you do have the standard deviation for the population, use Z = (x - μ)/σ for a unique value. If using the sample mean and n > 30, then use Z = (x - μ) / (σ/√n). If < 30, then t = (x - μ) / (s/√n).
How is extrinsic/intrinsic calculated for a G-test GoF?
For a G-Test of goodness of fit, this is calculated the same way as Chi-Square (see 38)
What is the null hypothesis for Exact test if you use a one-tailed?
For a one tailed, null is that observed number for one category is less than or equal to the expected number.
How is this calculated for a test of independence?
For a test of independence, this is equal to (R - 1) x (C -1), where R = Rows and C = Columns. Once you have this and the test statistic you can calculate your p value.
What should the null hypothesis be for Exact test?
For a two tailed test (which should always be used), null hypothesis is that number of observations in each category is equal to the predicted value. For a one tailed, null is that observed number for one category is less than or equal to the expected number.
Give an example of an extrinsic hypothesis
For example, even before the kittens are born you know that a genetic cross between two heterozygotes will yield a 3:1 ratio of phenotypes.
Practice: Apply the bonferroni correction when you have 4 groups.
For example, if your goodness of fit test was looking at the proportion of women named, Sara, Jenny, Ally, and Bob, you'd divide your p-value you by 4 when compariting Sara & Jenny.
What are difference scores?
Found by simply subtracting one score from another score.
How do you calculate the G statistic?
G statistic = -2log(Lnull/Lalt) Multiplying by -2 is what makes the log approximately fit the chi-square distribution. More simply, it appears this is also equal to: = 2∑[O x ln(O/E_) where O is the observed number and E is expected.
DO we run the G-test of independence 1st or the Goodness of fit test first? why?
GoF; because you need to do the G of Ind after to get heterogeneity value; so you can decide whether or not to pool data
What may we need to be on the lookout for (Homoscedasticity)?
However, if sample sizes are uneven, and the group with a small sample size has the larger standard deviation, can give you a lot of false positives.
When should False Discovery Rate be high?
IF the cost of additional experiments is low and the cost of a false negative is high, then this should be high (.1 or .2)
How would you calculate an OR?
IT's calculated as follows: OR = [PG1 / (1 - PG1)] / [PG2 / (1 - PG2)] PG1 is thd odds of the event of interest for group 1 PG2 is the odds of the event of interest for group 2
What additional assumption must be met before you can compare the medians of the two groups (Mann-Whitney U-Test)?
If an additional assumption is met (that the shapes of the distributions are equal, null could be that the medians of the two levels are equal.
Why should view barely significant results in G & chi-squared tests with skepticism?
If someone's p value when using a G-test or chi-squared test with a small number of samples is barely below .05, look at these conclusions with some skepticism.
When could you divide by less? (Post-hoc Test for Chi-square test of independence)
If you decide before the study that you're only going to compare 5 pairs, you could divide p by 5 when doing a bonferonni correction for 4 groups.
How would you determine this for a chi square test with an extrinsic hypothesis?
In a chi square test for goodness of fit, when dealing with an extrinsic hypothesis, this is equal to the number of categories of the nominal variable - 1. This value and the Chi-square statistic allows you to obtain the p value using the chi-square distribution.
Contrast Cochran-Mantel-Haenszel Test to the Repeated G Test of goodness of fit.
In contrast to the Repeated G-Test of goodness-of-fit which uses, as the name suggests, repeated G-tests of goodness of fit, this test used repeated tests of independence
The Critical value separates what?
In hypothesis testing, this value separates the critical region from the non-critical region. Whether you use a Z distribution or t distribution depends on what test you're using.
When you run a binomial test, when would you need to use Bonferroni Correction?
In the Post-hoc test, where you run a binomial test to compare each category to each other, this refers to how you must divide the p-value you obtain from the binomial test by the # of categories for that nominal variable
What does Common Odds Ratio summarize?
It's a way of summarizing how big the effect is when pooled across the different repeats of the experiments. Assumes that the odds ratio is the same in the different repeats.
How do you calculate Standard Error of Odds Ratio?
It's calculated as follows: √[1/a + 1/b + 1/c + 1/d]
Nonparametric Disadvantages
Less power, particularly with small sample sizes. This approach does not estimate the magnitude of an effect, and are instead geared towards hypothesis testing.
Give an example of an intrinsic hypothesis.
Like Hardy-Weinberg proportions, where you don't know the allele frequencies (p and q) until after you collect the data.
The G-Test Statistic is used with what distribution to find the P value?
Like most statistical test, the G-Test of goodness-of-fit calculates an intermediate statistic and then uses the chi-square distribution to estimate the probability of obtaining that value of the test statistic if the null were true.
How do you calculate the test statistic for Chi-Square Test of Independence?
Math is the same in this as it is for the chi-square test of goodness-of-fit, the only difference: chi2 = ∑(O - E)2/E where O is the Observed # and E is the Expected Number.
When would you use Chi-square Test as a test of goodness of fit?
May be used as a test of goodness-of-fit when one is comparing frequencies of one nominal variable to theoretical expectations,
To ensure GoF assumption applies, what may you need to do?
May need to conduct independence testing to ensure this criteria is met.
What is another name for Chi-Square Test of Independence?
More specifically, this is called the Pearson's chi-square test of independence.
What is the null hypothesis (most often) for the Exact Binomial Test?
Most common use of this test is when the null hypothesis is that the outcomes have an equal probability of occurring.
Why do we multiply this ratio by this constant? (G-Test Statistic)
Multiplying by -2 is what makes the log approximately fit the chi-square distribution.
What do you need to be wary of when conducting post-hoc testing?
Need to decide before the study what groups you want to compare
Wilcoxon signed-rank Test
Non-parametric alternative to the paired t-test. This is similar to the sign test, except it takes magnitude of the observations into account, so less power is lost. This is used when there are two nominal variables and one measurement variable. One of the nominal variables only has two values (like before vs after). Requires a minimum of 5 pairs of observatinos. Null hypothesis of this is that median difference between pairs of observations is zero (unlike the paired t-test which looks at the mean difference. Steps to perform this test are as follows: 1) State the null hypothesis (specifically the hypothesized value) 2) Rank all observations in increasing order of magnitude (rank 1 is the lowest, ignore the sign). Remove observations that are equal to the null hypothesis. If multiple observations have the same magnitudue, give them an avg ranking. 3) Allocate a sign to each observation 4) R+ = Sum of all positive ranks, R- = Sum of all negative ranks, R = Smaller of R+ and R-. In other words, the smller R is your test statistic. 5) Calculate a P value Note: This test, like the sign test, only gives a P value and does not provide a straightforward estimate of the magnitude of any effect, but it gives you more insight into the size of the differences than the sign test does.
Don't use Yates' Correction for what types of tests?
Not great for tests of independence, as P values become too high.
What is the null hypothesis of this test?
Null hypothesis is that odds ratio within each repetition are equal to 1
What should you do if you get an OR of .5?
OR < 1 means the first group is less likely. An OR value below 1.00 is not directly interpretable. IF your OR is less than one, then you need to redo the calculation with the other group first.
If the value of Odds Ratio is 3, what does that mean?
OR > 1 means the first group is more likely to experience the outcome.
Yates' Correction only applies when what is true?
Only applies to test with one degree of freedom.
Wilcoxon Signed-Rank Assumptions
Paired samples are random and independent. Distribution of differences between the groups must be symmetrical (use a bloxplot to determine this). No assumption of normality.
Describe Simpson's paradox.
Paradox in which pooling different studies together can show a significant difference in proportions when there isn't one (or even the opposite of what is true).
Mann-Whitney U-Test is identical, to the point that it's basically the same as another test. What is this other test?
Pretty similar (identical) to the Kruskal-Wallis Test.
If our data is just a little not normal, should we use Mann-Whitney U-Test or the t-test? Why?
Recommend to use the more powerful t-test despite deviations from normality since the t-test is not sensitive to this (this test asks "is there a difference between the trwo groups." If the samples have different distributions, this test is no better than a t-test.
Nonparametric Advantages
Require limited to no assumptions about the data. Useful for dealing with unexpected, outlying observations. Intuitive for small samples. Useful in the analysis of ordered categorical data in which assignation of scores to individual categories may be inappropriate (parametric methods would require us to assume that the effect of moving from one rank to the next is fixed).
How should you display results for a test of independence?
Results with independence tests should typically be displayed with a bar graph a, with values of one variable on X and the other variable on Y. If the Y-variable has more than 2 values, you need to graph all of them, could use a pie chart. Pie charts make it more difficult to see small differences though, so bar graph still the best (you'd just have multiple columns coming out of each value on the X-axis).
Under what conditions would you use for G-Test of Goodness of Fit (what variables, sample size)?
SS larger than 1000; Used when you have one nominal variable with two or more values
Why is Lnull always < Lalt?
Since the alternative hypothesis will state that the actual proportion should be the same as exactly what you just observed, it will always be higher than the odds of observing the number of observations if the null were true.
How would you determine this for a chi square test with an intrinsic hypothesis?
So for intrinsic you're subracting twice (think Hardy Weinberg which has 1 nominal variable with 3 categories (homodom, hetero, homerec), one parameter (p), - 1 more to get 1 degree of freedom.
What if one of our 2x2 tables is skewed one way, and another in the opposite direction?
Some repeats may have a big difference in proprotion in one direction, and another repeat may have it in the opposite direction, which would yield insignificant results. If this happens, we'd want to perform a test of independence on each 2x2 table separately.
1) These Case Control Studies start with a ____ and go back in time to look for _____. 2) This kind of study calculates what? 3) How do we present our data? 4) Which condition do we put on the top? 5) The side?
Study that start with a disease, go back in time looking for exposures. These calculate Odds Ratio. By convention, the 2x2 tables (in all studies, not just this one) is to list the disease condition (lung cancer vs cancer free) on the top of the 2x2 table and the condition (smoking vs non-smoking on the side.
How do you apply Benjamini-Hochberg Procedure?
TO do this, you put your individual P values in order, starting from the smallest to the largest and compare each value to the Benjamini-Hochberg Critical value ((I/m)Q).
How do you calculate the Benjamini-Hochberg Critical value?
TO do this, you put your individual P values in order, starting from the smallest to the largest and compare each value to the Benjamini-Hochberg Critical value ((I/m)Q). THe largest P value thta is < the critical value is significant, along with all the smaller P values (even if those P values are not less than the critical value.
What different null hypothesis does Repeated G-test of goodness-of-fit technique allow you to test?
Technique actually tests four null hypothesis. 1) Numbers within each experiment fit the expectations (same hypothesis for the G-test of goodness-of-fit) 2)Relative proportions are the same across different experiments (same hypothesis as the G-test for independence) 3. Pooled data fits the expectations (all groups summed up and compared to the null).
What does Repeated G-test of goodness-of-fit experiment tell you?
Tells you both whether there's an overall deviation from the expected proportions and significant variation among the repeated experiments.
What test estimates Common Odds Ratio?
The Cochran-Mantel-Haenszel test estimates this.
How to calculate the degrees of freedom in Chi-Square Distribution depends on what?
The degrees of freedom is calculated differently, depending on whether you're study uses an extrinsic or intrinsic null hypothesis.
Unlike the exact test for goodness of fit, Chi-Square Test of Goodness of Fit uses a test statistic. How do you calculate it?
The following equation is used to calculate the test statistic: chi2 = ∑(O - E)2/E where O is the Observed # and E is the Expected Number.
What is the equation to calculate the test statistic for a chi-square test of goodness of fit?
The following equation is used to calculate the test statistic: chi2 = ∑(O - E)2/E where O is the Observed # and E is the Expected Number. The Higher the statistic is, the greater the observe data differs from the population. Written as X2.
How do you calculate p for Fisher's Exact Probability Test?
The formula for this is as follows: p = [(a+b)!(c+d)!(a+c)!(b+d)!] / n!a!b!c!d! p = Fisher's Exact probability; a b c d = counts in the cells, n = sum of a b c & d
What is the main goal of statistical testing?
The main goal of a statistical test is to answer "what is the probability of getting a result like my observed data if the null hypothesis were true.
What is the null hypothesis for Chi-Square Test of Goodness of Fit, generally?
The null hypothesis for this kind of test is generally an extrinsic hypothesis, where you know the expected proportions before the experiment begins
What is the null hypothesis for Breslow-Day test?
The null hypothesis of this test is that odds ratios are equal across the different repeats.
How does Paired Sample T-Test reporting differ from the one-sample t-test?
The only difference from the one-sample t-test is that you don't need to include the number per group (n). Results are given in sentence form.
According to HBS, what is the only reason to use Mann-Whitney U-Test?
The only reason to use this test, according to HBS, is if you have a true ranked variable instead of a measurement variable.
The false discovery rate is a proportion. What does it tell you?
The proportion of significant results that are actually false positives.
The shape of this Chi-Square Distribution depends on what?
The shape of this distribution depends on the number of degrees of freedom.
How are the distributions of the t-test and z-test different?
The z-test uses a normal distribution. The t-test uses the t-distribution which is shorter, but has fatter tails. This is what makes the p-values different.
What type and how many variables are in Cochran-Mantel-Haenszel Test?
There are 3 nominal variables: the two in the 2x2 table and the third variable that identifies the repeats.
What does GoF test assume about the relationship between?
These tests assume that the individual observations are independent, meaning that the value of one-observation does not influence the value of other observations.
How is Cochran-Mantel-Haenszel Test Statistic calculated?
Think of the four numbers in your 2x2 table and label them a, b, c, d. A + b + c + d = n χ2MH = {∑|[a-(a+b)(a+c)/n]| - 0.5}2 / ∑(a+b)(a+c)(b+d)(c+d) / (n3 - n2)
When is William's Correction used?
This correction is used for goodness of fit tests
You could also present Odds Ratio data how?
This data could also be presented in a 2x2 table and the OR could be calculated as follows: (a/b) / (c/d) where a and b are the probabilities an event occurs for two different groups, and c & d are the probabilities the even does not occur for two groups (a/b) / (c/d) where a and b
Why is it unethical to use Sign Test after your paired t-test failed to reject the null?
This goes back to a principal idea that if you run multiple statistical analysis on a test, you are inflating your chances of a false positive. This test merely explores the role of chance in explaining the relationship, gives no direct estimate of the size of any effect. This test assumes measurements are independent. Remember, as a rule, non-parametric tests like thus are less powerful than parametric equivalents.
Chi-Square Test of Goodness of Fit is an alternative to what kind of test?
This is an alternative to the G-test of goodness-of-fit.
When is the two-sample t-test sensitive to deviations from Homoscedasticity?
This is an assumption of many tests, like the two-sample t-test. The two-sampled t-test is only sensitive to this assumption for small sample sizes (n<10).
Why does the test lose power when you have many categories within nominal variables and a small number of observations?
This is because your chi-square or G-value can't increase much but the degrees of freedom increases with the # of categories
How do you calculate Benjamini-Hochberg adjusted P Value?
This is either the raw P value multiplied by m/I or the adjusted P value for the next higher raw P value, whichever is smaller.
How do you calculate Benjamini-Hochberg Critical Value?
This is equal to (i/m)Q. I = rank, m = total number of tests, Q = chosen false discovery rate
After a significant goodness of fit test, why would you want to perform a post-hoc test?
This is not a specific test, just a way to further analyze your data after running a goodness of fit or independence test. You can follow up a significant exact multinomial test, g-test, and the chi-square test for goodness of fit and tests whether each category deviates significantly from the expected number. You perform an exact binomial on each result to see if any one category is significantly different from the expected outcome.
How do you use the cross-product?
This is the easy way to calculate the odds ratio. IS as follows: A*D / B*C. Unlike the way presented for calculating the odds ratio initially, you don't need to calculate the proportion. Just use the counts!
Standard Error of Odds Ratio can be used to find the what?
This is then used, to add/subtract to the log of the odds ratio to find the confidence limits.
Under what conditions should one use Chi-Square Test of Goodness of Fit (sample size, what kind of variables)?
This is used to test the goodness-of-fit when you have one noimnal variable with two or more values and want to see whether the number of observations in each category fits a theoretical expectation, and the sample size is larger (>1000).
When would you use Fisher's Exact Probability Test?
This is used to test the significance of an odds ratio and should be used when the OR dataset takes the form of a 2x2 table.
What does Likelihood Ratio Chi-Square tell you?
This is used to test the significance of an odds ratio when the data is entered into a statistical analysis program. Uses a natural log but surely we won't need to know how to do this.
When would you use Pearson Chi-Square?
This is used to test the significance of an odds ratio when there are more than just the 4 cells in a 2x2 table.
When would you use Cochran-Mantel-Haenszel Test?
This is used when you have data from 2x2 tables that you've repeated at different times/locations.
You use Repeated G-test of goodness-of-fit when you have what kind and how many variables?
This method is used when you have two nominal variables, one is something you'd analyze with a goodness-of-fit test, and the other variable represetns repeating the experiment multiple times.
What does Odds Ratio evaluate?
This ratio evaluates whether the odds of a certain event or outcomes is the same for two groups. So an odds ratio of 3 means that an event is 3 times more likely than the alternative treatment to occur.
What does Homoscedasticity refer to?
This term refers to the equal variances in the two groups (standard deviations are equal).
Why might you want to use the chi-square goodness of fit over a G-Test of goodness of fit?
This test is more familiar to more reader than G Test, hence why it may be preferable to use.
1) What is a one-sample t-test comparing? 2) What assumption about normality does this make? 3) How sensitive to deviations in this assumption is this test? 4) What is this sensitive to? 5) How do you calculate the t-statistic? 6) The largert a t-statistic is, the (larger/smaller) p is. 7) How do you find the SE of the Sample Mean? 8) How do you calculate the SD? 9) T test tells us whether our data is statistically significant. How do we determine an effect size?
This test is used to explore how likely a difference in the mean of a group differs from the hypothesized mean of the population. The P value, which estimates the probability that that groups estimated mean differs from the population mean due to chance, is derived using a t statistic. WHile this assumes normality, this is not sensitive at all to non-normality (even samples less than 10). This is sensitive to skewedness for small sample sizes (<50). The t-statistic is calculated as follows: t = (sample mean - hypothesized mean) / SE of Sample Mean The larger the t statistic, the smaller the p value. So if T Statistic > critical value, we can reject the null. Now going back to module 1,how do we find the SE of Sample mean? Well first, estimate the population variance by finding the variance of the sample: S2 = (∑(X - M)2) / (N-1) = (SS) / (N-1) SS = sum of squares Find standard devition (SD) of the sample by taking square root of variance, √S2 Standard Error of Sample Mean = SD/√N So now we know how to determine whether our data is statisically significant. But wht about effect size? d = (M - μ)/S S = standard deviation of sample
Under what conditions should one use Exact test (sample size, what kind of variables)?
This test is used when you have one nominal variable, a small sample size, and want to see whether the number of observations in each category fits a theoretical expectation.
What exactly is the purpose of Exact test?
This test is used when you have one nominal variable, a small sample size, and want to see whether the number of observations in each category fits a theoretical expectation.
Welch's T-test is a nonparametric alternative to?
This test should be used in place of the two-sample t-test
A larger Cochran-Mantel-Haenszel Test Statistic means what?
This test statistic increases as the differences between observed & expected values increase or as the variance gets smaller.
What does Cochran-Mantel-Haenszel Test tell you?
This test tells you whether you have a consistent difference in proportions across the repeats.
Pooling allows us to avoid loss of power with many categories. When should we decide how we're going to pool our data?
To avoid this, we may pool some of the smaller categories together. Decisions about pooling should be made before analyzing data.
How does calculating the G statistic in this differ from calculating the G statistic in the goodness of fit test?
To calculate the G-statistic, just like with the G-test of goodness of fit but instead of using theoretical values for your comparison, you're using expected values.
How does William's Correction differ from the Yates correction (other than how it's applied)?
Unlike Yates' correction this can be applied to tests with >1 degrees of freedom.
Unlike most tests, Exact test has no ____.
Unlike most statistical tests, there is no test statistic (like a p value). Instead, one directly calculates the probability of obtaining the observed data under a null hypothesis because the predictions of the null are so simple.
What characteristic of a G-test make it more versatile?
Unlike the chi-square test, these values are additive, meaning G-values of different parts of an experiment add up to an overall G-value for the whole experiment. The ability to do more elaborate statistical analyes is one reason some people prefer the G-test.
If you don't have a lot of samples, but have a ton of groups, what test are you going to use to test for goodness of fit?
Use G-test GoF cautiously if you don't have a lot of samples, but have a lot of groups. In this situation, you want to stick with the Exact test if able.
Under what conditions should one use Sign Test (sample size, what kind of variables)?
Use when one-sample or paired t-test might traditionally apply. Also can be used inplace of the cochran-mantel-haenszel test if you're only interested in the direction of the differences in proportions and not the size. Used when there are two nominal variables and one measurement variable. One of the nominal variables must have only two values and the other nominal variables identifies pairs of observations.
What test uses difference scores?
Used in paired sample t-tests.
While Exact Multinomial Test of Goodness-of-Fit is another name for the Exact test of goodness-of-fit, what condition must apply for this test?
Used in place of the exact binomial test when the nominal variables has more than two values.
When would we use Breslow-Day test?
Used to test whether the assumption used in the Cochran-Mantel-Haenszel test (that the odds ratio is the same in the different repeats) is true.
Is G-test of Independence used for small or large sample sizes?
Used when the sample size is large. (n>1000)
We use Fisher's Exact test of Independence for (small/large) sample sizes.
Used when the sample size is small. (n<1000)
When do we use a post-hoc test for the G-test of independence?
Used when you have a table larger than 2x2 and data is significant, or close to significant. Never forget your bonferroni correction.
We use Fisher's Exact test of Independence when we have what kind and how many variables?
Used when you have two nominal variables
You can use G-test of Independence when you have what type and how many variables?
Used when you have two nominal variables (each with 2+ possible values) and you want to see whether the proportions of one variable are different for different values of the other variable.
What is your objective when you use Chi-Square Test of Independence?
Used when you have two nominal variables (each with 2+ possible values) and you want to see whether the proportions of one variable are different for different values of the other variable. Used when the sample is large
1) What is Cohen's convention used to determine? 2) What is the equation for this? 3) What is considered a small, medium, and large effect?
Using the equation d = (M - μ)/S, this puts a label on effect size for a one-sample t-test, and paired t-test. ±0.2 = small effect, ±0.5 = medium effect, ±0.8 = large effect S = standard deviation of sample, μ = mean of population, M = sample mean
What should we report after a one-sample t-test?
We should report the following items after a one-sample t-test: Assumption testing, Descriptive statistics (M, SD), Number (N), number per group (n), Degrees of freedom (df), t value (t), significance level (p), effect size.
What should we report after a dependent t-test (paired)?
We should report the following items after a paired-sample t-test: Assumption testing, Descriptive statistics (M, SD), Number (N), Degrees of freedom (df), t value (t), significance level (p), effect size.
Welch's Anova
We use the t-test for normally distributed data with two variables, a nominal and measurement variable. We use Kruskal Wallis if the data is not normal but is homoscedastic. WE use Welch's t-test if data is not normal and is heterscedastric. And then we use this if the conditions of the Welch's t-test apply, but we have more than two groups.
When dealing with an exact test, what happens when our table is not conditioned?
When an exact test is not dealing with conditioned rows and columns, it's actually a conservative test (meaning you'll get a significant value less than 5% of the time).
What happens to your test if you have many categories with a small number of observations?
When dealing with nominal variables with many categories and a small number of observations in categories, test loses power.
What does the term conditioned mean?
When referring to a R&C table, this means the number of rows and columns are fixed. It's rare that this is the case.
When applying the bonferroni comparison for four groups what would you divide the p value by?
When using the bonferroni correction here, divide by the number of comparisons (4 groups would yield 6 comparisons, 1&2, 1&3, 1&4, 2&3, 2&4, 3&4).
1) What is sample bias? 2) How do we correct for it?
When we're calculating the variance of a sample, it's consistently smaller than the actual variance of the population. This is why the degrees of freedom for a sample is smaller (n-1) than the degree of freedom for a population (n). We're correcting for this bias.
When would you use a one-tailed variety of Exact test?
When you want to see if your observed value is GREATER (right-tailed) or LESS (left-tailed) than the expected value.
How do you categorize your data in a way that's relevant to the exact binomial test when using Sign Test?
You count the number of differences in one direction, compare it to the # in the other direction, then use the exact binomial test to see whether the numbers are different from a 1:1 ratio (if the measured observation is exactly equal to the null hypothesis (0) then you will just remove the observation from the data and sample size).
How do you apply Yates' Correction?
You do so by subtracting 0.5 from each observed value that is > expected and adding 0.5 to those that are < expected.
We use Mann-Whitney U-Test when we have what kind and how many variables?
You have one independent nominal variable (2 possible groups) and dependent variable (ordinal - strict ranks, interval - you know the difference between ranks but no true 0, ratio - interval but with true 0)
When would you conduct a post-hot test on a chi-square test for independence?
You want to conduct a post-hoc test for this with tables larger than 2x2 that yield significant results (sometimes insignificant results are worth further analysis as well).
How does the Sign Test differ from a paired t-test, as far as your objective is concerned?
You're not interested in the size of the difference (like in a paired t-test) but are just interested in the direction.
A test of independence? (Chi-square Test)
a test of independence when one is comparing frequencies of one nominal variable for different values of a second nominal variable.
Repeated G-test of goodness-of-fit allows you to do what?
allowing you to test several hypothesis at once
What kind of correction is applied?
bonferroni
Purpose of G test of independence?
compares frequencies of one nominal variable for different values of a second nominal variable.
Repeated G-test of goodness-of-fit allows us to conduct multiple tests. THus, we should be applying what correction?
do your typical G-test of goodness-of-fit with the bonferonni correction.
How is William's Correction applied?
found by dividing the chi-square or G values by the following: q = 1 + (a2-1)/6nv. A = number of categories, n = sample size, v = degrees of freedom.
How do you assign these ranks for Mann-Whitney U-Test?
increasing order of magnitude (lowest = 1). Give observations with the same magnitude an avg ranking.
What assumption is made about the independent variable for Mann-Whitney U-Test?
nominal variable (2 possible groups)
If the dependent variable is ordinal, what does that mean in common man speak? Interval? Ratio? (Mann-Whitney U-Test)
ordinal - strict ranks, interval - you know the difference between ranks but no true 0, ratio - interval but with true 0
What is unique about one of the nominal variables in Repeated G-test of goodness-of-fit, when compared to the other goodness of fit tests?
other variable represetns repeating the experiment multiple times.
We use McNemar's Test when what assumption does not hold true for Chi-Square, G-Test, and Exact?
samples are not independent, such as before-after observations on the same individuals.
What does Fisher's Exact Probability Test tell you?
significance of an odds ratio
What does Pearson Chi-Square tell you?
test the significance of an odds ratio
If the test statistic for Chi-Square GoF and Chi-Square Ind is calculated the same way for both tests, what's different between these?
the difference between these two tests comes from how you calculate the degrees of freedom
What is the null hypothesis (most often) for Exact Multinomial Test of Goodness-of-Fit?
the number of observations in each category is equal to that predicted by a biological theory Most common example is genetic crosses. This test gets pretty math-complex, so a G-test or chi-square test of goodness-of-fit should be used.
When would you use Likelihood Ratio Chi-Square?
to test the significance of an odds ratio when the data is entered into a statistical analysis program
We use Welch's T-test when one of what two assumptions of this test are not met?
when the groups have substantially different (one is 2x greater than the other) standard deviations and sample sizes are small (<10) or unequal.
What kind of and how many variables is Chi-Square Test of Independence used for?
when you have two nominal variables (each with 2+ possible values)
What is an extrinsic hypothesis (Chi-Square Test of Goodness of Fit)?
where you know the expected proportions before the experiment begins
What is the heterogeneiety G value?
which tells you if the proportion differs significantly among your groups; An insignifcant heterogeneity G-value means you can pool your results without worry and perform a G-test of goodness of fit on the pooled data.
We use Fisher's Exact test of Independence when we want to determine what?
you want to see whether the proportions of one variable are different depending on the value of the other variable
How would you calculate the test statistic for Pearson Chi-Square?
χ2 = ∑ (o-e)2 / e