L7: Statistical Methods for Non-Continuous Data (Chi-Square Tests)

Ace your homework & exams now with Quizwiz!

A cohort study examines the association between pesticide exposure and prostate cancer among males in the United States. Investigators estimate a risk ratio of 1.3 with a 95% CI (1.1, 1.5). Therefore, we can conclude that there is a statistically significant association between pesticide exposure and prostate cancer among males in the United States.

true: The 95% confidence interval does not include the null value of 1 and thus we reject the null hypothesis that there is no association. There is a statistically significant association.

Step 5: calculate p value

use rstudip

CI for Odds Ratio (OR)

•Similar to CI for RR a, b, c, d are cell frequencies from 2x2 table

Practice 1: Smoking Cessation Trial. •Smoking cessation intervention trial -40/120 intervention participants quit at 6 months -10/80 comparison participants quit at 6 months •Was the intervention effective (was the proportion of quitters higher in the intervention vs. control group?) -Risk of quitting smoking at 6 months

•Smoking cessation intervention trial -40/120 intervention participants quit at 6 months -10/80 comparison participants quit at 6 months

For which of the following variable types is a chi-squared test NOT appropriate?

continuous - Chi-square tests are used to test the difference between observed frequencies and expected frequencies of outcomes under the null hypothesis. Appropriate types of variables include dichotomous, ordinal and categorical because these types of outcome variables are expressed as counts and relative frequencies (proportions). Continuous outcome variables are compared using t-tests.

Steps 3 and 4: chi-squared and df Calculations

df= (r-1)*(c-1) - Chi squared tests static looks at difference between observed and expected and then squared to get the magnitude

A chi-square analysis is to be performed for a study where there are two possible outcomes and 5 exposure categories. What is the number of degrees of freedom (df)?

- 4 - The number of degrees of freedom = (r-1) x (c-1) where r= the number of rows for exposure categories and c= the number of columns for outcomes. In this analysis, there are 5 exposure categories and 2 outcome columns, so df=(5-1) x (2-1) = 4.

Consider the five statements about measures of association shown below. Which one is a correct statement of the null hypothesis?

- The risk difference = 0 - An absolute measure of association, or difference measure, is computed by subtracting the measure of disease frequency (outcome) in the exposed and the measure of disease frequency in the unexposed. If these two risks (or rates, prevalences or odds) are equivalent, then RD will be equal to 0. - The null hypothesis for an absolute measure of association is H0: RD=0, or PD=0, or IRD=0 - The null hypothesis for a relative measure of association is H0: RR=1, or PR=1, or IRR=1, or OR=1

Chi-Square Test of Independence

- outcome is dichotomous, categorical or ordinal - two or more groups compared - data can be organized into a RxC table of any size •Research question can be phrased as either: Is there an association between two variables? or Is there a difference in outcome between two (or more) groups?

Review: Measures of Association

-Risk ratio: RR=pE/pU; E=index (exposed), U=referent (unexposed) -Prevalence ratio: PR=pE/pU -Odds ratio: OR=odds of disease in exposed/odds of disease in unexposed

Step 6. What is your conclusion?

-We reject the null hypothesis that the risk of quitting smoking was the same in the intervention and non-intervention groups (p<0.05). People in the smoking intervention group had a higher risk of quitting smoking than people who were not in the intervention group (33.3% vs. 12.5%, chi-square (1 df) = 11.11, p=0.0009).

Steps to apply chi-squared test

1) Formulate null and research hypotheses 2) Calculate expected RxC values from observed RxC table 3) calculate chi-squared test stat 4) calculate degrees of freedom: df= (R-1)* (C-1) 5) compute p-value using Rstudio 6) Make a conclusion regarding statistical significance

Investigators in the Framingham Heart study designed a cohort study to examine the association between BMI and the development of type 2 diabetes over a 20-year period. BMI was an ordinal variable defined as: 1) underweight, 2) normal weight, 3) overweight and 4) obese. Type 2 diabetes was an ordinal variable defined as: 1) did not develop diabetes, 2) pre-type 2 diabetic and 3) developed type 2 diabetes. If the investigators use a chi-square test to determine if there is an association, how many degrees of freedom are associated with the chi-square statistic?

6 - The exposure has 4 levels and the outcome has 3 levels, so the degrees of freedom is (r-1)*(c-1) = 3*2 = 6

Practice 3: Calculate 95% for RR

95% confident that the true risk ratio for the association of being on treatment vs being on placebo

Step 2 calculate expected frequencies

First cell has freedom to vary. B- cell does not have freedom to vary in 2x2 table C- and D-cell do not have freedom to vary 2x2 has 1 degree of freedom

Practice 4: Hypotheses. Step 1. Write the null and research hypotheses in words and using statistical notation.

H0: There is no association between mother's BMI and child's obesity status; p1 = p2 = p3 H1: There is an association between mother's BMI and child's obesity status; one p is different OR H0: The risk of child obesity status does not differ by mother's BMI group; p1 = p2 = p3 H1: The risk of child obesity status differs by mother's BMI group; one p is different

Step 1: Step 1: Formulate Null and Research Hypotheses

H0: There is no association between smoking intervention and risk of quitting smoking; p1 = p2 H1: There is an association between smoking intervention and risk of quitting smoking; p1 ≠ p2 or H0: The proportion of participants who quit smoking does not differ between smoking intervention groups; p1 = p2 H1: The proportion of participants who quit smoking differs between smoking intervention groups; p1 ≠ p2 p = proportion; 1 = intervention group, 2 = non-intervention group

Step 5: Calculating p-values for Chi-squared Tests Using R

With test statistic and df you can use r to compute p When chi squared is 0 the observed and expected are the same for every single cell - chi squared cannot be negative

Step 6: conclusion

X2 statistic = 1.9908; df = 2; p=0.370 We fail to reject the null hypothesis of no association (p>0.05). There is no statistically significant association between smoking status and risk of stroke recurrence (39.13% for smokers, 27.27% for past smokers, 30.21% for never smokers, p=0.37). OR... The proportion (risk) of participants with stroke recurrence does not statistically significantly differ across smoking categories (39.13% for smokers, 27.27% for past smokers, 30.21% for never smokers, p=0.37).

Which of the following statements regarding risk ratios and odds ratios is true?

A log transformation of odds ratios and risk ratios tends to produce a normal distribution and this is used to compute confidence intervals for odds ratios and risk ratios. - Odds ratios and risk ratios are not normally distributed; they are positively (right) skewed. However, the natural log of odds ratios and risk ratios is normally distributed, which allows us to compute confidence intervals around relative measures of association. - We cannot directly calculate risks or rates using case-control data. The controls are a sample of the source population. Therefore, the sizes of the exposed group and unexposed group are unknown, so we don't have the denominators needed to calculate risks and rates. - The null value for odds ratios and risk ratios (and prevalence ratios and incidence rate ratios) is 1.

A researcher designs a randomized clinical trial in which half of the participants are assigned to a placebo and the other half are assigned to a diuretic to examine the effect on their hypertension status (participants are classified as hypertensive or not hypertensive - a dichotomous outcome). Which hypothesis test should be used to determine if the risk for hypertension is the same for those assigned to the diuretic versus those assigned to the placebo?

Chi-square test of independence - We have a dichotomous outcome and we are comparing two independent groups, so we can use a chi-square test of independence to determine if there is an association.

Step 2: Calculate Expected Frequencies

E=(row total)*(column total)/N - How much do the observed and expected frequencies differ?

Example: Smoking and Recurrent Stroke. Step 1: Formulate null and research hypotheses

H0: There is no association between smoking status and recurrent stroke risk; p1 = p2 = p3 H1: There is an association between smoking status and recurrent stroke risk; one p is different or H0: The proportion of participants with recurrent stroke does not differ across smoking status groups; p1 = p2 = p3 H1: The proportion of participants with recurrent stroke differs across smoking status groups; one p is different - Proportion of people in each category is the same= null hypothesis Research= there is an association between smoking and recurrent stroke risk - Research hypothesis will always be two tailed - Test stat can only be positive, never negative - If this was a cohort study Cross-sectional or ecologic=prevalence

Example: Hypertension and Diabetes Mellitus RR

Note: CI is not symmetric about RR

RRs and ORs from R

OR is the appropriate relative measure of association for a case-control study. RR or IRR is most appropriate for an experimental or cohort study, depending on whether person-time data was collected.

Practice 3: Interpretation of RR

Participants in the intervention group had 0.75 times the (25% reduced) risk of having a side effect in 6 weeks compared to participants in the placebo group. - Interpretation we want to know in words what the value and tis meaning but in conclusion use p value or confidence interval to describe significance

Confidence Intervals for RR* and OR

RR, OR are ratios that are not normally distributed. Natural log (ln) of RR or OR is approximately normally distributed and is used to compute the confidence interval. - Step 1: Convert from RR to ln(RR) - Step 2: Find CI for ln(RR) - Step 3: Convert CI from ln(RR) to RR - Natural log of risk ratios normally distributed We can calculate 95% cI using the z-critical values but first must mathematically transform risk, odd ratio to do that computation Null value for odds ratio is 1

p-value and Conclusion

Step 5. Compute the p-value using RStudio. Step 6. What is your conclusion? We fail to reject the null hypothesis (p>0.05). There is no statistically significant association between mother's BMI and child's obesity status (chi-square (2 df) = 4.91, p=0.09).

Which of the following is the correct statement of the null hypothesis for a chi-square test comparing the frequency of an outcome among two or more exposure groups?

The distribution of the outcome is the same among the exposure groups. - The null hypothesis for a chi-square test for two or more independent groups is that the distribution of the outcome is independent of the comparison groups. Stated another way, the null hypothesis is that there is no difference in the frequency of the outcome across exposure groups (no association between exposure and outcome). - The alternative hypothesis is that the distribution of the outcome is dependent on the comparison groups. As such, the alternative hypothesis contradicts the null hypothesis by stating that exposure status influences the frequency of the outcome.

A randomized clinical trial examines the effect of Ivermectin (FDA approved drug) and mortality among COVID-19 patients. The study finds that Ivermectin, compared to a placebo, reduces the risk of mortality by 15% (95% CI 0.65, 1.05). Which of the following statements about the p-value for this association is true?

The p-value is greater than 0.05. - Since the 95% CI includes the null value of 1, we know the association is not statistically significant and thus the p-value is greater than 0.05.

A researcher designs a randomized clinical trial in which half of the participants are assigned to a placebo and the other half are assigned to a diuretic to examine the effect on their systolic blood pressure (measured in mm Hg - a continuous outcome). Which hypothesis test should be used to determine if the mean systolic blood pressure is equal for those assigned to the diuretic versus those assigned to the placebo?

Two independent sample t-test - Our outcome is continuous and we are comparing two independent groups, so the two independent sample t-test should be used.

Practice 3: Conclusion

We don't reject the null hypothesis (p>0.05). There is no statistically significant difference in the risk of side effects comparing the new medication to the placebo (6% versus 8%, 95% CI: 0.27-2.08, p=0.58).


Related study sets

Psychology 101 Lena Ericksen Ch. 1

View Set

ANIMAL BEHAVIOR AND INTERDEPENDENCIES

View Set