Epidemiology and Biostatistics (EASY POINTS)
Define each term: - Incidence - Prevalence - Mortality - Case fatality
- Incidence: # of new/at risk (if you have disease you are not at risk) - Prevalence: # of affected persons/total population Incidence x duration of disease = Prevalence Mortality and Case Fatality - See IMAGE
Confidence Intervals and Statistical Significance: What 3 things do you check with confidence intervals to see if the results are statistically significant?
-CI for mean difference cannot cross 0 (can't have negative and positive values in the range) -CI for ratios cannot cross 1 (values must either be only above one or only below one). -CI between two groups cannot cross each other / overlap -- if they do, the two groups are not statistically different.
An outbreak of Ebola occurred in Liberia between 2014 and 2016. The population in Liberia at the time was 3.75 million. 10,675 people were diagnosed with Ebola and there were 4809 deaths. What was the case fatality rate?
0.45 4,809 / 10,675 = 0.45
A cohort study was conducted to study the association of coffee drinking and anxiety. Among 10,000 coffee drinkers, 500 developed anxiety. Among the 20,000 non-coffee drinkers, 200 cases of anxiety were observed. 1. What is the relative risk of anxiety associated with coffee use? A. 3.50 B. 4.25 C. 5.00 D. 7.25 E. 8.75 2. How many people must drink coffee to cause one case of anxiety? A. 4 B. 20 C. 25 D. 50 E. 60
1. C. 5.00 Relative Risk = A(A+B) / C(C+D) 0.05 / 0.01 = 5 2. C. 25 Atributable Risk = (A / A+B) - (C / C+D) Number needed to treat = 1/AR Absolute Risk Reduction = (C / C+D) - (A / A+B) Number needed to harm = 1/ARR
Name the term for each type of study: 1. A single patient's story 2. Tracks multiple subjects with a common known exposure, such as patients who have received a similar treatment, or examines their medical records for exposure and outcome. 3. Snapshot in time of a bunch of cases all at once. 4. Begins with the effect and looks for the cause (retrospective). You begin with the disease state and look for a common exposure or risk factor. 5. Begins with the exposure or risk factor and looks at the long-term effects. It's a particular form of longitudinal study that samples a group of people who share a defining characteristic, typically those who experienced a common event in a selected period, such as birth or graduation, and you perform cross-sectional studies at intervals through time.
1. CASE REPORT - one patient's story 2. CASE SERIES - a few stories 3. CROSS SECTIONAL - snap shot in time 4. CASE CONTROL - begin with disease state - By definition, must be RETROSPECTIVE 5. COHORT - begin with exposure/risk factor - Can be RETROSPECTIVE or PROSPECTIVE - Cohort = group of people who share a defining characteristic
Name each type of bias that can occur while interpreting the results: EASY POINTS 1. When a factor is related to both the exposure and outcome, but not on the causal pathway, it distorts or confuses effect of exposure on outcome. E.g., Pulmonary disease is more common in coal workers than the general population; however, people who work in coal mines also smoke more frequently than the general population 2. Early detection is confused with survival. E.g., Early detection makes it seem like survival rate has increased, but the disease's natural history has not changed. 3. Screening test detects diseases with long latency period, while those with shorter latency period become symptomatic earlier. E.g., A slowly progressive cancer is more likely detected by a screening test than a rapidly progressive cancer.
1. Confounding bias - confounding variable 2. Lead-time bias 3. Length-time bias
A researcher was investigating adverse effects due to smoking. She found that the death rate per 100,000 for lung cancer is 7 among nonsmokers and 71 among smokers. The death rate per 100,000 for coronary thrombosis is 422 among nonsmokers and 599 among smokers. The prevalence of smoking in the population is 55%. 1. What is the relative risk of dying of coronary thrombosis for smokers verses nonsmokers? A. 0.7 B. 10.1 C. 5.9 D. 0.1 E. 1.4 2. What is the relative risk of dying of lung cancer for smokers verses nonsmokers? A. 0.7 B. 5.9 C. 1.4 D. 10.1 E. 0.1
1. E. 1.4 2. D. 10.1 Relative Risk = A(A+B) / C(C+D)
Name each type of bias that can occur while performing the study: EASY POINTS 1. Awareness of disorder alters recall by subjects; common in retrospective studies. E.g., Patients with disease recall exposure after learning of similar cases. 2. Information is gathered in a systemically distorted manner. E.g., Association between HTN and MI not observed when using faulty automatic sphygmomanometer. 3. Subjects in different groups are not treated the same. E.g., Patients in treatment group spending more time in highly specialized hospital units. 4. Researcher's belief in the efficacy of a treatment changes the outcome of that treatment. E.g., An observer expecting the treatment group to show signs of recovery is more likely to document positive outcome.
1. Recall bias 2. Measurement bias 3. Procedure bias 4. Observer-Expectancy Bias
A study is designed to test a new "artificial pancreas" in children with Type 1 diabetes. It is calculated that without the device, the risk of hospitalization in a calendar year is 8%. With the device, the risk of hospitalization is 2%. What is the relative risk reduction for users of this artificial pancreas? A. 22% B. 40% C. 75% D. 80% E. 94%
2% / 8% = 25% 100% - 25% = 75% reduction in risk C. 75% Relative Risk = A(A+B) / C(C+D) Relative Risk Reduction = 1 - RR.
A randomized controlled trail is initiated that examines a new inhaled treatment for bronchiolitis in infants. 200 infants are recruited; 100 receive the new therapy and 100 receive placebo. 18 of the patients in the treatment group develop an pruritic rash compared to 2 in the placebo group (p<0.05). What is the number needed to harm for this new treatment?
6.25
The total population in Arkansas in 2012 was 6 million and 383,982 people died. What is the mortality rate per 1000 people?
63.997
An investigator wanted to determine whether vitamin deficiency was associated with birth defects. By reviewing the birth certificates during a single year in a large county, the researcher located 189 infants born with NTDs. A total of 600 other births were selected at random. Mothers were given a dietary questionnaire. Among mothers who gave birth to an infant with an NTD, 84 reported no use of supplementary vitamins; a total of 137 control mothers did not use a vitamin. What is the Odds Ratio between vitamin use and NTDs? A. 0.37 B. 0.14 C. 4.22 D. 0.47 E. 1.83
A. 0.37
A 43 year-old woman comes to her physician's office because she had a positive serum test for HIV. Of the 100 patients tested; 20 had HIV and the test was positive for 18 of them. The remaining 80 patients did not have HIV; however, the test was positive for 4 of them. Given this patient's positive test, which of the following is the probability that she does not have HIV? A. 18% B. 20% C. 24% D. 82% E. 90%
A. 18% FP = 4/22 = 18%
Two journal articles are assigned to medical students during their OB-GYN clerkship. The first article was a study evaluating the relationship between breast cancer and a woman's history of breastfeeding. The investigator selects women with breast cancer and an age-matched sample of women who live in the same neighborhood as the women with breast cancer. Study subjects are interviewed to determine if they breastfed any of their children. The second article was a study to investigate the relationship between exposure to chest irradiation and subsequent risk of breast cancer. Women who received radiation for postpartum mastitis in the 1950s were compared to women who received alternative treatments. The women were followed for 60 years to determine the incidence of breast cancer in each group. What type of study design were the first and second study, respectively? A. Case-Control; Prospective Cohort B. Cross-sectional; Case-Control C. Prospective Cohort; Cross-sectional D. Cross-sectional; Retrospective Cohort E. Prospective Cohort; Case-Control F. Case-Control; Cross-sectional G. Retrospective Cohort; Case-Control H. Cross-sectional; Prospective Cohort I. Case-Control; Retrospective Cohort
A. Case-Control; Prospective Cohort
Before the 2010 Winter Olympics, the International Olympic Committee wanted a screening test that would correctly identify a high proportion of athletes who used illegal performance-enhancing drugs. The athletes were concerned about a screening test that would incorrectly identify persons when, in fact, they were not using them. Which of the follow screening test characteristics were important to each group? A. High sensitivity for Olympic officials and high specificity for athletes B. High specificity for Olympic officials and high sensitivity for athletes C. High positive predictive value for Olympic officials and high sensitivity for athletes D. High specificity for Olympic athletes and high negative predictive value for athletes. E. High positive predictive value for Olympic officials and high negative predictive value for athletes.
A. High sensitivity for Olympic officials and high specificity for athletes Olympic officials want it to detect if it's there. Olympic athletes don't want a false positive.
A placebo-controlled clinical trial is conducted to assess whether a new antihypertensive drug is more effective than standard therapy. A total of 5000 patients with essential hypertension are enrolled and randomly assigned to one of two groups: 2500 patients receive the new drug and 2500 patients receive placebo. If the alpha is set at 0.01 instead of 0.05, which of the following is the most likely result? A. Significant findings can be reported with greater confidence B. The study will have more power C. There is a decreased likelihood of a Type II error D. There is an increased likelihood of statistically significant findings E. There is an increased likelihood of a Type I error
A. Significant findings can be reported with greater confidence
+++++++++++++++++++++++++++ Once you have a 2x2 table, how do you determine: - Atributable Risk - Absolute Risk Reduction - Number needed to treat - Number needed to harm
Atributable Risk = (A / A+B) - (C / C+D) Number needed to treat = 1/AR Absolute Risk Reduction = (C / C+D) - (A / A+B) Number needed to harm = 1/ARR
A new scholarship is created for UAMS medical students. Students are eligible if their USMLE Step 1 score is at least 2 standard deviations above the mean for their class. There are 160 students in this year's class. Approximately how many students are eligible? A. 2 B. 4 C. 8 D. 9 E. 12
B. 4
The prevalence of undetected diabetes in a population to be screened is approximately 1.5%, and it is assumed that 10,000 persons will be screened. The screening test will measure blood serum glucose content. A value of 180 mg% or higher is considered positive. The sensitivity and specificity associated with this screening test are 22.9% and 99.8% respectively. What is the predictive value of a positive test? A. 22.9% B. 63% C. 98.8% D. 36.3% E. 44%
B. 63% PPV = probability of disease for person w/ + test LR+: change in probability after a positive test LR+ = Sensitivity/(1-Specificity) LR-: change in probability after a negative test LR- = (1-Sensitivity)/Specificity
You are reading an article in JAMA that describes 3 patients who presented to an emergency department with hyperthermia and agitation after ingestion of bath salts. What type of study is this? A. Case Report B. Case Series C. Case-Control D. Cohort E. Randomized Control Trial
B. Case Series Multiple people's individual experiences.
A study is conducted to examine the relationship between alcohol consumption and medical school performance. Participants in this study were classified as abstainers, light drinkers, or heavy drinkers using established Research Diagnostic Criteria. The same participants were also classified as being in the top, middle, or bottom of their class. Results showed that students in the top or the bottom of the class were more likely to be heavy drinkers with a p-value of ≤ 0.01. Which of the following statistical tests was most likely used to generate this result? A. Analysis Of Variance B. Chi-square C. Matched Pairs T-test D. Meta-analysis E. Pearson Correlation Coefficient F. Pooled T-test
B. Chi-square Chi-square for Categorical data
A randomized controlled trial is developed to test a new medication for atrial fibrillation. The investigator wants to enroll 200 patients. α is set to 0.05. A power calculation is performed and shows that β = 0.4. How can they improve their power? A. Lower the α from 0.05 to 0.01 B. Increase the number of patients enrolled C. Decrease the number of patients enrolled D. Perform intention-to-treat analysis E. Change the trial to become open-label
B. Increase the number of patients enrolled Larger sample size => Increased power
A study is developed to test whether or not a relationship exists between a medical student's step 1 score and how many residency interviews they attend. What type of statistical method would be used to test the results? A. Chi-squared B. Pearson Correlation C. ANOVA D. t-test E. Relative risk
B. Pearson Correlation Wanting to see a correlation. Step scores are continuous data, so chi squared, t-test, and ANOVA would not work.
A retrospective cohort study is examining birth complications in women with diabetes. The study determines that babies are more likely to be born large for gestation age (LGA) if the mother has diabetes. The relative risk for the study is calculated to be 4. Which of the following describes this relative risk. A. The incidence of diabetes among mothers with LGA babies is 4X that of non-LGA mothers B. The incidence of LGA infants among women with diabetes is 4X that of women without diabetes C. The incidence of LGA infants among women without diabetes is 4 times that of women with diabetes D. The odds of diabetes among mothers with LGA babies is 4X that of non-LGA mothers E. The odds of LGA infants among women with diabetes is 4X that of women without diabetes.
B. The incidence of LGA infants among women with diabetes is 4X that of women without diabetes
A new screening test for HIV is developed and tested in 500 subjects. Has HIV and tested positive: 100 Has HIV, but tested negative: 40 Does not have HIV, but tested positive: 10 Does not have HIV and tested negative: 350 What is the probability that a positive test indicates the presence of HIV? A. 0.71 B. 0.97 C. 0.91 D. 0.89 E. 0.28
C. 0.91 PPV = A / (A + B) --or-- TP / (TP + FP) 100 / (100 + 10) = 100 / 110 = 0.91
In a city with a population of 1 million, 10,000 individuals have AIDS. During the course of a year there are 1,000 new cases of AIDS and 200 deaths from the disease. There are 2,500 deaths from all causes during the year. Assuming no net emigration from or immigration to the city, the incidence of AIDS in the city during this year is given by which of the following expressions? A. 800/990,000 B. 800/1,000,000 C. 1,000/990,000 D. 1,000/1,000,000 E. 2,500/1,000,000 F. 10,000/1,000,000
C. 1,000/990,000 With incidence, you take the new cases out of the denominator. You're wanting to see the number of new cases.
A study of the relationship between recent alcohol consumption (yes or no) and boating accidents found a risk ratio of 2.0 and a P value of 0.15. The best interpretation of this finding is: A. Recent drinkers were two times more likely to have a boating accident than nondrinkers. B. There was no association between alcohol consumption and boating accidents because the P value indicates that the results were not statistically significant. C. People who drank recently were twice as likely to have a boating accident as compared with people who did not drink recently, although the results were not statistically significant. D. There was a 30% risk of having a boating accident among people who drank recently compare with a 15% risk of having a boating accident among people who did not drink recently. E. None of the above
C. People who drank recently were twice as likely to have a boating accident as compared with people who did not drink recently, although the results were not statistically significant. Risk Ratio of 2.0 = twice as likely p is >0.05, so not significant.
+++++++++++++++++++++++++++++++ What statistical test would you want to do for: - Categorical data (sex; race) - If you're comparing two groups - If you're comparing 3 or more groups - Interval or Ratio data (continuous data; like numbers)
Chi Square for Categorical data t-Test for two groups Anova for 3+ groups Pearson correlation coefficient for continuous data.
The association between job-related exposure to welding fumes and COPD was explored in a case-control study. The following data was reported for 399 COPD patients: 37 currently employed as welders; the remainder had no occupational exposure. Among 800 controls, 48 were employed as welders. Calculate the Odds Ratio for welding fumes and COPD. A. 0.37 B. 0.72 C. 1.00 D. 1.60 E. 2.46
D. 1.60 OR = AD/BC Fill in the table, calculate, get 1.60.
A researcher is investigating if there is a relationship between tobacco use and skin cancer. 2,500 patients are enrolled and placed into groups according to smoking status (never, former, current). After being followed for 15 years, the data shows no statistical difference between groups. What type of study is this? A. Case Control B. Clinical Trial C. Cross-sectional D. Prospective cohort E. Retrospective cohort
D. Prospective cohort You know it's a cohort study. It's prospective, not retrospective, because you're starting with similar patients and your data is the effects (rather than starting with effect and determining the cause).
In a cohort study of elderly women, the relative risk ratio for hip fractures among those who exercise regularly is 1.2 (95% confidence interval of 1.1 to 1.8). Which of the following is the most appropriate conclusion about the effect of regular exercise on the risk for hip fracture? A. Statistically non-significant increase in risk B. Statistically non-significant overall decrease in risk C. Statistically significant overall decrease in risk D. Statistically significant overall increase in risk
D. Statistically significant overall increase in risk CI does not cross 1, so it's significant.
A student researcher conducts a study regarding the occurrence of lung cancer in non-smoking subjects who reside with smokers. She determines that a non-smoking individual subjected to secondhand smoke at home for 10 years or more has a 50% chance of developing lung cancer. Which of the following is the probability that five randomly selected non-smoking individuals exposed to secondhand smoke for more than 10 years will all develop lung cancer? A. 0.3% B. 2.2% C. 20% D. 22% E. 3% F. 30%
E. 3% Probability is defined as the extent to which an outcome is likely to occur, determined by the ratio of wanted outcomes to the number of cases possible. In the this study, there is a 50% chance a non-smoking individual will develop lung cancer after being exposed to smoke in the household for 10 or more years. The odds of all five individuals living in separate smoking households developing cancer is (0.5)^5 or 0.03 (3%).
How do you use your LR+ and LR- on the Fagan Nomogram to convert pretest probability (aka prevalence) to determine post-test probability?
Find prevalence on left, match it with LR+ and LR- in middle column to find post-test probablility. Overall, this determines whether it's worth doing the test (will it actually help?)
Implication for a test with high: - Positive Predictive Value - Negative Predictive Value
High positive predictive value helps rule a condition in -- if positive, they likely have the disease - good for diagnostic tests - sPin High negative predictive value helps rule a condition out -- if negative, they likely actually don't have the disease. - good for screening tests - sNout
How do you calculate: - Positive Likelihood Ratio (LR+) - Negative Likelihood Ration (LR-) E.g., Sensitivity = 0.9 Specificity = 0.825 Also, define both terms.
LR+: change in probability after a positive test LR+ = Sensitivity/(1-Specificity) LR-: change in probability after a negative test LR- = (1-Sensitivity)/Specificity If Sensitivity = 0.9 and Specificity = 0.825, LR+ = 0.9 / (1 - 0.825) = 5.14 LR- = (1 - 0.9) / 0.825 = 0.121
How do raising vs. lowering the cutoff points for data affect the sensitivity vs. specificity and the PPV vs. NPV of a test?
Lowering the cutoff point: - (shift left) - Increases False Positives - Decreases False Negatives - Increases seNsitivity and NPV - Decreases sPecificity and PPV Raising the cutoff point: - (shift right) - Decreases False Positives - Increases False Negatives - Decreases seNsitivity and NPV - Increases sPecificity and PPV For example, in diabetes screening, RAISING the blood glucose cutoff needed for diagnosis will decrease the sensitivity of the test (harder to be diagnosed) and raise the specificity of the test. It will increase the PPV, but decrease NPV.
A randomized controlled trail is initiated that examines a new inhaled treatment for bronchiolitis in infants. 200 infants are recruited; 100 receive the new therapy and 100 receive placebo. 24 of the patients in the treatment group need admission compared to 64 in the placebo group (p<0.05). What is the number needed to treat for this new treatment?
NNT = 2.5
Define each type of data: - Nominal - Ordinal - Interval - Ratio - °K is which type, while °C and °F are which type? - What would class rank be? - Height? - Blood Pressure? - Sex; Race?
NOIR - in order of increasing usefulness • Nominal: Names or Categories (Sex, Race) • Ordinal: Scales (Likert, Class Rank) • Interval: Ordered & exact differences (°C, height, BP) • Ratio: Ordered, has absolute zero (°K) Interval and Ratio are both continous data, but ratio has an absolute zero (can have a value of zero)
A new rapid HIV test enters the market. The sensitivity is 95% and the specificity is 80%. You're working in a primary care clinic and have two patients this afternoon who need tested for HIV. Patient A: From Little Rock, where the prevalence of HIV is 4%. Patient B: Just arrived from Africa and would like to establish care. The prevalence in her home country is 40%. Using your new rapid HIV test, what is the predictive value of a negative test in each patient?
NPV Little Rock: 99.7% NPV Africa: 96% NPV = TN / (TN + FN) - probability of no disease for person w/ - test
+++++++++++++++++++++++++++ Once you have a 2x2 table, how do you determine: - Odds ratio - Relative risk - Relative risk reduction
Odds Ratio = AD/BC -- Diagonal over Diagonal Relative Risk = A(A+B) / C(C+D) Relative Risk Reduction = 1 - RR.
An increase in prevalence would have what effect on: - Positive Predictive Value - Negative Predictive Value
PPV & NPV change w/ prevalence: ↑Prev = ↑PPV & ↓NPV
How to set up a table and determine (formulas): - Positive Predictive Value - Negative Predictive Value - Sensitivity - Specificity
PPV = TP / (TP + FP) - probability of disease for person w/ + test NPV = TN / (TN + FN) - probability of no disease for person w/ - test Sensitivity = TP / (TP + FN) Specificity = TN / (FP + TN)
Define: - Positive skew - Negative skew In terms of mean, median, and mode And shapes of curve in each type of skew
Positive Skew: Mean > Median > Mode - Average is higher than the other two - Average is higher than reality - Caused by a few really high values throwing the mean off / pulling the mean to the right. - MNEM: Positive skew is in alphabetical order -- mean > median > mode Negative Skew: Mean < Median < Mode - Average is lower than the other two - Average is lower than reality - Caused by a few really low values throwing the mean off / pulling the mean to the left.
Precision vs. Accuracy What does each provide in a study? I.e., ... 1. If a study is very precise, it has high ___. 2. If a study is very accurate, it has high ___. - Which one is decreased by random errors in a test? - Which one is decreased by systematic error or bias in a test? - Which one, if high, will reduce standard deviation and increase statistical power.
Precision (Increases Reliability): - The consistency and reproducibility of a test. - The absence of random variation in a test. - Decreased by random errors in a test. - High precision reduces standard deviation. - High precision increases statistical power (1 −β). Accuracy (Increases Validity): - The trueness of test measurements. - The absence of systematic error or bias in a test. - Decreased by systematic errors in a test.
You are running a pediatric clinic with 300 patients and 45 of your patients have autism. What is your prevalence? You measure in the next year that 44 more patients are diagnosed with autism. What is the incidence per 100 children for that year?
Prevalence: 0.15 Incidence: 17
5% of women have uterine cancer. You diagnose 10 new cases in a year. - What is the prevalence? - What is incidence per 100 people?
Prevalence: 5% Incidence: 10.5 per 100 people (10.5%)
A study is performed on a new test for strep throat. 1000 patients are enrolled. 20% of the patients have strep throat. 150 patients test positive including 30 of the healthy volunteers. Calculate the sensitivity and specificity of this new test.
Sensitivity = TP / (TP + FN) Specificity = TN / (FP + TN) 200 patients have strep throat. 150 test positive. 30 of the positive tests were false positives. Thus 120 of the positive tests were true positives. 850 tested negative. So fill in the chart. Sensitivity = 120/200 = 0.6 Specificity = 770/800 = 0.96 So this would be a good diagnostic test because it has really high specificity.
A new screening test for breast cancer is tested against the gold standard. 1050 patients are tested and their data is: 240 patients with breast cancer had a positive test 10 patients with breast cancer had a negative test 20 patients without breast cancer had a positive test 780 patients without breast cancer had a negative test What is the specificity of the new test?
Specificity = TN / (FP + TN) Specificity = 780/800 = 0.975 Very high specificity -- good diagnostic test
+++++++++++++++++++++++++++++++++++ Types of Errors: - Type I error is a false ___. Greek letter? - Type II error is a false ___. Greek letter? What is the actual definition of Power? Formula?
Type I error: α Chance of showing a difference when one does NOT exist False positive Type II error: β Chance of showing no difference when one DOES exist False negative POWER: Probability of rejecting the null hypothesis when it actually is false (1-β) -- i.e., probability of detecting a significant difference. H0: Hypothesis of NO difference - Null H1: Hypothesis of some difference - What you want.
For each phase of a drug trial, name who participates and what they are trying to figure out: - Phase I - Phase II - Phase III - Phase IV
• Phase I: Small # Healthy Volunteers -- "Is it Safe?" • Phase II: Small # w/ Disease -- "Does it Work?" • Phase III: Large # w/ Disease -- "Is it an Improvement? -- Better than previous?" • Phase IV: Surveillance after approved and on the market.
+++++++++++++++++++++++++++ COMMON, EASY POINTS Define each type of prevention strategy: - Primary - Secondary - Tertiary
• Primary prevention -Prevention of disease by reducing susceptibility or exposure (e.g., quarantine) • Secondary prevention -Earlier detection and/or treatment of the disease • Tertiary prevention -Reducing morbidity/disability from the disease USMLE-Rx: Primary prevention involves interventions administered to prevent onset of disease or injury before any evidence of disease has occurred. Secondary prevention involves identifying asymptomatic disease in patients and preventing its progression, for example treating hypertension to prevent stroke. Tertiary prevention involves treating symptomatic disease and reducing its morbidity, for example, using antiplatelet drugs after myocardial infarction.