study design pt 1
To show causation, the suspected causal factor must occur or be present ________ the effect (disease) developed
before *in epidemiology especially with chronic diseases, this is more difficult than it sounds
are cohort studies big or small
can be either
methods often used in epidemiology (and medical research) include
case descriptions, cross-sectional surveys, cohort studies, and case-control studies
characteristics of systematic review and meta-analysis
combines multiple prior studies; incorporates rigorous appraisal of those studies' potential biases; produces new measure of effect size
BRFSS state comparisons- chronic disease indicators
compare things like diabetes in different states
even if your purported causal factor of interest is statistically associated with the disease, and occurs prior to disease onset, you need to
eliminate other possible explanations (causes) that also could fit the data
characteristics of an ecological study
examines relationship between exposure and disease with population-level rather than individual-level data
public health falls to __________ government
local and state
types of analytic studies
observational and experimental
five PCP (and CMV and candidiasis cases)
see slide 22
causality is inferred from
statistical association, temporal sequence, plausibility of mechanism, controlling for biases and confounders, and ruling out alternative explanations
characteristics of lab experiments, clinical / field trials, pragmatic trials
studies preventions and treatments for diseases; investigator actively manipulates which groups receive the agent under study; randomization may be of patients, physicians, facilities, etc.
characteristics of a cohort study
typically examines multiple health effects of an exposure; subjects are defined according to their exposure levels and followed for disease occurrence; may be prospective or retrospective
characteristics of a cross-sectional study
typically examines relationship between exposure and disease prevalence in a defined population at a single point in time
indoor wood burning raises women's lung cancer risk by 43% study hypothesis
wood burning in the fireplace, in the stove creates smoke- this increases the lung cancer
one-time (cross-sectional) ecological study
would relate the frequency of a characteristic (e.g., smoking rate) with outcome of interest (e.g., lung cancer incidence)
do all research designs have strengths and weaknesses
yes
positive aspects of cross-sectional surveys
•Are fairly quick and easy to perform •Are useful for hypothesis generation
positive aspects of ecological studies
•Are fairly quick and easy to perform •Are useful for hypothesis generation
positive aspects of experiments
•Are the "gold standard" for evaluating treatment (clinical trials) or preventive interventions (field trials) •Allow investigator to have extensive control over research process •Broad category including lab experiments, randomized controlled trials, some clinical trials *, cluster-randomized trials, pragmatic trials
negative aspects of cohort studies
•Are time-consuming and costly (especially prospective studies) •Can study only the risk factors measured at the beginning •Can be used only for common diseases
negative aspects of experiments
•Are time-consuming and usually costly •Can study only interventions or exposures that are controlled by investigator •May be limited in generalizability •May not be ethically feasible
positive aspects of cohort studies
•Can be used to obtain a true (absolute) measure of risk •Can study many disease outcomes •Are good for studying rare risk factors
negative aspects of case-control studies
•Can obtain only a relative measure of risk •Selection of controls may be difficult •Temporal relationships may be unclear •Can study only one disease outcome at a time
negative aspects of case studies/series
•Cannot test hypotheses •Can explore only what is seen (or looked for)
negative aspects of qualitative studies
•Cannot test hypotheses •Can explore only what is stated or heard
positive aspects of systematic reviews and meta-analysis
•Decrease subjective element of literature review •Increase statistical power •Allow exploration of subgroups •Provide quantitative estimates of effect size
positive aspects of case studies/series
•Describes something novel •Generates hypotheses for further research
positive aspects of qualitative studies
•Descriptive •Real persons' own words
negative aspects of ecological studies
•Do not allow for causal conclusions to be drawn •Are subject to ecological fallacy •Are not good for hypothesis testing
negative aspects of cross-sectional surveys
•Do not offer evidence of a temporal relationship between risk factors & disease •Are subject to late-look bias •Are not good for hypothesis testing
positive aspects of case-control studies
•May be fairly quick and easy to perform •Can study many risk factors •Are good for studying rare diseases
negative aspects of systematic reviews and meta-analysis
•Mixing poor quality studies together in a review or meta-analysis does not improve the underlying quality of studies •Accounting for publication bias may be difficult
examples of TriNetX papers
-COVID-19: Protective effect of vaccination on post-acute sequelae -pre-existing auto-immune disease and adverse events with immune checkpoint inhibitors among cancer patients -fibrosis-4 predicts need for MV in COVID-19
where do we start- cross sectional, cohort (prospective), case control
-CS: starts at one time snapshot where have mixture of data on exposure and outcome -C: starts with exposure and proceeds to looking at outcome -CC: start with outcome and look backwards at probability of exposure
famous case reports
-first description of Parkinson's -identification of AIDS -E. coli outbreak in Germany and then another in US -China CDC weekly- original 4 pneumonia cases for identification of covid-19
nurses' health study
-3 generations, > 280,000 participants -established by Dr. Frank Speizer in 1976 with continuous funding from the NIH since that time. -the original focus of the study was on contraceptive methods, smoking, cancer, and heart disease, but has expanded over time to include research on many other lifestyle factors, behaviors, personal characteristics, and more than 30 diseases
famous prospective cohort studies
-Framingham Heart Study (1950 - current) -Nurses' Health Study (1976 - current) -National Child Development Study (post-WWII - current, in United Kingdom)
retrospective cohort studies you can do with a mentor
-VCU & VCUHS participate in the TriNetX clinical data repository -10 years of de-identified clinical EMR data from VCUHS and ~80 other healthcare systems ->1 million VCUHS patients, >100 million across systems -easy to use query interface for creating cohorts -built-in analytics to describe and compare cohorts, produce incidence and prevalence of events, and evaluate outcomes
prospective studies
-begin with enrolling humans before outcomes of interest occur, and conducting measures of the persons longitudinally- this ensures temporality of the association
NIEHS sister study smoker model
-compared never and ever smokers -effect doesn't go away when stratify by smoking- both non smokers and smokers have this association between wood smoke and incident cases of lung cancer
various research designs may reveal this
-comparing the rate of disease before and after an intervention -comparing groups with and without exposure to risk factor(s) -comparing groups with and without treatment -comparing groups with treatment A versus treatment B
framingham output
-conventional and novel vascular risk factors -inflammatory, hormonal and other biomarker data -subclinical disease -subclinical structural and functional changes that accompany brain aging -clinical end-points of incident stroke -diet, physical activity, depression and social networks -alternative causes of morbidity and mortality -neuropathological data on over 100 deceased persons -genetic database
retrospective cohorts in palliative care research
-cost savings for home-based palliative care -VCU part of prospective study looking at overall effect of palliative care on costs
alternative explanations: ulcers
-data: anecdotes -leading explanation: stress, diet, hypersecretion of acids; everyone "knew" that bacteria couldn't survive in the harsh environment of the acid-washed stomach -eventual explanation (1980s): helicobacter pylori infection is cause of most cases
alternative explanations: cholera in london, 1849
-data: strong correlation between elevation and cholera rates -leading explanation: miasma theory (noxious vapors); met criteria for association, temporality, and elimination of alternatives -eventual explanation: germ theory (Pasteur 1860s, Koch 1870s, etc.) also fits the 1849 London data - greater elevation means less contamination of well water by river Thames - and other data as well
major measures in public health
-disease incidence and prevalence -life expectancy -causes of death *there are inequities and disparities in these measures
are all clinical trials experimental
-if •you define an experiment as comparison of 2+ groups randomly assigned to conditions, then no -Phase I trials: One-armed trial of a novel agent to determine safety in humans (and perhaps pharmacokinetics) vis-à-vis dosage and adverse events -Phase II trials: Often but not always one-armed trials to determine preliminary effects and adverse events at selected dose(s)
framingham heart study
-longitudinal, prospective study - wide variety of risk factors, and cardio-vascular, neurological and other health outcomes -3 generations of participants 1. 1948/50 Original Cohort- 5,209 participants (2/3 of adults in Framingham), evaluated biennially (29 times) 2. 1971- Offspring Cohort- children of Originals and their spouses (5,124) 3. 2002- Gen3- 4,095 participants -evaluated every other yr so original cohort evaluated 29 times since late 1940s and 50s
NIEHS sister study second model
-looked at confounders and adjusted for race and ethnicity, education, urban status, smoking status, number of pack years for smokers, total years exposed to environmental tobacco smoke, marital status and income status -multivariate regression, adjusted hazard ratio -still above 1.0 for presence of wood-burning fireplace/stove and used wood as source of fuel -1.68 times likelihood of lung cancer for those who used it at least 30 times in year
indoor wood burning raises women's lung cancer risk by 43% study confounders
-maybe people that smoke cigarettes are more or less likely to have fires and are more or less likely to get lung cancer -depending on SES, some women are more likely to have indoor fireplaces and some less, maybe some people are more likely to cook with open fires than others and those same people might be more or less likely to develop lung cancer -other sources of pollution- it might be that women who use fires indoors are also likely to hang out at bonfires outside or to burn leaves, or to burn trash or things like that and also have lung cancer -maybe they live in a city and they are exposed to air pollution from urban and they have lung cancer and they are more or less likely to wood burning stoves and fireplaces
NIEHS sister study results
-people who have wood burning fireplace or stove had hazard ratio of 1.05 but CI crosses 1.0 -uses wood as source of fuel- 1.25, CI stays above 1.0, means 1.25 greater risk of lung cancer -if you use a fireplace at least one month a year, that's associated with a 1.4 times likelihood of getting lung cancer compared to those who don't have one at all (CI does not cross 1.0)
"'I Am Not The Doctor For You': Physicians' Attitudes About Caring For People With Disabilities"
-physicians reported feeling overwhelmed by the demands of practicing medicine in general and the requirements of the Americans with Disabilities Act of 1990 specifically; in particular, they felt that they were inadequately reimbursed for accommodations -some physicians reported that because of these concerns, they attempted to discharge people with disabilities from their practices -our findings also suggest that physicians' bias and general reluctance to care for people with disabilities play a role in perpetuating the health care disparities they experience
BRFSS surveys of the general population ask questions about specific conditions such as childhood asthma ("Has a doctor, nurse or other health professional EVER said that the child has asthma"? for example). The surveys are anonymous, cross-sectional, annual, and statewide (meant to be representative of a state's population). Using just the data from BRFSS, which of the following statistics could you likely produce?
-prevalence -trends in prevalence over time
NIEHS sister study
-prospective cohort study of environmental and genetic risk factors for breast cancer and other diseases among 50,884 sisters of women who have had breast cancer -such sisters have about twice the risk of developing breast cancer as other women, with about 300 new cases of breast cancer expected to be diagnosed each year -having a risk-enriched cohort improves statistical power to identify potential modifiable risk factors and for assessing the interplay of genes and environment -additionally, sisters are often highly motivated to participate in long-term breast cancer research because their family member has experienced the disease -thus, the response rates and compliance have been high -the prospective design allows the assessment of exposures before the onset of disease, thereby avoiding biases common to retrospective studies
randomization does not help per se to
-rule out other possible explanations that fit the data, when bias and confounding are already eliminated -a new experiment testing two hypotheses may be required
clinical interventions have less impact on health than
-socioeconomic factors: poverty, education, housing... -regulating risks: tobacco, fluoridation... -protective / preventive health: disease screening, immunizations, cessation treatments
BRFSS module topics
-states implement different topics -ex. ACEs, cancer survivorship
indoor wood burning raises women's lung cancer risk by 43% study design
-study involving 50,000 women -indoor wood-burning from stoves and fireplaces and incident lung cancer among the sister study participants -sister study is a prospective longitudinal study of women across all of the US who have a sister who was diagnosed with breast cancer and the person enrolled in the study does not have cancer -women were enrolled in the longitudinal prospective study 20 years ago to 16 years ago -21% reported annual use of an indoor wood burning fireplace or stove equal to or greater than 30 times 30 days per year
ecological studies
-the unit of analysis is populations, not individuals -provide no information as to whether the people who were exposed to the characteristic were the same people who developed the disease, whether the exposure or the onset of disease came first, or whether there are other explanations for the observed association -*looking at population level rates*
twin studies
-unique epidemiological tool for understanding roles of genetics, epigenetics, environment, and interactions among those in subsequent health, cognitive abilities, personality, behavior -huge studies & datasets in centers world-wide -VCU Virginia Institute for Psychiatric and Behavioral Genetics (VIPBG) ex. spit for science- prospective study to gather genetic info from saliva and look over time at students' behavioral and emotional health
features of retrospective cohort studies
-use existing data that were generated for other purposes (clinical EMRs, billing / administrative data, health plan claims data, disease registries, combinations of sources)- aka secondary data -not gathering data prospectively on a targeted set of participants -efficient -real-world, potentially diverse samples
retrospective studies
-use secondary data including both risks and outcomes that have already occurred -cheaper and easier, but the quality and comprehensiveness of data may be less -not collecting new data after enroll them in study
cross-sectional surveys
-useful for determining knowledge, attitudes, and behaviors in populations -can be one-time or repeated -ex. BRFSS (behavioral risk factor surveillance system)- recurring meaning a series of snapshots (not longitudinal)
longitudinal ecological study
-would use ongoing surveillance or frequent repeated cross-sectional survey data to measure trends in disease rates over many years in a defined population -by comparing the trends in disease rates with other changes in the society (e.g., wars, immigration, introduction of a vaccine or antibiotics), epidemiologists attempt to determine the impact of these changes on disease rates
ecological study examples
1. -per capita cigarette consumption over years- rises then falls -same trend for rates of lung cancer but there is a 30 year delay between this and above -less smoking by females so curve is smaller than that of males (attenuated relationship) 2. recent work with covid (cases and vaccination %) 3. life expectancy by neighborhood
case study/series
1. descriptive 2. not really trying to determine relationship between cause and effect 3. novel / unusual / striking -new condition, permutation or combination of conditions -unusual severity -unusual history -striking response to treatments including new adverse event -ethical issues
difficulties of retrospective cohort studies
1. difficult to overcome biases - describe findings as "associated with" -statistical methods such as propensity score matching are used to try to balance two cohorts on potentially confounding variables -multivariate regressions look at predictor of interest while controlling for other variables' associations
types of observational studies
1. ecological 2. cross-sectional 3. cohort 4. case-control -cross-sectional -case-control -cohort
What is a good study design choice in epidemiology research?
1. enable a comparison of a variable (e.g., disease frequency) between two or more groups at one point in time or, in some cases, within one group before and after receiving an intervention or being exposed to a risk factor 2. allow the comparison to be quantified in absolute terms (as with a risk difference or rate difference) or in relative terms (as with a relative risk or odds ratio) 3. permit the investigators to determine when the risk factor and the disease occurred, so that they can determine the temporal sequence 4. minimize biases, confounding, and other problems that would complicate interpretation of the data
The scientific method for determining causation can be summarized as having these three steps in the following order:
1. investigation of the statistical association 2. investigation of the temporal relationship 3. elimination of all known alternative explanations
types of experimental studies
1. lab experiments, clinical / field trials, pragmatic trials 2. systematic review and meta-analysis -lab trials -field trials
NHS methods
1. married registered nurses, aged 30 to 55 in 1976, who lived in the 11 most populous states, and whose nursing boards agreed to supply NHS with their members' names and addresses, were eligible to be enrolled in the cohort if they responded to the NHS baseline questionnaire -the original states were California, Connecticut, Florida, Maryland, Massachusetts, Michigan, New Jersey, New York, Ohio, Pennsylvania, and Texas 2. the names and addresses of 238,026 nurses who fulfilled the eligibility criteria were obtained in 1972 from the American Nurses' Association, with approval from the state boards of nursing -unique identification numbers were immediately assigned to each nurse to ensure strict confidentiality. 3. the NHS cohort was then established by a series of three mailings of the baseline questionnaire -the first mailing to all 238,026 nurses occurred in June 1976, with the final mailing to non-respondents in December 1976 -overall, 121,700 women returned a completed questionnaire. After excluding 65,613 questionnaires that could not be delivered, the response rate was approximately 71% (121,700 of 172,413)
types of analytical studies
1. observational 2. experimental
types of descriptive studies
1. qualitative 2. case study/series -case reports -case series -descriptive surveys
qualitative designs
1. qualitative research: Listen to participants for insights into public health problems -ethnographic observations -open-ended semi-structured interviews -focus groups -key informant interviews 2. examples -"Vaccine hesitancy as an opportunity for engagement: A rapid qualitative study of patients and employees in the U.S. Veterans Affairs healthcare system" -"'I Am Not The Doctor For You': Physicians' Attitudes About Caring For People With Disabilities"
cohort studies
correlational study of a clearly identified group of people 1. prospective studies 2. retrospective studies
The Virginia Cancer Registry gathers information from hospitals and oncology practices on new cases of cancer each year. It does not include information on how many people are not diagnosed with cancer or the size of the general population. Using just the information from the VCR, which statistics could you likely produce?
cumulative incidence
characteristics of a case study/series
describes a phenomenon, emphasizing novel condition, history, presentation, degree, or response to treatment
characteristics of a qualitative study
elicits and describes experiences of people of interest
statistical tests are then used to show whether this association is
greater than would be expected by chance alone
researchers have created __________ from which new hypotheses can be tested, and many of these are available to other researchers (but bring your own funding for analyses)
huge, rich datasets
survey data connected with Medicare data
if you do cancer registry work, government has linked it to medicare and medicaid data
A researcher is interested determining what outcomes are associated with a rare exposure (e.g., first responders at WTC on 9/11). What research design would be best for that?
prospective cohort study
NHS outputs
related cig smoking to different outcomes of interest (oral contraceptives, hormone therapy, obesity, alcohol)
retrospective cohort study example: risk of cancer from occupational exposure to ionizing radiation
relative rate of mortality due to all cancer other than leukaemia by categories of cumulative colon dose, lagged 10 years in INWORKS. Vertical lines=90% confidence intervals; blue line=fitted linear model for the change in the excess relative rate of mortality due to all cancer other than leukaemia with dose; numbers above vertical lines=number of deaths due to cancer other than leukaemia in that dose category
if you do a study using TriNetX it would be a
retrospective cohort study with all of its strengths and weaknesses
statistical significance in association between
risk (or protective) factor and outcome
one possible alternative is that the apparent association is spurious- what does that mean?
taking an objective look at potential biases, random error, statistical "power", adequacy of measurements, etc
careful selection of research design is necessary to determine
temporal sequence of cause and effect *a well-designed experiment can help
randomization helps
to rule-out factors such as self-selection, investigator bias, confounding factors
characteristics of a case-control study
typically examines multiple exposures in relation to a disease; subjects are defined by the outcome -- as cases or controls - and exposure histories are compared to determine causal agents