Concepts in epidemiology
Confounder/effect modifier?
A given factor is confounder if RF for disease in question AND not distributed evenly between exposed/unexposed. Is an EM if it impacts exposure-disease relationship throughsome kindof epidemiological/biological mechanism
Matching
selection of controls for cases on the basis of specific criteria, creation of pairs => comparison of pairs, not only of two general populations. Benefits: control for confounders, can increase power of study by making cases/controls more similar. Disadvantages: avoid matching on irrelevant variables. Risk of over-matching (so similar that exposure status becomes identical, not always obvious), danger of matching on variables associated with exposure but not with disease (=> biased results). Requires specific, matched analysis
Selection of unexposed/control group?
should be as similar as possible to exposed for comparability (except relevant exposure), problem to find them if exposed were chosen from specific high-exposure group. Healthy worker effect: people in occupation are healthier than general population, don't just go to the street and interview people
clinical vs PH setting
sick/sick and well people, results => individuals/populations, therapeutic/therapeutic and preventive use, personal history/talk with community, exam/records, tests/surveys, patient dg/community dg
Sample size/power of study?
size should at least give 80% power to detect clinically relevant difference. Too small: waste of resources, subjects' willingness wasted, potentially wrong impression that intervention does not work may make it harder to carry out other studies later on.
definition epidemiology
the study of the distribution and determinants of health-related states or events in specified populations and the application of this study to the control of health-related problems
Quantifying disease?
tools =>health statistics, surveys => to estimate burden on health services Standardization: relate numbers to population at risk => calculate rate (no. of cases/population at risk) to compare between areas/in time
Confounding?
two groups should be comparable with regard to all factors except for the RF in question (problem: no disease has one single RF). If other RFs not balanced between groups => confounding of exposure-disease relationship. Confounder must be a RF for the outcome. RF only becomes a confounder when distributed unevenly between the two groups. Is both associated with exposure and with outcome. => restriction of study population (age, gender etc), matching, stratification, multivariate analysis.
Definition goal?
ultimate (high-level) aim of a project or a programme, usually in terms of national development goal, time frame mostly undefined
Most important of Hill's criteria?
Most important: plausibility, internal and external consistency, strength of association, dose-response, temporality, reversibility (removal of exposure leads to decline in outcome). But: except for temporality, none of Hill's criteria is absolute for establishing a causal relationship, he said that himself
When high specificity?
No treatment available; or not too serious disease but treatment unplesant and/or unsafe. Or serious social consequence of being labelled acase.
Definition programme?
PH undertaking with no foreseeable ending, usually essential health care intervention with strong government ownership and multiple sources of support
Definition project?
PH undertaking with specific goal/objective, usually funded by specific source and usually time-bound
Why Hill's criteria?
We should use these 9 viewpoints to look at association before opting for causation, they help us to decide if there is another way of explaining the facts
Definition activity?
action needed to achieve an output
Clinical case series?
coherent and consecutive (but ad hoc) set of cases, essential idea: count cases, define key features including signs and symptoms, describe treatment. Useful for clinicians, teaching, audit, research. Limitations: probably not complete set of cases (numerator incomplete), population at risk not accurately defined (incorrect denominator) => usually impossible to calculate disease rates
Analytical studies (test hypothesis on etiology or risk factors)?
cohort studies, case-control studies, ecological studies.
Nested case-control study (within cohort study)?
combination of case-control assessment with cohort approach
Misclassification?
comes from imperfect measurements, bias etc => erroneous classification of individual/value/attribute into the wrong category, eg misclassification of disease, exposure status or both
Effectiveness?
measure of impact of an intervention under "real world" conditions, usually measured in large-scale programmes => how well does it work in practice? how much of the original efficacy can be retained under "real life" circumstances? Long term measure and effects (> 2y): enthusiasm and adherence may decline, long-term impact on immunity etc. Rarely done although "cost-effectiveness" often quoted (in reality, it is "cost-efficacy"), should be part of phase 4 assessment.
Efficacy?
measure of impact under ideal (maximal) conditions, usually measured in phase-3 RCTs => bias-free. Best possible measure of impact ("goal" to reach) => how well could it work?
differential misclassification?
misclassification of disease is unequal according to exposure status, or misclassification of exposure status is unequal according to disease status
non-differential misclassification?
misclassification of disease is unrelated to exposure status, or misclassification of exposure status is independent of disease status
Definition monitoring?
routine, continuous tracking of key performance elements to assess whether programme is implemented as planned. Data sources are important (record-keeping, reporting, surveillance systems, observations etc
3 sources for random error?
sampling error (=> increase sample size), individual biological variation (=> standardize all test procedures including timing), random measurement error (=> standardize testing and keep internal and external quality controls). Statistical tests allow to estimate likelihood that observed result is due to chance rather than to real effect (p-value quantifies this probability).
Cluster/group randomization?
second-best to individual allocation. Responses within clusters likely to be correlated: within cluster, individuals are likely to be more similar than between clusters => CI and p-value become larger. Increase sample size or accept lower power. Always have at least 10 groups, with 10-20 groups (small number) use paired design to make groups more comparable
problem with case-control studies?
work backwards (O => E), this is not intuitive for clinicians or researchers=> widely misunderstood
Hill's criteria?
9 criteria for causality => strength of association, consistency (in different situations and with different techniques), specificity (but many diseases have more than one cause), temporality (which comes first?), biological gradient (dose-response), plausibility (biological, depends on actual knowledge), coherence (in line with general facts of disease, very close to plausibility), experimental evidence (do preventive measures change something?), analogy (examples you can derive from eg thalidomide).
Advantages/characteristics cohort study?
: often for etiological research. Confirm temporal sequence (E precedes O). Used a lot for chronic diseases (RF not obvious), less for infectious diseases (causal agents easier to see). Exceptions eg BSE, HIV/AIDS in early days. Intervention studies based on cohort design (difference: exposure is not simply observed but determined by investigators). Only cohort design allows to measure incident rates and related measures, and effect of rare exposures (special groups can be selected). We can study multiple outcomes for a single exposure Always min. 2 groups (controls!). Less susceptible to bias/confounding than case control studies
Probability assessment?
=> formal testing of significance, implies having an adequate control group with randomization. Most sophisticated but difficult to implement. If effectiveness is proven => control group is not acceptable anymore
Descriptive epidemiology
distribution of disease: identify and classify disease entities (who, where, when, how much), describe transmission, distribution, evolution (natural history)of disease
Exposure in epidemiology?
Being exposed in epidemiology means having a risk factor, not being at risk for the disease: both exposed and non-exposed can get disease of interest.
Selection cases andcontrols?
Case group selection: precise criteria inclusion/exclusion, incident cases (new cases appearing in study period => requires prospective design), prevalent cases (cases with outcome observed at one point in time). Sources for cases: hospital based (convenient, but representative for general population?) vs population-based (identification requires large infrastructure especially for rare outcomes, not always feasible e.g. stigma, but easier to interpret/less biased than hospital based). Control group selection: same source as for cases => eligible as "cases" if they developed the disease, up to 4 controls can be chosen for each case to improve the power of the study. Choice of controls according to who are cases
ways to calculate in 2x2 table?
Cohort studies: results calculated on the rows (disease incidence rate in exposed/non-exposed), case-control studies: results calculated on the columns (odds => we cannot calculate incidence rates because we chose artificially, big disadvantage but we have OR).
Descriptive studies (frequency/distribution of a disease => generate hypothesis about etiology or risk factors)?
Cross-sectional studies (at one point in time, a cross-section of the population at risk is sampled and examined => allows to measure prevalence rate), longitudinal studies (a group of individuals is followed over time and new disease episodes are registered, phased entry: not all individuals need to be under observation all the time; closed cohort => risk, open cohort (person-time at risk)=>rates, => allows to measure incidence rate).
Dose-response relationship?
if effect increases with increasing exposure level: strong indication of causality, this should be investigated wherever possible
Why lower effectiveness drugs/vaccines/PH?
Drugs: lower compliance, less rigorous indications, different age groups than in RCT, co-morbidity might affect treatment effect... Vaccines: problems with storage/cold chain, low compliance for multi-dose vaccines. PH programmes: sub-optimal coverage, patient adherence, provider compliance, irregular delivery, herd effect not working, overall changes in society etc.
Definition RCT?
Epidemiologic experiment in which subjects are randomly allocated into groups (study and control group), to receive or not to receive an intervention. Results are assessed by rigorous comparison of results
Test efficiency?
How many are correctly identified? (a+d)/N
Individual/community effectiveness?
Individual => impact on individual level (often assessed through case-control design), community => impact at community level, relates to cohort design and intention-to-treat analysis. Relation: Community effectiveness = individual effectiveness x coverage effectiveness. Both measures should ideally be close, but often this is not the case.
Negative predictive value?
Probability that person with negative test result is healthy:PV- = d/(c+d)
Positive predictive value?
Probability that person with positive test is diseased: PV+ = a/(a+b)
Sensitivity?
Proportion of diseased persons in screened population who are correctly identified as diseased (they have positive test result)a/(a+c)
Specificity?
Proportion of healthy (non-diseased) persons in screened population who are correctly identified as such (they have negative test result)d/(b+d)
Prospective - retrospective study?
Prospective: study starts today and is conducted forward in time, cases of disease have not yet occurred. Retrospective: study starts today and looks back in time, cases of disease have already occurred. Prosp/retrosp is independent of the study design (especially cohort vs case-control), every study design can be done in both ways.
measures from RCT?
RCT measures RR and hence PE, AR (RD), continuous variables. Size of difference matters as much as statistical significance.
Influence of random and systematic error?
RE =>precision/reliability, SE => validity/accuracy
When high sensitivity favoured?
Serious disease, high penalty associated with missing a case. Consequences of over-siagnosis not too severe
Simple - double - triple-blind?
Simple: patient does not know. Double: patient and investigator do not know. Triple: even analysis is carried out blind, randomization code only broken at the very end (usual standard for RCTs). Even if not triple-blind, analytic plan should be prepared before inception
cohort vs case control study?
Start from exposure and wait for outcome in exposed and non-exposed (E => O): cohort study Start from outcome and look back for exposure in people with and without disease (E <=O): case control study. Both have strengths and weaknesses, complex issue, many compromises and considerations.
Study types in analytical epidemiology?
Study types in analytical epidemiology compare two or more groups (so we have control group) => investigate impact of RF or suspected cause by removing factors that affect both groups equally ("background noise"). RF either increase or decrease the risk of getting the disease => are associated. Sometimes several risk factors, difficult to separate out just one => bias and confounders possible.
Selection of exposed group (cohort)?
if exposure is common =>choose from general population. If exposure is rare => choose groups with specially high exposure. Selection also determined by practicality of data collection (school for children's diseases, civil servants => Whitehall)
Ecological fallacy?
association observed between variables on aggregate level does not necessarily represent association existing at individual level; severe limitation for generalizability
2 types of systematic error?
bias and confounding
Cohort study?
both groups disease-free at the start (but both are "exposed" to the risk of getting the outcome/disease!), occurrence of disease is measured over time.
Measurement of exposure information?
can be difficult (information bias: inaccurate remembering, recall bias: cases remember exposure/circumstances better than controls => be aware of these biases)
Plausibility assessment?
can we exclude other explanations for the observed change, is there causal relationship? Tries to measure effect of external factors and excluding/confirming them as confounders. Some kind of control group (if not randomized => problematic), measures many other parameters thought to be important
Population case series?
cases recruited comprehensively from general population instead of health facilities, foundation for description of disease. Longitudinal assessment through extended time frame for case collection. If cases well defined and population at risk known => calculate disease rates. Key requirements: clear/uniform diagnosis, date, place, socio-economic characteristics, size/characteristics of population at risk. Useful for disease rate calculation, to provide national/international perspective on disease. Limitations :often difficult to compile comprehensively, variable quality of case definitions, evtl reverse causality, limited amount of complementary information
Systematic error?
if tendency exists to produce results that differ in a systematic way from "true" values. Rarely controlled by increased sample size, much bigger hazard than random error: many more sources of SE, more difficult to be aware of them, often little control over SE
essential elements for RCT?
contemporary controls (to account for natural fluctuations in absence of intervention, "baseline". Controls can receive standard treatment, not always "nothing". Contemporaneous to avoid problems with time trends, eg wet and dry seasons) and randomization (essential to avoid personal judgment => selection bias and confounding, can be simple, stratified, paired) /concealment (allocation cannot be manipulated by trial participants, undermines validity of the trial), desirable: blindness (to avoid manipulation, not always possible => open label trial. The less "hard" the outcome, the more important blinding is. Placebo use only if there is no existing standard treatment, this allows to control placebo effect
Counterfactual model?
counterfactual effect is unobserved, so we estimate a proxy/surrogate amount of effect from a population who are not exposed but otherwise comparable to study population, then => calculate
Disadvantages case-control studies?
crucial to find appropriate control group, only one type of measure (OR)
Strobe statement?
defines what must be included in accurate and complete report of an observational study (similar to consort statement for RCTs), 22 items, adopted by most leading journals.
Analytical epidemiology
determinants of disease: determinants and causes of disease, risk factors and/or high-risk groups (how, what, why)
5 important measures cohort studies?
disease incidence rate in exposed and unexposed - relative risk = incidence in exposed/incidence in unexposed - protective efficacy usually gives risk reduction that individual user can expect - risk difference/AR for assessing the absolute PH significance of a RF - population attributable risk: size of difference matters more than statistical significance
Bias?
doing something wrong to one group and not to the other: measurement (testing not comparable), selection (selection of the study population), interviewer (behave differently with cases/controls), recall (some people may remember past events better). Losses to follow-up: those who withdraw from study may be different from those who complete whole study. Hawthorne effect: subject's behaviour/reaction changes as the result of observation (subtle). => Careful study design, randomization, careful pre-testing, regular quality control, blinding, control factors responsible for selection bias. Can't be improved during/after analysis
Interim analysis of RCT?
done before end of trial to check if detrimental effect or massive benefit in intervention group, performed by data and safety monitoring committee. Formal power to stop the trial.
Random error?
due to the fact that we only study a sample and not whole population at certain point in time (we make an inference from the results). Simple random variation depends on the sample and the time of testing. We can quantify RE but not get rid of it. A measure with little random error is precise/reliable
Definition evaluation?
episodic assessment to systematically and objectively judge implementation of a programme, usually in comparison with original plans. Tries to link output/outcome to intervention (attribution) and to document impact as result of these outputs/outcomes. More difficult than monitoring because routine data collection is not enough. Often done by independent entities. Should be scientifically rigorous but often has serious design flaws
Quality of evidence
expert opinions < multiple time series < well-designed cohort/case-control studies < well-designed controlled trials without randomization <at least one properly designed RCT
Intention-to-treat analysis?
for individual randomization => person allocated to intervention must be analyzed as having received it even if this is not the case, group randomization => every member of cluster is considered as having received the cluster intervention. If not => we compare compliers in intervention group with controls => risk of bias, especially if non-compliance is high => results not applicable to population. Alternative: per protocol analysis.
Eligibility RCT?
general (e.g. non-resident, known allergy etc) and specific (eg poor general health, had received respective treatment during last month) criteria for non-eligibility
Rates for "acute" diseases
high monthly incidence, short duration of episodes => prevalence << monthly incidence => incidence rate more informative (also births, deaths, hospital admissions without "duration")
Measurement errors?
how capable is study/test to measure what it intends to measure? Is valid if result corresponds to the "truth" (how well test describes the real situation in a population) => then the result is accurate. Best: no systematic error, random error as small as possible. Is precise if repeated measures do not give much different results (= reliability). CI/spread of repeated measures is narrow.
Intervention epidemiology
identify problems, design solutions, measure their impact => changing the situation (so what ? does it work?)
Fallacy of homogeneity?
individuals in a defined group haven't always similar characteristics but this is assumed anyway
Causal inference?
infectious => Koch's postulate chronic => Hill's criteria
Consort statement?
intended to improve reporting of RCT => standardized way for complete transparency, e.g. using checklist and flow diagram (37 items)
Monitoring & evaluation? Difference?
investor/government/donor giving money has right to know what was done, how, what it has achieved. Monitoring for inputs - activities - outputs - outcomes, evaluation for outputs and outcomes
Rates for "chronic" diseases
low monthly incidence, long duration of episodes => prevalence >> monthly incidence => prevalence rate more informative (also stable characteristics with "infinite" dimension like sex, blood group etc)
Odds ratio
no absolute measures because proportion case/control is arbitrary, disease rates not representative. Rely instead on the odds of being exposed (essentially used when gambling). Odds of being exposed among cases: a/c, odds of being exposed among controls: b/d. OR: ad/bd. For rare diseases very similar to RR (approximation)
Definition purpose?
objective that the project is epected to achieve and through which it contributes to the goal
Analytical + descriptive epidemiology= ?
observational epidemiology
Disadvantages cohort study?
only one RF can be studied in a given cohort study. Often large (rare disease) and last for long periods (latency) => losses to follow-up. Cannot be conducted if disease too rare. Improve feasibility by creating "permanent" control groups (Whitehall, Framingham, Swiss HIV cohort).
Koch's postulate?
parasite must be present in all with disease, can never occur in healthy persons, can be isolated/cultured and passes disease to healthy experimental subjects, organism must be re-isolated from experimentally-infected individual. Limitations: Co-factors might be required, viruses can't be cultured like bacteria (need living cells), pathogenic viruses can be present without clinical disease (asymptomatic infection)
Four basic elements of descriptive epidemiology?
person, time, place, how much?
Phases from bench to field?
phase 0 (pre-clinical) => basic developments in lab/animals. 1 (safety)=> immunogenicity and safety in adult volunteers. 2 (proof of concept)=> small groups, tight follow-up. 3 (pivotal trials) => full-scale clinical RCT in potential target population. 4 (post-marketing) => long-term impact in real world
Definition impact?
positive health results achieved by the programme
Intervention studies (test whether an intervention works, confirm etiology or suspected risk factors in a powerful way)?
randomized controlled trials
Definition causality?
relationship between an event (cause) and a second event (effect) where second event is understood as consequence of the first => cause precedes outcome
Interaction/effect modification?
relationship between exposure and outcome variable is changed by addition of a third factor (effect modifier). EM must be RF for outcome of interest (can also be confounder but doesn't have to be). Detected by investigating effect measures across different levels => stratified analysis. EM must be described and reported (stratum-specific estimates with CIs), can't be controlled. Exploring this interaction can provide important insights in mechanisms of action of exposure and outcome/disease
Cross-sectional studies at individual level?
representative cross-section of population at risk sampled at one defined point in time (snapshot) => point prevalence rate, prevalence study. Repeated over time => "repeated cross-sectional". Useful: rapid and simple to do, comprehensive data can be collected. Limitations: over-representation of diseases with long duration, under-representation of diseases with short duration, danger of reverse causality, bias: many sources
Definition output?
result from activities of the programme
Definition outcome?
results from programme outputs
Prospective/retrospective: advantages/disadvantages?
retrospective => faster, cheaper and easier, especially with diseases with long latency period (no need to wait for disease to occur). But: limited by availability of exposure data of sufficient quality as this had not been planned. Prospective => full and planned data collection on exposure status and relevant confounders possible, also easier to access and control biases => prospective studies are done more frequently despite added efforts/expenses
Cross-sectional studies at population level (ecological studies)?
unit of analysis => populations or groups of people, can be combined with individual analysis. Outcome: more often incidence measures, but prevalence also possible. Are more an analytic extension of population case series than separate study design. Same group can be analyzed repeatedly over time. Usefulness: measure effect of aggregate measures of exposure, often indirect, certain exposure factors can only be measured at group level (environmental or global measures). Rapid analysis, good insight into differences between populations. Limitations: ecological fallacy, fallacy of homogeneity
Advantages case-control studies?
useful for rare outcomes, require less time/resources, easier to analyze, exposures complex and multiple
Causal inference: 5 criteria needed to estimate causal effects?
well-defined interventions, exchangeability (hardest and most important), positivity/experimental treatment assumption ETA, no measurement error, no model misspecification
