Bias

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

DAG theory on 3 ways an association between an exposure and outcome can be produced

*1. Cause and effect* If E causes D or vice versa, they will be associated *2. Common causes* If E and D share a common cause, they will be associated, even if neither is a cause of the other *3. Common effects* If E and D have a common effect C, they will be conditionally associated if the MOA is computed within levels of C (the stratum-specific MOAs will differ from 1 Chance could also cause associations seen even in the absence of above structures, but isn't a structural source of association - DAGs focus on structural sources

Types of bias analysis

*1. Deterministic bias analysis:* doesn't incorporate effect of random error, assumes confounding relationships known with certainty - simple bias analysis - multidimensional bias analysis *2. Stochastic bias analysis:* incorporates random error/uncertainty in bias parameters - probabilistic bias analysis - multiple bias analysis

Direction and magnitude of bias

*Direction* refers to how the bias changes the effect estimate Bias towards the null (conservative bias); closer to 1 than the true measure for relative and closer to 0 than true measure for absolute Bias away from the null (exaggerates association); farther from 1 in either direction for relative, farther from 0 in either direction for absolute *Magnitude* refers to the strength of the bias, the extent to which it distorts the association

Selection vs information bias

*Selection bias* relates to how people get into a study - How participants are recruited - What affects their choice to participate or not - What affects their probability of dropping out vs staying in When exposure and outcome both affect subject's probability of being present in our study, then the exposure-outcome association in our sample differs from that association in the source population and we have selection bias *Information bias* relates to how participant's information is collected/measured once they are in the study -How exposure and outcome are collected (self-report, by study personnel - masking?, from records, from registry) When outcome misclassification is not the same in exposed/unexposed or exposure misclassification is not the same for diseased/non-diseased, then misclassification is differential

Specific selection biases

1. Berkson's bias 2. Healthy worker effect 3. Self-selection bias

What are possible explanations for an observed association between exposure and disease?

1. Causal effect 2. Random error 3. Systematic error Typically, epidemiologic associations represent a combination of all 3

DAG theory on biases that could be present in these 3 structures

1. Cause and effect - reverse causation could explain association, ex: recall bias 2. Common causes - confounding 3. Common effects - selection bias

Types of systematic error

1. Information bias 2. Selection bias 3. Confounding bias

Types of validity

1. Internal validity 2. External validity

Information biases: differential outcome misclassification

1. Observer bias 2. Respondent bias

Information biases: differential exposure misclassification

1. Recall bias 2. Interviewer bias

Four levels of populations (hierarchy of populations)

1. Study population (sample) 2. Actual population (eligible pop) 3. Target population (source pop) 4. External population

What is one primary goal of epidemiology and how does it relate to precision and validity?

Accuracy in estimation - how well does the observed association reflect the true relationship between exposure and outcome Accuracy encompasses both precision and validity Validity includes both internal and external validity Internal validity includes confounding, selection bias, information bias

Effect of selection bias differential with respect to exposure and disease

All measures of disease and association will be biased, except in the rare situation where they cancel

How to prevent information biases: recall bias

Assess or verify exposure status by collecting more objective exposure data that were recorded before disease onset (medical records, employment records) - these data sources may have limitations depending on the exposure of interest Select a control group that will have similar recall

Systematic error

Bias Occurs when the observed association in your sample differs from the true association in the source population due to a systematic difference between the two Affects validity of the estimate If there is systematic error, then over many study repetitions your estimates of association will NOT center around the truth in the source population Ex: a tape measure that has only 11 inches to a foot would yield biased estimates of height - any height over 11 inches would be incorrect in a systematic way - it has a direction and magnitude

Confounding bias

Bias arising from different distributions of factor(s) that cause the outcome between exposed/unexposed groups that can explain part or all of the difference between an observed association and true causal effect Sometimes understood as a separate phenomenon from the broader category of bias because it comes about because of true causal associations in the source population, just not the one of interest, whereas other biases do not reflect true causal associations in the source population

Selection bias

Bias arising from procedures used to select study subjects and from factors influencing participation Bias arising from systematic error in ascertainment or participation of study subjects People having different probabilities of being included in the study sample based on exposure and outcome Generally, when both exposure and outcome affect selection/participation, the exposure outcome association in the sample is no longer representative of that in the source population

Information bias

Bias resulting from flawed definition of study variables or measurement of study variables Results in erroneous classification of subjects with regard to exposure and/or outcome - misclassification

Effect of non-differential misclassification

Biases towards the null when the exposure or disease misclassified is binary Conservative bias - at least you know you're not presenting an artificially large association With more than two categories, does not necessarily bias towards the null Categorization of a variable with non-differential misclassification can generate differential misclassification

Target population

Broader population from which the actual population was defined (with specific eligibility criteria) Population about which we aim to make inferences - the target of our inferences Aim to be able to infer that our study results apply to this population even though this population is typically broader that the actual population, and information reporting may not have been perfect in the actual population Decision about what population should be the target is defined by the investigator and subjective Possible examples include: - All non-institutionalized English- or Spanish-speaking adults (18-65 years of age) living in the 9 Bay Area counties in July-December 2005, who otherwise met inclusion criteria - includes actual population except includes both participants/would-be participants and those who declined -All adults in the Bay Area -All adults in Northern California

External population

Broader than the target population, may or may not overlap with the target population, to which we want to generalize our findings May be more than one Possible examples include: -People of all ages in Northern California -All adults in California -All adults in Washington State -All adults in the US

Selection bias in case-control studies

Cases are generally more likely to participate in studies than controls - having the disease increases motivation to participate; cases in a cohort being more likely to be selected than noncases is a defining feature of case-control studies Bias arises when exposure status also affects participation probability If among either cases or controls exposure makes subjects more or less likely to participate, then selection bias is introduced If the sampling fraction for cases and controls is different across exposures - selection bias present when exposure distribution in cases or controls is not the same in diseased or non-diseased in study base

Non-differential misclassification of exposure

Degree of exposure misclassification is not related to outcome status Sensitivity and specificity of exposure assessment same for diseased and non-diseased (but both are not 1)

Differential misclassification of exposure

Degree of exposure misclassification is related to outcome status Sensitivity and specificity of exposure assessment NOT the same for diseased and non-diseased Includes recall and interviewer bias

Non-differential misclassification of the outcome

Degree of outcome misclassification is not related to exposure status Sensitivity and specificity of outcome assessment same for exposed and unexposed (but both are not 1)

Differential misclassification of the outcome

Degree of outcome misclassification is related to exposure status Sensitivity and specificity of outcome assessment NOT the same for exposed and unexposed Includes observer and respondent bias

How to prevent selection bias

Design studies in such a way that ensures exposure and outcome do not both affect inclusion into the study - be especially careful with hospital- or clinic-based case-control studies Attempt to get a high response rate Attempt to collect some data on non-respondents and respondents who decline to participate Compare participants with those declining to participate to see if they differ systematically Strive for high completion/follow-up Compare subjects with LTFU with those remaining in the study to see if they differ systematically

Conditioning on a collider

Divide data into strata based on that variable AKA "controlling for" or "adjusting for" Represented by putting a box around a variable

Statistical inference

Extent to which findings from the study pop reflect the true causal effect in the actual pop

Internal validity

Extent to which findings from the study pop reflect the true causal effect in the target pop Pre-requisite for external validity, primary concern: if internally valid, then meaningful even if not externally valid; if not internally valid then not externally valid iether

External validity

Extent to which findings from the study pop reflect true causal effect in an external pop beyond the target pop

Selection factor

Factor that influences whether or not an individual's data ends up in the study analysis For example: -decides to participate in study -loss to follow-up -competing causes (death)

Precision

Lack of random error

Validity

Lack of systematic error A study design and analysis plan are valid if, on average (aside from random error), results generated from the study design reflect the true causal effect in the source population

Multidimensional bias analysis

Like a simple bias analysis but examines multiple biasing scenarios. Yields a range of bias-corrected estimates. Simultaneous examination of multiple scenarios

Multiple bias analysis

Like probabilistic bias analysis, but can examine the influence of more than one type of bias simultaneously (like confounding and exposure misclassification) Yields distribution of bias-corrected estimates that covers multiple dimensions of bias Most people focus on one type of bias at a time as in probabilistic

Simple bias analysis

MOA adjusted for one type of bias at a type, with one proposed biasing relationship Yields a single bias-corrected estimate Can do over multiple scenarios, one at a time Multiply observed MOA by bias-correction factor which accounts for prevalence of confounder in exposed and unexposed groups and MOA for U-Y relationship For misclassification, use sens and spec as bias parameters

How to prevent information biases: observer bias/outcome identification bias

Mask data collectors/observers to exposure status if possible Use multiple observers to independently classify events, check for inter-rater reliability Confirm participant report of outcomes with more objective metrics Rigorously define outcome and ask participant about multiple symptoms in outcome's diagnostic constellation

How to prevent information biases: interviewer bias

Mask interviewers to participant disease status if possible Carefully design interview protocols, provide training, and periodically monitor interviewer adherence to protocol Conduct reliability studies (can be difficult with interview data)

Effect of selection bias differential with respect only to disease

Measures of disease and AR, CIR, and IDR will be biased OR will be unbiased

Example of recall bias: case-control study of gestational pesticide exposure and offspring developmental delay

Mothers with developmentally delayed children may more comprehensively recall their exposures during pregnancy or may over-report them, having spent time thinking about what might have caused their child's disability Control mothers with typically developing children have not spent time pondering prenatal exposures, and thus may be less likely recall exposure

Types of misclassification

Non-differential Differential Definition of these depends on the variable being measured - exposure or outcome

Self-selection bias

Occurs in all studies in which individuals decide whether to participate When people who choose to enroll are different from those who do not, selection bias may be introduce Becomes more problematic as response rate/participation rate decreases

Berkson's bias

Occurs in case-control studies of hospitalized patients If both exposure and outcome affect hospitalization and thus selection into the study, a statistical association will be induced in the study pop Two diseases that are unassociated in the population are associated among hospitalized patients when both diseases affect probability of hospital admission Particular concern when exposure and outcome are both health conditions that cause hospitalization (like studies of effect of hypertension on skin cancer - no association in source population, but everyone with exposure (hypertension) is hospitalized and everyone with outcome (skin cancer) is hospitalized, and fewer of those with neither are hospitalized - end up with fewer non-diseased/nonexposure overall and underestimate association) In figure, in a case-control study in which cases were hospitalized patients with D and controls were hospitalized patients with E, an exposure R that causes disease E would appear to be a risk factor for disease D, even if R does not cause D

Healthy worker effect

Occurs in occupational cohort studies and other studies comparing working to non-working or general population groups Workers have lower rates of disease than comparison cohorts from the general population (self-selection of hardier workers into more taxing jobs, attrition of sick workers from the workforce) Comparison of workers to non-workers , general pop groups, or workers in less taxing jobs may cause the health effects of occupational exposures to be underestimated

Interviewer bias

Occurs when interviewers are not masked to participant disease status Interviewer may question disease and non-diseased differently, for example emphasizing some words or questions, or asking more clarifying questions of those with disease in an attempt to elicit information on the exposure

Observer bias

Occurs when observers/raters collecting outcome data are not masked to exposure status Analogous to interviewer bias, except affects outcome classification Observers may be more likely to count cases among participants with high risk/exposure profiles Ex: A sample of nephrologists were sent patient case histories with a simulated race randomly assigned to each case. When the case history identified the patient as black, the nephrologists were twice as likely to diagnose the patient as hypertensive end-stage renal disease, as compared to patients labeled white.

Recall bias

Occurs when participants asked about past exposure after the outcome in question has occurred (or not) - as in cross-sectional and case-control studies Respondents memories vary according to whether or not they experienced the outcome, especially if the exposure is a commonly known risk factor for the disease they have experienced Those with the disease and exposure more likely to recall exposure - increased sensitivity Those with disease and not exposed more likely to report exposure - reduced specificity

Random error

Occurs when the observed association in your sample differs from the true association in the source population due to chance Affects precision of the estimate Based on the concept of hypothetical study repetitions - your sample is only one of innumerable samples you could have selected by chance and random error captures the distribution of the possible samples If there is only random error, then over many study repetitions your estimates of association will center around the truth in the source population Ex: a tape measure with only has marks for feet (not inches or smaller) would give imprecise measures of height so you wouldn't be able to measure height with more precision than feet. This would not introduce systematic error, just random error around the true value

Respondent bias

Participants with high risk/exposure profiles may be more likely to report outcome of interest

Actual population

Population that met eligibility requirements for the study and would have participated and reported information similarly had they been contacted about participation Ex: All non-institutionalized English- or Spanish-speaking adults (18-65 years of age) living in the 9 Bay Area counties in July-December 2005, who otherwise met inclusion criteria and would have participated in the study if they'd been approached - this represents millions of people

Study population (sample)

Population that participated in the study Data collected on the study population On the bases of statistical analysis, we aim to make inference about associations in other populations Is a subset of the actual population People who were eligible may not have participated because they decided not to or they weren't contacted Ex: 2421 Bay area adult subjects recruited into a study of the relation between exercise and obesity

Specificity

Probability of being classified as negative among true negatives

Sensitivity

Probability of being classified as positive among true positives

Bias analysis

Quantitatively addressing bias due to systematic errors (residual confounding, selection bias, and information bias) AKA sensitivity analysis, uncertainty analysis, risk analysis Requires information that is unavailable by the nature of the problem, relying on "best guess" for missing information Conclusions are dependent on assumptions made and must be interpreted as such - you don't get a "correct" MOA after this approach, instead, if your second MOA is similar to the first then you could say that systematic error due to that confounder was not important in measuring the first MOA

Selection bias and colliders

Selection bias is equivalent to conditioning on a collider Occurs when both the exposure and outcome affect whether or not an individual is included in the study data Effectively, you analyze your study data within one stratum of any selection factor You only have data on those who participated, and/or only on those retained in the study (not LTFU) If exposure and outcome both cause a selection factor, then you are conditioning on collider

Selection bias in cohort studies

Selection bias related to *differential participation* by exposure and outcome status less likely to occur because subjects are selected and enrolled prior to disease onset If exposure status does affect selection into the study, it is still unlikely that future outcome experience will affect selection *Differential loss to follow-up* is a substantial concern in cohort studies Sometimes considered information bias If outcome and exposure affects probability of being lost, then selection bias is present Has the same practical effect as differential selection into the study - it distorts the association of interest in the sample, relative to the source population, making the exposure-outcome association systematically different from the association in the source population - biased

Correcting for selection bias

Sometimes accomplished via inverse probability weighting (IPW) estimators for longitudinal studies IPW assigns a weight to each subject so that she accounts not only for herself but also for others with similar characteristics (similar values of E and L in example) who were not selected IPW creates a pseudopopulation in which similar subjects from the original population are replaced by copies of uncensored subjects in the final study population The effect measure in the pseudopopulation is unaffected by the selection bias assuming that the outcome in the uncensored subjects truly represents the unobserved outcomes in the censored subjects - this is met if the probability of selection is calculated conditional on E and all factors that independently predict both selection and the outcome (untestable) IPW can also be used to adjust for the confounding present in these examples using inverse probability of treatment weighting (as opposed to inverse probability of censoring weighting described above). This involves modeling the probability of exposure given past exposure and L so that denominator of weight is conditional probability of receiving one's past treatment history If using both IPCW and IPTW, can multiply these weights together to create a final weight for each subject Relies on positivity assumption - if not met g-estimation could be used instead

Probabilistic bias analysis

Specifies probability densities for biasing relationships Yields a distribution of bias-corrected estimates. Akin to simple bias analysis, but with specification of probability density to account for random error Steps: 1) ID sources of important bias 2) ID bias parameters needed 3) Assign probability distribution to each bias parameter (uniform, normal, beta, gamma, trapezoidal/triangular) 4) Randomly sample from each bias parameter distribution 5) Use simple bias analysis to correct for the bias 6) Save the corrected estimated, then repeat 4 and 5 many times 7) Summarize corrected estimates with central tendency and simulation interval

Selection biases and cancelling

Theoretically, different selection biases can cancel each other out if magnitude of bias in selection of cases is the same as in controls - in practice impossible to know

DAGs for selection bias in cohort studies and RCTs

These could depict LTFU or missing data/nonresponse for outcome - depict missing data on outcome for any reason in a standard analysis restricted to subjects with complete data (C=0) Can also represent studies in which C is agreement to participate (1=yes) and as such depict volunteer or self-selection bias When considering LTFU in RCTs, figures a and b are relevant, but not c and d because unmeasured common causes of exposures and other variables shouldn't exist Volunteer/self-selection bias can't occur in RCTs in which subjects are randomized after agreeing to participate - Figures a and b eliminated because exposure can't cause C, and Figures c and d are eliminated because, with random exposure assignment, we can't have common causes of exposures and other variables Can also represent healthy worker bias

Effect of mislcassification of a confounder

Unpredictable Can be considered residual confounding Degree of residual confounding will usually differ across strata of the variable in question, distorting the degree of apparent heterogeneity (EMM) - can induce appearance of EMM when none is present or mask EMM when it is present

Effect of differential misclassification

Unpredictable - can bias towards or away from the null Possible to evaluate on a case-by-case basis and speculate on direction Typically bias away from the null considered a more serious problem because 1) don't know direction, and 2) risk of presenting artificially inflated estimate

Collider

Variable in a DAG with 2 arrows pointing into it

Selection bias and time-varying exposures

Variables affected by previous exposures can induce selection bias if conditioned on, but are also confounders of subsequent exposure-outcome relationship and so need to be adjusted for - stratification will eliminate confounding for the subsequent exposure while inducing selection bias for the prior exposure *Example: effect of ART (E=E0+E1) on viral load (D)* The greater the unmeasured true immunosuppression (U), the greater the viral load (D) and the lower the CD4 count (L) Treatment increases CD4 count and the presence of low CD4 increases the probability of receiving treatment. To estimate effect of E without bias, we need to be able to estimate E0 and E1 without bias simultaneously, but this isn't possible with standard methods because lack of adjustment for L results in confounding bias, but adjustment for L induces selection bias

Effect of conditioning on a collider

Will induce a spurious association between the exposure and outcome, or alter the true association between the exposure and outcome


Set pelajaran terkait

Social Impact of the Industrial Revolution Chapter 7 Section 3

View Set

Harr 5.7 Enzymes and Cardiac Markers

View Set

Mental Health Test 3 [Monday 4/3/17]

View Set

Multisystem and Genetic Disorders: Pediatric Primary Care

View Set

Oracle Database SQL - Exam 1Z0-071

View Set

Med Surg; Neuro/Muscular Questions

View Set

RN Concept-Based Assessment Level 3 Online Practice B

View Set

Pathology 3: Irreversible cell death - Apoptosis

View Set