Epidemiology - Quick Review 3

Ace your homework & exams now with Quizwiz!

screening

"the presumptive identification of an unrecognized disease or defect by the application of tests, examinations or other procedures that can be applied rapidly" clinician must target screening in preclinical phase of disease

sensitivity

% of people with the disease who tested positive for the disease = TP/(TP+FN) = TP/all people with the disease Sensitivity (%) + FN (%) = 100%

specificity

% of people without the disease who tested negative for the disease = TN/(FP+TN) = TN/all people without disease Specificity (%) + FP (%) = 100%

shape of distribution

- If symmetric then mean = median - If left skewed then mean < median (long tail to left) - If right skewed then mean > median (long tail to right)

Indicators of valid results:

- Independent blind comparison with reference standard - Appropriate sample selection - Clear method description to allow for replication - Evaluation of whether results influence decision to perform reference standard

Causation

- association does not prove causation - non-causal explanations (including study biases like measurement error, selection bias or confounding) may cause a spurious association and threaten internal validity - if the study has internal validity, we can then consider whether causal inference can be made Bradford Hill criteria... Aspects of risk factor-disease association must be examined to determine causation: 1. Temporality (only NECESSARY criterion) 2. Strength of association 3. Dose-response relationship (biologic gradient) 4. Consistency 5. Biologic plausibility

cross sectional studies

- cannot easily establish causality - can estimate the prevalence of a disease - cannot estimate the incidence of a disease - they are subject to confounding - they measure exposure and disease at one point in time

cohort: crude vs. adjusted effects

- cohort studies are observational, so exposures are not randomized - because exposures are not randomized, there are baseline differences in the two groups being compared that may influence the outcome (confounding) - the crude estimate may be "confounded" (ie risk of disease may seem higher due to confounding) - must adjust for confounding in regression model

How to prevent selection bias

- design study so that participation is NOT impacted by the exposure or the outcome - attempt to recruit all cases within the source pop - ensure control and cases are selected from the same target population (population based controls are ideal) - minimize non-response and refusals - minimize loss to f/u (in cohort and RTCs)

Retrospective cohort study

- exposure status is determined from information previously recorded - incidence of disease (or outcome) is determined from time of exposure to start of study - the outcome of interest already occurred by the time the study starts but may be learned at start of study

normality assumption of t test

- if underlying distribution is normal, okay to use a t-test when you estimate sample standard deviation - if underlying distribution is not normal, only okay to use t-test if sample is large enough and not too skewed - t test is very robust against the normality assumption

Prospective cohort study

- information on the exposure status of the cohort members is determined at the start of the study - identify new cases of disease (or outcome of interest) from the start of the study moving forward - the outcome of interest has not occurred when the study starts

descriptive statistics for numerical variables

- mean and std (if normally distributed) - median and range (if skewed) - mode express by box plots or histograms. must take into account shape of the distribution

prerequisites for successful screening

cannot screening all diseases so we choose those that are important: -high prevalence -high death or disability early identification of disease must reduce death and/or disability: -allow prompt and effective treatment -prolong survival

case-control study

- observational study in which subjects are selected on the basis of outcome status - people with the outcome of interest are cases - people without the outcome of interest are controls - prior exposures are determined in cases and controls - retrospective direction of inquiry regarding risk factors - must identify cases and select controls (ideally, controls should be a sample of the study base from which the cases emerged. controls must be selected from people who could be cases)

Measures of association and study design for prospective studies (cohort or RTC)

- risk ratio - rate ratio (need person-year info) - HR (from Cox proportional hazards model, need person-year information) - OR (unlikely to use because RR is better)

non-parametric test

- t tests require outcome variable to be approximately normally distributed - non-parametric tests are based on RANKS instead of means and standard deviations - non-parametric tests are less powerful than parametric tests if the population is close to normal, but more powerful if distributed in skewed Examples: - Wilcoxon signed rank test - one-sample and matched pair t test - Wilcoxon rank-sum or Mann-Whitney test - two sample t test - Kruskal-Wallis test - ANOVA

properties of t distribution

- the mean of the distribution is equal to 0 - the variance is equal to v/(v-2) where v is degrees of freedom (V > or =2) - variance is always greater than 1 but it is close to one when there are many degrees of freedom. with infinite df, the t distribution is the same as the standard normal distribution

proportions that are commonly called "rates"

-case-fatality rate -mortality rate -attack rate

purposes of surveillance

-characterize disease patterns and trends -detect disease outbreaks -develop clues about possible risk factors -identify cases for further investigation -identify high-risk population groups to target intervention -monitor impact of prevention and control programs -project health needs

why quantify disease?

-characterize disease patterns and trends (rising level of obesity in US) -detect epidemics (tsar's epidemic) -identify cases for research (especially useful for rare diseases/cancers) -evaluate prevention and control programs -project health needs

purpose of diagnostic tests

-establish a diagnosis in symptomatic patients -screen for disease in asymptomatic patients -provide prognostic information in patients with established disease -confirm that a person is free from a disease ****The purpose of a diagnostic test is really to move the estimated probability of the presence of a disease toward either end of the probability scale. All approaches to gathering clinical information can be considered diagnostic tests (ie history, physical exam)****

ambispective cohort study

-evaluate past records and collect data from groups that are followed into the future

how to determine efficacy of screening program?

-mortality is preferred endpoint -best determined by randomized trial -mortality must be reduced in order to recommend screening

Fisher's exact

-non-parametric test to use in replace of chi-square with a 2x2 table when there are expected counts <5 -more powerful than chi-square but harder to compute -requires two categorical variables with categories in each

case counts

-useful for investigating disease outbreaks (epidemic that occurs suddenly and within a confined geographic area) -epidemic curve: number of cases (y-axis) against time of consent of disease (x-axis)

key sources of data for surveillance

1. National population health surveys from NCHS (mortality data) 2. CDC's congenital malformations registry 3. NCI's Cancer, Surveillance, Epi, and End Results (SEER) program 4. International Association for Research on Cancer (IARC) 5. Centers for Medicare Services (CMS) hospital discharge data 6. National population health surveys from the NCHS: NHANES, BRFSS

3 ways to quantify disease occurrence

1. case counts (numbers) 2. proportions (ratio: # of cases/some variant of ppl at risk) 3. rates (ratio: # of cases in a given time/people at risk and time) ~ velocity of the disease

3 ways to quantify disease

1. counts 2. proportions - point and period prevalence - cumulative incidence 3. rates - incidence rates

3 main sources of epidemic

1. point transmission 2. person-to-person transmission 3. continuous transmission

Selection Bias

A form of sampling bias due to systematic differences between those who are selected for a study (or agree to participate) and those who are not selected (or refuse to participate). Occurs when different people will have different probabilities of being in the study depending on their exposure or outcome. Selection bias can arise from: - procedures used to select subjects - factors that influence study participation - factors that influence participant attrition

In a retirement community of 2000 men and women, 600 are found to have speech-frequency hearing loss at initial screening with audiometry, and 154 new cases of hearing loss are found at subsequent screening one year later. A. What is the estimated prevalence of hearing loss at initial screening? B. What is the approximate 1-year risk of developing hearing loss? C. What is the incidence rate? D. What is the estimated annual prevalence at the end of 1 year-f/u?

A. Prevalence = 600/2000 B. Approximate 1 year risk (cumulative incidence) = 154 / (2000-600) C. CI = 154 / (2000-600) cases/person-year D. Period prevalence = (600+154) / 2000

Cohort studies

Advantages: - can determine incidence in exposed and unexposed groups - can assess temporal associations - minimizes bias in exposure and subject selection - good to evaluate rare exposures Disadvantages: - can take years and be costly - loss to followup may introduce bias in the outcome - not good for rare diseases

case-control studies ad. vs disad.

Advantages: - relatively inexpensive - can use small sample size - great for rare diseases - can evaluate many exposures Disadvantages: - very susceptible to bias (accurate exposure info is hard to obtain; outcome and exposure are known when the study starts) - can't determine risks in exposed and unexposed - not good for temporal sequence - not good for rare exposures

t test assuming equal variance

df = (n1-1) + (n2-1)

Type I and II errors

Alpha - generally ranges from 0.01 to 0.1 - use low alpha if important to avoid type 1 (FP) error (ie testing efficacy of a potentially dangerous medication) - if alpha=0.05, then there is a 5% chance of incorrectly rejecting null Beta - generally ranges between 0.05 and 0.2 -use low beta if important to avoid type II (FN) error (ie provide evidence to reassure the public that living near a toxic dump if safe) -if beta=0.10, there is a 10% chance of missing an association of a given magnitude, or a 90% chance of finding an association of that size - reduce errors by increasing sample size

Outcomes of hypothesis testing

Alpha (type I error) - FP: incorrectly reject null Beta (type II error) - FN: incorrectly fail to reject null Power = 1-beta

Selection bias in cross-sectional studies

Can arise due to sampling: - survivor bias: occurs when survivors of a disease are sampled instead of all cases of a disease. The bias stems from the issue that survivors may have less aggressive forms of disease and different exposures than the people who died - volunteer bias / membership bias: people who join groups tend to be systematically different than people who do not join groups - non-random sampling schemes (i.e snow-ball sampling, convenience sampling) Can arise due to non-participation: - non-response bias: those who choose not to respond may be systematically different than those who do (this is especially problematic in survey research)

Selection bias in case-control studies

Cases are already more motivated to participate than controls because they have the disease of interest - if EXPOSURE status ALSO affects the likelihood of being in the study (or vice-versa), then selection bias occurs Berkson's bias arises when you use hospital controls. Hospital controls may have a lower observed probability than the target population should (P'<P) therefore the observed OR is an overestimate of true association

Clinical and statistical significance

Clinical significance - determination is subjective. Depends on the magnitude of effect and unit of increase. Statistical significance - p-value (HT) - confidence interval (significant if it doesn't include the null)

Chi-square

Dichotomous categorical independent and outcome variables Perform to determine if there is a relationship between two categorical variables measured on same subjects ie null: low birth weight (y/n) and smoking status of mother are independent (not associated). Compute expected value for all cells. Example: Expected value for (smoker and LBW)=P(smoker)*P(LBW)*Total Idea.. how likely is it that we'll observe a chi-square value or larger if there is no association between smoking and LBW df = (# columns -1)*(# rows -1) For 2x2 table, df=1

Non-differential misclassification

Equal misclassification in the groups being compared. There are equivalent degrees of outcomes among exposed and unexposed Both cases and controls under-report so OR is biased toward the null. (underestimation of true effect, conservative bias) Example: In a case-control study of alcohol consumption and risk of hepatocellular carcinoma, both cases and controls underreport exposure to alcohol

Analyzing results of RTC (ITT)

Everyone who is randomized is analyzed (ITT) ITT is more conservative approach to analyzing results but it better reflects how people actually adhere to drug usage in the real world. ITT preserves balance of measured and unmeasured confounders ITT analysis results in bias toward the null

Cutpoints

For a diagnostic test that measures a continuous variable we must choose a cutoff point above which we consider the disease to be present, and below which we consider the disease to be absent. When you choose the cutoff point there is a trade off. If you improve sensitivity, the specificity will suffer and if you improve specificity the sensitivity will suffer. Maximizing sensitivity and specificity will result in the inclusion of FP and exclusion of FN. Where you choose the cutpoint depends on the clinical context. If it is imperative you identify everyone with the disease (no FN) choose a highly sensitive test. If it is imperative you don't misdiagnose, choose a highly specific test. Cutpoints A, B, C: -A = high sensitivity, low specificity (no FN but high FP) -B = maximum specificity and sensitivity (includes FP but exludes FN) -C = high specificity, low sensitivity (no FP but high FN)

Hazard Ratio

HR is similar to rate ratio (relative risk). HR equals a weighted relative risk over the duration of a study. Analysis by ITT will pull the HR closer to the null.

Stages of RTC

I. Unblinded, uncontrolled studies in a few volunteers to test safety, find dose II. Relatively small randomized, controlled, blinded trials: test tolerability, surrogate outcomes III. Relatively large, randomized, controlled, blinded trials to test effect of therapy on clinical outcomes IV. Large trials or observational studies after drug is approved by FDA to assess rate of SAEs and other uses

estimating cumulative incidence from incidence rate

If incidence rate is low (IR*time<10%) then cumulative incidence ~ IR*time if incidence rate is high then cumulative incidence = 1-e^(-IR*time)

SnNout

If test is highly sensitive and NEGATIVE, rule out disease

SPIN

If test is highly specific and POSITIVE, rule in disease.

P value

If the null hypothesis is true, what is the probability that the difference between two groups will be at least as large as that actually observed? The P value is the probability of obtaining an effect as large as or larger than the observed effect, assuming null hypothesis is true. - provides a measure of strength against the null - does not provide info on magnitude of effect P value ranges from 0 to 1 - P ~ 0 means the association observed is unlikely due to chance alone - P ~ 1 suggests there is no difference between the groups other than that due to chance variation - p value is an arbitrary cut-point therefore it's important to report exact p-value If comparing two studies, take note of the sample size in each. - p value decreases with increasing sample size so a much larger N in one study will make it appear as if the observed association is more significant - if the sample sizes are approximately the same for two studies being compared, the p-value is related to the magnitude of the observed associations. so small RR will generate a smaller P value

Lead-time bias

Lead-time bias is an increase in survival as measured from detection of disease to death, without lengthening life. Patients identified by screening are diagnosed earlier therefore their "survival time" starts before patients diagnosed once the disease progresses enough to show clinical symptoms

measures of central tendency

Mean - the average/balancing point - affected by outliers (use if data is normally distributed) - should not be used with ordinal data - describes numerical data that is symmetrically distributed Median - exact middle value - not affected by outliers (used is data is skewed) - describes ordinal OR numerical data if skewed Mode - value that occurs most frequently - describes bimodal distributions

numerical variables

Measurements that can be quantified as numbers. Continuous: uninterrupted numbers for which any value is possible - weight, BP, cholesterol levels, age, salary Discrete: integers; only some numbers are possible - number of children, number of cavities, MCAT test scores

measures of variation / spread

Measures of variation give information on the spread or variability of the data value. - range (Xmax-Xmin) - percentiles/quartiles - IQR (Q3-Q1) - std/variance ((sum of (Xi-mean)^2)/(n-1)

Hypothesis testing

Null hypothesis: there is no association between the independent and dependent/outcome variables (rate of outcome in exposed group = rate of outcome in unexposed group) RR = 1 Alternative hypothesis: RR does not equal one. At alpha of 0.05, if we get a p value < 0.05, we can be 95% confident that the observed value is not due to chance alone. We reject the null. If p > 0.05 then we fail to reject the null.

Measures of association and study design for prospective studies (case-control)

OR (approximates relative risk under rare disease assumption. if rare disease assumption is not met OR > RR)

Odd Ratio (case-control study)

OR is a good approximation of the RR as long as the probability of the outcome in the unexposed group is less than 10% Odds = P(A) / (1-P(A) ..... odds of exposure among cases/odds of exposure among controls As with RR... OR = 1 indicates no association between exposure and outcome OR < 1 indicates exposure is a protective factor OR > 1 indicates exposure is a risk factor When rare disease assumption is NOT met, bias is introduced into the OR - OR > RR

Epidemiological study designs

Observational study: the population is observed without any interference by the investigator Experimental study: the investigator tries to control the environment in which the hypothesis is tested

Misclassification bias

Occurs when you classify people as having the wrong outcome, or the wrong exposure because there are systematic problems with the way you are measuring/getting your information 2 types: differential (worse) and non-differential

Effect of overestimation of a RR or OR

Overestimation of a RR or OR for probability of a risk factor biases away from the null (RR' > RR) Overestimation of a RR or OR for probability of a protective factor biases away from the null (RR'<RR)

Recall bias

Participants are asked to report on past exposures after the disease outcome has already occurred. Problem occurs in case-control and cross-sectional studies. Example: Case control study to test the association between maternal second-hand smoke exposure in pregnancy and infant birth defects. - mothers of infants with birth defects are more likely to recall second-hand smoke exposure Bias OR away from null! Less conservative

Precision vs Accuracy

Precision: - The absence of random error in a conclusion or measurement. - Reproducibility of results Accuracy: - The correctness of a study's conclusions. - A measure of how accurate or close to the truth the results are - Results reflect the TRUE CAUSAL effect in the source population

How likely is it a disease is present or absent? Predictive values.

Predictive values are measures of clinical utility. Sometimes referred to as posterior probability because it is determined AFTER knowing a test result. PPV is the proportion of people who tested positive who actually have the disease. PPV tells you the likelihood a person has the disease if their test was positive. PPV = TP/(TP+FP) = TP/all ppl with positive test result NPV is the proportion of people who tested negative who do not have the disease NPV = TN/(TN+FN) = TN/all ppl with negative test result

measures of disease frequency

Prevalence - # cases / total pop - % with disease at one point in time - no units Cumulative incidence - # new cases / pop at risk - % who develop disease over given period of time - no units Incidence rate - # new cases / (# persons at risk * time observed) - number / person-years or number of persons at risk per year - includes a measure of time

ROC curve

ROC (receiver operating characteristic) curve summarizes the relationship between sensitivity and specificity. Signal (TP%~sensitivity) is plotted on the y-axis. Noise (FP% = 1-specificity) is plotted on the x-axis. An excellent diagnostic test has an area under the ROC curve that approaches 1. Signal to noise ratio (sensitivity/(1-specificity) = Likelihood ratio +

Randomized Clinical Trials

Randomized, controlled clinical trial is the gold standard for evaluating usefulness of a treatment Advantages: - Experimental design eliminates many sources of bias (randomization reduces confounding; blinding reduces misclassification of exposure and outcome) - can determine risks - good for temporal sequence - can be used for rare or common exposures Disadvantages: - expensive - loss to f/u - not good for rare outcomes - can't randomize harmful exposures

Relative Risk (RTC and Cohort Studies)

Relative risk = incidence of disease in the treated group / incidence of disease in the control group RR = 1 indicates no association between exposure and outcome (null hypothesis) RR < 1 indicates exposure is a protective factor RR > 1 indicates exposure if a risk factor Image is of rate ratio

Which measures of variability to use

STD - use when mean is used (and numeric data is symmetrically distributed) Percentiles and IQR - use when median is used (ordinal or skewed numeric data) - can use when mean is used if objective is to compare individual observations with a set of norms IQR - use to describe central 50% of a distribution regardless of shape Range - use with numerical data to emphasize extreme values

Bias

SYSTEMATIC errors by the investigators in sampling, collecting or interpreting data that threaten the internal validity of the study 3 main types: selection, misclassification and confounding

Accuracy of diagnostic tests

Sensitivity and specificity describe the validity and accuracy of the diagnostic test relative to the gold standard.

LR+

Signal to noise ratio (sensitivity/(1-specificity) = Likelihood ratio + - LR+ > 10.0 indicates great diagnostic test (rule in disease) - LR+ < 0.1 rule out disease - LR > 10 or < 0.1 generate large and conclusive changes from pre to posttest probability - LR 5-10 or 0.1-0.2 generate moderate shifts - LR 2-5 and 0.2-0.5 general small changes - LR 1-2 and 1-.5 do not alter probability in any important way Can use the LR+ to compute the predictive value. Compute pre-test odds. Pretest odds x LR+ = posttest odds. Convert posttest odds to PPV (posterior probability)

Surveillance

Surveillance detects the occurrence of health-related events or exposures in a target population. The goal is to identify changes in disease distribution in order to prevent or control these diseases within the population

A study was conducted in children to determine the accuracy of a rapid antigen-detection test (RADT) for diagnosing group A streptococcus (GAS) pharyngitis compared to the throat culture with a blood agar plate(considered the reference standard). Both tests (throat culture and RADT) were administered to 1843 children, 3-18 years of age, in community pediatric offices. Thirty percent of the children had a positive throat culture for GAS and among these 385 had a positive RADT. Among the children who had a negative throat culture for GAS, 28 had a positive RADT. A. What is the sensitivity of RADT? B. What is the specificity of RADT?

TP = 385 TP+FN = all with disease = 0.3*1843 TN+FP = all without disease = 0.7*1843 TN = (0.7*1843) - 28 A. sensitivity = TP/(TP+FN) B. specificity = TN/(TN+FP)

Target population vs external population

Target population: population we want our results to directly effect. - within the target population, we find an actually population from which we can reasonably sample in order to get our study population External population: a larger sea of people we may or may not want our study to apply to

number needed to treat

The NNT is the number of patients who need to be treated in order to prevent one additional bad outcome. NNT is the gold standard of reporting. For ARR, if CI includes 0 then the result is NOT statistically significant.

Key features of Stage III RTCs

The following features add strength (internal validity) to RTC: - prospective design (ensures temporality which is required to determine causality) - intervention / treatment - randomization (controls for confounding) - placebo-controlled - double-blind (controls bias in outcome ascertainment)

Gold standard vs. other diagnostic test

The reference or gold standard definitively informs the presence or absence of disease. Other diagnostic tests have benefits but they're not as accurate. Throat culture (gold standard) vs. rapid strep test Gold-standard tests are typically : -expensive -invasive -not readily available -time-consuming We typically use other diagnostic tests instead of gold standard because they tend to be: -inexpensive -safe and painless -reliable -quick and sample

categorical variables

Two or more groups/categories being measured. Nominal: Descriptive names (no natural order) - Examples: marital status, presence of disease, blood type) - binary data is categorical variable with 2 groups (yes/no) Ordinal: "ordered" data; values with an order; often numeric values but intervals between consecutive values are not equally spaced - degrees of pain (0-10) - Rankin or Likert scales - TNM stages

Effect of underestimation of a RR or OR

Underestimation of a RR or OR for probability of a risk factor biases toward from the null (RR' < RR) Underestimation of a RR or OR for probability of a protective factor biases toward from the null (RR' > RR)

Differential misclassification bias

Unequal misclassification in the groups being compared. Worse than non-differential because one group is favored over the other. The observed effect could be an overestimate or underestimate of the true effect but can't predict. The amount of misclassification depends on whether one is exposed or unexposed to the risk factor, or whether one has/doesn't have the disease outcome. Differential misclassification of exposure: - recall bias, interviewer bias - occurs mainly in case-control and cross sectional studies Differential misclassification of outcome: - observer bias, respondent bias - occurs mainly in cohort and RCTs Example: In a case-control study of SSRIs and congenital birth defects, cases are more likely to report SSRI exposure (recall bias)

T test

Use to assess the association between a continuous variable (outcome) and a binary variable (independent) - to use must follow normal distribution - use to evaluate whether the mean of two groups are statistically different from each other - for the t test statistic the numerator is always the signal (difference you hope to detect) and the denominator is a measure of the variability ***when looking at the differences between scores for two groups, we have to judge the difference between their means relative to the spread or variability of their scores.

Interpreting confidence intervals

Width of CI - narrow CI implies high precision - wide CI implies poor precision (usually due to small sample size and therefore high variability) Notice whether the interval contains a value that implies no change/no effect/no association - CI for a ratio ( OR, RR, HR): not statistically significant if CI includes 1 - CI for a difference between two means: not statistically significant if CI includes 0

prevalence

amount of disease already present in a population. best used to measure chronic diseases (ie diabetes) -point prevalence is proportion of disease in a population at a point in time -period prevalence is proportion of disease in a population during a period of time

point source transmission

an epidemic in which all cases are infected at the same time, usually from a single source or exposure. ie all people infected at a picnic ate the same food

continous source transmission

an epidemic in which the causal agent (ie polluted drinking water, spoiled food) infects people as they come into contact with it over an extended period of time. (as in the case of cholera discovered by John Snow) ie. multi-state outbreak of Listeriosis linked to Cantaloupes from a farm in CO

person-to-person transmission (propagated epidemic)

an epidemic in which the causal agent is transmitted from person to person, allowing the epidemic to propagate or spread. (ie influenza)

incidence rate

how fast new occurrences of disease arise. best used to measure acute, short duration diseases and/or chronic diseases in large populations over longer times

estimating prevalence from incidence rate

if a disease is rare (very low prevalence) then prevalence is approximately equal to the incidence rate times disease duration this is because for rare diseases, the rate of incidence will approximately equal the rate at which people either die or are cured. Example: Coronary Heart Disease is decreasing in prevalence over time. Why? Mortality rate of ppl with CHD is going down due to improved treatments so disease duration has increased. Incidence rate is decreasing because we are effectively preventing CHD by modifying health behaviors (known risk factors). Prevalence is approx equal to incidence rate * disease duration (for rare diseases)

validity

internal - how accurately study results reflect target population external - "external generalizability" - how generalizable study results are to an external population *differences between study sample and actual population impact statistical inference **differences between study sample and target population introduce bias, which impacts the internal validity of the study (and undermine the study findings) ***differences between study sample and the external population impact generalizability, which hurts the external validity of the study

descriptive statistics for categorical variables

number (N), frequency (%) express by contingency tables or bar charts

Predictive value and prevalence

prevalence is considered prior probability. -as prevalence increases: PPV goes up and NPV goes down -as prevalence decreases: PPV goes down and NPV goes up -prevalence is proportional to PPV -prevalence is inversely proportional to NPV -sensitivity and specificity do not change with prevalence Example: consider breast cancer. prevalence without palpable mass is lower than prevalence with palpable mass therefore with palpable mass (greater prevalence) PPV increases and NPV decreases. Note sensitivity and specificity for a given diagnostic test remain constant regardless of changes in prevalence.

Selection bias in cohort studies

self-selection bias - occurs if people who choose to participate in the study are systematically difference from those who decline to participate **disease status rarely affects participation in cohort because disease is not yet known. Healthy worker effect differential loss to f/u: if the exposure or an outcome makes the study subject less likely to continue participating in the study then results may get distorted - consider cohort study on depression. if people can depressed and stop participating in the study then the number of individuals who are observed to be depressed will be far less than the number of individuals truly depressed. if a'<a then RR'>RR

crude mortality rates

special type of incidence rate

components and types of surveillance

surveillance consists of: -continuous data collection -data analysis -timely dissemination of info -use of data for purposes of investigation or disease control types of surveillance include: -laboratory-based -death certificates -physician notification (reporting system) -hospital discharge summaries -pharmacy records -active surveillance

cumulative incidence (risk)

the likelihood (risk) that an individual will develop a disease. commonly used to measure acute diseases and chronic diseases

age-adjusted mortality rates

to calculate age-adjusted mortality rates must: 1. calculate the age-specific rates of death from people in the study population > age-specific rate of study population = # incidence cases in age stratum/# of study population in age stratum*time) 2. calculate expected number of cases in each age stratum using the number of people from the standard population > expected # of cases = age-specific rate of study population * # of people from standard population in age stratum 3. sum total expected # of cases direct age-adjusted rate = total expected number of cases / total size of the standard population

paired t test

to find a difference between pre and post measurements on the same individual.

t test assuming unequal variance

use sattertwaite to find df. finds to be much smaller than df when you use equation that assumes variances are equal


Related study sets

Precalc: Radian and Degree Measures

View Set

Types of Subsistence Agriculture (2E)

View Set

Chapter 6 (slide terms) FINAnnuity

View Set

CS 4306 Algorithm Analysis - Final Study Guide

View Set

Surgical Technology-Pharmacology Chapter 5 Antibiotics

View Set