HSC4501-Exam 2
When is FP better than FN?
When a condition has a high risk of transmission or death ex. Blood borne pathogens in blood donations, mammography
For simple analysis
USe 2 x 2 table
Methods of randomization: Timing of randomization
Usually best for randomization to occur as late as possible for each subject, until just before he/she is treated differently under the alternative interventions
How to study rare diseases/ long onset/ cardiovascular disease or cancer?
Via case-control study design
ecologic fallacy
We assume characteristics from the group are characteristics of the individuals
Relative risk
(incidence among exposed) / (incidence among non-exposed) ex. (28.0 per 1000) / (17.4 per 1000) = 1.6 The risk of developing CHF among smokers is 1.6 the risk among non-smokers
Attributable risk percent
(incidence in exposed group) - (incidence in non-exposed group) / incidence in exposed group
Attributable risk formula
(incidence in exposed) - (incidence in non- exposed) ex. (28.0 per 1000) - (17.4 per 1000)= 10.6 per 1000 10.6 of the 28 per 1000 incident cases in smokers are attritbutable to the fact that these people smoke *this means* if we had an effective smoking cessation campaign we could prevent 10.6 of the 28 per 1000 incident cases of CHF that smokers experience
Kappa Statistic formula
(percent agreement observed) - (percent agreement expected by chance alone) / 100 % - (percent agreement expected by chance alone)
Factors that affect reliability in test results
- Intrasubject variation - Intraobserver variation - Interobserver variation
What are two ways to measure variation?
- Kappa statistic - Percent agreement
Disadvantages of RCTs
- Often controlled settings - Not always representative of true human behavior - Ethical considerations for exposure - External validity (generalizability) - Cost
Advantages of RCT
- Prospective design/ temporality - *Randomization* reduces *confounding* even unknown factors - *Blinding* eliminates *bias* assignments of exposure by participants or investigators
Predictive value is affected by which two factors?
- The prevalence of the disease in the pop. tested - The specificity of the test being used
Alternative treatments
- Used to measure the diff between two treatments - Most often used to test a null hypothesis of equal efficacy comparing a new intervention to the best existing alternative - Study must be stopped if a new intervention does not shows effect similar to existing therapy
Diagnostic test
*Symptomatic individuals*, tested at time that illness is suspected
Screening tests
*asymptomatic individuals*, regularly screened recommended time intervals
Where do you control for confounders
- Analysis phase using statistical techniques or stratification - Design phase, including matching and stratification
Randomize Control Trial
- Assigned exposure - Use of control group - Ability to randomize subjects to each group
Sample size in RCTs
- Determined prior to conducting the study - Larger sample sizes are desired, allows for better precision and accuracy, reducing bias and error - Based on the level of desired power to determine group/ treatment differences
case control advantages
- Faster and cheaper than cohort - studying rare diseases - studying diseases with long latency periods - study many exposures with respect to one income of interest
Subject selection in RCTs
- Who would benefit from new intervention? - Need to preserve internal validity of trial - Ability of subjects to provide valid and reliable data - Low probability of dropping out - Expected compliance with regimen
A test with high specificity
- results in more false negatives - Lack of follow-up techniques - Person may be "lost' from the system
A test with high sensitivity
- results in more false positives - Will need more sophisticated and more expensive tests
For complicated analysis
-Stratified analysis -Regression analysis
RR interpretation
1 = no difference > 1: positive associated (*risk factor*) between treatment and outcome < 1 negative associated (*protective factor*) between treatment and outcome
The OR can be used to estimate the RR only when?
1)Cases in the study are representative of all people with the disease in the population in terms of exposure status 2) Controls in the study are representative of all people without the disease in the population in terms of exposure status 3) The disease is rare
Sources of selection bias
1. Differential diagnosis (Surveillance or referral into the study group or sample) 2. Differential refusal (Non-response rates among subjects by either disease or exposure status)
descriptive/observational studies
1. Ecologic studies 2. Case series 3. Case reports
Disadvantages of cohort design
1. Expensive in time and money 2. Difficult organizationally 3. *Not applicable for rare diseases*
Advantages of cohort studies
1. Gives best estimate of disease *incidence* 2. Gives best risk treatment
Cohort disadvantages
1. Not good for diseases of low incidence (rare diseases) 2. Not good for diseases with a long latency period
cohort advantages
1. Rare *exposures* can be examined 2. Temporal relationship is clear 3. Allows for direct determination of absolute risk of disease
Forms of Observation Bias
1. Recall bias 2. Interviewer bias 3. Follow-up bias
Crossover study design
A method of comparing two or more treatments or interventions in which the subjects, upon completion of the course of one treatment, are switched to another. -usually used for study of short-term response studies - a "wash-out" period between regiments to be sure that effects of the first regimen have dissipated before second regimen is begun
Selection Bias
A systematic error in the process of identifying and recruiting members of the study sample to either outcome or exposure • Does not relate to sample representativeness or generalizability
The "gold standard" in screening
A test that is considered to be the most accurate among all the known tests *they are usually expensive or invasive and used as a follow up test to confirm results of a cheaper/less invasive screening test*
Prevalence rato formula
PR= (prevalence of disease in exposed) / (prevalence of disease in unexposed) PR: (A / A + B) / (C/ C + D)
Population Attributable Risk (PAR)
AR x Prev (exposure)
Specificity
Ability of the test to identify correctly those who *do not have the disease*
Sensitivity
Ability of the test to identify correctly those who *have* the disease
Attributable risk
Amount of disease incidence that can be attributed to a specific exposure Ex. How much of the lung cancer risk experienced by smokers can be attributed to smoking? *indicated the potential for prevention if the exposure could be eliminated*
THE DIFFERENCE BETWEEN BIAS AND CONFOUNDING
Bias creates an association that is not true, but confounding describes an association that is true, but potentially misleading.
How can the success of participant blinding be evaluated?
By asking subjects for their best guess about the group in which s/he participated.
Which of the following study designs allows us to establish temporality between the exposure and the outcome? A. Cross-sectional B. Cohort C. All of the above D. Case control
Cohort
Observational Analytic Studies
Cohort, Case-control
Reliability
Consistent results
Validity
Correct results
n a case-control study, which of the following is true? a) The proportion of cases with the exposure is compared with the proportion of controls with the exposure b) Disease rates are compared for people with the factor of interest and for people without the factor of Interest c) The investigator may choose to have multiple comparison groups d) Recall bias is a potential problem e) a, c, and d
E
Methods of randomization: Simple random allocation
E.g., even number means: subject goes to intervention group, odd number means: subject goes to control group (problems occur if physicians know the method)
When is FN better than FP?
For conditions that may not change treatment recommendations or may become apparent at a later time ex. Flu tests, home pregnancy test
When is group randomization useful?
Intervention strategy affects an entire group and is expected to work better on a group such as social context or environment ex. fluoridation of water supplies
When is a screening program most productive and efficient?
If it directed to a high-risk target population
Case reports
Information on one person includes symptoms and test results for the suspected condition
OR interpretation example
OR: 5.0 Cases of the disease are 5.0 times as likely to have been exposed to the exposure than those without the disease
Measurement of association for case-control?
Odds ratio OR= ad/bc
PAR %
PAR/ incidence (total)
PR interpretation example
PR: 5.0 The prevalence of the disease among the exposed is 5.0 times the prevalence of the disease among the unexposed
Which of the following is used as a measure of reliability? A. Sensitivity B. Percent Agreement C. Specificity D. Predictive Value
Percent agreement
Types of RCTs: Clinical trials
Population: Hospital, clinic Sample size: 10-1000 Outcome prevalence: Low Intervention: Drugs, surgery, radiation Length of study: days-years Cost: expensive - Test new medication, surgical approaches, or therapies such as radiation for cancer (ex. Does the combination of metformin and insulin control Type 2 DM better than insulin alone? )
Types of RCTs: Community trials
Population: Region, city, neighborhood Sample size: 1,000 - 100,000 Outcome prevalence: Often high Length of study: years Cost: Very expensive - Test behavior modifications or efficacy of health education (ex. Does nutrition education plus exercise program lead to greater weight loss than exercise program alone?)
RR interpretation example
RR= 5.0 Those with the *exposure* are *5.0* times as likely to develop than *disease* than those *Without the exposure*
Single blinding
Study subject
Double Blinding
Subject & experimenter
Triple blinding
Subject, experimenter & statistician
Formula for Specificity
TN/ (TN + FP)
Formula for sensitivity
TP/(TP + FN)
Negative predictive value
The probability that a person with a negative test result is truly disease free refers to what value? NPV = TN (TN + FN)
Positive predictive value
The probability that a person with a positive test result is truly positive refers to what value? PPV = TP/(TP + FP)
Which of the following is not a type of randomization used for RCTs? A. Timed Randomization B. Randomization within Strata C. Group Randomization D. Simple Random Allocation
Timed randomization
Attributable Risk (AR)
[a/(a+b)] - [c/(c+d)] -difference between exposed and unexposed groups
attributable risk percent
[a/(a+b)] - [c/(c+d)] / [a/(a+b)]
Relative risk formula
[a/(a+b)]/[c/(c+d)]
The higher prevalence in the screened pop. has led to ______
a marked increase in the positive predictive value using the same test
Cohort Study
an exposure is assessed and then participants are followed prospectively to observe whether they develop the outcome. *participants are grouped by exposure status*
Group randomization generally has ___ statistical power than a trial involving an equal # of persons w/ individual randomization
less
Relative risk is a ____________
measure of association
Attributable risk is a ________
measure of impact
Case Reports and Series Disadvantages
•Potentially very biased since often generated solely from a clinical setting •Non-representative of the population as a whole •May reflect associations that exist solely by chance
Measurement of association for cross-sectional studies
prevalence ratio (PR)
Types of Bias
selection bias and information bias (•How data is gathered)
Case-control study
starts with the disease and attempts to determine the chain of events leading to it *Incidence or prevalence NOT determined* Selection of cases : convenience sample Selection of controls: must be chosen from source population
Differential Referral
•Recruitment of subjects from sources that are not accessed equally by those at risk or with the disease •Recruiting cases solely from private practices, while controls are from private and public facilities (E.g. Recruiting post partum women from hospital maternity vs. recruiting women from a private practice)
cross-sectional design
•Single point/period of examination •Collects data from respondents at the individual level •Exposure and Outcome asked at the same time *prevalence*
95% Confidence Intervals
•Statistically produced in analysis phase of study •We are 95% confident that our calculated statistic truly lies within the range (X,X) •Based on a=0.05
Interobserver variation
variation between observers
intraobserver variation
variation in the reading of test results by the same reader. Due to: - Good day/bad day - Distractions - Subjective interpretation
Intrasubject variation
variation within individual subjects ex. blood pressure of a patient might vary throughout the day
Differential refusal or non-response
•Subject willingness to participate may be influenced by either their disease or exposure status -Persons asked to be controls in a study of behavioral risks for HIV may be unwilling to participate because they do not wish to reveal risk factors
Ecologic Advantages
•Analysis is based on data from secondary sources E.g., National or international agencies •Generally fast and easy to do •Inexpensive •Can use a variety of different data types/sources including publicly available and free data
Methods of randomization: Randomization within strata
•Assures that an important prognostic factor is balanced between treatment and control groups without relying solely on simple randomization (within each stratum, half of the members go to the treatment group and the other half to control group)
Threats to reliability
•Chance •Bias •Confounding
Ecologic study
•Examines information at the aggregated population level only •Characteristics of the country, state, county, etc. •The difference between populations can be described but the etiology of disease cannot be established
Case Series
•Information on a group of people •Case series information can be used for preliminary case definitions of a new disease and development of testing guideline
Case Reports/Series Advantages
•New conditions identified •Change in old disease definitions •New strain or presentation of an outcome
OR interpretation
•OR = 1 -> the exposure is not related to the disease/outcome • •OR > 1 -> the exposure is positively (risk factor) related to the disease/outcome • •OR < 1 -> the exposure is negatively/inversely (protective factor) related to the disease/outcome
PR interpretation
•PR = 1 à the exposure is not related to the prevalence of the disease/outcome • •PR > 1 à the exposure is positively (risk factor) related to the prevalence of the disease/outcome • •PR < 1 à the exposure is negatively/inversely (protective factor) related to the prevalence of the disease/outcome
Differential Diagnosis
•Particularly a problem for facility-based research (Persons who have a risk factor may be tested and diagnosed at a faster rate, and be in care at a different rate, than those without the risk •E.g. BrCA in the family, lung cancer testing in current smokers)
Differential Surveillance
•Particularly a problem when using passively reported data Passive data = automatically reported Active data = researchers go out and collect •Persons who are treated by different care providers may have differential rates of disease reporting E.g. Chlamydia is reported much higher in women because it can be asymptomatic and men do not get annual exams
Bias
•Systematic error in a study which results in an incorrect estimate of association between the exposure and the outcome •Bias is generally introduced by the researchers as a result of the study design, instrument, sampling method *No statistical fix*
Observation/Information Bias
•Systematic error in the collection or measurement of data related to the exposure or the outcome Can be related to both
case control disadvantages
•Temporal relationship difficult to define •Causal inference is less clear •Rate of outcome cannot be estimated directly •Insufficient for studying rare exposures •Particularly susceptible to both selection and information bias
Ecologic Disadvantages
•Unable to link disease and exposure factors at the individual level •Ecologic Fallacy •No way to control for confounding •Measurement of exposure and disease may be imprecise or inaccurate