EBM Exam 1

अब Quizwiz के साथ अपने होमवर्क और परीक्षाओं को एस करें!

What is prevalence rate?

# of cases/total population

What is/are (a) degree(s) of freedom?

# of values in a statistical calculation that can freely vary (# of rows -1)*(# of columns - 1) = degrees of freedom for chi-squared this usually is 1 degree

25. Be able to define and interpret the six most common measures of risk NNH

100/(absolute risk increase %) number of pts needing tx before 1 is harmed (round up)

25. Be able to define and interpret the six most common measures of risk NNT

100/(absolute risk reduction as a percent) number of pts that need to be treated before 1 pt benefits (round up)

Explain ways to deal with these threats to validity 2 general approaches + threat-specific approaches

2 ways = address in design phase (prevention/experimental control) or address in analysis phase (evaluate/statistical control) *random error (chance) - design phase = increase sample size; standardize/refine/automate data collection instruments; ensure good inter-rater reliability for multiple measurements analysis phase = statistical analysis to find probability that observed result occurred by chance alone (p-values, CIs) *systematic error (bias) design = best possible study design, minimize preventable biases analysis = collect more info, check for similar results in other studies *confounding design = apply RCT when possible, specify inclusion/exclusion criteria, measure confounders accurately analysis = stratify, match (age/gender/other variables), multivariate statistical analysis

Describe the 2-sample t test and the Wilcoxon rank sum test

2-sample (use with normal dist. or abnormal dist. with v large sample size): independent 2 sample compares mean of one variable between 2 mutually-exclusive groups with the same variance - calculate t-value and df (which is sample # for 1 and 2 - 2); if 95 CI contains "0" class, fail to reject null paired or dependent compares means of related groups (eg same individual measured twice) - simplifies to one-sample test, df = n-1 Wilcoxon: sum of all ranks of values

Interpret a p-value and 95% Confidence Interval

95% confidence interval = expectation of the frequency with which repeated samples would yield a mean within a desired range (true population mean GPA = 3.45; there is a 95% chance that the interval 3.36- 3.53 of one sample contains the true population mean) p-value = probability of having observed results if null hypothesis is true (ie the likelihood that our observed results come from chance alone and not actual differences/variability between groups) GOOD: p-value of less than 0.05 (sometimes less than 0.01) = less than 5% chance/1 in 20 probability that our observations are due to chance alone *we reject null hypothesis BAD: p-value 0.05 or greater = more likely that observations are due to chance and that null hypothesis is true *we fail to reject null hypothesis (but do not say we accept it/it has been proven because maybe a larger sample size would have picked up on a true difference)

Describe the ANOVA and Kruskal-Wallace test

ANOVA = for 3+ person groups, mean comparison df =# groups - 1 for "between-group" fd = n - # of groups for "within group" KW = extended Wilcoxon test

Explain 5 As of EBM information cycle

Assess (by): Ask - Map question (can be about prevention, therapy, harm/causation, diagnosis, prognosis, outcomes, economics, qualitative, guidelines), use PICO (patient, intervention, comparison/control, outcome) Acquire (Sources: what types/levels of evidence exist, where can it be found?) (Select: systems, syntheses, summaries, synopses, studies + pre/un filtered) Appraise - Interpret (results, validity, importance, applications) Apply - to specific situations

31. Know how to interpret a correlation coefficient, and when one should be used Pearson? (discuss types + strengths of association, and p value considerations; what is r^2) Spearman?

CC measures degree of association between 2 numerical variables - plot all data pairs, make line of best fit, measure distance of observations from LOBF Pearson: r = sample estimate ranging from (-1 to +1), assumes that variables have a linear relationship - r greater than 0 = positive association - r less than 0 = negative association - if abs. value of r: *greater than 0.8 THEN association = very high *0.5 to 0.8 THEN association = moderate *less than 0.5 THEN association = weak/none - P-value + CI can be determined from r; MAKE SURE to interpret p and r together to avoid getting a false significance from p - r^2 = proportion of variability of var1 attributable to linear relationship to var2 Spearman: - non-parametric equivalent to Pearson (use for non-normal distribution when 1 or both variables are ordinal in a small sample size) - relationship is non-linear - variable = r(sub)s

Differentiate between descriptive vs. explanatory studies, and between observational vs. experimental studies

Descriptive: describe population/disease by recording events/observations/activities; eg: case reports/series. cross-sectional (surveys) *do NOT provide info about causality or clinical efficacy* Explanatory: help elucidate aspects of disease etiology/prognosis (causality) or intervention efficacy by comparing/explaining differences between populations based on interventions; eg. case-control, cohort, RCT Observational studies: make INFERENCES about cause and effect by observing and comparing subjects in their settings rather than controlling an intervention; eg. cross-sectional, case-control, cohort experimental: draw cause/effect CONCLUSIONS by measuring exposures and outcomes in a CONTROLLED setting + intervention; eg. RCT

T/F: P-value is the probability that null hypothesis is true Small P-value means that the alternative hypothesis is absolutely true P value does not indicate whether results are valid P-value does not tell us about the magnitude of the difference/effect

F: it is the probability observations are due to chance F: small p-value still leaves room for alt hypothesis to be wrong T: does not speak to result validity T

Describe the goals, pros, cons, and uses for case reports. Classify as descriptive/explanatory and observational/experimental

Goals: descriptive, describes experiences/observations and establishes a baseline for more complex research, crudely quantifies incidence rates complexity: varies with # of cases uses: precursor to tx/harm/prognosis analysis; documenting exceptional patients/effects Pros: cheap, quick, GENERATES TESTABLE HYPOTHESES (statements of "potential") cons: anecdotal information, no control, high bias potential descriptive + observational

Describe the goals, pros, cons, and uses for cross-sectional studies. Classify as descriptive/explanatory and observational/experimental

Goals: estimate prevalence/correlations between exposures and outcomes by collecting survey-type evidence; measure multiple patient-reported factors at once complexity: varies uses: precursor to tx/harm/prognosis analysis, patient-reported data collection to estimate correlation pros: good starting point for research, quick and cheap cons: biased, low participation, difficulty sequencing reported events descriptive, observational

Describe the goals, pros, cons, and uses for case-control studies. Classify as descriptive/explanatory and observational/experimental.

Goals: study potential CAUSES of outcomes to estimate association - START WITH OUTCOME and work back to find cause uses: harm analysis, searching for common factors among sample cohort to narrow hypothesis for research pros: quick/cheap, ID future research, rare outcomes cons: high bias (esp recall), historical data, tough to select control group explanatory/observational

24. Understand the difference between incidence and prevalence, and between cumulative incidence and incidence rate (IR)

Incidence = new cases (water exiting tap) prevalence = existing cases (water in bathtub) cumulative incidence (CI) = # of events during a specific time period or # in population at risk during specific time period (assumes risk applies for entire time period) incidence rate (IR) = incidence density; CI/total person-time at risk (assumes time of risk is differential) - add individual risk times together to get total person-years of risk, then divide CI by this number to get # of incidents in a specified number of person-years

Three functions of EBM

Know, Do, Understand know what to do (intuition/conjecture vs practice based on evidence) do what is known (memorizing vs. reasoning; 5 As) understand what is done (implicit evidence from observation v explicit from appraisal)

Describe major study designs in clinical research - rank from strongest to weakest

My Rude Cousin Casey Creates ChAos Meta-analysis RCT Cohort Case-control Cross-sectional Case studies

2 principles of EBM

Not all evidence is equal (there is a hierarchy) evidence alone is not enough (it needs to be contextualized)

29. Identify the p-value corresponding to a calculated test value using a distribution look-up table

OK

Name the components of a well-built question and infer these from a case study, scenario or research report.

PICO patient/population intervention control outcome (should have a timeline)

35. Understand when Poisson regression would be used rather than the other types of regression already learned about, and be able to interpret results from a Poisson regression

Poisson = ordinal outcome *analyzes counts/rate of occurrence of outcomes (rate = [# of events]/[person-time for f/u]) *results in rate ratio (relative risk) helps answer whether ind var are associated with rate of event and whether event rates are different across groups

How would you interpret R^2 values in a multivariate regression model fit?

R2 tells us the % of total variation in y that can be explained by all of the ind var together (xk) R2 of 0 - 0.2 = low predictive ability R2 of 0.2 - 0.4 = moderate/strong predictive ability R2 of 0.4 - 0.6 = very strong predictive ability R2 of > 0.6 are rare (biologic variability always remains)

21. Understand the difference between Type I and Type II Error

T1 = convicting an innocent man - rejecting innocence when it is true (rejecting null when it is true) * alpha = max probability of T1 error that we can accept (ie the statistical significance cutoff for the p-value; usually a = 0.05) T2 = letting a guilty man walk (failing to reject null when it is false) * beta = max acceptable probability of T2 error (usually b= 0.2 or 0.1) (1-b = power of test to detect a difference that exists)

25. Be able to define and interpret the six most common measures of risk odds ratio

a way to estimate RR without having size of treated and control populations (we start with outcome + no-outcome populations) only good in rare disease (otherwise, OR will over-estimate association) OR for cases: [(exposure cases)*(no exposure controls)]/[(no exposure cases)*(exposure controls)] alternatively OR = [(exposure cases)/(no exposure cases)]/[(exposure controls)/(no exposure controls)] OR greater than 1 means cases are likelier (by a factor of OR) to have been exposed to factor OR=2.01, 95% CI=(0.89-4.35) "vitamin B12 deficiency was not associated with past use of H2RA/PPI" What if 95% CI=(1.03-4.35)? People with vitamin B12 deficiency were 2.01 times more likely to have used H2RA/PPI in the past

25. Be able to define and interpret the six most common measures of risk risk difference/absolute risk reduction or increase

absolute value of (control risk - tx risk) tx group risk = a/a+b control risk = c/c+d RD = abs I(c/c+d) - (a/a+b)I

26. Know which measures of risk are most commonly calculated for each study design (RCT, cohort, and case-control)

all measures of risk are calculated for RCT and cohort only odds ratio is calculated for case-control

Distinguish between background and foreground questions.

background = about conditions/problems; suited to general knowledge and typical of new learners 2 parts: -Question Root (who/what/when/where/why) + Verb -Problem (disease/syndrome/pattern etc) foreground = about choices (pts/populations); PICO questions: suited to specialized knowledge/decision-making, typical of experts

Identify a normal distribution and its characteristics

bell-shaped, mean = median, no significant skew 68% of subjects are within 1 SD 95% of subjects are within 2 SDs

Differentiate between reliability and validity

both pertain to data/variables and studies as a whole reliability = reproducibility (all data points fall within a narrow range/all studies show similar data) validity = accuracy of representation of a "true value" (all data points fall close to the expected/true value) -internal study validity: quality of study + analysis' support of conclusions (assessing research design and statistical issues) -external study validity = GENERALIZABILITY: how well do study results apply to unstudied "real" subjects

What do you use to test for association between 2 categorical variables? Between 2 numerical variables?

categorical = chi-square test numerical = correlation

Identify different types of variables based on how they are measured. Why is this kind of classification important? 2 types and 4 subtypes?

categorical or qualitative/numerical or quantitative categorical: nominal (cannot be ranked i.e. purple/yellow/green or male/female - if exactly 2 options for a nominal variable, then it is considered DICHOTOMOUS) v ordinal (can be ranked or ordered i.e. age groups, stage of disease) numerical: discrete (integer values only) v continuous (can have fractional/decimal data points) important because type of variable (nominal/discrete or any numerical) requires a different/specific statistical test or model Association between 2 categorical variables = chi square association between 2 numerical = correlation

Describe measures of central tendency and variability, and use the measures to quickly assess available evidence

central tendency = mean (avg), median (middle value of set/50th percentile), mode (most frequent value) if mean = median, distribution is symmetrical if mean and median are not equal, distribution is asymmetrical variability = range (min-max), percentile (90th percentile = 90% below and 10% above; interquartile range = central 50% (between 25th and 75th percentile), variance (measures data spread), and standard deviation (average of deviations from the mean aka square root of variance)

Describe the goals, pros, cons, and uses for RCTs. Classify as descriptive/explanatory and observational/experimental.

goal: test drug/tx efficacy - 3 components = true randomization + blinding, control of intervention, control group uses: therapy/harm analysis Pros: best cause-effect design, FDA gold standard, least confounding Cons: costly and long, artificial (tough to generalize), biases exist, unethical/unfeasible esp for AEs explanatory + experimental

Describe the goals, pros, cons, and uses for cohort studies. Classify as descriptive/explanatory and observational/experimental.

goals: one cohort with drug of interest and one cohort without - start with CAUSE and assess outcome uses: assessing relative risk (risk of exposed group relative to risk of non-exposed group) (non-exposed can be without any drug tx or with different drug tx than exposed group), harm/prognosis analysis Pros: can be prospective (high degree of control) or retrospective (rapid); more realistic than RCTs, easy to time-sequence; bias control; ideal for rare exposures Cons: may be costly and time-consuming, LOSS OF FOLLOW UP, tough to show causation explanatory/observational

Describe the goals, pros, cons, and uses for meta-analyses. Classify as descriptive/explanatory and observational/experimental.

goals: review/analyze multiple RCTs to reach more comprehensive/generalizable conclusions uses: therapy/harm analysis Pros: produces single result, cheaper than RCT Cons: publication/lit retrieval bias, variations between individual studies explanatory + experimental

What is ITT?

intention to treat study should account for all subjects, even those who dropped out

33. Distinguish between linear, logistic and survival regression and know when to use each

linear = dependent variable is numerical and normally distributed (continuous); result is slope estimate; can be both simple or MV logistic = dependent variable is dichotomous (yes/no); need this instead of linear because outcome doesn't follow normal distribution; results in OR; can be both S or MV survival = analyze time to event; concerned with how long it takes for the event to happen and may account for unequal follow-up (use with RCTs and cohort studies) (eg. time to death, time to tumor recurrence, etc) Kaplan-Meier survival curve: *simple, crude, # events, # at risk, timing of events - results in median survival time + rate estimate Cox-proportional hazards regression: *accounts for censoring (unlike t tests or linear regression) and for time (unlike risk/odds ratios or logistic regressions) *results in hazard ratio *can be S or MV Poisson: *rate (# events/person-time) *results in rate ratio *can be both S or MV

34. Be able to interpret results from each type of regression model (linear, logistic, survival), for different types of independent variables (numerical, dichotomous)

linear: - estimate intercept by setting all ind var to 0 - slope: *if ind var is numerical: Y changes by magnitude b as x changes *if ind var is categorical: b represents mean difference in Y between a non-ref and the referent category (can treat b-values as coefficients for plug and chug - just make sure to enter "0" for the xn value when computing the referent variable) logistic: - can be simple or multivariate (numerical or categorical ind var) - can be: *conditional if studies are matched *multinomial if more than 2 nominal outcomes (non-dichotomous) *ordinal if outcomes are ranked/ordinal - OR for each ind var (always greater than 0) *if ind var = numerical, change in X results in corresponding change in Y by magnitude OR * if ind var = categorical, then risk of Y is OR times lower/higher for non ref than ref group for X survival: - outcome has 2 parts: did event happen (dichotomous y/n) + time to event (numerical) - define study start (can be rolling), end (can account for f/u), and event - subjects with no evidence of event are "censored": we don't have all the info about subject's event time *L-censoring = mitigated in study design, prior to study start *R-censoring = finished study but no event during this time OR didn't finish study and didn't have study while enrolled OR subject lost in f/u before event - Kaplan-Meier: *cumulative probability = chance that subject will be event free (will survive) at a given time after study start (as time from start increases and more conditional probabilities are calculated, multiply them together to get a cumulative prob) *cumulative incidence = 1-survival probability - median time to survival = time when 50% of subjects have had event - log-rank = crude/simple version of survival analysis *tests for differences in survival between groups, unadjusted for covariates *to compare survival in different groups at specific time points, can use a chi-square test - Cox proportional hazards regression: * hazard = instant risk of having event at time "t" for individual "i" *this regression tests hazard of one variable on risk of outcome, can adjust for covariates (estimates hazard ratio)

Be able to state the null and alternative hypotheses for a PICO question

null hypothesis - no effect/difference exists between compared groups (no difference in recovery time between zinc lozenge and placebo group) alternative hypothesis - a true difference exists between groups

Identify different ways to display different types of data (descriptive biostats) one variable vs 2

one variable = frequencies categorical = bar or pie chart continuous = histogram (GPA bar chart with a non-integer scale) 2 variables = relationships categorical = stacked or clustered bar numerical = scatter plot

Describe the one-sample t test and the sign test

one-sample t test (use with normal dist. or abnormal dist. with v large sample size): only 1 sample compared to known value (calculate t-value and df; df = sample size - 1 - then use df value to check for most similar p-value and see what significance that holds) sign test: based on median distribution - is there a difference between the population medium and the sample medians?

one v two-tailed test

one-tailed makes an assumption about the direction of difference, two-tailed does not

30. Describe the difference between parametric and non-parametric tests

parametric = use for normally distributed numerical data (1/2/3+ groups); tests include - one-sample t test (one group), two-sample t test, paired t-test, ANOVA non-parametric = non-normally distributed numerical data (1/2/3+ groups); tests include sign test, Wilcoxon rank sum test, and Kruskal-Wallace)

25. Be able to define and interpret the six most common measures of risk relative risk reduction

percent of baseline risk that is removed as result of tx (this is the same as a percent error calc) RRR = RD/control risk [abs (control - tx)]/(control risk)

point v interval estimate

point estimate = one estimate that is a "best estimate" of a true value using available data (a single value, like a mean) interval estimate = range of true value used when making estimates (eg. confidence interval)

22. Define power and identify factors influencing study power

power of test = ability to detect a difference that exists; 1-b affected by: sample size - larger = more power observation variability - less variability = more power effect of interest - larger effect size = more power significance level (alpha) - increased T1 error = increased power (power increases if we are willing to accept greater T1 error probability) for clinical trials, power should be 80%+ or higher

Recognize the three main threats to study validity

random error (chance) - hopefully influence is small due to "Law of Averages" (drug x used by 20% of population but study sample may under or over-report this percentage for the pt subset due to chance error in random sampling) systematic error (bias) - (asking parents of RH pts and control pts about ClearPhed use - RH parents may have better recall and skew data) confounding - (following users and non-users of ClearPhed for 1 year to observe rate of RH incidence - result = higher incidence in ClearPhed users; this result does not account for other contributing factors to RH which may have been higher in the user group and confounded the data)

32. Understand the difference between "simple" versus multivariate regression for each, give: dependent variable type # of independent var equation intercept estimate slope estimate when do you use each?

regression = similar to correlation but assuming one variable is dependent on other simple = first step of analysis, unadjusted regression dependent variable = numerical (y), assume normal distribution # of independent var = 1 (x) equation: y = bx + a intercept estimate = a slope estimate = b multivariate = addition of covariates to regression; quantifies effect of one ind var on dependent var while controlling/adjusting for covariates dependent variable = numerical # of independent = varies (can be numerical, categorical, or ordinal) equation: y = a+ b1x1 + b2x2 + b3x3 + .... bkxk intercept estimate = a slope estimate = b1 - bk use simple for crude/first-step association use MV when other variables are expected to impact relationship between dependent and independent variables (include confounders as an "other variable")

25. Be able to define and interpret the six most common measures of risk relative risk - understand calculation - interpret RR value - interpret CI for RR - how to interpret RR in therapy v harm studies

risk of tx group/risk of control group tx group risk = a/a+b control risk = c/c+d RR = (a/a+b)/(c/c+d) -if RR less than 1, then risk is greater in control than tx -if RR greater than 1, then risk is greater in tx than in control -if RR is 1, then control and tx risk are the same in confidence interval for RR value, if "RR=1" falls into the 95% CI range then there is no significant difference between the tx and control groups in the outcome if "RR=1" is not in this interval, then our RR value is very different from 1 and there is significant difference between the groups therapy (helps): RR should be greater than 1 if tx is effective harm (of tx): RR should be greater than 1 if tx increases AEs

23. Identify factors influencing sample size

sample size depends on needed power (1-b) and desired alpha value - greater power = greater sample size needed - smaller alpha/significance level (0.01 vs 0.05) requires larger sample (less conservative cutoffs can manage with smaller sample sizes) -time, cost, ethics are LIMITING factors - need to minimize T2 errors, sufficient power to detect meaningful differences are EXPANDING factors

What is blinding? (single double triple)

single = subject doesn't know which group double = subject and clinician don't know triple = no one knows

Surrogate (substitute) v final (true) outcome?

surrogate = something used to substitute for the final outcome when calculating true outcome is not feasible (true = L ventricle function, surrogate = blood pressure)

define bias

systematic deviation from the truth

27. Be able to identify the most appropriate statistical test based on the distribution of the data, the type of data, and the number of groups in the data 28. Interpret the results (p-value, 95% confidence interval) of each statistical test in terms of the null or alternative hypothesis McNemar's test

use for 2 dependent categorical values (ie one individual measured at 2 different times)

27. Be able to identify the most appropriate statistical test based on the distribution of the data, the type of data, and the number of groups in the data 28. Interpret the results (p-value, 95% confidence interval) of each statistical test in terms of the null or alternative hypothesis chi-squared test

use for 2 independent categorical values (tests association or independence) chi-squared value must meet/exceed the critical value denoted by the data's alpha value and degrees of freedom in order for data to be considered as following the chi-squared distribution and that the null hypothesis is rejected if 95% CI does not contain 0%, then reject null hypothesis

27. Be able to identify the most appropriate statistical test based on the distribution of the data, the type of data, and the number of groups in the data 28. Interpret the results (p-value, 95% confidence interval) of each statistical test in terms of the null or alternative hypothesis Z test

use for one nominal value being compared to a known value if calculated Z is less than the critical value and/or the associated p-value is less than alpha, we fail to reject the null hypothesis if known value is contained in the 95% CI, then there is no statistically significant difference (fail to reject null)

27. Be able to identify the most appropriate statistical test based on the distribution of the data, the type of data, and the number of groups in the data 28. Interpret the results (p-value, 95% confidence interval) of each statistical test in terms of the null or alternative hypothesis Fisher's exact test

used for independent categorical values with a very small sample size (less than 5 in any one cell of the chi-squared table)

when you see a median presented for a data point instead of the mean value, what might this mean?

usually this means that the researchers are treating the data as skewed

What do central tendency/variance measures tell us?

what the typical values are our confidence level in point estimates indications for which statistical test to use

when do you use correlation?

when you have 2 numerical variables and neither is considered the main effect examine whether one variable can be a substitute for the other


संबंधित स्टडी सेट्स

APUSH Unit 5 College Board Review Questions

View Set

Ricci → Ch. 23: Nursing Care of the Newborn With Special Needs PrepU

View Set

RPRACTICES16: Taxes, Tax Year, Prop 13, Capitol Gains, Determining a Profit or Loss, Depreciation

View Set

Chapter 13- Nervous System: The Brain & Cranial Nerves

View Set

Chapter 9 Real Estate and Other Assets

View Set

Altered Intracranial Regulation - Nursing Care: Altered Intracranial Regulation

View Set

PEDS Chapter 48: Nursing Care of the Child With an Alteration in Metabolism/Endocrine Disorder

View Set