Exam I - Buchanan

Ace your homework & exams now with Quizwiz!

what is the Ho and Ha for a case-control study?

Ho : OR = 1 (odds are equal in both groups) Ha: OR < or > 1 (odds are not equal in both groups)

what is the Ho and Ha for a kaplan-meier model

Ho : S1(t) = S2(t) survival properties are equal in the 2 groups Ha: S1(t) /= S2(t) survival properties are NOT equal in the 2 groups

describe the null and alternate hypothesis of a r test (pearson's test)

Ho : r = 0 (no linear association, scatterplot looks like snow) Ha : r > 0 OR r < 0 (positive or negative linear association present)

Interpret the following regression results: (select all) B: 2.8 R^2: 0.05 p value: 0.07 a. B suggests that x is associated w/ y b. B is not statistically significantly different from zero c. model does not fit well d. all of the above

b. B is not statistically significantly different from zero --> p value > 0.05, fail to reject null of r = 0 c. model does not fit well --> small R^2, only 5% of variance in y can be explained by increase in x

Which type of variable can have a range of values? a. discrete b. continuous c. categorical

b. continuous (73.4 kg)

What is the dependent variable in a logistic model? select all a. odds b. log (odds) c. logit

b. log (odds) c. logit

interpret this: age is measured as a continuous variable and OR of 1.05 (exposure is activity level) a. those aged >42 are 1.05 x as likely as likely as those <42 to be inactive b. odds of not being active increase by 5% for every year increase in age c. odds of not being active increase by 8.1 times as people get 1 year older

b. odds of not being active increase by 5% for every year increase in age

What type of test would be appropriate? is BP different in patients w/ and w/o DM? a. chi square b. t test c. spearman rho d. wilcoxon

b. t test (continuous)

what is the benefit of randomization

balances differences between patients that could effect risk or outcome of study

in a cox proportional hazard, what is the dependent variable?

hazard rate

good fit of the regression line is indicated by _____

high R^2

what does the cox proportional hazard investigate?

relationship between hazard rate and independent variables

match the statistical measure with the type of study a. relative risk b. odds ratio randomized controlled trials cohort studies case-control studies

relative risk is for randomized controlled trials & cohort studies odds ratio is for case-control studies

how is cumulative incidence (risk) represented

1-S(t) 1-survival

how to interpret an aOR

1-aOR = XX% less or more likely to experience an outcome in response to an exposure

T/F correlation is able to measure strength of a NONlinear relationship

F, this is why we need to check linear relationship w/ a scatterplot

T/F want a small p value for homer lemeshow test to show no evidence of lack of fit

F, want a large P value (want the Ho)

what is the equation for relative risk?

(a/(a+b)) / (c/(c+d)) a: exposed w/ outcome b: exposed w/o outcome c: unexposed w/ outcome d: unexposed w/o outcome incidence of outcome w/ exposure / incidence of outcome w/o exposure

what r value is for a perfect + correlation? what r value is for a perfect - correcation?

+1.0 -1.0

what is the range for log(odds)

- infinity to + infinity beneficial

T/F a survival analysis could be appropriate for a randomized clinical trial or a case-control study design

F, randomized clinical trial or cohort study design

what is the equation for the line of regression

-y = a + Bx B: slope (change y / change x)

an R^2 of 0.003 is what percent

0.03%

what are the 5 assumptions of a logistic regression

1. binary outcome (y/n data) 2. independent observations 3. independent variables not correlated (can check this) 4. independent variables and log(odds) are linear 5. large enough sample size

name the 5 assumptions for a pearson's test

1. continuous variables only 2. both variables approx normal 3. no outliers 4. linear relationship 5. independent observations

What 3 points should be checked in a regression?

1. p value 2. sign & units of B 3. R^2 value --> tells if model fits (if model isn't a good fit, then a large B means nothing)

to get the R^2 into a percent, multiply by ____

10

a multiple regression analysis predicts one dependent variable (y) from how many independent variables (x, predictors)?

2 or more each can be interval/ratio OR qualitative

in general, what % variance found by R^2 is considered a good fit?

>/= 50%

give examples of scenarios for censorship

ANYONE who doesn't make it to the end of the study, regardless of reason, is censored withdrawals loss to follow up not experiencing event during time of study

T/F correlation = causation

F!!!!!

T/F censoring does not reduce the # of patients who contribute to the curve

F, censoring DOES reduce the # of patients who contribute to the curve

T/F survival function and hazard function are not related

F, they are related

T/F cox proportional hazard model for survival analysis has no limitations

F, has limitations need to include adjusted survival curve

T/F linear functions fit the data for dichotomous outcomes

F, hence why we need to use logistic regression

T/F pearson's correlation coefficient (r) works well w/ outliers

F, instead use spearman's correlation coefficient

T/F logistic regression is a linear function

F, its non-linear

T/F pearson's correlation can be used to detect a quadratic association between two continuous variables

F

what happens to R^2 when more variables keep getting added? why is this bad? how is this fixed?

R^2 keeps increasing (NEVER will decrease) model with more variables will always seem to fit better, even better fit is not actually true fixed w/ adjusted R^2

how is survival function represented

S(t) survival

T/F a % range is an example of interval/ratio scale

T

T/F controlling for confounders in a logistic regression will require a multiple logistic regression

T

T/F hazard ratio has built in selection bias

T

T/F in a multiple regression, each additional variable is tested while the others are held constant

T

T/F log(odds) = B

T

T/F probability is a finite number, while odds is an infinite number

T

T/F the value of r is an absolute value

T

T/F you can only perform a linear regression for continuous data

T

T/f B can either equal increase in probability of outcome or disease

T

T/F the logit creates properties of a linear regression w/o actually being for a linear regression model

T beneficial

T/F probability of an outcome for a logistic regression should be performed

T --> makes the outcomes a percent to be able to run the logistic regression dependent variable of logistic regression is a probability

T/F categorical data is a subtype of discrete data. why?

T, categorical data is observations w/ limited values like counts (4 legs)

T/F for each additional variable in the multiple regression, there is another B

T, each variable tested gets its own B

T/F shape of a scatter plot is important in check assumptions for spearman's corrletion

T, if association is linear, then r will be near 0 and a true association could be missed

T/F the logit allows for prediction of odds of disease or odds ratio for exposure and disease

T, use slope beneficial

at what point of the study can survival estimates be unreliable & why

at the end of the study when a lg # of subjects have been censored

What are the 3 general methods to making experimintal conditions equal between study groups?

randomization placebo (could be double dummy) blinding

what does a tick in a kaplan-meier curve indicate

a censored subject

what type of measure (outcome) is studied to determine a difference in each of the following? a. difference in means b. difference in event rates c. difference in survival function

a. continuous b. dichotomous c. continuous w/ dichotomous (risk difference)

match each predictor w/ type of test: a. r b. R^2 c. adjusted R^2

a. r = pearson or spearman simple correlation b. R^2 = simple determination (explainable variance) c. adjusted R^2 = R^2 accounting for number of predictors (x) in the model

assign a level of correlation to each of the value ranges for r: a. 0-0.25 b. 0.25-0.75 c. 0.75-0.99 d. 1.0

a. weak b. moderate c. high d. perfect

what type of OR controls for confounding. in a logistic regression?

adjusted OR (aOR)

Which do I look at for a multiple linear regression, R^2 or adjusted R^2?

adjusted R^2

What is the type of data for each of these variables? age smoking status weight medication adherence

age --> continuous smoking status --> binary or ordinal weight --> continuous medication adherence --> binary

how to interpret R^2

amount of variance in y that can be explained by x

if y is a measure of risk for heart disease & x is # cigarettes smoked/day, interpret y-intercept (alpha) a. predicted value of y when x = 0 b. risk of heart disease for non-smokers when x = 0 c. a & b d. none of the above

c. x = 0 represents non-smokers

Can I use pearson's or spearman's if the scatterplot shows a "U" or "A" shape?

cant use either

the p value for a linear regression checks to see if alpha or beta is statistically significantly different from zero?

checking if B is statistically significantly different

describe the B in a logistic regression model

compares a change in log(odds) for every 1 unit increase in x

in interval data, the distance between two values is _______ & _________

constant & meaningful ex: Celsius scale

homoscedasticity means?

constant variance

the two subtypes of numeric measurements are ______ & _______

continuous & discrete

before calculating r for spearman's (rank order) correlation, what do you have to do w/ the data?

convert (continuous) data to rankings

r = rho = pearson's correlation coefficient = population _______

correlation

hazard ratios are for what model

cox proportional hazards similar to odds ratio of multivariable logistic regression

does a curve step down when a patient is censored or when a patient dies (or reaches the outcome)?

curve only steps down from patient death (or outcome) NOT censor

prediction of an outcome can be achieved w/ correlation or regression?

regression

are weight in 1971 & age in 1971 related? This question can be answered w/: a. t test b. pearson's c. spearman's d. linear regression

d. linear regression independent = age dependent = weight

give examples of events that could be studied in a survival analysis

death injury onset of illness time to recovery change in LDL change in SBP

the outcome is the independent or dependent variable?

dependent

binary is synonymous to _____

dichotomous

in a logistic regression, the independent (x) variable can be (3)

dichotomous categorical continuous

in a logistic regression, the dependent (y, outcome) variable must be

dichotomous (binary)

how do you calculate OR from a B value?

e^B

how is the adjusted hazard ratio calculated in a cox proportional hazard model?

e^coefficient

explain the proportional hazards assumption

effect of a risk factor is constant over time stratify if not proportional or interaction with time

what are the three possible goals of a survival analysis

estimate time to event for a group of individuals compare time to event between 2+ groups assess relationship of variables or covariates to time to event

binary data is mutually _________

exclusive

interpret OR >1 for determining if exposure is a risk factor for the disease

exposure increases disease risk (exposure is a risk factor)

interpret OR = 1 for determining if exposure is a risk factor for the disease

exposure is not a risk factor

interpret OR <1 for determining if exposure is a risk factor for the disease

exposure reduces disease risk (exposure is protective)

diagnosis & treatment can be guided by correlation or regression?

regression (estimating dose-response relationship)

what test checks the overall fit of a logistic regression?

hosmer lemeshow test

what does the slope of the survival curve tell

how fast survival changes

how to interpret B?

how much y changes when x changes by 1

what is an indicator in the shape of two survival curves that we would fail to reject Ho (aside from the p value)?

if the two survival curves overlap each other

how to interpret B in a multiple logistic regression

increase in log odds for a one unit increase in exposure of interest with all other exposures held constant

as censorship increases, the fraction of people experiencing death or an event ______ (increases or decreases)

increases / becomes larger

the predictor/exposure is the independent or dependent variable?

independent

how is an odds ratio (logistic regression) interpreted?

individuals w/ the exposure (x) have XX times OR XX% higher/lower the odds or chances of getting the disease compared to individuals without the exposure

what is the hazard rate

instantaneous incidence rate

what type of data is used for assessing a relationship between two continuous variables?

interval/ratio

how does regression build on correlation?

regression tells how to draw a straight line of best fit in a scatterplot in order to describe a relationship between X & Y

"stair-step" curve is kaplan meier or cox proportional hazard?

kaplan meier accounts for censoring

what are the 2 methods for survival analysis

kaplan meier cox proportional hazards

If hazard groups in a kaplan-meier cross, does the test gain or lack power?

lack power

What benefit does converting a probability to odds result in?

larger range of possible values for dependent variable (expands the range of outcomes)

what are the 4 assumptions for a linear regression?

linearity (of regression line) homoscedasticity (constant std.dev by looking @ resid plot) normal distribution independence

would linear or logistic regression be used to evaluate mortality?

logistic

what regression model estimates propensity scores

logistic regression

what regression model is used for a dichotomous outcome

logistic regression

describe 3 ways subjects become censored in survival analysis

loss to follow up or drop out study ends before person achieves event (death or outcome) counted as alive or outcome-free at time of study starting

the best line to fit the data __________ distance between observed and predicted data

minimizes ie. small residuals (actual y - predicted y)

describe monotonic vs non-monotonic relationship

monotonic: data moves in one direction (EITHER up or down) non-monotonic: data moves in multiple directions ( up AND down)

sign of r denotes _______; value of r denotes ________

nature; strength

can a cox proportional hazard be used if hazards cross each other?

no

can a multiple linear regression infer cause?

no

can a t test be used to assess association between two continuous variables?

no

can a t test be performed for nominal data?

no can do a z-test or chi square

can the distance between responses for ordinal data be quantitatively measured?

no ex: scale of strongly disagree to strongly agree

interpret an r of 0 (or very close to)

no correlation between the two variables

what would a scatter plot look like for an r of 0?

no pattern in the scatter points

what does an r value of 1 mean?

no scatter around the trend line

does hazard have an upper bound and is it a probability?

no to both h(t)>/=0

does an odds ratio demonstrate a difference in rate?

no, demonstrates a difference in odds of an outcome

can race and sex variables directly be used in a pearsons correlation test

no, they are nominal measurements

race and ethnicity are examples of what kind of data variables?

nominal (categorical)

what is the difference between nominal and ordinal data?

nominal data is unordered descriptions, while ordinal observations can be ordered (least happy to most happy)

what does the wald test assess

to see if B (the parameter) is significant

describe a disease odds ratio

odds of being a case of disease among exposed individuals divided by odds of being a case of disease among non-exposed individuals (a/b)/(c/d)

describe an exposure odds ratio

odds of being exposed among the cases divided by odds of being exposed amond the controls (a/c)/(b/d)

interpret OR = 1 for comparison between cases and controls

odds of exposure are equal among cases and controls

interpret OR >1 for comparison between cases and controls

odds of exposure for cases are greater than odds of exposure for controls

interpret OR <1 for comparison between cases and controls

odds of exposure for cases are less than odds of exposure for controls

explain an R^2 value of 0.008

only 1% of variability in y can be explained by x poor model fit

What type of data is this? no HS degree HS degree some college, no degree associates degree bachelors degree higher than bachelors degree

ordinal

the two subtypes of categorical measurements are _______ & _________

ordinal & nominal

what are the 3 assumptions for a spearman's correlation?

ordinal, interval, or ratio scale variables two variables measured on all study participants monotonic relationship between the two variables

what is the coefficient of a cox proportional hazard model?

parameter estimate

how are influential observations checked for a logistic regression?

pearson's

What two tests can be performed to assess association between numerical variables?

pearson's spearman's

how is unreliable survival estimates d/t censorship fixed?

peto test (as opposed to log-rank) --> weights survival time earlier in the curve more heavily ex: high mortality --> use peto

what does a scatter plot of the variables show for assumption checks?

possible outliers correlation ( + or -) linear relationship (or lack of)

how to calculate odds from a probability?

probability / 1 - probability

what is the hazard function

probability that if you survive to a specified time, you will succumb to the event in the next instant

propensity scores are matched in patients to balance baseline characteristics between study groups, similar result to what?

randomization

what type of bias does loss to follow up introduce?

selection bias

what is the benefit to using a multiple regression?

simultaneously considers influence of multiple explanatory variables on a response variable y AND adjusts out influence of confounders/other variables

spearman's correlation is used when data is strongly _______ or not normally ________ (ordinal)

skewed distributed

What 2 parts of a survival curve need to be evaluated?

slop shape of curve

pain intensity and narcotic dose correlation would be performed w/ pearson's or spearman's?

spearman's

what does the p value of <0.05 mean based on an alpha of 0.05

statistically significant difference between the two survival curves

how do we estimate curves for specific groups in the kaplan meier method

stratification

what type of longitudinal data does survival analysis refer to?

survival or time to event

what is compared in the kaplan meier survival analysis

survival probabilities of 2 groups ex: treated vs untreated patients

how would a high p value for a cox proportional hazard be interpreted?

test parameter does not significantly impact survival

what is r

the correlation coefficient

in the kaplan meier curve, there is only a downward trend if what happens?

the event occurs (death or other target outcome)

what does the shape of a survival curve tell

the pattern of survival varying over time

define time to event

time from entry into a study until subject has a specific outcome

in an SPSS table, does the B come from the unstanderdized B column or the standardized coefficients beta column?

unstandardized B of the constant row

how do we adjust for covariates in the kaplan meier method

use inverse probability weights

interpret this result: wt71 & wt82 have pearson's r of 0.876 & p<0.00001

weight in 1971 and 1982 are highly correlated, and this correlation is statistically significant

what should be the only difference in groups achieved through randomization

what treatment (exposure) each group gets

what does controlling for variables help determine?

which independent variables are truly related to the dependent variable

if B is 0, then?

y does not change as much as x does

Give some examples of dichotomous outcomes

y/n pancreatic cancer y/n worse sx y/n appropriate tx y/n survival after surgery

can adjustments for covariates & estimation curves be made in the kaplan meier method?

yes

can outcomes between groups be directly compared if there is randomization?

yes

can hazard ratio change over time?

yes, but not necessarily reported over time

ratio data has an absolute _____ point

zero ex: Kelvin scale, weight.


Related study sets

ATI Reproductive and Genitourinary

View Set

Choosing a Leader - The Electoral College

View Set

decision trees pt 3 , neural networks, and whatever is missing

View Set

Peds Exam Three Practice Questions

View Set

Chapter 1 A Framework for Maternal and Child Health Nursing

View Set

AI/Machine Learning (15) AND Cognition/Emotion/Therapy (16)

View Set

Chapter 11: the gallbladder and biliary system (practice test)

View Set