Survival Analysis
Direct Survival Analysis
- the outcome variable in which outcome variable is discrete, (the number of the intervals until the event)
Why use method 2 over method 1
-allows for stat inferences to the made about competing risk models that cannot be done with method 1
Cox prop model and other model assumptions
1. Non-Informative Censoring- censoring is random and has nothing to do with the likelihood of the event occurring 2.Survival times are independent. the first two are assumptions in all of the models 3. hazards are proportional. hazard ratio is constant over time. Can check with C-log log plot or schoenfileds test. if violated stratify on the variable or time dependent coefficient/parameters 4.log(haz) is linear functions of the X's. you can check with residual plots. if violated using transformations, binomial, categories 3 and 4 are also assumed for exponential and weibull 5. Values of x do not change of time. if violated time dependent covariates model 6. baseline haz is unspecified
why does censoring occur?
1. a person doesn't experience the event before the study ends 2. a person is lost to follow-up 3. a person withdraws from the study because of death
Exponential Distribution
A probability distribution associated with the time between arrivals
H0 testing
H0 B=0 1. wald test 2. likelihood ratio test
independent censoring
Within any subgroup of interest, subjects who are censored at time t should be representative of all subjects in that subgroup who remained at risk at time t with respect to their survival experience. -if not true this can lead to biased results
t bar
average of survival times
risk set
collection of individuals that survived the least amount of time
counting process
does not deal with the order of events based on the proportional hazards model each event is independent assumes common baseline hazard for all events --if you just want overall conclusion about the measure of effects (doesn't matter which event)
how to calculate hazard ratio of a exposure variables
e^b1 if there are no other exposure variables
what is the survival exponential model based off of?
hazard model from the exponential distribution. it is the same as the poission distribution
stratification
pros quick and easy can look at
greenwood's formula
used to calculate the variance for CI of kaplan Meier
Parameter
value of a measure for a population
time
years,months,weeks,days
-2log likelihood chi square goodness of fit testing
- if the chi-sq value is significant then go with the 2nd model. In most cases this the model with added variables. -if the value is non significant then stay with the first model. This is the original model with no added variables.
Log-logistic
- p less than or equal 1 hazard decreases over time, p more than or equal 1 hazard increases then decreases over time (unimodal) -The AFT model is a proportional odds model -A proportional odds survival model is a model in which the survival odds ratio remains constant over time. The odds of surviving beyond time t. failure odds is the odds of getting the event by time t. FOR=1/SOR -the log odds of failure is a linear function of the log of time -log-logistic assumption can be eval graphically by plotting (1-S(t))/S(t) against ln(t) where S(t) are the Kaplan meier surivial estimates - Graphical Notes: straight lines support the log-logistic assumption, b. parallel curves support the PO assumption, and c. If the log-logistic and PO assumptions hold, then the AFT assumption also holds. -hazard can change direction (nonmonotonic) -Ex. cancer survival -only AFT
Frailty (unshared Frailty) models
-Frailty is a random component designed to account for variability due to unobserved individual-level factors that is otherwise unaccounted for by the other . (Ex heredity, family support, environment) predictors in the model. (unobserved multiplicative effect on the hazard function assumed to follow some distribution g(a) with a>0 and the mean of a equal to 1. -aXh(t) for hazard, for survival s(t)^a -with frailty included two people could have completely different survival functions - people with a>1 have an increased hazard and decreased probability compared to those with average frailty a=1. it is the opposite for hazard and probability if a<1 -individual level=conditional survival function. population level= unconditional survival function (increases initially but then decreases to 0) - two distributions both with a fixed mean of 1 and parameterized in terms of the variance: gamma and Gaussian -the exponentiated coefficient of the predictor is the ratio of conditional hazards from the same individual - the more frail individuals (a>1) the greater hazard there is and vise versa for a<1 - good in PH models where the in the K-M log-log survival plot the two lines converge. Becaise it means they converged due to unobserved heterogeneity in the population -used for cluster data or recurrent event data
Frailty in the Weibull model
-If the AFT assumption holds at the individual level then it also holds a the population level unlike PH assumption
Cumulative incidence curve
-KM survival curves are not the most useful for competing risks because they are based on the independence assumption which cannot be verified -so this method uses marginal probabilities -if there is only one risk 1-KM=CIC -but with competing risk it is derived from cause-specific survival function -does not require independence assumption -Has assumption that events are mutually exclusive and are nonrecurrent -when using a Cox PH to calculate CIC, independent competing risks is needed (Gray model)
General for of AFT model
-When fitting the linear model censored observations are included - is additive on the log scale but a multiplicative model with respect to T -is similar to Cox model
cause specific hazard function
-a hazard function for each failure type
Strat 3 used a sensitivity analysis
-allows estimations of worst case violations of the independence assumption. determines extreme ranges for parameters of one model. -if the worst case results do not meaningfully differ from the obtained results then there will be a small bias if the analysis is done with the independence assumption -if the results do differ that means the results can be biased if not adjusted for with the independence assumption
Method 2 Lunn-McNeil approach
-allows for one PH model to be fit for all of the event types. -allows flexibility to perform statistical inferences about various features of the competing risk models that can not be conveniently assessed using method -uses an interaction version of a stratified cox model
binary regression
-an approach for handling interval censoring. Since the outcome can be coded as 0(no event ) or 1 (event) during an interval. Useful when there is if there is a lot of events in each interval and the researcher does not want to specify a distribution for continuous survival time. -You have to keep in mind for the parameters each interpretation is specific to that parameter - dummy variables take the place of an intercept because the baseline could be different at the start of each intercept -Proportional Odds assumption- odds are constant over time or at least at the end of the interval. This can be tested by including an interaction term made up of two dummy variables and the predictor variable -uses the logit link function (expresses the log odds of failure as a linear function)
shared frailty model
-clusters have the same frailty -Ex. Subjects from the same family my have the same unobserved factors -accounts for within-cluster correlation -condtional hazard function is aX/h(t) =specific cluster frailty X the individuals hazard with the cluster -This is the same as unshared frailty but the data is applied differently affecting differences in interpretation and methods of esitimation -Accounts for dependence among subjects who share frailty - the likelihood comes from each cluster where the frailty is taken out then the product of all the clusters is the full likelihood -fraility component does not work in
Strat 1 decide assumption satisfied on clinical/biological/other grounds
-deciding without data analysis that the assumption holds based on clinical/biological/other grounds -for cancer and CVD, people who were censored for CVD were no more or less likely to die from cancer -assumption cannot be verified with observed data
Complementary log-log link
-expresses the log -log as a linear function of regression parameters - is a proportional hazards model (as long as the additive effects are still present when comparing the two option for the outcome variable
Exponential
-hazard is constant but not a function of time -stronger assumption than the proportional hazards assumption that the HR is constant. - just because hazard ratio is constant does not mean that each hazard is constant -regression coefficients estimated maximum likelihood estimation (MLE) and are asymptomatically normal distributed -can accommodate PH and AFT assumptions -there are different interpretations based on whether you use AFT or PH -AFT applies to comparison of survival times -PH applies to comparison of hazards -AFT and PH are the same for exponential. Also the survival function, hazard function, and median survival times do not differ between these models. Poisson similarities Assume a constant rate Different data structure ○ Poisson — aggregate counts ○ Exponential — individual level Use different outcomes ○ Poisson — number of cases ○ Exponential — time to event Yield equivalent parameter estimates ○ With same data and same covariates in the model Ex. death within the chronically ill (or another homogenous group) -Both AFT and PH
Strat 2 include common risk factors for competing risks in survival model
-include common risk factors for both failure types in both models for each failure type -assumption cannot be verified with observed data
cross-over adjusted
-methods to deal with people leaving the study or a specific treatment group and going to the other(cross-over) -Intention to treat principle- subjects will be analyzed to the treatment group to which they were originally assigned at the time of randomization. -
cause specific cox PH model
-model for each failure type -the effects of the predictors on the event can vary per model -PH assumption still needs to be met for any of the results to be meaningful -other failure types are treated as censored
Weibull
-most widely used parametric model -p shape parameter- p>1 hazard increases over time. p<1 hazard decreases over time. p=1 hazard is constant -if the AFT assumption holds then thenPH assumption also holds. unique to Weibull. hazard ratios can be compared and the acceleration factor. holds if does not vary over different levels of covariates -the log(-log) of S(t) is linear with the log of time. this allows a graphical eval by plotting the log -log of the Kaplan-Meier survival estimates against the log of time. Straight lines support Weibull assumption. Parallel curves support the PH assumption. -All different outcomes of log-log plot: Parallel straight lines ) Weibull, PH, and AFT assumptions hold 2. Parallel straight lines with slope of 1 ) Exponential. PH and AFT 3. Parallel but not straight lines ) PH but not Weibull, not AFT (can use Cox model) 4. Not parallel and not straight ) Not Weibull, PH violated 5. Not parallel but straight lines ) Weibull holds, but PH and AFT violated, different p - location parameter can move around the entire graph depends on the value, it is usually 0 -scale or lamabda is the hazard which is usually 1 -Ex.initial survival after organ transplantation is high but then a steep increase in deaths over time -Both AFT and PH
Gompertz model
-not AFT -Like Cox PH model but baseline hazard is specified as the hazard of the Gompertz distribution containing a shape parameter y -If g > 0 then the hazard exponentially increases over time. If g < 0 then the hazard exponentially decreases over time. If g ¼ 0 then the hazard is constant and reduces to the exponential model. - has 2 parameters' -examples distribution of adult lifespans
Marginal model
-only uses the stop times but not the start times from the observational record -but is only stratify on the sequence number. otherwise someone could show up twice -you cannot do the fixed effects partial likelihood -treats each of the individual's events as if they are separate people entirely -each event is a separate process -adjusts within subject correlation -useful when assuming events are different types -if events in different orders are different types
Akaike information criterion (AIC)
-provides an approach for comparing the fit of models with different underlying distributions, making use of the 2 log likelihood statistic -AIC=-2log(L)+2p
followup period
-section time in a study in which participants are reassessed for the interest of the study -a longer follow-up period has a larger impact on a sample size than a longer accrual time
generalized gamma model
-survival and hazard expressed in integrals -has 3 parameters, allows flexibility in its shape - Weibull and lognormal are special cases of the generalized gamma distribution
lognormal model
-survival and hazard expressed in integrals shape similar to log-logistic - accommodates AFT but not PO model -infection rates through an epidemic -only AFT
Parametric Likelihood
-this found by taking the product of each individuals contribution to the likelihood -Assumption- no competing risks(no competing event will stop someone from getting the event of interest -Assumption- the subjects contributions are independent -Assumption follow-up time is continuous with no-gaps. If there are gaps the likelihood can be modified -can handle left, right, and interval censored data unlike Cox likelihood with only handles right censoring
Gap time model
-this treats each interval as a distinct observation by pooling all the intervals together and estimating a single model. -Problem: dependence among multiple observations because they come from the same unit. the dependence needs to be taken into account before pooling. -a variation of the stratified CP, the starring time at risk is set to 0 at the start of each event -conditional model:assumes you cant have event 2 without event 1 -useful for modeling time between events -baseline hazard changes by event -uses gap time, instead of counting process time until 11st event does not affect risk for later events -if time from previous event to next event is important
Frailty effect
-unconditional hazard eventually decreases over time because the "at risk group" has an increasing proportion of less frail individuals -there is an issue for distinguishing between the individual and population level hazard ratios -for the population hazard it violates the PH assumption when the gamma or inverse-gaussian distributed frailty is added to the PH model
For analyzing competing risks: Method 1 separate models for diff event types
-uses a cox PH model to separately estimate hazards and corresponding hazard ratios for each failure type. Each model treats the other comepeting risks as censored. -if only one failure type is of interest then the hazard estimation will be for the one only but the others are still censored -
Method 2Alternate Lunn-McNeil approach
-uses the same general model but the dummy variables and the predictors serve as basic variables that transform in product terms -the regression coefficients and computational formulas are different -But the HRs, test stats and interval estimates are identical -inferences about the HR only require the standard errp
independent censoring with competing risks
-you have to deal with the fact that other failure types are being included in the censoring -so it is assumes that independent censoring is present among competing risks -but how do we determine this? and what to we do if they are not independent? -it cannot be determined that this assumption is true in a data set -there are different strategies to deal with this issue
Censor Method for CIC
1-KM works for a single event type. treats other events as censored
Steps to construct CIC
1. estimate hazard at ordered failure times for event type of interest (number of events at time t divined by those at risk at time t) for event c 2. estimates overall survival probability of surviving pervious time. St=overall survival curve 3. compute estimated incidence of failing form event-type c at time t. (St*ht)
two approaches for repeated events
1. separate analysis for each successive event 2. Use a gap time model
when do we use extended cox model
1. when there are important variables that can change over time 2.cross-over studies 3. testing the ph assumption
Conditional Probability Curves CPC
A summary measure in addition to CIC. -CPCc is the probability of experiencing an event c by time t, given that an individual has not experienced any of the other competing risks by time t. -CPC=CICc/(1-CICc^) where CICc^ is the cumulative incidence of failures from risks other than risk c
Accelerated failure time AFT model
AFT models describe this stretching out or contraction of survival time as a function of predictor variables Underlying assumptions AFT- effect of covariates is multiplicative with respect to survival time PH-effective of covariates is multiplicative with respect to hazard time -acceleration factor (y) is the measure of association in an AFT. can be parameterized as exp(a). a is a parameter to be estimated from the data. is a ratio of survival times corresponding to any fixed value of S(t). this ratio of survival times is assumed constant for all fixed values of S(t). S(t)=q q is constant for any probability q. -The acceleration factor is a ratio of survival times corresponding to any fixed value. Ex. Dog ages 7 times faster than humans. Dog:(t) Human:(yt)=(7t) - S2=s1(yt) -an acceleration factor greater than one for the effect of an exposure implies that being exposed (i.e., TRT = 1) is beneficial to survival whereas a hazard ratio greater than one implies being exposed is harmful to survival (and vice versa). -assumption: survival times are proportional -predicts time not hazard -ln(t)=linear model - can think of it as a linear model but with time
Frailty in log-logistic/lognormal model
Accommodates unimodal hazard without frailty component
event
An action that causes something to happen. Ex death, disease incidence
survival analysis
An event history method, and others are used when the dependent variable of interest is a time interval (e.g., time from onset of disease to death). why do we use it?
interval censoring
Ex. looking for a asymptomatic disease and finding it between 2, yearlong follow up periods -left censoring is a special case if 0 is the lower boundary
Parametric survival models
Exact distribution is unknown if parameters are unknown -Data used to estimate parameters -model examples: linear, logistic, and Poisson -Survival time (the outcome) is assumed to follow a known distribution. -Distribution examples: Weibull, exponential(a special case of Weibull), log-logistic, log normal. and generalized gamma. -Appeals of this model: More consistent with theoretical s(t) then non-distributional approaches . Simplistic. Completeness- h(t) and S(t) specified -Doesnt have to be PH models. Many are AFT models
Relationship of S(t) and h(t)
If you know one, you can determine the other. ex if h(t) is constant then S(t)=elt
Goodness of fit test
Large sample z or chi square significant value= PH assumption is not met insignificant value= PH is met may not detect certain departures from PH assump due to it not being subjective enough too global 2 tests can be used in this situation 1. Harrel and Lee 2.Schoenfield residuals
Kaplan-Meier method
Most common type of survival analysis. Utilizes log-rank test which is most common statistical test for survival time and 2 independent groups. 1-(deaths/number of people at risk during time period). If censoring is present, I prime= people at risk - people censored/2 then 1- deaths/ i prime= cumm survival probabilty Parametric
Identifying Likely model
Plot ln(-ln(survival)) vs ln(t) -parallel straight lines > weibull , ph and assumptions hold - Parallel straight lines with slope of 1 > exponential PH and AFT -Parallel but not straight lines
Poisson Distribution
Probability distribution for the number of arrivals during each time period
confounding effect
The distortion of the relationship between an independent variable and a dependent variable due to another variable called a confounder.
independence assumption for competing risks
The failure rate of one risk is unrelated to the failure rate of another risk. has to be true for all models and methods
Heaviside Function
Works as g(t), is 1 when T is greater than or equal to than a certain value of t, or 0 when T is less than a certain value of t
time independent variables
X's that do not involve t or time
time dependent variables
X's that involve time
random-effects models (subject specific)
a model that introduces a disturbance item representing unobserved heterogeneity. -this model handles both issues of standard errors, test stats, and unobserved heterogeneity -it is constant for each event but not the individuals -estimated with the maximum or partial likelihood
small letter t
any small specific value of interest for T
Tarone ware H0 test
applies weight to the early failures. not as good as gehan
confounder
associated with both the independent and dependent variable interferes with the main relationship of interest; bias solutions: adjust in analysis, stratify, restrict to one group, matching, randomization
exponential curve model
assumption hazard is constant over time parametric
h bar
average hazard rate ( total number of failures/ the sum of the observed survival times
survival time
average length of time from diagnosis to death
Schoenfeld residuals
defined for each predictor in model every subject who has the event if PH assump holds for a particular covariate then the residuals will not be related to survival time 3 step process 1. obtain Schoenfield residuals Run a Cox PH model to find them 2. rank failure times Create a variable that ranks failure times (earliest =1 then 2 and so on) 3. test correlation of residuals to ranked failure time H0:p=0 if H0 is rejected conclude PH assumption violated the null is never proven just cannot be rejected p-value drives sample size: small=gross violation of null may not be significant. large- slight violation of null may be significant
Unobserved heterogeneity
differences between study participants that are not observed or accounted for. in a linear model it is accounted for by the variation term. While in a cox model there is not term to account for this. It causes calculated hazard functions to decline with time event though that isnt the case.
competing risks
different causes of death 'compete' with each other for a person's life. EX a person can either die of lung cancer or a stroke not both. To assess any one risk's failure rate or survival in relation to other predictors. Also to compare hazard rates of two different events. We may also want to know how death rates differ when controlling for different predictors the survival probability can be hard to interpret depending upon the situation. Especially if one event never occurs it ends up being 1
Transforming a variable with log
doing the log of a variable allows for the variable to be used even if it violates the PH assumption. log(variable). it works for any variable but it does interfere with the interpretation of the hazard ratio of the variable. It will be the an increase or decrease of log units of the variable. EX IF you log transform blood pressure, the hazard ratio of 1.05 (it would most likely not land on 1.05 again after log transform but we'll stick to 1.05 for simplicity) means 5% increase for each log unit increase in blood pressure
how to get the hazard rate from the parameter estimate
e to x the parameter estimate to get the hr. you can use ln on the hr to get the parameter estimate
calculating hazard ratio
eBindictor*(subtraction result) subtraction result categorical ex. treatment group 1- treatment group 0=1
recurrent events
events that happen more than once -recurrences are related -nearly everyone becomes censored -how do you define time to event
poission process
events that occur independently over time
exact vs discrete method
exact calculates the exact probability of all possible orderings of events. assumes time is exact Breslow assumes event times are continuous and hazard is constant in the time interval Efron uses an average weight discrete method-assumes time is exact. models proportional
type 2 error
failing to reject the null hypotheis
separate analysis
for example, using birth intervals. You do an analysis for the time in between every since birth. Problems: It is inefficient and tedious. Multiple interpretations are needed for the numbers that are calculated in each analysis. Later analyses can become biased the more analyses you have to do. Works well with small sample sizes but not large ones
flemming harington H0 test
good for when hazards cross, can have weight for either early or late events
hazard function (conditional failure rate)
h(t) the instantaneous potential per unit time for the event to occur, given that the individual has survived up to time t it is always nonnegative, that is, equal to or greater than zero; it has no upper bound. model types exponential Weibull increasing or decreasing lognormal other info it is a measure of instantaneous potential whereas a survival curve is a cumulative measure over time; it may be used to identify a specific model form, such as an exponential, a Weibull, or a lognormal curve that fits one's data; it is the vehicle by which mathematical modeling of survival data is carried out; that is, the survival model is usually written in terms of the hazard function.
interval censored
if a subjects true survival time is within a certain specified time interval
probability density function F(t)
if this or the hazard h(t) or survival functions(t) is known you can figure out the other two. the hazard is reparametrized in terms of predictor variables and regression parameters. the parameter p (sometimes called the shaped parameter) is held fixed
fixed effects model
in this case the disturbance item is representing a set of fixed constants the disturbance term is able to be freely correlated with Xi -estimated with partial likelihood using method of stratification -can only estimate coefficients that vary across or within successive spells of an individual -does not work well with estimating the effect of previous events. -when the disturbance term is not correlated with covariates the random effect is better -this model excludes those with no events/ those with one uncensored speel and one censored spell if the censored spell is shorter than the uncensored one the data do not come from a randomized experiment. interest is centered on covariates that vary across intervals for each individual ■ most individuals have at least two events ■ there is a reasonable presumption that the process generating events is invariant over time.
parametric
inferential statistical tests involving interval- or ratio-level data to make inferences about the population. uses parameters and leads to assumptions. (exponential and weibull)
exponential survival model
intercept is constant does not change. works for survival or hazard models parametric
additive failure time model
is like AFT but additive instead of multiplicative
accurial period
is the period in which people are enrolled -there must be a balance between this and the follow up period
effect modifier/interaction
magnitude of the independent variables effect differs by 3rd variable overall estimate is misleading solutions: report stratified estimates. include interaction term in regression model
hazard ratio
measure of effect for survival analysis. obtained with logistic model since hazard is dichotomous
peto-peto h0 test
more weight to early events. best used when hazard rates are not constant. good for log normal distributions
failure
negative individual experience (death) or positive individual experience (returning to work)
observed with predicted survivor curves
observed curves does not have confounder in the model predicted curves has confounder in model if they are close then the PH assumption can be met 2 strats 1. one at a time- uses KM curves to obtain observed plots for continuous variables, there are two options for expected plots 1. used a PH model with k-1 dummy variables 2.use a PH model with the continuous variable being assessed 2. adjusting for other variables uses stratified Cox PPH model to obtain observed plots
noninformative censoring
occurs if the distribution of survival times (T) provides no information about the distribution of censorship times (C), and vice versa
estimated -ln(-ln) survivor curves
parallel survival curves. taking the natural log of a survival curve twice. value can be negative or positive it can be written in 2 ways 1. linear sum of the BiXi 2. log of the negative log of the baseline survival function for continuous variables make 2 or 3 meaningful categories that are balanced this does think out the data and have to identify specific variable causing nonparallelism Another strat is to asses for one predictor that is adjusted for other predictors
wilcoxon (gehan) H0 test
places emphasis on the beginning of the survival curve. early failures have more weight that late ones. better when censoring is low. good for log-normal and hazards that diverge. assumes censoring distributions are the same
Capital T
random variable for someones survival time. can be = or >0
small letter d
random variable indicating either failure or censorship d=1 is failure d=0 is censorship
type 1 error
rejecting the null when it is true
random censoring
subjects who are censored at time t should be representative of all the study subjects who remained at risk at time t with respect to their survival experience
confidence interval for kaplan meier curve
survival function for a given time plus or minus 1.96 square root of the variance of the survival function
Log-rank test
test used to compare survival analysis curves between 2 or more groups. it is the same as chi square in large groups of data. g-1 for degrees of freedom. (observed total - expected total) squared / expected total. do this for each group then add them together. good for exponential
Tests vs graphs for finding the PH assumption
test- more objective graph more subjective but can detect specific violations
cox proportional hazard model
the intercept can fluctuate with time so can the hazard. it also can predict all of the coefficients without having to specify baseline hazard. cannot estimate survival function, good for hazard models. does not consider probabilities for all subjects only those who fail (partial likelihood). can make adjusted survival curves. semi-parametric
the weibull survival analysis
the intercept increases/decreases with time. hazard slowly increases/decrease over time. works for survival or hazard models parametric
effect size
the magnitude of a relationship between two or more variables -EX. differences or ratios hazard rates, survival probabilities , median survival times
Marginal Probability
the probability of a single event without consideration of any other event -useful to assess treatment utility in cost-effectiveness analyses
survivor function
the probability that a person survives longer than some specified time they are nonincreasing; that is, they head downward as t increases; at time t= 0, S(t)= S(0)= 1; that is, at the start of the study, since no one has gotten the event yet, the probability of surviving past time 0 is one; at time t= infinity, S(t) = S(infinity) = 0; that is, theoretically, if the study period increased without limit, eventually nobody would survive, so the survivor curve must eventually fall to zero.
MLE maximum likelihood estimation
trying to find the most likely parameters based on the data Steps 1. Define likelihood function based on distribution of intrest 2. calculate the log-likelihood 3. Maximize the log-likelihood to estimate parameters Partial: takes right censoring into account (those who fail) Full:takes into all censoring When to use certain fit stats- -2log likelihood-nested models aic- comparing models across model types BIC-
Stratified counting process
used to distinguish the order in which recurrent events occur. -has to have a starting and stopping time -conditional model: you can have event 2 without event 1 -if time of event occurrence is important (specific order)
CI for median survival time
used to find the confidence interval for median survival. an inequality is used to determine where the intervals lie on the table
Extended cox model
used when time-dependent variables are present and doesn't satisfy proportional hazards. is extended to include interaction terms I. e. time-dependent variables. the significance of the interaction terms are tested H0: term=0 Wald or likelihood ratio statistic can be used if significant then PH assump is violated can also be use for multiple predictors as well as adjusted ones this has to be likelihood ratio. the hazard in the likelihood for each individual varies over time if the multiple predictor model is found to be significant then you would have to go back and do one by one. for adjusted the other predictors have to meet the PH assump. H0: predictor=0 wald or likelihood ratio works weakness: has to used Heaviside function g(t)= the function of time for the ith predictor
stratified cox model
used when variable that does not satisfy the PH assumption needs to be included in the model. Multiple groups are made for each category that the variable has The likelihood values for each group are multiplied together to get the final number there can also be an interaction model that includes the variables that are being stratified. The best model can be assessed by getting the chi-square and seeing if the interaction model is significant
Population averaged method
using COVSANDWICH with PROC PHREG. -there is no need to make assumptions about the nature or structure of dependence with this method. -but there is no correction for biases that arise from unobserved heterogeneity
statistic
value of a measure for a sample from the population
left censoring
when survival time is incomplete on the left side of the follow-up. can occur when a persons true survival time is less than or equal to that persons survival time
right censoring
when survival time is incomplete on the right side of the follow-up
censoring
when survival time is not exactly known. is crucial to made sure that internal validity is maintained can't be related to the likelihood of the event
recurrent risk
when there are more than one event being considered for the outcome
interaction effect
whether the effect of the original independent variable depends on the level of another independent variable
Independent censoring
within any subgroup of interest, the subjects who are censored at time t should be representative of all the subjects in that subgroup who remained at risk at time t with respect to their survival experience