Title: Marginal Structural Models and Causal Inference in Epidemiology Modeling Treatment, Censoring and Mi
1Marginal Structural Models and Causal Inference
in Epidemiology
Modeling Treatment, Censoring and Missing Values
- Kooperberg, et al., 1997, JASA 92(437) 117.
- Robin, et al., 2000, Epidemiology 11(5) 550
- Hernan, et al., 2002, Stat in Med. 211689.
2Review of Set-up of Data
- Measurements made at regular times, 1,2,...,K.
- Ak is the AZT does on the kth day.
- Yk is dichotomous outcome measured (maybe
repeatedly) on kth day. - is the history
of treatment as measured at time k. - is the history
of the potential confounders measured at time k.
3Equivalence (algrothmically) of repeated binary
outcome data and survival data.
- In Survival data, the outcome of interest is T
the time at which an event occurs relative to an
index time. - Example time of recurrence of cancer measured
from removal of original tumor. - Or, death from dx of AIDS.
- In order to practically implement the MSM
estimator with survival data, must discretize the
time interval into (equal) blocks of time (e.g.,
months).
4- If we break up the time scale into blocks, then
one variable, T, is converted to set of binary
variables, Yk, that are 0 (not dead yet) or 1
(died) and blank either after the event or
censoring (o-censor, xfailure).
5Treatment Weights are now the same for
longitudinal data with repeated measures
- Now, the k represents the time-interval.
- Estimation works by treating each interval as a
unique observation and just using the same old
weighting. - If not fitting a stratified MSM, then ignore the
V in the numerator.
6Censoring Weights
- Same as the missigness weights.
- In this case, missingness has more predictable
pattern once missing (censored), always
missing. - Use this to make slightly less general censoring
weight.
7Censoring Weights
- Let C(k)0 if the subject is censored in that
interval, C(k)1 if still in risk set opposite
of syntax in Hernan et al.,1999 paper. - Informally, the denominator is the probability of
being observed in interval k (not censored),
given the past (not censored yet, covariates and
treatment).
8Estimating Stabilized Missingness Weights
9Total Weights
- Then the weight for every non-censored interval
is (just like repeated outcome data) - It would be nice, lovely, marvelous, wonderful
... to have an automatic procedure to select the
models. - Good news they exist!!
10What a good method would look like
- Want a procedure that looks among a large list of
possible models and choses the best model based
on a sensible criterion. - The procedure should try, automatically, to
balance the competing goals of minimizing bias
and variance. - It should, thus, try to fit the data as closely
as possible without over-fitting the data. - The model should accomodate, in a flexible
manner, both non-linear dose responses and
possible statistical interactions.
11What would the optimal procedure try to optimize?
- Let the following were our MSM of interest
- Then, the optimal procedure would choose the
model for the weights that minimized, given the
data
12What would the optimal procedure try to optimize?
- However, procedures to do so are not currently
available (perhaps soon using recently developed
x-validation theory). - Thus, we concentrate on a routine that still
optimizes the fit of the censoring and treatment
models and has the qualities we desire
POLYCLASS.
13POLYCLASS
- Procedure used to model categorical outcomes.
- Thus, can be used for categorical treatment, (the
Ak) and binary missing or censoring (the Ck) - Developed by Kooperberg, et al., 1997.
- Works by building progressively more complicated
models based on interactions and linear splines.
14Linear Splines
- A linear spline allows the function to bend at a
point called a knot. - Linear splines are based on basis functions
that have the following form - Usually written as x(x-c).
15Example of Linear Spline
16Description of the Polyclass Model(assume binary
outcome)
- Where gi are (basis) functions of the relevant
covariates (linear terms, spline terms,
interactions, etc. - ?i are the coefficients in front of these basis
functions.
17How POLYCLASS build models up
- Starts with the constant model (just ?0),and at
each step records a fit statistic default is
BICp-2log(like)log(n)(p1), where n is the
sample size and p is the number of non-constant
basis functions in the model - Among all potential covariates, finds the one
that gives the smallest p-value (Rao statistic). - At next step considers either adding 1) a new
covariate, 2) a new knot to existing covariates,
or 3) an interaction between basis functions
already in model.
18How POLYCLASS pair models down.
- Removes basis functions one at a time, again at
each step recording the fit statistic. - Removes terms hierarchically so only removes
main effect if variable not involved in a spline
term nor interaction term. - Among all candidate variables for removal, picks
one with highest p-value (Wald statistic).
19The End Result
- Among all models (both in the building up and
pairing down procedures), choses the model with
the best (smallest) fit statistic. - One frequently gets a complicated and somewhat
hard to interpret model involving splines,
interactions and main effects. - Good news is you dont really care just need
probability of observed treatment, not-missing or
not censored and follow-up prediction procedures
are available. - Implemented in both R and Splus.
20POLYCLASS function in R
- polyclass(data, cov, weight, penalty, maxdim,
exclude, include, additive FALSE, linear,
delete 2, fit, silent TRUE, normweight
TRUE, tdata, tcov, tweight, cv, select, loss,
seed) - data vector of outcomes
- cov matrix of covariates
21Example with Diarrhea Data
- Diarrhea
- Read in the data from a comma delimited file
with header - diarrhea1lt-read.csv("c/hubbard/causal2003/dataset
s/diarboil.csv",headerT) - Omit rows with missing observations
- diarrhea1lt-na.omit(diarrhea1)
-
22Diarrhea/Bioling Example
- Point Treatment Study with boiling water
(categorical) as explanatory variable (0none,
1sometimes, 2always) - Many potential confounders
- Diarrhea (outcome) is 0 (no) 1 (yes)
- MSM
23Treatment Model - multinomial logistic regression
24Fitting with Polyclass
- boillt-diarrhea1,"boilcat"
- covarlt-as.matrix(diarrhea1,321)
- Results
- Fit model (use cross-validation to choose
penatly to minimize miss-classification) - poly.boillt-polyclass(boil,covar,cv3,select0)
- dim1 knot1 dim2 knot2 Class 0 Class 1 Class 2
- 1 NA NA NA NA -10.878 -15.019 0
- 2 1 NA NA NA -5.515 -2.434 0
- 3 1 1 NA NA -5.183 -7.281 0
- 4 19 NA NA NA 21.701 23.556 0
- 5 15 NA NA NA 10.685 11.580 0
- 6 1 NA 15 NA 12.203 11.348 0
- 7 5 NA NA NA -11.551 -11.345 0
- 8 18 NA NA NA 4.290 3.840 0
- 9 18 NA 19 NA -1.989 -2.338 0
- 10 18 1 NA NA -3.166 -2.300 0
- 11 1 NA 19 NA 14.985 14.899 0
25Fitting with Polyclass, cont.
- The importance-anova decomposition is
- Cov-1 Cov-2 Percentage
- NA NA 53.60
- 1 NA 12.72
- 5 NA 5.07
- 15 NA 10.41
- 18 NA 1.24
- 19 NA 2.88
- 1 15 2.67
- 1 19 4.70
- 18 19 6.71
- Getting Probabilities using ppolyclass
- Description
- Classify new cases (cpolyclass'), compute
class probabilities for - new cases (ppolyclass'), and generate
random multinomials for new - cases (rpolyclass') for a polyclass'
model.
26Probabilities and weights of Observed Treatment
- ppolyclt-ppolyclass(boil,covar,poly.boil)
- wghtslt-1/ppolyc
27Stabilized Weights
- boilwlt-table(boil)/sum(table(boil))
- swlt-wghts
- swboil0lt-boilw1swboil0
- swboil1lt-boilw2swboil1
- swboil2lt-boilw3swboil2