Marginal Structural Models and Causal Inference in Epidemiology Modeling Treatment, Censoring and Mi - PowerPoint PPT Presentation

1 / 27

About This Presentation

Title:

Marginal Structural Models and Causal Inference in Epidemiology Modeling Treatment, Censoring and Mi

Description:

Let C(k)=0 if the subject is censored in that interval, C(k)=1 if still in risk ... the weight for every non-censored interval is (just like repeated outcome ... – PowerPoint PPT presentation

Number of Views:660

Avg rating:3.0/5.0

Slides: 28

Provided by: alanhu

Category:

more less

Transcript and Presenter's Notes

Title: Marginal Structural Models and Causal Inference in Epidemiology Modeling Treatment, Censoring and Mi

1
Marginal Structural Models and Causal Inference
in Epidemiology
Modeling Treatment, Censoring and Missing Values

Kooperberg, et al., 1997, JASA 92(437) 117.
Robin, et al., 2000, Epidemiology 11(5) 550
Hernan, et al., 2002, Stat in Med. 211689.

2
Review of Set-up of Data

Measurements made at regular times, 1,2,...,K.
Ak is the AZT does on the kth day.
Yk is dichotomous outcome measured (maybe
repeatedly) on kth day.
is the history
of treatment as measured at time k.
is the history
of the potential confounders measured at time k.

3
Equivalence (algrothmically) of repeated binary
outcome data and survival data.

In Survival data, the outcome of interest is T
the time at which an event occurs relative to an
index time.
Example time of recurrence of cancer measured
from removal of original tumor.
Or, death from dx of AIDS.
In order to practically implement the MSM
estimator with survival data, must discretize the
time interval into (equal) blocks of time (e.g.,
months).

If we break up the time scale into blocks, then
one variable, T, is converted to set of binary
variables, Yk, that are 0 (not dead yet) or 1
(died) and blank either after the event or
censoring (o-censor, xfailure).

5
Treatment Weights are now the same for
longitudinal data with repeated measures

Now, the k represents the time-interval.
Estimation works by treating each interval as a
unique observation and just using the same old
weighting.
If not fitting a stratified MSM, then ignore the
V in the numerator.

6
Censoring Weights

Same as the missigness weights.
In this case, missingness has more predictable
pattern once missing (censored), always
missing.
Use this to make slightly less general censoring
weight.

7
Censoring Weights

Let C(k)0 if the subject is censored in that
interval, C(k)1 if still in risk set opposite
of syntax in Hernan et al.,1999 paper.
Informally, the denominator is the probability of
being observed in interval k (not censored),
given the past (not censored yet, covariates and
treatment).

8
Estimating Stabilized Missingness Weights

Need a model for

9
Total Weights

Then the weight for every non-censored interval
is (just like repeated outcome data)
It would be nice, lovely, marvelous, wonderful
... to have an automatic procedure to select the
models.
Good news they exist!!

10
What a good method would look like

Want a procedure that looks among a large list of
possible models and choses the best model based
on a sensible criterion.
The procedure should try, automatically, to
balance the competing goals of minimizing bias
and variance.
It should, thus, try to fit the data as closely
as possible without over-fitting the data.
The model should accomodate, in a flexible
manner, both non-linear dose responses and
possible statistical interactions.

11
What would the optimal procedure try to optimize?

Let the following were our MSM of interest
Then, the optimal procedure would choose the
model for the weights that minimized, given the
data

12
What would the optimal procedure try to optimize?

However, procedures to do so are not currently
available (perhaps soon using recently developed
x-validation theory).
Thus, we concentrate on a routine that still
optimizes the fit of the censoring and treatment
models and has the qualities we desire
POLYCLASS.

13
POLYCLASS

Procedure used to model categorical outcomes.
Thus, can be used for categorical treatment, (the
Ak) and binary missing or censoring (the Ck)
Developed by Kooperberg, et al., 1997.
Works by building progressively more complicated
models based on interactions and linear splines.

14
Linear Splines

A linear spline allows the function to bend at a
point called a knot.
Linear splines are based on basis functions
that have the following form
Usually written as x(x-c).

15
Example of Linear Spline

E(YXx)3x-1.5(x-5)

16
Description of the Polyclass Model(assume binary
outcome)

Where gi are (basis) functions of the relevant
covariates (linear terms, spline terms,
interactions, etc.
?i are the coefficients in front of these basis
functions.

17
How POLYCLASS build models up

Starts with the constant model (just ?0),and at
each step records a fit statistic default is
BICp-2log(like)log(n)(p1), where n is the
sample size and p is the number of non-constant
basis functions in the model
Among all potential covariates, finds the one
that gives the smallest p-value (Rao statistic).
At next step considers either adding 1) a new
covariate, 2) a new knot to existing covariates,
or 3) an interaction between basis functions
already in model.

18
How POLYCLASS pair models down.

Removes basis functions one at a time, again at
each step recording the fit statistic.
Removes terms hierarchically so only removes
main effect if variable not involved in a spline
term nor interaction term.
Among all candidate variables for removal, picks
one with highest p-value (Wald statistic).

19
The End Result

Among all models (both in the building up and
pairing down procedures), choses the model with
the best (smallest) fit statistic.
One frequently gets a complicated and somewhat
hard to interpret model involving splines,
interactions and main effects.
Good news is you dont really care just need
probability of observed treatment, not-missing or
not censored and follow-up prediction procedures
are available.
Implemented in both R and Splus.

20
POLYCLASS function in R

polyclass(data, cov, weight, penalty, maxdim,
exclude, include, additive FALSE, linear,
delete 2, fit, silent TRUE, normweight
TRUE, tdata, tcov, tweight, cv, select, loss,
seed)
data vector of outcomes
cov matrix of covariates

21
Example with Diarrhea Data

Diarrhea
Read in the data from a comma delimited file
with header
diarrhea1lt-read.csv("c/hubbard/causal2003/dataset
s/diarboil.csv",headerT)
Omit rows with missing observations
diarrhea1lt-na.omit(diarrhea1)

22
Diarrhea/Bioling Example

Point Treatment Study with boiling water
(categorical) as explanatory variable (0none,
1sometimes, 2always)
Many potential confounders
Diarrhea (outcome) is 0 (no) 1 (yes)
MSM

23
Treatment Model - multinomial logistic regression

j 0, 1 or 2.
?i00

24
Fitting with Polyclass

boillt-diarrhea1,"boilcat"
covarlt-as.matrix(diarrhea1,321)
Results
Fit model (use cross-validation to choose
penatly to minimize miss-classification)
poly.boillt-polyclass(boil,covar,cv3,select0)
dim1 knot1 dim2 knot2 Class 0 Class 1 Class 2
1 NA NA NA NA -10.878 -15.019 0
2 1 NA NA NA -5.515 -2.434 0
3 1 1 NA NA -5.183 -7.281 0
4 19 NA NA NA 21.701 23.556 0
5 15 NA NA NA 10.685 11.580 0
6 1 NA 15 NA 12.203 11.348 0
7 5 NA NA NA -11.551 -11.345 0
8 18 NA NA NA 4.290 3.840 0
9 18 NA 19 NA -1.989 -2.338 0
10 18 1 NA NA -3.166 -2.300 0
11 1 NA 19 NA 14.985 14.899 0