Introduction to causal inference and the analysis of treatment effects in the presence of departures

About This Presentation

Title:

Introduction to causal inference and the analysis of treatment effects in the presence of departures

Description:

'Doctor doctor, will psychotherapy cure my depression? ... Outcome: Beck Depression Inventory (BDI) at 6 months. recorded on 317 randomised individuals ... – PowerPoint PPT presentation

Number of Views:95

Avg rating:3.0/5.0

Slides: 61

Provided by: richard816

Category:

more less

Transcript and Presenter's Notes

Title: Introduction to causal inference and the analysis of treatment effects in the presence of departures

1
Methods of explanatory analysis for psychological
treatment trials workshop

Session 1
Introduction to causal inference and the analysis
of treatment effects in the presence of
departures from random allocation
Ian White

Funded by MRC Methodology Grant G0600555 MHRN
Methodology Research Group
2
Plan of session 1

Describe departures from random allocation
Intention-to-treat analysis, per-protocol
analysis and their limitations
What do we want to estimate?
Estimation methods principal stratification
Instrumental variables
Structural mean model
Extensions complex departures, missing data,
covariates
Small group discussion
Illustrated with data from the ODIN and SoCRATES
trials

3
Parallel-group trial
Recruit
Randomise
Standardtreatment (S)
Experimental treatment (E)
Get E
Get S
Measure outcome
Measure outcome
4
Aim of session 1

Infer causal effect of treatment in the presence
of departures from randomised intervention
Better term than non-compliance includes both
non-adherence and changes in prescribed treatment
Types of departure
Switches to other trial treatment or changes to
non-trial (or no) treatment
Yes / no or quantitative (e.g. attend some
sessions)
Constant or time-dependent
Well start by considering the simplest case
all-or-nothing switches to the other trial
treatment
The methods introduced here will be used in later
sessions

5
Plan of session 1

Describe departures from random allocation
Intention-to-treat analysis, per-protocol
analysis and their limitations
What do we want to estimate?
Estimation methods principal stratification
Instrumental variables
Structural mean model
Extensions complex departures, missing data,
covariates
Small group discussion
All illustrated with data from the ODIN trial

6
Intention-To-Treat (ITT) Principle

http//www.consort-statement.org/ glossary
A strategy for analyzing data in which all
participants are included in the group to which
they were assigned, whether or not they completed
the intervention given to the group.
Intention-to-treat analysis prevents bias caused
by the loss of participants, which may disrupt
the baseline equivalence established by random
assignment and which may reflect non-adherence to
the protocol.
Now the standard analysis and rightly so

7
Intention-to-treat analysis

Compare groups as randomised, ignoring any
departures
Answers an important pragmatic question
e.g. the public health impact of prescribing E
Disadvantage this may be the wrong question!
may want to explore public health impact of
prescribing E outside the trial, when compliance
might be less
alternative pragmatic question
may want to know the effect of receiving E
explanatory question

8
Disadvantage of ITT

Doctor doctor, will psychotherapy cure my
depression?
I dont know, but I expect prescribing
psychotherapy to reduce your BDI score by 5 units
on average
thats on average over whether you attend or not
Clearly, judgements about whether a patient is
likely to attend, take a drug, etc., should be a
part of prescribing
But we often need to know effects of attendance,
the drug, etc. in themselves

9
Per-protocol (PP) analysis

Alternative to ITT
Exclude any data collected after a departure from
randomised treatment
requires careful pre-definition what will be
counted as departures?
Idea is to exclude data that doesnt allow for
the full effect of treatment
However, PP implicitly assumes that individuals
with different treatment experience are
comparable
rarely true
in practice there can be substantial selection
bias

10
Alternative to ITT and PP

We adopt a causal modelling approach that
carefully considers what we want to estimate and
what assumptions are needed to do so
Estimation will avoid assumptions of
comparability between groups as treated
will instead be based on comparisons of
randomised groups

11
Plan of session 1

Describe departures from random allocation
Intention-to-treat analysis, per-protocol
analysis and their limitations
What do we want to estimate?
Estimation methods principal stratification
Instrumental variables
Structural mean model
Extensions complex departures, missing data,
covariates
Small group discussion
All illustrated with data from the ODIN trial

12
What do we want to estimate?

The effect of the intervention, if everyone had
received their randomised intervention?
average causal effect, ACE
average treatment effect, ATE
conceptual difficulties
how could we make them receive their randomised
intervention?
would this be ethical?
would it have other consequences?
technical difficulties
turns out to be unidentified (unestimable)
without further strong assumptions

13
What do we want to estimate? (2)

Alternatives to the average causal effect
Average treatment effect in the treated, ATT
Complier-average causal effect, CACE
to be defined below
Note how we separate what we want to estimate
from analysis methods

14
Counterfactuals

Consider a trial of intervention E vs. control S
Define counterfactual or potential outcomes
Yi(1) outcome for individual i if they received
intervention
Yi(0) outcome for individual i if they received
control
We can only observe one of these!
Intervention effect for individual i is Di
Yi(1) - Yi(0)
Then average causal effect of intervention is
EDi
the average difference between outcome with
intervention and outcome with control

15
Estimation with perfect compliance

With perfect compliance, we observe
Yi(1) in everyone in the intervention arm
Yi(0) in everyone in the control arm
Randomisation means that mean outcome with
intervention can be estimated by mean outcome of
those who got intervention
EYi RE EYi RS EYi(1) RE
EYi(0) RS EYi(1) EYi(0) EDi
Not true with imperfect compliance!
So ITT estimates the average causal effect of
intervention

16
Estimation with imperfect compliance

Assume all-or-nothing compliance
everyone gets either intervention or control
In the intervention arm, we observe
Yi(1) in compliers
Yi(0) in non-compliers
In the control arm, we observe
Yi(0) in compliers
Yi(1) in contaminators
Need assumptions to estimate the average causal
effect of intervention
A very simple assumption is
Yi(1) - Yi(0) b
b is the (average) causal effect of intervention

17
Estimation with imperfect compliance (2)

Continuing with causal model Yi(1) - Yi(0) b
can be written as Yi Yi(0) b Di
Di 1 if intervention was received, else 0
Implies that expected difference in outcome
(between randomised groups) causal effect of
intervention x expected difference in
intervention receipt
EYiRE EYiRS b EDiRE EDiRS
This gives the simplest causal estimator
causal effect of intervention expected
difference in outcome / expected difference in
intervention receipt

18
But

Angrist, Imbens and Rubin (1996) took a different
perspective and showed that this estimator isnt
what it seems
To see this, consider counterfactual
treatments
DiE treatment if randomised to intervention
DiS treatment if randomised to control
both are 0/1 (received standard / intervention)
Implies 4 types of person (compliance-types)
DiE1, DiS1 always-takers
DiE1, DiS0 compliers
DiE0, DiS0 never-takers
DiE0, DiS1 defiers assumed absent

19
Introducing the complier-average causal effect

The observed data tell us nothing about the
causal effects of treatment in always-takers and
never-takers
In fact, our simple estimator estimates the
complier-average causal effect (CACE) EDi
DiE1, DiS0
This is all we can hope to estimate in RCTs!

20
Problems with the CACE

We dont know who is a complier
In practice, we may want to know what will be
observed
if compliance is worse than in the trial (e.g. if
rolled out in clinical practice)
if compliance is better than in the trial (e.g.
because intervention is well publicised /
marketed)
This means we want to know the average causal
effect in a different subgroup. We might assume
this is the CACE but it is an assumption

21
Summary of things we can estimate

ITT EYRE EYRS
PP EYRE, DE1 EYRS, DS0
ACE/ATE EY(1) Y(0)
ATT EY(1) Y(0) DE1
CACE EY(1) Y(0) DE1, DS0
We are going to explore ways to estimate the CACE

22
Plan of session 1

Describe departures from random allocation
Intention-to-treat analysis, per-protocol
analysis and their limitations
What do we want to estimate?
Estimation methods principal stratification
Instrumental variables
Structural mean model
Extensions complex departures, missing data,
covariates
Small group discussion
All illustrated with data from the ODIN trial

23
Principal stratification

An idea of Frangakis and Rubin (1999),
generalising the simple compliance-types above
Again, let
DiE treatment if randomised to intervention
DiS treatment if randomised to control
where both could be complex (e.g. numbers of
sessions of psychotherapy)
Principal strata are the levels of the pair (DiE,
DiS)

24
Using principal stratification

We should model outcomes conditional on principal
strata
typically allow a different mean for each
principal stratum avoids assuming they are
comparable
allow differences between randomised groups
within principal strata
these parameters have a causal meaning
Of course this may not be easy, since for every
individual we only know one of (DiE, DiS) so we
dont know their principal stratum

25
Example ODIN trial

Trial of 2 psychological interventions to reduce
depression (Dowrick et al, 2000)
Randomised individuals
236 to the psychological interventions (E)
128 to treatment as usual (S)
Outcome Beck Depression Inventory (BDI) at 6
months
recorded on 317 randomised individuals

26
ODIN trial compliance

Of 236 individuals randomised to psychological
interventions, 128 (54) attended in full
others refused, did not attend or discontinued
Psychological interventions werent available to
the control arm (no contaminators) so DS0 for
all
Only 2 principal strata
would attend if randomised to intervention
DE1, compliers
would not attend if randomised to intervention
DE0, never-takers

27
Exclusion restriction

Key assumption used to identify the CACE
In individuals for whom randomisation has no
effect on treatment (e.g. in never-takers and
always-takers), randomisation has no effect on
outcome
Often reasonable e.g. in a double-blind drug
trial, not taking active drug is the same as not
taking placebo
But not always reasonable e.g. not attending
counselling despite being invited could be
different from not attending because uninvited
I wouldnt have gone, but Id like to have been
invited

28
Exclusion restriction in ODIN

In ODIN, the exclusion restriction means that
randomisation has no effect on outcomes in those
who would not attend if randomised to
psychological intervention
But recall that we included those who
discontinued as non-attenders
their partial attendance is very likely to have
had some effect on them
the exclusion restriction would be more plausible
if we defined compliance as any attendance
well return to this later

29
CACE analysis (complete cases)
30
CACE analysis (2)
Note 66.7 compliance (118/177)ITT / 0.667
CACE
CACE 13.32 16.13 -2.81(cf ITT 13.29
15.16 -1.87)
31
CACE vs. PP
32
Plan of session 1

Describe departures from random allocation
Intention-to-treat analysis, per-protocol
analysis and their limitations
What do we want to estimate?
Estimation methods principal stratification
Instrumental variables
Structural mean model
Extensions complex departures, missing data,
covariates
Small group discussion
All illustrated with data from the ODIN trial

33
Instrumental variables (IV)

Popular in econometrics
Model
Model of interest Yi a b Di ei
Error ei may be correlated with Di (endogenous)
Example in econometrics D is years of education,
Y is adult wage, e includes unobserved
confounders
We cant estimate b by ordinary linear regression
Instead, we assume error ei is independent of an
3rd instrumental variable Ri
i.e. Ri only affects outcome through its effect
on Di
or randomisation only affects outcome through
its effect on treatment actually received

34
IV estimation

Estimation by two-stage least squares model
implies
EYi Ri a b EDi Ri
so first regress Di on Ri to get EDi Ri
then regress Yi on EDi Ri
NB standard errors not quite correct by this
method general IV uses different standard errors
More generally, we use an estimating equation
based onSi Ri (Yi a b Di ) 0

35
Instrumental variables for ODIN

. ivreg bdi6 (treataz)
Instrumental variables (2SLS) regression
Source SS df MS
Number of obs 317
-------------------------------------------
F( 1, 315) 2.64
Model -58.5115086 1 -58.5115086
Prob gt F 0.1049
Residual 32532.4232 315 103.277534
R-squared .
-------------------------------------------
Adj R-squared .
Total 32473.9117 316 102.765543
Root MSE 10.163
--------------------------------------------------
----------------------------
bdi6 Coef. Std. Err. t
Pgtt 95 Conf. Interval
-------------------------------------------------
----------------------------
treata -2.803511 1.724143 -1.63
0.105 -6.195802 .5887801
_cons 15.15714 .8588927 17.65
0.000 13.46725 16.84703
--------------------------------------------------
----------------------------
Instrumented treata
Instruments z

Same estimate as before!
36
Easy to extend to include covariates

. ivreg bdi6 (treataz) bdi0
Instrumental variables (2SLS) regression
Source SS df MS
Number of obs 317
-------------------------------------------
F( 2, 314) 43.26
Model 6808.64828 2 3404.32414
Prob gt F 0.0000
Residual 25665.2634 314 81.7365076
R-squared 0.2097
-------------------------------------------
Adj R-squared 0.2046
Total 32473.9117 316 102.765543
Root MSE 9.0408
--------------------------------------------------
----------------------------
bdi6 Coef. Std. Err. t
Pgtt 95 Conf. Interval
-------------------------------------------------
----------------------------
treata -3.428509 1.539881 -2.23
0.027 -6.458298 -.3987196
bdi0 .5813933 .0630405 9.22
0.000 .4573581 .7054285
_cons 2.395561 1.546673 1.55
0.122 -.6475924 5.438714
--------------------------------------------------
----------------------------
Instrumented treata

Usual gain in precision
37
Plan of session 1

Describe departures from random allocation
Intention-to-treat analysis, per-protocol
analysis and their limitations
What do we want to estimate?
Estimation methods principal stratification
Instrumental variables
Structural mean model
Extensions complex departures, missing data,
covariates
Small group discussion
All illustrated with data from the ODIN trial

38
Structural mean model (SMM)

Extends our simple model Yi(1) - Yi(0) b
SMM is EYiE - YiC DiE, DiC, X b Di
where Di is a summary of treatment thought to
have a causal effect, e.g.
Di DiE DiC causal effect of treatment is
proportional to amount of treatment
Di (DiE DiC , Xi(DiE DiC)) and X is an
effect modifier
Goetghebeur and Lapp, 1997 (assumed DiC0)
Estimation is equivalent to instrumental
variables with R and RX as instruments
in other words, we also assume that X does not
modify the causal effect of treatment

39
Summary for binary compliance

The principal stratification approach divides
individuals into always-takers, compliers and
never-takers
We can then identify the complier-average causal
effect, provided we make the exclusion
restriction assumption
This works for binary or continuous outcomes
Instrumental variables and structural mean models
approaches lead to the same estimates for
continuous outcomes
For binary outcomes, instrumental variables are
problematic, and generalised structural mean
models are needed (Vansteelandt and Goetghebeur,
2003)

40
Plan of session 1

Describe departures from random allocation
Intention-to-treat analysis, per-protocol
analysis and their limitations
What do we want to estimate?
Estimation methods principal stratification
Instrumental variables
Structural mean model
Extensions complex departures, missing data,
covariates
Small group discussion
All illustrated with data from the ODIN trial

41
Example with missing outcome data

Our IV analyses of ODIN used complete cases only
This is a bad idea
Follow-up rates were worse in non-attenders (55)
than in attenders (92)
So we modify the previous analysis
We will now assume the data are missing at
random given randomised group and attendance
e.g. among non-attenders, there is no difference
on average between non-responders and responders

42
CACE analysis under MAR
CACE (MAR) 13.32 16.80 -3.48cf CACE (CC)
13.32 16.13 -2.81
43
A more general approach

We can allow for missing data by using inverse
probability weights
Suppose a certain group of individuals has only
50 chance of responding
give each responder in that group a weight of 2
accounts for their non-responding fellows
In ODIN, we will consider the baseline-adjusted
analysis
We will construct weights depending on baseline
BDI, randomised group and attendance

44
Constructing the weights

. logistic resp6 z treata bdi0
Logistic regression
Number of obs 427
LR chi2(3) 49.84
Prob gt chi2 0.0000
Log likelihood -218.70364
Pseudo R2 0.1023
--------------------------------------------------
----------------------------
resp6 Odds Ratio Std. Err. z
Pgtz 95 Conf. Interval
-------------------------------------------------
----------------------------
z .4327186 .1102412 -3.29
0.001 .2626333 .7129535
treata 10.1753 3.909568 6.04
0.000 4.791789 21.60713
bdi0 .9750455 .0136551 -1.80
0.071 .9486461 1.00218
--------------------------------------------------
----------------------------
. predict presp
(option pr assumed Pr(resp6))
. gen wt1/presp

45
Examining the weights
therapy, non-compliers
control
therapy, compliers
46
Weighted IV analysis

. ivreg bdi6 (treataz) bdi0 pwwt
(sum of wgt is 4.2710e02)
Instrumental variables (2SLS) regression
Number of obs 317
F( 2, 314) 37.28
Prob gt F 0.0000
R-squared 0.2183
Root MSE 9.0521
--------------------------------------------------
----------------------------
Robust
bdi6 Coef. Std. Err. t
Pgtt 95 Conf. Interval
-------------------------------------------------
----------------------------
treata -3.953868 1.944846 -2.03
0.043 -7.780444 -.1272916
bdi0 .5810663 .0680343 8.54
0.000 .4472056 .714927
_cons 2.37602 1.554941 1.53
0.128 -.6834003 5.435441
--------------------------------------------------
----------------------------
Instrumented treata
Instruments bdi0 z

47
Back to the exclusion restriction

Recall that partial attenders were included as
non-compliers
If instead we include them as compliers, the
exclusion restriction is much more plausible
The estimated causal effect is smaller because it
is an average over a wider group that includes
partial compliers

48
Summary of ODIN results
49
Example with continuous compliancethe SoCRATES
trial

SoCRATES was a multi-centre RCT designed to
evaluate the effects of cognitive behaviour
therapy (CBT) and supportive counselling (SC) on
the outcomes of an early episode of
schizophrenia.
201 participants were allocated to one of three
groups
Control Treatment as Usual (TAU)
Treatment TAU plus psychological intervention,
either CBT TAU or SC TAU
The two treatment groups are combined in our
analyses
Outcome psychotic symptoms score (PANSS) at 18
months

50
SoCRATES ITT results
51
SoCRATES compliance

We have a record of the number of sessions
attended
ranges from 2 to 29 in the intervention group
0 for all in the control group
We could dichotomise
e.g. split at the median (17)
attending lt17 sessions is non-compliance
BUT the exclusion restriction is implausible
Instead, we keep number of sessions as continuous

52
Model for continuous compliance

Structural mean model Yi(1) - Yi(0) b Di(1)
The causal effect of d sessions is proportional
to the number of sessions
20 sessions are twice as good as 10 sessions
This is an assumption that you have to believe
Q is this assumption wrong if individuals
continue with sessions until they feel they have
achieved an adequate benefit?
Estimation can be done by instrumental variables
just as before

53
IV model in SoCRATES

. ivregress 2sls pant18 (sessionsrg) i.centre
pantot pw1/presp, small
(sum of wgt is 2.0101e02)
--------------------------------------------------
----------------------------
Robust
pant18 Coef. Std. Err. t
Pgtt 95 Conf. Interval
-------------------------------------------------
----------------------------
sessions -.4243381 .1632735 -2.60
0.010 -.7469866 -.1016897
centre
2 5.927803 4.013788 1.48
0.142 -2.003934 13.85954
3 -11.32247 2.523946 -4.49
0.000 -16.3101 -6.334842
pantot .4236632 .091294 4.64
0.000 .243255 .6040714
_cons 30.27006 7.72171 3.92
0.000 15.01101 45.5291
--------------------------------------------------
----------------------------
Instrumented sessions
Instruments 2.centre 3.centre pantot rgroup

NB Ive used Stata 11 here
Each extra session reduces PANSS by 0.4 points
54
Summary for continuous compliance

There are too many principal strata for the
principal stratification approach to work
Instrumental variables and structural mean models
approaches work for continuous outcomes

55
Plan of session 1

Describe departures from random allocation
Intention-to-treat analysis, per-protocol
analysis and their limitations
What do we want to estimate?
Estimation methods principal stratification
Instrumental variables
Structural mean model
Extensions complex departures, missing data,
covariates
Small group discussion
All illustrated with data from the ODIN trial

56
Practical session

Please work in small groups.
Well consider the Down your drink (DYD) trial
internet users seeking help with their drinking
were randomised to a new interactive website or
control.
the intervention groups use of the new website
is measured by the number of page hits. The mean
was 60 hits over a 3-month period.
outcome weekly alcohol consumption at 3 months
I will list some possible analyses of this trial,
all aiming to estimate the causal effect of
treatment. In each case, please
identify the underlying assumption
decide how plausible you think that assumption is.

57
Analyses to consider (1)

Regarding those who hit less than 60 pages as
non-compliers
A per-protocol analysis intervention group
compliers compared with the control group
A CACE analysis intervention group compliers
compared with those members of the control group
who would have complied if they had been
randomised to intervention
The same, but regarding those who hit less than
10 pages as non-compliers
A structural mean model analysis, modelling the
causal effect of the intervention as proportional
to the number of pages hit

58
Analyses to consider (2)

The control group had access to a different web
site, and averaged 30 page hits.
A per-protocol analysis intervention group with
gt60 page hits compared with the control group
with gt30 page hits
A SMM analysis modelling the causal effect of
each intervention as proportional to the number
of pages hit (with different parameters)
Do you have any other suggestions for the
analysis?

59
References

Dowrick C, Dunn G, et al. Problem solving
treatment and group psychoeducation for
depression multicentre randomised controlled
trial. BMJ 2000 321 14504.
Goetghebeur E, Lapp K. The effect of treatment
compliance in a placebo-controlled trial
Regression with unpaired data. JRSS(C) 1997 46
351364.
Angrist JD, Imbens GW, Rubin DB. Identification
of causal effects using instrumental variables.
JASA 1996 91 444455.

60
Suggested further reading

Dunn G et al. Estimating psychological treatment
effects from a randomised controlled trial with
both non-compliance and loss to follow-up.
British Journal of Psychiatry 2003 183 323331.
simple CACE methods
Maracy M, Dunn G. Estimating dose-response
effects in psychological treatment trials the
role of instrumental variables. SMiMR 2008.
IV methods
White IR. Uses and limitations of
randomization-based efficacy estimators. SMiMR
2005 14 327347.
overview of ideas
Fischer-Lapp K, Goetghebeur E. Practical
properties of some structural mean analyses of
the effect of compliance in randomized trials.
Controlled Clinical Trials 1999 20 531546.
structural mean models