Causal Diagrams: Directed Acyclic Graphs to Understand, Identify, and Control for Confounding

About This Presentation

Title:

Causal Diagrams: Directed Acyclic Graphs to Understand, Identify, and Control for Confounding

Description:

Constructing a Causal Diagram ... represented (depends on level of detail of the model) ... Heart Disease. Ex: What assumptions does the DAG we constructed make? ... – PowerPoint PPT presentation

Number of Views:1037

Avg rating:3.0/5.0

Slides: 56

Provided by: mvc2

Category:

more less

Transcript and Presenter's Notes

Title: Causal Diagrams: Directed Acyclic Graphs to Understand, Identify, and Control for Confounding

1
Causal Diagrams Directed Acyclic Graphs to
Understand, Identify, and Control for Confounding

Maya Petersen
PH 250B 11/03/04

2
What is causation?

Ex We observe a high degree of association
between carrying matches and lung cancer
Can we infer that carrying matches causes lung
cancer?
The counterfactual definition of causation
Carrying matches is a cause of lung cancer if
the risk of lung cancer is higher in people who
carry matches than it would be if these exact
same people did not carry matches

3
Causal diagrams

Intuitive approach to representing our
assumptions about causal relationships
Provide relatively straightforward tool for
relating observed statistical associations and
causal effects
What do we need to know (or assume) before we can
infer that an exposure causes a disease, and get
an unbiased estimate of this effect?

4
Causal diagrams

Today will focus on
How to draw a causal diagram
Use of causal diagrams to decide
Is confounding present?
What should we adjust for to get an unbiased
estimate of effect?
Causal diagrams to illustrate a situation where
the traditional approach to controlling
confounding (i.e. multivariable adjustment) fails

5
Ex . Constructing a Causal Diagram

We are interested in the effect of maternal
multivitamin use on birth defects, and make the
following causal assumptions
Prenatal care (PNC) leads to an increase in
vitamin use (as a result of intervention and
education.)
Prenatal care protects against birth defects in
ways other than by increasing vitamin use .
Difficulty conceiving may cause a woman to seek
out PNC once she becomes pregnant
Maternal genetics that lead to difficulty
conceiving can also lead to birth defects.
Socio-economic characteristics directly affect
both access to PNC and use of vitamins

6
Ex Constructing a Causal Diagram
Difficulty conceiving
SES
Maternal genetics
Pre-Natal Care
Vitamins
Birth Defects
7
Directed Acyclic Graph (DAG) construction Basics

Direct causal relationships between variables are
represented by arrows
All causal relationships have a direction,
because any given variable cannot be
simultaneously a cause and an effect (Directed)
There are no feedback loops ( Acyclic)
There can be no feedback loops because causes
always precede their effects
To avoid feedback loops, extend graph over time

Malnutrition
Malnut. (t0)
Malnut. (t1)
Infection
Infect. (t0)
Infect. (t1)
8
Directed Acyclic Graph (DAG) construction
Terminology

Parent Child
Directly connected by an arrow (No intermediates)
Pre-Natal care is a parent of birth defects
Birth defects is a child of Pre-natal care
Ancestor Descendant
Connected by a directed path of a series of
arrows
SES is an ancestor of Birth Defects
Birth Defects is a descendant of SES

Difficulty conceiving
SES
Maternal genetics
Pre-Natal Care
Vitamins
Birth Defects
9
Directed Acyclic Graph (DAG) construction
Assumptions

Not all intermediate steps between two variables
need to be represented (depends on level of
detail of the model)
Ex can represent the effect of smoking on lung
cancer as
Smoking -gt Cancer or
Smoking -gt tar -gt mutations -gt Cancer
Absence of a directed path from X to Y implies
that X has no effect on Y

10
Directed Acyclic Graph (DAG) construction
Assumptions

DAGs assume that all common causes of exposure
and disease of interest are included in causal
diagram
If common causes are unknown, or cannot be
observed, they must still be included
Ex

Unmeasured characteristics (religious beliefs,
culture, lifestyle, etc.)
Alcohol Use
Heart Disease
Smoking
11
Ex What assumptions does the DAG we constructed
make?

SES has no effect on difficulty conceiving
Difficulty conceiving has no effect on maternal
vitamin use, other than through its effect on
seeking prenatal care
SES has no effect on birth defects other than via
its effects on access to prenatal care and on
vitamin use
There are no additional common causes of vitamin
use and birth defects
Etc

Difficulty conceiving
SES
Maternal genetics
Pre-Natal Care
Vitamins
Birth Defects
12

Back to our basic problem
What can we say about causal effects, based on
the associations we observe in our data?
Associations between exposure and disease in our
crude data can arise in several ways

13
Crude (unadjusted) associations in our
observational data 1) Exposure causes disease

A crude association between smoking and cancer
could be due to
Smoking -gt Cancer
Smoking -gt tar -gt mutations -gt Cancer
Adjusting for an intermediate in the causal
pathway between exposure and disease removes any
association that results from that pathway
In the DAG above, if we control for tar levels,
we will block the association between smoking and
cancer
Smoking tar mutations Cancer
By adjusting for the effects of the exposure, we
will no longer be able to study them

14
Crude (unadjusted) associations in our
observational data 2) Exposure and disease
share a common cause

A crude association between matches and cancer
could be due to
Matches have no causal effect on cancer, but the
two are associated because they have a common
cause (smoking)
This is a classic example of confounding
By adjusting for the common cause, association is
eliminated
Matches are no longer associated with cancer
after we stratify on smoking
This is what we do when we adjust for a
confounder

Smoking
Matches
Cancer
15
Yet again- What is confounding?

If the crude association between exposure and
disease is unconfounded, then
All of the association we see between exposure
and disease is due to the effect of exposure on
disease
None of the association between exposure and
disease is due to common causes that they share.
(confounding)
In other words If exposure has no effect on
disease, would we still expect to observe an
association in our data?
If yes -gt confounding is present

16
How can we use a DAG to check for presence of
confounding?

Remove all direct effects of the exposure
These are the effects we are interested in. We
want to see if, in their absence, an association
is still present.
Check whether disease and exposure share a common
cause (ancestor)
Does any variable connect E and D by following
only forward pointing arrows?
If E and D have a common cause -gt confounding is
present
Any common cause they share will lead to an
association between E and D that is not due to
the effect of E on D

17
Vitamins and Birth Defects Is confounding
present?

Remove all direct effects of vitamin use
Do exposure and disease share a common cause
(ancestor)?

Difficulty conceiving
SES
Maternal genetics
Pre-Natal Care
Vitamins
Birth Defects
18
How can we use a DAG to decide what variables to
control for in our analysis?

We want to choose a set of variables that, when
adjusted for, will give us an unconfounded
estimate of the effect of exposure on disease
In other words, if the exposure had no effect on
disease, after adjusting for these variables,
exposure and disease will no longer be associated

19
How can two variables become associated?

Review A crude (unadjusted) association between
exposure (E) and disease (D) can be due to
Causal pathway from E to D (or vice versa)
E -gt D or E -gt x -gt y -gt D
Common cause of E and D
By adjusting (or stratifying) on a third
variable, it is possible to introduce a new
source of non-causal association (confounding)
between E D
As we begin to adjust for variables in attempt to
control for confounding, we must take this
potential source of association into account

C
D
E
20
Adjusting for a common effect of two variables
will induce a new association between them (Even
if they were unassociated before adjusting)

Ex
Being on a diet does not cause cancer (or vice
versa), and dieting and cancer share no common
causes In our crude data, diet and cancer will
not be associated
Whether or not an individual was on a diet does
not tell us anything about whether or not he/she
has cancer.
If we stratify on weight loss, we can create a
new association between dieting and cancer
Within the strata of people who lost weight, if
we know an individual was on a diet, it tells us
that he/she is less likely to have cancer
(dieting provides an alternate explanation for
weight loss).

Weight-loss diet
Cancer
Weight Loss
21
Using a DAG to decide what variable to adjust for
in analysis

Ex 1 Is adjusting for prenatal care sufficient
to control for confounding of the effect of
vitamin use on birth defects?

22
Using a DAG to decide what to adjust for in
analysis

Step 1 Is prenatal care caused by vitamin use?
If yes, we should not adjust for it.
Do not adjust for an effect of the exposure of
interest

Difficulty conceiving
SES
Pre-Natal Care
Maternal genetics
Vitamins
Birth Defects
23
Using a DAG to decide what to adjust for in
analysis

Step 2 Delete all non-ancestors of vitamin use,
birth defects, and pre-natal care
If a variable is not an ancestor of vitamin use
or birth defects, it cannot be a common cause,
and so cannot be a source of crude association
between them
If a variable is not an ancestor of prenatal
care, new associations with that variable can not
be created by adjusting for prenatal care

Difficulty conceiving
SES
Maternal genetics
Pre-Natal Care
Vitamins
Birth Defects
24
Using a DAG to decide what to adjust for in
analysis

Step 3 Delete all direct effects of Vitamins
These are the effects we are interested in. We
want to see if, in their absence, an association
is still present. If it is, we still have
confounding.

Difficulty conceiving
SES
Pre-Natal Care
Maternal genetics
Vitamins
Birth Defects
25
Using a DAG to decide what to adjust for in
analysis

Step 4 Connect any two causes sharing a common
effect
Adjustment for the effect will result in
association of its common causes

Difficulty conceiving
SES
Pre-Natal Care
Maternal genetics
Vitamins
Birth Defects
26
Using a DAG to decide what to adjust for in
analysis

Step 5 Strip arrow heads from all edges
We are moving from a graph that represents causal
effects, to a graph that represents the
associations we expect to observe (as a result of
both causal effects and the adjustment process)

Difficulty conceiving
SES
Pre-Natal Care
Maternal genetics
Vitamins
Birth Defects
27
Using a DAG to decide what to adjust for in
analysis

Step 6 Delete prenatal care
This is equivalent to adjusting for prenatal
care, now that we have added to the graph the new
associations that will be created by adjusting

Difficulty conceiving
SES
Maternal genetics
Vitamins
Birth Defects
28
Using a DAG to decide what to adjust for in
analysis

Test Are Vitamins and Birth Defects still
connected?
Yes Adjusting for Prenatal Care is not
sufficient for control of confounding
After adjusting for prenatal care, vitamin use
and birth defects will still be associated in our
data, even if vitamin use has no causal effect on
birth defects

Difficulty conceiving
SES
Maternal genetics
Vitamins
Birth Defects
29
Using a DAG to decide what to adjust for in
analysis

Adjustment for which variables would result in
control of confounding?
Our DAG shows that adjusting for any one or more
of the three remaining variables, in addition to
prenatal care, would be sufficient for control of
confounding (e.g. SES and prenatal care)

Difficulty conceiving
Maternal genetics
Vitamins
Birth Defects
30
Vitamins and Birth Defects Lessons learned

It may not be immediately intuitive what
variables we need to control for in our analysis
The process of adjustment/stratifiction can
introduce new sources of association in our data
that must be accounted for in any attempt to
control confounding
Step by step analysis of a DAG provides a
rigorous check whether we have adequately
controlled for confounding
Adjustment for several different sets of
confounders may each be sufficient to control
confounding of the same exposure disease
relationship.
Can inform study design

31
DAGs for control of confounding Summary of Steps

Problem Is adjustment for/stratification on a
set of confounders C sufficient to control for
confounding of the relationship between E and D?
No variables in C should be descendants of E
Delete all non-ancestors of E, D, C
Delete all arrows emanating from E
Connect any two parents with a common child
Strip arrowheads from all edges
Delete C
Test If E is disconnected from D in the
remaining graph, then adjustment for C is
sufficient to remove confounding

Pearl, J. Causality. Cambridge University Press,
Cambridge UK. 2001. pp. 355-57.
32
Stratification has its limits

Up till now, you have heard about one way to
remove confounding adjustment or stratification
on certain variables
But in some situations, there are no variables
you can stratify on and sucessfully remove
confounding
We will illustrate this using a DAG
In a future lecture, you will hear about a method
you can use in these circumstances (Marginal
Structural Models)

33
A DAG-based illustration of time-dependent
confoundingA situation in which traditional
methods to control for confounding (i.e.
adjustment/stratification) break down

Ex What variables should we control for to
estimate the effect of antiretroviral therapy on
CD4 count?

34
Ex. Antiretroviral therapy and CD4 count

Question of interest What is the effect of
antiretroviral therapy on CD4 count?
Study Population A cohort of HIV-infected
individuals
Outcome CD4 count at the end of the study
Exposure Antiretroviral therapy (ART) (treated
or not for the entire study period)

35
Ex. Antiretroviral therapy and CD4 count

Sicker individuals (those with lower baseline
CD4 counts at the beginning of the study) are
more likely to be treated with ART
Low baseline CD4 count causes physicians to treat
their patients
CD4 count at baseline also affects CD4 count at
the end of the study

36
Representing these relations in a DAG
CD4 Count at beginning of study
Outcome CD4 count at the end of a study
Causal effect of interest
Exposure Antiretroviral Treatment
37
Simple confounding

CD4 count at baseline is a confounder
If we dont adjust for baseline CD4 count, we
will underestimate the effect of ART on
preserving final CD4 count
Sicker people/ those with lower initial counts
will be overrepresented among those who get
treated
We can see this in the DAG- we must adjust for
baseline CD4 or ART and final CD4 will still be
connected once we delete our causal effect of
interest
CD4 and ART share a common cause

38
Representing these relations in a DAG
Confounder
CD4 Count at beginning of study
Outcome CD4 count at the end of a study
Exposure Antiretroviral Treatment
39
Antiretroviral therapy and CD4 count A more
realistic example

Same study population and outcome
Cohort of HIV-infected
Outcome is final CD4 count
Now, an individual can change treatment status
during the course of follow-up
E.g. an individual who is not treated at the
beginning of the study (t0) may go on treatment
partway through the study (e.g. t1)
CD4 also measured during course of follow-up

40
DAG- Expanded to incorporate changing treatment
over time
Baseline confounder
Y Final CD4 count
CD4 Count partway through study (t1)
CD4 Count at beginning of study (t0)
Causal effect of interest
Antiretroviral Treatment at t0
Antiretroviral Treatment at t1
41
Something is missing.

Our effect of interest is how antiretroviral
treatment throughout the study (eg t0 and t1)
affects final CD4 count
We have left out an important causal relationship
in the previous DAG!
Antiretroviral treatment at baseline affects
intermediate CD4 counts (e.g. CD4 measured at
t1) , which in turn affect final CD4 counts
This is part of our causal effect of interest!

42
Filling in the DAG
Baseline confounder
Y Final CD4 count
CD4 Count partway through study (t1)
CD4 Count at beginning of study (t0)
Causal effect of interest
Antiretroviral Treatment at t0
Antiretroviral Treatment at t1
43
Something is still missing

CD4 count at t1 will also affect subsequent
treatment (ART at t1)
Note we take the convention that CD4(t) is
measured before ART(t)
Patients with lower CD4 counts at t1 are more
likely to start ART partway through the study
A patient getting sicker causes his/her physician
to start them on treatment

44
Filling in the DAG
Baseline confounder
Y Final CD4 count
CD4 Count partway through study (t1)
CD4 Count at beginning of study (t0)
Causal effect of interest
Antiretroviral Treatment at t0
Antiretroviral Treatment at t1
45
What does this DAG tell us about what we need to
adjust for to control confounding?
46
Using the DAG to decide what we need to control
for

We cant adjust for anything that is a descendant
of (caused by) ART
Rules out CD4 at t1
Delete all non-ancestors of exposure, disease,
and things we are considering adjusting for
NA Everything in current graph is an ancestor of
outcome or exposure

Y Final CD4 count
CD4 Count at beginning of study (t0)
CD4 Count partway through study (t1)
Causal effect of interest
Antiretroviral Treatment at t0
Antiretroviral Treatment at t1
47
Using the DAG to decide what we need to control
for

Delete any arrows from ART
Connect parents sharing a common child
NA Already connected

Y Final CD4 count
CD4 Count at beginning of study (t0)
CD4 Count partway through study (t1)
Antiretroviral Treatment at t0
Antiretroviral Treatment at t1
48
Using the DAG to decide what we need to control
for

Strip arrowheads
What can we delete that will leave ART and final
CD4 unconnected?
Remember CD4 at t1 is not an option since ART
at t0 affects it

Y Final CD4 count
CD4 Count at beginning of study (t0)
CD4 Count partway through study (t1)
Antiretroviral Treatment at t0
Antiretroviral Treatment at t1
49
A Dilemma

From our analysis of the DAG it is clear that if
we dont adjust for CD4 at t1, we fail to
control for confounding
But we know we cannot adjust for a variable
affected by our exposure of interest
Adjusting for CD4 at t1 would be equivalent to
adjusting for part of our causal effect of
interest
We would again fail to correctly estimate the
total effect of ART on final CD4 because we would
lose that component of the effect mediated by
early changes in CD4

50
Adjusting for a variable on the causal pathway of
interest
Baseline confounder- could include it in
traditional multivariable model
Time-dependent confounder
Y Final CD4 count
CD4 Count partway through study t1
CD4 Count at beginning of study t0
Causal effect of interest
Antiretroviral Treatment at t0
Antiretroviral Treatment at t1
51
Time-dependent confounding

Time-dependent confounder A covariate that is
predictive of subsequent exposure, is an
independent risk factor for the outcome, and is
itself affected by prior exposure
If we dont adjust for the covariate we get bias
due to confounding
If we do adjust, we fail to estimate the causal
effect we are interested in because we are
adjusting for part of our effect of interest
You will see more of this problem, and hear about
some ways to address it (i.e. Marginal Structural
Models)

52
Conclusions

Today we have outlined the steps to
Construct a DAG, based on knowledge/assumptions
Use a DAG to decide if confounding is present
Use a DAG to decide what variables to control for
in analysis
We have also used a DAG to illustrate a situation
where traditional methods for controlling
confounding are not adequate (time-dependent
confounding)

53
References

Pearl J. Causality Models reasoning and
Inference. Cambridge University Press, Cambridge
UK. 2001.
Jewell NP. Statistics for Epidemiology. Chapman
Hall/CRC, USA. 2004102-112
Greenland S. Causal Diagrams for Epidemiologic
Research. Epidemiology, 1999 Jan, 10(3) 37-48.
Robins JM. Data, design, and background knowledge
in etiologic inference. Epidemiology,
200211313-320.
Hernan M, et al. Causal knowledge as a
prerequisite for confounding evaluation an
application to birth defects epidemiology. Am J
Epidemiol, 2002 155(2)176-184.

54
Example DAG from Mayas research
55
Example from Mayas research

Effect of interest Effect of observed viral
mutation profile (presence of specific mutations)
on viral load (i.e. response to treatment
DAG reveals that adjustment for treatment history
is sufficient

Write a Comment

User Comments (0)