Causal Diagrams: Directed Acyclic Graphs to Understand, Identify, and Control for Confounding - PowerPoint PPT Presentation

1 / 55
About This Presentation
Title:

Causal Diagrams: Directed Acyclic Graphs to Understand, Identify, and Control for Confounding

Description:

Constructing a Causal Diagram ... represented (depends on level of detail of the model) ... Heart Disease. Ex: What assumptions does the DAG we constructed make? ... – PowerPoint PPT presentation

Number of Views:1037
Avg rating:3.0/5.0
Slides: 56
Provided by: mvc2
Category:

less

Transcript and Presenter's Notes

Title: Causal Diagrams: Directed Acyclic Graphs to Understand, Identify, and Control for Confounding


1
Causal Diagrams Directed Acyclic Graphs to
Understand, Identify, and Control for Confounding
  • Maya Petersen
  • PH 250B 11/03/04

2
What is causation?
  • Ex We observe a high degree of association
    between carrying matches and lung cancer
  • Can we infer that carrying matches causes lung
    cancer?
  • The counterfactual definition of causation
  • Carrying matches is a cause of lung cancer if
    the risk of lung cancer is higher in people who
    carry matches than it would be if these exact
    same people did not carry matches

3
Causal diagrams
  • Intuitive approach to representing our
    assumptions about causal relationships
  • Provide relatively straightforward tool for
    relating observed statistical associations and
    causal effects
  • What do we need to know (or assume) before we can
    infer that an exposure causes a disease, and get
    an unbiased estimate of this effect?

4
Causal diagrams
  • Today will focus on
  • How to draw a causal diagram
  • Use of causal diagrams to decide
  • Is confounding present?
  • What should we adjust for to get an unbiased
    estimate of effect?
  • Causal diagrams to illustrate a situation where
    the traditional approach to controlling
    confounding (i.e. multivariable adjustment) fails

5
Ex . Constructing a Causal Diagram
  • We are interested in the effect of maternal
    multivitamin use on birth defects, and make the
    following causal assumptions
  • Prenatal care (PNC) leads to an increase in
    vitamin use (as a result of intervention and
    education.)
  • Prenatal care protects against birth defects in
    ways other than by increasing vitamin use .
  • Difficulty conceiving may cause a woman to seek
    out PNC once she becomes pregnant
  • Maternal genetics that lead to difficulty
    conceiving can also lead to birth defects.
  • Socio-economic characteristics directly affect
    both access to PNC and use of vitamins

6
Ex Constructing a Causal Diagram
Difficulty conceiving
SES
Maternal genetics
Pre-Natal Care
Vitamins
Birth Defects
7
Directed Acyclic Graph (DAG) construction Basics
  • Direct causal relationships between variables are
    represented by arrows
  • All causal relationships have a direction,
    because any given variable cannot be
    simultaneously a cause and an effect (Directed)
  • There are no feedback loops ( Acyclic)
  • There can be no feedback loops because causes
    always precede their effects
  • To avoid feedback loops, extend graph over time

Malnutrition
Malnut. (t0)
Malnut. (t1)
Infection
Infect. (t0)
Infect. (t1)
8
Directed Acyclic Graph (DAG) construction
Terminology
  • Parent Child
  • Directly connected by an arrow (No intermediates)
  • Pre-Natal care is a parent of birth defects
  • Birth defects is a child of Pre-natal care
  • Ancestor Descendant
  • Connected by a directed path of a series of
    arrows
  • SES is an ancestor of Birth Defects
  • Birth Defects is a descendant of SES

Difficulty conceiving
SES
Maternal genetics
Pre-Natal Care
Vitamins
Birth Defects
9
Directed Acyclic Graph (DAG) construction
Assumptions
  • Not all intermediate steps between two variables
    need to be represented (depends on level of
    detail of the model)
  • Ex can represent the effect of smoking on lung
    cancer as
  • Smoking -gt Cancer or
  • Smoking -gt tar -gt mutations -gt Cancer
  • Absence of a directed path from X to Y implies
    that X has no effect on Y

10
Directed Acyclic Graph (DAG) construction
Assumptions
  • DAGs assume that all common causes of exposure
    and disease of interest are included in causal
    diagram
  • If common causes are unknown, or cannot be
    observed, they must still be included
  • Ex

Unmeasured characteristics (religious beliefs,
culture, lifestyle, etc.)
Alcohol Use
Heart Disease
Smoking
11
Ex What assumptions does the DAG we constructed
make?
  • SES has no effect on difficulty conceiving
  • Difficulty conceiving has no effect on maternal
    vitamin use, other than through its effect on
    seeking prenatal care
  • SES has no effect on birth defects other than via
    its effects on access to prenatal care and on
    vitamin use
  • There are no additional common causes of vitamin
    use and birth defects
  • Etc

Difficulty conceiving
SES
Maternal genetics
Pre-Natal Care
Vitamins
Birth Defects
12
  • Back to our basic problem
  • What can we say about causal effects, based on
    the associations we observe in our data?
  • Associations between exposure and disease in our
    crude data can arise in several ways

13
Crude (unadjusted) associations in our
observational data 1) Exposure causes disease
  • A crude association between smoking and cancer
    could be due to
  • Smoking -gt Cancer
  • Smoking -gt tar -gt mutations -gt Cancer
  • Adjusting for an intermediate in the causal
    pathway between exposure and disease removes any
    association that results from that pathway
  • In the DAG above, if we control for tar levels,
    we will block the association between smoking and
    cancer
  • Smoking tar mutations Cancer
  • By adjusting for the effects of the exposure, we
    will no longer be able to study them

14
Crude (unadjusted) associations in our
observational data 2) Exposure and disease
share a common cause
  • A crude association between matches and cancer
    could be due to
  • Matches have no causal effect on cancer, but the
    two are associated because they have a common
    cause (smoking)
  • This is a classic example of confounding
  • By adjusting for the common cause, association is
    eliminated
  • Matches are no longer associated with cancer
    after we stratify on smoking
  • This is what we do when we adjust for a
    confounder

Smoking
Matches
Cancer
15
Yet again- What is confounding?
  • If the crude association between exposure and
    disease is unconfounded, then
  • All of the association we see between exposure
    and disease is due to the effect of exposure on
    disease
  • None of the association between exposure and
    disease is due to common causes that they share.
    (confounding)
  • In other words If exposure has no effect on
    disease, would we still expect to observe an
    association in our data?
  • If yes -gt confounding is present

16
How can we use a DAG to check for presence of
confounding?
  • Remove all direct effects of the exposure
  • These are the effects we are interested in. We
    want to see if, in their absence, an association
    is still present.
  • Check whether disease and exposure share a common
    cause (ancestor)
  • Does any variable connect E and D by following
    only forward pointing arrows?
  • If E and D have a common cause -gt confounding is
    present
  • Any common cause they share will lead to an
    association between E and D that is not due to
    the effect of E on D

17
Vitamins and Birth Defects Is confounding
present?
  • Remove all direct effects of vitamin use
  • Do exposure and disease share a common cause
    (ancestor)?

Difficulty conceiving
SES
Maternal genetics
Pre-Natal Care
Vitamins
Birth Defects
18
How can we use a DAG to decide what variables to
control for in our analysis?
  • We want to choose a set of variables that, when
    adjusted for, will give us an unconfounded
    estimate of the effect of exposure on disease
  • In other words, if the exposure had no effect on
    disease, after adjusting for these variables,
    exposure and disease will no longer be associated

19
How can two variables become associated?
  • Review A crude (unadjusted) association between
    exposure (E) and disease (D) can be due to
  • Causal pathway from E to D (or vice versa)
  • E -gt D or E -gt x -gt y -gt D
  • Common cause of E and D
  • By adjusting (or stratifying) on a third
    variable, it is possible to introduce a new
    source of non-causal association (confounding)
    between E D
  • As we begin to adjust for variables in attempt to
    control for confounding, we must take this
    potential source of association into account

C
D
E
20
Adjusting for a common effect of two variables
will induce a new association between them (Even
if they were unassociated before adjusting)
  • Ex
  • Being on a diet does not cause cancer (or vice
    versa), and dieting and cancer share no common
    causes In our crude data, diet and cancer will
    not be associated
  • Whether or not an individual was on a diet does
    not tell us anything about whether or not he/she
    has cancer.
  • If we stratify on weight loss, we can create a
    new association between dieting and cancer
  • Within the strata of people who lost weight, if
    we know an individual was on a diet, it tells us
    that he/she is less likely to have cancer
    (dieting provides an alternate explanation for
    weight loss).

Weight-loss diet
Cancer
Weight Loss
21
Using a DAG to decide what variable to adjust for
in analysis
  • Ex 1 Is adjusting for prenatal care sufficient
    to control for confounding of the effect of
    vitamin use on birth defects?

22
Using a DAG to decide what to adjust for in
analysis
  • Step 1 Is prenatal care caused by vitamin use?
    If yes, we should not adjust for it.
  • Do not adjust for an effect of the exposure of
    interest

Difficulty conceiving
SES
Pre-Natal Care
Maternal genetics
Vitamins
Birth Defects
23
Using a DAG to decide what to adjust for in
analysis
  • Step 2 Delete all non-ancestors of vitamin use,
    birth defects, and pre-natal care
  • If a variable is not an ancestor of vitamin use
    or birth defects, it cannot be a common cause,
    and so cannot be a source of crude association
    between them
  • If a variable is not an ancestor of prenatal
    care, new associations with that variable can not
    be created by adjusting for prenatal care

Difficulty conceiving
SES
Maternal genetics
Pre-Natal Care
Vitamins
Birth Defects
24
Using a DAG to decide what to adjust for in
analysis
  • Step 3 Delete all direct effects of Vitamins
  • These are the effects we are interested in. We
    want to see if, in their absence, an association
    is still present. If it is, we still have
    confounding.

Difficulty conceiving
SES
Pre-Natal Care
Maternal genetics
Vitamins
Birth Defects
25
Using a DAG to decide what to adjust for in
analysis
  • Step 4 Connect any two causes sharing a common
    effect
  • Adjustment for the effect will result in
    association of its common causes

Difficulty conceiving
SES
Pre-Natal Care
Maternal genetics
Vitamins
Birth Defects
26
Using a DAG to decide what to adjust for in
analysis
  • Step 5 Strip arrow heads from all edges
  • We are moving from a graph that represents causal
    effects, to a graph that represents the
    associations we expect to observe (as a result of
    both causal effects and the adjustment process)

Difficulty conceiving
SES
Pre-Natal Care
Maternal genetics
Vitamins
Birth Defects
27
Using a DAG to decide what to adjust for in
analysis
  • Step 6 Delete prenatal care
  • This is equivalent to adjusting for prenatal
    care, now that we have added to the graph the new
    associations that will be created by adjusting

Difficulty conceiving
SES
Maternal genetics
Vitamins
Birth Defects
28
Using a DAG to decide what to adjust for in
analysis
  • Test Are Vitamins and Birth Defects still
    connected?
  • Yes Adjusting for Prenatal Care is not
    sufficient for control of confounding
  • After adjusting for prenatal care, vitamin use
    and birth defects will still be associated in our
    data, even if vitamin use has no causal effect on
    birth defects

Difficulty conceiving
SES
Maternal genetics
Vitamins
Birth Defects
29
Using a DAG to decide what to adjust for in
analysis
  • Adjustment for which variables would result in
    control of confounding?
  • Our DAG shows that adjusting for any one or more
    of the three remaining variables, in addition to
    prenatal care, would be sufficient for control of
    confounding (e.g. SES and prenatal care)

Difficulty conceiving
Maternal genetics
Vitamins
Birth Defects
30
Vitamins and Birth Defects Lessons learned
  • It may not be immediately intuitive what
    variables we need to control for in our analysis
  • The process of adjustment/stratifiction can
    introduce new sources of association in our data
    that must be accounted for in any attempt to
    control confounding
  • Step by step analysis of a DAG provides a
    rigorous check whether we have adequately
    controlled for confounding
  • Adjustment for several different sets of
    confounders may each be sufficient to control
    confounding of the same exposure disease
    relationship.
  • Can inform study design

31
DAGs for control of confounding Summary of Steps
  • Problem Is adjustment for/stratification on a
    set of confounders C sufficient to control for
    confounding of the relationship between E and D?
  • No variables in C should be descendants of E
  • Delete all non-ancestors of E, D, C
  • Delete all arrows emanating from E
  • Connect any two parents with a common child
  • Strip arrowheads from all edges
  • Delete C
  • Test If E is disconnected from D in the
    remaining graph, then adjustment for C is
    sufficient to remove confounding

Pearl, J. Causality. Cambridge University Press,
Cambridge UK. 2001. pp. 355-57.
32
Stratification has its limits
  • Up till now, you have heard about one way to
    remove confounding adjustment or stratification
    on certain variables
  • But in some situations, there are no variables
    you can stratify on and sucessfully remove
    confounding
  • We will illustrate this using a DAG
  • In a future lecture, you will hear about a method
    you can use in these circumstances (Marginal
    Structural Models)

33
A DAG-based illustration of time-dependent
confoundingA situation in which traditional
methods to control for confounding (i.e.
adjustment/stratification) break down
  • Ex What variables should we control for to
    estimate the effect of antiretroviral therapy on
    CD4 count?

34
Ex. Antiretroviral therapy and CD4 count
  • Question of interest What is the effect of
    antiretroviral therapy on CD4 count?
  • Study Population A cohort of HIV-infected
    individuals
  • Outcome CD4 count at the end of the study
  • Exposure Antiretroviral therapy (ART) (treated
    or not for the entire study period)

35
Ex. Antiretroviral therapy and CD4 count
  • Sicker individuals (those with lower baseline
    CD4 counts at the beginning of the study) are
    more likely to be treated with ART
  • Low baseline CD4 count causes physicians to treat
    their patients
  • CD4 count at baseline also affects CD4 count at
    the end of the study

36
Representing these relations in a DAG
CD4 Count at beginning of study
Outcome CD4 count at the end of a study
Causal effect of interest
Exposure Antiretroviral Treatment
37
Simple confounding
  • CD4 count at baseline is a confounder
  • If we dont adjust for baseline CD4 count, we
    will underestimate the effect of ART on
    preserving final CD4 count
  • Sicker people/ those with lower initial counts
    will be overrepresented among those who get
    treated
  • We can see this in the DAG- we must adjust for
    baseline CD4 or ART and final CD4 will still be
    connected once we delete our causal effect of
    interest
  • CD4 and ART share a common cause

38
Representing these relations in a DAG
Confounder
CD4 Count at beginning of study
Outcome CD4 count at the end of a study
Exposure Antiretroviral Treatment
39
Antiretroviral therapy and CD4 count A more
realistic example
  • Same study population and outcome
  • Cohort of HIV-infected
  • Outcome is final CD4 count
  • Now, an individual can change treatment status
    during the course of follow-up
  • E.g. an individual who is not treated at the
    beginning of the study (t0) may go on treatment
    partway through the study (e.g. t1)
  • CD4 also measured during course of follow-up

40
DAG- Expanded to incorporate changing treatment
over time
Baseline confounder
Y Final CD4 count
CD4 Count partway through study (t1)
CD4 Count at beginning of study (t0)
Causal effect of interest
Antiretroviral Treatment at t0
Antiretroviral Treatment at t1
41
Something is missing.
  • Our effect of interest is how antiretroviral
    treatment throughout the study (eg t0 and t1)
    affects final CD4 count
  • We have left out an important causal relationship
    in the previous DAG!
  • Antiretroviral treatment at baseline affects
    intermediate CD4 counts (e.g. CD4 measured at
    t1) , which in turn affect final CD4 counts
  • This is part of our causal effect of interest!

42
Filling in the DAG
Baseline confounder
Y Final CD4 count
CD4 Count partway through study (t1)
CD4 Count at beginning of study (t0)
Causal effect of interest
Antiretroviral Treatment at t0
Antiretroviral Treatment at t1
43
Something is still missing
  • CD4 count at t1 will also affect subsequent
    treatment (ART at t1)
  • Note we take the convention that CD4(t) is
    measured before ART(t)
  • Patients with lower CD4 counts at t1 are more
    likely to start ART partway through the study
  • A patient getting sicker causes his/her physician
    to start them on treatment

44
Filling in the DAG
Baseline confounder
Y Final CD4 count
CD4 Count partway through study (t1)
CD4 Count at beginning of study (t0)
Causal effect of interest
Antiretroviral Treatment at t0
Antiretroviral Treatment at t1
45
What does this DAG tell us about what we need to
adjust for to control confounding?
46
Using the DAG to decide what we need to control
for
  • We cant adjust for anything that is a descendant
    of (caused by) ART
  • Rules out CD4 at t1
  • Delete all non-ancestors of exposure, disease,
    and things we are considering adjusting for
  • NA Everything in current graph is an ancestor of
    outcome or exposure

Y Final CD4 count
CD4 Count at beginning of study (t0)
CD4 Count partway through study (t1)
Causal effect of interest
Antiretroviral Treatment at t0
Antiretroviral Treatment at t1
47
Using the DAG to decide what we need to control
for
  • Delete any arrows from ART
  • Connect parents sharing a common child
  • NA Already connected

Y Final CD4 count
CD4 Count at beginning of study (t0)
CD4 Count partway through study (t1)
Antiretroviral Treatment at t0
Antiretroviral Treatment at t1
48
Using the DAG to decide what we need to control
for
  • Strip arrowheads
  • What can we delete that will leave ART and final
    CD4 unconnected?
  • Remember CD4 at t1 is not an option since ART
    at t0 affects it

Y Final CD4 count
CD4 Count at beginning of study (t0)
CD4 Count partway through study (t1)
Antiretroviral Treatment at t0
Antiretroviral Treatment at t1
49
A Dilemma
  • From our analysis of the DAG it is clear that if
    we dont adjust for CD4 at t1, we fail to
    control for confounding
  • But we know we cannot adjust for a variable
    affected by our exposure of interest
  • Adjusting for CD4 at t1 would be equivalent to
    adjusting for part of our causal effect of
    interest
  • We would again fail to correctly estimate the
    total effect of ART on final CD4 because we would
    lose that component of the effect mediated by
    early changes in CD4

50
Adjusting for a variable on the causal pathway of
interest
Baseline confounder- could include it in
traditional multivariable model
Time-dependent confounder
Y Final CD4 count
CD4 Count partway through study t1
CD4 Count at beginning of study t0
Causal effect of interest
Antiretroviral Treatment at t0
Antiretroviral Treatment at t1
51
Time-dependent confounding
  • Time-dependent confounder A covariate that is
    predictive of subsequent exposure, is an
    independent risk factor for the outcome, and is
    itself affected by prior exposure
  • If we dont adjust for the covariate we get bias
    due to confounding
  • If we do adjust, we fail to estimate the causal
    effect we are interested in because we are
    adjusting for part of our effect of interest
  • You will see more of this problem, and hear about
    some ways to address it (i.e. Marginal Structural
    Models)

52
Conclusions
  • Today we have outlined the steps to
  • Construct a DAG, based on knowledge/assumptions
  • Use a DAG to decide if confounding is present
  • Use a DAG to decide what variables to control for
    in analysis
  • We have also used a DAG to illustrate a situation
    where traditional methods for controlling
    confounding are not adequate (time-dependent
    confounding)

53
References
  1. Pearl J. Causality Models reasoning and
    Inference. Cambridge University Press, Cambridge
    UK. 2001.
  2. Jewell NP. Statistics for Epidemiology. Chapman
    Hall/CRC, USA. 2004102-112
  3. Greenland S. Causal Diagrams for Epidemiologic
    Research. Epidemiology, 1999 Jan, 10(3) 37-48.
  4. Robins JM. Data, design, and background knowledge
    in etiologic inference. Epidemiology,
    200211313-320.
  5. Hernan M, et al. Causal knowledge as a
    prerequisite for confounding evaluation an
    application to birth defects epidemiology. Am J
    Epidemiol, 2002 155(2)176-184.

54
Example DAG from Mayas research
55
Example from Mayas research
  • Effect of interest Effect of observed viral
    mutation profile (presence of specific mutations)
    on viral load (i.e. response to treatment
  • DAG reveals that adjustment for treatment history
    is sufficient
Write a Comment
User Comments (0)
About PowerShow.com