Methodological Workshop 2: Ignorability, Selection Bias, and Causal Inference - PowerPoint PPT Presentation

Loading...

PPT – Methodological Workshop 2: Ignorability, Selection Bias, and Causal Inference PowerPoint presentation | free to download - id: 7ff9a0-MDg5M



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Methodological Workshop 2: Ignorability, Selection Bias, and Causal Inference

Description:

Methodological Workshop 2: Ignorability, Selection Bias, and Causal Inference Yu Xie University of Michigan Observed Data A population with N individuals, from which ... – PowerPoint PPT presentation

Number of Views:65
Avg rating:3.0/5.0
Slides: 76
Provided by: yux86
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Methodological Workshop 2: Ignorability, Selection Bias, and Causal Inference


1
Methodological Workshop 2 Ignorability,
Selection Bias, and Causal Inference
  • Yu Xie University of Michigan

2
Observed Data
  • A population with N individuals, from which we
    draw a sample of size n.
  • There is an outcome of interest, say Y, that is
    measured on the real line.
  • There is an independent variable of interest, say
    D. For simplicity, let us assume that D is a
    binary treatment, D1 (T), D0 (C). This is the
    simplest case.
  • Let us call this setup canonical case

3
Canonical Case Examined
  • What is the causal effect of treatment D?
  • It is the counterfactual effect for the ith
    individual
  • YiT - YiC
  • However, we either observe
  • YiT when Di 1 or
  • YiC when Di 0.
  • Conclusion it is not possible to identify
    individual-level causal effect without
    assumptions.

4
At Another Extreme
  • We can impose a strong, unrealistic assumption
    that all individuals are identical (a homogeneity
    assumption often made in physical science) then
    we have
  • YiT YT YiC YC
  • We only need two observations to identify the
    causal effect YT when D 1 and YC when D 0.
  • Implication it is population variability that
    makes scientific sampling necessary.

5
Yu Xies Fundamental Paradox in Social Science
  • There is always variability at the individual
    level.
  • Causal inference is impossible at the individual
    level and thus always requires statistical
    analysis at the group level on the basis of some
    homogeneity assumption.
  • Different methods boil down to different
    comparison groups.

6
Consider the Usual Case
  • Population is divided into two subpopulations P1
    if Di 1, P0 if Di0.
  • Use the following notations
  • q proportion of P0 in P
  • E(Y1T) E(YTD1) , E(Y1C) E(YCD1)
  • E(Y0T) E(YTD0) , E(Y0C) E(YCD0)
  • By total expectation rule
  • ATEE(YT - YC) E(Y1T Y1C)(1-q) E(Y0T
    Y0C)q E(Y1T Y0C) - E(Y1C Y0C) -
    (d1-d0)q,
  • where d1 E(Y1T Y1C) TT,
  • d0 E(Y0T Y0C) TUT.

7
In Other Words
  • The standard estimator E(Y1T Y0C) contains two
    sources of biases
  • (1) The average difference between P1 and P0 in
    the absence of treatment (pre-treatment
    heterogeneity bias, or Type I selection
    bias.)
  • E(Y1C Y0C)
  • (2) The difference in the average treatment
    effect between P1 and P0 (treatment-effect
    heterogeneity bias, or Type II selection
    bias.)
  • d1-d0
  • Both sources of bias average to zero under
    randomized assignment.

8
In Regression Language
  • Yi a diDi ei
  • There are two types of variability that may cause
    biases
  • (1) Type I selection bias (focusing on ei) If
    corr(e,,D)?0.
  • (2) Type II selection bias (focusing on di ) If
    corr(d,,D)?0.

9
Selection Bias and Estimands
  • ATE(d) E(YT - YC) E(Y1T Y0C) - E(Y1C
    Y0C) - (d1-d0)q,
  • When Type I selection bias is present, but Type
    II selection bias is absent (say homogenous
    treatment effect).
  • E(Y1T Y0C) ? d
  • When Type I selection bias is absent, but Type II
    selection bias is present.
  • E(Y1T Y0C) ? d
  • ATE(d) ? d1 ? d0
  • Type II selection bias is important.
  • Type II selection bias cannot be eliminated by
    fixed-effects approach.

10
Ignorability and Selection Bias
Type of Selection Bias Type of Selection Bias
Type I Type II
Ignorability Assumed? (Invoking Unobserv- ables?) Yes (No) Propensity Score (Rubin et al.) ?
Ignorability Assumed? (Invoking Unobserv- ables?) No (Yes) Structural Selection Model (Heckman et al.) Non-parametric IV Models (Heckman et al.)
11
IV versus LATE
  • Exactly the same formula, but different
    interpretations
  • IV interpretation constant treatment effect.
  • LATE interpretation heterogeneous treatment
    effects, averaged into different groups (strata).

12
Heckman Selection Model
Latent Rule
13
Important Role of Unobservables
  • The treatment of selection bias in economics
    requires specification of unobserved variables.
  • Such specifications are subject to dispute.
  • The issue of unobservables also splits economists
    and statisticians into two camps.
  • As a result, not enough attention has been paid
    to (1,2) cell, marked by ?.

14
Missing Knowledge
  • We do not know much about the cell marked by ?
    .
  • Most work in economics on selection bias assumes
    that ignorability does not hold true.
  • Since we can easily handle Type I selection bias
    under ignorability, it seems that Type II
    selection bias under ignorability is a trivial
    matter.
  • I will show that this is not true.

15
Making Sense
  • In this presentation, I discuss a simple scenario
    where Type II selection bias (which I call
    composition bias) arises from a common
    situation in which we assume ignorability.

16
Ignorability Assumption
  • Also called selection on observables.
  • Let X denote a vector of observed covariates.
    The ignorability assumption states
  • D ? (YC, YT) X.
  • We start with the assumption, although we do not
    necessarily believe that this is true.
  • We want to learn as much as the data can tell us.

17
Under the Ignorability Assumption
  • The important work by Rosenbaum and Rubin (1984)
    shows that, when the ignorability assumption
    holds true, it is sufficient to condition on the
    propensity score as a function of X. The
    condition is changed to
  • D ? (YC, YT) p(D1X).

18
In Other Words
  • There is no bias, conditional on propensity
    score
  • EYT - YC p(X) EY1T Y0C p(X)

19
Recall Earlier Result
  • d E(YT - YC) ATE E(Y1T Y0C) - E(Y1C
    Y0C) - (d1-d0)q.
  • The ignorability assumption thus means
  • No Type I selection bias, conditional on p(X)
  • EY1C Y0Cp(X) 0
  • EY0Cp(X) EY1Cp(X) EYCp(X)
  • No Type II selection bias, conditional on p(X)
  • E(Y1T Y1C) - (Y0T Y0C)p(X) 0
  • EY1T Y0Cp(X) EY1T Y1Cp(X)
  • E(YT - YC)p(X)

20
Implications
  • Implication 1 we should conduct propensity-score
    specific analysis under ignorability.
  • Implication 2 the only interaction effects
    that can lead to selection bias (Type II) are
    those between the treatment status and the
    propensity score.

21
Setup
  • Two requirements
  • There are heterogeneous treatment effects
  • The heterogeneity in treatment effects is
    correlated with the propensity of treatment.
  • Both requirements are accepted in the standard
    (statistical) approach assuming ignorability.
  • We wish to show
  • (1) treatment-effect heterogeneity gt
    Type II selection bias.
  • (2) Type II selection bias composition bias.
  • (3) This happens without unobservables.

22
Example I Market Premium in Contemporary China
  • We found that the social mechanisms and social
    consequences of transitioning from the state
    sector to the market significantly changed over
    time (Wu and Xie 2003, ASR).

23
Jann (2005) and Xie and Wu (2005)
  • Jann argued that there is no statistical
    difference in returns to education between early
    entrants and late entrants. Thus, Wu and Xies
    conclusion is incorrect.
  • Social processes generating the three groups are
    cumulative so that the three groups are not
    symmetric.

24
(No Transcript)
25
Xie and Wus ( 2005) Key Results Market Premium
of Late Entry
26
Example II College Returns (Brand and Xie)
  • Research question
  • Whats the earnings return to college education
  • Data set WLS. Earnings are measured at
    different points in life course.

27
Preliminary Findings College Graduation
Treatment Effect on Earnings by Propensity Score
Strata WLS Men
28
Example III NSW Data on Job Training
  • Research question
  • Does participation in the National Supported Work
    Demonstration (NSW) improve workers wages?
  • NSW
  • A temporary employment program designed to help
    low skilled workers move into the labor market.
  • Original NSW data were experimental (random
    assignment into treatment and control groups).

29
Re-Analysis in Xie, Perez, and Raudenbush (in
progress)
30
Main Insights
  • Selection into treatment is a dynamic process
    (akin to survival analysis), so that net
    composition changes with the proportion of the
    subpopulation being treated (P1).
  • Heterogeneous treatment propensities associated
    heterogeneous treatment effects ? composition
    bias -- Type II selection bias.
  • In this setup, we use the marginal proportion of
    treatment as an instrument for the definition
    of the marginal treatment effect.

31
Simulation One, Setup
Baseline
Strata Delta N of UT Drawn DrawnTot TUT_c MTE_c TT_c
0.05 50 100 0 0 5.00 0.00 0.00
0.15 150 100 0 0 15.00 0.00 0.00
0.25 250 100 0 0 25.00 0.00 0.00
0.35 350 100 0 0 35.00 0.00 0.00
0.45 450 100 0 0 45.00 0.00 0.00
0.55 550 100 0 0 55.00 0.00 0.00
0.65 650 100 0 0 65.00 0.00 0.00
0.75 750 100 0 0 75.00 0.00 0.00
0.85 850 100 0 0 85.00 0.00 0.00
0.95 950 100 0 0 95.00 0.00 0.00
SUM   1000 0 0 TUT MTE TT
500.00 0.00 0.00
32
Simulation One
Draw1
Strata Delta N of UT Drawn DrawnTot TUT_c MTE_c TT_c
0.05 50 99 1 1 5.50 0.50 0.50
0.15 150 97 3 3 16.17 4.50 4.50
0.25 250 95 5 5 26.39 12.50 12.50
0.35 350 93 7 7 36.17 24.50 24.50
0.45 450 91 9 9 45.50 40.50 40.50
0.55 550 89 11 11 54.39 60.50 60.50
0.65 650 87 13 13 62.83 84.50 84.50
0.75 750 85 15 15 70.83 112.50 112.50
0.85 850 83 17 17 78.39 144.50 144.50
0.95 950 81 19 19 85.50 180.50 180.50
SUM   900 100 100 TUT MTE TT
481.667 665.000 665.00
33
Simulation One
Draw2
Strata Delta N of UT Drawn DrawnTot TUT_c MTE_c TT_c
0.05 50 98 1 2 6.12 0.57 0.54
0.15 150 94 3 6 17.56 5.03 4.77
0.25 250 90 5 10 27.98 13.70 13.10
0.35 350 85 8 15 37.40 26.28 25.39
0.45 450 82 9 18 45.87 42.51 41.50
0.55 550 78 11 22 53.42 62.10 61.30
0.65 650 74 13 26 60.09 84.79 84.65
0.75 750 70 15 30 65.90 110.29 111.40
0.85 850 67 16 33 70.90 138.33 141.42
0.95 950 63 18 37 75.11 168.63 174.57
SUM   800 100 200 TUT MTE TT
460.344 652.249 658.62
34
Simulation One, Summary
35
Simulation Two (Micro)
  • A population of 100,000 with 1000 trained per
    round.
  • Propensity score (P) uniform (.001 to 0.999)
  • Heterogeneous treatment effects
  • d 1000P
  • Simple random sampling without stratification.

36
Summary Average Treatment Effects Decrease with
Marginal P.
37
A Small Sample Case of Micro Simulation
38
Discussion
  • It is not possible to discuss causal inference at
    the individual level.
  • Causal inference is possible only at the group
    level which requires some sort of homogeneity
    assumption.
  • Ignorability is unlikely to be true, but needed
    for causal inference with observational data
    without strong and unverifiable assumptions.

39
Solution
  • Even in this ideal situation (with ignorability
    assumption being true), causal effects can be
    heterogeneous.
  • This can be handled with hierarchical models
    (Bayesian or not) assuming homogeneous effects
    (or structure) within subgroups.
  • However --

40
Conclusion 1
  • (1) Any estimand (something that is to be
    estimated) in causal inference is essentially a
    weighted mean by composition.
  • (2) There is a composition bias, which is a
    form of selection bias (Type II), as we change
    the marginal proportion of the population
    treated. (Bad news for those looking for
    external validity. ) We do not need
    controversial unobservables for this to happen.

41
Conclusion 2
  • Discovering patterns of heterogeneous treatment
    effects (under ignorability) is informative to
    our understanding of social processes.
  • Examples Xie and Wu (2005), Tsai and Xie (2008),
    Brand and Xie (2007), Xie, Perez, and Raudenbush
    (in progress).

42
Conclusion 3
  • Observed patterns of heterogeneous treatment
    effects (under ignorability) can help us question
    the ignorability assumption and understand
    potential unobserved selection process
  • Examples Xie and Wu (2005), Tsai and Xie (2008),
    Brand and Xie (2007), Bruch and Xie (in
    progress).

43
Modeling Heterogeneous Treatment Effects AND
Selection
  • Heckmans Marginal Treatment Effects (MTE)
    approach.
  • It is very general, but highly demanding in terms
    of richness of data.
  • Not only do we need exclusion restriction, we
    also need full support of exclusion restriction
    over the whole range of the latent tendency of
    being treated.

44
Marginal Treatment Effects
  • Focus on the treatment effects for those who are
    at the margin of being treated.
  • The term UD can be interpreted as latent
    resistance to participate.
  • Originally attributable to Bjorklund and Moffitt
    (1987).

45
Usefulness of MTE
  • Cornerstone of Heckmans recent work on
    heterogeneous treatment effects
  • It provides a linkage to LIV and unifies all
    other estimands (e.g., Heckman, Urzua, and
    Vytlacil 2006).
  • Treatment heterogeneity is specified at the level
    of the latent tendency/resistance to participate.
  • Some homogeneity is still assumed.

46
More Detailed Empirical Example
  • Social Selection and Returns to College
    Education.
  • Collaborative with Jennie Brand, with her as the
    first author (Brand and Xie, in progress).

47
Economic or Positive Selection
  • Individuals who are most likely to benefit from
    college are the most likely to attend college
  • Theory of comparative advantage
  • Empirical support
  • Willis and Rosen (1979)
  • Recent series of papers by Heckman and colleagues

48
Social or Negative Selection
  • Individuals who are most likely to benefit from
    college are the least likely to attend college
  • Theory rooted in a social stratification research

49
Social Stratification Research
  • Education is the main factor in the reproduction
    of SES and in upward mobility (Blau Duncan
    1967 Featherman Hauser 1978)
  • Social reproduction theory (Bourdieu 1977 Bowles
    and Gintis 1976 Collins 1971 MacLeod 1989)
  • Differences in educational attainment by origins
    (Mare 1980, 1981, 1995 Hout, Raftery, Bell
    1993 Lucas 2001)
  • The higher the level of educational attainment,
    the less dependence between origins and
    destinations (Yamaguchi 1983 Hout 1988 DiPrete
    Grusky 1990)

50
Hypothetical Model Origins, Education, and
Destinations
Socioeconomic destinations
College-educated workers
Benefit of a college degree
Less-educated workers
Benefit of a college degree
Socioeconomic origins
51
Main Argument
  • There is no blanket answer to the question of
    whether the selection is positive or negative
  • The answer depends on what is being controlled.
  • With adequate controls for relevant factors, we
    may observe positive, rather than, negative,
    selection.

52
Research Description
  • Three panel data sources
  • National Longitudinal Study of Youth 1979
    (NLSY79)
  • National Longitudinal Study of the Class of 1972
    (NLS72)
  • Wisconsin Longitudinal Study (WLS57)
  • Analyses are separate for men and women.

53
Treatment Effects as a Function of Propensity
Scores A Hierarchical Linear Model
  • Propensity score P(X) P(d 1 X)
  • Group individuals into propensity score strata
  • Level 1 Estimate the treatment effect specific
    to balanced propensity score strata
  • Level 2 Pool the results and examine the trend
    in the variation of effects
  • A trend in the heterogeneous effects provides a
    clear depiction of the direction of selection

54
Main Results College Graduation Treatment Effect
on Log Earnings by Propensity Score Strata
NLSY79 Men
55
Main Results College Graduation Treatment Effect
on Log Earnings by Propensity Score Strata
NLS72 Men
56
Main Results College Graduation Treatment Effect
on Log Earnings by Propensity Score Strata
WLS57 Men
57
Main Results College Graduation Treatment Effect
on Log Earnings by Propensity Score Strata
NLSY79 Women
58
Main Results College Graduation Treatment Effect
on Log Earnings by Propensity Score Strata
NLS72 Women
59
Main Results College Graduation Treatment Effect
on Log Earnings by Propensity Score Strata
WLS57 Women
60
Why do some studies find evidence for economic
selection?
  • Omitted variable bias

61
Propensity Score Covariates
  • Family Background
  • Parents income
  • Mothers education
  • Fathers education
  • Intact family
  • Number of siblings
  • Rural residence
  • Proximity to college/univ.
  • Race
  • Religion
  • Ability and Academics
  • Ability / IQ
  • Class rank / HS grades
  • College-prep. program
  • Social-Psychological
  • Teachers encouragement
  • Parents encouragement
  • Friends plans

62
Descriptive Statistics Mental Ability by
Propensity Score Strata Full Set of Covariates
WLS57 Men
63
Descriptive Statistics Mental Ability by
Propensity Score Strata Small Set of Covariates
WLS57 Men
29 point difference
24 point difference
64
Results, Full Set of Covariates College
Graduation Treatment Effect on Log Earnings by
Propensity Score Strata WLS57 Men
65
Results, Small Set of Covariates College
Graduation Treatment Effect on Earnings by
Propensity Score Strata WLS57 Men
66
Results, Full Set of Covariates College
Graduation Treatment Effect on Log Earnings by
Propensity Score Strata WLS57 Women
67
Results, Small Set of Covariates College
Graduation Treatment Effect on Earnings by
Propensity Score Strata WLS57 Women
68
Why is there social selection?
  • Forms of Heterogeneity
  • Pre-treatment heterogeneity
  • Treatment effect heterogeneity
  • Heterogeneous treatments
  • d 1 p(d1X) j ? d 1 p(d1X) k,
    where j ? k
  • Low propensity students utilize college as a
    means for economic mobility

69
Heterogeneous Treatments College Majors for WLS57
Men
  • Low propensity men
  • High proportion of majors Business, Education
  • High propensity men
  • High proportion of majors Sciences, Humanities

70
Ratio of Monetary to Non-monetary Importance in
Selecting a Career by Propensity Score Strata
NLS72 Men
71
Ratio of Monetary to Non-monetary Importance in
Selecting a Career by Propensity Score Strata
NLS72 Women
72
Value of College by Propensity Score Strata
WLS57 Men
73
Value of College by Propensity Score Strata
WLS57 Women
74
Summary
  • Main results
  • Robust evidence for social selection
  • NLS79, NLS72, and WLS57
  • Men and women
  • Early, mid-, and late career returns
  • Exploratory results
  • Why do some prior studies find evidence for
    economic selection?
  • Omitted variable bias
  • Why is there social selection?
  • Low propensity students View college as a means
    for economic mobility

75
References
  • Brand, Jennie, and Yu Xie. 2007. Who Benefits
    Most From College? Evidence for Negative
    Selection in Heterogeneous Economic Returns to
    Higher Education.
  • Rosenbaum, Paul R. and Donald B. Rubin. 1984.
    "Reducing Bias in Observational Studies Using
    Subclassification on the Propensity Score.''
    Journal of the American Statistical Association
    79, 516-524.
  • Tsai, Shu-Ling, and Yu Xie. 2008. Changes in
    Earnings Returns to Higher Education in Taiwan
    since the 1990s. Population Review.
  • Xie, Yu, Steve Raudenbush, and Tony Perez. In
    Progress. Weighting in Causal Inference. Xie,
    Yu and Xiaogang Wu. 2005. Market Premium,
    Social Process, and Statisticism. American
    Sociological Review 70865-870.
About PowerShow.com