# Methodological Workshop 2: Ignorability, Selection Bias, and Causal Inference - PowerPoint PPT Presentation

PPT – Methodological Workshop 2: Ignorability, Selection Bias, and Causal Inference PowerPoint presentation | free to download - id: 7ff9a0-MDg5M

The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
Title:

## Methodological Workshop 2: Ignorability, Selection Bias, and Causal Inference

Description:

### Methodological Workshop 2: Ignorability, Selection Bias, and Causal Inference Yu Xie University of Michigan Observed Data A population with N individuals, from which ... – PowerPoint PPT presentation

Number of Views:65
Avg rating:3.0/5.0
Slides: 76
Provided by: yux86
Category:
Tags:
Transcript and Presenter's Notes

Title: Methodological Workshop 2: Ignorability, Selection Bias, and Causal Inference

1
Methodological Workshop 2 Ignorability,
Selection Bias, and Causal Inference
• Yu Xie University of Michigan

2
Observed Data
• A population with N individuals, from which we
draw a sample of size n.
• There is an outcome of interest, say Y, that is
measured on the real line.
• There is an independent variable of interest, say
D. For simplicity, let us assume that D is a
binary treatment, D1 (T), D0 (C). This is the
simplest case.
• Let us call this setup canonical case

3
Canonical Case Examined
• What is the causal effect of treatment D?
• It is the counterfactual effect for the ith
individual
• YiT - YiC
• However, we either observe
• YiT when Di 1 or
• YiC when Di 0.
• Conclusion it is not possible to identify
individual-level causal effect without
assumptions.

4
At Another Extreme
• We can impose a strong, unrealistic assumption
that all individuals are identical (a homogeneity
assumption often made in physical science) then
we have
• YiT YT YiC YC
• We only need two observations to identify the
causal effect YT when D 1 and YC when D 0.
• Implication it is population variability that
makes scientific sampling necessary.

5
Yu Xies Fundamental Paradox in Social Science
• There is always variability at the individual
level.
• Causal inference is impossible at the individual
level and thus always requires statistical
analysis at the group level on the basis of some
homogeneity assumption.
• Different methods boil down to different
comparison groups.

6
Consider the Usual Case
• Population is divided into two subpopulations P1
if Di 1, P0 if Di0.
• Use the following notations
• q proportion of P0 in P
• E(Y1T) E(YTD1) , E(Y1C) E(YCD1)
• E(Y0T) E(YTD0) , E(Y0C) E(YCD0)
• By total expectation rule
• ATEE(YT - YC) E(Y1T Y1C)(1-q) E(Y0T
Y0C)q E(Y1T Y0C) - E(Y1C Y0C) -
(d1-d0)q,
• where d1 E(Y1T Y1C) TT,
• d0 E(Y0T Y0C) TUT.

7
In Other Words
• The standard estimator E(Y1T Y0C) contains two
sources of biases
• (1) The average difference between P1 and P0 in
the absence of treatment (pre-treatment
heterogeneity bias, or Type I selection
bias.)
• E(Y1C Y0C)
• (2) The difference in the average treatment
effect between P1 and P0 (treatment-effect
heterogeneity bias, or Type II selection
bias.)
• d1-d0
• Both sources of bias average to zero under
randomized assignment.

8
In Regression Language
• Yi a diDi ei
• There are two types of variability that may cause
biases
• (1) Type I selection bias (focusing on ei) If
corr(e,,D)?0.
• (2) Type II selection bias (focusing on di ) If
corr(d,,D)?0.

9
Selection Bias and Estimands
• ATE(d) E(YT - YC) E(Y1T Y0C) - E(Y1C
Y0C) - (d1-d0)q,
• When Type I selection bias is present, but Type
II selection bias is absent (say homogenous
treatment effect).
• E(Y1T Y0C) ? d
• When Type I selection bias is absent, but Type II
selection bias is present.
• E(Y1T Y0C) ? d
• ATE(d) ? d1 ? d0
• Type II selection bias is important.
• Type II selection bias cannot be eliminated by
fixed-effects approach.

10
Ignorability and Selection Bias
Type of Selection Bias Type of Selection Bias
Type I Type II
Ignorability Assumed? (Invoking Unobserv- ables?) Yes (No) Propensity Score (Rubin et al.) ?
Ignorability Assumed? (Invoking Unobserv- ables?) No (Yes) Structural Selection Model (Heckman et al.) Non-parametric IV Models (Heckman et al.)
11
IV versus LATE
• Exactly the same formula, but different
interpretations
• IV interpretation constant treatment effect.
• LATE interpretation heterogeneous treatment
effects, averaged into different groups (strata).

12
Heckman Selection Model
Latent Rule
13
Important Role of Unobservables
• The treatment of selection bias in economics
requires specification of unobserved variables.
• Such specifications are subject to dispute.
• The issue of unobservables also splits economists
and statisticians into two camps.
• As a result, not enough attention has been paid
to (1,2) cell, marked by ?.

14
Missing Knowledge
• We do not know much about the cell marked by ?
.
• Most work in economics on selection bias assumes
that ignorability does not hold true.
• Since we can easily handle Type I selection bias
under ignorability, it seems that Type II
selection bias under ignorability is a trivial
matter.
• I will show that this is not true.

15
Making Sense
• In this presentation, I discuss a simple scenario
where Type II selection bias (which I call
composition bias) arises from a common
situation in which we assume ignorability.

16
Ignorability Assumption
• Also called selection on observables.
• Let X denote a vector of observed covariates.
The ignorability assumption states
• D ? (YC, YT) X.
necessarily believe that this is true.
• We want to learn as much as the data can tell us.

17
Under the Ignorability Assumption
• The important work by Rosenbaum and Rubin (1984)
shows that, when the ignorability assumption
holds true, it is sufficient to condition on the
propensity score as a function of X. The
condition is changed to
• D ? (YC, YT) p(D1X).

18
In Other Words
• There is no bias, conditional on propensity
score
• EYT - YC p(X) EY1T Y0C p(X)

19
Recall Earlier Result
• d E(YT - YC) ATE E(Y1T Y0C) - E(Y1C
Y0C) - (d1-d0)q.
• The ignorability assumption thus means
• No Type I selection bias, conditional on p(X)
• EY1C Y0Cp(X) 0
• EY0Cp(X) EY1Cp(X) EYCp(X)
• No Type II selection bias, conditional on p(X)
• E(Y1T Y1C) - (Y0T Y0C)p(X) 0
• EY1T Y0Cp(X) EY1T Y1Cp(X)
• E(YT - YC)p(X)

20
Implications
• Implication 1 we should conduct propensity-score
specific analysis under ignorability.
• Implication 2 the only interaction effects
that can lead to selection bias (Type II) are
those between the treatment status and the
propensity score.

21
Setup
• Two requirements
• There are heterogeneous treatment effects
• The heterogeneity in treatment effects is
correlated with the propensity of treatment.
• Both requirements are accepted in the standard
(statistical) approach assuming ignorability.
• We wish to show
• (1) treatment-effect heterogeneity gt
Type II selection bias.
• (2) Type II selection bias composition bias.
• (3) This happens without unobservables.

22
Example I Market Premium in Contemporary China
• We found that the social mechanisms and social
consequences of transitioning from the state
sector to the market significantly changed over
time (Wu and Xie 2003, ASR).

23
Jann (2005) and Xie and Wu (2005)
• Jann argued that there is no statistical
difference in returns to education between early
entrants and late entrants. Thus, Wu and Xies
conclusion is incorrect.
• Social processes generating the three groups are
cumulative so that the three groups are not
symmetric.

24
(No Transcript)
25
Xie and Wus ( 2005) Key Results Market Premium
of Late Entry
26
Example II College Returns (Brand and Xie)
• Research question
• Data set WLS. Earnings are measured at
different points in life course.

27
Treatment Effect on Earnings by Propensity Score
Strata WLS Men
28
Example III NSW Data on Job Training
• Research question
• Does participation in the National Supported Work
Demonstration (NSW) improve workers wages?
• NSW
• A temporary employment program designed to help
low skilled workers move into the labor market.
• Original NSW data were experimental (random
assignment into treatment and control groups).

29
Re-Analysis in Xie, Perez, and Raudenbush (in
progress)
30
Main Insights
• Selection into treatment is a dynamic process
(akin to survival analysis), so that net
composition changes with the proportion of the
subpopulation being treated (P1).
• Heterogeneous treatment propensities associated
heterogeneous treatment effects ? composition
bias -- Type II selection bias.
• In this setup, we use the marginal proportion of
treatment as an instrument for the definition
of the marginal treatment effect.

31
Simulation One, Setup
Baseline
Strata Delta N of UT Drawn DrawnTot TUT_c MTE_c TT_c
0.05 50 100 0 0 5.00 0.00 0.00
0.15 150 100 0 0 15.00 0.00 0.00
0.25 250 100 0 0 25.00 0.00 0.00
0.35 350 100 0 0 35.00 0.00 0.00
0.45 450 100 0 0 45.00 0.00 0.00
0.55 550 100 0 0 55.00 0.00 0.00
0.65 650 100 0 0 65.00 0.00 0.00
0.75 750 100 0 0 75.00 0.00 0.00
0.85 850 100 0 0 85.00 0.00 0.00
0.95 950 100 0 0 95.00 0.00 0.00
SUM   1000 0 0 TUT MTE TT
500.00 0.00 0.00
32
Simulation One
Draw1
Strata Delta N of UT Drawn DrawnTot TUT_c MTE_c TT_c
0.05 50 99 1 1 5.50 0.50 0.50
0.15 150 97 3 3 16.17 4.50 4.50
0.25 250 95 5 5 26.39 12.50 12.50
0.35 350 93 7 7 36.17 24.50 24.50
0.45 450 91 9 9 45.50 40.50 40.50
0.55 550 89 11 11 54.39 60.50 60.50
0.65 650 87 13 13 62.83 84.50 84.50
0.75 750 85 15 15 70.83 112.50 112.50
0.85 850 83 17 17 78.39 144.50 144.50
0.95 950 81 19 19 85.50 180.50 180.50
SUM   900 100 100 TUT MTE TT
481.667 665.000 665.00
33
Simulation One
Draw2
Strata Delta N of UT Drawn DrawnTot TUT_c MTE_c TT_c
0.05 50 98 1 2 6.12 0.57 0.54
0.15 150 94 3 6 17.56 5.03 4.77
0.25 250 90 5 10 27.98 13.70 13.10
0.35 350 85 8 15 37.40 26.28 25.39
0.45 450 82 9 18 45.87 42.51 41.50
0.55 550 78 11 22 53.42 62.10 61.30
0.65 650 74 13 26 60.09 84.79 84.65
0.75 750 70 15 30 65.90 110.29 111.40
0.85 850 67 16 33 70.90 138.33 141.42
0.95 950 63 18 37 75.11 168.63 174.57
SUM   800 100 200 TUT MTE TT
460.344 652.249 658.62
34
Simulation One, Summary
35
Simulation Two (Micro)
• A population of 100,000 with 1000 trained per
round.
• Propensity score (P) uniform (.001 to 0.999)
• Heterogeneous treatment effects
• d 1000P
• Simple random sampling without stratification.

36
Summary Average Treatment Effects Decrease with
Marginal P.
37
A Small Sample Case of Micro Simulation
38
Discussion
• It is not possible to discuss causal inference at
the individual level.
• Causal inference is possible only at the group
level which requires some sort of homogeneity
assumption.
• Ignorability is unlikely to be true, but needed
for causal inference with observational data
without strong and unverifiable assumptions.

39
Solution
• Even in this ideal situation (with ignorability
assumption being true), causal effects can be
heterogeneous.
• This can be handled with hierarchical models
(Bayesian or not) assuming homogeneous effects
(or structure) within subgroups.
• However --

40
Conclusion 1
• (1) Any estimand (something that is to be
estimated) in causal inference is essentially a
weighted mean by composition.
• (2) There is a composition bias, which is a
form of selection bias (Type II), as we change
the marginal proportion of the population
treated. (Bad news for those looking for
external validity. ) We do not need
controversial unobservables for this to happen.

41
Conclusion 2
• Discovering patterns of heterogeneous treatment
effects (under ignorability) is informative to
our understanding of social processes.
• Examples Xie and Wu (2005), Tsai and Xie (2008),
Brand and Xie (2007), Xie, Perez, and Raudenbush
(in progress).

42
Conclusion 3
• Observed patterns of heterogeneous treatment
effects (under ignorability) can help us question
the ignorability assumption and understand
potential unobserved selection process
• Examples Xie and Wu (2005), Tsai and Xie (2008),
Brand and Xie (2007), Bruch and Xie (in
progress).

43
Modeling Heterogeneous Treatment Effects AND
Selection
• Heckmans Marginal Treatment Effects (MTE)
approach.
• It is very general, but highly demanding in terms
of richness of data.
• Not only do we need exclusion restriction, we
also need full support of exclusion restriction
over the whole range of the latent tendency of
being treated.

44
Marginal Treatment Effects
• Focus on the treatment effects for those who are
at the margin of being treated.
• The term UD can be interpreted as latent
resistance to participate.
• Originally attributable to Bjorklund and Moffitt
(1987).

45
Usefulness of MTE
• Cornerstone of Heckmans recent work on
heterogeneous treatment effects
• It provides a linkage to LIV and unifies all
other estimands (e.g., Heckman, Urzua, and
Vytlacil 2006).
• Treatment heterogeneity is specified at the level
of the latent tendency/resistance to participate.
• Some homogeneity is still assumed.

46
More Detailed Empirical Example
• Social Selection and Returns to College
Education.
• Collaborative with Jennie Brand, with her as the
first author (Brand and Xie, in progress).

47
Economic or Positive Selection
• Individuals who are most likely to benefit from
college are the most likely to attend college
• Empirical support
• Willis and Rosen (1979)
• Recent series of papers by Heckman and colleagues

48
Social or Negative Selection
• Individuals who are most likely to benefit from
college are the least likely to attend college
• Theory rooted in a social stratification research

49
Social Stratification Research
• Education is the main factor in the reproduction
of SES and in upward mobility (Blau Duncan
1967 Featherman Hauser 1978)
• Social reproduction theory (Bourdieu 1977 Bowles
and Gintis 1976 Collins 1971 MacLeod 1989)
• Differences in educational attainment by origins
(Mare 1980, 1981, 1995 Hout, Raftery, Bell
1993 Lucas 2001)
• The higher the level of educational attainment,
the less dependence between origins and
destinations (Yamaguchi 1983 Hout 1988 DiPrete
Grusky 1990)

50
Hypothetical Model Origins, Education, and
Destinations
Socioeconomic destinations
College-educated workers
Benefit of a college degree
Less-educated workers
Benefit of a college degree
Socioeconomic origins
51
Main Argument
• There is no blanket answer to the question of
whether the selection is positive or negative
• The answer depends on what is being controlled.
• With adequate controls for relevant factors, we
may observe positive, rather than, negative,
selection.

52
Research Description
• Three panel data sources
• National Longitudinal Study of Youth 1979
(NLSY79)
• National Longitudinal Study of the Class of 1972
(NLS72)
• Wisconsin Longitudinal Study (WLS57)
• Analyses are separate for men and women.

53
Treatment Effects as a Function of Propensity
Scores A Hierarchical Linear Model
• Propensity score P(X) P(d 1 X)
• Group individuals into propensity score strata
• Level 1 Estimate the treatment effect specific
to balanced propensity score strata
• Level 2 Pool the results and examine the trend
in the variation of effects
• A trend in the heterogeneous effects provides a
clear depiction of the direction of selection

54
Main Results College Graduation Treatment Effect
on Log Earnings by Propensity Score Strata
NLSY79 Men
55
Main Results College Graduation Treatment Effect
on Log Earnings by Propensity Score Strata
NLS72 Men
56
Main Results College Graduation Treatment Effect
on Log Earnings by Propensity Score Strata
WLS57 Men
57
Main Results College Graduation Treatment Effect
on Log Earnings by Propensity Score Strata
NLSY79 Women
58
Main Results College Graduation Treatment Effect
on Log Earnings by Propensity Score Strata
NLS72 Women
59
Main Results College Graduation Treatment Effect
on Log Earnings by Propensity Score Strata
WLS57 Women
60
Why do some studies find evidence for economic
selection?
• Omitted variable bias

61
Propensity Score Covariates
• Family Background
• Parents income
• Mothers education
• Fathers education
• Intact family
• Number of siblings
• Rural residence
• Proximity to college/univ.
• Race
• Religion
• Ability / IQ
• Class rank / HS grades
• College-prep. program
• Social-Psychological
• Teachers encouragement
• Parents encouragement
• Friends plans

62
Descriptive Statistics Mental Ability by
Propensity Score Strata Full Set of Covariates
WLS57 Men
63
Descriptive Statistics Mental Ability by
Propensity Score Strata Small Set of Covariates
WLS57 Men
29 point difference
24 point difference
64
Results, Full Set of Covariates College
Graduation Treatment Effect on Log Earnings by
Propensity Score Strata WLS57 Men
65
Results, Small Set of Covariates College
Graduation Treatment Effect on Earnings by
Propensity Score Strata WLS57 Men
66
Results, Full Set of Covariates College
Graduation Treatment Effect on Log Earnings by
Propensity Score Strata WLS57 Women
67
Results, Small Set of Covariates College
Graduation Treatment Effect on Earnings by
Propensity Score Strata WLS57 Women
68
Why is there social selection?
• Forms of Heterogeneity
• Pre-treatment heterogeneity
• Treatment effect heterogeneity
• Heterogeneous treatments
• d 1 p(d1X) j ? d 1 p(d1X) k,
where j ? k
• Low propensity students utilize college as a
means for economic mobility

69
Heterogeneous Treatments College Majors for WLS57
Men
• Low propensity men
• High proportion of majors Business, Education
• High propensity men
• High proportion of majors Sciences, Humanities

70
Ratio of Monetary to Non-monetary Importance in
Selecting a Career by Propensity Score Strata
NLS72 Men
71
Ratio of Monetary to Non-monetary Importance in
Selecting a Career by Propensity Score Strata
NLS72 Women
72
Value of College by Propensity Score Strata
WLS57 Men
73
Value of College by Propensity Score Strata
WLS57 Women
74
Summary
• Main results
• Robust evidence for social selection
• NLS79, NLS72, and WLS57
• Men and women
• Early, mid-, and late career returns
• Exploratory results
• Why do some prior studies find evidence for
economic selection?
• Omitted variable bias
• Why is there social selection?
• Low propensity students View college as a means
for economic mobility

75
References
• Brand, Jennie, and Yu Xie. 2007. Who Benefits
Most From College? Evidence for Negative
Selection in Heterogeneous Economic Returns to
Higher Education.
• Rosenbaum, Paul R. and Donald B. Rubin. 1984.
"Reducing Bias in Observational Studies Using
Subclassification on the Propensity Score.''
Journal of the American Statistical Association
79, 516-524.
• Tsai, Shu-Ling, and Yu Xie. 2008. Changes in
Earnings Returns to Higher Education in Taiwan
since the 1990s. Population Review.
• Xie, Yu, Steve Raudenbush, and Tony Perez. In
Progress. Weighting in Causal Inference. Xie,
Yu and Xiaogang Wu. 2005. Market Premium,
Social Process, and Statisticism. American
Sociological Review 70865-870.