Instrumental Variables Estimation (with Examples from Criminology) - PowerPoint PPT Presentation

About This Presentation

Title:

Instrumental Variables Estimation (with Examples from Criminology)

Description:

Instrumental Variables Estimation (with Examples from Criminology) Robert Apel, Ph.D. School of Criminal Justice University at Albany Center for Social and ... – PowerPoint PPT presentation

Number of Views:175

Avg rating:3.0/5.0

Slides: 46

Provided by: apel7

Learn more at: http://cega.berkeley.edu

Category:

more less

Transcript and Presenter's Notes

Title: Instrumental Variables Estimation (with Examples from Criminology)

1
Instrumental Variables Estimation (with Examples
from Criminology)

Robert Apel, Ph.D.
School of Criminal Justice
University at Albany

Center for Social and Demographic
Analysis University at Albany May 5 7, 2009
2
Vital Statistics

Ph.D., Criminology and Criminal Justice, 2004
University of Maryland
Coursework in Department of Economics
Dissertation used instrumental variables
State child labor laws as instrumental variables
for the causal effect of youth employment on
antisocial behavior

3
Topics That Will Be Covered in this Workshop

Why use IV?
Discussion of endogeneity bias
Statistical motivation for IV
What is an IV?
Identification issues
Statistical properties of IV estimators
How is an IV model estimated?
Software and data examples
Diagnostics IV relevance, IV exogeneity, Hausman

4
Review of the Linear Model

Population model Y a ßX e
Assume that the true slope is positive, so ß gt 0
Sample model Y a bX e
Least squares (LS) estimator of ß
bLS (X'X)1X'Y Cov(X,Y) / Var(X)
Under what conditions can we speak of bLS as a
causal estimate of the effect of X on Y?

5
Review of the Linear Model

Key assumption of the linear model
E(X'e) Cov(X,e) E(e X) 0
Exogeneity assumption X is uncorrelated with
the unobserved determinants of Y
Important statistical property of the LS
estimator under exogeneity
E(bLS) ß Cov(X,e) / Var(X)
plim(bLS) ß Cov(X,e) / Var(X)

Second terms 0, so bLS unbiased and consistent
6
Endogeneity and the Evaluation Problem

When is the exogeneity assumption violated?
Measurement error ? Attenuation bias
Instantaneous causation ? Simultaneity bias
Omitted variables ? Selection bias
Selection bias is the problem in observational
research that undermines causal inference
Measurement error and instantaneous causation can
be posed as problems of omitted variables

7
When Is the Exogeneity Assumption Violated?

(1) Measurement error in X (u) that is correlated
with M.E. in Y (v) or with the model error (e)
Classical M.E. leads to attenuation, 0 lt E(bLS) lt
ß, but non-random M.E. (or correlation between
M.E. and X, Y, V, and/or e) introduces unknown
biases

And, if there are multiple Xs, bias
contaminates the whole model, not just the
coefficient on the X measured with error (a.k.a.
smearing)
8
When Is the Exogeneity Assumption Violated?

(2) Instantaneous causation of Y on X
Direction of the bias depends on what the sign is
for the feedback effect, Y ? X
If positive, E(bLS) gt ß, so overestimate true
effect
If negative, E(bLS) lt ß, so underestimate true
effect and in severe cases can even flip the sign
so that E(bLS) lt 0 even though ß gt 0

This non-recursivity complicates the
relationship between price and quantity in
economics
9
When Is the Exogeneity Assumption Violated?

(3) Omitted variable (W) that is correlated with
both X and Y
Classic problem of omitted variables bias
Coefficient on X will absorb the indirect path
through W, whose sign depends on Cov(X,W) and
Cov(W,Y)

Things more complicated in applied settings
because there are bound to be many Ws, not to
mention that the smearing problem applies in
this context also
10
Example 1 Police Hiring

Measurement error
Mobilization of sworn officers (M.E. in X) as
well as differential victim reporting or crime
recording (M.E. in Y) may be correlated with
police size
Instantaneous causation
More police might be hired during a crime wave
Omitted variables
Large departments may differ in fundamental ways
difficult to measure (e.g., urban, heterogeneous)

11
Example 2 Sanction Perceptions

Measurement error
Measures of perceived sanction risk are probably
noisy (M.E. in X), resulting in attenuation at
best
Instantaneous causation
Perceptions are sensitive to the success/failure
of criminal behavior, so feedback is negative
Omitted variables
Perceived risk probably correlated with
unobserved determinants of crime (e.g.,
intelligence)

12
Example 3 Delinquent Peers

Measurement error
Highly delinquent youth probably overestimate the
delinquency of their peers (M.E. in X), and
likely underestimate their own delinquency (M.E.
in Y)
Instantaneous causation
If there is influence/imitation, then it is
bidirectional
Omitted variables
High-risk youth probably select themselves into
delinquent peer groups (birds of a feather)

13
Regression EstimationIgnoring Omitted Variables

Suppose we estimate treatment effect model
Y a ßX e
Lets assume without loss of generality that X is
a binary treatment ( 1 if treated 0 if
untreated)
Least squares estimator
bLS Cov(X,Y) / Var(X) E(Y X 1) E(Y X
0)
Simply the difference in means between treated
units (X 1) and untreated units (X 0)

14
Regression EstimationIgnoring Omitted Variables

But suppose the population treatment effect model
is instead
Y a ßX (dW ?)
Now the residual conveys information about W
Consider a plausible example
Y crime, X marriage, W marriageability
Marriageability can be broadly construed to
encompass earnings potential, desire for
children, willingness to compromise,
faithfulness, verbal communication skills,...
Including signals that individuals emit about
these qualities

15
Regression EstimationIgnoring Omitted Variables

What does LS estimate when W is omitted?
bLS C(X,Y)/V(X) C(W,Y)/V(W)
C(X,W)/V(X)
ß d E(W X 1) E(W X 0)
Marriage effect on crime will be overestimated
IMPORTANT Even if ß 0, bLS lt 0

16
Regression EstimationIgnoring Omitted Variables

So...
bLS ß d E(W X 1) E(W X 0)
Estimate of ß is unbiased if and only if
1. Marriageability is uncorrelated with crime
d 0
or...
2. Marriageability is balanced (i.e.,
equivalent) between married and unmarried
subjects
E(W X 1) E(W X 0)

17
Omitted Variables in Criminological Research

What variables of interest to criminologists are
surely endogenous?
Micro Employment, education, marriage, military
service, fertility, conviction, family
structure,....
Macro Poverty, unemployment rate, collective
efficacy, immigrant concentration,....
Basically, EVERYTHING!
(Im sorry to be the one to break it to you)

18
Traditional Strategies to Deal with Omitted
Variables

Randomization (physical control)
Achieves balance (in expectation) on any and all
potential Ws
Control variables are technically unnecessary
Covariate adjustment (statistical control)
Control for potential Ws in a regression model
But...we have no idea how many Ws there are, so
model misspecification is still a real problem
here

19
Quasi-Experimental Strategies to Deal with
Omitted Variables

Difference in differences (fixed-effects model)
Requires panel data
Propensity score matching
Requires a lot of measured background variables
Similar to covariate adjustment, but only the
treated and untreated cases which are on
support are utilized
Instrumental variables estimation
Requires an exclusion restriction

20
Instrumental Variables Estimation Is a Viable
Approach

An instrumental variable for X is one solution
to the problem of omitted variables bias

Requirements for Z to be a valid instrument for X
Relevant Correlated with X
Exogenous Not correlated with Y but through its
correlation with X

21
Important Point about Instrumental Variables
Models

I often hear...A good instrument should not be
correlated with the dependent variable
WRONG!!!
Z has to be correlated with Y, otherwise it is
useless as an instrument
It can only be correlated with Y through X
A good instrument must not be correlated with the
unobserved determinants of Y

22
Important Point about Instrumental Variables
Models

Not all of the available variation in X is used
Only that portion of X which is explained by Z
is used to explain Y

X Endogenous variable Y Response
variable Z Instrumental variable
23
Important Point about Instrumental Variables
Models
Best-case scenario A lot of X is explained by
Z, and most of the overlap between X and Y is
accounted for
Realistic scenario Very little of X is
explained by Z, or what is explained does not
overlap much with Y
24
Important Point about Instrumental Variables
Models

The IV estimator is BIASED
In other words, E(bIV) ? ß (finite-sample bias)
The appeal of IV derives from its consistency
Consistency is a way of saying that E(b) ? ß as
N ? 8
SoIV studies often have very large samples
But with endogeneity, E(bLS) ? ß and plim(bLS) ?
ß anyway
Asymptotic behavior of IV
plim(bIV) ß Cov(Z,e) / Cov(Z,X)
If Z is truly exogenous, then Cov(Z,e) 0

25
Instrumental Variables Terminology

Three different models to be familiar with
First stage X a0 a1Z ?
Structural model Y ß0 ß1X e
Reduced form Y d0 d1Z ?
An interesting equality
d1 a1 ß1
so
ß1 d1 / a1

26
Different Types of Instrumental Variables
Estimators

Wald estimator for binary instrument
bWald E(Y Z 1) E(Y Z 0) / E(X Z
1) E(X Z 0)
Difference in response Difference in treatment
Instrumental variables (IV) estimator
bIV (Z'X)1Z'Y Cov(Z,Y) / Cov(Z,X)
Shows that bIV can be recovered from two samples
Two-stage least squares (2SLS) estimator
b2SLS (X'X)1X'Y Cov(X,Y) / Var(X)
X represents fitted value from first-stage
model

27
Different Types of Instrumental Variables
Estimators

Single binary instrument and no control
variables...
bWald bIV b2SLS
Single instrument (binary or continuous) with or
without control variables...
bIV b2SLS
Multiple instruments (binary or continuous) with
or without control variables...
b2SLS

28
More on the Method of Two-Stage Least Squares
(2SLS)

Step 1 X a0 a1Z1 a2Z2 ??? akZk u
Obtain fitted values (X) from the first-stage
model
Step 2 Y b0 b1X e
Substitute the fitted X in place of the original
X
Note If done manually in two stages, the
standard errors are based on the wrong residual
e Y b0 b1X when it should be e Y
b0 b1X
Best to just let the software do it for you

29
Including Control Variables in an IV/2SLS Model

Control variables (Ws) should be entered into
the model at both stages
First stage X a0 a1Z a2W u
Second stage Y b0 b1X b2W e
Control variables are considered instruments,
they are just not excluded instruments
They serve as their own instrument

30
Functional Form Considerations with IV/2SLS

Binary endogenous regressor (X)
Consistency of second-stage estimates do not
hinge on getting first-stage functional form
correct
Binary response variable (Y)
IV probit (or logit) is feasible but is
technically unnecessary
In both cases, linear model is tractable, easily
interpreted, and consistent
Although variance adjustment is well advised

31
Functional Form Considerations with IV/2SLS

Quadratic second stage with a continuous
endogenous regressor
Entering first-stage fitted values and their
square into second-stage model leads to
inconsistency
The square of a linear projection is not
equivalent to a linear projection on a quadratic
Squares and cross-products of IVs should be
treated as additional instruments
Kelejian (1971)
Linear and squared Xs are treated as two
different endogenous regressors

32
Technical Conditions Required for Model
Identification

Order condition At least the same of IVs as
endogenous Xs
Just-identified model IVs Xs
Overidentified model IVs gt Xs
Rank condition At least one IV must be
significant in the first-stage model
Number of linearly independent columns in a
matrix
E(X Z,W) cannot be perfectly correlated with
E(X W)

33
Statistical Inference with IV

Variance estimation
s2ßLS s2e / SSTX
s2ßIV s2e / (SSTX ? R2X,Z)
where
e Y ß0 ß1X
NOTICE Because R2X,Z lt 1 ? sbIV gt sbLS
IV standard errors tend to be large, especially
when R2X,Z is very small, which can lead to type
II errors

34
Instrumental Variables and Randomized Experiments

Imperfect compliance in randomized trials
Some individuals assigned to treatment group will
not receive Tx, and some assigned to control
group will receive Tx
Assignment error subject refusal investigator
discretion
Some individuals who receive Tx will not change
their behavior, and some who do not receive Tx
will change their behavior
A problem in randomized job training studies and
other social experiments (e.g., housing vouchers)

35
Instrumental Variables and Randomized Experiments

Two different measures of treatment (X)
Treatment assigned Exogenous
Intention-to-treat (ITT) analysis
Reduced-form model Y d0 d1Z ?
Often leads to underestimation of treatment
effect
Treatment delivered Endogenous
Individuals who do not comply probably differ in
ways that can undermine the study
Self-selection ? bias and inconsistency

36
Angrist (2006), J.E.C.

Minneapolis D.V. experiment
Sherman and Berk (1984)
Cases of male-on-female misdemeanor assault in
two high-density precincts, in which both parties
present at scene
Random assignment of arrest-mediation-separation
But...treatment assigned was not treatment
delivered
Fidelity vis-à-vis arrest, but many subjects
(25) assigned to mediation/separation were
arrested
Upgrading was more likely when suspect was
rude, suspect assaulted officer, weapons were
involved, victim persistently demanded arrest,
and incident violated restraining order

37
Angrist (2006), J.E.C.
38
Angrist (2006), J.E.C.

Estimates of effect of arrest (vs. mediate or
separate) on D.V. recividism (Tables 2, 3)
OLS b .070 (s.e. .038)
ITT b .108 (s.e. .041)
2SLS b .140 (s.e. .053)
Deterrent effect of arrest is twice as large in
2SLS as opposed to OLS
In this context, 2SLS is known as a local
average treatment effect (Ill come back to this)

39
Sexton and Hebel (1984), J.A.M.A.

Maternal smoking and birth weight
Sexton and Hebel (1984)
Sample of pregnant women who were confirmed
smokers, recruited from prenatal care registrants
At least 10 cigarettes per day and not past 18th
week
Random assignment of staff assistance in a
smoking cessation program
Personal visits telephone and mail contacts
But...some smokers in treatment group did not
quit and some smokers in control group did quit

40
Sexton and Hebel (1984), J.A.M.A.
41
Sexton and Hebel (1984), J.A.M.A.
(1) First-stage model Mean cigarettes
smoked Treatment 6.4 Control
12.8 First-stage effect bFS 6.4
(2) Reduced-form model Mean birth
weight Treatment 3,278g Control
3,186g Reduced-form effect bRF 92
(3) Structural model Effect of smoking frequency
on mean birth weight bIV 92 / 6.4
14.4g Each cigarette reduces birth weight by
14.4 grams
42
Sexton and Hebel (1984), J.A.M.A.

As an interesting aside, its also possible to
estimate the effect of continuing smoking (vs.
quitting) from the data
First stage bFS 0.23 (57 vs. 80 smokers)
Reduced form bRF 92g
Structural bIV 92 / 0.23 400g
Women who kept smoking by the 8th month of
pregnancy bore children who were 400 grams
lighter, on average

43
Permutt and Hebel (1989), Biometrics

Estimates of the effect of smoking frequency (in
8th month) on birth weight
OLS b 2g (s.e. not reported)
2SLS b 14g (s.e. 7g)
Here as well, 2SLS yields the local average
treatment effect of smoking on birth weight

44
Instrumental Variables and Local Average
Treatment Effects

Definition of a L.A.T.E.
The average treatment effect for individuals who
can be induced to change treatment status by a
change in the instrument
Imbens and Angrist (1994, p. 470)
The average causal effect of X on Y for
compliers, as opposed to always takers or
never takers
Not a particularly well-defined (sub)population
L.A.T.E. is instrument-dependent, in contrast to
the population A.T.E.

45
L.A.T.E. in the Previous Two Examples

In the D.V. study...
For men who were arrested as per the experimental
protocol, arrest resulted in a mean 14-point
decline in the probability of recidivism compared
to non-arrest interventions
In the maternal smoking study...
For women who reduced their smoking frequency
because they were assigned to the intervention,
each one-cigarette reduction resulted in a
14-gram increase in birth weight (from mean 11
cigarettes)