Module 3: Impact Evaluation for TTLs - PowerPoint PPT Presentation

1 / 79
About This Presentation
Title:

Module 3: Impact Evaluation for TTLs

Description:

Other (using Instrumental Variables, matching, etc) ... Instrumental Variables. Some fancy statistics: Find a variable Z which satisfies two conditions: ... – PowerPoint PPT presentation

Number of Views:36
Avg rating:3.0/5.0
Slides: 80
Provided by: WB1673
Category:

less

Transcript and Presenter's Notes

Title: Module 3: Impact Evaluation for TTLs


1
Module 3 Impact Evaluation for TTLs
  • Paul J. Gertler
  • Chief Economist, HDN
  • Sebastian Martinez
  • Impact Evaluation Cluster, AFTRL
  • HD Learning Week
  • Washington DC
  • November 2006

Slides by Paul Gertler and Sebastian Martinez
2
Measuring Impact
  • What makes a good impact evaluation?

3
Motivation
  • Traditional ME
  • Is the program being implemented as designed?
  • Could the operations be more efficient?
  • Are the benefits getting to those intended?
  • Monitoring trends
  • Are indicators moving in the right direction?
  • ? NO inherent Causality
  • Impact Evaluation
  • What was the effect of the program on outcomes?
  • Because of the program, are people better off?
  • What would happen if we changed the program?
  • ? Causality

4
Motivation
  • Objective in evaluation is to estimate the CAUSAL
    effect of intervention X on outcome Y
  • What is the effect of a cash transfer on
    household consumption?
  • For causal inference we must understand the data
    generation process
  • For impact evaluation, this means understanding
    the behavioral process that generates the data
  • how benefits are assigned

5
Causation versus Correlation
  • Recall correlation is NOT causation
  • Necessary but not sufficient condition
  • Correlation X and Y are related
  • Change in X is related to a change in Y
  • And.
  • A change in Y is related to a change in X
  • Causation if we change X how much does Y change
  • A change in X is related to a change in Y
  • Not necessarily the other way around

6
Causation versus Correlation
  • Three criteria for causation
  • Independent variable precedes the dependent
    variable.
  • Independent variable is related to the dependent
    variable.
  • There are no third variables that could explain
    why the independent variable is related to the
    dependent variable
  • External validity
  • Generalizability causal inference to generalize
    outside the sample population or setting

7
Motivation
  • The word cause is not in the vocabulary of
    standard probability theory.
  • Probability theory two events are mutually
    correlated, or dependent ? if we find one, we can
    expect to encounter the other.
  • Example age and income
  • For impact evaluation, we supplement the language
    of probability with a vocabulary for causality.

8
Statistical Analysis Impact Evaluation
  • Statistical analysis Typically involves
    inferring the causal relationship between X and Y
    from observational data
  • Many challenges complex statistics
  • Impact Evaluation
  • Retrospectively
  • same challenges as statistical analysis
  • Prospectively
  • we generate the data ourselves through the
    programs design ? evaluation design
  • makes things much easier!

9
How to assess impact
  • What is the effect of a cash transfer on
    household consumption?
  • Formally, program impact is
  • a (Y P1) - (Y P0)
  • Compare same individual with without programs
    at same point in time
  • So whats the Problem?

10
Solving the evaluation problem
  • Problem we never observe the same individual
    with and without program at same point in time
  • Need to estimate what would have happened to the
    beneficiary if he or she had not received
    benefits
  • Counterfactual what would have
    happened without the program
  • Difference between treated observation and
    counterfactual is the estimated impact

11
Finding a good counterfactual
  • The treated observation and the counterfactual
  • have identical factors/characteristics, except
    for benefiting from the intervention
  • No other explanations for differences in outcomes
    between the treated observation and
    counterfactual
  • The only reason for the difference in
    outcomes is due to the intervention

12
Measuring Impact
  • Tool belt of Impact Evaluation Design Options
  • Randomized Experiments
  • Quasi-experiments
  • Regression Discontinuity
  • Difference in difference panel data
  • Other (using Instrumental Variables, matching,
    etc)
  • In all cases, these will involve knowing the rule
    for assigning treatment

13
Choosing your design
  • For impact evaluation, we will identify the
    best possible design given the operational
    context
  • Best possible design is the one that has the
    fewest risks for contamination
  • Omitted Variables (biased estimates)
  • Selection (results not generalizable)

14
Case Study
  • Effect of cash transfers on consumption
  • Estimate impact of cash transfer on consumption
    per capita
  • Make sure
  • Cash transfer comes before change in consumption
  • Cash transfer is correlated with consumption
  • Cash transfer is the only thing changing
    consumption
  • Example based on Oportunidades

15
Oportunidades
  • National anti-poverty program in Mexico (1997)
  • Cash transfers and in-kind benefits conditional
    on school attendance and health care visits.
  • Transfer given preferably to mother of
    beneficiary children.
  • Large program with large transfers
  • 5 million beneficiary households in 2004
  • Large transfers, capped at
  • 95 USD for HH with children through junior high
  • 159 USD for HH with children in high school

16
Oportunidades Evaluation
  • Phasing in of intervention
  • 50,000 eligible rural communities
  • Random sample of of 506 eligible communities in 7
    states - evaluation sample
  • Random assignment of benefits by community
  • 320 treatment communities (14,446 households)
  • First transfers distributed April 1998
  • 186 control communities (9,630 households)
  • First transfers November 1999

17
Oportunidades Example
18
Counterfeit CounterfactualNumber 1
  • Before and after
  • Assume we have data on
  • Treatment households before the cash transfer
  • Treatment households after the cash transfer
  • Estimate impact of cash transfer on household
    consumption
  • Compare consumption per capita before the
    intervention to consumption per capita after the
    intervention
  • Difference in consumption per capita between the
    two periods is treatment

19
Case 1 Before and After
  • Compare Y before and after intervention
  • ai (CPCit T1) - (CPCi,t-1 T0)
  • Estimate of counterfactual
  • (CPCi,t T0) (CPCi,t-1 T0)
  • Impact A-B

CPC
After
Before
A
B
t-1
t
Time
20
Case 1 Before and After
21
Case 1 Before and After
  • Compare Y before and after intervention
  • ai (CPCit T1) - (CPCi,t-1 T0)
  • Estimate of counterfactual
  • (CPCi,t T0) (CPCi,t-1 T0)
  • Impact A-B
  • Does not control for time varying factors
  • Recession Impact A-C
  • Boom Impact A-D

CPC
After
Before
A
D?
B
C?
t-1
t
Time
22
Counterfeit CounterfactualNumber 2
  • Enrolled/Not Enrolled
  • Voluntary Inscription to the program
  • Assume we have a cross-section of
    post-intervention data on
  • Households that did not enroll
  • Households that enrolled
  • Estimate impact of cash transfer on household
    consumption
  • Compare consumption per capita of those who did
    not enroll to consumption per capita of those who
    enrolled
  • Difference in consumption per capita between the
    two groups is treatment

23
Case 2 Enrolled/Not Enrolled
24
Those who did not enroll.
  • Impact estimate ai (Yit P1) - (Yj,t P0)
    ,
  • Counterfactual (Yj,t P0) ? (Yi,t
    P0)
  • Examples
  • Those who choose not to enroll in program
  • Those who were not offered the program
  • Conditional Cash Transfer
  • Job Training program
  • Cannot control for all reasons why some choose to
    sign up other didnt
  • Reasons could be correlated with outcomes
  • We can control for observables..
  • But are still left with the unobservables

25
Impact Evaluation ExampleTwo counterfeit
counterfactuals
  • What is going on??
  • Which of these do we believe?
  • Problem with Before-After
  • Can not control for other time-varying factors
  • Problem with Enrolled-Not Enrolled
  • Do no know why the treated are treated and the
    others not

26
Possible Solutions
  • We need to understand the data generation process
  • How beneficiaries are selected and how benefits
    are assigned
  • Guarantee comparability of treatment and control
    groups, so ONLY difference is the intervention

27
Measuring Impact
  • Experimental design/randomization
  • Quasi-experiments
  • Regression Discontinuity
  • Double differences (diff in diff)
  • Other options

28
Choosing the methodology..
  • Choose the most robust strategy that fits the
    operational context
  • Use program budget and capacity constraints to
    choose a design, i.e. pipeline
  • Universe of eligible individuals typically larger
    than available resources at a single point in
    time
  • Fairest and most transparent way to assign
    benefit may be to give all an equal chance of
    participating ? randomization

29
Randomization
  • The gold standard in impact evaluation
  • Give each eligible unit the same chance of
    receiving treatment
  • Lottery for who receives benefit
  • Lottery for who receives benefit first

30
Population
Randomization
Sample
Randomization
Treatment Group
Control Group
31
External Internal Validity
  • The purpose of the first-stage is to ensure that
    the results in the sample will represent the
    results in the population within a defined level
    of sampling error (external validity).
  • The purpose of the second-stage is to ensure that
    the observed effect on the dependent variable is
    due to some aspect of the treatment rather than
    other confounding factors (internal validity).

32
Case 3 Randomization
  • Randomized treatment/controls
  • Community level randomization
  • 320 treatment communities
  • 186 control communities
  • Pre-intervention characteristics well balanced

33
Baseline characteristics
34
Case 3 Randomization
35
Impact Evaluation Example No Design v.s.
Randomization
36
Measuring Impact
  • Experimental design/randomization
  • Quasi-experiments
  • Regression Discontinuity
  • Double differences (diff in diff)
  • Other options

37
Case 4 Regression Discontinuity
  • Assignment to treatment is based on a clearly
    defined index or parameter with a known cutoff
    for eligibility
  • RD is possible when units can be ordered along a
    quantifiable dimension which is systematically
    related to the assignment of treatment
  • The effect is measured at the discontinuity
    estimated impact around the cutoff may not
    generalize to entire population

38
Indexes are common in targeting of social programs
  • Anti-poverty programs ? targeted to households
    below a given poverty index
  • Pension programs ? targeted to population above a
    certain age
  • Scholarships ? targeted to students with high
    scores on standardized test
  • CDD Programs ? awarded to NGOs that achieve
    highest scores

39
Example effect of cash transfer on consumption
  • Target transfer to poorest households
  • Construct poverty index from 1 to 100 with
    pre-intervention characteristics
  • Households with a score lt50 are poor
  • Households with a score gt50 are non-poor
  • Cash transfer to poor households
  • Measure outcomes (i.e. consumption) before and
    after transfer

40
(No Transcript)
41
Non-Poor
Poor
42
(No Transcript)
43

Treatment Effect
44
Case 4 Regression Discontinuity
  • Oportunidades assigned benefits based on a
    poverty index
  • Where
  • Treatment 1 if score lt750
  • Treatment 0 if score gt750

45
Case 4 Regression Discontinuity
Baseline No treatment
2
46
Case 4 Regression Discontinuity
Treatment Period
47
Potential Disadvantages of RD
  • Local average treatment effects not always
    generalizable
  • Power effect is estimated at the discontinuity,
    so we generally have fewer observations than in a
    randomized experiment with the same sample size
  • Specification can be sensitive to functional
    form make sure the relationship between the
    assignment variable and the outcome variable is
    correctly modeled, including
  • Nonlinear Relationships
  • Interactions

48
Advantages of RD for Evaluation
  • RD yields an unbiased estimate of treatment
    effect at the discontinuity
  • Can many times take advantage of a known rule for
    assigning the benefit that are common in the
    designs of social policy
  • No need to exclude a group of eligible
    households/individuals from treatment

49
Measuring Impact
  • Experimental design/randomization
  • Quasi-experiments
  • Regression Discontinuity
  • Double differences (Diff in diff)
  • Other options

50
Case 5 Diff in diff
  • Compare change in outcomes between treatments and
    non-treatment
  • Impact is the difference in the change in
    outcomes
  • Impact (Yt1-Yt0) - (Yc1-Yc0)

51
Treatment Group
Control Group
52
Outcome
Average Treatment Effect
EstimatedAverage Treatment Effect
Treatment Group
Control Group
Time
Treatment
53
Diff in diff
  • Fundamental assumption that trends (slopes) are
    the same in treatments and controls
  • Need a minimum of three points in time to verify
    this and estimate treatment (two
    pre-intervention)

54
Case 5 Diff in Diff
55
Impact Evaluation Example Summary of Results
56
Measuring Impact
  • Experimental design/randomization
  • Quasi-experiments
  • Regression Discontinuity
  • Double differences (Diff in diff)
  • Other options
  • Instrumental Variables
  • Matching

57
Other options for Impact Evaluation
  • There are a few others out there
  • Common scenario
  • Voluntary inscription in program
  • Cant control who enrolls and who does not
  • Possible solution random promotion or incentives
    into the program
  • Information
  • Money
  • Other help/incentives

58
Random Promotion
  • Those who get promotion are more likely to enroll
  • But who got promotion was determined randomly, so
    not correlated with other observables/non-observab
    les
  • Compare average outcomes of two groups
    promoted/not promoted
  • Effect of offering the program (ITT)
  • Effect of the intervention (TOT)
  • TOT effect of offering program/proportion of
    those who took up

59
Example Community Based School Management
  • Chaudhury, Gertler, Vermeersch (work in progress)
  • Estimate effect of decentralization of school
    management on learning outcomes
  • Grant for funding of community based management
  • Community management of hiring, budgeting,
    oversight
  • 1500 schools in the evaluation
  • Each community chooses whether to participate in
    program
  • Community submits proposal for program
    participation

60
Evaluation Design
  • Community based school management
  • Provision of technical assistance and training by
    NGOs for submission of grant application
  • Random selection of communities with NGO support
  • Random promotion is an Instrumental Variable

61
Technique called Instrumental Variables
  • Some fancy statistics
  • Find a variable Z which satisfies two conditions
  • Correlated with T corr (Z , T) ? 0
  • Uncorrelated with e corr (Z , e) 0
  • Z is the random promotion in our example

62
Indirect least squares Case 1
Promotion No-Promotion Change
Takeup (T) 0.5 0 0.5
Test Score (S) 100 80 20

63
Indirect least squares Case 2
Promotion No-Promotion Change
Takeup (T) 0.8 0.3 0.5
Test Score (S) 100 90 10

64
Two Stage Least Squares (2SLS)
  • Model with endogenous Treatment (T)
  • Stage 1 Regress endogenous variable on the IV
    (Z) and other exogenous regressors
  • Calculate predicted value for each observation T
    hat

65
Two stage Least Squares (2SLS)
  • Stage 2 Regress outcome y on predicted variable
    (and other exogenous variables)
  • Need to correct Standard Errors (they are based
    on T hat rather than T)
  • In practice just use STATA - ivreg
  • Intuition T has been cleaned of its
    correlation with e.

66
Instrumental Variables
  • A variable correlated with treatment but nothing
    else (i.e. random promotion)
  • Again, we really just need to understand how the
    data are generated
  • Dont have to exclude anyone

67
Case 6 IV
  • Estimate TOT effect of Oportunidades on
    consumption
  • Run 2SLS regression

68
Measuring Impact
  • Experimental design/randomization
  • Quasi-experiments
  • Regression Discontinuity
  • Double differences (Diff in diff)
  • Other options
  • Instrumental Variables
  • Matching

69
Matching
  • Pick up the ideal comparison that matches the
    treatment group from a larger survey.
  • The matches are selected
    on the basis of
    similarities in observed characteristics
  • This assumes no selection bias based on
    unobservable characteristics.
  • Source Martin Ravallion

70
Propensity-Score Matching (PSM)
  • Controls non- participants with same
    characteristics as participants
  • In practice, it is very hard. The entire vector
    of X observed characteristics could be huge.
  • Rosenbaum and Rubin match on the basis of the
    propensity score
  • P(Xi) Pr (Di1X)
  • Instead of aiming to ensure that the matched
    control for each participant has exactly the same
    value of X, same result can be achieved by
    matching on the probability of participation.
  • This assumes that participation is independent of
    outcomes given X.

71
Steps in Score Matching
  1. Representative highly comparables survey of
    non-participants and participants.
  2. Pool the two samples and estimated a logit (or
    probit) model of program participation.
  3. Restrict samples to assure common support
    (important source of bias in observational
    studies)
  4. For each participant find a sample of
    non-participants that have similar propensity
    scores
  5. Compare the outcome indicators. The difference is
    the estimate of the gain due to the program for
    that observation.
  6. Calculate the mean of these individual gains to
    obtain the average overall gain.

72
Density
Density of scores for participants
Region of common support
0
1
Propensity score
73
PSM vs an experiment
  • Pure experiment does not require the untestable
    assumption of independence conditional on
    observables
  • PSM requires large samples and good data

74
Lessons on Matching Methods
  • Typically used when neither randomization, RD or
    other quasi-experimental options are not possible
    (i.e. no baseline)
  • Be cautious of ex-post matching
  • Matching on endogenous variables
  • Matching helps control for OBSERVABLE
    heterogeneity
  • Matching at baseline can be very useful
  • Estimation
  • combine with other techniques (i.e. diff in diff)
  • Know the assignment rule (match on this rule)
  • Sampling
  • selecting non-randomized evaluation samples
  • Need good quality data
  • Common support can be a problem

75
Case 7 Matching
76
Case 7 Matching
77
Impact Evaluation Example Summary of Results
78
Measuring Impact
  • Experimental design/randomization
  • Quasi-experiments
  • Regression Discontinuity
  • Double differences (Diff in diff)
  • Other options
  • Instrumental Variables
  • Matching
  • Combinations of the above

79
Remember..
  • Objective of impact evaluation is to estimate the
    CAUSAL effect of a program on outcomes of
    interest
  • In designing the program we must understand the
    data generation process
  • behavioral process that generates the data
  • how benefits are assigned
  • Fit the best evaluation design to the operational
    context
Write a Comment
User Comments (0)
About PowerShow.com