Evaluating Anti-Poverty Programs: Concepts and Methods - PowerPoint PPT Presentation

1 / 57
About This Presentation
Title:

Evaluating Anti-Poverty Programs: Concepts and Methods

Description:

... Outcome on flexible formulation of control function, ... the region of common support Parametric assumption Observational methods: ... models (Todd and ... – PowerPoint PPT presentation

Number of Views:185
Avg rating:3.0/5.0
Slides: 58
Provided by: World191
Category:

less

Transcript and Presenter's Notes

Title: Evaluating Anti-Poverty Programs: Concepts and Methods


1
  • Evaluating Anti-Poverty Programs Concepts and
    Methods
  • Norbert Schady
  • Development Research Group

2
Outline of presentation
  • Introduction The evaluation problem
  • Possible solutions
  • 1. Experimental evaluations
  • Randomization
  • 2. Quasi-experimental evaluations
  • Instrumental variables
  • Regression discontinuity
  • 3. Non-experimental evaluations
  • OLS
  • Matching methods
  • Differences-in-differences
  • Learning more from evaluations

3
Outline of presentation
  • Big disclaimer! I will frequently be drawing on
    my own work in this presentation for examples

4
The evaluation problem
  • Assigned programs
  • Some units (individuals, households, villages)
    get the program
  • Some do not
  • Examples
  • Social fund selects from applicants
  • School construction some villages get a new
    school, others get nothing
  • Cash transfers to eligible households only
  • Ex-post evaluation

5
The evaluation problem
  • Impact is the difference between the relevant
    outcome indicator with the program and that
    without it
  • However, we can never observe someone in two
    different states of nature at the same time
  • While a post-intervention indicator is observed,
    its value in the absence of the program is notit
    is a counter-factual
  • So all evaluation is essentially a problem of
    missing data
  • Calls for counterfactual analysis

6
Naïve comparisons can be deceptive
  • Common practices
  • Compare outcomes after the intervention to those
    before, or
  • Compare units (people, households, villages) with
    and without the anti-poverty program
  • Potential biases from failure to control for
  • Other changes over time under the counterfactual,
    or
  • Unit characteristics that influence program
    placement

7
We observe an outcome indicator,

Intervention
8
and its value rises after the program

Intervention
9
However, we need to identify the counterfactual

Intervention
10
since only then can we determine the impact of
the intervention

11
The evaluation problem
  • However, we never observe the counterfactual, and
    so have to estimate it
  • Making comparisons between treated and
    control (or comparison) groups

12
Alternative solutions
  • Experimental evaluations (Social experiments)
  • Program is randomly assigned
  • If properly carried out, corrects for observable
    and unobservable differences between treated and
    controls
  • Estimates ATE
  • Quasi-experimental evaluations
  • Instrumental variables
  • Regression discontinuity
  • Can correct for observable and unobservable
    differences, but estimated treatment effect is
    local
  • Non-experimental evaluations (observational
    studies)
  • OLS
  • Matching techniques
  • Exogenous placement conditional on observables
  • Differences in differences or higher-order
    differencing
  • Can correct for time-invariant, additive
    differences (including in unobservables) between
    treated and controls

13
Randomization
  • Lottery used to assign households to treatment
    and control groups
  • If sample is large enough, this equates all
    characteristicsobservable and unobservableof
    both groups
  • Differences in outcomes can then be credibly
    interpreted as program impacts
  • No need for complicated econometrics or
    conditioning variables
  • Simple differences of means suffices

14
Randomization
  • Randomization checks
  • Check random assignment
  • Check whether conditioning on X variables makes a
    difference
  • Check whether cross-sectional and
    first-differenced analysis yields similar results

15
Conclusion Randomization
  • Randomization is the benchmark for
    quasi-experimental and non-experimental
    evaluation methods
  • Has become much more popular in developing
    countries in recent decades, and with good reason
  • Groundbreaking example of PROGRESA
  • However, randomization is no panacea
  • Often infeasible Political and moral
    difficulties of denying treatment to eligible
    beneficiaries who have lost a lottery
  • Be thoughtful about extrapolating from estimated
    parameters

16
What is the estimated parameter? Is it
policy-relevant?
  • Randomization estimates Average Treatment Effect
    (ATE) if all households in treatment group
    receive the treatment and all those in control
    group do not
  • If compliance in treatment group is imperfect,
    then can estimate Intent-to-Treat (ITT)the
    impact of being offered the program
  • Or can inflate ITT by program take-up to estimate
    Treatment-on-the-Treated (TT)
  • Program take-up R
  • ITT estimate of program effect ß1
  • TT estimate of program effect ß2(ß1/R)

17
What is the estimated parameter? Is it
policy-relevant?
  • Deeper problem Randomization often implemented
    in small-scale pilots, with highly-motivated
    staff
  • Impact of large-scale, perhaps nationwide program
    may be very different
  • US literature the impact of attending preschool
    on school outcomes
  • Perry Preschool program compared to Head Start
  • Difficult problem to overcome
  • In some cases, randomization takes place in the
    context of a large-scale program
  • PROGRESA in Mexico
  • However, this tends to be politically difficult
    to sustain
  • Oportunidades evaluations in Mexico

18
Quasi-experimental analysis RDD
  • Threshold M below which individuals are eligible
    for treatment, above which they are ineligible
  • Intuition behind approach is you compare
    individuals just above and just below this
    threshold value
  • Proxy means Determines eligibility for programs
  • Scholarships in Cambodia (Filmer and Schady 2008)
  • School fee reduction program in Bogota (Barrera,
    Linden and Urquiola 2007)
  • Geographic jurisdiction Program implemented in
    some areas but not others
  • Piso Firme in Mexico Comparisons in households
    just across the border in Coahuila and Durango
    states (Cattaneo et al. 2007)
  • Class size on test scores in Bolivia (Urquiola
    2007)

19
Quasi-experimental analysis RDD
  • Sharp RDD the threshold M perfectly predicts who
    receives a given treatment and who does not
  • Regress Outcome on flexible formulation of
    control function, and dummy for treatment
  • Estimate Yi a df(Ci) F(CiltM) ei
  • Note that, by definition (CiltM)T
  • Can also estimate control function
    nonparametrically, above and below threshold
  • Fuzzy RDD The threshold is a significant but
    imperfect predictor of treatment
  • Estimate Yi a df(Ci) FTi ei, where Ti is
    instrumented with CiltM

20
Quasi-experimental analysis RDD
  • Identifying assumption No discontinuity in
    counterfactual values at threshold
  • Essentially threshold is given exogenously and
    individuals respond mechanically to it
  • Can be violated if there is sorting
  • Urquiola and Verhogen (2008) sorting in Chilean
    education system
  • Schools dont want to add another class because
    it is expensive
  • Increase fees to limit enrollment
  • Parents understand school behavior and higher
    education parents sort themselves into schools
    with smaller class sizes
  • Discontinuity in observable (and perhaps
    unobservable) characteristics at threshold
    violates identifying assumption
  • RDD check present evidence of no observable
    differences at threshold

21
Quasi-experimental analysis RDD
Intent-to-treat effects of 45 versus no
scholarship (LHS) and 60 versus 45
(RHS) Source Filmer and Schady (2008)
22
What is the estimated parameter? Is it
policy-relevant?
  • RDD estimates treatment effects at the threshold
  • If there is heterogeneity of treatment effects,
    this may not correspond to the ATE
  • However, it may be a policy-relevant parameter
    for a small expansion of the program near the
    threshold
  • For example, for targeted programs, it will
    estimate effect of expanding coverage of program
    to incorporate marginal individuals

23
Quasi-experimental analysis IV
  • Intuition Identifying exogenous variation using
    a 3rd variable
  • Outcome regression
  • Yi ßTi FXi ei
  • Concern is that there are differences between
    treated (T1) and control (T0) individuals that
    are not captured by vector Xi
  • Induces correlation Ti between and ei
  • Biased estimates of program effects
  • Solution identify a variable Zi that is
    correlated with Ti (first stage) but is
    uncorrelated with ei (exclusion restriction)

24
Quasi-experimental analysis IV
  • Steps
  • 1 Regress Ti ß1Zi F1Xi ei
  • Predict T-hati
  • This gives you the exogenous variation in Ti
  • 2 Regress Yi ß2T-hati F2Xi ?i
  • In practice, this is done in one step to get the
    correct standard errors
  • Practical difficulty finding convincing
    instruments (the exclusion restriction cannot be
    tested)
  • If exclusion restriction does not hold, biases
    can be severe

25
Quasi-experimental analysis IV
  • Some examples
  • Partially randomized design
  • Angrist et al. (2002) on impact of vouchers on
    test scores in Colombia
  • Schady and Araujo (2008) on impact of cash
    transfers on enrollment in Ecuador
  • Lottery to determine access to Bono de Desarrollo
    Humano cash transfer program
  • But substantial contamination of control group,
    which appears to be non-random
  • Want to determine impact of program on enrollment
  • Solution regress enrollment on treatment, with
    treatment instrumented by the lottery
  • Since the lottery was randomized, it is not
    correlated with regression error term

26
Quasi-experimental analysis IV
  • Some examples
  • Political variables as instruments
  • Want to assess the impact of new school
    infrastructure on enrollment in Peru
  • But placement of school infrastructure may be
    endogenous
  • Maybe communities with tastes for education
    clamor more for a new school, and tastes are
    unobserved
  • Maybe program administrators want to place
    schools in very disadvantaged areas or in areas
    where they expect the returns to be highest
  • In any of these cases, a simple regression of
    school outcomes (enrollment, test scores) on new
    school infrastructure could be biased

27
Quasi-experimental analysis IV
  • Schady (2000) shows that the distribution of
    expenditures on school infrastructure in the
    Fujimori administration was partially determined
    by political considerations
  • Districts that had voted for Fujimori in 1990 but
    against Fujimori in 1993 were more likely to
    receive school investments than other, comparable
    districts (a buy-back strategy)
  • Paxson and Schady (2002) use this to construct an
    instrument for school infrastructure
  • Regress enrollment on school infrastructure, with
    school infrastructure instrumented with the
    change in the share voting for Fujimori
  • Exclusion restriction changes in vote share
    uncorrelated with regression error term

28
Quasi-experimental analysis IV
  • Some examples
  • Program glitches
  • Impact of Bolsa Alimentação CCT program
  • Software used by program could not read special
    characters
  • As a result, people whose names had special
    characters (for example, Ângela, João, José,
    Gonçalves) were rejected by the system, and did
    not receive BDH payments in a first phase
  • Interested in estimating the effect of Bolsa on
    an outcome, but participation in program may be
    endogenous
  • Regress outcome (say, height-for-age z-score) on
    Bolsa, with Bolsa instrumented with whether or
    not applicant had special character in name

29
What is the estimated parameter? Is it
policy-relevant?
  • If identifying assumptions hold, IV estimates are
    LATEthey estimate the impact of treatment on
    outcome on complier households (Imbens and
    Angrist 1994 Angrist, Imbens and Rubin 1996)
  • These are households whose probability of
    receiving the treatment was affected by the
    instrument
  • So, in partial randomization example, these
    exclude individuals who would have received
    transfers no matter what, as well as those who
    would not have received transfers no matter what
  • Note that this is a counterfactual comparisonwe
    cannot identify these individuals in practice
  • So, if there is heterogeneity of treatment
    effects, so that some households respond
    differently to an intervention than others, it is
    hard to extrapolate to another populationeven if
    IV estimator is unbiased

30
What is the estimated parameter? Is it
policy-relevant?
  • Also, if there is selection on expected returns
    (Card 1999 Heckman and Vytlacil 2005), so that
    those who stand to benefit the most are most
    likely to select into the program, this selection
    effect is incorporated into the estimated
    treatment effects
  • Imagine creating a program that randomly assigns
    fee waivers to some districts in a country but
    not others
  • Since program is randomized, you can estimate
    impact of fee waiver on school attainment without
    additional complications

31
What is the estimated parameter? Is it
policy-relevant?
  • But say you also want to use this design to
    estimate the impact of school attainment on wages
  • In theory, you could run a regression of wages on
    schooling, with schooling instrumented with
    whether a district was selected into the fee
    reduction program
  • However, if those who stood to gain the most from
    schooling were also more likely to respond to the
    fee waiver program, as seems plausibleso-called
    Roy selectionthen the IV estimates of schooling
    on wages include (i) the effect of schooling on
    wages, and (ii) a selection effect
  • Heckman calls this essential heterogeneity
  • Card (1999 2001) argues that this is the reason
    whycontrary to expectationsinstrumenting
    schooling generally results in higher estimates
    of the returns to schooling than those obtained
    by OLS

32
Detour 1 What is the estimated parameter? Is
it policy-relevant?
  • So, is the estimated parameter policy relevant?
  • Not if you are interested in estimating the
    effect of schooling on wages for the population
    at large
  • However, it may be the right parameter if you are
    considering expanding the fee waiver program and
    you want to assess how this will affect wages

33
Conclusion Quasi-experimental methods
  • Quasi-experimental methods can be appealing
    because, in the best of circumstances, they
    approach the design of a randomized study
  • Can control for observable and unobservable
    differences between treated and control
    households
  • However, estimates are generally local in one
    way or another
  • Makes it difficult to extrapolate to other
    population groups if there is heterogeneity of
    effects
  • Also (especially with instrumental variables)
    they are opportunistic, and the exclusion
    restriction is untestable
  • Cannot count on finding a good instrument after
    a program has been rolled out and using this to
    assess impact

34
Observational methods OLS
  • The intuition behind OLS and matching estimators
    of impact is that you can correct for differences
    between treated and control groups by
    including a vector of characteristics Xi
  • Equivalently, that there is selection on
    observables only
  • Basic set-up
  • Yi ßTi FXi ei
  • The coefficient ß is then an estimate of the
    average treatment effect
  • Concerns
  • Selection on unobservables
  • Using observations outside the region of common
    support
  • Parametric assumption

35
Observational methods Matching
  • Match on the probability of participation
  • Ideally we would match on the entire vector X of
    observed characteristics
  • However, this is practically impossible, since X
    could be huge
  • PSM match on the basis of the propensity score
    (Rosenbaum and Rubin 1983)
  • Basic steps
  • Step 1 Regress participation on observable
    characteristics
  • Ti ß1Xi ei
  • Predict T-hati, the propensity score
  • Step 2 Restrict sample to assume common
    support
  • Failure of common support is an important source
    of bias in observational studies (Heckman et al.
    1997)

36
Density of scores for participants
37
Density of scores for non-participants
38
Density of scores for non-participants
39
Observational methods Matching
  • Basic steps (continued)
  • Step 3 For each participant, find a sample of
    non-participants with similar propensity scores
  • Various weighting schemes
  • Step 4 Compare the outcome indicators
  • The difference is the estimate of the gain due to
    the program for that observation
  • Step 5 Calculate the mean of these individual
    gains to obtain the average overall gain

40
Observational methods Matching
  • Many recent developments in the matching
    literature
  • For example, Hirano, Imbens, and Ridder (2003)
    show that a reweighting of the data by the
    propensity score performs well
  • Step 1 Predict propensity score, T-hati, as
    before
  • Step 2 Run OLS for outcome equation, weighting
    treated households by (1/ T-hati) and comparison
    households by (1/ 1-T-hati)
  • This produces a fully efficient estimator of the
    Average Treatment Effect with conservative
    standard errors

41
Conclusion OLS, matching
  • Low cost, and can use existing data sets
    (censuses, survey)
  • However, need high-quality data with information
    on many X variables for treated and comparison
    observations
  • Matching is more flexible than OLS and does not
    make use of data outside the region of common
    support
  • This can be an important advantage
  • However, both methods are based on the assumption
    of no selection on observables
  • This is untestable and has to be argued on a
    case-by-case basis
  • In practice, single-difference OLS and matching
    can often be badly biased by unobserved
    heterogeneity, correlated with treatment

42
Observational methods DD and higher-order
differences
  • Observed changes over time for non-participants
    provide the counterfactual for participants
  • Steps
  • Collect baseline data on non-participants and
    (probable) participants before the program
  • Compare with data after the program
  • Subtract the two differences, or use a regression
    with a dummy variable for participant
  • This allows for selection bias but it must be
    time-invariant and additive

43
Diff-in-diff requires that the bias be additive
and time-invariant

44
Observational methods DD and higher-order
differences
  • In practice, estimate a regression of the
    following form
  • Ei ßTi dYi F(YiTi) ei
  • where F is the difference-in-difference estimate
    of program impact
  • Note that this is equivalent to a regression in
    first differences
  • Eit-Eit-1 ßTi eit-eit
  • Both approaches can also be supplemented with a
    vector of characteristics Xi
  • Can also combine with matching
  • Step 1 match observations on the basis of their
    baseline observable characteristics
  • Step 2 Test whether outcome grew by more in
    treated than in comparison units (individuals,
    schools, districts)

45
Observational methods DD and higher-order
differences
  • Example 1 Galiani, Gertler and Schargrodsky
    (2005) on impact of privatization of water
    services on child mortality in Argentina
  • Did child mortality decrease by more in districts
    that privatized water than in those that did not?
  • More convincing when you can show that
    pre-existing trends were the same in both groups
    (as they do)
  • Example 2 Berlinski, Galiani and Gertler (2005)
    on impact of preschool attendance on test scores
    in primary school
  • Preschool construction program Did test scores
    increase by more in provinces and among cohorts
    exposed to the construction program when they
    were of preschool age
  • More convincing with placebo experiment only
    the affected cohorts in provinces that received
    the preschool intervention saw gains in test
    scores

46
Observational methods DD and higher-order
differences
  • Example 3 Filmer and Schady (2008) Did female
    school enrollment grow by more in schools that
    offered female scholarships than in other schools
    in Cambodia?
  • Yes, but
  • These same schools appear to have higher
    pre-intervention growth rates in female
    enrollment
  • So, triple-differencing
  • Did the school enrollment of girls, relative to
    that of boys, grow by more in schools that
    offered female scholarships?
  • Yes, and there were no pre-existing differences
    between treated and control schools in the growth
    rate of the boy-girl enrollment ratio

47
Conclusion DD and higher-order differencing
  • More convincing than OLS or matching with a
    single, post-intervention survey
  • Requires careful planning for baseline
  • Particularly convincing when there are placebo
    experiments
  • Things that you would not expect to change dont
    change
  • Scholarship program for 7th graders should have
    no effects (or very small effects) on enrollment
    in (say) 1st grade
  • Cohorts not exposed to program should not behave
    differently from those who are
  • No apparent differences in pre-existing trends in
    outcomes

48
Detour spillover effects
  • What if the effects of the treatment spill over
    to the control group, or if there are general
    equilibrium effects?
  • Intervention 1 Provide deworming drugs in Kenya
    (Miguel and Kremer)
  • Program benefits extend not just to those who
    receive the drugs, but also to other children in
    the study areas
  • Intervention 2 scholarships to low-SES girls in
    Cambodia (Filmer and Schady 2008)
  • Concern that increased enrollment among
    scholarship recipients affects enrollment
    decisions of other children in same grade
  • Can be serious threat to identification
  • Possible solution move to a higher unit of
    aggregationcompare treated and control
    villages or schools, rather than individuals

49
Detour anticipation effects
  • What if people in the control group expect that
    they will be incorporated into the treatment in
    the future and change their behavior accordingly?
  • Consumption smoothing
  • Simple version of permanent income hypothesis
    all of short-term transfer income should be
    invested
  • Or maybe control households change their behavior
    (schooling, health-seeking, asset ownership)
    because they think that this makes it more likely
    that they will receive benefits?
  • Very hard to rule out
  • Collect qualitative data
  • Collect data from before baseline, and test for
    unexpected changes in behavior among controls

50
Conclusions and future challenges
  • Moving beyond averages Assessing the impact of
    program on different population groups
  • Great deal of accumulating evidence of
    heterogeneity of treatment effects
  • A positive overall effect may hide a great deal
    of variability, possibly including zero or
    negative effects for some groups

51
Conclusions and future challenges
  • Open up the Black box provided by impact
    evaluations
  • What features of program matter?
  • For example, in explaining the impact of a CCT on
    outcomes is it the cash that matters? the
    condition? the fact that transfers are made to
    women?
  • Various options for trying to untangle possible
    explanations
  • Structural models (Todd and Wolpin 2007) or
    ex-ante simulation (Bourguignon, Ferreira and
    Leite 2003)
  • Randomize alternative program features, perhaps
    on a small-scale pilot basis (forthcoming
    evaluation of a CCT in Morocco)
  • Collect information on other intermediate
    outcomes, and see whether these help shed light
    on underlying mechanisms

52
Conclusions and future challenges
  • Example Macours, Schady and Vakis (2008) on
    impact of the Atención a Crisis CCT program in
    Nicaragua on child cognitive development among
    children of preschool age
  • Program resulted in an improvement in language
    ability of .17 to .22 standard deviations
  • Was it the cash, the social marketing of the
    program, or the gender of the beneficiaries?
  • Literature identifies two key risk factors for
    inadequate cognitive development in poor
    countries
  • Inadequate nutrition (calories, proteins,
    micronutrients)
  • Inadequate early stimulation
  • Program resulted in increase in food expenditures
    and diversification of food consumption (out of
    staples, and into fruits, vegetables, animal
    proteins), and increase in stimulation inputs

53
(No Transcript)
54
Conclusions and future challenges
  • Example Macours, Schady and Vakis
    (2008)continued
  • But can the changes in inputs be fully explained
    by the increase in income?
  • Engel curve analysis

55
Conclusions and future challenges

56
Conclusions and future challenges

57
Conclusions and future challenges
Write a Comment
User Comments (0)
About PowerShow.com