Why Stay in the Dark About Real Program Results Shedding New Light on Methods for Revitalizing Evalu - PowerPoint PPT Presentation

1 / 36

About This Presentation

Title:

Why Stay in the Dark About Real Program Results Shedding New Light on Methods for Revitalizing Evalu

Description:

To stay in the dark or not on evidence-based results of PRH programs? ... Dar es Salaam. Shedding New Light on M&E. AMMP CEP Compared to PCA Index ... – PowerPoint PPT presentation

Number of Views:28

Avg rating:3.0/5.0

Slides: 37

Provided by: philips150

Category:

more less

Transcript and Presenter's Notes

Title: Why Stay in the Dark About Real Program Results Shedding New Light on Methods for Revitalizing Evalu

1
Why Stay in the Dark About Real Program Results?
Shedding New Light on Methods for Revitalizing
Evaluation in Health

Charles Teller, USAID
Philip Setel, MEASURE Evaluation

To stay in the dark or not on evidence-based
results of PRH programs?
Why do we need rigorous program evaluation and
evaluation research?

What is the current situation on ME of GH
programs in general, and in USAID in particular?

What is the new USAID directive on "revitalizing"
evaluation?
Objective is to clearly demonstrate results that
USAID is achieving with taxpayer . Mission
actions
Appointing an ME officer
Setting aside for evaluation during design
phase
Preparing Mission Order on ME
Preparing an annual Mission Evaluation Plan
Providing evaluation training for CTOs, TAs, SO
Leaders
Offering incentives who promote the use of
evaluations

What is PRH doing on revitalization?
New strategic framework with CAs
MEASURE Evaluation Policy Program Coordination
updating training
New ME Working Group
Individual training mentoring to new CAs
Rigorous evaluation studies (examples)
Updating indicators manuals in ME
Supporting PMP development under new Fragile
States strategy.

What methodological innovations are being done by
MEASURE/Evaluation project to address these
issues
DDU, Capacity Building, Com-Based Info systems
(PRISM, SAVVY, PLACE), GIS, Pop-Environ.,
poverty-equity, etc.

Can we do ME business as usual in the so-called
Fragile States?
Innovations in ME/Strategic Information/surveilla
nce on gender-based violence
USAID through MEASURE Evaluation providing ideas
on design for such a system, including a
decision-support approach and indicators with
WHO, UNICEF, UNFPA and others

SUMMARY- to light a candle or curse the darkness?

9
MEASURE Evaluation no innovation without
evaluation

Charles mentioned several of the methods MEASURE
Evaluation has produced or is developing and
applying.
Priorities for Local AIDS Control Efforts (PLACE)
Sample Vital Registration with Verbal Autopsy
(SAVVY)
PRISM
Poverty Measures
Innovation and revitalization is all very nice,
but hang on a second How do we know that these
innovations are answering the task?

10
Some examples

Limited time to discuss a number of activities
and how we do our best to ensure that the ME
methods we develop answer the questions that need
to be asked.
Ill discuss two
SAVVY
Verbal Autopsy validation
Poverty Measures
Consumption Expenditure Proxy validation

SAVVY
Demographic Surveillance Mortality Surveillance
D is for Denominators!
Mortality surveillance based on verbal autopsy
(VA)
How do we validate VA?

12
Background and objectives

Compare the VA to a gold standard (i.e. medical
records)
Validation of VA procedures for three age groups
Perinatal/neonatal
Post-neonatal lt 5
Age 5
Cause of death list/coding is important!
International comparability
International Classification of Diseases (ICD)

13
Methods

A time sample of deaths, or a quota sample of a
certain number of deaths by various causes.
Must have
Death occurred in health facility, or
Death occurred at home, but contact with a health
facility before death (so some record)
AND
A VA for the same individuals to use as the basis
of comparison.

14
Coding Cause of Death Assignment

Coding
ICD training provided to coding physicians
coded to ICD-10 core four-digit levels
3-line death certificates produced for all VAs
and medical records
No physician codes both MR and VA for same
individual
Validity of ICD coding verified using tools from
US National Center for Health Statistics.

15
But how many carats is the gold standard?

After verifying validity of underlying COD
Appropriate diagnostic tests
Appropriate treatment
Documented signs
Reported presenting symptoms
Consistent past medical history

16
Summing up good performance (Tanzania example)

Perinatal Neonatal causes
birth asphyxia/respiratory disorders
intrauterine complications
Pneumonia
Post-neonatal child causes
Pneumonia
injuries

17
Summing up good performance

Population age 5
HIV/AIDS (ICD codes B20-B24)
Malaria
Tuberculosis (ICD codes A15-A19)
Cerebrovascular diseases
Injuries
Direct maternal causes

18
Summing up VA validation issues

Generalizability of hospital-based validation
results to community-based data (no practical
validation method!).
VA performed reasonably well (according to
specified criteria) for at least 9 causes across
all age groups
Cause-specific mortality rates possible
For causes that did not perform well
Trends priority setting generally still OK
Is poorer performance this due to sample sizes?
Or inherent limitations of VA?
We dont know yet
How many carats is the gold standard and how do
we factor this into validation studies?

Poverty Measurement
Wealth in people versus wealth in things
Wealth in things Permanent Income
Consumption Expenditure as best guess of PI
Proxies for Consumption Expenditure
Get you absolute and relative measures
i.e. how many are below the poverty line?

How to develop and validate a rapid consumption
expenditure proxy
NOT EASY!

21
Which Construct?

From theory standpoint, options were many
huge literature menu of poverty measures
Quickly narrowed to 2 taking constraints
criteria of estimating PI into account
asset index approach
validated consumption expenditure proxy (CEP)
(cf Morris 2000)

22
Development of a CEP

First available data from a Household Budget
Survey (HBS) or Living Standards Measurement
Survey (LSMS) used to develop preliminary models,
separately for rural and urban households.
Models identified limited set of potential
variables from a sub-set of variables.
Full HBS or LSMS data then used to evaluate and
thereby adjust the most appropriate model.
Final models used to predict estimates of monthly
household consumption expenditure per adult
equivalent in an evaluation study.

23
Household Budget Survey (Tanzania)

2000/01 National Bureau of Statistics HBS
provided source data.
22,000 households
Consumption expenditure per adult equivalent
calculated on the basis of
Detailed expenditure data collected over a 28-day
period, combined with
a 12-month recall on major items of expenditure.
Billions and billions of variables! (well, not
that many, but too many to include for an
evaluation study!)

24
Model Development Data

Regression modeling used with
Household level variables, e.g. type of toilet
facilities, access to water, ownership of a
number of assets, etc., as poverty proxies.
If source data set allows, separate models can be
developed and validated sub-national areas where
evaluation is desired (regional level probably
lowest level in most cases).

25
Minimization Validation

Analytical Methods
Variables selected using a backward elimination
procedure, but considering the possible
conceptual/local importance of variables
previously removed from the model (e.g. spending
money on fertilizer or seasonal labor).
Model developed using part of the data, and
validated on remaining observations.
Basic validation question How well does the
minimal model predict the true consumption of the
household?

26
Validation Model applied to an external data set

Data set A used for fitting the model
Remaining data (set B) used for validating model

r 0.72
27
Model Results

Best predictors of consumption expenditure
measured 60-65 of variation in consumption
expenditure
Compression toward the mean (misclassifies some
of the poorest).
Common variables to consider
Household size
Education level of head of household
Number of days meat eaten in past week.
Urban variables
Status of walls
Whether household owned an iron, an electric/gas
stove, an automobile
In past month whether household paid money to
purchase certain food items
Rural variables
Area of land used for farming/pastoralism
Whether household spent money to purchase
agricultural inputs.
Number of persons employed in household (inc.
self employed)
Main source of drinking water
In past 12 months whether household spent money
to purchase fertiliser/manure
Whether household owned a bicycle owned a bed
net
Toilet facility available
Main fuel used for lighting.

Best predictors of consumption expenditure
(rural)
Kilimanjaro (rural) R2 65
Age of household head
Area of land used for farming/pastoralism
In past 12 months whether household spent money
to purchase seeds.
In past 12 months whether household spent money
to purchase fertiliser/manure.
Whether household owned a bicycle, sofa, lamp
Main source of cash income.
Morogoro (rural) R2 56
Sex and age of household head
Number of persons employed in household (inc.
self employed)
Dependency ratio
Number of persons per sleeping room
Main source of drinking water
In past 12 months whether household spent money
to purchase fertiliser/manure
Whether household owned a bicycle owned a bed
net
Status of walls
Toilet facility available
Main fuel used for lighting.

29
Model Performance
30
AMMP CEP Compared to PCA Index

Some concerns relating to PCA derived asset
index
Variable selection?
Connection to wealth
Binary variables
Would one set of PCA coefficients, nationally
derived, be likely to give meaningful results?
Is it valid to regard a categorical variable
(e.g. main source of drinking water) as a set of
independent binary variables in PCA?
To what extent is the asset index suitable for
determining wealth quintiles?

31
Graphical Comparison
2
1
1 PCA Asset Index (r0.46) 2 Additive Index
(r0.44) 3 CEP (r0.76)
3
32
Conclusions 1

Proxies generally perform poorly, but CEP may be
best of worst so far
Method requires HBS or similar separate
modeling for each region of a country
CEP approach stood up well to model assessment
criteria and cross-validation against an external
data set.

33
Conclusions 2

Performance was reasonable for predicting means
So able to use these in relating to health
outcome variables (measured at community level).
Predictions at an individual household level are
much less reliable.
Less than 50 of population classified into
correct quintile, but results much better
compared with similar results from alternatives.

34
Recommendations

No innovation without validation!
In 2005 this should be non-negotiable.
How good are these new tools?
How good are the old ones?
Keep your eyes on the evaluation prize!
Given the non-negotiable need to know something
concrete about method performance
How much power does your evaluation need?
Plausibility? Explanatory? Validating right
priorities for health decision-making
Do you get what you pay for?
Depends on the decisions results