EVIDENCE AND
PROSPECTS FOR A
HEALTHY NIGERIA
March 31,
2016
IMPACT EVALUATION
David Evans
World Bank
March 31,
2016
WHAT IS IMPACT
EVALUATION?
“An impact evaluation assesses
changes in the well-being of
individuals, households,
communities or firms that can
be attributed to a particular
project, program or policy.”
-World Bank
WHAT IS IMPACT
EVALUATION?
“An impact evaluation assesses
changes in the well-being of
individuals, households,
communities or firms that can
be attributed to a particular
project, program or policy.”
-World Bank
OBJECTIVE
Evaluate the causal impact of a program or an
intervention on some outcome
Examples
 How much do exposure to a television soap opera affect HIV/AIDS
awareness and testing?
 How much do monetary incentives reduce turnover among midwives?
What about non-monetary incentives?
 How much do a Quality Improvement Plan and coaching increase the
quality of care at primary health care facilities?
 How much does providing improve housing for midwives reduce
turnover in rural areas?
WHY EVALUATE?
1. Evaluation helps to learn whether programs
are actually achieving their objectives.
2. Evaluation helps to improve program
effectiveness.
3. Evaluation helps to garner resources for
scale-up.
WHAT DO WE NEED?
A COUNTERFACTUAL
What would have happened in the absence
of the program?
COUNTERFACTUAL CRITERIA
Treated & comparison groups
1. Have identical average characteristics (observed &
unobserved)
2. The only difference is the treatment
3. Therefore the only reason for any difference in
outcomes is the treatment
Key question: What would participant look like if she
hadn’t received the program?
PERFECT EXPERIMENT
Identify target beneficiaries
Clone them!
Identical on the outside (observable)
Identical on the inside (unobservable)
Kami Tami
We’re both five-
year-old puppets
We both love to
take up new
health
interventions!
Images © Sesame Workshop
PERFECT EXPERIMENT
Give the intervention to one set of
clones
Kami
Tami
PERFECT EXPERIMENT
Observe some time later
Because the groups are identical (inside &
out), the difference is due to the bednets!
Kami
Tami
FINDING A GREAT CONTROL GROUP
What would the participant look like if she
weren’t in the program?
Room For Improvement Control Groups
Before – After
Participants – Non Participants
RFI: BEFORE-AFTER
Before bednets
6 malaria episodes in 6
months
After bednets
2 malaria episodes in 6
months
What else might be going on besides the bednets?
• Seasonal differences
• Rising incomes: Households invest in other
measures
RFI: BEFORE-AFTER
Important to monitor before-after
 Monitoring systems tell us if things are moving in
the right direction
Insufficient to show impact of program
Too many factors changing over time
 Example of cash transfers in Nicaragua!
Counterfactual: What would have happened
in the absence of the project, with everything
RFI: PARTICIPANTS VS NON-
PARTICIPANTS
Compare recipients of a program
to
 People who were not eligible for the program
 People who chose not to enroll in the program
Home births Clinic births
Example: Complications in
childbirth Impact of clinic
births?
What else might explain
the difference?
Observable differences
Income
Education
Unobservable differences
Heard rumor about hospitals
Neighbor available to care for other children
Kami
RFI: Participants vs Non-Participants
Grover
© Muppet photos copyright Sesame
No way to know how much of difference is because of clinic
RFI: Participants vs Non-
Participants
Home births Clinic births
Example: Complications in
childbirth
Impact of clinic
births
Other factors!
SELECTION BIAS
People who choose to join the program are
different!
If we cannot account completely for those
differences in our data…
We usually cannot
 How do you capture attitudes toward health systems? Initiative?
…then our comparison will not show the true
impact of the program
WHAT SHOULD WE
DO?
Gold standard:
Randomized experimental design
RANDOMIZED EXPERIMENTAL
DESIGN
Randomly assign potential beneficiaries to be
in the treatment or comparison group
Treatment and comparison have the same
characteristics (observed and unobserved), on
average
Any difference in outcomes is due to
treatment
Randomization with two doesn’t work!
But differences average out in a big sample
On average, same number of Kamis and Grovers
 Observable AND unobservable
Result: Measure true impact of program 
RANDOMIZED
EXPERIMENTAL DESIGN
We don’t even look
similar!
Compariso
nTreatment
Compariso
n
Treatment
RANDOM ASSIGNMENT
Random sample
Gather data from random
sample of population
No guarantee of unbiased
impact measure
Random
assignment
Randomly assign program
Unbiased impact measure!
Treatment Control Treatment Control
CAN WE RANDOMIZE?
Randomization does not mean denying people the
benefits of the project
Usually there are existing constraints within project
implementation that allow randomization
Randomization is the fairest way to allocate treatment
 Tanzania CCT: Randomized across needy villages
 Nigeria Quality Improvement: Lottery among eligible facilities
RANDOMIZATION OPPORTUNITIES
STAGGERED ROLL-OUT OF
PROGRAM
Roll-out to 200
clinics
Roll-out to 200
more clinics
Roll-out to 400
more clinics
Jan
2013
July
2013
Jan
2014
• Randomize the order in which clinics receive
program
• Compare Jan 2013 group to Jan 2014 group at
end of first year• Example: Mexico parent health training –
staggered roll-out among vulnerable
Example: Program for children in Kenya
 Orphans – Must have program now!
 Randomized among less vulnerable children
RANDOMIZATION OPPORTUNITIES
SOME GROUPS MUST GET THE
PROGRAM!
Highly
vulnerable
Moderately
vulnerable Not
vulnerable
RANDOMIZATION OPPORTUNITIES
VARY TREATMENT
intensity nature
Malaria
information
campaign
100 villages
Malaria
information
campaign +
SMS
reminders
100 villages
Randomizeacross
communities
Radio
campaign
100 villages
Newspaper
campaign
100 villages
Randomizeacross
communities
Additional impact of SMS
reminders?
Which approach has greater
impact?
UNIT OF RANDOMIZATION
At what level should I randomize?
 Individual
 Household
 Clinic
 Community
Considerations
 Political feasibility of randomization at individual level
 Spillovers within groups
 Implementation capacity: One clinic administering
different treatments
UNIT OF RANDOMIZATION
Bigger unit = Bigger study
(Because of intra-community correlation)
Individual randomization:
630 participants
(315 treatment, 315 control)
Clinic-level randomization:
150 clinics
(75 treatment, 75 control)
Number of units you randomize matters more than total
3,000 participants!
WHAT IF RANDOMIZATION IS
IMPOSSIBLE?
Think again: It often is possible on some
level, and it’s the best way to get a clear
measure of impact
Some situations, not possible
 Evaluate the effect of a national health policy
 Interventions in the past
 Life saving vaccination (volunteers for control
group?)
Alternative methods available, compelling
in some circumstances
I volunteer!
A COUPLE OF LAST BIG
POINTS
WE SHOULD DO AN
EVALUATION IF A PROGRAM
IS…
1. Innovative: This approach hasn’t been used
before
2. Replicable: The program may be scaled up
3. Strategically relevant: The program could
involve significant resources or affect many people
4. Untested: We don’t know how well it works
5. Influential: The results will be used to make a
policy decision
Adapted from Impact Evaluation in
WHAT MAKES A GREAT
IMPACT EVALUATION
QUESTION?
1. Cause-effect
• YES: “What is the impact of ______ on ______?”
• NOT “Who is taking up our antenatal care
program?”
2. Prospective (future-looking)
 YES: “What is the impact of this program we are
about to roll out?”
 NO: “What was the impact of a program we rolled
out 5 years ago?”
KEY CONCLUSIONS
Impact evaluation tells us if our programs are
working
Randomization of treatment leads to unbiased
estimate of impact
Other methods rely on more assumptions
Lots of opportunities for randomization
 No withholding of benefits
 Staggered roll-out
 Varied treatment
Thank you!

What is impact evaluation?

  • 1.
    EVIDENCE AND PROSPECTS FORA HEALTHY NIGERIA March 31, 2016
  • 2.
  • 3.
    WHAT IS IMPACT EVALUATION? “Animpact evaluation assesses changes in the well-being of individuals, households, communities or firms that can be attributed to a particular project, program or policy.” -World Bank
  • 4.
    WHAT IS IMPACT EVALUATION? “Animpact evaluation assesses changes in the well-being of individuals, households, communities or firms that can be attributed to a particular project, program or policy.” -World Bank
  • 5.
    OBJECTIVE Evaluate the causalimpact of a program or an intervention on some outcome Examples  How much do exposure to a television soap opera affect HIV/AIDS awareness and testing?  How much do monetary incentives reduce turnover among midwives? What about non-monetary incentives?  How much do a Quality Improvement Plan and coaching increase the quality of care at primary health care facilities?  How much does providing improve housing for midwives reduce turnover in rural areas?
  • 6.
    WHY EVALUATE? 1. Evaluationhelps to learn whether programs are actually achieving their objectives. 2. Evaluation helps to improve program effectiveness. 3. Evaluation helps to garner resources for scale-up.
  • 7.
    WHAT DO WENEED? A COUNTERFACTUAL What would have happened in the absence of the program?
  • 8.
    COUNTERFACTUAL CRITERIA Treated &comparison groups 1. Have identical average characteristics (observed & unobserved) 2. The only difference is the treatment 3. Therefore the only reason for any difference in outcomes is the treatment Key question: What would participant look like if she hadn’t received the program?
  • 9.
    PERFECT EXPERIMENT Identify targetbeneficiaries Clone them! Identical on the outside (observable) Identical on the inside (unobservable) Kami Tami We’re both five- year-old puppets We both love to take up new health interventions! Images © Sesame Workshop
  • 10.
    PERFECT EXPERIMENT Give theintervention to one set of clones Kami Tami
  • 11.
    PERFECT EXPERIMENT Observe sometime later Because the groups are identical (inside & out), the difference is due to the bednets! Kami Tami
  • 12.
    FINDING A GREATCONTROL GROUP What would the participant look like if she weren’t in the program? Room For Improvement Control Groups Before – After Participants – Non Participants
  • 13.
    RFI: BEFORE-AFTER Before bednets 6malaria episodes in 6 months After bednets 2 malaria episodes in 6 months What else might be going on besides the bednets? • Seasonal differences • Rising incomes: Households invest in other measures
  • 14.
    RFI: BEFORE-AFTER Important tomonitor before-after  Monitoring systems tell us if things are moving in the right direction Insufficient to show impact of program Too many factors changing over time  Example of cash transfers in Nicaragua! Counterfactual: What would have happened in the absence of the project, with everything
  • 15.
    RFI: PARTICIPANTS VSNON- PARTICIPANTS Compare recipients of a program to  People who were not eligible for the program  People who chose not to enroll in the program Home births Clinic births Example: Complications in childbirth Impact of clinic births? What else might explain the difference?
  • 16.
    Observable differences Income Education Unobservable differences Heardrumor about hospitals Neighbor available to care for other children Kami RFI: Participants vs Non-Participants Grover © Muppet photos copyright Sesame
  • 17.
    No way toknow how much of difference is because of clinic RFI: Participants vs Non- Participants Home births Clinic births Example: Complications in childbirth Impact of clinic births Other factors!
  • 18.
    SELECTION BIAS People whochoose to join the program are different! If we cannot account completely for those differences in our data… We usually cannot  How do you capture attitudes toward health systems? Initiative? …then our comparison will not show the true impact of the program
  • 19.
    WHAT SHOULD WE DO? Goldstandard: Randomized experimental design
  • 20.
    RANDOMIZED EXPERIMENTAL DESIGN Randomly assignpotential beneficiaries to be in the treatment or comparison group Treatment and comparison have the same characteristics (observed and unobserved), on average Any difference in outcomes is due to treatment
  • 21.
    Randomization with twodoesn’t work! But differences average out in a big sample On average, same number of Kamis and Grovers  Observable AND unobservable Result: Measure true impact of program  RANDOMIZED EXPERIMENTAL DESIGN We don’t even look similar! Compariso nTreatment Compariso n Treatment
  • 22.
    RANDOM ASSIGNMENT Random sample Gatherdata from random sample of population No guarantee of unbiased impact measure Random assignment Randomly assign program Unbiased impact measure! Treatment Control Treatment Control
  • 23.
    CAN WE RANDOMIZE? Randomizationdoes not mean denying people the benefits of the project Usually there are existing constraints within project implementation that allow randomization Randomization is the fairest way to allocate treatment  Tanzania CCT: Randomized across needy villages  Nigeria Quality Improvement: Lottery among eligible facilities
  • 24.
    RANDOMIZATION OPPORTUNITIES STAGGERED ROLL-OUTOF PROGRAM Roll-out to 200 clinics Roll-out to 200 more clinics Roll-out to 400 more clinics Jan 2013 July 2013 Jan 2014 • Randomize the order in which clinics receive program • Compare Jan 2013 group to Jan 2014 group at end of first year• Example: Mexico parent health training – staggered roll-out among vulnerable
  • 25.
    Example: Program forchildren in Kenya  Orphans – Must have program now!  Randomized among less vulnerable children RANDOMIZATION OPPORTUNITIES SOME GROUPS MUST GET THE PROGRAM! Highly vulnerable Moderately vulnerable Not vulnerable
  • 26.
    RANDOMIZATION OPPORTUNITIES VARY TREATMENT intensitynature Malaria information campaign 100 villages Malaria information campaign + SMS reminders 100 villages Randomizeacross communities Radio campaign 100 villages Newspaper campaign 100 villages Randomizeacross communities Additional impact of SMS reminders? Which approach has greater impact?
  • 27.
    UNIT OF RANDOMIZATION Atwhat level should I randomize?  Individual  Household  Clinic  Community Considerations  Political feasibility of randomization at individual level  Spillovers within groups  Implementation capacity: One clinic administering different treatments
  • 28.
    UNIT OF RANDOMIZATION Biggerunit = Bigger study (Because of intra-community correlation) Individual randomization: 630 participants (315 treatment, 315 control) Clinic-level randomization: 150 clinics (75 treatment, 75 control) Number of units you randomize matters more than total 3,000 participants!
  • 29.
    WHAT IF RANDOMIZATIONIS IMPOSSIBLE? Think again: It often is possible on some level, and it’s the best way to get a clear measure of impact Some situations, not possible  Evaluate the effect of a national health policy  Interventions in the past  Life saving vaccination (volunteers for control group?) Alternative methods available, compelling in some circumstances I volunteer!
  • 30.
    A COUPLE OFLAST BIG POINTS
  • 31.
    WE SHOULD DOAN EVALUATION IF A PROGRAM IS… 1. Innovative: This approach hasn’t been used before 2. Replicable: The program may be scaled up 3. Strategically relevant: The program could involve significant resources or affect many people 4. Untested: We don’t know how well it works 5. Influential: The results will be used to make a policy decision Adapted from Impact Evaluation in
  • 32.
    WHAT MAKES AGREAT IMPACT EVALUATION QUESTION? 1. Cause-effect • YES: “What is the impact of ______ on ______?” • NOT “Who is taking up our antenatal care program?” 2. Prospective (future-looking)  YES: “What is the impact of this program we are about to roll out?”  NO: “What was the impact of a program we rolled out 5 years ago?”
  • 33.
    KEY CONCLUSIONS Impact evaluationtells us if our programs are working Randomization of treatment leads to unbiased estimate of impact Other methods rely on more assumptions Lots of opportunities for randomization  No withholding of benefits  Staggered roll-out  Varied treatment
  • 34.

Editor's Notes

  • #14 RFI stand for “Room for Improvement”
  • #15 In Nicaragua, a cash transfer program was followed by a significant reduction of income. But then there was also a massive drop in coffee prices. It turns out, the cash transfer recipients had less of a reduction.
  • #29 Delta = .2., rho = .2. alpha = .05.