Presented by Pascale Schnitzer and Carlo Azzarri, IFPRI at the Africa RISING–CSISA Joint Monitoring and Evaluation Meeting, Addis Ababa, Ethiopia, 11-13 November 2013
1. Evaluation in Africa RISING
Pascale Schnitzer and Carlo Azzarri, IFPRI
Africa RISING–CSISA Joint Monitoring and Evaluation Meeting,
Addis Ababa, Ethiopia, 11-13 November 2013
3. Quantitative
Experimental
• RCT
• Choice experiments, auctions, games
Quasi-experimental
• Double-Difference (Diff-in-Diff)
• Matching
• RD
• IV and encouragement design
4. Example: providing fertilizers to farmers
Intervention: provide fertilizer to farmers in district A
Program targets all farmers living in district A
Farmers have to enroll at the local extension office to
receive the fertilizer
District B does not receive any intervention
Starts in 2012, ends in 2016, we have data on yields for
farmers in district A and district B for both years
What is the effect of giving fertilizer on agricultural
yields?
5. Case I: Before & After
Y
(1) Observe only
beneficiaries
(2) Two observations
in time:
yields at T=0
and yields at T=1.
2200
A
α = 200
2000
B
T=2012
IMPACT=A-B= 200?
T=2016
Time
6. Case I: What’s the problem?
Unusual good weather/rain:
o Real Impact=A-C
2200
o A-B is an overestimate
A
Impact?
α = 200
Drought
o Real Impact=A-D
o A-B is an
underestimate
C?
2000
Impact?
B
D?
T=2012
T=2016
Time
7. Case II: Those Enrolled & Not Enrolled
If we have post-treatment data on
o
o
Enrolled: treatment group
Not-enrolled: “comparison” group (counterfactual)
(a) Those that choose NOT to participate
(b) Those ineligible to participate (e.g. neighbor
community)
Yields in 2016 by participant type
3000
2800
2500
2500
2200
2000
1500
1000
500
0
Choose to participate
Choose not to participate
Inelegible to participate
Did AR have a
negative impact?
8. Case I and II
In the end, with these naïve comparisons, we cannot tell
if the program had an impact
We need a comparison group that is as identical in
observable and unobservable dimensions as possible, to
those receiving the program, and a comparison group
that will not receive spillover benefits.
9. We need to keep in mind…
B&A
E&NE
Compare: Same individuals Before
and After they receive P.
Problem: Other things may have
happened over time.
Compare: Group of individuals
Enrolled in a program with group
that chooses not to enroll.
Problem: Selection Bias. We don’t
know why they are not enrolled.
Both counterfactuals may lead to
biased estimates of the impact.
13. Choice experiments, auctions, games
• An experiment is a set of observations generated in a controlled
environment to answer a particular question or solve a particular
problem.
• Subjects make decisions that are not part of their day-to-day
decision making (typically in a game environment), they know they
are part of an experiment, or both.
• Purposes:
1. Test theories
2. Measure what are considered “unobservables” (e.g.
preferences, beliefs)
3. Test sensitivity of experimental results to different forms of
heterogeneity
14. Choice experiments, auctions, games
• Examples:
-behavioral game theory
-ultimatum games
-dictator games
-trust games
-public good games
-coordination games
-market experiments (auctions)
-risk- and time-preference experiments
16. Probability of adoption
Impact =(A-B)-(CD)=(A-C)-(B-D)
Not
participants
C=0.81
D=0.78
A=0.74
Participants
Impact=0.11
B=0.60
T=0
Before
T=1
After
Time
18. Example from Malawi:
Total land used (acres)
Treatment Group
(Randomized to
treatment)
Counterfactual
(Randomized to
Comparison)
Impact
(Y | P=1) - (Y | P=0)
Baseline (T=0)
[MARBES] (Y)
3.04
2.13
0.91
Follow-up (T=1)
[MARBES] (Y)
??
??
??
20. Propensity-Score Matching (PSM)
Comparison Group: non-participants with same
observable characteristics as participants.
In practice, it is very hard.
There may be many important characteristics!
Match on the basis of the “propensity score”,
Compute everyone’s probability of participating, based on
their observable characteristics.
Choose matches that have the same probability of
participation as the treatments.
23. RD: Effect of fertilizer program on adoption
Goal
Improve fertilizers adoption for small farmers
Method
o Farms with a score (Ha) of land ≤2 are small
o Farms with a score (Ha) of land >2 are not small
Intervention
Small farmers receive subsidies to purchase fertilizer
27. Babati (WP2): Timeline and design of an evaluation
Feb
13
July
13
AugOct 13
Nov 13 –Mar. 14
Mar. 2016
200 receive
improved seeds
Initial planting at
demonstration
plots
Follow-up field
day: farmers rank
preferred seeds
Fertilizer and seed
distribution
Survey
800 farmers in 11
villages
200 receive
improved seeds and
fertilizer
200 receive
seeds, fertilizer and
contracts
200 receive no
additional
intervention
End-line survey: measure
impacts
29. Qualitative
• Semi-structured or open-ended indepth interviews
• Focus groups
• Outcome Mapping
• Participatory Impact Pathways Analysis
(PIPA)
30. Outcome Mapping (OM)
• Contribution of AR to changes in the
actions, behaviors, relationships, activities of the ‘boundary
partners’ (individuals, groups, and organizations with whom AR
interacts directly and with whom it anticipates opportunities for
influence)
• It is based largely on systematized self-assessment
• OM is based on three stages:
1. Intentional design (Why? Who? What? How?)
2. Outcome and performance monitoring
3. Evaluation design
31. Outcome Mapping (OM)
By using OM, AR would not claim the achievement
of development impacts; rather, the focus is on its
contributions to outcomes. These outcomes, in
turn, enhance the possibility of development
the relationship is not necessarily a
impacts – but the relationship is not necessarily a
direct one of cause and effect
effect.
33. Participatory Impact Pathways Analysis (PIPA)
• PIPA begins with a participatory workshop where stakeholders make
explicit their assumptions about how their project will achieve an
impact. Participants construct problem trees, a visioning exercise and
network maps to help them clarify their 'impact pathways‘ (IPs).
• IPs are then articulated in two logic models:
1. The outcomes logic model -> the project's medium term
objectives in the form of hypotheses: which actors need to
change, what are the changes, which strategies are needed to attain the
changes.
2. The impact logic model -> how, by helping to achieve the
expected outcomes, the project will impact on people's livelihoods.
Participants derive outcome targets and milestones, regularly revisited
and revised as part of M&E.
36. Conclusions
• We cannot do everything in every megasite…
• Quantitative surveys are being conducted/planned in
every country
• IFPRI has comparative advantage in quantitative
approaches, we shall split the tasks with the research
teams on qualitative methods -> mixed methods
• Is IFPRI M&E on the right track? What shall we be
focusing on more? What shall we not be doing ?
37. Africa Research in Sustainable Intensification for the Next Generation
africa-rising.net
Editor's Notes
Talk about evaluation methods used or that can potentially be used in AR.What are the objectives of evaluation? Not only learn whether something worked or not, but also
How do we estimate impact? Lets think about comparing before and after
What is the impact
The rest of the world moves on and you are not sure what was caused by the program and what by the rest of the world
(b) Farmers in region B have better quality soil and better equipment, wealthier(b) Farmers in the region B have more irrigation, which is key in this drought year (observable)(a) Farmers that choose not to participate did not do so because they have big plots of land and are already producing the amount they need/want. On the other hand, poor farmers with small plots went and get the fertilizer
Considered to be the goal standard method of impact evaluation.
Experiments enable us to construct a proper counterfactual scenario for establishing causality, and thus for identifying effects. E.g. a framed field experiment to test the seminal hypothesis that insurance induces farmers to take greater, yet profitable risks
Quasi experimental methods
It compares the changes in outcomes over time between a population that is enrolled in a program (the treatment group) and a population that is not (the comparison group).
‘parallel trends’ in the absence of the program.Requires assumption: Without treatment, outcomes would need to increase or decrease at the same rate in both groups. In other words, we require that outcomes display equal trends in the absence of treatment.Intuition example: What if the drought only happened in the neighbor community? What if the drought happened in both treatment and control community but the control community had an irrigation system in place so that this shock did not affect them in the same way?
DID and potentially PSM
Idea: For each treated unit pick up the best comparison unit (match) from another data source.How:Matches are selected on the basis of similarities in observedcharacteristics.Issue: If there are unobservablecharacteristics and those unobservables influence participation: Selection bias!
We have a (1) continuous eligibility index with a (2) defined cut-off
X: amt. land. Y: yieldsLine represents average.We have a (1) continuous eligibility index with a (2) defined cut-off- Units just above the cut-off point are very similar to units just below it – good comparison.- Compare outcomes Yfor units just above and below the cut-off point.
??
Culled from discussions with geneticist/breeder Macdonald Jumbo Bright (CIMMYT), on-the-ground research implementers and extensionists, and researcher Joseph Rusike (IITA). For the ‘mini-IE’, xxx.??
?
Qualitative research can then again probe and explain contextual differences and shed light on the unexplainable of quantitative results.