Evaluation in Africa RISING


Published on

Presented by Pascale Schnitzer and Carlo Azzarri, IFPRI at the Africa RISING–CSISA Joint Monitoring and Evaluation Meeting, Addis Ababa, Ethiopia, 11-13 November 2013

Published in: Technology, Economy & Finance
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Talk about evaluation methods used or that can potentially be used in AR.What are the objectives of evaluation? Not only learn whether something worked or not, but also
  • How do we estimate impact? Lets think about comparing before and after
  • What is the impact
  • The rest of the world moves on and you are not sure what was caused by the program and what by the rest of the world
  • (b) Farmers in region B have better quality soil and better equipment, wealthier(b) Farmers in the region B have more irrigation, which is key in this drought year (observable)(a) Farmers that choose not to participate did not do so because they have big plots of land and are already producing the amount they need/want. On the other hand, poor farmers with small plots went and get the fertilizer
  • Considered to be the goal standard method of impact evaluation.
  • Experiments enable us to construct a proper counterfactual scenario for establishing causality, and thus for identifying effects. E.g. a framed field experiment to test the seminal hypothesis that insurance induces farmers to take greater, yet profitable risks
  • Quasi experimental methods
  • It compares the changes in outcomes over time between a population that is enrolled in a program (the treatment group) and a population that is not (the comparison group).
  • ‘parallel trends’ in the absence of the program.Requires assumption: Without treatment, outcomes would need to increase or decrease at the same rate in both groups. In other words, we require that outcomes display equal trends in the absence of treatment.Intuition example: What if the drought only happened in the neighbor community? What if the drought happened in both treatment and control community but the control community had an irrigation system in place so that this shock did not affect them in the same way?
  • DID and potentially PSM
  • Idea: For each treated unit pick up the best comparison unit (match) from another data source.How:Matches are selected on the basis of similarities in observedcharacteristics.Issue: If there are unobservablecharacteristics and those unobservables influence participation: Selection bias!
  • We have a (1) continuous eligibility index with a (2) defined cut-off
  • X: amt. land. Y: yieldsLine represents average.We have a (1) continuous eligibility index with a (2) defined cut-off- Units just above the cut-off point are very similar to units just below it – good comparison.- Compare outcomes Yfor units just above and below the cut-off point.
  • ??
  • Culled from discussions with geneticist/breeder Macdonald Jumbo Bright (CIMMYT), on-the-ground research implementers and extensionists, and researcher Joseph Rusike (IITA). For the ‘mini-IE’, xxx.??
  • ?
  • Qualitative research can then again probe and explain contextual differences and shed light on the unexplainable of quantitative results.
  • Evaluation in Africa RISING

    1. 1. Evaluation in Africa RISING Pascale Schnitzer and Carlo Azzarri, IFPRI Africa RISING–CSISA Joint Monitoring and Evaluation Meeting, Addis Ababa, Ethiopia, 11-13 November 2013
    2. 2. Outline • Quantitative (experimental and quasi-experimental) • Qualitative • Mix methods
    3. 3. Quantitative Experimental • RCT • Choice experiments, auctions, games Quasi-experimental • Double-Difference (Diff-in-Diff) • Matching • RD • IV and encouragement design
    4. 4. Example: providing fertilizers to farmers  Intervention: provide fertilizer to farmers in district A  Program targets all farmers living in district A  Farmers have to enroll at the local extension office to receive the fertilizer  District B does not receive any intervention  Starts in 2012, ends in 2016, we have data on yields for farmers in district A and district B for both years  What is the effect of giving fertilizer on agricultural yields?
    5. 5. Case I: Before & After Y (1) Observe only beneficiaries (2) Two observations in time: yields at T=0 and yields at T=1. 2200 A α = 200 2000 B T=2012 IMPACT=A-B= 200? T=2016 Time
    6. 6. Case I: What’s the problem? Unusual good weather/rain: o Real Impact=A-C 2200 o A-B is an overestimate A Impact? α = 200 Drought o Real Impact=A-D o A-B is an underestimate C? 2000 Impact? B D? T=2012 T=2016 Time
    7. 7. Case II: Those Enrolled & Not Enrolled If we have post-treatment data on o o Enrolled: treatment group Not-enrolled: “comparison” group (counterfactual) (a) Those that choose NOT to participate (b) Those ineligible to participate (e.g. neighbor community) Yields in 2016 by participant type 3000 2800 2500 2500 2200 2000 1500 1000 500 0 Choose to participate Choose not to participate Inelegible to participate Did AR have a negative impact?
    8. 8. Case I and II In the end, with these naïve comparisons, we cannot tell if the program had an impact We need a comparison group that is as identical in observable and unobservable dimensions as possible, to those receiving the program, and a comparison group that will not receive spillover benefits.
    9. 9. We need to keep in mind… B&A E&NE Compare: Same individuals Before and After they receive P. Problem: Other things may have happened over time. Compare: Group of individuals Enrolled in a program with group that chooses not to enroll. Problem: Selection Bias. We don’t know why they are not enrolled. Both counterfactuals may lead to biased estimates of the impact.
    10. 10. Quantitative Experimental • RCT
    11. 11. RCT 1. Population 3. Randomize treatment 2. Evaluation sample X Comparison Treatment = Ineligible = Eligible External Validity Internal Validity
    12. 12. Quantitative Experimental • RCT • Choice experiments, auctions, games
    13. 13. Choice experiments, auctions, games • An experiment is a set of observations generated in a controlled environment to answer a particular question or solve a particular problem. • Subjects make decisions that are not part of their day-to-day decision making (typically in a game environment), they know they are part of an experiment, or both. • Purposes: 1. Test theories 2. Measure what are considered “unobservables” (e.g. preferences, beliefs) 3. Test sensitivity of experimental results to different forms of heterogeneity
    14. 14. Choice experiments, auctions, games • Examples: -behavioral game theory -ultimatum games -dictator games -trust games -public good games -coordination games -market experiments (auctions) -risk- and time-preference experiments
    15. 15. Quantitative Quasi-experimental designs • Double-Difference (Diff-in-Diff)
    16. 16. Probability of adoption Impact =(A-B)-(CD)=(A-C)-(B-D) Not participants C=0.81 D=0.78 A=0.74 Participants Impact=0.11 B=0.60 T=0 Before T=1 After Time
    17. 17. Impact =(A-B)-(CD)=(A-C)-(B-D) Probability of adoption Not enrolled C=0.81 D=0.78 A=0.74 Impact<0.11 Enrolled B=0.60 T=0 Before T=1 After Time
    18. 18. Example from Malawi: Total land used (acres) Treatment Group (Randomized to treatment) Counterfactual (Randomized to Comparison) Impact (Y | P=1) - (Y | P=0) Baseline (T=0) [MARBES] (Y) 3.04 2.13 0.91 Follow-up (T=1) [MARBES] (Y) ?? ?? ??
    19. 19. Quantitative Non-experimental • Double-Difference (Diff-in-Diff) • Matching
    20. 20. Propensity-Score Matching (PSM) Comparison Group: non-participants with same observable characteristics as participants.   In practice, it is very hard. There may be many important characteristics! Match on the basis of the “propensity score”,   Compute everyone’s probability of participating, based on their observable characteristics. Choose matches that have the same probability of participation as the treatments.
    21. 21. Density Non-Participants Participants Common Support 0 Propensity Score 1
    22. 22. Quantitative Non-experimental • Double-Difference (Diff-in-Diff) • Matching • RD
    23. 23. RD: Effect of fertilizer program on adoption Goal Improve fertilizers adoption for small farmers Method o Farms with a score (Ha) of land ≤2 are small o Farms with a score (Ha) of land >2 are not small Intervention Small farmers receive subsidies to purchase fertilizer
    24. 24. Regression Discontinuity: Design at baseline Not eligible Eligible
    25. 25. Regression Discontinuity: post intervention IMPACT
    26. 26. Quantitative Non-experimental • Double-Difference (Diff-in-Diff) • Matching • RD • IV and encouragement design
    27. 27. Babati (WP2): Timeline and design of an evaluation Feb 13 July 13 AugOct 13 Nov 13 –Mar. 14 Mar. 2016 200 receive improved seeds Initial planting at demonstration plots Follow-up field day: farmers rank preferred seeds Fertilizer and seed distribution Survey 800 farmers in 11 villages 200 receive improved seeds and fertilizer 200 receive seeds, fertilizer and contracts 200 receive no additional intervention End-line survey: measure impacts
    28. 28. Outline • Qualitative
    29. 29. Qualitative • Semi-structured or open-ended indepth interviews • Focus groups • Outcome Mapping • Participatory Impact Pathways Analysis (PIPA)
    30. 30. Outcome Mapping (OM) • Contribution of AR to changes in the actions, behaviors, relationships, activities of the ‘boundary partners’ (individuals, groups, and organizations with whom AR interacts directly and with whom it anticipates opportunities for influence) • It is based largely on systematized self-assessment • OM is based on three stages: 1. Intentional design (Why? Who? What? How?) 2. Outcome and performance monitoring 3. Evaluation design
    31. 31. Outcome Mapping (OM) By using OM, AR would not claim the achievement of development impacts; rather, the focus is on its contributions to outcomes. These outcomes, in turn, enhance the possibility of development the relationship is not necessarily a impacts – but the relationship is not necessarily a direct one of cause and effect effect.
    32. 32. Qualitative • Participatory Impact Pathways Analysis (PIPA)
    33. 33. Participatory Impact Pathways Analysis (PIPA) • PIPA begins with a participatory workshop where stakeholders make explicit their assumptions about how their project will achieve an impact. Participants construct problem trees, a visioning exercise and network maps to help them clarify their 'impact pathways‘ (IPs). • IPs are then articulated in two logic models: 1. The outcomes logic model -> the project's medium term objectives in the form of hypotheses: which actors need to change, what are the changes, which strategies are needed to attain the changes. 2. The impact logic model -> how, by helping to achieve the expected outcomes, the project will impact on people's livelihoods. Participants derive outcome targets and milestones, regularly revisited and revised as part of M&E.
    34. 34. Outline • Mix methods
    35. 35. Mixed methods • Combination of quantitative and qualitative research methods to evaluate programs
    36. 36. Conclusions • We cannot do everything in every megasite… • Quantitative surveys are being conducted/planned in every country • IFPRI has comparative advantage in quantitative approaches, we shall split the tasks with the research teams on qualitative methods -> mixed methods • Is IFPRI M&E on the right track? What shall we be focusing on more? What shall we not be doing ?
    37. 37. Africa Research in Sustainable Intensification for the Next Generation africa-rising.net