Impact Evaluation


Published on

Published in: Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Impact Evaluation

  1. 1. Impact Evaluation: An Overview Wali Memon1 Wali Memon
  2. 2. What is Impact Evaluation? IE assesses how a program affects the well-being or welfare of individuals, households or communities (or businesses) Well-being at the individual level can be captured by income & consumption, health outcomes or ideally both At the community level, poverty levels or growth rates may be appropriate, depending on the question2 Wali Memon
  3. 3. Outline Advantages of Impact Evaluation Challenges for IE: Need for Comparison Groups Methods for Constructing Comparison3 Wali Memon
  4. 4. IE Versus other M&E Tools The key distinction between impact evaluation and other M&E tools is the focus on discerning the impact of the program from all other confounding effects IE seeks to provide evidence of the causal link between an intervention and outcomes4 Wali Memon
  5. 5. Monitoring and IE IMPACT Effect on living standards and welfare - infant and child mortality, - improved household income OUTCOMES Access, usage and satisfaction of users - number of children vaccinated, - percentage within 5 km of health center OUTPUTS Goods and services generated - number of nurses - availability of medicine INPUTS Financial and physical resources - spending in primary health care5 Wali Memon
  6. 6. Monitoring and IE IMPACTSProgram impactsconfounded by local,national, global effects OUTCOMES difficulty Users meet of service showing delivery causality OUTPUTS Gov’t/program production function INPUTS 6 Wali Memon
  7. 7. Logic Model: An Example Consider a program of providing Insecticide-Treated Nets (ITNs) to poor households What are: Inputs? Outputs? Outcomes? Impacts?7 Wali Memon
  8. 8. Logic Model: An Example Inputs: # of ITNs; # of health or NGO employees to help dissemination Outputs: # of ITNs received by HHs Outcomes: ITNs utilized by # of households Impact: Reduction in illness from malaria; increase in income; improvements in children’s school attendance and performance8 Wali Memon
  9. 9. Advantages of IE In order to be able to determine which projects are successful, need a carefully designed impact evaluation strategy This is useful for: Understanding if projects worked: Justification for funding Scaling up Meta-analysis: Learning from Others Cost-benefit tradeoffs across projects Can test between different approaches of same program or different projects to meet national indicator9 Wali Memon
  10. 10. Essential Methodology Difficulty is determining what would have happened to the individuals or communities of interest in absence of the project The key component to an impact evaluation is to construct a suitable comparison group to proxy for the “counterfactual” Problem: can only observe people in one state of the world at one time10 Wali Memon
  11. 11. Before/After Comparisons Why not collect data on individuals before and after intervention (the Reflexive)? Difference in income, etc, would be due to project Problem: many things change over time, including the project The country is growing and ITN usage is increasing generally (from 2000-2003 in NetMark data), so how do we know an increase in ITN use is due to the program or would have occurred in absence of program? Many factors affect malaria rate in a given year11 Wali Memon
  12. 12. Example: Providing Insecticide- Treated Nets (ITNs) to Poor Households The intervention: provide free ITNs to households in Zamfara Program targets poor areas Women have to enroll at local NGO office in order to receive bednets Starts in 2002, ends in 2003, we have data on malaria rates from 2001- 2004 Scenario 1: we observe that the households in Zamfara we provided bednets to have an increase malaria from 2002 to 200312 Wali Memon
  13. 13. Basic Problem of Impact Evaluation: Scenario 1Malaria Underestimated Impact whenRate using before/after comparisons: High rainfall year Zamfara households with bednets Impact = C – A? C An increase in malaria rate! A 13 Wali Memon 2001 2002 Treatment Period 2003 2004 Years
  14. 14. Basic Problem of Impact Evaluation: Scenario 1Malaria Underestimated Impact whenRate using before/after comparisons: High rainfall year “Counterfactual” Zamfara Households if no bednets provided B Zamfara households with bednets Impact = C – B C A Decline in the Malaria Rate! Impact ≠ C - A A 14 Wali Memon 2001 2002 Treatment Period 2003 2004 Years
  15. 15. Basic Problem of Impact Evaluation: Scenario 2Malaria Overestimated Impact: Bad RainfallRate “Counterfactual” (Zamfara households Impact ≠ C - A B if no bednets provided) A Zamfara households C TRUE Impact = C-B 15 Wali Memon 2001 2002 Treatment Period 2003 2004 Years
  16. 16. Comparison Groups Instead of using before/after comparisons, we need to use comparison groups to proxy for the counterfactual Two Core Problems in Finding Suitable Groups: Programs are targeted Recipients receive intervention for particular reason Participation is voluntary Individuals who participate differ in observable and unobservable ways (selection bias) • Hence, a comparison of participants and an arbitrary group of non-participants can lead to misleading or incorrect results16 Wali Memon
  17. 17. Comparison 1: Treatment and Region B Scenario 1: Failure of reflexive comparison due to higher rainfall, and everyone experienced an increase in malaria rates We compare the households in the program region to those in another region We find that our “treatment” households in Zamfara have a larger increase in malaria rates than those in region B, Oyo. Did the program have a negative impact? Not necessarily! Program placement is important: Region B has better sanitation and therefore affected less by rainfall (unobservable)17 Wali Memon
  18. 18. Basic Problem of Impact Evaluation: Program PlacementMalaria High Rainfallrate D TRUE IMPACT: E-D E “Treatment”: Zamfara A 18 Wali Memon 2001 2002 Treatment Period 2003 2004 Years
  19. 19. Basic Problem of Impact Evaluation: Program PlacementMalaria Underestimated Impact when using region Brate comparison group: High Rainfall E-A > C-B : Region B affected less by rainfall Region B: Oyo C B D TRUE IMPACT: E-D E “Treatment”: Zamfara A 19 Wali Memon 2001 2002 Treatment Period 2003 2004 Years
  20. 20. Comparison 2: Treatment vs. Neighbors We compare “treatment” households with their neighbors. We think the sanitation and rainfall patterns are about the same. Scenario 2: Let’s say we observe that treatment households’ malaria rates decrease more than comparison households. Did the program work? Not necessarily: There may be two types of households: types A and B, with A knowing how malaria is transmitted and also burn mosquito coils Type A households were more likely to register with the program. However, their other characteristics mean they would have had lower malaria rates in the absence of the ITNs (individual unobservables).20 Wali Memon
  21. 21. Basic Problem of Impact Evaluation:Selection BiasMalaria Comparing Project Beneficiaries (Type A) toRates Neighbors (Type B) Type B HHs Observed difference Type A HHs with Project 21 Y1 Memon Wali Y2 Y3 Y4 Treatment Period Years
  22. 22. Basic Problem of Impact Evaluation:Selection BiasMalaria Participants are often different than Non-participantsRates Type B HHs Selection Bias Observed difference True Impact Type A Households Type A HHs with Project 22 Y1 Memon Wali Y2 Y3 Y4 Treatment Period Years
  23. 23. Basic Problem of Impact Evaluation: Spillover Effects Another difficulty finding a true counterfactual has to do will spillover or contagion effects Example: ITNs will not only reduce malaria rates for those sleeping under nets, but also may lower overall rates because ITNs kill mosquitoes Problem: children who did not receive “treatment” may also have lower malaria rates – and therefore higher school attendance rates Generally leads to underestimate of treatment effect23 Wali Memon
  24. 24. Basic Problem of Impact Evaluation: Spillover EffectsSchool Attendance “Treatment” Children B “Control” Group of Impact ≠ B - C Children in Impact = B - A Neighborhood School C C>A due to spillover from treatment A children 24 Wali Memon 2001 2002 Treatment Period 2003 2004 Years
  25. 25. Counterfactual: Methodology We need a comparison group that is as identical in observable and unobservable dimensions as possible, to those receiving the program, and a comparison group that will not receive spillover benefits. Number of techniques: Randomization as gold standard Various Techniques of Matching25 Wali Memon
  26. 26. How to construct a comparison group – building the counterfactual 1. Randomization 2. Difference-in-Difference 3. Regression discontinuity 4. Matching Pipeline comparisons Propensity score26 Wali Memon
  27. 27. 1. Randomization Individuals/communities/firms are randomly assigned into participation Counterfactual: randomized-out group Advantages: Often addressed to as the “gold standard”: by design: selection bias is zero on average and mean impact is revealed Perceived as a fair process of allocation with limited resources27 Wali Memon
  28. 28. Randomization: Disadvantages Disadvantages: Ethical issues, political constraints Internal validity (exogeneity): people might not comply with the assignment (selective non-compliance) External validity (generalizability): usually run controlled experiment on a pilot, small scale. Difficult to extrapolate the results to a larger population. Does not always solve problem of spillovers28 Wali Memon
  29. 29. When to Randomize If funds are insufficient to treat all eligible recipients Randomization can be the most fair and transparent approach The program is administered at the individual, household or community level Higher level of implementation difficult: example – trunk roads Program will be scaled-up: learning what works is very valuable29 Wali Memon
  30. 30. 2. Difference-in-differenceObservations over time: compare observed changes in theoutcomes for a sample of participants and non-participantsIdentification assumption: the selection bias or unobservablecharacteristics are time-invariant (‘parallel trends’ in theabsence of the program)Counter-factual: changes over time for the non-participants30 Wali Memon
  31. 31. Diff-in-Diff: Continued Constraint: Requires at least two cross-sections of data, pre-program and post-program on participants and non-participants Need to think about the evaluation ex-ante, before the program More valid if there are 2 pre-periods so can observe whether trend is same Can be in principle combined with matching to adjust for pre-treatment differences that affect the growth rate31 Wali Memon
  32. 32. Implementing differences in differences: Different Strategies Some arbitrary comparison group Matched diff in diff Randomized diff in diff These are in order of more problems less problems, think about this as we look at this graphically32 Wali Memon
  33. 33. Essential Assumptions of Diff-in-Diff Y1 Initial Impactdifference mustbe time * Y1invariant Y0 In absenceof program, thechange overtime would be t=0 t=1 timeidentical33 Wali Memon
  34. 34. Difference-in-Difference in ITN Example Instead of comparing Zamfara to Oyo, compare Zamfara to Niger if: While Zamfara and Oyo have different malaria rates and different ITN usage, we expect that they change in parallel Use NetMark data to compare 2000 to 2003 in Zamfara and Niger states Use additional data (GHS, NLSS) to compare incomes and sanitation infrastructure levels and changes prior to program implementation34 Wali Memon
  35. 35. 3. Regression discontinuity design Exploit the rule generating assignment into a program given to individuals only above a given threshold – Assume that discontinuity in participation but not in counterfactual outcomes Counterfactual: individuals just below the cut-off who did not participate Advantages: “Identification” built in the program design Delivers marginal gains from the program around the eligibility cut- off point. Important for program expansion Disadvantages: Threshold has to be applied in practice, and individuals should not be able manipulate the score used in the program to become eligible 35 Wali Memon
  36. 36. RDD in ITN Example Program available for poor households Eligibility criteria: must be below the national poverty line or < 1 ha of land Treatment group: those below cut-off Those with income below the poverty line and therefore qualified for ITNs Comparison group: those right above the cutoff Those with income just above poverty line and therefore not-eligible36 Wali Memon
  37. 37. RDD in ITN Example Problems: How well enforced was the rule? Can the rule be manipulated? Local effect: may not be generalizable if program expands to households well above poverty line Particularly relevant since NetMark data indicate low ITN usage across all socio-economic status groups37 Wali Memon
  38. 38. 4. Matching Match participants with non-participants from a larger survey Counterfactual: matched comparison group Each program participant is paired with one or more non-participant that are similar based on observable characteristics Assumes that, conditional on the set of observables, there is no selection bias based on unobserved heterogeneity When the set of variables to match is large, often match on a38 Wali Memon statistics: the probability of participation as a function of summary the observables (the propensity score)
  39. 39. 4. Matching Advantages: Does not require randomization, nor baseline (pre- intervention data) Disadvantages: Strong identification assumptions In many cases, may make interpretation of results very difficult Requires very good quality data: need to control for all factors that influence program placement Requires significantly large sample size to generate comparison group39 Wali Memon
  40. 40. Matching in Practice Using statistical techniques, we match a group of non- participants with participants using variables like gender, household size, education, experience, land size (rainfall to control for drought), irrigation (as many observable characteristics not affected by program intervention) One common method: Propensity Score Matching40 Wali Memon
  41. 41. Matching in Practice: 2 Approaches Approach 1: After program implementation, we match (within region) those who received ITNs with those who did not. Problem? Problem: likelihood of usage of different households is unobservable, so not included in propensity score This creates selection bias Approach 2: The program is allocated based on land size. After implementation, we match those eligible in region A with those in region B. Problem? Problems: same issues of individual unobservables, but lessened because we compare eligible to potential eligible Now problem of unobservable factors across regions41 Wali Memon
  42. 42. An extension of matching: pipeline comparisons Idea: compare those just about to get an intervention with those getting it now Assumption: the stopping point of the intervention does not separate two fundamentally different populations Example: extending irrigation networks In ITN example: If only some communities within Zamfara receive ITNs in round 1: compare them to nearby communities will receive ITNs in round 242 Wali Memon with Infrastructure: Spillover effects may Difficulty be strong or anticipatory effect