Chris NicolettiActivity #267: Analysing the socio-economicimpact of the Water Hibah on beneficiaryhouseholds and communiti...
2Tuesday - Session 1INTRODUCTION AND OVERVIEW1) Introduction2) Why is evaluation valuable?3) What makes a good evaluation?...
This material constitutes supporting material for the "Impact Evaluation in Practice" book. This additional material is ma...
4CausalInferenceCounterfactualsFalse CounterfactualsBefore & After (Pre & Post)Enrolled & Not Enrolled(Apples & Oranges)
5IE MethodsToolboxRandomized AssignmentDiscontinuity DesignDiff-in-DiffRandomizedOffering/PromotionDifference-in-Differenc...
6CausalInferenceCounterfactualsFalse CounterfactualsBefore & After (Pre & Post)Enrolled & Not Enrolled(Apples & Oranges)
7Our ObjectiveEstimate the causal effect (impact)of intervention (P) on outcome (Y).(P) = Program or Treatment(Y) = Indica...
8Causal InferenceWhat is the impact of (P) on (Y)?α= (Y | P=1)-(Y | P=0)Can we all go home?
9Problem of Missing DataFor a program beneficiary:α= (Y | P=1)-(Y | P=0)we observe(Y | P=1): Household Consumption (Y) wit...
10SolutionEstimate what would have happened toY in the absence of P.We call this the Counterfactual.
11Estimating impact of P on YOBSERVE (Y | P=1)Outcome with treatmentESTIMATE (Y | P=0)The Counterfactualo Intention to Tre...
12Example: What is theImpact of…giving Jim(P)(Y)?additional pocket moneyon Jim’s consumption ofcandies
13The Perfect CloneJim Jim’s CloneIMPACT=6-4=2 Candies6 candies 4 candiesX
14In reality, use statisticsTreatment ComparisonAverage Y=6 candies Average Y=4 CandiesIMPACT=6-4=2 CandiesX
16Finding good comparisongroupsWe want to find clones for the Jims in ourprograms.The treatment and comparison groups shou...
17Case Study: ProgresaNational anti-poverty program in Mexicoo Started 1997o 5 million beneficiaries by 2004o Eligibility ...
18Case Study: ProgresaRigorous impact evaluation with rich datao 506 communities, 24,000 householdso Baseline 1997, follow...
19Eligibility and EnrollmentIneligibles(Non-Poor)Eligibles(Poor)EnrolledNot Enrolled
20CausalInferenceCounterfactualsFalse CounterfactualsBefore & After (Pre & Post)Enrolled & Not Enrolled(Apples & Oranges)
21False Counterfactual #1YTimeT=0BaselineT=1EndlineA-B = 4A-C = 2IMPACT?BAC (counterfactual)Before & After
22Case 1: Before & AfterWhat is the effect of Progresa (P) on consumption (Y)?YTimeT=1997 T=1998α = $35IMPACT=A-B= $35BA23...
23Case 1: Before & AfterNote: If the effect is statistically significant at the 1% significance level, we label the estima...
24Case 1: What’s theproblem?YTimeT=1997 T=1998α = $35BA233268Economic Boom:o Real Impact=A-Co A-B is anoverestimateC?D ?Im...
25CausalInferenceCounterfactualsFalse CounterfactualsBefore & After (Pre & Post)Enrolled & Not Enrolled(Apples & Oranges)
26False Counterfactual #2If we have post-treatment data ono Enrolled: treatment groupo Not-enrolled: “comparison” group (c...
27Case 2: Enrolled & NotEnrolledEnrolledY=268Not EnrolledY=290Ineligibles(Non-Poor)Eligibles(Poor)In what ways might E&NE ...
28Case 2: Enrolled & NotEnrolledConsumption (Y)Outcome with Treatment(Enrolled) 268Counterfactual(Not Enrolled) 290Impact(...
29Progresa PolicyRecommendation?Will you recommend scaling up Progresa?B&A: Are there other time-varying factors that also...
30B&ACompare: Same individualsBefore and After theyreceive P.Problem: Other things mayhave happened over time.E&NECompare:...
31IE MethodsToolboxRandomized AssignmentDiscontinuity DesignDiff-in-DiffRandomizedOffering/PromotionDifference-in-Differen...
32Choosing your IEmethod(s)Prospective/RetrospectiveEvaluation?Eligibility rules and criteria?Roll-out plan (pipeline)?Is ...
33Choosing your IEmethod(s)Best DesignHave we controlled foreverything?Is the result valid foreveryone?o Best comparison g...
34IE MethodsToolboxRandomized AssignmentDiscontinuity DesignDiff-in-DiffRandomizedOffering/PromotionDifference-in-Differen...
35Randomized Treatments &Comparisono Randomize!o Lottery for who is offered benefitso Fair, transparent and ethical way to...
36= IneligibleRandomized treatmentsand comparisons= Eligible1. PopulationExternal Validity2. Evaluationsample3. Randomizet...
37Unit of RandomizationChoose according to type of program Individual/Household School/HealthClinic/catchment area Bloc...
38Case 3: RandomizedAssignmentProgresa CCT programUnit of randomization: Community 320 treatment communities (14446 house...
39Case 3: RandomizedAssignmentTreatmentCommunities320ComparisonCommunities186TimeT=1T=0Comparison Period
40Case 3: RandomizedAssignmentHow do we know we have goodclones?In the absence of Progresa, treatmentand comparisons shoul...
41Case 3: Balance atBaselineCase 3: Randomized AssignmentTreatment Comparison T-statConsumption($ monthly per capita) 233....
42Case 3: Balance atBaselineCase 3: Randomized AssignmentTreatment Comparison T-statHead is female=1 0.07 0.07 -0.66Indige...
43Case 3: RandomizedAssignmentTreatment Group(Randomized totreatment)Counterfactual(Randomized toComparison)Impact(Y | P=1...
44Progresa PolicyRecommendation?Note: If the effect is statistically significant at the 1% significance level, we label th...
45Keep in MindRandomized AssignmentIn Randomized Assignment,large enough samples,produces 2 statisticallyequivalent groups...
46IE MethodsToolboxRandomized AssignmentDiscontinuity DesignDiff-in-DiffRandomizedOffering/PromotionDifference-in-Differen...
47What if we can’t choose?It’s not always possible to choose a controlgroup. What about:o National programs where everyone...
48Randomly offering orpromoting programIf you can exclude some units, but can’t force anyone:o Offer the program to a rand...
49Randomly offering orpromoting program1. Offered/promoted and not-offered/ not-promoted groupsare comparable:• Whether or...
50Randomly offering orpromoting programWITHoffering/promotionWITHOUToffering/promotionNever EnrollOnly Enroll ifoffered/pr...
510Randomly offering or promotingprogramEligible units Randomize promotion/offering the programEnrollmentOffering/Promotio...
52Randomly offering orpromoting programOffered/PromotedGroupNot Offered/ NotPromoted GroupImpact%Enrolled=80%Average Y for...
53Examples: RandomizedPromotionMaternal Child Health Insurance inArgentinaIntensive information campaignsCommunity Based S...
54Community Based SchoolManagement in NepalContext:o A centralized school systemo 2003: Decision to allow local administra...
55Maternal Child HealthInsurance in ArgentinaContext:o 2001 financial crisiso Health insurance coverage diminishesPay for ...
56Case 4: RandomizedOffering/ PromotionRandomized Offering/Promotion is an“Instrumental Variable” (IV)o A variable correla...
57Case 4: ProgresaRandomized OfferingOffered groupNot offeredgroupImpact%Enrolled=92%Average Y forentire group =268%Enroll...
58Case 4: RandomizedOfferingEstimated Impact onConsumption (Y)Instrumental VariablesRegression 29.8**Instrumental Variable...
59Keep in MindRandomized Offering/PromotionRandomized Promotionneeds to be an effectivepromotion strategy(Pilot test in ad...
60IE MethodsToolboxRandomized AssignmentDiscontinuity DesignDiff-in-DiffRandomizedOffering/PromotionDifference-in-Differen...
61Discontinuity DesignAnti-povertyProgramsPensionsEducationAgricultureMany social programs select beneficiaries using anin...
62Example: Effect of fertilizerprogram on agriculture productionImprove agriculture production (rice yields) for smallfarm...
63Regression DiscontinuityDesign-BaselineNot eligibleEligible
64Regression DiscontinuityDesign-Post InterventionIMPACT
65Case 5: DiscontinuityDesignWe have a continuous eligibility index with adefined cut-offo Households with a score ≤ cutof...
66Case 5: DiscontinuityDesignEligibility for Progresa is based onnational poverty indexHousehold is poor if score ≤ 750Eli...
67Case 5: Discontinuity DesignScore vs. consumption at Baseline-No treatmentFittedvaluespuntaje estimado en focalizacion27...
68Fittedvaluespuntaje estimado en focalizacion276 1294183.647399.51Case 5: Discontinuity DesignScore vs. consumption post-...
69Keep in MindDiscontinuity DesignDiscontinuity Designrequires continuouseligibility criteria with clearcut-off.Gives unbi...
70Keep in MindDiscontinuity DesignDiscontinuity Designproduces a local estimate:o Effect of the programaround the cut-offp...
71IE MethodsToolboxRandomized AssignmentDiscontinuity DesignDiff-in-DiffRandomizedOffering/PromotionDifference-in-Differen...
72MatchingFor each treated unit pick up the best comparisonunit (match) from another data source.IdeaMatches are selected ...
73Propensity-Score Matching(PSM)Comparison Group: non-participants with sameobservable characteristics as participants. I...
74Density of propensity scoresDensityPropensity Score0 1ParticipantsNon-ParticipantsCommon Support
75Case 7: Progresa Matching(P-Score)Baseline Characteristics Estimated CoefficientProbit Regression, Prob Enrolled=1Head’s...
76Case 7: Progresa CommonSupportPr (Enrolled)Density: Pr (Enrolled)Density:Pr(Enrolled)Density: Pr (Enrolled)
77Case 7: Progresa Matching(P-Score)Estimated Impact onConsumption (Y)Multivariate LinearRegression 7.06+Note: If the effe...
78Keep in MindMatchingMatching requires largesamples and good qualitydata.Matching at baseline can bevery useful: Know th...
79Progresa PolicyRecommendation?Note: If the effect is statistically significant at the 1% significance level, we label th...
80Appendix 2: Steps inPropensity Score Matching1. Representative & highly comparables survey of non-participants and parti...
81IE MethodsToolboxRandomized AssignmentDiscontinuity DesignDiff-in-DiffRandomizedOffering/PromotionDifference-in-Differen...
82MatchingFor each treated unit pick up the best comparisonunit (match) from another data source.IdeaMatches are selected ...
83Propensity-Score Matching(PSM)Comparison Group: non-participants with sameobservable characteristics as participants. I...
84Density of propensity scoresDensityPropensity Score0 1ParticipantsNon-ParticipantsCommon Support
85Case 7: Progresa Matching(P-Score)Baseline Characteristics Estimated CoefficientProbit Regression, Prob Enrolled=1Head’s...
86Case 7: Progresa CommonSupportPr (Enrolled)Density: Pr (Enrolled)Density:Pr(Enrolled)Density: Pr (Enrolled)
87Case 7: Progresa Matching(P-Score)Estimated Impact onConsumption (Y)Multivariate LinearRegression 7.06+Note: If the effe...
88Keep in MindMatchingMatching requires largesamples and good qualitydata.Matching at baseline can bevery useful: Know th...
89Progresa PolicyRecommendation?Note: If the effect is statistically significant at the 1% significance level, we label th...
90Appendix 2: Steps inPropensity Score Matching1. Representative & highly comparables survey of non-participants and parti...
91IE MethodsToolboxRandomized AssignmentDiscontinuity DesignDiff-in-DiffRandomizedOffering/PromotionDifference-in-Differen...
92Keep in MindRandomized AssignmentIn Randomized Assignment,large enough samples,produces 2 statisticallyequivalent groups...
93Randomized assignmentwith different benefit levelsTraditional impact evaluation question: What is the impact of a progr...
94= IneligibleRandomized assignmentwith different benefit levels= Eligible1. Eligible Population 2. Evaluation sample3. Ra...
95Other key policy question for a program with variousbenefits: What is the impact of an intervention compared to another...
96= IneligibleRandomized assignmentwith multiple interventions= Eligible1. Eligible Population 2. Evaluation sample3. Rand...
97Appendix : Two StageLeast Squares (2SLS)1 2y T x      0 1 1T x Z      Model with endogenous Treatment (T...
98Appendix 1: Two StageLeast Squares (2SLS)^1 2( )y T x      Need to correct Standard Errors (they arebased on T h...
99Tuesday - Session 1INTRODUCTION AND OVERVIEW1) Introduction2) Why is evaluation valuable?3) What makes a good evaluation...
Thank You!
101This material constitutes supporting material for the "Impact Evaluation in Practice" book. This additional material is...
Upcoming SlideShare
Loading in …5
×

Session 2 evaluation design

355 views
279 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
355
On SlideShare
0
From Embeds
0
Number of Embeds
24
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Session 2 evaluation design

  1. 1. Chris NicolettiActivity #267: Analysing the socio-economicimpact of the Water Hibah on beneficiaryhouseholds and communities (Stage 1)Impact EvaluationTraining CurriculumDay 2April 17, 2013
  2. 2. 2Tuesday - Session 1INTRODUCTION AND OVERVIEW1) Introduction2) Why is evaluation valuable?3) What makes a good evaluation?4) How to implement an evaluation?Wednesday - Session 2EVALUATION DESIGN5) Causal Inference6) Choosing your IE method/design7) Impact Evaluation ToolboxThursday - Session 3SAMPLE DESIGN AND DATA COLLECTION9) Sample Designs10) Types of Error and Biases11) Data Collection Plans12) Data Collection ManagementFriday - Session 4INDICATORS & QUESTIONNAIRE DESIGN1) Results chain/logic models2) SMART indicators3) Questionnaire DesignOutline: topics being covered
  3. 3. This material constitutes supporting material for the "Impact Evaluation in Practice" book. This additional material is made freely but please acknowledgeits use as follows: Gertler, P. J.; Martinez, S., Premand, P., Rawlings, L. B. and Christel M. J. Vermeersch, 2010, Impact Evaluation in Practice: AncillaryMaterial, The World Bank, Washington DC (www.worldbank.org/ieinpractice). The content of this presentation reflects the views of the authors and notnecessarily those of the World Bank.MEASURING IMPACTImpact Evaluation Methods for PolicyMakers
  4. 4. 4CausalInferenceCounterfactualsFalse CounterfactualsBefore & After (Pre & Post)Enrolled & Not Enrolled(Apples & Oranges)
  5. 5. 5IE MethodsToolboxRandomized AssignmentDiscontinuity DesignDiff-in-DiffRandomizedOffering/PromotionDifference-in-DifferencesP-Score matchingMatching
  6. 6. 6CausalInferenceCounterfactualsFalse CounterfactualsBefore & After (Pre & Post)Enrolled & Not Enrolled(Apples & Oranges)
  7. 7. 7Our ObjectiveEstimate the causal effect (impact)of intervention (P) on outcome (Y).(P) = Program or Treatment(Y) = Indicator, Measure of SuccessExample: What is the effect of a household freshwaterconnection (P) on household water expenditures(Y)?
  8. 8. 8Causal InferenceWhat is the impact of (P) on (Y)?α= (Y | P=1)-(Y | P=0)Can we all go home?
  9. 9. 9Problem of Missing DataFor a program beneficiary:α= (Y | P=1)-(Y | P=0)we observe(Y | P=1): Household Consumption (Y) witha cash transfer program (P=1)but we do not observe(Y | P=0): Household Consumption (Y)without a cash transfer program (P=0)
  10. 10. 10SolutionEstimate what would have happened toY in the absence of P.We call this the Counterfactual.
  11. 11. 11Estimating impact of P on YOBSERVE (Y | P=1)Outcome with treatmentESTIMATE (Y | P=0)The Counterfactualo Intention to Treat (ITT) –Those to whom wewanted to give treatmento Treatment on the Treated(TOT) – Those actuallyreceiving treatmento Use comparison orcontrol groupα= (Y | P=1) - (Y | P=0)IMPACT = - counterfactualOutcome withtreatment
  12. 12. 12Example: What is theImpact of…giving Jim(P)(Y)?additional pocket moneyon Jim’s consumption ofcandies
  13. 13. 13The Perfect CloneJim Jim’s CloneIMPACT=6-4=2 Candies6 candies 4 candiesX
  14. 14. 14In reality, use statisticsTreatment ComparisonAverage Y=6 candies Average Y=4 CandiesIMPACT=6-4=2 CandiesX
  15. 15. 16Finding good comparisongroupsWe want to find clones for the Jims in ourprograms.The treatment and comparison groups should• have identical characteristics• except for benefiting from the intervention.In practice, use program eligibility & assignmentrules to construct valid counterfactuals
  16. 16. 17Case Study: ProgresaNational anti-poverty program in Mexicoo Started 1997o 5 million beneficiaries by 2004o Eligibility – based on poverty indexCash Transferso Conditional on school and health care attendance.
  17. 17. 18Case Study: ProgresaRigorous impact evaluation with rich datao 506 communities, 24,000 householdso Baseline 1997, follow-up 1998Many outcomes of interestHere: Consumption per capitaWhat is the effect of Progresa (P) onConsumption Per Capita (Y)?If impact is an increase of $20 or more, then scale upnationally
  18. 18. 19Eligibility and EnrollmentIneligibles(Non-Poor)Eligibles(Poor)EnrolledNot Enrolled
  19. 19. 20CausalInferenceCounterfactualsFalse CounterfactualsBefore & After (Pre & Post)Enrolled & Not Enrolled(Apples & Oranges)
  20. 20. 21False Counterfactual #1YTimeT=0BaselineT=1EndlineA-B = 4A-C = 2IMPACT?BAC (counterfactual)Before & After
  21. 21. 22Case 1: Before & AfterWhat is the effect of Progresa (P) on consumption (Y)?YTimeT=1997 T=1998α = $35IMPACT=A-B= $35BA233268(1) Observe onlybeneficiaries (P=1)(2) Two observationsin time:Consumption at T=0and consumption atT=1.
  22. 22. 23Case 1: Before & AfterNote: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2stars (**).Consumption (Y)Outcome with Treatment(After) 268.7Counterfactual(Before) 233.4Impact(Y | P=1) - (Y | P=0) 35.3***Estimated Impact onConsumption (Y)Linear Regression 35.27**Multivariate LinearRegression 34.28**
  23. 23. 24Case 1: What’s theproblem?YTimeT=1997 T=1998α = $35BA233268Economic Boom:o Real Impact=A-Co A-B is anoverestimateC?D ?Impact?Impact?EconomicRecession:o Real Impact=A-Do A-B is anunderestimate
  24. 24. 25CausalInferenceCounterfactualsFalse CounterfactualsBefore & After (Pre & Post)Enrolled & Not Enrolled(Apples & Oranges)
  25. 25. 26False Counterfactual #2If we have post-treatment data ono Enrolled: treatment groupo Not-enrolled: “comparison” group (counterfactual)Those ineligible to participate.Those that choose NOT to participate.Selection Biaso Reason for not enrolling may be correlated with outcome (Y)Control for observables.But not un-observables!o Estimated impact is confounded with other things.Enrolled & Not Enrolled
  26. 26. 27Case 2: Enrolled & NotEnrolledEnrolledY=268Not EnrolledY=290Ineligibles(Non-Poor)Eligibles(Poor)In what ways might E&NE be different, other than their enrollment in the program?
  27. 27. 28Case 2: Enrolled & NotEnrolledConsumption (Y)Outcome with Treatment(Enrolled) 268Counterfactual(Not Enrolled) 290Impact(Y | P=1) - (Y | P=0) -22**Estimated Impact onConsumption (Y)Linear Regression -22**Multivariate LinearRegression -4.15Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).
  28. 28. 29Progresa PolicyRecommendation?Will you recommend scaling up Progresa?B&A: Are there other time-varying factors that alsoinfluence consumption?E&BNE:o Are reasons for enrolling correlated with consumption?o Selection Bias.Impact on Consumption (Y)Case 1: Before& AfterLinear Regression 35.27**Multivariate Linear Regression 34.28**Case 2:Enrolled & NotEnrolledLinear Regression -22**Multivariate Linear Regression -4.15Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).
  29. 29. 30B&ACompare: Same individualsBefore and After theyreceive P.Problem: Other things mayhave happened over time.E&NECompare: Group ofindividuals Enrolled in aprogram with group thatchooses not to enroll.Problem: Selection Bias.We don’t know why theyare not enrolled.Keep in MindBoth counterfactuals may lead tobiased estimates of the impact.!
  30. 30. 31IE MethodsToolboxRandomized AssignmentDiscontinuity DesignDiff-in-DiffRandomizedOffering/PromotionDifference-in-DifferencesP-Score matchingMatching
  31. 31. 32Choosing your IEmethod(s)Prospective/RetrospectiveEvaluation?Eligibility rules and criteria?Roll-out plan (pipeline)?Is the number of eligible unitslarger than available resourcesat a given point in time?o Poverty targeting?o Geographictargeting?o Budget and capacityconstraints?o Excess demand forprogram?o Etc.Key information you will need for identifying the rightmethod for your program:
  32. 32. 33Choosing your IEmethod(s)Best DesignHave we controlled foreverything?Is the result valid foreveryone?o Best comparison group youcan find + least operationalrisko External validityo Local versus global treatmenteffecto Evaluation results apply topopulation we’re interested ino Internal validityo Good comparison groupChoose the best possible design given theoperational context:
  33. 33. 34IE MethodsToolboxRandomized AssignmentDiscontinuity DesignDiff-in-DiffRandomizedOffering/PromotionDifference-in-DifferencesP-Score matchingMatching
  34. 34. 35Randomized Treatments &Comparisono Randomize!o Lottery for who is offered benefitso Fair, transparent and ethical way to assign benefits to equallydeserving populations.Eligibles > Number of Benefitso Give each eligible unit the same chance of receiving treatmento Compare those offered treatment with those not offered treatment(comparisons).Oversubscriptiono Give each eligible unit the same chance of receiving treatment first,second, third…o Compare those offered treatment first, with those offered later(comparisons).Randomized Phase In
  35. 35. 36= IneligibleRandomized treatmentsand comparisons= Eligible1. PopulationExternal Validity2. Evaluationsample3. RandomizetreatmentInternal ValidityComparisonTreatmentX
  36. 36. 37Unit of RandomizationChoose according to type of program Individual/Household School/HealthClinic/catchment area Block/Village/Community Ward/District/RegionKeep in mind Need “sufficiently large” number of units to detectminimum desired impact: Power. Spillovers/contamination Operational and survey costs
  37. 37. 38Case 3: RandomizedAssignmentProgresa CCT programUnit of randomization: Community 320 treatment communities (14446 households): First transfers in April 1998. 186 comparison communities (9630households): First transfers November 1999506 communities in the evaluation sampleRandomized phase-in
  38. 38. 39Case 3: RandomizedAssignmentTreatmentCommunities320ComparisonCommunities186TimeT=1T=0Comparison Period
  39. 39. 40Case 3: RandomizedAssignmentHow do we know we have goodclones?In the absence of Progresa, treatmentand comparisons should be identicalLet’s compare their characteristics atbaseline (T=0)
  40. 40. 41Case 3: Balance atBaselineCase 3: Randomized AssignmentTreatment Comparison T-statConsumption($ monthly per capita) 233.4 233.47 -0.39Head’s age(years) 41.6 42.3 -1.2Spouse’s age(years) 36.8 36.8 -0.38Head’s education(years) 2.9 2.8 2.16**Spouse’s education(years) 2.7 2.6 0.006Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).
  41. 41. 42Case 3: Balance atBaselineCase 3: Randomized AssignmentTreatment Comparison T-statHead is female=1 0.07 0.07 -0.66Indigenous=1 0.42 0.42 -0.21Number of householdmembers 5.7 5.7 1.21Bathroom=1 0.57 0.56 1.04Hectares of Land 1.67 1.71 -1.35Distance to Hospital(km) 109 106 1.02Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).
  42. 42. 43Case 3: RandomizedAssignmentTreatment Group(Randomized totreatment)Counterfactual(Randomized toComparison)Impact(Y | P=1) - (Y | P=0)Baseline (T=0)Consumption (Y) 233.47 233.40 0.07Follow-up (T=1)Consumption (Y) 268.75 239.5 29.25**Estimated Impact onConsumption (Y)Linear Regression 29.25**Multivariate LinearRegression 29.75**Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).
  43. 43. 44Progresa PolicyRecommendation?Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).Impact of Progresa on Consumption (Y)Case 1: Before& AfterMultivariate Linear Regression 34.28**Case 2:Enrolled & NotEnrolledLinear Regression -22**Multivariate Linear Regression -4.15Case 3:RandomizedAssignmentMultivariate Linear Regression 29.75**
  44. 44. 45Keep in MindRandomized AssignmentIn Randomized Assignment,large enough samples,produces 2 statisticallyequivalent groups.We have identified theperfect clone.RandomizedbeneficiaryRandomizedcomparisonFeasible for prospectiveevaluations with over-subscription/excess demand.Most pilots and newprograms fall into thiscategory.!
  45. 45. 46IE MethodsToolboxRandomized AssignmentDiscontinuity DesignDiff-in-DiffRandomizedOffering/PromotionDifference-in-DifferencesP-Score matchingMatching
  46. 46. 47What if we can’t choose?It’s not always possible to choose a controlgroup. What about:o National programs where everyone is eligible?o Programs where participation is voluntary?o Programs where you can’t exclude anyone?Can we compareEnrolled & NotEnrolled?Selection Bias!
  47. 47. 48Randomly offering orpromoting programIf you can exclude some units, but can’t force anyone:o Offer the program to a random sub-sampleo Many will accepto Some will not acceptIf you can’t exclude anyone, and can’t force anyone:o Making the program available to everyoneo But provide additional promotion,encouragement or incentives to a randomsub-sample:Additional Information.Encouragement.Incentives (small gift or prize).Transport (bus fare).RandomizedofferingRandomizedpromotion
  48. 48. 49Randomly offering orpromoting program1. Offered/promoted and not-offered/ not-promoted groupsare comparable:• Whether or not you offer or promote is not correlated withpopulation characteristics• Guaranteed by randomization.2. Offered/promoted group has higher enrollment in theprogram.3. Offering/promotion of program does not affectoutcomes directly.Necessary conditions:
  49. 49. 50Randomly offering orpromoting programWITHoffering/promotionWITHOUToffering/promotionNever EnrollOnly Enroll ifoffered/promotedAlways Enroll3 groups of units/individualsXXX
  50. 50. 510Randomly offering or promotingprogramEligible units Randomize promotion/offering the programEnrollmentOffering/PromotionNo Offering/No PromotionXXOnly if offered/promotedAlwaysNever
  51. 51. 52Randomly offering orpromoting programOffered/PromotedGroupNot Offered/ NotPromoted GroupImpact%Enrolled=80%Average Y forentire group=100%Enrolled=30%Average Y for entiregroup=80∆Enrolled=50%∆Y=20Impact= 20/50%=40Never EnrollOnly Enroll ifOffered/PromotedAlways Enroll--
  52. 52. 53Examples: RandomizedPromotionMaternal Child Health Insurance inArgentinaIntensive information campaignsCommunity Based SchoolManagement in NepalNGO helps with enrollment paperwork
  53. 53. 54Community Based SchoolManagement in NepalContext:o A centralized school systemo 2003: Decision to allow local administration of schoolsThe program:o Communities express interest to participate.o Receive monetary incentive ($1500)What is the impact of local school administration on:o School enrollment, teachers absenteeism, learning quality, financialmanagementRandomized promotion:o NGO helps communities with enrollment paperwork.o 40 communities with randomized promotion (15 participate)o 40 communities without randomized promotion (5 participate)
  54. 54. 55Maternal Child HealthInsurance in ArgentinaContext:o 2001 financial crisiso Health insurance coverage diminishesPay for Performance (P4P) program:o Change in payment system for providers.o 40% payment upon meeting quality standardsWhat is the impact of the new provider paymentsystem on health of pregnant women and children?Randomized promotion:o Universal program throughout the country.o Randomized intensive information campaigns to inform women ofthe new payment system and increase the use of health services.
  55. 55. 56Case 4: RandomizedOffering/ PromotionRandomized Offering/Promotion is an“Instrumental Variable” (IV)o A variable correlated with treatment but nothing else (i.e.randomized promotion)o Use 2-stage least squares (see annex)Using this method, we estimate the effect of“treatment on the treated”o It’s a “local” treatment effect (valid only for )o In randomized offering: treated=those offered thetreatment who enrolledo In randomized promotion: treated=those to whom theprogram was offered and who enrolled
  56. 56. 57Case 4: ProgresaRandomized OfferingOffered groupNot offeredgroupImpact%Enrolled=92%Average Y forentire group =268%Enrolled=0%Average Y forentire group = 239∆Enrolled=0.92∆Y=29Impact= 29/0.92 =31Never Enroll-Enroll ifOfferedAlways Enroll- - -
  57. 57. 58Case 4: RandomizedOfferingEstimated Impact onConsumption (Y)Instrumental VariablesRegression 29.8**Instrumental Variableswith Controls 30.4**Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).
  58. 58. 59Keep in MindRandomized Offering/PromotionRandomized Promotionneeds to be an effectivepromotion strategy(Pilot test in advance!)Promotion strategy will helpunderstand how to increaseenrollment in addition toimpact of the program.Strategy depends onsuccess and validity ofoffering/promotion.Strategy estimates a localaverage treatment effect.Impact estimate valid onlyfor the triangle hat type ofbeneficiaries.!Don’t exclude anyone but…
  59. 59. 60IE MethodsToolboxRandomized AssignmentDiscontinuity DesignDiff-in-DiffRandomizedOffering/PromotionDifference-in-DifferencesP-Score matchingMatching
  60. 60. 61Discontinuity DesignAnti-povertyProgramsPensionsEducationAgricultureMany social programs select beneficiaries using anindex or score:Targeted to households below a givenpoverty index/incomeTargeted to population above a certainageScholarships targeted to students withhigh scores on standarized textFertilizer program targeted to smallfarms less than given number ofhectares)
  61. 61. 62Example: Effect of fertilizerprogram on agriculture productionImprove agriculture production (rice yields) for smallfarmersGoal Farms with a score (Ha) of land ≤50 are small Farms with a score (Ha) of land >50 are not smallMethodSmall farmers receive subsidies to purchase fertilizerIntervention
  62. 62. 63Regression DiscontinuityDesign-BaselineNot eligibleEligible
  63. 63. 64Regression DiscontinuityDesign-Post InterventionIMPACT
  64. 64. 65Case 5: DiscontinuityDesignWe have a continuous eligibility index with adefined cut-offo Households with a score ≤ cutoff are eligibleo Households with a score > cutoff are not eligibleo Or vice-versaIntuitive explanation of the method:o Units just above the cut-off point are very similar to units just belowit – good comparison.o Compare outcomes Y for units just above and below the cut-offpoint.
  65. 65. 66Case 5: DiscontinuityDesignEligibility for Progresa is based onnational poverty indexHousehold is poor if score ≤ 750Eligibility for Progresa:o Eligible=1 if score ≤ 750o Eligible=0 if score > 750
  66. 66. 67Case 5: Discontinuity DesignScore vs. consumption at Baseline-No treatmentFittedvaluespuntaje estimado en focalizacion276 1294153.578379.224Poverty IndexConsumptionFittedvalues
  67. 67. 68Fittedvaluespuntaje estimado en focalizacion276 1294183.647399.51Case 5: Discontinuity DesignScore vs. consumption post-intervention period-treatment(**) Significant at 1%ConsumptionFittedvaluesPoverty Index30.58**Estimated impact onconsumption (Y) | MultivariateLinear Regression
  68. 68. 69Keep in MindDiscontinuity DesignDiscontinuity Designrequires continuouseligibility criteria with clearcut-off.Gives unbiased estimate ofthe treatment effect:Observations just across thecut-off are good comparisons.No need to exclude agroup of eligiblehouseholds/ individualsfrom treatment.Can sometimes use it forprograms that are alreadyongoing.!
  69. 69. 70Keep in MindDiscontinuity DesignDiscontinuity Designproduces a local estimate:o Effect of the programaround the cut-offpoint/discontinuity.o This is not alwaysgeneralizable.Power:o Need many observationsaround the cut-off point.Avoid mistakes in the statisticalmodel: Sometimes whatlooks like a discontinuity inthe graph, is somethingelse.!
  70. 70. 71IE MethodsToolboxRandomized AssignmentDiscontinuity DesignDiff-in-DiffRandomizedOffering/PromotionDifference-in-DifferencesP-Score matchingMatching
  71. 71. 72MatchingFor each treated unit pick up the best comparisonunit (match) from another data source.IdeaMatches are selected on the basis of similarities inobserved characteristics.How?If there are unobservable characteristics and thoseunobservables influence participation: Selection bias!Issue?
  72. 72. 73Propensity-Score Matching(PSM)Comparison Group: non-participants with sameobservable characteristics as participants. In practice, it is very hard. There may be many important characteristics!Match on the basis of the “propensity score”,Solution proposed by Rosenbaum and Rubin: Compute everyone’s probability of participating, basedon their observable characteristics. Choose matches that have the same probability ofparticipation as the treatments. See appendix 2.
  73. 73. 74Density of propensity scoresDensityPropensity Score0 1ParticipantsNon-ParticipantsCommon Support
  74. 74. 75Case 7: Progresa Matching(P-Score)Baseline Characteristics Estimated CoefficientProbit Regression, Prob Enrolled=1Head’s age (years) -0.022**Spouse’s age (years) -0.017**Head’s education (years) -0.059**Spouse’s education (years) -0.03**Head is female=1 -0.067Indigenous=1 0.345**Number of household members 0.216**Dirt floor=1 0.676**Bathroom=1 -0.197**Hectares of Land -0.042**Distance to Hospital (km) 0.001*Constant 0.664**Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).
  75. 75. 76Case 7: Progresa CommonSupportPr (Enrolled)Density: Pr (Enrolled)Density:Pr(Enrolled)Density: Pr (Enrolled)
  76. 76. 77Case 7: Progresa Matching(P-Score)Estimated Impact onConsumption (Y)Multivariate LinearRegression 7.06+Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**). Ifsignificant at 10% level, we label impact with +
  77. 77. 78Keep in MindMatchingMatching requires largesamples and good qualitydata.Matching at baseline can bevery useful: Know the assignment ruleand match based on it combine with othertechniques (i.e. diff-in-diff)Ex-post matching is risky: If there is no baseline, becareful! matching on endogenousex-post variables gives badresults.!
  78. 78. 79Progresa PolicyRecommendation?Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**). Ifsignificant at 10% level, we label impact with +Impact of Progresa on Consumption (Y)Case 1: Before & After 34.28**Case 2: Enrolled & Not Enrolled -4.15Case 3: Randomized Assignment 29.75**Case 4: Randomized Offering 30.4**Case 5: Discontinuity Design 30.58**Case 6: Difference-in-Differences 25.53**Case 7: Matching 7.06+
  79. 79. 80Appendix 2: Steps inPropensity Score Matching1. Representative & highly comparables survey of non-participants and participants.2. Pool the two samples and estimated a logit (or probit) modelof program participation.3. Restrict samples to assure common support (importantsource of bias in observational studies)4. For each participant find a sample of non-participants thathave similar propensity scores5. Compare the outcome indicators. The difference is theestimate of the gain due to the program for that observation.6. Calculate the mean of these individual gains to obtain theaverage overall gain.
  80. 80. 81IE MethodsToolboxRandomized AssignmentDiscontinuity DesignDiff-in-DiffRandomizedOffering/PromotionDifference-in-DifferencesP-Score matchingMatching
  81. 81. 82MatchingFor each treated unit pick up the best comparisonunit (match) from another data source.IdeaMatches are selected on the basis of similarities inobserved characteristics.How?If there are unobservable characteristics and thoseunobservables influence participation: Selection bias!Issue?
  82. 82. 83Propensity-Score Matching(PSM)Comparison Group: non-participants with sameobservable characteristics as participants. In practice, it is very hard. There may be many important characteristics!Match on the basis of the “propensity score”,Solution proposed by Rosenbaum and Rubin: Compute everyone’s probability of participating, basedon their observable characteristics. Choose matches that have the same probability ofparticipation as the treatments. See appendix 2.
  83. 83. 84Density of propensity scoresDensityPropensity Score0 1ParticipantsNon-ParticipantsCommon Support
  84. 84. 85Case 7: Progresa Matching(P-Score)Baseline Characteristics Estimated CoefficientProbit Regression, Prob Enrolled=1Head’s age (years) -0.022**Spouse’s age (years) -0.017**Head’s education (years) -0.059**Spouse’s education (years) -0.03**Head is female=1 -0.067Indigenous=1 0.345**Number of household members 0.216**Dirt floor=1 0.676**Bathroom=1 -0.197**Hectares of Land -0.042**Distance to Hospital (km) 0.001*Constant 0.664**Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).
  85. 85. 86Case 7: Progresa CommonSupportPr (Enrolled)Density: Pr (Enrolled)Density:Pr(Enrolled)Density: Pr (Enrolled)
  86. 86. 87Case 7: Progresa Matching(P-Score)Estimated Impact onConsumption (Y)Multivariate LinearRegression 7.06+Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**). Ifsignificant at 10% level, we label impact with +
  87. 87. 88Keep in MindMatchingMatching requires largesamples and good qualitydata.Matching at baseline can bevery useful: Know the assignment ruleand match based on it combine with othertechniques (i.e. diff-in-diff)Ex-post matching is risky: If there is no baseline, becareful! matching on endogenousex-post variables gives badresults.!
  88. 88. 89Progresa PolicyRecommendation?Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**). Ifsignificant at 10% level, we label impact with +Impact of Progresa on Consumption (Y)Case 1: Before & After 34.28**Case 2: Enrolled & Not Enrolled -4.15Case 3: Randomized Assignment 29.75**Case 4: Randomized Offering 30.4**Case 5: Discontinuity Design 30.58**Case 6: Difference-in-Differences 25.53**Case 7: Matching 7.06+
  89. 89. 90Appendix 2: Steps inPropensity Score Matching1. Representative & highly comparables survey of non-participants and participants.2. Pool the two samples and estimated a logit (or probit) modelof program participation.3. Restrict samples to assure common support (importantsource of bias in observational studies)4. For each participant find a sample of non-participants thathave similar propensity scores5. Compare the outcome indicators. The difference is theestimate of the gain due to the program for that observation.6. Calculate the mean of these individual gains to obtain theaverage overall gain.
  90. 90. 91IE MethodsToolboxRandomized AssignmentDiscontinuity DesignDiff-in-DiffRandomizedOffering/PromotionDifference-in-DifferencesP-Score matchingMatching
  91. 91. 92Keep in MindRandomized AssignmentIn Randomized Assignment,large enough samples,produces 2 statisticallyequivalent groups.We have identified theperfect clone.RandomizedbeneficiaryRandomizedcomparisonFeasible for prospectiveevaluations with over-subscription/excess demand.Most pilots and newprograms fall into thiscategory.!
  92. 92. 93Randomized assignmentwith different benefit levelsTraditional impact evaluation question: What is the impact of a program on an outcome?Other policy question of interest: What is the optimal level for program benefits? What is the impact of a “higher-intensity” treatmentcompared to a “lower-intensity” treatment?Randomized assignment with 2 levels ofbenefits:Comparison Low Benefit High BenefitX
  93. 93. 94= IneligibleRandomized assignmentwith different benefit levels= Eligible1. Eligible Population 2. Evaluation sample3. Randomizetreatment(2 benefit levels)X
  94. 94. 95Other key policy question for a program with variousbenefits: What is the impact of an intervention compared to another? Are there complementarities between various interventions?Randomized assignment with 2 benefit packages:Intervention 1Treatment ComparisonIntervention2TreatmentGroup A Group CComparisonGroup B Group DXRandomized assignmentwith multiple interventions
  95. 95. 96= IneligibleRandomized assignmentwith multiple interventions= Eligible1. Eligible Population 2. Evaluation sample3. Randomizeintervention 14. Randomizeintervention 2X
  96. 96. 97Appendix : Two StageLeast Squares (2SLS)1 2y T x      0 1 1T x Z      Model with endogenous Treatment (T):Stage 1: Regress endogenous variable onthe IV (Z) and other exogenous regressors:Calculate predicted value for eachobservation: T hat
  97. 97. 98Appendix 1: Two StageLeast Squares (2SLS)^1 2( )y T x      Need to correct Standard Errors (they arebased on T hat rather than T)Stage 2: Regress outcome y on predictedvariable (and other exogenous variables):In practice just use STATA – ivreg.Intuition: T has been “cleaned” of itscorrelation with ε.
  98. 98. 99Tuesday - Session 1INTRODUCTION AND OVERVIEW1) Introduction2) Why is evaluation valuable?3) What makes a good evaluation?4) How to implement an evaluation?Wednesday - Session 2EVALUATION DESIGN5) Causal Inference6) Choosing your IE method/design7) Impact Evaluation ToolboxThursday - Session 3SAMPLE DESIGN AND DATA COLLECTION9) Sample Designs10) Types of Error and Biases11) Data Collection Plans12) Data Collection ManagementFriday - Session 4INDICATORS & QUESTIONNAIRE DESIGN1) Results chain/logic models2) SMART indicators3) Questionnaire DesignOutline: topics being covered
  99. 99. Thank You!
  100. 100. 101This material constitutes supporting material for the "Impact Evaluation in Practice" book. This additional material is made freely butplease acknowledge its use as follows: Gertler, P. J.; Martinez, S., Premand, P., Rawlings, L. B. and Christel M. J. Vermeersch, 2010,Impact Evaluation in Practice: Ancillary Material, The World Bank, Washington DC (www.worldbank.org/ieinpractice). The content ofthis presentation reflects the views of the authors and not necessarily those of the World Bank.MEASURING IMPACTImpact Evaluation Methods for PolicyMakers

×