Successfully reported this slideshow.
Your SlideShare is downloading. ×

Impact evaluation in-depth: More on methods and example of impact evaluation research using R

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 30 Ad

Impact evaluation in-depth: More on methods and example of impact evaluation research using R

Download to read offline

Presented by Colas Chervier (CIRAD) at "Workshop on impact evaluation methods and research collaboration kick-off", Samarinda, Indonesia, on 10 October 2022

Presented by Colas Chervier (CIRAD) at "Workshop on impact evaluation methods and research collaboration kick-off", Samarinda, Indonesia, on 10 October 2022

Advertisement
Advertisement

More Related Content

More from Center for International Forestry Research (CIFOR) (20)

Recently uploaded (20)

Advertisement

Impact evaluation in-depth: More on methods and example of impact evaluation research using R

  1. 1. Counterfactual impact evaluation Implementation Colas Chervier CIRAD colas.chervier@cirad.fr 1
  2. 2. Content • Session 1. • What questions can these methods help address? • What are the main features of these methods and why use them? • Session 2. • What is a counterfactual? • What are the specific features of these methods? • How to implement them in practice (steps)? • Presentation of an example of an impact assessment study 2
  3. 3. Goal of these methods (reminder) • Identify a good COUNTERFACTUAL • = control group representing what those outcomes would have been in the absence of the intervention • …after the program target area has been selected… • … and that is free of selection bias • Estimate IMPACT • By comparing outcomes beween control and treated groups • And control for additional biases (e.g. change in price) 3 Before/after PES Control Intervention (PES) 3
  4. 4. What is a counterfactual? 4
  5. 5. Rubin's Theoretical Causal Model: Individual level • Impact at the individual level • ti = y1 i – y0 i • y1 i : the value of the parameter of interest if the agent is subject to the policy • y0 i : the value of the parameter of interest if the agent is not subject to the policy • Counterfactual: if the agent is effectively subject to the policy, y0 i is the counterfactual value (and vice versa) • Problem • Impossible to observe both outcomes for the same individual in a real world… https://www.youtube.com/watch?v=LrmrH26EhSo 5
  6. 6. Rubin's Theoretical Causal Model: population level • Average causal effect for a population T = E[Y1 − Y0] = E[Y1] − E[Y0] • E[] mean outcome value for the population • Y1 the vector of the set of potential outcome values if all agents were subjected to the treatment • Y0 the vector of the set of potential values if no agent were subject to the policy • In practice, some receive the program, others do not… T = (E[Y1 |W = 1] - E[Y0 |W = 1]) + (E[Y1 | W = 0] - E[Y0 | W = 0]) • We only observe the two expressions in red • We define the impact of the program as the average treatment effect on the treated, in bold Program No program E[Y1 |W = 1] - E[Y1 | W = 0] - E[Y0 |W = 1] E[Y0 | W = 0] 6
  7. 7. The real world and the selection bias • Difference in observed mean, ATT and selection bias Δµ= E[Y1 |W = 1] - E[Y0 | W = 0] = (E[Y 1| W = 1] - E[Y0 |W = 1]) + (E[Y0 |W = 1] - E[Y0 | W = 0]) • The first part is the impact (bold) • The second is selection bias (underlined) → if the attribution of the treatment is independent (no selection bias, the difference in mean represents the value of the impact https://www.youtube.com/watch?v=RKGw2Lp6Y8I&t=1s Résultats de traduction The first part is the ATT (bold) The second is selection bias (underlined) Résultats de traduction The first part is the ATT (bold) The second is selection bias (underlined) 7
  8. 8. One way to get rid of the selection bias • RCT or randomized control trial • The ideal case where there is no selection bias, i.e. where the treatment assignment is independent of Y1 and Y0 • Rare and difficult to implement: → Quasi-experimental- methods 8
  9. 9. Methods description 9
  10. 10. Matching (1/2) • Each unit of observation is assigned a twin with similar observable pre-treatment characteristics (covariates) so that, on average there is no difference between the two groups • Covariates influence probabilities of being selected into the program • & experiencing changes in the outcome variable of interest • Covariates are preprogram to avoid contamination • Selection bias is controlled for (no initial differences) • The comparison of the outcome means of the two groups theoretically gives us the impact free of the selection bias. 10
  11. 11. Matching (2/2) Data/knowledge needs • If data are available for a large number of agents who did not receive the program • If we have a good understanding of what influenced the placement of the program and what influences the parameter of interest • If we have pre-intervention data Hypothesis • There is no unobserved difference correlated to the results between the treatment group and the comparison group (e.g. governance) 11
  12. 12. DD (1/2) Before/after/with/without comparison of outcomes: 12 Controls for time-invariant factors that affect that group (observed: year of birth, etc.; unobserved: intelligence). Control for external factors that vary over time and affect both groups (price changes, etc.)
  13. 13. DD (2/2) Data/knowledge needs • Before and after data for the outcome variable Hypothesis • No un observed time-varying factors that affect differently the two groups • Parallel trends in the absence of intervention 13
  14. 14. Synthetic control method (1/2) Principle • SCM creates a counterfactual obtained as a weighted combination of control units. • Weights are assigned to control units so that their combination minimizes differences in pre- treatment outcomes and predictors of outcome. • Impact is obtained by comparing a weighted average of control units' outcomes with treated units’ outcomes over time. https://medium.com/analytics- vidhya/synthetic-control-method-5c01f72da4e 14
  15. 15. Synthetic control method (2/2) Data/Knowledge needs • Pre-treatment outcome for a long period • Identify good predictors and have access to pre-treatment data • Large control pool Hypothesis • the best fitting weighting of units in terms of pre-treatment outcomes would follow a time trend similar to the treated unit without the intervention. 15
  16. 16. Implementing quasi-experimental methods step by step Example of the impact of logging concessions in DRC 16
  17. 17. 1. Define policy-relevant question(s) to be addressed • Key questions to ask ourselves • What intervention or group of intervention do we want to evaluate? • What outcome(s)? Socio-economic impact and/or environmental impact? Which one in particular? • What kind of heterogeneity analysis? • Decision criteria: • Policy relevance / stakeholders’ interest (co-development of the empirical question) • Scientific relevance • Access to data, particularly the availability of before & socio-economic data. 17
  18. 18. 2. Creating a theory of change • Build a theory of change detailing: • the impact pathways through which the intervention influence outcomes; • The other explanations/factors that can influence the outcome of interest; • The possible reasons why impact might be heterogeneous; • The reasons why the intervention is placed where it is placed; • This will help us define / justify: • Our working hypotheses • Our control variables 18 Intervention Outcomes ? Other causes of deforestation Placement criteria Heterogeneous impact
  19. 19. 3. Define the key building blocks of the IE (1/3) • Unit of analysis • It depends a lot on the structure of the data that will be used and the outcome we target • E.g. • Household (survey/income analysis), • pixel or groups of pixels (remote sensing data/deforestation analysis), • jurisdiction or admin unit (health data) 19
  20. 20. 3. Define the key building blocks of the IE (2/3) • The intervention • Treated area • Usually quite easy, it is the pool of units where we expect to observe the impact of the project • REDD+ target villages, etc. • The cut-off date • When did the intervention start? 20
  21. 21. 3. Define the key building blocks of the IE (3/3) • The control area • based on an understanding of where the policy could be implemented in the future or where it could have been implemented (ToC) and where the evolution of outcomes would have been the same in treated areas if they had not been treated • Work with experts and stakeholders 21
  22. 22. 4. Choose the estimation strategy (1/2) • Which method to build the counterfactual? • Depends a lot on the size of the treated pool of units • If pre-intervention data in both control and treated units, yes • If small number of treated units, SCM, otherwise matching. Pre-Matching None SCM 22
  23. 23. 4. Choose the estimation strategy (2/2) • Which impact estimation method? • Overall strategy • If pre-intervention data for the outcome: BACI as it allows controlling for more sources of bias; otherwise CI • Why estimate a more complex regression? • to take into account the hierarchical structure of data (observations not independent); • to further control for covariates (e.g. time and group varying covariates); • to take into account time trends (e.g. panel data). Difference-in-Difference (BACI design) Simple means difference More complex regression With and Without intervention Simple means difference More complex regression 23 Depends a lot on what data is available
  24. 24. 4. Identify specific variables and data sources For covariate and impact estimation • Identify variables needed: • 3 main types (control*, outcome and treatment) • Based on the ToC • Depends on available data sources (trade-off). Variable name Variable description Data source File name Attribute and category name Year Settlements Euclydian distance to closest settlement Forest Atlas 2009 shp file: RDC_localité_ 2009 All points 2009 (only year available) Roads including logging roads - OLD OPEN Euclydian distance to an "old" open road https://www. research- collection.eth z.ch/handle/ 20.500.11850 /342221 shp: Kleinshcroth et al old_open 2003 and before 24 * Reminder: all variables influencing outcome and placement of the intervention
  25. 25. 5. Plan additional analyses using the same estimation strategy(1/2) • Prove that we don’t find impact by chance / convince the reviewer, reader • Robustness/sensitivity: • Are our conclusions sensitive to changes in model specifications? • E.g. adding a variable or changing the matching procedure • Placebo: • Are we finding significant results in cases where we should not find? • E.g. using a fake treated group that looks like the real treated group but was not treated 25
  26. 26. 5. Plan additional analyses using the same estimation strategy (2/2) • Provide information about conditions of effectiveness. • Heterogeneity • Does impact magnitude vary according to initial conditions? • Is impact magnitude different depending on specific intervention characteristics? 26 Note: to be planned according to the ToC
  27. 27. 6. Collect data and create a database • Data collection or compilation • Create a GIS ‘project’ compiling all spatial maps/shp/etc. • Design and implement a socio-economic survey in intervention and control areas • Create datasets for counterfactual creation and for estimation: id year Concession or village Treatment year_contract weights X (precipitation) outcome 1 1 A 0 0 0.462 810 25 1 2 A 0 1 0.462 937 45 1 3 A 0 1 0.462 1077 36 2 1 A 0 0 0.850 1448 2 2 2 A 0 1 0.850 890 8 2 3 A 0 1 0.850 1035 12 3 1 B 1 0 0.000 1053 23 3 2 B 1 0 0.000 1030 65 3 3 B 1 0 0.000 840 85 27 id Treatment X1 pretreat X2 pretreat 1 0 810 25 2 0 937 45 3 0 1077 36 4 0 1448 2 5 0 890 8 6 0 1035 12 7 1 1053 23 8 1 1030 65 9 1 840 85 Panel dataset For matching
  28. 28. 7. Final steps (1/2) • Data analysis • Skills in econometrics • R has better packages for matching and is open source • Interpretation of results • Work with experts and stakeholders. 28
  29. 29. 7. Final steps (2/2) 29 Publication of results
  30. 30. Conclusion day 2 • Relatively easy to implement for environmental impact but a bit more difficult for social impacts (budget, lack of data) • The basis for the success of these methods are 1. a good understanding of the context and 2. good access to data • It requires a mix of expertise (econometrics, GIS, context knowledge) 30

×