This document outlines a training curriculum for evaluating the socio-economic impact of a water program. It covers four sessions over four days: introduction and overview, evaluation design, sample design and data collection, and indicators and questionnaire design. Key topics include causal inference, impact evaluation methods like randomized assignment and difference-in-differences, sample designs, and designing indicators and questionnaires. The document uses a case study of Mexico's Progresa anti-poverty program to illustrate concepts like randomized assignment, pre-post comparisons, and enrolled vs non-enrolled comparisons.
Presentation at the European Central Bank, Nov 6, 2013
Panel surveys are used to measure change over time, but previous research has shown that simply asking the same questions of the same respondents in repeated interviews leads to overreporting of change. With proactive dependent interviewing, responses from the previous interview are preloaded into the questionnaire, and respondents are reminded of this information before being asked about their current situation. Existing research has shown that dependent interviewing techniques can reduce spurious change in wave-to-wave reports and thus improve the quality of estimates from longitudinal data. However, the literature provides little guidance on how such questions should be worded. After reminding a respondent of her report in the last wave (“Last time we interviewed you, you said that you were not employed”), we might ask: “Is that still the case?”; “Has that changed?”; “Is that still the case or has that changed?”; or we might ask the original question again: “What is your current labour market activity?”. In this study we present experimental evidence from a longitudinal telephone survey in Germany (n=1500) in which we experimentally manipulated the wording of the dependent questions and contrasted them with independent questions. We report differences in the responses collected by the different question types. Due to the concern that respondents may falsely confirm previous information as still applying, leading to underreporting of change in dependent interviewing, we also test hypotheses about how respondents answer such questions. In these tests, we focus on the roles played by personality, deliberate misreporting to shorten the interview, least effort strategies and cognitive ability in the response process to dependent questions. The paper provides evidence-based guidance on questionnaire design for panel surveys.
joint work in Annette Jaeckle, University of Essex
This document discusses different evaluation design approaches including quantitative, qualitative, and mixed methods. It provides details on key aspects of each approach such as data collection instruments, strengths, and when each is most applicable. For quantitative methods, it describes experimental, quasi-experimental, time series, and cross-sectional designs. For qualitative methods, it discusses observation, interviews, focus groups, document studies, and key informants. It notes that mixed methods combine quantitative and qualitative approaches to provide multiple perspectives on outcomes and implementation.
Annual Results and Impact Evaluation Workshop for RBF - Day Six - Introductio...RBFHealth
This document discusses impact evaluation methods for results-based financing (RBF) programs. It aims to build evidence on what works and why through impact evaluations that are built into program operations and involve government ownership. The document outlines key policy questions around whether and how RBF works and discusses methods like randomized assignment that can help estimate the causal impact of RBF programs on outcomes like health service utilization and health impacts. It provides an example impact evaluation of Rwanda's performance-based financing project which found improvements in prenatal care quality, skilled delivery rates, and child preventive care utilization from RBF.
This document provides an overview and outline of a 4-day impact evaluation training curriculum. Day 1 focuses on introductions and why impact evaluation is valuable by answering questions about why evaluate, monitoring vs evaluation, and impact evaluation. It discusses how to implement an impact evaluation by estimating a counterfactual and addressing selection bias. Day 2 will cover evaluation design including causal inference, choosing methods, and the impact evaluation toolbox. The following days cover sample design, data collection, and indicator and questionnaire design. The document emphasizes that choosing an evaluation design depends on how the program is implemented and the rules of operation.
Pro Bono Economics aims to match professional economists with charities to provide pro bono assistance measuring performance and impact, having supported over 260 volunteers from various economic organizations to work on over 160 projects for charities serving groups like the homeless, ex-offenders, and disabled people. PBE was incorporated in 2009 and has grown from managing 11 projects with no staff to over 40 projects annually with a few part-time staff through fostering volunteering in economics.
Applying impact evaluation tools for integrating agricultural sectors in Nati...UNDP Climate
1. Impact evaluations assess how interventions affect outcomes, both intended and unintended. They aim to determine causal relationships between the intervention and outcomes, rather than just correlations.
2. For agriculture adaptation projects, impact evaluations using experimental and quasi-experimental techniques can estimate the impact of specific adaptation options. This helps policymakers rationally choose among options.
3. Impact evaluations must be prospectively designed to understand if an adaptation option achieves its intended impacts, addressing the evaluation aspect of monitoring and evaluation for adaptation projects.
Presented by Pascale Schnitzer and Carlo Azzarri, IFPRI at the Africa RISING–CSISA Joint Monitoring and Evaluation Meeting, Addis Ababa, Ethiopia, 11-13 November 2013
Presentation at the European Central Bank, Nov 6, 2013
Panel surveys are used to measure change over time, but previous research has shown that simply asking the same questions of the same respondents in repeated interviews leads to overreporting of change. With proactive dependent interviewing, responses from the previous interview are preloaded into the questionnaire, and respondents are reminded of this information before being asked about their current situation. Existing research has shown that dependent interviewing techniques can reduce spurious change in wave-to-wave reports and thus improve the quality of estimates from longitudinal data. However, the literature provides little guidance on how such questions should be worded. After reminding a respondent of her report in the last wave (“Last time we interviewed you, you said that you were not employed”), we might ask: “Is that still the case?”; “Has that changed?”; “Is that still the case or has that changed?”; or we might ask the original question again: “What is your current labour market activity?”. In this study we present experimental evidence from a longitudinal telephone survey in Germany (n=1500) in which we experimentally manipulated the wording of the dependent questions and contrasted them with independent questions. We report differences in the responses collected by the different question types. Due to the concern that respondents may falsely confirm previous information as still applying, leading to underreporting of change in dependent interviewing, we also test hypotheses about how respondents answer such questions. In these tests, we focus on the roles played by personality, deliberate misreporting to shorten the interview, least effort strategies and cognitive ability in the response process to dependent questions. The paper provides evidence-based guidance on questionnaire design for panel surveys.
joint work in Annette Jaeckle, University of Essex
This document discusses different evaluation design approaches including quantitative, qualitative, and mixed methods. It provides details on key aspects of each approach such as data collection instruments, strengths, and when each is most applicable. For quantitative methods, it describes experimental, quasi-experimental, time series, and cross-sectional designs. For qualitative methods, it discusses observation, interviews, focus groups, document studies, and key informants. It notes that mixed methods combine quantitative and qualitative approaches to provide multiple perspectives on outcomes and implementation.
Annual Results and Impact Evaluation Workshop for RBF - Day Six - Introductio...RBFHealth
This document discusses impact evaluation methods for results-based financing (RBF) programs. It aims to build evidence on what works and why through impact evaluations that are built into program operations and involve government ownership. The document outlines key policy questions around whether and how RBF works and discusses methods like randomized assignment that can help estimate the causal impact of RBF programs on outcomes like health service utilization and health impacts. It provides an example impact evaluation of Rwanda's performance-based financing project which found improvements in prenatal care quality, skilled delivery rates, and child preventive care utilization from RBF.
This document provides an overview and outline of a 4-day impact evaluation training curriculum. Day 1 focuses on introductions and why impact evaluation is valuable by answering questions about why evaluate, monitoring vs evaluation, and impact evaluation. It discusses how to implement an impact evaluation by estimating a counterfactual and addressing selection bias. Day 2 will cover evaluation design including causal inference, choosing methods, and the impact evaluation toolbox. The following days cover sample design, data collection, and indicator and questionnaire design. The document emphasizes that choosing an evaluation design depends on how the program is implemented and the rules of operation.
Pro Bono Economics aims to match professional economists with charities to provide pro bono assistance measuring performance and impact, having supported over 260 volunteers from various economic organizations to work on over 160 projects for charities serving groups like the homeless, ex-offenders, and disabled people. PBE was incorporated in 2009 and has grown from managing 11 projects with no staff to over 40 projects annually with a few part-time staff through fostering volunteering in economics.
Applying impact evaluation tools for integrating agricultural sectors in Nati...UNDP Climate
1. Impact evaluations assess how interventions affect outcomes, both intended and unintended. They aim to determine causal relationships between the intervention and outcomes, rather than just correlations.
2. For agriculture adaptation projects, impact evaluations using experimental and quasi-experimental techniques can estimate the impact of specific adaptation options. This helps policymakers rationally choose among options.
3. Impact evaluations must be prospectively designed to understand if an adaptation option achieves its intended impacts, addressing the evaluation aspect of monitoring and evaluation for adaptation projects.
Presented by Pascale Schnitzer and Carlo Azzarri, IFPRI at the Africa RISING–CSISA Joint Monitoring and Evaluation Meeting, Addis Ababa, Ethiopia, 11-13 November 2013
International Food Policy Research Institute (IFPRI) organized a three days Training Workshop on ‘Monitoring and Evaluation Methods’ on 10-12 March 2014 in New Delhi, India. The workshop is part of an IFAD grant to IFPRI to partner in the Monitoring and Evaluation component of the ongoing projects in the region. The three day workshop is intended to be a collaborative affair between project directors, M & E leaders and M & E experts. As part of the workshop, detailed interaction will take place on the evaluation routines involving sampling, questionnaire development, data collection and management techniques and production of an evaluation report. The workshop is designed to better understand the M & E needs of various projects that are at different stages of implementation. Both the generic issues involved in M & E programs as well as project specific needs will be addressed in the workshop. The objective of the workshop is to come up with a work plan for M & E domains in the IFAD projects and determine the possibilities of collaboration between IFPRI and project leaders.
This document provides an introduction to quantitative impact evaluation methods. It discusses why impact evaluations are important, how to design an evaluation, and common evaluation tools and methodologies. Key points include: impact evaluations measure a program's causal effects, require a comparison group to estimate counterfactual outcomes, and use methods like randomization, matching, regression discontinuity, and difference-in-differences to construct valid comparisons. The goals of evaluations are to measure impacts, assess cost-effectiveness, and explain which program components are most effective.
Impact evaluation in-depth: More on methods and example of impact evaluation ...CIFOR-ICRAF
Presented by Colas Chervier (CIRAD) at "Workshop on impact evaluation methods and research collaboration kick-off", Samarinda, Indonesia, on 10 October 2022
Evaluating economic impacts of agricultural research ciatCIAT
This document discusses key issues in evaluating the economic impacts of agricultural research through examples and lessons. It covers:
1) Identifying the counterfactual scenario of what would have happened without the research through various estimation approaches.
2) Managing the assessment of multiple objectives like productivity, poverty reduction, environment, and health.
3) Addressing aggregation of impacts at different levels from field to national.
4) The importance of integrating impact assessment with institutional research data management systems.
(1) Impact evaluations should focus more on understanding dynamics of change and people's responses to incentives, rather than just registering outcomes. (2) Evaluation provides critical information on policy relevance and is a progressive learning exercise. (3) Novel approaches are needed to evaluate new development strategies.
Czarnitzki - Towards a portfolio of additionaliyu indicatorsinnovationoecd
This document discusses current practices in evaluating science, technology, and innovation (STI) policies using econometric methods and identifies areas for future improvement. It notes that methods like matching, difference-in-differences, and instrumental variables are now commonly used to establish control groups and estimate policy impacts. However, greater emphasis is needed on exploring heterogeneous treatment effects across different policy instruments, firm characteristics, and over time to better inform policy design. Indirect effects also need consideration. Improving identification strategies through techniques like regression discontinuity and exploiting natural experiments can further strengthen evaluations.
HLEG thematic workshop on Measuring Inequalities of Income and Wealth, Nora L...StatsCommunications
HLEG workshop on Measuring Inequalities of Income and Wealth, 15-16 September 2015, Berlin, Germany, More information at: http://oe.cd/hleg-workshop-inequalities-income-and-wealth
This document discusses causal inference and program evaluation. It notes that evaluating programs requires estimating the counterfactual outcome for participants in the absence of the program, which is difficult. Common problems in evaluation include selection bias if participants differ from non-participants in unobserved ways, spillover effects, and impact heterogeneity. Internal validity assesses if the true impact is measured, while external validity examines generalizability. Estimating average treatment effects requires addressing non-random selection into programs.
Thailand UNDP-GIZ workshop on CBA - A review of conduction cost-benefit analysisUNDP Climate
Thailand, 27-28 November 2017 - UNDP and GIZ partnered with the Thailand Office of Agriculture Economics (OAE) to launch a workshop designed to connect vital stakeholders to build an effective National Adaptation Plan.
The two-day workshop at the Rama Garden Hotel had 20 participants from each department under the Ministry of Agriculture and Cooperatives (MOAC). The workshop was designed to build capacity of planning officers to formulate better projects and budget submissions as well as potential climate finance proposal using cost-benefit analysis and ecosystem-based analysis appraisal tools.
This document discusses the challenges of conducting impact evaluations of agricultural extension programs in Sub-Saharan Africa. It notes that while most impact evaluations have found positive effects, some studies have produced contradictory results. There are several challenges to impact evaluation, including difficulties establishing a counterfactual, attribution of outcomes to the program given multiple influences, and methodological limitations like lack of baseline data and small sample sizes. The document argues for moving beyond a focus on proving impact to instead emphasize learning from evaluations and using flexible, low-cost methods to improve programs over time.
How Impact Investors Think about Measuring Impactrfbridge0
Impact investors want a measurable impact return. But it can be challenging to measure impact efficacy. This presentation speaks to how the Southwest Angel Network, an impact investing network, thinks about impact measurement.
The document outlines the components of impact evaluation including needs assessment, theory of change, process evaluation, impact evaluation, and cost-effectiveness analysis. It discusses framing impact evaluation through a theory of change and using randomized evaluations as the gold standard for measuring a program's causal impact. Randomized evaluations compare outcomes between participants who are randomly assigned to a treatment group or control group to estimate the counterfactual.
Methodologies for impact assessment of post harvest technologiesAshish Murai
This document discusses methodologies for assessing the impact of post-harvest technologies. It outlines key concepts like impact, impact assessment, and different approaches like ex-ante and ex-post assessment. It then describes the steps involved in impact assessment, including selecting a technology, identifying indicators, conducting baseline surveys, and using tools like benefit-cost analysis. Specific methodologies like net present value, benefit-cost ratio, and internal rate of return calculations are explained. An example of assessing the impact of a pulse milling project is provided.
This document discusses three "holy grails" of improving social impact: 1) defining outcomes, 2) measuring and transparently reporting outcomes to change resourcing decisions, and 3) collaborative and participative approaches. It argues that without clear outcomes, effective measurement, and collaboration across sectors, the social impact system will not be able to meet future challenges. Measurement is valuable for focusing organizations on outcomes and allowing funders to reallocate resources, but barriers include costs, timeframes, and lack of agreement on methods. The document advocates for defining shared outcomes, shared measurement, open data, long-term collaborative approaches, and a learning system oriented toward community outcomes rather than individual programs.
CDI Seminar "Mixed Methods to Assess the Impact of Private Sector Support: Ex...Debbie West
This document discusses methods for evaluating the impact of private sector support programs. It begins by outlining different approaches to impact evaluation, including testing causal effects and understanding contributing factors. The document then describes mixing quantitative and qualitative methods to evaluate both attribution of outcomes to interventions and contributions of interventions to outcomes. Specifically, it discusses using a cohort study, contribution scoring, and comparative case studies to evaluate the Dutch PRIME program. Results show some effects on business practices and outcomes like profits and exports. The document concludes by synthesizing the mixed methods approach into an evidence-based contribution narrative.
This document provides an overview of health economics and its role in public health. It begins by defining health economics as using an economic framework to help maximize population health given constrained resources. It then discusses the various analyses health economists perform, including economic evaluations like cost-effectiveness analysis. It provides examples of how economics can inform decisions around public health programs and resource allocation in India. Key points made include that health resources are limited so choices must be made, and that economic evaluations can help identify which health interventions provide the best value. The conclusion emphasizes that health economics should be integrated into health policy and management to help make resource decisions more explicit and fair.
International Food Policy Research Institute (IFPRI) organized a three days Training Workshop on ‘Monitoring and Evaluation Methods’ on 10-12 March 2014 in New Delhi, India. The workshop is part of an IFAD grant to IFPRI to partner in the Monitoring and Evaluation component of the ongoing projects in the region. The three day workshop is intended to be a collaborative affair between project directors, M & E leaders and M & E experts. As part of the workshop, detailed interaction will take place on the evaluation routines involving sampling, questionnaire development, data collection and management techniques and production of an evaluation report. The workshop is designed to better understand the M & E needs of various projects that are at different stages of implementation. Both the generic issues involved in M & E programs as well as project specific needs will be addressed in the workshop. The objective of the workshop is to come up with a work plan for M & E domains in the IFAD projects and determine the possibilities of collaboration between IFPRI and project leaders.
This document provides an introduction to quantitative impact evaluation methods. It discusses why impact evaluations are important, how to design an evaluation, and common evaluation tools and methodologies. Key points include: impact evaluations measure a program's causal effects, require a comparison group to estimate counterfactual outcomes, and use methods like randomization, matching, regression discontinuity, and difference-in-differences to construct valid comparisons. The goals of evaluations are to measure impacts, assess cost-effectiveness, and explain which program components are most effective.
Impact evaluation in-depth: More on methods and example of impact evaluation ...CIFOR-ICRAF
Presented by Colas Chervier (CIRAD) at "Workshop on impact evaluation methods and research collaboration kick-off", Samarinda, Indonesia, on 10 October 2022
Evaluating economic impacts of agricultural research ciatCIAT
This document discusses key issues in evaluating the economic impacts of agricultural research through examples and lessons. It covers:
1) Identifying the counterfactual scenario of what would have happened without the research through various estimation approaches.
2) Managing the assessment of multiple objectives like productivity, poverty reduction, environment, and health.
3) Addressing aggregation of impacts at different levels from field to national.
4) The importance of integrating impact assessment with institutional research data management systems.
(1) Impact evaluations should focus more on understanding dynamics of change and people's responses to incentives, rather than just registering outcomes. (2) Evaluation provides critical information on policy relevance and is a progressive learning exercise. (3) Novel approaches are needed to evaluate new development strategies.
Czarnitzki - Towards a portfolio of additionaliyu indicatorsinnovationoecd
This document discusses current practices in evaluating science, technology, and innovation (STI) policies using econometric methods and identifies areas for future improvement. It notes that methods like matching, difference-in-differences, and instrumental variables are now commonly used to establish control groups and estimate policy impacts. However, greater emphasis is needed on exploring heterogeneous treatment effects across different policy instruments, firm characteristics, and over time to better inform policy design. Indirect effects also need consideration. Improving identification strategies through techniques like regression discontinuity and exploiting natural experiments can further strengthen evaluations.
HLEG thematic workshop on Measuring Inequalities of Income and Wealth, Nora L...StatsCommunications
HLEG workshop on Measuring Inequalities of Income and Wealth, 15-16 September 2015, Berlin, Germany, More information at: http://oe.cd/hleg-workshop-inequalities-income-and-wealth
This document discusses causal inference and program evaluation. It notes that evaluating programs requires estimating the counterfactual outcome for participants in the absence of the program, which is difficult. Common problems in evaluation include selection bias if participants differ from non-participants in unobserved ways, spillover effects, and impact heterogeneity. Internal validity assesses if the true impact is measured, while external validity examines generalizability. Estimating average treatment effects requires addressing non-random selection into programs.
Thailand UNDP-GIZ workshop on CBA - A review of conduction cost-benefit analysisUNDP Climate
Thailand, 27-28 November 2017 - UNDP and GIZ partnered with the Thailand Office of Agriculture Economics (OAE) to launch a workshop designed to connect vital stakeholders to build an effective National Adaptation Plan.
The two-day workshop at the Rama Garden Hotel had 20 participants from each department under the Ministry of Agriculture and Cooperatives (MOAC). The workshop was designed to build capacity of planning officers to formulate better projects and budget submissions as well as potential climate finance proposal using cost-benefit analysis and ecosystem-based analysis appraisal tools.
This document discusses the challenges of conducting impact evaluations of agricultural extension programs in Sub-Saharan Africa. It notes that while most impact evaluations have found positive effects, some studies have produced contradictory results. There are several challenges to impact evaluation, including difficulties establishing a counterfactual, attribution of outcomes to the program given multiple influences, and methodological limitations like lack of baseline data and small sample sizes. The document argues for moving beyond a focus on proving impact to instead emphasize learning from evaluations and using flexible, low-cost methods to improve programs over time.
How Impact Investors Think about Measuring Impactrfbridge0
Impact investors want a measurable impact return. But it can be challenging to measure impact efficacy. This presentation speaks to how the Southwest Angel Network, an impact investing network, thinks about impact measurement.
The document outlines the components of impact evaluation including needs assessment, theory of change, process evaluation, impact evaluation, and cost-effectiveness analysis. It discusses framing impact evaluation through a theory of change and using randomized evaluations as the gold standard for measuring a program's causal impact. Randomized evaluations compare outcomes between participants who are randomly assigned to a treatment group or control group to estimate the counterfactual.
Methodologies for impact assessment of post harvest technologiesAshish Murai
This document discusses methodologies for assessing the impact of post-harvest technologies. It outlines key concepts like impact, impact assessment, and different approaches like ex-ante and ex-post assessment. It then describes the steps involved in impact assessment, including selecting a technology, identifying indicators, conducting baseline surveys, and using tools like benefit-cost analysis. Specific methodologies like net present value, benefit-cost ratio, and internal rate of return calculations are explained. An example of assessing the impact of a pulse milling project is provided.
This document discusses three "holy grails" of improving social impact: 1) defining outcomes, 2) measuring and transparently reporting outcomes to change resourcing decisions, and 3) collaborative and participative approaches. It argues that without clear outcomes, effective measurement, and collaboration across sectors, the social impact system will not be able to meet future challenges. Measurement is valuable for focusing organizations on outcomes and allowing funders to reallocate resources, but barriers include costs, timeframes, and lack of agreement on methods. The document advocates for defining shared outcomes, shared measurement, open data, long-term collaborative approaches, and a learning system oriented toward community outcomes rather than individual programs.
CDI Seminar "Mixed Methods to Assess the Impact of Private Sector Support: Ex...Debbie West
This document discusses methods for evaluating the impact of private sector support programs. It begins by outlining different approaches to impact evaluation, including testing causal effects and understanding contributing factors. The document then describes mixing quantitative and qualitative methods to evaluate both attribution of outcomes to interventions and contributions of interventions to outcomes. Specifically, it discusses using a cohort study, contribution scoring, and comparative case studies to evaluate the Dutch PRIME program. Results show some effects on business practices and outcomes like profits and exports. The document concludes by synthesizing the mixed methods approach into an evidence-based contribution narrative.
This document provides an overview of health economics and its role in public health. It begins by defining health economics as using an economic framework to help maximize population health given constrained resources. It then discusses the various analyses health economists perform, including economic evaluations like cost-effectiveness analysis. It provides examples of how economics can inform decisions around public health programs and resource allocation in India. Key points made include that health resources are limited so choices must be made, and that economic evaluations can help identify which health interventions provide the best value. The conclusion emphasizes that health economics should be integrated into health policy and management to help make resource decisions more explicit and fair.
This document summarizes worldwide experiences restructuring railways across five key issues: industry structure, ownership and control, infrastructure access, regulatory oversight, and community service obligations. It reviews approaches in multiple countries and discusses options like public ownership, outsourcing, concessions, and privatization. The document also discusses ensuring efficient separation of infrastructure from operations. Overall, the summary provides a high-level overview of global experiences restructuring railways and considering issues like competition, investment, and regulation.
Dokumen ini membahas masalah infrastruktur terkait dengan Zona Selamat Sekolah (ZoSS), pelican crossings, dan akses ke sekolah di kota Denpasar, Bali. Dokumen ini menjelaskan definisi dari fasilitas tersebut dan lokasi proyek yang akan didesain untuk meningkatkan keselamatan siswa. Dokumen ini juga membahas masalah yang ada terkait fasilitas tersebut dan langkah selanjutnya untuk merancang solusi infra
This document discusses developing multimodal transport in the North Java Corridor. It identifies the main economic centers and activities in the corridor and notes the need for infrastructure to meet the needs of these economic activities. It then discusses the current transport context and ideal modes for different cargo types. Key obstacles to efficient freight transport by road, rail, sea and lack of intermodal connectivity are identified. The document recommends targeting specific commodities for shifting to rail and sea transport. It outlines the right infrastructure, policies and reforms needed over the short, medium and long term to improve multimodal transport in the corridor.
Dokumen tersebut membahas strategi pembangunan transportasi multimoda di Pulau Jawa dengan memanfaatkan peran strategis perkeretaapian untuk menghubungkan wilayah-wilayah strategis seperti pelabuhan, bandara, dan kawasan industri. Pembangunan jalur ganda dan elektrifikasi di beberapa rute utama di Pulau Jawa diharapkan dapat meningkatkan kapasitas angkutan kereta api."
The document discusses plans to develop transportation infrastructure in Indonesia. It notes that Tanjung Priok Port is the largest in Indonesia and handles about 60% of cargo traffic. Plans are outlined to expand Tanjung Priok and develop a new port called NewPriok to meet growing demand. Integrating NewPriok with a logistics park is discussed to boost trade and employment. Developing multimodal transportation between ports, including a Pendulum Nusantara shipping route, is presented as a way to reduce domestic logistics costs and stimulate domestic trade.
Dokumen tersebut membahas mengenai angkutan multimoda dan peran angkutan penyeberangan dalam mendukung sistem transportasi multimoda di Indonesia. Secara ringkas, angkutan multimoda menggunakan minimal dua moda transportasi berbeda berdasarkan satu kontrak, sedangkan angkutan penyeberangan berperan sebagai penghubung antar moda transportasi darat, laut, dan udara untuk meningkatkan aksesibilitas wilayah.
Dokumen tersebut membahas pendekatan sistem yang aman dalam menangani masalah infrastruktur dan kecelakaan di Indonesia. Pendekatan ini melibatkan berbagai unsur seperti jalan dan kendaraan yang aman, pengguna jalan yang patuh, pendidikan publik, peraturan dan penegakan hukum, serta kerja sama lintas sektor untuk mengurangi kematian akibat kecelakaan.
This document discusses strategies for improving road safety for vulnerable road users in Indonesia. It recommends implementing a "Safe System" approach with four key elements: safe speeds, safe roads, safe road users, and safety management. For each element, the document provides examples of specific countermeasures, such as setting speed limits below 40km/h, building separated cycling and walking infrastructure, conducting educational campaigns on risks like not wearing helmets, and establishing inter-agency partnerships to coordinate road safety efforts. In conclusion, it proposes a six-point plan to apply the Safe System approach in Bandung, Indonesia focusing on speed management, crossing facilities, footpath quality, cycling infrastructure, promotional campaigns, and inter-agency coordination.
Dokumen ini membahas upaya peningkatan keselamatan pengguna jalan rentan di Bandung melalui pengelolaan kecepatan, penyeberangan yang aman, trotoar yang terpelihara, fasilitas bersepeda, kampanye promosi, dan koordinasi antar instansi.
The document discusses community consultation for an integrated urban road safety program in Indonesia. It outlines steps for planning consultation, including conducting surveys of local schools to gather student travel data and stakeholder issues. The surveys would collect information on travel modes, home locations, parking/drop-off areas, crossing locations and speeds. Local organizations, governments or universities could assist with data collection to help plan infrastructure improvements and identify early issues. The goal is to inform, consult and engage stakeholders during the project planning phase.
Dokumen tersebut membahas kerangka kerja monitoring dan evaluasi untuk proyek infrastruktur. Ia menekankan perlunya mengidentifikasi keluaran-keluaran proyek, merencanakan pelaksanaan dan pemantauan berdasarkan anggaran dan jadwal, serta mengevaluasi pencapaian tujuan jangka pendek dan panjang proyek.
Dokumen ini membahas program Integrated Urban Road Safety yang diusulkan untuk kota Medan, Pekanbaru dan Bandung di Indonesia, yang bertujuan untuk mengurangi kecelakaan pejalan kaki dengan memperbaiki kondisi trotoar dan menyediakan fasilitas penyeberangan jalan yang lebih aman. Program ini akan menerapkan perbaikan infrastruktur seperti perpanjangan trotoar, pulau median, dan ruang henti khusus untuk mengurangi konflik lalu lintas
Dokumen tersebut membahas pendekatan sistem yang aman dalam menangani masalah infrastruktur dan kecelakaan di Indonesia. Pendekatan ini melibatkan berbagai unsur seperti jalan dan kendaraan yang aman, pengemudi yang terampil, penegakan hukum, serta edukasi masyarakat untuk mencapai tujuan mengurangi kematian akibat kecelakaan.
More from Indonesia Infrastructure Initiative (20)
1. Chris Nicoletti
Activity #267: Analysing the socio-economic
impact of the Water Hibah on beneficiary
households and communities (Stage 1)
Impact Evaluation
Training Curriculum
Day 2
April 17, 2013
2. 2
Tuesday - Session 1
INTRODUCTION AND OVERVIEW
1) Introduction
2) Why is evaluation valuable?
3) What makes a good evaluation?
4) How to implement an evaluation?
Wednesday - Session 2
EVALUATION DESIGN
5) Causal Inference
6) Choosing your IE method/design
7) Impact Evaluation Toolbox
Thursday - Session 3
SAMPLE DESIGN AND DATA COLLECTION
9) Sample Designs
10) Types of Error and Biases
11) Data Collection Plans
12) Data Collection Management
Friday - Session 4
INDICATORS & QUESTIONNAIRE DESIGN
1) Results chain/logic models
2) SMART indicators
3) Questionnaire Design
Outline: topics being covered
3. This material constitutes supporting material for the "Impact Evaluation in Practice" book. This additional material is made freely but please acknowledge
its use as follows: Gertler, P. J.; Martinez, S., Premand, P., Rawlings, L. B. and Christel M. J. Vermeersch, 2010, Impact Evaluation in Practice: Ancillary
Material, The World Bank, Washington DC (www.worldbank.org/ieinpractice). The content of this presentation reflects the views of the authors and not
necessarily those of the World Bank.
MEASURING IMPACT
Impact Evaluation Methods for Policy
Makers
7. 7
Our Objective
Estimate the causal effect (impact)
of intervention (P) on outcome (Y).
(P) = Program or Treatment
(Y) = Indicator, Measure of Success
Example: What is the effect of a household freshwater
connection (P) on household water expenditures(Y)?
9. 9
Problem of Missing Data
For a program beneficiary:
α= (Y | P=1)-(Y | P=0)
we observe
(Y | P=1): Household Consumption (Y) with
a cash transfer program (P=1)
but we do not observe
(Y | P=0): Household Consumption (Y)
without a cash transfer program (P=0)
11. 11
Estimating impact of P on Y
OBSERVE (Y | P=1)
Outcome with treatment
ESTIMATE (Y | P=0)
The Counterfactual
o Intention to Treat (ITT) –
Those to whom we
wanted to give treatment
o Treatment on the Treated
(TOT) – Those actually
receiving treatment
o Use comparison or
control group
α= (Y | P=1) - (Y | P=0)
IMPACT = - counterfactual
Outcome with
treatment
12. 12
Example: What is the
Impact of…
giving Jim
(P)
(Y)?
additional pocket money
on Jim’s consumption of
candies
14. 14
In reality, use statistics
Treatment Comparison
Average Y=6 candies Average Y=4 Candies
IMPACT=6-4=2 Candies
X
15. 16
Finding good comparison
groups
We want to find clones for the Jims in our
programs.
The treatment and comparison groups should
• have identical characteristics
• except for benefiting from the intervention.
In practice, use program eligibility & assignment
rules to construct valid counterfactuals
16. 17
Case Study: Progresa
National anti-poverty program in Mexico
o Started 1997
o 5 million beneficiaries by 2004
o Eligibility – based on poverty index
Cash Transfers
o Conditional on school and health care attendance.
17. 18
Case Study: Progresa
Rigorous impact evaluation with rich data
o 506 communities, 24,000 households
o Baseline 1997, follow-up 1998
Many outcomes of interest
Here: Consumption per capita
What is the effect of Progresa (P) on
Consumption Per Capita (Y)?
If impact is an increase of $20 or more, then scale up
nationally
21. 22
Case 1: Before & After
What is the effect of Progresa (P) on consumption (Y)?
Y
TimeT=1997 T=1998
α = $35
IMPACT=A-B= $35
B
A
233
268
(1) Observe only
beneficiaries (P=1)
(2) Two observations
in time:
Consumption at T=0
and consumption at
T=1.
22. 23
Case 1: Before & After
Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2
stars (**).
Consumption (Y)
Outcome with Treatment
(After) 268.7
Counterfactual
(Before) 233.4
Impact
(Y | P=1) - (Y | P=0) 35.3***
Estimated Impact on
Consumption (Y)
Linear Regression 35.27**
Multivariate Linear
Regression 34.28**
23. 24
Case 1: What’s the
problem?
Y
TimeT=1997 T=1998
α = $35
B
A
233
268
Economic Boom:
o Real Impact=A-C
o A-B is an
overestimate
C
?
D ?
Impact?
Impact?
Economic
Recession:
o Real Impact=A-D
o A-B is an
underestimate
25. 26
False Counterfactual #2
If we have post-treatment data on
o Enrolled: treatment group
o Not-enrolled: “comparison” group (counterfactual)
Those ineligible to participate.
Those that choose NOT to participate.
Selection Bias
o Reason for not enrolling may be correlated with outcome (Y)
Control for observables.
But not un-observables!
o Estimated impact is confounded with other things.
Enrolled & Not Enrolled
26. 27
Case 2: Enrolled & Not
Enrolled
Enrolled
Y=268
Not Enrolled
Y=290
Ineligibles
(Non-Poor)
Eligibles
(Poor)
In what ways might E&NE be different, other than their enrollment in the program?
27. 28
Case 2: Enrolled & Not
Enrolled
Consumption (Y)
Outcome with Treatment
(Enrolled) 268
Counterfactual
(Not Enrolled) 290
Impact
(Y | P=1) - (Y | P=0) -22**
Estimated Impact on
Consumption (Y)
Linear Regression -22**
Multivariate Linear
Regression -4.15
Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).
28. 29
Progresa Policy
Recommendation?
Will you recommend scaling up Progresa?
B&A: Are there other time-varying factors that also
influence consumption?
E&BNE:
o Are reasons for enrolling correlated with consumption?
o Selection Bias.
Impact on Consumption (Y)
Case 1: Before
& After
Linear Regression 35.27**
Multivariate Linear Regression 34.28**
Case 2:
Enrolled & Not
Enrolled
Linear Regression -22**
Multivariate Linear Regression -4.15
Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).
29. 30
B&A
Compare: Same individuals
Before and After they
receive P.
Problem: Other things may
have happened over time.
E&NE
Compare: Group of
individuals Enrolled in a
program with group that
chooses not to enroll.
Problem: Selection Bias.
We don’t know why they
are not enrolled.
Keep in Mind
Both counterfactuals may lead to
biased estimates of the impact.
!
31. 32
Choosing your IE
method(s)
Prospective/Retrospective
Evaluation?
Eligibility rules and criteria?
Roll-out plan (pipeline)?
Is the number of eligible units
larger than available resources
at a given point in time?
o Poverty targeting?
o Geographic
targeting?
o Budget and capacity
constraints?
o Excess demand for
program?
o Etc.
Key information you will need for identifying the right
method for your program:
32. 33
Choosing your IE
method(s)
Best Design
Have we controlled for
everything?
Is the result valid for
everyone?
o Best comparison group you
can find + least operational
risk
o External validity
o Local versus global treatment
effect
o Evaluation results apply to
population we’re interested in
o Internal validity
o Good comparison group
Choose the best possible design given the
operational context:
34. 35
Randomized Treatments &
Comparison
o Randomize!
o Lottery for who is offered benefits
o Fair, transparent and ethical way to assign benefits to equally
deserving populations.
Eligibles > Number of Benefits
o Give each eligible unit the same chance of receiving treatment
o Compare those offered treatment with those not offered treatment
(comparisons).
Oversubscription
o Give each eligible unit the same chance of receiving treatment first,
second, third…
o Compare those offered treatment first, with those offered later
(comparisons).
Randomized Phase In
35. 36
= Ineligible
Randomized treatments
and comparisons
= Eligible
1. Population
External Validity
2. Evaluation
sample
3. Randomize
treatment
Internal Validity
Comparison
Treatment
X
36. 37
Unit of Randomization
Choose according to type of program
Individual/Household
School/Health
Clinic/catchment area
Block/Village/Community
Ward/District/Region
Keep in mind
Need “sufficiently large” number of units to detect
minimum desired impact: Power.
Spillovers/contamination
Operational and survey costs
37. 38
Case 3: Randomized
Assignment
Progresa CCT program
Unit of randomization: Community
320 treatment communities (14446 households):
First transfers in April 1998.
186 comparison communities (9630
households):
First transfers November 1999
506 communities in the evaluation sample
Randomized phase-in
39. 40
Case 3: Randomized
Assignment
How do we know we have good
clones?
In the absence of Progresa, treatment
and comparisons should be identical
Let’s compare their characteristics at
baseline (T=0)
40. 41
Case 3: Balance at
Baseline
Case 3: Randomized Assignment
Treatment Comparison T-stat
Consumption
($ monthly per capita) 233.4 233.47 -0.39
Head’s age
(years) 41.6 42.3 -1.2
Spouse’s age
(years) 36.8 36.8 -0.38
Head’s education
(years) 2.9 2.8 2.16**
Spouse’s education
(years) 2.7 2.6 0.006
Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).
41. 42
Case 3: Balance at
Baseline
Case 3: Randomized Assignment
Treatment Comparison T-stat
Head is female=1 0.07 0.07 -0.66
Indigenous=1 0.42 0.42 -0.21
Number of household
members 5.7 5.7 1.21
Bathroom=1 0.57 0.56 1.04
Hectares of Land 1.67 1.71 -1.35
Distance to Hospital
(km) 109 106 1.02
Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).
42. 43
Case 3: Randomized
Assignment
Treatment Group
(Randomized to
treatment)
Counterfactual
(Randomized to
Comparison)
Impact
(Y | P=1) - (Y | P=0)
Baseline (T=0)
Consumption (Y) 233.47 233.40 0.07
Follow-up (T=1)
Consumption (Y) 268.75 239.5 29.25**
Estimated Impact on
Consumption (Y)
Linear Regression 29.25**
Multivariate Linear
Regression 29.75**
Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).
43. 44
Progresa Policy
Recommendation?
Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).
Impact of Progresa on Consumption (Y)
Case 1: Before
& After
Multivariate Linear Regression 34.28**
Case 2:
Enrolled & Not
Enrolled
Linear Regression -22**
Multivariate Linear Regression -4.15
Case 3:
Randomized
Assignment
Multivariate Linear Regression 29.75**
44. 45
Keep in Mind
Randomized Assignment
In Randomized Assignment,
large enough samples,
produces 2 statistically
equivalent groups.
We have identified the
perfect clone.
Randomized
beneficiary
Randomized
comparison
Feasible for prospective
evaluations with over-
subscription/excess demand.
Most pilots and new
programs fall into this
category.
!
46. 47
What if we can’t choose?
It’s not always possible to choose a control
group. What about:
o National programs where everyone is eligible?
o Programs where participation is voluntary?
o Programs where you can’t exclude anyone?
Can we compare
Enrolled & Not
Enrolled?
Selection Bias!
47. 48
Randomly offering or
promoting program
If you can exclude some units, but can’t force anyone:
o Offer the program to a random sub-sample
o Many will accept
o Some will not accept
If you can’t exclude anyone, and can’t force anyone:
o Making the program available to everyone
o But provide additional promotion,
encouragement or incentives to a random
sub-sample:
Additional Information.
Encouragement.
Incentives (small gift or prize).
Transport (bus fare).
Randomized
offering
Randomized
promotion
48. 49
Randomly offering or
promoting program
1. Offered/promoted and not-offered/ not-promoted groups
are comparable:
• Whether or not you offer or promote is not correlated with
population characteristics
• Guaranteed by randomization.
2. Offered/promoted group has higher enrollment in the
program.
3. Offering/promotion of program does not affect
outcomes directly.
Necessary conditions:
49. 50
Randomly offering or
promoting program
WITH
offering/
promotion
WITHOUT
offering/
promotion
Never Enroll
Only Enroll if
offered/
promoted
Always Enroll
3 groups of units/individuals
XX
X
50. 51
0
Randomly offering or promoting
program
Eligible units Randomize promotion/
offering the program
Enrollment
Offering/
Promotion
No Offering/
No Promotion
X
X
Only if offered/
promoted
Always
Never
51. 52
Randomly offering or
promoting program
Offered
/Promoted
Group
Not Offered/ Not
Promoted Group
Impact
%Enrolled=80%
Average Y for
entire group=100
%Enrolled=30%
Average Y for entire
group=80
∆Enrolled=50%
∆Y=20
Impact= 20/50%=40
Never Enroll
Only Enroll if
Offered/
Promoted
Always Enroll
-
-
53. 54
Community Based School
Management in Nepal
Context:
o A centralized school system
o 2003: Decision to allow local administration of schools
The program:
o Communities express interest to participate.
o Receive monetary incentive ($1500)
What is the impact of local school administration on:
o School enrollment, teachers absenteeism, learning quality, financial
management
Randomized promotion:
o NGO helps communities with enrollment paperwork.
o 40 communities with randomized promotion (15 participate)
o 40 communities without randomized promotion (5 participate)
54. 55
Maternal Child Health
Insurance in Argentina
Context:
o 2001 financial crisis
o Health insurance coverage diminishes
Pay for Performance (P4P) program:
o Change in payment system for providers.
o 40% payment upon meeting quality standards
What is the impact of the new provider payment
system on health of pregnant women and children?
Randomized promotion:
o Universal program throughout the country.
o Randomized intensive information campaigns to inform women of
the new payment system and increase the use of health services.
55. 56
Case 4: Randomized
Offering/ Promotion
Randomized Offering/Promotion is an
“Instrumental Variable” (IV)
o A variable correlated with treatment but nothing else (i.e.
randomized promotion)
o Use 2-stage least squares (see annex)
Using this method, we estimate the effect of
“treatment on the treated”
o It’s a “local” treatment effect (valid only for )
o In randomized offering: treated=those offered the
treatment who enrolled
o In randomized promotion: treated=those to whom the
program was offered and who enrolled
56. 57
Case 4: Progresa
Randomized Offering
Offered group
Not offered
group
Impact
%Enrolled=92%
Average Y for
entire group =
268
%Enrolled=0%
Average Y for
entire group = 239
∆Enrolled=0.92
∆Y=29
Impact= 29/0.92 =31
Never Enroll
-
Enroll if
Offered
Always Enroll
- - -
57. 58
Case 4: Randomized
Offering
Estimated Impact on
Consumption (Y)
Instrumental Variables
Regression 29.8**
Instrumental Variables
with Controls 30.4**
Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).
58. 59
Keep in Mind
Randomized Offering/Promotion
Randomized Promotion
needs to be an effective
promotion strategy
(Pilot test in advance!)
Promotion strategy will help
understand how to increase
enrollment in addition to
impact of the program.
Strategy depends on
success and validity of
offering/promotion.
Strategy estimates a local
average treatment effect.
Impact estimate valid only
for the triangle hat type of
beneficiaries.
!
Don’t exclude anyone but…
60. 61
Discontinuity Design
Anti-poverty
Programs
Pensions
Education
Agriculture
Many social programs select beneficiaries using an
index or score:
Targeted to households below a given
poverty index/income
Targeted to population above a certain
age
Scholarships targeted to students with
high scores on standarized text
Fertilizer program targeted to small
farms less than given number of
hectares)
61. 62
Example: Effect of fertilizer
program on agriculture production
Improve agriculture production (rice yields) for small
farmers
Goal
Farms with a score (Ha) of land ≤50 are small
Farms with a score (Ha) of land >50 are not small
Method
Small farmers receive subsidies to purchase fertilizer
Intervention
64. 65
Case 5: Discontinuity
Design
We have a continuous eligibility index with a
defined cut-off
o Households with a score ≤ cutoff are eligible
o Households with a score > cutoff are not eligible
o Or vice-versa
Intuitive explanation of the method:
o Units just above the cut-off point are very similar to units just below
it – good comparison.
o Compare outcomes Y for units just above and below the cut-off
point.
65. 66
Case 5: Discontinuity
Design
Eligibility for Progresa is based on
national poverty index
Household is poor if score ≤ 750
Eligibility for Progresa:
o Eligible=1 if score ≤ 750
o Eligible=0 if score > 750
66. 67
Case 5: Discontinuity Design
Score vs. consumption at Baseline-No treatment
Fittedvalues
puntaje estimado en focalizacion
276 1294
153.578
379.224
Poverty Index
Consumption
Fittedvalues
67. 68
Fittedvalues
puntaje estimado en focalizacion
276 1294
183.647
399.51
Case 5: Discontinuity Design
Score vs. consumption post-intervention period-treatment
(**) Significant at 1%
Consumption
Fittedvalues
Poverty Index
30.58**
Estimated impact on
consumption (Y) | Multivariate
Linear Regression
68. 69
Keep in Mind
Discontinuity Design
Discontinuity Design
requires continuous
eligibility criteria with clear
cut-off.
Gives unbiased estimate of
the treatment effect:
Observations just across the
cut-off are good comparisons.
No need to exclude a
group of eligible
households/ individuals
from treatment.
Can sometimes use it for
programs that are already
ongoing.
!
69. 70
Keep in Mind
Discontinuity Design
Discontinuity Design
produces a local estimate:
o Effect of the program
around the cut-off
point/discontinuity.
o This is not always
generalizable.
Power:
o Need many observations
around the cut-off point.
Avoid mistakes in the statistical
model: Sometimes what
looks like a discontinuity in
the graph, is something
else.
!
71. 72
Matching
For each treated unit pick up the best comparison
unit (match) from another data source.
Idea
Matches are selected on the basis of similarities in
observed characteristics.
How?
If there are unobservable characteristics and those
unobservables influence participation: Selection bias!
Issue?
72. 73
Propensity-Score Matching
(PSM)
Comparison Group: non-participants with same
observable characteristics as participants.
In practice, it is very hard.
There may be many important characteristics!
Match on the basis of the “propensity score”,
Solution proposed by Rosenbaum and Rubin:
Compute everyone’s probability of participating, based
on their observable characteristics.
Choose matches that have the same probability of
participation as the treatments.
See appendix 2.
73. 74
Density of propensity scores
Density
Propensity Score0 1
ParticipantsNon-Participants
Common Support
74. 75
Case 7: Progresa Matching
(P-Score)
Baseline Characteristics Estimated Coefficient
Probit Regression, Prob Enrolled=1
Head’s age (years) -0.022**
Spouse’s age (years) -0.017**
Head’s education (years) -0.059**
Spouse’s education (years) -0.03**
Head is female=1 -0.067
Indigenous=1 0.345**
Number of household members 0.216**
Dirt floor=1 0.676**
Bathroom=1 -0.197**
Hectares of Land -0.042**
Distance to Hospital (km) 0.001*
Constant 0.664**
Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).
75. 76
Case 7: Progresa Common
Support
Pr (Enrolled)
Density: Pr (Enrolled)
Density:Pr(Enrolled)
Density: Pr (Enrolled)
76. 77
Case 7: Progresa Matching
(P-Score)
Estimated Impact on
Consumption (Y)
Multivariate Linear
Regression 7.06+
Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**). If
significant at 10% level, we label impact with +
77. 78
Keep in Mind
Matching
Matching requires large
samples and good quality
data.
Matching at baseline can be
very useful:
Know the assignment rule
and match based on it
combine with other
techniques (i.e. diff-in-diff)
Ex-post matching is risky:
If there is no baseline, be
careful!
matching on endogenous
ex-post variables gives bad
results.
!
78. 79
Progresa Policy
Recommendation?
Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**). If
significant at 10% level, we label impact with +
Impact of Progresa on Consumption (Y)
Case 1: Before & After 34.28**
Case 2: Enrolled & Not Enrolled -4.15
Case 3: Randomized Assignment 29.75**
Case 4: Randomized Offering 30.4**
Case 5: Discontinuity Design 30.58**
Case 6: Difference-in-Differences 25.53**
Case 7: Matching 7.06+
79. 80
Appendix 2: Steps in
Propensity Score Matching
1. Representative & highly comparables survey of non-
participants and participants.
2. Pool the two samples and estimated a logit (or probit) model
of program participation.
3. Restrict samples to assure common support (important
source of bias in observational studies)
4. For each participant find a sample of non-participants that
have similar propensity scores
5. Compare the outcome indicators. The difference is the
estimate of the gain due to the program for that observation.
6. Calculate the mean of these individual gains to obtain the
average overall gain.
81. 82
Matching
For each treated unit pick up the best comparison
unit (match) from another data source.
Idea
Matches are selected on the basis of similarities in
observed characteristics.
How?
If there are unobservable characteristics and those
unobservables influence participation: Selection bias!
Issue?
82. 83
Propensity-Score Matching
(PSM)
Comparison Group: non-participants with same
observable characteristics as participants.
In practice, it is very hard.
There may be many important characteristics!
Match on the basis of the “propensity score”,
Solution proposed by Rosenbaum and Rubin:
Compute everyone’s probability of participating, based
on their observable characteristics.
Choose matches that have the same probability of
participation as the treatments.
See appendix 2.
83. 84
Density of propensity scores
Density
Propensity Score0 1
ParticipantsNon-Participants
Common Support
84. 85
Case 7: Progresa Matching
(P-Score)
Baseline Characteristics Estimated Coefficient
Probit Regression, Prob Enrolled=1
Head’s age (years) -0.022**
Spouse’s age (years) -0.017**
Head’s education (years) -0.059**
Spouse’s education (years) -0.03**
Head is female=1 -0.067
Indigenous=1 0.345**
Number of household members 0.216**
Dirt floor=1 0.676**
Bathroom=1 -0.197**
Hectares of Land -0.042**
Distance to Hospital (km) 0.001*
Constant 0.664**
Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).
85. 86
Case 7: Progresa Common
Support
Pr (Enrolled)
Density: Pr (Enrolled)
Density:Pr(Enrolled)
Density: Pr (Enrolled)
86. 87
Case 7: Progresa Matching
(P-Score)
Estimated Impact on
Consumption (Y)
Multivariate Linear
Regression 7.06+
Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**). If
significant at 10% level, we label impact with +
87. 88
Keep in Mind
Matching
Matching requires large
samples and good quality
data.
Matching at baseline can be
very useful:
Know the assignment rule
and match based on it
combine with other
techniques (i.e. diff-in-diff)
Ex-post matching is risky:
If there is no baseline, be
careful!
matching on endogenous
ex-post variables gives bad
results.
!
88. 89
Progresa Policy
Recommendation?
Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**). If
significant at 10% level, we label impact with +
Impact of Progresa on Consumption (Y)
Case 1: Before & After 34.28**
Case 2: Enrolled & Not Enrolled -4.15
Case 3: Randomized Assignment 29.75**
Case 4: Randomized Offering 30.4**
Case 5: Discontinuity Design 30.58**
Case 6: Difference-in-Differences 25.53**
Case 7: Matching 7.06+
89. 90
Appendix 2: Steps in
Propensity Score Matching
1. Representative & highly comparables survey of non-
participants and participants.
2. Pool the two samples and estimated a logit (or probit) model
of program participation.
3. Restrict samples to assure common support (important
source of bias in observational studies)
4. For each participant find a sample of non-participants that
have similar propensity scores
5. Compare the outcome indicators. The difference is the
estimate of the gain due to the program for that observation.
6. Calculate the mean of these individual gains to obtain the
average overall gain.
91. 92
Keep in Mind
Randomized Assignment
In Randomized Assignment,
large enough samples,
produces 2 statistically
equivalent groups.
We have identified the
perfect clone.
Randomized
beneficiary
Randomized
comparison
Feasible for prospective
evaluations with over-
subscription/excess demand.
Most pilots and new
programs fall into this
category.
!
92. 93
Randomized assignment
with different benefit levels
Traditional impact evaluation question:
What is the impact of a program on an outcome?
Other policy question of interest:
What is the optimal level for program benefits?
What is the impact of a “higher-intensity” treatment
compared to a “lower-intensity” treatment?
Randomized assignment with 2 levels of
benefits:
Comparison Low Benefit High Benefit
X
94. 95
Other key policy question for a program with various
benefits:
What is the impact of an intervention compared to another?
Are there complementarities between various interventions?
Randomized assignment with 2 benefit packages:
Intervention 1
Treatment Comparison
Intervention2
Treatment
Group A Group C
Comparison
Group B Group D
X
Randomized assignment
with multiple interventions
95. 96
= Ineligible
Randomized assignment
with multiple interventions
= Eligible
1. Eligible Population 2. Evaluation sample
3. Randomize
intervention 1
4. Randomize
intervention 2
X
96. 97
Appendix : Two Stage
Least Squares (2SLS)
1 2y T x
0 1 1T x Z
Model with endogenous Treatment (T):
Stage 1: Regress endogenous variable on
the IV (Z) and other exogenous regressors:
Calculate predicted value for each
observation: T hat
97. 98
Appendix 1: Two Stage
Least Squares (2SLS)
^
1 2( )y T x
Need to correct Standard Errors (they are
based on T hat rather than T)
Stage 2: Regress outcome y on predicted
variable (and other exogenous variables):
In practice just use STATA – ivreg.
Intuition: T has been “cleaned” of its
correlation with ε.
98. 99
Tuesday - Session 1
INTRODUCTION AND OVERVIEW
1) Introduction
2) Why is evaluation valuable?
3) What makes a good evaluation?
4) How to implement an evaluation?
Wednesday - Session 2
EVALUATION DESIGN
5) Causal Inference
6) Choosing your IE method/design
7) Impact Evaluation Toolbox
Thursday - Session 3
SAMPLE DESIGN AND DATA COLLECTION
9) Sample Designs
10) Types of Error and Biases
11) Data Collection Plans
12) Data Collection Management
Friday - Session 4
INDICATORS & QUESTIONNAIRE DESIGN
1) Results chain/logic models
2) SMART indicators
3) Questionnaire Design
Outline: topics being covered
100. 101
This material constitutes supporting material for the "Impact Evaluation in Practice" book. This additional material is made freely but
please acknowledge its use as follows: Gertler, P. J.; Martinez, S., Premand, P., Rawlings, L. B. and Christel M. J. Vermeersch, 2010,
Impact Evaluation in Practice: Ancillary Material, The World Bank, Washington DC (www.worldbank.org/ieinpractice). The content of
this presentation reflects the views of the authors and not necessarily those of the World Bank.
MEASURING IMPACT
Impact Evaluation Methods for Policy
Makers