Better evidence on what works and why, under what circumstances and at what cost.Improve awareness about the kind of policies that are effective and accountabilityIncrease likelihood that H interventions are able to contribute to reduced vulnerability and increased reslience. Effective allocation of funds
The working paper had a definition for natural disasters, not humanitarian crises. Do you like this def?Also keeps with the GHA definition which is: aid and action designed to save lives, alleviate suffering and protect human dignity in the aftermath of emergencies. It is usually short term but in practice it is hard to say where ‘during and in the immediate aftermath of the emergency’ ends and when other assistance begins, especially in situations of prolonged vulnerability. Definition from the School for Center for Peace in Spain: A humanitarian crisis is a situation in which there is an exceptional and generalized threat to human life, health or subsistence. These crises usually appear within the context of an existing situation of a lack of protection where a series of pre-existent factors (poverty, inequality, lack of access to basic services) exacerbated by a natural disaster or armed conﬂict, multiply the destructive effectsSource - http://escolapau.uab.cat/img/programas/alerta/alerta/10/cap04i.pdf
By July in 2012, 61 million people had already been affectedGlobal Humanitarian Assistance. GHA Report 2012. Rep., 2012. <http://www.globalhumanitarianassistance.org/reports>IbidThe Year That Shook the Rich: A review of natural disasters in 2011. The Brookings Institution – London School of Economics Project on Internal Displacement, March 2012
1. OCHA report 2011Cumulative economic cost was US$380 billion, making 2011 the most expensive year in history for natural disastersThe Humanitarian Emergency Response Review (2011) noted the lack of impact evaluation of Humanitarian initiatives, and recommended rigorous IEs
Cite Alison Buttenheim’s working paper
This might not be relevant for this presentation
This might not be relevant for this presentationProportionate disaster lossesProportionate recovery of lossesPrpoportionate restoration to baseline
For example, two households both lose 3 water buffalo in a disaster.
- Relief efforts started immediately with the establishment of the Federal Relief Commission completed by the Asian Development Bank and the World Bank at the request of the Government of Pakistan.Data were collected and compiled from several sources, including sector-specific field assessments, desk reviews, aerial reconnaissance, site visits, and interviews.The needs assessment generated both a very detailed pre-earthquake profile of the affected areas and damage estimates: (Conversion from PKR to USD on 31st Oct, 2005; 1USD= 59.7 PKR)Rs.135,146 millions ($2263.69 mil) in direct damage: 2 Bn in direct damageRs.34,187 millions ($572.63 mil) in indirect losses, and : 0.5 Bn in indirect lossesRs.208,091 million ($3485.5244 mil) in reconstruction costs: 3.5 Bn in reconstruction(Conversion from PKR to USD on todays rate; 1USD= 96.65 PKR)Rs.135,146 millions ($1398.3mil) in direct damage, Rs.34,187 millions ($353.7mil) in indirect losses, and Rs.208,091 million ($2153.03 mil) in reconstruction costs
Earthquake Reconstruction and Rehabilitation AuthorityThe official mission of ERRA was to coordinate reconstruction and recovery efforts in the affected areas and across the multitude of local, national and international government agencies and nongovernment organisations (NGOs) that were operating in the areas.‘Build Back BetterEvaluation DesignThe unit of analysis is the household, and an initial sample size of 1350 was chosen in order to detect changes at the 90 per cent confidence level. A two-stage cluster sample was drawn by first sampling 30 rural villages with probability proportionate to size from each of nine heavily affected districts in NWFP and AJK. Within each sampled village, five households were chosen at random from one randomly selected sub-division or neighbourhood.The M&E wing of ERRA, with UK Department for International Development support, devised a survey instrument covering all the ERRA sector priorities. ‘Baseline’ or Round 1 data were collected between April and September 2008, almost three years after the earthquake.Second round was collected in Aug-Sept 2009, with the same expanded to 16 households per village and to urban blocks. But used a different survey – the Pakistan Social and Living Standards Measurement Survey (PSLM) that is conducted annually since 2005Problems1. Selection bias - This sampling strategy cannot account for households that were in the affected area prior to and during the earthquake, but then were not available for sampling during the post-intervention periods. These households may not have been available either because the entire household died or because they had left the region. In either case, this missing group is likely to be different from households that weresampled. In addition, the design is not explicit about how households were selected for interventions. For example, housing programmes that target the worst-hit households might have less impressive results compared to programmes that implicitly targeted less-vulnerable elites.2. Information bias: Respondents may not accurately remember details about their housing, livelihoods, or schooling prior to earthquake. If interviewed by officials associated with the recovery effort, respondents may report more or less favourable conditions for either time period depending on perceived interviewer expectations or future benefits. The extent of misreporting may also be correlated with the severity of earthquake exposure or damages incurred.3. Contamination bias. The ERRA Social Impact Assessment study assumes that changes experienced by earthquake-affected households from baseline to post-intervention follow-up are attributable to the ERRA interventions. Without a comparison group, the assumption is a very strong one. - What was the role of remittances and the role of other unaffected households for example?
In 2009 study conducted by WBThe evaluation sample consists of 126 villages randomly drawn from the 1998 population census list of villages in four earthquake-affected districts. Outcomes of interest at the household level include employment, consumption, nutrition, education status of children, mental health, and asset recovery. The study also hopes to link household data to administrative and bank records of cash transfers. At the school level, post-earthquake staffing, infrastructure, enrolment and test-scores will be evaluated in both private and public schools. The World Bank undertook a detailed census of 28,000 households in sampled villages in spring 2009, with a more extensive questionnaire administered to 25 per cent of households. A second round of data, including a detailed household survey of 2500 randomly selected households, was fielded in fall 2009. School-based survey modules collect data on school facilities, enrolment, and child outcomes including cognitive and achievement testing.Preliminary results on education indicate that school interruption was significant: almost four months for young children and more than five months for older children. The proportion of schools destroyed by the earthquake increased sharply as distance from the fault line decreased, but when distance to the fault line is controlled, private schools appeared to sustain fewer damages than public schools. Consequently, public schools witnessed a decrease in school enrolment from pre-earthquake to post-earthquake periods, while private schools increased their enrolments.ProblemsThe study is not without limitations: it will offer little insight into the design of optimal reconstruction programmes. It is not clear how relevant or generalisable the estimation strategy will be beyond the Pakistan situation, and therefore how applicable the findings will be to other post-disaster settings
Facing these problems, Alison Buttenhiem sets up a hypothetical design for the evaluation of the Pak earthquakesBecause it’s hypothetical, she assumes that the immediate post disaster period (2-3) was devoted solely to Design picks up after rescue efforts are completed; relief provisions are well underway and recovery programs are being planned1. Identify the long-term household-level outcomes of interest. It is important to clarify exactly what questions an impact evaluation is designed toanswer. It was evident immediately after the earthquake that housing, health facilities, school facilities, government buildings, and infrastructure for WATSAN, power, telecom, and transit were all seriously compromised. ERRA’s sectoral approach to the design, delivery and evaluation of recovery programs reflects this, as does the list of household and community outcomes developed by ERRA and reproduced above. While ERRA’s list served process outcomes and monitoring well, a focused list of household- level outcomes and indicators will guide the impact evaluation process.
Source: Based on Pakistan Social and Living Standards Measurement Survey(2004- 05), 2005 and ERRA Social Impact Assessment, 2009.30
2. Obtain a pre-earthquake area-representative household sample. An obvious shortcoming of both the ERRA Social Impact Assessment and the World Bank evaluation is the lack of a pre- disaster, population- representative sample. An evaluation design that compares post-intervention welfare to a pre- disaster point in time requires a pre-disaster observation, preferably collected pre- disaster rather than retroactively.In the case of Pakistan, the 2004-05 Pakistan Social and Living Standards Measurement Survey (PSLM) is a good candidate for such a sample. Interviews took place between September 2004 and March 2005, or 7- 13 months prior to the earthquake. The sample includes 1080 households in Mansehra and Abbotabad districts, some of which were affected by the earthquake and some of which were not. (The samplealso includes 1,322 households in AJ&K, but it is not clear how many, if any, of these households were located in earthquake- affected areas.). In our hypothetical study, this sample becomes the baseline observation for affected and unaffected areas3. Collect data on the pre-earthquake sample immediately post-earthquake. Needsassessments of affected populations are often undertaken in the immediateaftermath of a disaster. In Pakistan, several needs assessments were done, includinga wide- ranging village level needs assessment by the GoP and World Bank. Ofcourse, the focus of these needs assessment was the correct targeting and provisionof relief. For long- term impact evaluation purposes, observing the sampleimmediately after the disaster is also very helpful. In our hypothetical design, thePSLM sample identified in Step 2 is located and briefly re-interviewed in January-March 2006. If necessary, the sample is expanded to provide sufficient power forevaluation analyses. Because the sample includes households in affected andunaffected areas, this post- disaster surveying will also include unaffected areas. Thisround of data collection can serve multiple purposes in addition to serving as asecond “baseline” of sorts for impact evaluation of future recovery efforts: it canprovide accurate estimates of disaster-related mortality and morbidity and postdisasteroutmigration; assess the adequacy of relief efforts; identify priorities forrecovery programs; and reveal intentional and natural variations in recoveryinterventions that can be exploited for evaluation purposes. For example, the WorldBank Study above leverages the household size eligibility requirement for livelihoodgrants, and the variation in the agency providing housing reconstruction grants.4. Design interventions for staged roll out or other variations. As discussed above,experimental designs are a controversial aspect of humanitarian aid provision, andmay not be appropriate in the emergency or relief phase of post- disaster aidprovision. However, recovery programs that unfold over many months or years arebetter suited to an experimental design. They are particularly useful when there is alack of consensus about the best way to deliver an intervention (e.g., how largeshould livelihood cash grants be and when should they be distributed? How muchsweat equity should be required of homeowners during housing reconstruction?Should school reconstruction prioritize the rebuilding of primary or secondaryschools?). Testing competing interventions in an experimental design can provid e32strong evidence about best practices in humanitarian aid that can guide future post -disaster interventions in other settings. In the examples above, beneficiaries are notdeprived of life- saving resource, but instead may receive a different form of a benefitthan beneficiaries in a neighboring village or district. In our hypothetical studydesign, ERRA identifies a set of outstanding debates about intervention theory,design or delivery, and plans sectoral programming to include experimentalconditions. Where practical
3. Collect data on the pre-earthquake sample immediately post-earthquake. Needs assessments of affected populations are often undertaken in the immediate aftermath of a disaster. In Pakistan, several needs assessments were done, including a wide- ranging village level needs assessment by the GoP and World Bank. Of course, the focus of these needs assessment was the correct targeting and provisionof relief. For long- term impact evaluation purposes, observing the sample immediately after the disaster is also very helpful. In our hypothetical design, the PSLM sample identified in Step 2 is located and briefly re-interviewed in January-March 2006. If necessary, the sample is expanded to provide sufficient power for evaluation analyses. Because the sample includes households in affected and unaffected areas, this post- disaster surveying will also include unaffected areas. This round of data collection can serve multiple purposes in addition to serving as a second “baseline” of sorts for impact evaluation of future recovery efforts: it can provide accurate estimates of disaster-related mortality and morbidity and postdisaster outmigration; assess the adequacy of relief efforts; identify priorities for recovery programs; and reveal intentional and natural variations in recoveryinterventions that can be exploited for evaluation purposes. For example, the World Bank Study above leverages the household size eligibility requirement for livelihood grants, and the variation in the agency providing housing reconstruction grants.
4. Design interventions for staged roll out or other variations. Experimental designs are a controversial aspect of humanitarian aid provision, andmay not be appropriate in the emergency or relief phase of post- disaster aid provision. However, recovery programs that unfold over many months or years are better suited to an experimental design. They are particularly useful when there is a lack of consensus about the best way to deliver an intervention. Testing competing interventions in an experimental design can provide strong evidence about best practices in humanitarian aid that can guide future post-disaster interventions in other settings. In the examples above, beneficiaries are not deprived of life- saving resource, but instead may receive a different form of a benefit than beneficiaries in a neighboring village or district. In our hypothetical study design, ERRA identifies a set of outstanding debates about intervention theory, design or delivery, and plans sectoral programming to include experimental conditions.
TEC; Rwanda messaging: Different groups listened to different things. (Paluck 2009), reconciliation messaging in post-conflict 2004. NGO used it. 12 communities matched pairs. New Dawn. But preceded by survey-based measures, focus groups, and observed behavioral measures.Sierra Leone- paired matching from 236 communities (Go Bifo); 236 communities. CDD. The study found that the intervention had mixed impacts. While the intervention succeededin delivering material benefits to the beneficiary communities, impacts on collective action andinclusion of marginalized groups were not significant. That is, from the surveys, focus groups, andactivities, the treated communities did not exhibit significantly better collective action capacity ortendency to include marginalized groups in decision making or sharing of benefits.Aceh – paired matches BUT with clear lack of balance since selected towns had to have administrative capacity. Barron et al. Burundi – ex combatants – phased roll out. WB study. Reintegration program. 23000 ex combatants. Gilligan et al.
What else should I add on this page?
Evaluating impact of humanitarian action: a science or an art (Jo Puri, 3iE)
Evaluating impact of Humanitarian Action: A science and an art? Jo (Jyotsna) Puri Head of Evaluation Deputy Executive Director, 3ie www.3ieimpact.org
Group Exercise ILets write down a definition.WHAT IS AN IMPACT EVALUATION?
What is impact evaluation?Impact evaluations answer the question aboutthe extent the intervention being evaluatedaltered the state of the world= the (outcome)this We can see indicator with the interventioncompared to what it would have been in theabsence of the interventionBut we can’t see this So we use a= Yt(1) – Yt(0) comparison group
Starting with a theory of Change Behavioral attributes ensure correct spending Cash People Households Increased Improvedtransfers Money is record high are targeted purchasing livelihooddesigned sent levels of power indicators satisfaction Access to markets Households No correctly id-ed leakage
Group exercise question II• What is the theory of change/causal chain for this project that you are interested in?• Write one outcome that is important?• What were the assumptions and risks in various stages? 6
The counterfactual Outcome monitoring Before AfterIntervention 40 92Control 84Before vs. after (single difference) = 92-40 = 52(outcome monitoring)Post-treatment comparison = 92-84 = 8 www.3ieimpact.org
The counterfactual Outcome monitoring Before AfterIntervention 40 92Control 26 84Before vs. after (single difference) = 92-40 = 52(outcome monitoring)Post-treatment comparison = 92-84 = 8Double difference = (92-40)-(84-26) = -6 www.3ieimpact.org
Group Exercise III• Write a matrix for the outcome you are interested in examining.• Write (hypothetical) numbers in the matrix.• Calculate the following: – Single difference – Single ex post difference – Double difference
Overall Aim: Improve lives• Evidence on what works and why, how;• Improve awareness and accountability on impact and process;• Effective allocation of funds;• Increase likelihood that humanitarian interventions are effective and efficient;
The counterfactualOutcome Factual Counterfactual Time
The essence of large n design Before AfterProjectComparison
Large n• n is the number of units of assignment, e.g. schools, villages, sub-districts (the unit of assignment can be different from the unit of analysis)• If n is large then we create treatment (project) and comparison groups which are identical prior to the intervention… – And use statistical analysis to assess post- intervention differences between treatment and comparison: we say these differences are caused by the intervention
Impact Evaluationspolicies Impact evaluation and for development assistanceEfficacy: Does it workin laboratory Would it haveconditions? happened anyway? Are there other Did the program ways, that are If the program caused cause the change? cheaper to get the the effect, how much same impact? was the effect? Theory of change Counterfactual Mixed methods Outcome variables Internal validity: power, sample size, spill overs, john henry effects External validity: Heterogeneity, representativeness, context
Working definition: Humanitarian actionResponse to an emergency,to protect human life,health and subsistence.The emergency can be theconsequence of a naturaldisaster or a conflict.-Slow onset disasters- Short term or longer term
Impact Evaluationspolicies Impact evaluation and for humanitarian assistance Who lost most? Who recovered best? Unintended Was it timely? How/ consequences. Are there otherDid the program was adequateincrease the ways, that are coverage ensured? cheaper to get theresilience ofpopulations? same impact? (cost- effectiveness) Did the affected population recover?
Real-time vs Impact Evaluations Evaluations• Evaluates processes • Includes real time evaluations• Focuses on developmentMeasures net change in welfare • and implementation of levels; Measurement biases. the program • Expensive but low cost• Examines targets were evaluations too. met • Robust evidence (relief,• Is cheap (?) recovery, resilience) unintended• Controversial • Vulnerable populations • Long term policy
Humanitarian Crisis: Heterogeneity of impactsNATURAL DISASTER ARMED CONFLICT POVERTY SOCIAL INEQUALITY POOR GOVERENANCE STATE FRAGILITY FOOD INSECURITY WELL BEING OUTCOMES
CLEAR AND PRESENT NEED FOR IMPACTEVALUATIONS IN HUMANITARIANASSISTANCE
Some facts• In 2011, 62 million people were affected by crises across the world• Natural disasters, alone, killed almost 26,000 people.
Humanitarian Window – a need. • “Understanding the impact of humanitarian assistance is another area where much work is needed….Linking impact measurement and accountability better to the funds agencies receive is a key recommendation of this review.”
Need for impact evaluations• There is a big gap between the requirement and availability of funds. – In 2011, shortage of funds amounted to $3.4 billionCritical that we know the efficiency andeffectiveness of interventions
Humanitarian vs. Development IEsHumanitarian interventions are more complex to evaluatethan development interventions Development Evaluations Humanitarian Evaluations Selection bias All development evaluations; plusFragile states and vulnerable populations Rapid onset Multiple concurrent interventions High covariance Inadequate data Disrupted communities Difference in resources and need Absence of baseline data Difficulty in counterfactual selection
Single difference Disaster related losses Recovery from disaster Persistence of recoveryBaseline Emergency Relief Recovery t-1 t0 t1 t2 Restoration to baseline Sustained Recovery Sustained restoration to baseline
Proportionate changesProportionate disaster lossAlso called “vulnerability of exposure to uninsured risk”
Proportionate changes Why is it important? Heterogeneity!!Case 1 Case 2Baseline= 3 buffalos Baseline = 6 buffalos100% loss of large livestock 50% loss of large livestock
Pakistan Earthquake 2005: Background • Struck on 8th October, 2005 • 7.6 on the Richter scale • Immediate toll – • 73,000 deaths • 128,000 injured • 600,000 houses destroyed • Estimated damages were US $5.8 billion
Pakistan Earthquake 2005: ERRA evaluation• The ERRA was set up: primary responsibility for allocating reconstruction funds;• ERRA undertook a “social impact assessment”• Conducted a pre-/post assessment (no counterfactual)• PROBLEMS? • Selection Bias • Information Bias • Contamination Bias
Pakistan Earthquake 2005: World Bank Evaluation • They compare recovery in villages that were more vs. less affected (use as counterfactual) • The evaluation focused on • Recovery for households and educational facilities • Access/quality of schooling • Effects of grants • Has limitations, should compliment ERRA study
Suggested steps – Looking back Immediately after rescue efforts -1. Identify the long term household-level outcomes of interest • Clarify what questions an IE is designed to answer • Create a focused list of outcomes to guide the evaluation. For example (next slide)
Impact Evaluation: Outcome indicators for hypotheticalevaluation designEducation:· Net and gross enrollment rates at primary, middle and matric levelsHealth:· Infant/child immunization coverage· Diarrheal prevalence, last 30 days, children under 5· Provider consultation and treatment rates for recent illness/injury· % of women with recent birth receiving tetanus toxoid injection· Skilled attendance and location of childbirthHousing, water supply and sanitation:· Roof and wall materials· Number of rooms· Source of drinking water· Type of toiletHousehold perception of economic situation and satisfaction withfacilities and service use:· Perception of economic situation of household compared to one year ago· Perception of economic situation of community compared to one year ago· Satisfaction with local services basic health unit, family planning services, school,veterinary hospital, agricultural extension, and police
Suggested steps – Looking back Immediately after rescue efforts -2. Obtain a pre-earthquake area-representative household sample • In Pak, a good measure would have been the 2004- 05 PSLM • Importance of baseline data
Suggested steps – Looking back Immediately after rescue efforts -3. Collect data on the pre- earthquake sample immediately post- earthquake • Use the PSLM and re- interview households • Expand sample if necessary • Post-disaster surveying includes unaffected households as well.
Suggested steps – Looking back Immediately after rescue efforts -4. Design interventions for staged roll out or other variations • Provides a counterfactual • Allows comparing two or more interventions, and understanding best practices • Ethical
Some variations• Theory of change (without contamination bias)• Basic care package – Factorial design = A vs. A+B vs. A+C• Cluster randomized designs can help determine the effectiveness of packages• Examples – Rwanda – messaging – Sierra Leone – Paired matching – Aceh – Documentation documentation! – Burundi – Phased roll out amongst ex-combatants.
ConclusionsIEs• Provide insight regarding the losses resulting from an emergency and compare them with a baseline;• Test innovative programmes in real-life situations;• See what difference assistance has made;• Examine whether recovery is sustained;• Examine the cost-effectiveness of interventions.