CriteriaComprehensive description or guidance for the rationale, theory and practice of real-time evaluation.Discernable influence upon the subsequent development of real-time evaluation.Consolidation of practice wisdom or empirical research.
Of the 56 reports identified, 47 described operations in single countries. Nine of the reports spanned two or more countries. The countries with the most studied programmes were Pakistan (n=8)and Haiti (n=6). Together these countries represented 25% of the sample. All six of the Haiti evaluations (from five different agencies) related to the 2010 earthquake. The Pakistan evaluations, from four different agencies, examined the earthquake, floods and civil conflict. The largest number of evaluation studied related to conflict (n=16) or situations of violence (n=3). 59 per cent (n=33) related to natural disasters: cyclone, drought, earthquake, epidemic, flood or tsunami. Floods (n=8) were the most studied response; five of these reports related to floods in Pakistan.
Overall, yes – a modest upward trend in the similarity between theory and practice according to the proxy indicators – but still a wide distribution, a lot of variance between actors.We would expect this, as theory should inform practice and vice versa. Indeed, many of those writing the guiding texts are also producing the case examples.Measurable proxy indicators of similarity between theory and practice:Does the report describe the methods used to carry out the evaluation?Does the report have 5-10 recommendations? Is there an inception report included in the report?Does the report include a matrix of evidence? Does the report include a matrix of recommendations?Does the report refer to methods used to triangulate and validate data?Is there a timeline of events related to the humanitarian emergency?Does the report refer to consultation or data collection with beneficiaries?Does the report refer to group interviews with the affected population?Does the evaluation team include 1-4 evaluators?Is the final report (excluding annexes) 15-40 pages?Does the evaluation team spend 7-21 days in the field?Does the report mention a results workshop in the field?
Is RTE new?Important question, as we have an approach called “real-time evaluation” that has emerged seemingly out of thin air. Many writers claim that RTE is innovative. Is this a new approach, or is it an existing approach with a new name? Writers on RTE point to formative and developmental evaluation.In some ways RTE is no different to other rushed field reviews. The only difference is the timing.Is RTE evaluation?It is important to ask this question: RTE doesn’t conform to the strict old-fashioned notion of evaluation as a “systematic investigation of the worth or merit of an object”*Here, different schools emerge. Those who see evaluation as satisfying a wide-range of uses. Others who see this as pseudoevaluation: a guiding hand from a technical expert, an organizational development consultant’s role and not much of an evaluation at all.*Joint Committee on Standards for Educational Evaluation. (1994). The programevaluation standards: How to assess evaluations of educational programs(2nd ed.). Thousand Oaks, CA: Sage.
1st Stage: methods and evidence2nd Stage: relationships and utilization3rd Stage: long-range learning plus utilizationMethods: truth and evidenceValuing: establishing the value of a program approachUse: instrumental or conceptual use by stakeholders
But lacks some elements of Quinn Patton’s approach:Establishing a leadership group among users to determine outcomes.Training staff on evaluation approaches and obtaining their buy-in.Sharing design decisions with a working team.Piloting data collection tools, running mock scenarios of findings.Involving staff in data interpretation and analysis.
To be distinguished from formative evaluation:The process of improving and preparing a program for post-hoc, summative evaluation (Patton, 2008). Can avert the difficulties that often plague evaluations (Scriven, 1991).In developmental evaluation, the program never becomes fixed.
Real time evaluation in theory and practice
Real-time evaluation In theory and practice Jessica Letch, University of MelbourneThesis: Master of Assessment and Evaluation
OverviewReal-time evaluationQuestions and rationaleMethodologyLogic and fidelityRTE in theoretical contextSuggestions
Origins of RTEPivotal conflicts Persian Gulf Crisis, Rwanda, KosovoEarly agencies UNHCR, World Bank, DanidaHumanitarian reform OCHA, Reliefweb, IASC, ALNAP
RationaleHumanitarian evaluation “tends to mirror humanitarian practice – it is often rushed, heavily dependent on the skills of its key protagonists, ignores local capacity, is top-down, and keeps an eye on the media and public relations implications of findings” - Feinstein & Beck (2006)
Rationale“atheoretical and method-driven”– a less thoughtful and rigorous cousin of mainstream evaluation. - Feinstein & Beck (2006)The „wild west‟ of evaluation - AES conference Canberra, 2009
The centrality of theoryWithout a strong theoretical base, “we are no different from the legions of others who also market themselves as evaluators today” - Shadish (1998)
Aid evaluation emergentAt the time of the Rwanda evaluation, “there were no manuals, guidelines or good practice notes to follow on evaluating humanitarian action” - Beck (2006)
The value of researchon evaluationRigorous and systematic study “can provide essential information in the development of an evidence base for a theoretically rooted evaluation practice, as well as provide the evidentiary base for the development of practice-based theory” - Miller (2010)
Research questions1. What is the conceptual logic behind real-time evaluation?2. How is real-time evaluation applied in practice?3. How can the theory and practice of real-time evaluation be strengthened?
MethodologyDrawn from: Miller and Campbell (2006) Multistage sampling approach; examination of fidelity between theory and practice Hansen et al (in press) Logic modeling from coding framework
Espoused theorySix items of literature Broughton, WFP (2001) Jamal and Crisp, UNHCR (2002) Sandison, UNICEF (2003) Cosgrave, Ramalingham & Beck (2009) Waldon, Scott & Lakeman (2010) Brusset, Cosgrave, MacDonald, ALNAP (2010)
Logic of theoryContext Activities Consequences / effects 4-12 weeks into large, rapid onset single or Data collection via semi- Realistic design Evaluation is multi-agency program. structured interviews, observation and flexible timely, credible May be concerns about with purposive sampling. plans. and responsive to Country team performance. Reflection workshops, focus Timeliness is information learning, reflection vital. May groups, limited document analysis. and improved needs. utilize a series morale. of visits. Internal and external Analysis concurrent with data stakeholders, under collection, with input from country Evaluator great pressure. team via workshop. Planning with credibility from Immediate stakeholders effective planning instrumental use. essential. and short-term Stronger Stakeholder participation: improvements. understanding of Country team involvement crucial, beneficiary engagement strongly context. Better HQ and donors need encouraged. Management guidelines and information. Evaluators policy. Greater role is responds to findings. Stronger transparency. impartial organizational outsider, capacity learning advisor and Multiple methods of report and decision- 1-4 highly skilled facilitator. Must generation. Written reports making in situ, evaluators with a secondary to briefings and Improved collate credible M&E systems and supportive manner, workshops. Rapid dissemination. outcomes for information. institutional diverse backgrounds Use of linking tool. survivors of learning. humanitarian emergenciesAssumptions External factorsEmergency response is difficult to monitor and evaluate. RTE is more interactive than Complex and difficult programming environment, subject to rapid changes. Time,standard humanitarian evaluation. Utilization is increased with staff engagement. logistics and security constraints.Organizational change is slow.
Logic of practiceFigure 1: Logic model for real-time evaluation from RTE reportsContext Activities Consequences / effects Large-scale single- or Planning via a multi-agency response to reference group Personnel, beneficiaries and RTE promotes sudden-onset or rapidly and pre-RTE field external stakeholders provide data. staff reflection, mission. Terms of Findings depend communication, deteriorating crises. Soon Formal management response. reference set, upon coordination, after establishment phase. then evaluators recollections of learning and Learning opportunities develop design informants, present. accountability. after initial data triangulated. collection and Data collection: semi-structured consultation. interviews, primarily with field Information needed for personnel. Documents, field visits Detailed action field personnel, external and focus groups. Evaluator plans established stakeholders and future Single or multi- credibility with the organizational response. phase design established ownership of with light through teams. Broader footprint. 11-21 transparency and Findings and recommendations policy field days. 36-111 meta-evaluation. Primary stakeholders are reviewed via reflection workshops development. agency staff and their informants, 40- with field teams. partners. Will be under 133 beneficiaries. pressure. Organizational Improved capacity outcomes for 3-4 internal or external Evaluator enhanced via Multiple reporting methods survivors of evaluators with sectoral provides support, reflection, including oral presentations at humanitarian and management guidance, outside communication field and headquarters.19-43 page emergencies. expertise and balanced perspective and and chronicling reports. profiles. accountability. events.Assumptions External factorsEarly deployment leads to influence. Ownership and participation faciliate Frequent changes, rapid staff turnover of staff and competing fieldutilization. RTEs measure opinions but not impact. RTEs will result in important missions. Sensitive political environment, security threats. Compromisedlessons learned. infrastructure, living and working conditions. Remote travel required. Visas and recruitment cause delays.
Contrasts in logic models • Theory: Concerns about programme performance Impetus • Practice: Silent on these concerns • Theory: Agency knowledge • Practice: External evaluators Evaluator with sectoral expertise and diverse backgrounds • Theory: Field based planning Planning • Practice: Reference groups
Contrasts in logic models • Theory: Field and management response. Stakeholders • Practice: More optimistic picture of beneficiary consultation. • Theory: Effective planning. Credibility • Practice: Relationships, transparency, meta-evaluation. Organizational • Theory: Establishing M&E systems. capacity
Contrasts in logic models • Theory: Learning Process use • Practice: Communication and coordination. • Theory: Understanding at headquarters. Utilization • Practice: Field team ownership and action plans. Constraints • Practice: Political environment.
Contrasts in logic models • Theory: Modest expectations of organizational change. Assumptions • Practice: The importance of lessons learned. • Practice has a stronger Overall emphasis on bottom-up influence and approaches.
Change in scorespre- and post- ALNAP guide Post-MarchElement Pre-March 2009 % change 2009Median no of 136 40 beneficiaries 240%beneficiaries beneficiariesMatrix ofrecommendation 9% 29% 222%s5 to 10 16% 33%recommendation 106% average 24 average 15sInception report 9% 17% 89%includedList of 41% 63% 54%informantsGroup 50% 67% 34%
Change in scorespre- and post- ALNAP guideElement Pre-March 2009 Post-March 2009 % changeWorkshop in field 59% 75% 27%Average fidelity 45% 53% 18%score 6.25 out of 14 7.38 our of 14Beneficiary 78% 88% 13%consultation1 to 4 evaluators 81% 88% 9%7 to 21 days in 50% 54% 8%field median 13 days median 16 daysAverage fidelity 45% 53% 18%score 6.25 out of 14 7.38 our of 14Beneficiary 78% 88% 13%consultation
Change in scorespre- and post- ALNAP guide Post-MarchElement Pre-March 2009 % change 2009Describes 97% 92% -5%methodsDescribestriangulation and 38% 33% -13%validityTimeline 38% 33% -13%Report 15 to 40 59% 46% -22%pages average 30 pages average 38 pagesMatrix of evidence 0% 21% N/A Median change (all scores) 18%
Highest fidelity scoresFound in Humanitarian Accountability Project, IASC and among external evaluators.Many of these evaluators also contributed to the literature on RTE.
Lowest scoresFound in mixed and internal teams, multi-country evaluations.Some reports seem to be labeled RTE simply for its cachet.
In theoryIs RTE new? Though described as innovative, it has many antecedents outside the humanitarian field.Is RTE evaluation at all? „Purists‟ would argue that it‟s pseudoevaluation. RTE is part of an increasingly vague distinction between evaluators and organizational development consultants.
Utilization-focused evaluationType of use Potential for real-time evaluation Influence actions and decisions.Instrumental Develop action plans. Change policies and programs. Lessons learned for country teams,Conceptual headquarters and donors. Information for donors.Symbolic Demonstrate transparency, accountability.Process Communication, coordination and morale.
DevelopmentalevaluationProgram in a continuous state of change; operations will never become fixed or stable. Patton (2008)Not to prove …but to improve Krueger & Sagmeister (2012), Stufflebeam (2004)
Connoisseurship“There are no algorithms, rules, recipes or the like to use” Eisner (2004)Expert-led, lightweight and agile design.Credibility (and supply) of experts a key limitation. Stufflebeam & Shinkfield (2007), Miller (2010)
SummaryThere is a strengthening relationship between theory and practiceA strong logic is emerging from RTERTE has roots in mainstream evaluation, especially developmental and utilization-focussed approaches. Must be wary of the risks of connoisseurship.
SuggestionsHumanitarian evaluators: stronger engagement with theory and better training in guidance.Mainstream theorists: attention to the specificities of emergencies, to adapt traditional models. Further research on evaluation in humanitarian programs.
Thank youJess Letch Masters candidate University of Melbourne, Australia email@example.comSpecial thanks to supervisor Brad AstburySpecial acknowledgement to Ros Hurworth
References Alkin, M. C., & Christie, C. A. (2004). An evaluation theory tree. In M. C. Alkin (Ed.), Evaluation roots: tracing theorists views and influences (pp. 12-65). Thousand Oaks, Calif.: Sage Publications. Beck, T. (2006). Evaluating humanitarian action using the OECD-DAC criteria: An ALNAP guide for humanitarian agencies: ALNAP. Broughton, B. (2001). Proposal Outlining a Conceptual Framework and Terms of Reference for a Pilot Real- Time Evaluation (O. O. o. Evaluation), Trans.). Canberra: World Food Program. Brusset, E., Cosgrave, J., & MacDonald, W. (2010). Real-time evaluation in humanitarian emergencies. [Article]. New Directions for Evaluation(126), 9-20. Cosgrave, J., Ramalingham, B., & Beck, T. (2009). Real-Time Evaluations of Humanitarian Action: An ALNAP Guide (Pilot Version): ALNAP. Eisner, E. (2004). The roots of connoisseurship and criticism: A personal journey. In M. C. Alkin (Ed.), Evaluation Roots: Tracing Theorists Views and Influences (pp. 8p). Thousand Oaks, Calif.: Sage Publications. Feinstein, O., & Beck, T. (2006). Evaluation of Development Interventions and Humanitarian Action. In I. F. Shaw, J. C. Greene & M. M. Mark (Eds.), The Sage Handbook of Evaluation. London: Sage. Hansen, M., Alkin, M. C., & LeBaron Wallace, T. (in press). Depicting the logic of three evaluation theories. Evaluation and Program Planning. doi: 10.1016/j.evalprogplan.2012.03.012 Jamal, A., & Crisp, J. (2002). Real-Time Humanitarian Evaluations: Some Frequently Asked Questions (E. a. P. A. Unit, Trans.): UNHCR. Krueger, S., & Sagmeister, E. (2012). Real-Time Evaluation of Humanitarian Assistance Revisited: Lessons Learned and the Way Forward. Paper presented at the European Evaluation Society, Helsinki. Miller, R. L. (2010). Developing standards for empirical examinations of evaluation theory. American Journal of Evaluation, 31(390). doi: 10.1177/1098214010371819
References Miller, R. L., & Campbell, R. (2006). Taking stock of empowerment evaluation: An empirical review. American Journal of Evaluation, 27(3), 296-319. Owen, J. M., & Rogers, P. J. (1999). Program Evaluation: Forms and approaches Retrieved from SAGE Research Methods database Retrieved from http://srmo.sagepub.com/view/program- evaluation/SAGE.xml doi:10.4135/9781849209601 Patton, M. Q. (2008). Utilization-Focused Evaluation. Thousand Oaks, California: Sage Publications. Sandison, P. (2003). Desk Review of Real-Time Evaluation Experience. New York: UNICEF. Shadish, W. R. (1998). Evaluation theory is who we are. American Journal of Evaluation, 19(1), 18. Shadish, W. R., Cook, T. D., & Leviton, L. C. (1991). Foundations of program evaluation: theories of practice. Newbury Park, CA: Sage Publications. Stake, R. E. (2004). Stake and Responsive Evaluation. In M. C. Alkin (Ed.), Evaluation Roots: Tracing Theorists Views and Influences (pp. 204-216). Thousand Oaks, Calif.: Sage Publications. Retrieved from http://www.loc.gov/catdir/toc/ecip048/2003019866.html. Stufflebeam, D. L. (2004). The 21st Century Cipp Model. In M. Alkin (Ed.), Evaluation Roots: Tracing Theorists Views and Influences (pp. 245-266). Thousand Oaks, Calif.: Sage Publications. Stufflebeam, D. L., & Shinkfield, A. J. (2007). Evaluation theory, models, and applications / Daniel L. Stufflebeam, Anthony J. Shinkfield: San Francisco : Jossey-Bass, c2007. Walden, V. M., Scott, I., & Lakeman, J. (2010). Snapshots in time: using real-time evaluations in humanitarian emergencies. Disaster Prevention and Management, 19(3), 8.