Can systematic reviews help identify what works and why?


Published on

Presentation to the South African Monitoring and Evaluation Association Conference 2011

1 Comment
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Once impact is measured, the debate is further how we explain that impacts to inform policy
  • Number of international initiatives for impact evaluation:NONIE (Network of Networks for Impact Evaluation): comprised of OECD’s Development Assistance Committee Evaluation Network, the United Nations Evaluation Group, the Evaluation Cooperation Group, and the International Organisation for Cooperation in Evaluation (network drawn from the regional evaluation associations)Paris Declaration of 2005 laid out practical, action-oriented roadmap to improve the quality of aid and its impact on development. One of the four key principles of the Declaration is that ‘Developing countries and donors shift their focus to development results and these results get measured’Accra Agenda for Action in 2008 reviewed progress on the Paris Declaration and set agenda for further accelerating advancement. It highlighted three areas of improvement, including “Aid is focused on real and measurable impact on development”International Initiative for Impact Evaluation (3ie)International Development Coordinating Group set up in the Campbell Collaboration in March 2011Abdul LatifJameel Poverty Action LabInnovations for Poverty Action‘Data dash’ - rigorous
  • About hitting the target, but sometime have to question the target
  • ‘Traditional’ M&E: Is the program being implemented as designed? Could the operations be more efficient? Are the benefits getting to those intended? NOT about showing causality.
  • (1) What would have happened without the intervention/program/project.(2) Estimated impact is difference between treated observation and counterfactual.About attribution
  • For some randomised controlled trials is the solution to the ‘what works’ challenge:- RCTs attempt to limit potential for bias within evaluation and provide high quality, generalisable evidence of the impact of X on Y(2)RCTs are accepted norm in clinical research but are new to policy and especially the development field- RCTs in conditional cash transfers, micro-savings, microcredit, pro-poor targeting(3) Chris Blattman (2008) refers to this as ‘return-on-investment’ impact evaluation, or Impact evaluation 1.0
  • Vigorous debate has arisen about the value of experimental methods for informing policy: Angrist & Pischke (2009, 2010), Banerjee & Duflo (2009), Deaton (2010), Heckman (2010) & Imbens (2010). Angus Deaton of Princeton cautions that RCTs have all the characteristics of a fad, and are bound to disappoint.
  • See debate of Randomistas (behavioural economists) versus Relativistas: Bendavid (2011), Blattman (2011), Buckley (2010), Devarajan (2011), Glennerster & Kremer (2011), Goldacre (2011), Haddad (2011), Kristof (2011), Lindley (2011) & Subramanian (2011)(Algoso 2011, Barrett et al 2010, Bhargava 2008, Chambers et al 2009, Deaton 2009, Jones 2009)
  • Cochrane Collaboration is an international network of health researchers committed to systematically combining RCT evidence to inform policy, with regional offices, including a centre in Cape Town.
  • “seek to identify, review, and synthesise all high quality studies on a particular research question, e.g. the effectiveness of a particular intervention” (Hughes & Hutchings 2011:10)
  • 3ie’s quality standards for inclusion in its impact evaluation database: “Quality evaluations employ either experimental or quasi-experimental approaches.”EPPI-Centre is one of specialist methodology centres around the world exploring and extending SR methods to consider other types of evidence to address complex questions, interventions and outcomes.
  • Different version of ‘What gets measured, gets done’.
  • - Focused on sub-Saharan Africa: e.g. African microfinance differs from Asian microfinance- This enabled us to produce a report for Africa that more contextual - SSA ‘disappears’ in wealth of evaluations in Asia- With micro-finance proposed as tool for development focus on poorest region in world made sense- Focussing on only one region made it possible for us to deliver our work within short timeframe required by funder
  • we searched 18 different databases, as well as the websites of 24 organisations, and an online directory of books. We also contacted 23 key microfinance networks, organisations and individuals requesting relevant evidence, conducted citation searches for two key publications, and searched the reference lists of initially included papers. Whilst our searching was all conducted in English, we did not exclude studies based on language, but worked with native speakers to assess foreign language papers for relevance and obtain translations when appropriate. Lastly, we identified a number of relevant research papers through our participation in informal microfinance networks via Twitter.
  • Similar to realist synthesis (Pawson 2006) (includes any evidence where conclusions warranted by data).Want vigour (Cummings 2010).
  • Details of the included studies 35 studies which compare the impact of having a loan or a savings account with not having either. 20 excluded either due to poor reporting, poor methodology or both. 11 studies were medium quality and 4 high quality. These 15 studies were considered ‘good enough’ quality and included in the in-depth review.
  • Overall direction of effect does not change.
  • Overall direction of effect does not change.
  • Overall direction of effect does not change.
  • Duvendack appear to be saying that 'no good evidence' means that microfinance is bad - but we of course know that absence of evidence is not the same as evidence of absence
  • Learn from contribution analysis: id theory of change early on & revise based on evidence (Mayne J 2008 Contribution analysis: An approach to exploring cause and effect. ILAC briefing paper 16).
  • In micro-finance have theory failure rather than implementation failure (Rogers’s blog).
  • “rigorous, diamond standard for evaluation which takes into account complexity, values, quality standards & different methodological options” (Cummings 2010).“Understanding change, the route towards impact, and impact itself requires not just a one-off evaluation, or results-oriented monitoring, or adaptive innovation, or impact evaluation.” (Guijt et al 2011:5).GEM (general elimination methodology) and MLLE (multiple lines and levels of evidence) and MSC (most significant change theory)
  • “Morell’s new book on ‘Evaluating the unexpected’ (2010) is a plea for flexibility.”
  • Chris Blattman (2008): “Version 2 evaluations try to understand why a program works, and what it reveals about the process of development. That is, they try to understand the causal mechanism.”
  • Can systematic reviews help identify what works and why?

    1. 1. Can systematic reviews help identify what works and why? The case of microfinance in sub-Saharan Africa<br />Presentation to 3rd Biennial SAMEA conference<br />8 September 2011<br />By<br />Carina van Rooyen, Dr Ruth Stewart & Prof Thea de Wet<br />
    2. 2. How do you want it – the crystal mumbo-jumbo or statistical probability?<br />
    3. 3. Need to demonstrate impact<br />Large development funders wants to know ‘what works’ in development<br />Looking for evidence of effectiveness – evidence-informed development policy <br />
    4. 4.
    5. 5. “UK government support for aid organisations will be targeted at those agencies which demonstrate they can deliver best value for money while they improve the health, education and welfare of millions of people in the poorest countries…. We expect these charities to work hard to prove to UK taxpayers that they will and can make a real difference to the lives of the poorest and deliver real value for money.” <br />~ DFID 2010<br />
    6. 6.
    7. 7. Impact evaluations (IE)<br />
    8. 8. IE about showing causality <br />Causation:<br />A change in X is related to a change in Y<br />Not the same as correlation<br />
    9. 9. Counterfactual crucial<br />
    10. 10. Randomistas provide the answers?<br />‘Gold standard’ study design advocated by ‘randomistas’ – led by influential academics at the Abdul LatifJameelPoverty Action Lab (JPAL)<br />Population<br />
    11. 11. “Creating a culture in which randomised evaluations are promoted, encouraged and financed has the potential to revolutionise social policy during the 21st century, just as randomised trials revolutionised medicine during the 20th.” <br />~ Esther Duflo quoted in Lancet Editorial, “The World Bank is finally embracing science” (2004)<br />
    12. 12. RCTs in development<br />
    13. 13. But are RCTs sufficient?<br />Methodological debates about RCTs raise number of concerns within the development community, including<br />Dismissal of other evaluation techniques: hierarchies and ‘gold standard’<br />Lack of consideration of contextual information: over-simplificationwith generalisable information stripping out contextual details<br />Narrow focus on linear causal relationships: experimental designs over-simplify complex issues<br />
    14. 14. Hierarchy of evidence<br />Randomised control double-blind trials<br />Randomised control trials<br />
    15. 15.
    16. 16. RCTs questioned in development <br />Narrow approach to evidence<br />Trials are costly, have ethical dilemmas & are often lacking<br />Solutions are urgently required<br />Heterogeneity raises serious concerns about external validity of such trials<br />
    17. 17. Systematic reviews (SRs) to the rescue?<br />Can these concerns about RCTs be overcome through the use of SRs?<br />Led by the Cochrane Collaboration, SRs routinely used in health care to combine results of RCTs<br />Integrated into health policy internationally<br />In development promoted by funders<br />
    18. 18. SRs in the development field<br />About 100 SRs in international development commissioned so far ~ Howard White (chair of IDCG)<br />First SRs in development published: water and environmental sanitation (Waddington & Snilstveit 2009), HIV behaviour change (Noar et al 2009), microfinance (Stewart et al 2010)<br />
    19. 19. SRs in the development field (cont.)<br />Four registered SRs with IDCG (Campbell Collaboration)<br />cash transfers for health & nutritional outcomes in poor families<br />deworming for improving school attendance in school-aged children<br />impact of farmer field schools<br />effectiveness & sustainability of water, sanitation hygiene interventions in combating child diarrhoea<br />IDCG expects to register more titles later in 2011 in CCTs in education, governance and anti-corruption, urban development, social protection & microfinance<br />
    20. 20. What is a systematic review?<br />Is about the evidence of effectiveness<br />Thorough & systematic collection of all relevant evidence & its quality appraisal and synthesis<br />Typically combine evidence from RCTs<br />Designed to minimisebiases & errors inherent to traditional, narrative reviews<br />
    21. 21. Elements of a SR<br />Formulate the review question & write a protocol which is peer reviewed<br /><ul><li>Search for and include primary studies</li></ul>Assess study quality<br /><ul><li>Extract data
    22. 22. Analyse data</li></ul>Interpret results & write a report, which is peer reviewed<br />Comprehensive strategy to search for relevant studies (unpublished & published)<br />Explicit & justified criteria for inclusion or exclusion of any study<br />Statistical synthesis of data (meta-analysis) if appropriate and possible, or qualitative synthesis<br />
    23. 23. Asking the right question?<br />Paper or plastic?<br />
    24. 24. Rigidity of SRs: Hierarchy of evidence?<br />
    25. 25. Do you ever think sometimes, you might be overdoing the whole moisturiser thing?<br />
    26. 26. Risks with methodological rigid SRs?<br />Narrow menu of methodological options could mean reduction of development to simple interventions, in order to facilitate its measurement (Guijt et al 2011:4)<br />“Those development programs that are most precisely and easily measured are the least transformational, and those programs that are most transformational are the least measurable.” (Natsios, ex USAID quoted in Guijt et al 2011:3)<br />
    27. 27. Similar critiques against methodological rigid SRs in development that against RCTs in development <br />
    28. 28. Our SR on the impact of microfinance on the poor in SSA<br />
    29. 29. Our pragmatic approach<br />Followed pragmatic approach for our SR in five important ways:<br />Focused on REGIONAL rather than worldwide evidence<br />
    30. 30. Our pragmatic approach (cont.)<br />Multi-disciplinary nature of our team<br />Using range of sources: not only electronic data bases (publication bias)<br />
    31. 31. Our pragmatic approach (cont.)<br />Methodological: <br />Drew on well-conducted evaluations with comparative research design, including RCTs, but also non-randomised trials, quasi-experimental designs, and simple with-and-without studies<br />For purists this ‘weakened’ confidence in evidence of impact<br />‘rigour’ narrowly defined in terms of statistically significant indication of difference with and without an intervention – internal validity (Guijt et al 2011:7)<br />
    32. 32. Our pragmatic approach (cont.)<br />We argue for ‘good enough quality’: rigour include aspects such as utility, external validity, method mastery, sense-making & substantiated methodological choice (Guijt et al 2011:7) <br />In practice we also broadened the scope of our study <br />Able to look at additional types of interventions & outcomes which haven’t yet been evaluated by RCTs <br />Draw on evidence from additional countries<br />
    33. 33. Details of 15 included studies <br />4 RCTs<br /> 2 quasi-experimental studies<br /> 9 with/without studies<br />11 = microcredit, 2 = savings, 2 = combined credit & savings<br />Ethiopia, Ghana, Kenya, Malawi, Madagascar, Rwanda, South Africa, Tanzania (Zanzibar), Uganda & Zimbabwe<br />Rural & urban initiatives<br />
    34. 34. Is ‘good enough quality’ good enough?<br />
    35. 35. Is ‘good enough quality’ good enough? (cont.)<br />
    36. 36. Is ‘good enough quality’ good enough? (cont.)<br />
    37. 37. “The quality of evidence about effectiveness should be judged not by whether it has used a particular methodology, but whether it has systematically checked internal and external validity, including paying attention to differential effects.” (Rogers 2010:195)<br />
    38. 38. If methodological purist (exclude any study with indication of bias) possible conclusion that evidence not good, e.g. Duvendack et al’s SR on impact of microcredit worldwide<br />
    39. 39. Clemens & Demombynes (2010:1) refer to luxury versus necessity<br />White (2011a) refers to choice between technical quality & policy influence<br />Risk of purist is that have nothing to say to policy makers as want definitive free-from-bias answer<br />Risk of pragmatist is that, while providing policy makers with ‘better’ information than what otherwise would have, might have bias<br />
    40. 40. Our pragmatic approach (cont.)<br />Development of causal pathway in which we explored how microfinance works to be able to draw conclusions about why microfinance does or does not work & for whom<br />What achieved (outcome) & how (process)<br />Conventional SRs limited to evidence of effectiveness, but this more enhanced approach allowed informed conclusions to be drawn<br />Evaluative ‘proving’ & improving<br />
    41. 41.
    42. 42. Causal pathway analysis<br />41<br />
    43. 43. Use other MFI <br />Social cohesion <br /> <br />What we now think is happening<br />Use same MFI <br />Micro-credit<br />Micro-savings<br />Women’s empowerment<br /> <br />Able to repay loan and avoid increase in debt<br />Able to save<br />Given to individuals or groups<br />Default on loan, lose collateral and/or forced to borrow more<br />Long-term benefits<br /> <br />Spend money differently<br />1. <br />Invest in immediate future: <br />a. Business<br />b. Productive assets<br />c. Adult education<br />d. Workers’ health & nutrition<br />2. Consumptive spending with scope for productivity: <br />a. Add on housing<br />b. Assets which retain value<br />3. <br />Invest in long- term future: <br />a. Children’s education<br />b. Children’s health and nutrition<br />4. Consumptive spending (non-productive): <br />Assets which do not retain value<br />Actual increased income <br />Actual decreased income <br />Determined by external factors: <br />Entrepreneurial ability<br />Appropriateness of business in context<br />Competition from other MFI clients<br />Gender and power relations<br /> <br />Scope for increased income via business or employment<br />Improved capabilities <br />Better able to deal with shocks<br />FOR CREDIT CLIENTS ONLY<br />Inability to repay loan<br /> <br />
    44. 44. Some of our recommendations<br />More and better impact evaluations of microfinance (especially savings)<br />On-going discussion of how to deliver pragmatic systematic reviews for international development<br />
    45. 45. Next steps<br />SR methodology to be further enhanced to serve the needs of development<br />Incorporating studies of poor people’s experiences, priorities & views (constructivist view): something similar has been done in health promotion, e.g. EPPI healthy eating review<br />Combining reviews of published evidence with primary research, e.g. Thuthuka project<br />Systemic approach to M&E and impact evaluations<br />
    46. 46. Three challenges for M&E<br />Consider findings of SRs to enhance individual programme evaluations, establishing what best available evidence shows & placing evaluation of individual projects within context of this broader evidence base<br />Consider RCT designs as one part of solution to impact evaluation, and explore where evaluations which you are able to conduct can fit within broader evidence base to shed light on key issues in development<br />Conduct pragmatic SRs to inform decision-making in development – flexibility <br />
    47. 47. The latest research shows that we really should do something with all this research<br />
    48. 48. Conclusion<br />About what works for whom under what circumstances and how<br />SRs help to think about strategic issues, rather than specific project intervention<br />There are limitations with SRs & they are very reliant on existence & clear reporting of individual evaluations<br />SR is only as good as the included studies (garbage in, garbage out)<br />
    49. 49. Conclusion (cont.)<br />But <br />They are bigger than individual studies<br />They take into account relevance, rigour & vigour<br />With causal pathway analysis (theory of change), they go some way to translating research evidence into meaningful policy & practice insights<br />
    50. 50. So, can SRs help identify what works and why? <br />Based on our SR on the impact of microfinance on the poor in sub-Saharan Africa, yes<br />But have to be pragmatic / flexible in approach to SR in the field of development<br />
    51. 51. Source of cartoon: Guijt et al 2011:i<br />
    52. 52. Source:<br />Not everything that can be counted counts, and not everything that counts can be counted ~ Albert Einstein<br />
    53. 53. Thank you for listening<br /><br />Presentation online at<br />
    54. 54. References / Acknowledgements<br />Blattman C 2008 Impact evaluation 2.0. Presentation to the Department for International Development (DFID) London on 14 February 2008. Available at<br />Blattman C 2011 Impact evaluation 3.0? 5 lessons and reflections after a couple of more years of failure and success. Presentation to DFID on 1 September 2011. Available at<br />Cummings S 2010 Evaluation revisited 1: Rigorous versus vigorous. Blog posting at on 17 June 2010<br />Deaton A 2010 Instruments, randomisation and learning about development. Journal of Economic Literature 48: 424–455<br />Gertler PJ, Martinez S, Premand P, Rawlings LB & Vermeersch CMJ 2010 Impact evaluation in practice: Ancillary material. World Bank: Washington DC (<br />Guijt I, Brouwers J, Kusters C, Prins E & Zeynaloba B 2011 Evaluation revisited: Improving the quality of evaluative practice by embracing complexity (conference report)<br />Hughes K & Hurchings C 2011 Can we obtain the required rigour without randomisation? Oxfam GB’s non-experimental Global Performance Framework (3ie Working Paper 13). New Delhi: 3ie <br />Rogers P 2010 Learning from the evidence about evidence-based policy. In Banks G (eds) Strengthening evidence-based policy in the Australian Federation. Melbourne VIC: Productivity Commission: 195-214 <br />Photos and cartoons not acknowlegedon slide were found via Google Images <br />