6. Brazil
2000 2010 2020 2030 2040 2050 2060 2070 2080 2090 2100
Progress in Math (PISA 15 year olds)
WDR 2018
7. Brazil
Time to reach OECD average in Math (PISA 15 year olds)
2000 2010 2020 2030 2040 2050 2060 2070 2080 2090 2100
75 years
OECD average
WDR 2018
8. Brazil
Time to reach OECD average in Math and Reading (PISA 15 ye
2000 2010 2020 2030 2040 2050 2060 2070 2080 2090 2100
>260 years
OECD average
WDR 2018
9. 0
2
4
6
8
10
12
14
16
Years of Schooling are not the same as Learning
Average years of schooling of 25-29 year olds, unadjusted and adjusted for learning
Yearsofschooling
WDR 2018
10. There is a learning crisis.
Solving it requires learning to
improve learning.
12. We have more evidence on
improving learning than ever before.
Evans & Popova 2016; WDR 2018
0
50
100
150
200
250
300
1980 1984 1988 1992 1996 2000 2004 2008 2012 2016
Impactevaluationswith
learningoutcomes
299 total studies
19 total studies
13. But there is still so much to learn!
Adapted from 3ie 2015
Cash transfers and
student enrollment
19 studies
Materials and
learning outcomes
4 studies
Remedial education
and student
learning
4 studies
14. No one will buy it without the price tag!
No data on costs
56%
Minimal data on
costs
Good cost data
McEwan 2015
17. Training teachers in active learning…
…reduced learning in Costa Rica
Berlinski & Busso 2017
18. Small, cheap interventions can
have a substantive, positive impact
Parent teacher
conferences in
Bangladesh
Information in the
Dominican Republic
Islam 2016; Jensen 2010
23. What does a null result mean?
Maybe the evaluation was
poorly designed!
24. Evaluation design problems
• Sample size too small
• Contamination (a la Lesson Study)
• Measured outcomes at the wrong time
• Evaluated while program was still
resolving fundamental problems
25. Evaluation design problems
“The trend to measure impact has brought
with it a proliferation of poor methods of
doing so, resulting in organizations wasting
huge amounts of money on bad ‘impact
evaluations.’”
- Gugerty & Karlan 2018
26. What does a null result mean?
Maybe the intervention was
implemented poorly!
Adapted from Glewwe & Muralidharan 2015
27. Textbooks in Sierra Leone
Mobile phone monitoring in Haiti
Sabarwal, Evans, & Marshak 2014; Adelman et al. 2017
28. What does a null result mean?
Maybe the intervention led to
compensating behavior!
Adapted from Glewwe & Muralidharan 2015
30. What does a null result mean?
Maybe it only worked for some
beneficiaries!
Adapted from Glewwe & Muralidharan 2015
31. Textbooks in western Kenya
Only the highest performers benefitted
Glewwe, Kremer, & Moulin 2009; Jukes et al. 2017
Literacy instruction in
eastern Kenya
For word identification, only
benefits for girls.
32. What does a null result mean?
Maybe it only worked with
complementary programs!
Adapted from Glewwe & Muralidharan 2015
33. Teacher pay-for-performance in Tanzania
Mbiti et al. 2017
Grants alone? No impact.
Incentives alone? No impact.
Together? Student learning gains.
34. What does a null result mean?
Maybe the intervention doesn’t
work!
Adapted from Glewwe & Muralidharan 2015
35. Unconditional increases in teacher salaries
don’t increase student learning.
-1
-.5
0
.5
1
0 20 40 60 80 100
percentile of Y0 test score
treatment control
Second stage (test scores)
Indonesia
Uruguay
Zambia
De Ree et al. 2017;
36. Good evaluations interrogate their null results
1. Poor implementation: Is good implementation
possible in this context?
2. Only benefits some: Narrower targeting?
3. Compensating behavior: Do beneficiaries value the
service?
4. Complementary programs: Explore variations.
5. It doesn’t work: Move on. Try something else.
38. Ask…
“What is the biggest change required in
behavior?”
“What changes are likely to make the
biggest difference to outcomes?”
Then measure those.
Jukes 2018
43. Their challenge, not your research
Relationships: Multiple interactions
Use clear, non-technical language
The right time
44. Get the Most Out of Education
Impact Evaluations
1. Learn from null results
2. Learn about mechanisms
3. Synthesize effectively
4. Inform policy
Editor's Notes
Most of my experience draws on Education Evaluations from low- and middle-income countries, but I strongly believe that many of the same lessons will apply.
National curriculum assessments at Key Stage 1 (grade 2) in England, 2017
If Brazilian 15-year-olds continue to improve at their current rate they will not reach the rich-country average score in math for 75 years. In reading, it will take 263 years.
If Brazilian 15-year-olds continue to improve at their current rate they will not reach the rich-country average score in math for 75 years. In reading, it will take 263 years.
If Brazilian 15-year-olds continue to improve at their current rate they will not reach the rich-country average score in math for 75 years. In reading, it will take 263 years.
This adjusts by ratio of a country’s TIMSS score relative to that in Singapore (which is the best performer)
These have probably grown, but there’s huge variation. And in some areas, like pedagogical interventions, the interventions are almost as varied, so even if you have 10 evaluations, it just means you’ve tested 5 totally different interventions, twice a piece.
Highlights from paper: * Experiment designed to affect the ability to reason and argue using mathematics.
•Used structured pedagogical intervention aimed at secondary school students.
•The intervention was implemented with high fidelity and was internally valid.
•The control group learned more than the treatment group.
https://www.sciencedirect.com/science/article/pii/S0165176517301854
USA: High achiever classes. “Minorities gain 0.5 standard deviation units in fourth-grade reading and math scores, with persistent gains through sixth grade.” https://www.aeaweb.org/articles?id=10.1257/aer.20150484
Kenya: Tracking improved outcomes for both groups. https://www.aeaweb.org/articles?id=10.1257/aer.101.5.1739
India: One dedicated hour led to gains. http://www.nber.org/papers/w22746
If these are the case, then we can’t necessarily learn so much. It is beholden on us, the evaluators, to make sure these elements of the evaluation are right. And if we can’t do an evaluation right, then we shouldn’t do it.
If these are the case, then we can’t necessarily learn so much. It is beholden on us, the evaluators, to make sure these elements of the evaluation are right. And if we can’t do an evaluation right, then we shouldn’t do it.
If these are the case, then we can’t necessarily learn so much. It is beholden on us, the evaluators, to make sure these elements of the evaluation are right. And if we can’t do an evaluation right, then we shouldn’t do it.
Switch On: Maybe teachers didn’t have a great awareness of the program.
If these are the case, then we can’t necessarily learn so much. It is beholden on us, the evaluators, to make sure these elements of the evaluation are right. And if we can’t do an evaluation right, then we shouldn’t do it.
If these are the case, then we can’t necessarily learn so much. It is beholden on us, the evaluators, to make sure these elements of the evaluation are right. And if we can’t do an evaluation right, then we shouldn’t do it.
If these are the case, then we can’t necessarily learn so much. It is beholden on us, the evaluators, to make sure these elements of the evaluation are right. And if we can’t do an evaluation right, then we shouldn’t do it.
Indonesia: Story.
Uruguay: RD – disadvantaged schools https://mpra.ub.uni-muenchen.de/86972/1/MPRA_paper_86972.pdf
Zambia: Compared schools that just qualified for hardship allowance to those that didn’t (40 pp increase in 20% salary increase) https://editorialexpress.com/cgi-bin/conference/download.cgi?db_name=CSAE2018&paper_id=864
If they don’t work, then we understand why things went wrong.
Multi-arm trials are a great way to do this, but where those aren’t possible, we can still learn.
Jukes: https://www.poverty-action.org/blog/learning-more-impact-evaluations-contexts-mechanisms-and-theories-literacy-instruction
If they don’t work, then we understand why things went wrong.
Multi-arm trials are a great way to do this, but where those aren’t possible, we can still learn.
Jukes: https://www.poverty-action.org/blog/learning-more-impact-evaluations-contexts-mechanisms-and-theories-literacy-instruction
If they don’t work, then we understand why things went wrong.
Multi-arm trials are a great way to do this, but where those aren’t possible, we can still learn.
Jukes: https://www.poverty-action.org/blog/learning-more-impact-evaluations-contexts-mechanisms-and-theories-literacy-instruction
If they don’t work, then we understand why things went wrong.
Multi-arm trials are a great way to do this, but where those aren’t possible, we can still learn.
Jukes: https://www.poverty-action.org/blog/learning-more-impact-evaluations-contexts-mechanisms-and-theories-literacy-instruction