Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Bettinger Keynote: The Difficulty of Knowing and The "E" Word


Published on

Published in: Education, Economy & Finance
  • Be the first to like this

Bettinger Keynote: The Difficulty of Knowing and The "E" Word

  1. 1. The Difficulty of Knowing and The “E” Word Dr. Eric P. Bettinger Stanford University and NBER April 19, 2012 I’m not Allowed to Reveal My YellowNumber for a meetingUp Front After my Talk
  2. 2. What is the “E” Word?• The “E” word today is “Evaluation”• Why is “evaluation” a frightening word? • Typically external and out of your control. • Judges your work. • Expensive and time-consuming. • Small look at a larger, more intensive service. • “Evaluation” courses in graduate school were not your favorite.• My goal today is to suggest that evaluation is integral to the life cycle of knowledge and the continued success of access and success programs.
  3. 3. Experimentation and The Learning Organization:A Virtuous Cycle Experiment Evaluate Innovate
  4. 4. Economist’s View of Higher Education Net benefit of Net benefit alternative of attendance Human Capital Includes: Model• Monetary Benefits • Monetary Costs • Non-monetary Costs/Benefits
  5. 5. Why Does the Process Fail? Net benefit Net benefit of attending of not attending Includes: Human • Cost of Completing Capital Applications Model •Procrastination •Bad Information•Students’ Guesses on Financial Aid
  6. 6. Experimentation and The Learning Organization:A Virtuous Cycle Experiment Evaluate Innovate
  7. 7. Innovate: How to Improve Information and ApplicationPartnership with H&R Block: Families reveal their finances to tax professionals each year. 2/3 of FAFSA information comes from the taxes. 60 percent of clientele are Pell eligible H&R Block tax professionals have expertise in complicated income information They have the ability to process highly-accurate forms to meet deadlines in a timely fashion Scalable – could replicate in any tax preparation setting About 2/3 of welfare recipients use tax preparers, like H&R Block, to complete their taxes
  8. 8. Experimentation and The Learning Organization:A Virtuous Cycle Experiment Evaluate Innovate
  9. 9. Then the Experiment… HRB completes regular tax services Software screens to see if likely eligible Complete consent & basic background questions RANDOMIZATION Treatment #1 Treatment #2 ControlFinancial Aid Application Information Only Help & Information Group Over 40,000 individuals participated in Ohio and North Carolina between 2007-2009.
  10. 10. Experimentation and The Learning Organization:A Virtuous Cycle Experiment Evaluate Innovate
  11. 11. Summary: Impact on College Enrollment & Aid ReceiptThe FAFSA Treatment significantly increasedenrollment among graduating HS seniors• Substantial increase of 7 percentage points in college going (34% compared to 27% for the control group)• Effect continues into students’ third year of collegeAmong older, independent students who had notpreviously attended college , there was also an effect• Enrollment effect was 21% (near significant)• The effect seems to be concentrated among those with incomes less than $22,000For other independents, there was an effect on aidreceipt (addressing problem of eligible collegestudents not getting aid)
  12. 12. Experimentation and The Learning Organization:A Virtuous Cycle Experiment Evaluate Innovate
  13. 13. Do Such Cycles Happen?• Perhaps, but if so, the knowledge is not being passed.• What Works Clearinghouse Guide to College Access (2009) • Academic preparation  evidence = low • Communicate with students their academic preparation  evidence = low • Surround with peers and adults to build aspirations  evidence = low • Assist with critical steps for college entry  evidence = moderate • 6 Studies: 3 positive, 3 with no effect. • Increase awareness of financial aid with help with FAFSA evidence = moderate • 2 Studies: Both positive• Why so little?
  14. 14. Why Would We Want Evaluation?• Assess Overall Effectiveness • Provides information for stakeholders • Without evaluation, there is only “conjecture and criticism” (Phipps 1998)• Policy Preservation • Social Security Student Benefit Program (US) • Example of Colombia PACES Program• Alignment and Modifications of Policies • Georgia Hope Example • Unexpected benefits and consequences • Identifying specific programmatic elements that could lead to the impacts
  15. 15. Social Security Benefit Program Aid by Year
  16. 16. Percentage of Students Attending College Father Not Father Deceased DeceasedFinished SecondarySchool 1979-1981 54% 63%Finished SecondarySchool 1982-83 49% 32% Evidence Came too Late. Program was Cancelled.
  17. 17. Example of Georgia HopeScholarship• Georgia Hope Scholarship • Provided full tuition scholarships to Georgia students who stay in Georgia • Students had to have a 3.0 GPA in secondary school • Stated goal of the program: Increase access to higher education among low-income families• Evaluation Results • Student enrollment increased in general (Cornwell, Mustard, Sridhar 2002) • Low-Income, especially minority, enrollments did not increase (Dynarski 2000)
  18. 18. Georgia Hope Scholarship (cont.)• Gap Between Goal and Impact • Goal: Increase access for low-income • Impact: Increase access for middle- and upper-income families but not lower-income• Why the Failure? • Hope rewarded academic performance • HOPE required complex forms • Higher income families have • Better secondary school performance • Greater access to college information• How did Evaluation Impact Policy? • Academic performance requirement was reduced • Application process simplified
  19. 19. What Makes a Good Evaluation?1. Comparison Strategy (“Identification Strategy”) • Research is about comparing what happened to what might have happened2. Data • Detailed data on program implementation and use • Data on student outcomes
  20. 20. Comparison Strategy• Core of Evaluation is Comparison • Program effect is difference between observed outcome and outcome that would have happened without the program • Counterfactual outcome is never observed • We cannot observe the same student with and without assistance • Comparison group represents the counterfactual • Not all comparison groups are created equal
  21. 21. Who is the Comparison?• Suppose your strategy is to compare Student X who receives help to similar students. There may be some unobservable differences.• I’ll use myself as an example here.• My high school career was extremely good: Valedictorian, Near Perfect ACT, Student Body President, All-State Football. • My counselors were energized to work with me.• Now to the counselors, I’m a success story. They worked with me. I succeeded.• Perhaps there is some truth to it, but I kept a little list of my goals I set at the start of high school. I made these goals at home by myself. The top of that list – a full ride to college.• Was my success the result of advisers or some underlying drive?
  22. 22. Criticisms of Research on Access/Success• Our anecdotes are compelling, but our numbers are often doubted.• Often we base our numbers on simple comparisons of “similar” students, but our comparisons can be debated. • WWC rarely recognizes our comparisons as meeting evidence standards. Only seven studies meet their definition of rigor, and only four of those find positive results.• In an era of increased accountability, more rigorous evidence is required • Increased demands for “return on investment” data.
  23. 23. So How Do We Make It Better?• We have to think more carefully about evaluation as we expand. • This is going to require talking to evaluators earlier in the process. • Evaluation rarely meets rigorous standards when planned after the fact.• We may have to modify some expansion to accommodate evaluation.• We need evaluation to be part of culture rather than the “E” word that we avoid.
  24. 24. Evaluators Have to Improve, Too• Results are time sensitive and important for continued and future funding.• Complex research designs are difficult to enact. Evaluators have to be creative. • Often there are tensions between programmatic goals and good evaluation design. • Both sides have to compromise.• We need greater communication on results and research design.
  25. 25. Why do Some Evaluations Show No Impact?• How can we often find no impacts when the stories and anecdotes emerging have so much salience?• The key is the counterfactual. To have impact, we need to change what would have happened in the absence of the program.• Consider my case, what was the counterfactual? Would I not have gone to college without their aid?• This presents an enormous Catch-22 for advising. • You have to be the judge of what situations would succeed without your help. • You have to have confidence that some would make it through the potholes. • We have to find the ones that cannot make it without help.
  26. 26. Example of Rigorous Evaluation• Angrist, Lang & Oreopoulos (2006) • Large Canadian university • Multiple Services • Program providing support services to new college students (e.g. tutoring) • Financial incentive for maintaining a certain grade point average in college • 700 students applied. There was only funding for half of the students. • Program managers used random lottery to assign students to level of treatment. They considered it a “fair” way to determine who received services.
  27. 27. Pre-Lottery Similarities in High School Grades .08 .06 .04 .02 0 65 70 75 80 85 90 95 High School Grade Average Used for University Admission Control SFP/SFSP
  28. 28. Post-Lottery Differences in Grade Point Average(Women) .04 .03 .02 .01 0 30 35 40 45 50 55 60 65 70 75 80 85 90 95 First Term Grade Average Control SFP/SFSP
  29. 29. Other Examples with Mixed Evidence?Hanushek’s (1996) Summary of Evidence on the Effects ofInputs on Student Outcomes Type of Number Statistically Statistically Insignificant Study of Significant Studies Positive Negative Positive Negative Unknown Teacher- 277 15 13 27 25 20 pupil Ratio Teacher 171 9 5 33 27 26 Education Teacher 207 29 5 30 24 12 Experience Expenditure 163 27 7 34 19 13 per Pupil
  30. 30. Matching Strategies: Example• Classic debate on class size in secondary school• Hanushek (1986, 1989, 1996, 1997, 1998) • Surveyed research based on matching students to “comparable” students. • Finds no consistent effect of class size on student achievement.• Krueger (2003) • Uses randomization in Tennessee • Finds large positive effects of class size
  31. 31. Why the Difference?• Krueger: • “not all estimates are created equal” • Krueger quoting Galileo: • ‘I say that the testimony of many has little more value than that of few, since the number of people who reason well in complicated matters is much smaller than that of those who reason badly. If reasoning were like hauling I should agree that several reasoners would be worth more than one, just as several horses can haul more sacks of grain than one can. But reasoning is like racing and not like hauling, and a single Barbary steed can outrun a hundred dray horses.’ • “Tennessee’s Project STAR is the single Barbary steed in the class size literature”
  32. 32. “Bottom Line”• Research depends on the quality of comparisons • Not all comparisons are equal • Some comparisons provide information • BUT, may hide confounding factors• If we want better results and more secure funding streams, we need better evidence. • We need to use strategies which are not susceptible to confounding factors.
  33. 33. Additional Considerations in Evaluation• Timing of Evaluation • Gap between start and production of evidence• Cost of Evaluation • Cost of program, evaluation, data collection• Ethical Considerations • Provision of the service • Right of privacy• Political Feasibility of Evaluation • Lance Pritchett: “No advocate would want to engage in research that potentially undermines support for his/her program. Endless, but less than compelling, controversy is preferred to knowing for sure.”
  34. 34. So Where Do We Start?1. Plan Ahead • Impossible to use randomization after the fact • Creating and developing data collection instruments takes time2. Consult People Who Know Research • NCAC’s integration of research into their expansion.3. Take a Risk • Evaluation is risky. It may be that the program does not work, but knowing a policy’s strengths can lead to even better policies.
  35. 35. Experimentation and The Learning Organization:A Virtuous Cycle Experiment Evaluate Innovate