Topics/Purposes of Presentation 1. Give overview and policy history2. Explain what went wrong and why it went wrong3. Present results of re-analyses that mitigate issues and correct impact estimates4. Discuss next steps and invitation for more analyses
Clarification of What Presentation is Not Not a critique of random assignment-recognize power of method and hope this critique will improve its application Not a general critique of Mathematica Policy Research ‗s work—believe conclusions and reports of ―no impact‖ estimates in their Upward Bound (UB) reports are seriously flawed; very critical of Mathematica‘s refusal to acknowledge more robust positive impact estimates and their misleading masking of key issues with the study in reports---but respect the hard work and determination of completing this study Not an Act of Advocacy for the program —am acting as a researcher concerned with meeting research standards
Personal Involvement Disclosure Employed as Contractor for over 25 years: Westat for 16 years and served as Project Director (PD) for National Evaluation of Student Support Services (SSS) evaluation. Mathematica for 6 years served as PD for National Evaluation of Talent Search—While employed at Mathematica also served as Survey Director for UB Third and start of Fourth follow up data collection RTI for 3 years served as NSOPF PD UB study began in 1992--Controversial Study over entire history—random assignment combined with probability national sample—very rare. Mathematica published 4 reports (two most recent 2004 & 2009) I joined US Department of Education (ED), Policy and Planning Studies Services (PPSS) in late 2004 ---Team Leader for Secondary Postsecondary Cross-Cutting (SPCC) Team---UB study was under my team. Developed concerns—Involved in long painful internal debate-- 2006-2011; Retired from ED in 2011 Currently Co-Principal Investigator for ED i3-grant—Using Data to Inform College Access Programming at Pell Institute for Study of Higher Education at Council for Opportunity in Education (COE)
Basic Problem As final ED COR/Technical Monitor found impact estimates published in 2004 and again in 2009 were seriously flawed such that the conclusions of ―no detectable impact‖ for UB program were found to be erroneous Re-analyses correcting for these errors using standard statistical procedures found strong positive results for the UB program on major outcomes Report is not transparent in revealing these issues or the findings of positive results when these issues are addressed
Upward Bound (UB) Program Overview UB begun in 1965 as part of civil rights movement and New Society: 1991—Upward Bound Math Science (UBMS) initiative begun Goal –increase college access and preparation for eligible high school students (low-income (150 percent of poverty) and first generation college (no parent has BA degree) Academic focus—6-to 8 week program on college campus in summer and academic year follow-up sessions Most intensive of TRIO programs--$4900 per year per student served; Average program serves 50 students per year Grants made to postsecondary institutions to run programs—often students enroll in institutions--- currently over 1000 programs across nation
Percentage of high school students who had at least one parent with a four-year college degree by race/ethnicity: 1972, 1980, 1990 and 2002: NCES High School Longitudinal Studies 60 Note large50 52 increase 45 since 4340 40 38 program 29 began in30 31 26 29 percent of 23 2720 22 23 21 parents 21 16 having BA 14 15 1310 11 13 14 degree 8 701970 1975 1980 1985 1990 1995 2000 2005 White Hispanic or Latino Black or African American Asian American Indian or Alaska Native All The Pell Institute 7
UB Evaluation: Study HistorySecond national evaluation and first random assignment study of UB: Begun in 1992 –last follow-up in 2003-04Under 3 contracts Mathematica has authored 4 reports published by ED 1996, 1999, 2004, 2009; Fourth follow up report unpublished
UB Study Basic Design Unique combination Multi-stage complex nationally representative probability sampling procedures –inverse probability of selection weighted to national estimates Experimental random assignment design Multi-stage sample design 67 projects from 46 strata designed to represent different types of projects (4-2year, public-private, small, med, large, rural, non-rural, race/ethnicity of participants) 339 end stage strata for 1500 treatment and 1380 control applicants Projects required to recruit at least twice number of openings so can do random assignment Study sought to change as little as possible about the program except recruitment Accommodations—allowed ―must serves‖ removed from analyses Did not control actual offering of treatment or participation of those assigned Multi-grade—multi-year cohort—grades 7 to 10 at baseline
Flawed reports authored by Mathematica Policy Research have driven ED Policy with regard to UB program for more than a Decade Third Follow up--- reported no average overall effects; but large effects for students at-risk academically and with lower educational expectations defined as expecting less than a BA at baseline The Program Assessment Rating Tool (PART) was developed to assess and improve program performance so that the Federal government can achieve better results ----UB given OMB PART rating of ―ineffective‖ Based on study findings --ED began new UB Initiative to serve more academically at risk students Budget ---Bush budget zero funding of all federal pre-college programs (UB, UBMS, Talent Search and Gear Up) in FY05 and FY06—Justified by UB study results--dropped in FY07 and FY08
Policy History (cont) UB 2006 Absolute Priority to serve 1/3 at-risk and 9th grade ; New random assignment study to evaluate begun 2006 Congress blocked in 2007 and cancelled by ED in 2008 HEOA 2008 Mandates rigorous evaluations Prohibits over-recruitment to program only for for the purposes of evaluation random assignment –does not prohibit any random assignment studies only when is deliberate denial of services Absolute Priority cancelled
Impact Estimates Reported byMathematica and on ED Website have: Inadequately controlled for bias in favor of control group Serious representational issues for largest 4-year public stratum Severe unequal weighting with one project given 26 percent of weight Lack of standardization of outcome measures to expected high school graduation year for sample that spanned 5 years of expected high school graduation year Inappropriate use of National Student Clearinghouse (NSC) data when coverage was too low to meet standards or non- existent and there is evidence of bias
Other Researchers Have Confirmed Issues Initial concern came in 2005 from Mathematica itself when a new staff person no longer employed there who was lead analyst from Fourth Follow up sent ED tables showing results were sensitive to only one project– revealed for first time that one project had 26 percent of weight; seemingly large negative impacts---Positive overall impacts when excluded; not significant impacts when included PPSS Consultation with RTI—statistical experts—James Chromy—Fellow of American Statistical Society --sent file in 2007 and he advised on how to handle project 69—treat as ineligible ---and replicated statistical tabulations using SUDANN—asked for sample frame –Mathematica delayed in sending David Goodwin -Division Director who was original COR for UB study and who originally defended the impact estimates eventually came to see the problems and believe that analyses without project 69 were more credible IES external reviews confirmed basic issues—stated results with project 69 were not robust When present information academic discussants and audiences are incredulous do not understand why ED would continue to publish these impacts
Guidance from three intersecting traditions Experimental design work examining the threats to validity (for example, Shadish, Cook, and Campbell; Heckman) Survey methods research on —sampling and non- sampling error (for example, Groves, et. al 2004) Statistical and program evaluation standards (for example, the Program Evaluation Standards, NCES Standards, AERA Standards ).
What is Sampling and Non- Sampling Error? Sampling error is the error caused by observing a sample instead of the whole population. Sample to sample variation estimated by observing variation among the sample members or sub-dividing the sample Non-sampling error is a catch all term for deviations from true value of estimates or study error that is not caused by sampling (examples non-response bias, lack of understanding of questions, lack of recall)— harder to measure statistically
Basic Assumptions of Random Assignment Studies1. Sample representative of population to which wish to generalize2. Treatment and control group are equivalent3. Treatment and control group treated equally except for the treatment4. Treatment and control group are mutually exclusive with regard to the treatment
Request for Correction Covers Major Focus on the Technical Standards Violations in report Also covers Transparency issues in the report (does not provide information needed to judge and also masks some of the issues) Review process issues—In politically directed process the report was published over the objections of unit responsible for the study (the PPSS Team Leader and Technical Reviewers) and over the Office of Postsecondary Education (OPE) formal disapproval in last week of Bush Administration Note: It was published with the reported acquiescence of IES even though an IES external reviewer had specifically stated that the ―impact estimates were not robust‖
REPORTS HAVE 6 MAJOR STANDARDS VIOLATIONS1. Seriously flawed sample design—one project of 67 carrying 26 percent of weight—only one single project selected from largest study defined stratum (some cases weighted up to 200 times weights of other students)2. Serious representational issues for project with 26 percent of weight –was atypical for its 4- year stratum in that had mostly 2-year and less than 2-year certificate programs3. Treatment and control group that has bias in favor of the control group ----were seriously non-equivalent4. Outcome variables were not standardized to expected high school graduation year (EHSGY) for sample that spanned 5 years of graduation dates5. Improper use of National Student Clearinghouse data for non-responders to surveys when coverage was too low or non-existent and evidence of bias6. Lack of transparency in acknowledging issues and masking some of issues—biased reporting of findings—lack of acknowledgement of alternative credible positive findings for Upward Bound
1. Sample Design Issues Sample highly stratified—46 for 67 projects Unequal weighting---One project carries 26 percent, 3 projects 35, and 8 projects 50 percent of weight Project level stratification—339—strata unequal within projects Basic Design Flaw--One project for largest Treatment-control non-equivalency introduced by outlier 26 percent project
Project that should have been declared ineligible to represent its 4-year stratum carried 26 percent of the weight Extreme unequal weightingFigure 5. Percent of sum of the weights by project of the 67 projects making up the and serious representation Upward Bound national evaluation sample: study conducted 1992-93- 2003-04 issues 30 One project of 67 in sample carried 26 percent of weight 26.38 25 20 (known as 69) and was sole 15 Percent of weight representative of the largest 4- 10 year public strata, but was a 5 former 2-year school with 0 largely less than 2-year programs 1 3 6 8 0 2 4 7 9 2 4 6 8 0 3 5 7 9 1 4 6 8 0 2 4 7 9 1 3 5 7 9 1 P1 P1 P1 P1 P2 P2 P2 P2 P2 P3 P3 P3 P3 P4 P4 P4 P4 P4 P5 P5 P5 P5 P6 P6 P6 P6 P6 P7 P7 P7 P7 P7 P8NOTE: Of the 67 projects making up the UB sample just over half (54 percent) have less than 1 percent of the weights each and oneproject (69) accounts for 26.4 percent of the weights. Project partnered with jobSOURCE: Data tabulated (December 2007) by Policy and Program Studies Service (PPSS) of Office of Planning, Evaluation and PolicyDevelopment (OPEPD) US Department of Education (ED) using national evaluation of Upward Bound data files: study conducted 1992 -93-2003-04. training program Inadequate representation of 4-year stratum
2. Treatment–Control Non- EquivalencySample well matched without project 69Project 69 introduces bias into the overall sample in favor of the controlsProject 69 has large differences (examples) Education expectations: 56 percent of controls expect advanced degree—15 percent treatment 9th grade academics—8 percent controls are at risk; 33 percent of treatment group are at risk Expected HS grad is 1997 (younger group)—60 percent of treatment and 42 percent of controls
Project 69 had seriously non- equivalent treatment and control group No69Treatment No69Control 69Treatment 69Control100 90 80 70 60 50 40 30 20 10 0 Male Expect MA or Base grade 8 Algebra in 9th High academic GPA below 2.5 White higher or below risk
Bias in 69 and balance in rest of sample taken together Project 69 66 projects in sample Other 100100 Control, 20 Control, 23 9090 80 Control, 49 Control, 49 Control, 5180 7070 60 Control, 7960 5050 40 Treatment, 80 Treatment, 77 30 Treatment, 51 Treatment, 51 Treatment, 4940 2030 1020 0 Treatment, 21 High academic In 9th (younger) Expect advanced10 risk grade in 1993-94 degree 0 Treatment Control High academic In 9th (younger) Expect advanced risk grade in 1993-94 degree Treatment Control The Pell Institute 23
10090 Control, 42 Control, 4480 Control, 5870605040 Treatment, 58 Treatment, 5630 Treatment, 422010 0 High academic In 9th (younger) Expect advanced risk grade in 1993-94 degree Treatment Control The Pell Institute 24
3. Lack of Outcome Standardization to Expected High School Graduation Year (EHSGY) Multi-grade study cohort spanned 5 years of expected high school graduation At the time of the last (5th) follow-up 10 percent had 6 years, 30 percent had 7 years; 34 percent had 8 years; 19 percent had 9 years; and 5 percent had 10 years since high school graduation Unbalances between treatment and control ---Control has larger percentage of older 10th grade students at time of randomization Mathematica never standardized outcome measures based on EHSGY; ED staff derived these variables for re-analysis
4. Survey Attrition and Non-Response and Non-Coverage Bias Concern in longitudinal studies UB rates very high for follow ups but at 74 percent by end—control group 4-5 percent less response rate --on Third and Fourth Positive outcomes more likely to respond Use federal aid files to observe and impute Improper use of National Student Clearinghouse for non-respondents when enrollment coverage too low and biased due to clustering; and when 2-year and less than 2-year was non-existent in most applicable period
Figure 4. Percent of total UB study participants found on the federal financial aid files as applicants and as Pell recipients, classified by fourth follow–up survey response status: study conducted 1992-93-2003-04 62 Applied for aid 79 47 Pell recipient 63 0 10 20 30 40 50 60 70 80 90 Responder Non-responderNOTE: Unweighted data based on 2845 Upward Bound sample members from both treatment and control groupsSOURCE: Data tabulated (October 2006) by Policy and Program Studies Service (PPSS) of Office of Planning, Evaluation and PolicyDevelopment (OPEPD) US Department of Education (ED) using national evaluation of Upward Bound data files and Federal Applicantand Award Files 1994-95 to 2003-04
5. Service Participation and non-Participation Issues Waiting List Drop-Outs --26 percent of treatment coded as waiting list file drop-outs— kept in treatment sample First Follow-up survey 18% non-participation in neither UB or UBMS in treatment group Survey data--12-14 percent controls evidence of UB or UBMS participation 60 percent controls and 92 percent treatment group reported some pre-college supplemental service participation
6. Masking of Issues in Final Report Failure to report on project 69‘s representational issues Failure to acknowledge large impacts without project 69 and stating that exclusion of project 69 does not make a difference in conclusions Failure to acknowledge NSC coverage and bias issues Failure to acknowledge standardization of outcomes results and misleading statements concerning results Failure to acknowledge the extent of academic risk bias in favor of the control group in estimates
Alternative Re-AnalysesExperimental Analyses Intent to treat (ITT)—UB opportunity--original random assignment groups—Logistic regression Treatment on Treated (TOT) -UB/UBMS participation—Instrumental Variables RegressionQuasi-experimental--Observational UB/UBMS compared to non-UB/non-UBMS service Any service compared to no serviceSelected subgroup (academic risk-and educational expectations)
Instrumental Variables Regression used in TOT/CACE and Observational analysesTwo stage regression—mitigate selection bias First stage models factors related to participation Second stage --uses results as additional control in the model estimating outcomes
What is the same as Mathematica‘s Analyses? Use same statistical methods (logistic and instrumental variables regression) Statistical programs that take into account the complex multi-stage sample design in estimating standard errors--STATA Same ITT opportunity grouping: TOT participation grouping recognizes UBMS as form of UB Similar model baseline controls: both omit 9th grade academic risk indicators; include additional control for grade at baseline Same weights--Mathematica
What is Different from Mathematica‘s analyses Standardize outcomes by expected high school graduation year Avoid using early NSC data when coverage too low; use only for BA degree as supplement for non-responders to surveys Use all applicable follow-up surveys (3 to 5) not just one round at a time; used federal aid files Present data with and without project 69 and weighted and unweighted; View impact estimates without project 69 as reasonably robust for 74 percent of applicants; view estimates with project 69 as non-robust and use should be avoided especially for estimates of BA impact
Re-analyses Findings for Enrollment and Financial aidStandardizing for Expected High School Graduation Year (and not using NSC data for enrollment) found significant and substantial positive ITT and TOT findings weighted and unweighted and with and without project 69
Overall ResultsSignificant and substantial positive ITT and TOT findings weighted and unweighted and with and without project 69 for: Evidence of postsecondary entrance in +18 months and for +4 years Application for financial aid in +18 months and for +4 years Evidence of award of any postsecondary degree or credential by fourth follow up (4 to 6 years after EHSGY)
Figure 1. Estimated rates of postsecondary entrance within +1 (about 18 months) of expected high school graduation year (EHSGY for Upward Bound Opportunity (ITT) and Upward Bound/Upward Bound Math Science Participation (TOT/CACE): study conducted 1992- 93-2003-04 ITT evidence of postsecondary within 66 Difference +1 of EHSGY 72.9 6.9**** (includes outlier) TOT/CACE evidence of postsecondary 62.5 within +1 of EHSGY 73.5 Difference (includes outlier) 10.9**** Control Treatment ITT evidence of Difference postsecondary within 64.3 9.1*** +1 of EHSGY 73.3 (excludes outlier) TOT/CACE evidence 60.4 Difference of postsecondary within +1 of EHSGY 14.2**** 74.6 (excludes outlier) 40 45 50 55 60 65 70 75 80*/**/***/**** Significant at 0.10/0.05/. 01/00 level; UB = regular Upward Bound; UBMS = Upward Bound Math Science; ITT = Intentto Treat; TOT= Treatment on Treated; CACE = Complier Average Causal Effect.NOTE: Estimated rates from STATA logistic and instrumental variables regression taking into account the complex sample design.Weighted estimates use poststratified weights. See table 4 in body of the report for detailed not e.SOURCE: Data tabulated (January 2008) Policy and Program Studies Service (PPSS) of Office of Planning, Evaluation and PolicyDevelopment (OPEPD) US Department of Education (ED) using national evaluation of Upward Bound data files: study conducted 1992 -93-2003-04; and Federal Aid Application and Pell Award Files 1994-95 to 2003-04.
Figure 2. Estimated rates of application for federal financial aid within +4 of expected high school graduation year (EHSGY) for Upward Bound Opportunity (ITT) and Upward Bound/Upward Bound Math Science Participation (TOT/CACE): study conducted 1992- 93-2003-04 ITT applied for federal financial aid within +4 58.7 Difference of EHSGY (includes 65.4 6.7**** outlier) TOT/CACE applied for federal financial aid 56.1 within +4 of EHSGY 66.7 Difference (includes outlier) 10.6**** Control Treatment ITT applied for federal Difference financial aid within +4 60.4 7.3*** of EHSGY (excludes 67.7 outlier) TOT/CACE applied for federal financial aid 57.1 within +4 of EHSGY Difference 69.1 (excludes outlier) 11.9**** 40 45 50 55 60 65 70 75 80*/**/***/**** Significant at 0.10/0.05/. 01/00 level; UB = regular Upward Bound; UBMS = Upward Boun d Math Science; ITT = Intentto Treat; TOT= Treatment on Treated; CACE = Complier Average Causal Effect.NOTE: Estimated rates from STATA logistic and instrumental variables regression taking into account the complex sample design.Weighted data use poststratified weights. See table 6 and table 4 in body of the report for detailed notes.SOURCE: Data tabulated (January 2008) Policy and Program Studies Service (PPSS) of Office of Planning, Evaluation and PolicyDevelopment (OPEPD) US Department of Education (ED) using national evaluation of Upward Bound data files: study conducted 1992-93-2003-04; and Federal Aid Application and Pell Award Files 1994-95 to 2003-04.
Re-Analyses--Awarded a BA in +6 years of EHSGY Weighted with 69 not sign. Unweighted sign. For the 74 percent of sample not represented by project 69 28 percent increase in BA award for ITT UB opportunity (13.3 increased to 17.0) 50 percent increase in BA award for TOT UB participation analyses (14.1 to increased to 21.1)
Impact of Upward Bound (UB) onBachelor’s (BA) degree attainment NOTE: Instrumental Variables Regression models for Treatment on the Treated (TOT) estimates based on 66 of 67 projects in UB sample: National Evaluation of Upward Bound, study conducted 1992-93 to 2003-04 EHSGY = Expected High School Graduation Year; NSC = National Student Clearinghouse; SFA = Student Financial Aid All estimates significant at the .01 level or higher. Estimates based on 66 of 67 projects in sample representing 74 percent of UB at the time of the study. One project removed due to introducing bias into estimates and representational issues. We use a 2-stage instrumental variables regression procedure to control for selection effects for the Treatment on the Treated (TOT) impact estimates. SOURCE: Data tabulated January 2010 using: National Evaluation of Upward Bound data files, study sponsored by the Policy and Program Studies Services (PPSS), of the Office of Planning, Evaluation and Policy Development (OPEPD), U.S. Department of Education; study conducted 1992-9 to -2003-04.
UB/UBMS Participation Comparedwith Other non-UB/UBMS Services ParticipationQuasi-experimental--Uses 2-stage instrumental variables regression—controls for selection bias not eliminateFound statistically significant and substantive positive results for UB/UBMS participation for: Evidence of postsecondary entrance +1 and +4 Application for financial aid +1 and +4 Award of BA in +6 unweighted overall and unweighted and weighted without project 69
Table 5. Evidence of Postsecondary Entrance within +1 (18 months) and within +4 of expected high school graduation year (EHSGY for observational models comparing types of service receipt: National Evaluation of Upward Bound, study conducted 1992-93 to 2003-2004 All sampling strata One outlier project removed (remainder represents 74 percent of Horizons waiting list)Outcome Participated in Any pre-college Participated in Any pre-collegevariable UB/UBMS compared support or UB/UBMS compared support or with participated in supplemental services with participated in supplemental services other non-UB/non-UBMS reported compared other non-UB/non-UBMS reported compared pre-college support or with no services pre-college support or with no services supplemental services reported supplemental services reported only (observational – (observational – only (observational – (observational – instrumental variables instrumental variables instrumental variables instrumental variables regression) regression) regression) regression)Evidence of xb T = 74.4 xb-T = 73.5 xb T = 75.0 xb T = 74.3postsecondary xb C = 65.3 xbC = 48.6 xb C = 61.7 xb C = 44.6entrance within Difference = 9.1*** Difference = 25.0**** Difference = 13.3**** Difference = 29.8****+1 of EHSGY (xb T = 75.8 (xb T = 75.9 (xb T = 76.2 xb C = 51.7 (xb T = 76.3 xb C = 51.1 xb C = 66.8 Difference = xb C = 66.3 Difference = Difference = 9.3****) 24.1****) Difference = 10.1****) 24.7****)Evidence of xb T = 75.6 xb-T = 74.8 xb T = 76.5 xb T = 75.9postsecondary xb C = 67.5 xb-C = 51.4 xb C = 64.4 xb C = 47.8entrance within Difference = 8.2*** Difference = 23.5*** Difference = 12.1**** Difference = 28.1****+4 EHSGY (xb T = 78.2 (xb T = 77.7 (xb T = 78.4 (xb T = 77.8 xb C = 68.7 xb C = 54.1 xb C = 68.2 xb C = 53.7 Difference = 9.5****) Difference = Difference = 10.2****) Difference = 23.6****) 24.1****)*/**/***/**** Significant at 0.10/0.05/.01/00 levelUB = regular Upward Bound; UBMS = Upward Bound Math Science; T = Treatment; C = Control or comparison; xb = linear prediction from STATAivreg instrumental variables regression. Odds ratio = prT(1-prC)/prC(1-prT).NOTE: Unweighted data given in parentheses. Please see table 4 for detailed notes.SOURCE: Data tabulated (January 2008) by Policy and Planning Studies Services (PPSS) using data from the, National Evaluation of Upward Bound,study files baseline through 4th follow up and Federal Aid Application and Pell Award Files 1994-95 to 2003-04.
Sub-Group Analyses Bottom 20 percent on academic indicators Large positive significant effects for: Postsecondary entrance Application for financial aid Award of any postsecondary degree Not for BA degree –two few achieved to compare treatment and control Top 80 percent on academic indicators Moderate positive significant effects for: Postsecondary entrance Application for financial aid Award of any postsecondary degree For BA degree in +6
Impact Estimates from Two Stage Instrumental Variables Regression for Percent Obtaining a BA in +6 years based on UB Random Assignment Evaluation Difference 7.0 **** 14.1% 50% increase UB/UBMS participation: Treatment on the Treated(TOT/CACE) (outlier removed) 21.1% 15.2% Difference 5.8*** UB/UBMS compared with other non-UB/UBMS 39 %increase service only (outlier removed) 21.0% Difference 14.4*** Any pre-college with academic component 6.5% 223% increase compared with no pre-college service reported (outlier removed) 20.9% 0.0% 5.0% 10.0% 15.0% 20.0% 25.0% Comparison Treatment Note: All estimates significant at the .01 level or higher. Estimates based on 66 of 67 projects in sample representing 74 percent of UB at the time of the study. One project removed due to introducing bias into estimates and representational issues.
Random Assignment National Evaluation of Upward Bound (UB) Dataon Estimated increase in life-time taxes paid compared to program cost per participant—taxes are 4.9 to 5.9 times the cost of participation Sources and Assumptions: *UB Evaluation Data. Estimated based on estimated differences in educational attainment between the treatment and control group from random assignment study that followed sample for 6 to 10 years after expected high school graduation. $41, 495 figure based on impact estimates from the final Fifth Follow up Survey using outcome variables derived by Mathematica Policy Research with weights adjusted for survey non- response. $36,493 estimates based on outcomes variables for longitudinal file standardized by expected high school graduation date Treated on the Treated (TOT) estimates based on instrumental variables regression modeling for 66 of the 67 projects representing 74 percent of the sample. One project of 67 in the sample excluded due to fact that was found to be ineligible to represent its stratum and also had large imbalances between treatment and control group that due to extreme weight that introduced bias into previously published overall estimates. *Life time earnings and taxes data from US Census Bureau; The Big Payoff: Educational Attainment and Synthetic estimates of Work-Life Earnings, July 2002, Current Population Reports Jennifer Day and Eric Newburger; College Board , Education Pays, The Benefits of Higher Education for Individuals and Society: 2007 **Cost of UB program per participant: US Department of Education Data on average cost of UB for one year --$4900 Assumes average participant uses about 1.5 times this level of resources.
Support for Timely Review Correction Request will be needed Ways to Support Request for Correction Public statement of fact that submitting and reasons Statement requesting timely review by ED signed by stakeholders and evaluators Holding panels discussing the issues at major education and evaluation associations (wider issues of evaluation methods and use and transparency) Accountability of the evaluator contractors and ed. issues
How could problems have been avoided in first place? Follow existing standards! Caution about trying to do too much---Chose a difficult and atypical design combining probability sampling with experimental design---led to serious issues—made worse by mistakes made and general lack of awareness of sampling and non-sampling study errors and role in impact estimation Sample design flawed from start with serious unequal weighting—follow established standards for sample design Representation issues—contractor did not adequately check representation of stratum and did not fully reveal issues when discovered Lack of care in analysis in outcome measures that were not standardized to expected high school graduation which spanned 5 years Lack of checking treatment and control group balance--equivalency on key attributes—faith in random assignment to ensure Failure to respect stakeholder concerns about control group contamination and other issues and technical monitor legitimate concerns about the representation and treatment-control group non-balance bias issues ---- repeatedly dismissed as non-objective advocates
Serious Problems with Doing Nothing about Report 1. ED continues to officially misrepresent the impact of UB2. The UB program reputation continues to be hurt by the evaluation and stakeholders have officially objected; could have serious consequences in Congress3. Missed opportunity to build on the program‘s successes and find ways to strengthen and adapt program to achieve nations goals of increased postsecondary access and completion4. Evaluation research as a whole suffers from not correcting mistakes made and learning from them
How to Correct Report? It is correctable and can provide useful information Not try to represent entire population of interest with study (remove project 69 and represent 74 percent)—IES reviewer stated that estimates are robust for other 66 projects taken together Standardize outcomes to expected high school graduation year Use NSC data only for BA degree and not for less than BA and not for postsecondary entrance
Next Steps in Evaluation Partnership model among stakeholders Use more innovative evaluation methods (collaborative, participatory, empowerment, utilizati on, systems analysis, culturally responsive evaluation) Utilized resources/leverage academic institutional research offices of grantees Focus on program improvement rather than up or down Open and transparent sharing Build capacity for self evaluation and accountability Utilization of standards for statistical research and program evaluation
Invitation to Research & Further Additional Information The full text of the COE Request for Correction can be found at http://www.coenet.us/files/spotlight- COE_Request_for_Correction_of_Mathematica_Report_0 11812.pdf Statement of concern by leading researchers in field http://www.coenet.us/files/spotlight- Statement_of_Concern_011812.pdf Results of the re-analysis detailing study error issues can be found at: http://www.coenet.us/files/files- Do_the_Conclusions_Change_2009.pdf. Information on obtaining the restricted use UB data files for additional research can be obtained by contacting: Sandra.Furey@ed.gov
Contact Information Margaret.Cahalan@pellinstitute.org 202-347-7430 ex 212 301-642-4851