Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
100W LED Gas Station Specification
Next
Download to read offline and view in fullscreen.

0

Share

Download to read offline

Pece - Putting the cart before a lame horse

Download to read offline

Parallel session 5 - Monday 19 September 2016

  • Be the first to like this

Pece - Putting the cart before a lame horse

  1. 1. Putting the Cart Before a Lame Horse: A case study for future initiatives to automate the use of administrative records for reporting government R&D Christopher Pece OECD Blue Sky Forum on Science and Innovation Indicators 19 September 2016 National Science Foundation National Center for Science and Engineering Statistics www.nsf.gov/statistics/
  2. 2. Outline • Background • State of Federal R&D Reporting • Recommendations to Improve R&D Reporting • Obtaining Agency Administrative Data • Data Tagging • Clone File Approach and Findings • Developing Government-wide Standards and Next Steps
  3. 3. Background • Federal government is the second largest funder of R&D in the U.S. ($127 bil. FY 2013); business and industry ($332 bil. 2013) • Two primary data sources of Federal spending on R&D by funder – National Center for Science and Engineering Statistics (NCSES) Survey of Federal Funds for Research and Development – Office of Management and Budget (OMB) Circular A-11, Section C (budget data) – These are in addition to Federal R&D total summed from performer surveys • “Building blocks for virtually every analysis of publicly sponsored U.S. scientific activity and technical activity...used by government, academia, industry, and a host of nonprofit analytical and advocacy groups as the primary source of information about federal spending on research and development.” (CNSTAT, 2010) 2
  4. 4. State of Federal R&D Reporting • Committee on National Statistics (CNSTAT) noted several challenges agencies face when it comes to completing R&D data calls – Difficulty translating data from the categories in which they are maintained on agency records into categories required for reporting – Inability to identify portions of industrial contracts that are for R&D, R&D plant, and non-R&D activities – Individuals interpret definitions of basic research, applied research, and experimental development differently – Uncertainty in identifying and classifying mission-based activities into Fields of R&D 3
  5. 5. Recommendations to Improve R&D Reporting • Three medium/long-term recommendations – Develop R&D descriptors (tags) into administrative databases to better enable identification of R&D components of agency or program budgets (CNSTAT, 2010, Rec. 4-1) – Use of administrative data to test new classification schemata by direct access to intramural spending information from agency databases. (CNSTAT, 2010, Rec. 4-2) – Develop several demonstration projects to test for the best method to move to a system based at least partly on administrative records (CNSTAT, 2010, Rec. 4-3) 4
  6. 6. Obtaining Agency Administration Data (1) • Met with over 40 representatives from 12 agencies with substantial R&D funding – Conducted series of follow-up interviews with agencies with more sophisticated accounting and data management systems and processes for identifying R&D • National Science Foundation (NSF) provided files without issue; as did the National Institutes of Health (NIH) following extensive negotiations on the Memorandum of Agreement • National Aeronautics and Space Administration (NASA) – Not originally considered due to data management systems, but provided data files used to complete the data call based on interest of the Budget Officer to improve their R&D data quality. 5
  7. 7. Obtaining Agency Administrative Data (2) • Department of Defense agencies (DARPA, ONR) – More sophisticated data management systems and processes for reporting and tracking R&D projects would make for good test cases for tagging and clone file compilation – Due to presence of classified project data and other national security concerns agencies refused to provide any data files • National Institute for Standards and Technology (NIST) had one of the more sophisticated processes – “NSF code” to specify projects as basic or applied research, or experimental development – However, NIST requested compensation to provide detailed records 6
  8. 8. Data Tagging • Proposed using eXtensible Business Reporting Language (XBRL) for R&D; derivative of XML software specifically for financial accounting records • An XBRL software module would take R&D transaction records extracted from agency systems and map the existing internal reporting taxonomies to the Federal Funds Survey taxonomy • Nearly all agencies assembled their R&D data from multiple incompatible data systems that are manually compiled • Little if any consistency in the reporting classification schemes used by different agencies 7
  9. 9. Clone File Approach and Findings • Under this approach, information relevant to R&D funding activities were extracted from agencies’ central accounting systems and mapped to survey variables to compile R&D statistics • Benefit of this approach provides consistency in the treatment of classification of R&D activities across agencies since NCSES would be responsible for the crosswalk of all data, reduce burden on agencies, and improve timeliness of R&D statistics disseminated by NCSES • Based on transaction records provided to us by NSF, NIH, and NASA, NCSES re-compiled data to re-create survey response 8
  10. 10. Clone File Approach and Findings: NSF • NSF provided two main files from the budget office • NSF uses a heuristic mechanism for determining the Field of R&D for given sets of projects based on the Directorate and Program funding • NCSES was able to re-create all statistics generated from the Federal Funds Survey for NSF with the administrative records data and the heuristics for classifying Field of R&D (FORD) – Some small, random errors that most likely occur from miscoding – NSF is not a very complex organization and all of its R&D is extramural, and is mostly classified as basic research 9
  11. 11. Clone File Approach and Findings: NIH • NIH more complex in terms of R&D activities, organizational structure, and information systems • Information on Type of R&D is not included in records, but NIH has proprietary system for disease classification –not FORD • Many variables reported on the survey are generated through manual manipulation or compilation by NIH staff and cannot be generated solely from administrative records • NCSES could only produce values for NIH-specific data in 65 of the 102 statistical tables produced for the FFS report • Administrative data for tracking extramural R&D spending appear to be complete, accurate, and comprehensive, a comparable level of data is not available for intramural R&D 10
  12. 12. Clone File Approach and Findings: NASA • NASA data systems does not contain data elements necessary for classifying R&D such as the identity of R&D performers, or whether a given transaction is or is not R&D in first place • Most records are associated with programs and organizations, not a functional area such as research • NASA uses internal heuristics to determine if project is R&D, Type of R&D, and Field of R&D • NCSES was able to calculate values for only nine Detailed Statistical Tables of the 65 that are compiled from the survey response; of those nine statistical tables, only six had data that matched the data reported to the survey 11
  13. 13. Developing Government-wide Standards and Next Steps (1) • For administrative data to produce comparable statistics on R&D funding common reporting standards are needed – Per CNSTAT “For purposes of collecting data on research and development statistics in a consistent manner across federal government agencies, it is necessary to establish a common taxonomy that will be useful to the largest number of data providers and users.” – Creating standards that can be applied towards all federal agencies is itself a monumental undertaking; expanding these standards to other sectors critical to measuring R&D is even more so • Standardization needs to ensure the human-based interpretations of the definitions of R&D can be internalized as part of an agencies own internal controls and conveyed to their administrative records data – Development of agency-specific heuristic models 12
  14. 14. Developing Government-wide Standards and Next Steps (2) • A collaborative effort currently underway by NCSES and OMB to establish a Federal R&D Community-of-Practice (R&D COP) is the best opportunity available at this point in time to develop: – Additional agency-specific heuristic models – Standard schemata for all federal R&D data calls – Automated reporting options • How should we seek to ensure data tags for R&D are a required component of all federal reporting? • How do we ensure the records themselves have sufficient means to capture specific detailed information about R&D that policy makers and others require? • How do we broaden standard beyond federal R&D funders? 13

Parallel session 5 - Monday 19 September 2016

Views

Total views

214

On Slideshare

0

From embeds

0

Number of embeds

0

Actions

Downloads

9

Shares

0

Comments

0

Likes

0

×