• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
RDAP14: An analysis and characterization of DMPs in NSF proposals from the University of Illinois
 

RDAP14: An analysis and characterization of DMPs in NSF proposals from the University of Illinois

on

  • 344 views

Research Data Access and Preservation Summit, 2014

Research Data Access and Preservation Summit, 2014
San Diego, CA
March 26-28, 2014
Lightning Talks

William Mischo, University of Illinois at Urbana-Champaign

Statistics

Views

Total Views
344
Views on SlideShare
321
Embed Views
23

Actions

Likes
0
Downloads
11
Comments
1

2 Embeds 23

https://twitter.com 13
http://www.scoop.it 10

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

11 of 1 previous next

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    RDAP14: An analysis and characterization of DMPs in NSF proposals from the University of Illinois RDAP14: An analysis and characterization of DMPs in NSF proposals from the University of Illinois Presentation Transcript

    • An Analysis and Characterization of DMPs in NSF Proposals from the University of Illinois RDAP14 Research Data Access & Preservation Summit March 26, 2014 William H. Mischo, Mary C. Schlembach, Megan A. O’Donnell University of Illinois at Urbana-Champaign Iowa State University
    • NSF data Management Plans • Data Management Plans (DMPs): required element in NSF proposals, January 2011 • July 2011: the Library, working with the campus Office of Sponsored Programs and Research Administration (OSPRA) began an analysis of DMPs in submitted NSF grant proposals • Currently, looked at 1,600 grants with 1,260 in the analysis.
    • Reasons for DMPs • Make key research data available and sharable • Allow the use of data for verification of results and reproducibility of research work • Agency can show significant return on investment to justify funding • We want to know storage venues and mechanisms for sharing and reuse • Also use of local templates and local campus resources such as IDEALS
    • Follow-on • Develop campus-wide infrastructure (Research Data Service - RDS) to support UIUC researchers in managing their data • Assist in compliance with federal agencies • Develop important partnerships with campus units (CITES, NCSA, Colleges) and national entities • Develop best practices and standard approaches
    • Analysis • Analysis attempts to characterize and classify DMPs into categories • DMPs assigned multiple categories • 1,260 DMPs from July 2011 to November 2013
    • Categories • PI Server – Servers and workstations that the PIs (and their students/staff) use to store project data. Examples: laboratory server, external hard drive, and group computer. • PI Website – Websites edited or administered by the PI or a group they belong to. If a departmental URL was given, it was also given the term “department.” Examples: lab website, project website, wiki, PI’s website
    • Categories • Campus – Services located, operated by, run by UIUC or endorsed by UIUC. This includes IDEALS, netfiles and Box.net, NCSA, and Beckman. • Department – Used when a department was specifically mentioned as providing a storage or hosting resource. Examples: Departmental website, departmental server, departmental backup service or a web address traced back to an academic department. Also given the “campus” label.
    • Categories • Remote – Services and sites not located on the UIUC campus. Examples: NASA, other campuses, collaborative projects, non-UIUC institutes • Disciplinary – Disciplinary repositories. Many are open access but not all. Examples: GenBank, arXiv, ICPSR, SEAD, Nanohub, and Dryad • Cloud – Storage services using cloud technology. Examples: Google Documents, Google Code, Box.net, Amazon, Microsoft, Dropbox
    • Categories • Publication – Scholarly outputs including journal articles, workshops, and conference presentations or posters. Very few DMPs were explicit as to how their “publications” and data were related or separated. • Analog - Physical records including lab notebooks, photographs, and files. Does not include specimens or artifacts. • Specimens - – Physical specimens; usually biological or artifacts
    • Categories • Optical Disc - DVD, CD, and Blu-ray discs. Often used as a backup mechanism • Not specified – the DMP was not specific enough for us to record details • No Data – Indicated the proposal will produce no data products. Many were theoretical studies (math), travel grants, or workshop planning sessions. • Local Template Used
    • All DMPs (including “no data”) n = 1260 Category Number Percent PI Server 503 39.9% PI Website 529 41.9% Campus 667 52.9% Department 142 11.2% Remote 353 28.0% Disciplinary 275 21.8% Publication 556 44.1% Cloud 63 5.0% Optical Disc 56 4.0% Analog 131 10.4% Specimens 111 8.8% Not Specified 66 5.2% Collaborative 164 13.0% No Data 103 8.2%
    • Data Venue and Risk Data Location Submitted Proposals Funded Proposals Since July 2011 n = 1260 Risk of Loss, Corruption, Breach n = 298 Risk of Loss, Corruption, Breach PI Server/Website 64% High 61% High Departmental Server/Website 11.2% Medium to High 7% Medium to High Campus-Wide Resource 52.9% Low 45% Low IDEALS Institutional Repository 21.9% 19.8% NCSA 4.3% 16.4% Disciplinary Repository/Cloud 25.8% Medium to Low 21.4% Medium to Low Remote Repository 28% Medium to High 22.8% Medium to High Optical Disk, Specimens, Analog 19.4% Out of Scope 11% Out of Scope
    • Notables • Funded: 298 • Used locally developed template: 254 • IDEALS: 275 • NCSA/XSEDE: 55 • Dryad: 22 • ICPSR: 17 • Genbank/Genetics Repository: 55 • ArX: 61 • Only 87 DMPS contained information about file types
    • Analysis • Any differences in storage venue or technologies between the unfunded proposals and the funded proposals? • Any differences between the proposals from the first year and the more current proposals? • Can look at differences in any of the proposal categories between funded and unfunded • 734 active NSF awards, $861.8 million
    • Analysis • Use of IDEALS institutional repository: 62 funded, 197 not funded: chi-square: 0.17 • Storing data on PI server or website: 183 funded, 569 not funded: chi-square: 0.7 • Disciplinary or Cloud: 67 funded, 241 not funded: chi-square: 0.85 • Remote storage: 68 funded, 267 not funded: chi-square: 3.01
    • Analysis • Use of IDEALS before August 2012 = 108, after (thru November 2013) = 166, chi-square: 4.59, p < .05 • Use of disciplinary or Cloud before August 2012 = 121, after = 182, chi-square: 4.33, p < .05
    • Implications • Conclusions: 1: no significant differences between funded/unfunded proposals in storage venues -- no advantage in IDEALS, Disciplinary; 2: more recent proposals suggest IDEALS and disciplinary repositories included at a significantly higher level • What is the role of the library? The campus? The subject discipline? • Connecting data to the literature important