The document analyzes 1,260 Data Management Plans (DMPs) from NSF grant proposals submitted to the University of Illinois between 2011-2013. It finds that most proposals planned to store data on PI servers, websites, or campus resources like IDEALS. While there were no significant differences between funded and unfunded proposals, more recent plans were more likely to use IDEALS and disciplinary repositories for data storage and sharing. This suggests an increasing role for libraries, universities, and disciplines in research data management.
Analysis of NSF Data Management Plans from University of Illinois
1. An Analysis and Characterization of
DMPs in NSF Proposals from the
University of Illinois
RDAP14 Research Data Access & Preservation
Summit
March 26, 2014
William H. Mischo, Mary C. Schlembach, &
Megan N. O’Donnell
University of Illinois at Urbana-Champaign
Iowa State University
2. NSF Data Management Plans
• Data Management Plans (DMPs): required
element in NSF proposals, January 2011
• July 2011: the Library, working with the campus
Office of Sponsored Programs and Research
Administration (OSPRA) began an analysis of
DMPs in submitted NSF grant proposals
• Currently, looked at 1,600 grants with 1,260 in
the analysis.
3. Reasons for Analysis
•What storage venues and mechanisms for
sharing and reuse are being used?
•Are the PI’s using local templates and local
campus resources such as the IDEALS?
4. Follow-on
• Develop campus-wide infrastructure (Research
Data Service - RDS)
• Assist in compliance with federal agencies
• Develop important partnerships with campus
units (CITES, NCSA, Colleges) and national
entities
• Develop best practices and standard approaches
5. Analysis
• Analysis attempts to characterize and classify
DMPs into categories
• DMPs assigned multiple categories
• 1,260 DMPs from July 2011 to November 2013
6. Categories
• PI Server – Servers and workstations that the PIs
(and their students/staff) use to store project
data.
laboratory server/workstations, external hard drives, group
computer
• PI Website – Websites edited or administered
by the PI or a group they belong to.
Examples: lab website, project website, wiki, PI’s website
7. Categories
• Campus – Services located, operated by, run by
or endorsed by Illinois.
IDEALS, Netfiles and Box.net, NCSA, and Beckman
Institute.
• Department – Used when a department was
specifically mentioned as providing a storage or
hosting resource.
Departmental website, departmental server, departmental
backup service or a web address traced back to an
academic department (also given the “campus” label)
8. Categories
• Remote – Services and sites not located on the
Illinois campus.
NASA, other campuses, collaborative projects, non-Illinois
institutes
• Disciplinary – Disciplinary repositories.
GenBank, arXiv, ICPSR, SEAD, Nanohub, and Dryad
• Cloud – Storage services using cloud technology.
Google Drive, Google Code, Box.net, Amazon, Microsoft,
Dropbox
9. Categories
• Publication - Scholarly outputs.
Journal articles, workshops, and conference
presentations/posters.
• Analog - Physical records/data.
Lab notebooks, photographs, files
• Specimens - Physical specimens.
Usually biological or artifacts
10. Categories
• Optical Disc - DVD, CD, and Blu-ray discs.
• Not specified – the DMP was not specific
enough for us to categorize further.
• No Data – Indicated the proposal will produce
no data products.
• Local Template Used – used a library authored
template.
11. Category Number Percent
PI Server 503 39.9%
PI Website 529 41.9%
Campus 667 52.9%
Department 142 11.2%
Remote 353 28%
Disciplinary 275 21.8%
Publication 556 44.1%
Cloud 63 5%
Optical Disc 56 4%
Analog 131 10.4%
Specimens 111 8.8%
Not Specified 66 5.2%
Collaborative 164 13%
No Data 103 8.2%
ALL DMPs (n=1,260)
12. Data Venue and Risk
Data Location
Submitted Proposals Funded Proposals
Risk of Loss/Corruption/ Breach
n=1260 n=298
PI Server/Website 64% High 61% High
Departmental
Server/Website
11.2% Medium to High 7% Medium to High
Campus-Wide Resource 52.9%
Low
45%
LowIDEALS (Institutional
Repos.)
21.9% 19.8%
NCSA 4.3% 16.4%
Disciplinary
Repository/Cloud
25.8% Medium to Low 21.4% Medium to Low
Remote Repository 28% Medium to High 22.8% Medium to High
Optical Disk, Specimens,
Analog
19.4% Out of Scope 11% Out of Scope
13. Notables
• Funded: 298
• Used local
template: 254
• Only 87 DMPS
contained
information about
file types
• IDEALS: 275
• NCSA/XSEDE: 55
• Dryad: 22
• ICPSR: 17
• GenBank: 55
• ArX: 61
14. Analysis
• Any differences in storage venue or technologies
between the unfunded proposals and the funded
proposals?
• Any differences between the proposals from the
first year and the more current proposals?
• Other differences in proposal categories
between funded and unfunded
• 734 active NSF awards, $861.8 million
15. Analysis: Funded vs. Not-funded
• IDEALS institutional repository:
62 funded, 197 not funded: chi-square: 0.17
• Storing data on PI server or website:
183 funded, 569 not funded: chi-square: 0.7
• Disciplinary or Cloud:
67 funded, 241 not funded: chi-square: 0.85
• Remote storage:
68 funded, 267 not funded: chi-square: 3.01
16. Analysis
• Use of IDEALS
before August 2012 = 108
after (thru November 2013) = 166
chi-square: 4.59, p < .05
• Use of Disciplinary or Cloud
before August 2012 = 121
after = 182
chi-square: 4.33, p < .05
17. Implications and Conclusions
1. No significant differences between
funded/unfunded proposals in storage venues -
no advantage in IDEALS, Disciplinary.
2. More recent proposals suggest IDEALS and
disciplinary repositories included at a
significantly higher level
• What is the role of the library? The campus?
The subject discipline?
• Connecting data to the literature important
Editor's Notes
Took out (covered in keynote)
- Make key research data available and sharable
- Allow the use of data for verification of results and reproducibility of research work
- Agency can show significant return on investment to justify funding
to support Illinois researchers in managing their data
Very few DMPs were explicit as to how their “publications” and data were related or separated.
No data: Many were theoretical studies (math), travel grants, or workshop planning sessions.