The document analyzes 1,260 Data Management Plans (DMPs) from NSF grant proposals submitted to the University of Illinois between 2011-2013. It finds that most proposals planned to store data on PI servers, websites, or campus resources like IDEALS. While there were no significant differences between funded and unfunded proposals, more recent plans were more likely to use IDEALS and disciplinary repositories for data storage and sharing. This suggests an increasing role for libraries, universities, and disciplines in research data management.
Feb 26 NISO Training Thursday
Crafting a Scientific Data Management Plan
About the Training
Addressing a data management plan for the first time can be an intimidating exercise. Join NISO for a hands-on workshop that will guide you through the elements of creating a data management plan, including gathering necessary information, identifying needed resources, and navigating potential pitfalls. Participants explore the important components of a data management plan and critique excerpts of sample plans provided by the instructors.
This session is meant to be a guided, step-by-step session that will follow the February 18 NISO Virtual Conference, Scientific Data Management: Caring for Your Institution and its Intellectual Wealth.
About the Instructors
Kiyomi D. Deards, MSLIS, Assistant Professor, University of Nebraska-Lincoln Libraries
Jennifer Thoegersen, Data Curation Librarian, University of Nebraska-Lincoln Libraries
February 18 2015 NISO Virtual Conference Scientific Data Management: Caring for Your Institution and its Intellectual Wealth
Using data management plans as a research tool: an introduction to the DART Project
Amanda L. Whitmire, Ph.D., Assistant Professor, Data Management Specialist, Oregon State University Libraries & Press
Poster RDAP13: Research Data in eCommons @ Cornell: Present and FutureASIS&T
Wendy A. Kozlowski, Dianne Dietrich, Gail Steinhart and Sarah Wright
Cornell University Library, Ithaca, NY
Research Data in eCommons @ Cornell: Present and Future
Research Data Access & Preservation Summit 2013
Baltimore, MD April 4, 2013 #rdap13
February 18 2015 NISO Virtual Conference
Scientific Data Management: Caring for Your Institution and its Intellectual Wealth
Network Effects: RMap Project
Sheila M. Morrissey, Senior Researcher, ITHAKA
RDAP 15 EarthCollab: Connecting Scientific Information Sources using the Sema...ASIS&T
Research Data Access and Preservation Summit, 2015
Minneapolis, MN
April 22-23, 2015
Erica M. Johns, Jon Corson-Rikert, Huda J. Khan, Dean B. Krafft and Matthew S. Mayernik
Information technology and resources are an integral and indispensable part of the contemporary academic enterprise. In particular, technological advances have nurtured a new paradigm of data-intensive research. However, far too much of this activity still takes place in silos, to the detriment of open scholarly inquiry, integrity, and advancement. To counteract this tendency, the University of California Curation Center (UC3) has been developing and deploying a comprehensive suite of curation services that facilitate widespread data management, preservation, publication, sharing, and reuse. Through these services UC3 is engaging with new communities of use: in addition to its traditional stakeholders in cultural heritage memory organizations, e.g., libraries, museums, and archives, the UC3 service suite is now attracting significant adoption by research projects, laboratories, and individual faculty researchers. This webinar will present an introduction to five specific services – DMPTool, DataUp, EZID, Merritt, Web Archiving Service (WAS) – applicable to data curation throughout the scholarly lifecycle, two recent initiatives in collaboration with UC campuses, UC Berkeley Research Hub and UC San Francisco DataShare, and the ways in which they encourage and promote new communities of practice and greater transparency in scholarly research.
Feb 26 NISO Training Thursday
Crafting a Scientific Data Management Plan
About the Training
Addressing a data management plan for the first time can be an intimidating exercise. Join NISO for a hands-on workshop that will guide you through the elements of creating a data management plan, including gathering necessary information, identifying needed resources, and navigating potential pitfalls. Participants explore the important components of a data management plan and critique excerpts of sample plans provided by the instructors.
This session is meant to be a guided, step-by-step session that will follow the February 18 NISO Virtual Conference, Scientific Data Management: Caring for Your Institution and its Intellectual Wealth.
About the Instructors
Kiyomi D. Deards, MSLIS, Assistant Professor, University of Nebraska-Lincoln Libraries
Jennifer Thoegersen, Data Curation Librarian, University of Nebraska-Lincoln Libraries
February 18 2015 NISO Virtual Conference Scientific Data Management: Caring for Your Institution and its Intellectual Wealth
Using data management plans as a research tool: an introduction to the DART Project
Amanda L. Whitmire, Ph.D., Assistant Professor, Data Management Specialist, Oregon State University Libraries & Press
Poster RDAP13: Research Data in eCommons @ Cornell: Present and FutureASIS&T
Wendy A. Kozlowski, Dianne Dietrich, Gail Steinhart and Sarah Wright
Cornell University Library, Ithaca, NY
Research Data in eCommons @ Cornell: Present and Future
Research Data Access & Preservation Summit 2013
Baltimore, MD April 4, 2013 #rdap13
February 18 2015 NISO Virtual Conference
Scientific Data Management: Caring for Your Institution and its Intellectual Wealth
Network Effects: RMap Project
Sheila M. Morrissey, Senior Researcher, ITHAKA
RDAP 15 EarthCollab: Connecting Scientific Information Sources using the Sema...ASIS&T
Research Data Access and Preservation Summit, 2015
Minneapolis, MN
April 22-23, 2015
Erica M. Johns, Jon Corson-Rikert, Huda J. Khan, Dean B. Krafft and Matthew S. Mayernik
Information technology and resources are an integral and indispensable part of the contemporary academic enterprise. In particular, technological advances have nurtured a new paradigm of data-intensive research. However, far too much of this activity still takes place in silos, to the detriment of open scholarly inquiry, integrity, and advancement. To counteract this tendency, the University of California Curation Center (UC3) has been developing and deploying a comprehensive suite of curation services that facilitate widespread data management, preservation, publication, sharing, and reuse. Through these services UC3 is engaging with new communities of use: in addition to its traditional stakeholders in cultural heritage memory organizations, e.g., libraries, museums, and archives, the UC3 service suite is now attracting significant adoption by research projects, laboratories, and individual faculty researchers. This webinar will present an introduction to five specific services – DMPTool, DataUp, EZID, Merritt, Web Archiving Service (WAS) – applicable to data curation throughout the scholarly lifecycle, two recent initiatives in collaboration with UC campuses, UC Berkeley Research Hub and UC San Francisco DataShare, and the ways in which they encourage and promote new communities of practice and greater transparency in scholarly research.
Poster RDAP13: Data information literacy multiple paths to a single goalASIS&T
Jake Carlson, Jon Jeffryes, Brian Westra and Sarah Wright
Data Information Literacy: Multiple Paths to a Single Goal
Research Data Access & Preservation Summit 2013
Baltimore, MD April 4, 2013 #rdap13
February 18 2015 NISO Virtual Conference Scientific Data Management: Caring for Your Institution and its Intellectual Wealth
Learning to Curate Research Data
Jennifer Doty, Research Data Librarian, Emory Center for Digital Scholarship, Emory University, Robert W. Woodruff Library
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...Amanda Whitmire
A workshop as part of the International Digital Curation Conference 2016 on DMP development and support. This presentation demonstrates how we can use data management plans as a source of information to better understand researcher data stewardship practices and how to support them. Be sure to see the slide notes to better understand the presentation (most slides are just photos/icons).
Poster RDAP13: A Workflow for Depositing to a Research Data Repository: A Cas...ASIS&T
Betsy Gunia, David Fearon, Benjamin Brosius, Tim DiLauro
JHU Data Management Services
Johns Hopkins University Sheridan Libraries
A Workflow for Depositing to a Research Data Repository: A Case Study for Archiving Publication Data
Research Data Access & Preservation Summit 2013
Baltimore, MD April 4, 2013 #rdap13
Integration of research literature and data (InFoLiS)Philipp Zumstein
Talk at CNI 2015 Spring Membership Meeting in Seattle on April 14th, 2015, see http://www.cni.org/events/membership-meetings/upcoming-meeting/spring-2015/
Abstract: The goal of the InFoLiS project is to connect research data and publications. Links between data and literature are created automatically by means of text mining and made available as Linked Open Data (LOD) for seamless integration into different retrieval systems. This enables scientists to directly access information about corresponding research data in a literature information system, and, vice versa, it is possible to directly find different interpretations and analyses in the literature of the same research data. In our talk, we will describe our methods for generating the links and give insight into the Linked Data infrastructure including the services we are currently building. Most importantly, we will detail how our solutions can be used by other institutions and invite all interested participants to discuss with us their ideas and thoughts on the requirements for these services to ensure broad interoperability with existing systems and infrastructures. InFoLiS is a joint project by the GESIS – Leibniz Institute for the Social Sciences, Cologne, Mannheim University Library, and Mannheim University supported by a grant from the DFG – German Research Foundation.
Who owns the data? Intellectual property considerations for academic research...Rebekah Cummings
Intellectual property (IP) is often complicated but is even more so as it pertains to data, as “facts” are not eligible for copyright protection under United States copyright law. The IP issues surrounding data in academic research environments are often exacerbated by the fact that data ownership has rarely been discussed in university environments prior to NSF’s data management plan requirement in 2011. Researchers retained custody over their datasets and other stakeholders – namely universities and funding agencies – rarely contested ownership. Now, as datasets are increasingly seen as valuable outputs of research alongside publications, questions of data ownership are coming to the fore. This presentation will frame the complex issues surrounding data ownership in an academic research setting and will discuss strategies for educating and advising your researchers on intellectual property issues related to research data.
RDAP13 Elizabeth Moss: The impact of data reuseASIS&T
Kathleen Fear, ICPSR, University of Michigan
“The impact of data reuse: a pilot study of 5 measures”
Panel: Data citation and altmetrics
Research Data Access & Preservation Summit 2013
Baltimore, MD April 4, 2013 #rdap13
The goal of the Very Open Data Project is to provide a software-technical foundation for this exchange of data, more specifically to provide an open database platform for data from the raw data coming from experimental measurements or models through intermediate manipulations to finally published results. The sheer expanse of the amount data involved creates some unique software-technical challenges. One of these challenges is addressed in the part of the study presented here, namely to characterize scientific data (with the initial focus being detailed chemistry data from the combustion kinetic community), so that efficient searches can be made. A formalization of this characterization comes in the form of schemas of descriptions of tags and keywords describing data and ontologies describing the relationship between data types and the relationship between the characterizations themselves. These will be translated to meta-data tags connected to the data points within a non-relational data of data for the community.
The focus of the initial work will be on data and its accessibility. As the project progresses, the emphasis will shift on not only having available data accessible for the community, but that the community itself will be able to, with emphasis on minimal effort, will be able contribute their own data. This will involve, for example, the concepts of the ‘electronic lab notebook’ and the existence and availability of extensive concept extraction tools, primarily from the chemical informatics field.
A demonstration of the DMPTool, which helps researchers create data management plans now required by the Nat'l Science Foundation and other US grant funding agencies. See http://www.cdlib.org/uc3/webinars/20111019/
for recording.
Poster RDAP13: Data information literacy multiple paths to a single goalASIS&T
Jake Carlson, Jon Jeffryes, Brian Westra and Sarah Wright
Data Information Literacy: Multiple Paths to a Single Goal
Research Data Access & Preservation Summit 2013
Baltimore, MD April 4, 2013 #rdap13
February 18 2015 NISO Virtual Conference Scientific Data Management: Caring for Your Institution and its Intellectual Wealth
Learning to Curate Research Data
Jennifer Doty, Research Data Librarian, Emory Center for Digital Scholarship, Emory University, Robert W. Woodruff Library
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...Amanda Whitmire
A workshop as part of the International Digital Curation Conference 2016 on DMP development and support. This presentation demonstrates how we can use data management plans as a source of information to better understand researcher data stewardship practices and how to support them. Be sure to see the slide notes to better understand the presentation (most slides are just photos/icons).
Poster RDAP13: A Workflow for Depositing to a Research Data Repository: A Cas...ASIS&T
Betsy Gunia, David Fearon, Benjamin Brosius, Tim DiLauro
JHU Data Management Services
Johns Hopkins University Sheridan Libraries
A Workflow for Depositing to a Research Data Repository: A Case Study for Archiving Publication Data
Research Data Access & Preservation Summit 2013
Baltimore, MD April 4, 2013 #rdap13
Integration of research literature and data (InFoLiS)Philipp Zumstein
Talk at CNI 2015 Spring Membership Meeting in Seattle on April 14th, 2015, see http://www.cni.org/events/membership-meetings/upcoming-meeting/spring-2015/
Abstract: The goal of the InFoLiS project is to connect research data and publications. Links between data and literature are created automatically by means of text mining and made available as Linked Open Data (LOD) for seamless integration into different retrieval systems. This enables scientists to directly access information about corresponding research data in a literature information system, and, vice versa, it is possible to directly find different interpretations and analyses in the literature of the same research data. In our talk, we will describe our methods for generating the links and give insight into the Linked Data infrastructure including the services we are currently building. Most importantly, we will detail how our solutions can be used by other institutions and invite all interested participants to discuss with us their ideas and thoughts on the requirements for these services to ensure broad interoperability with existing systems and infrastructures. InFoLiS is a joint project by the GESIS – Leibniz Institute for the Social Sciences, Cologne, Mannheim University Library, and Mannheim University supported by a grant from the DFG – German Research Foundation.
Who owns the data? Intellectual property considerations for academic research...Rebekah Cummings
Intellectual property (IP) is often complicated but is even more so as it pertains to data, as “facts” are not eligible for copyright protection under United States copyright law. The IP issues surrounding data in academic research environments are often exacerbated by the fact that data ownership has rarely been discussed in university environments prior to NSF’s data management plan requirement in 2011. Researchers retained custody over their datasets and other stakeholders – namely universities and funding agencies – rarely contested ownership. Now, as datasets are increasingly seen as valuable outputs of research alongside publications, questions of data ownership are coming to the fore. This presentation will frame the complex issues surrounding data ownership in an academic research setting and will discuss strategies for educating and advising your researchers on intellectual property issues related to research data.
RDAP13 Elizabeth Moss: The impact of data reuseASIS&T
Kathleen Fear, ICPSR, University of Michigan
“The impact of data reuse: a pilot study of 5 measures”
Panel: Data citation and altmetrics
Research Data Access & Preservation Summit 2013
Baltimore, MD April 4, 2013 #rdap13
The goal of the Very Open Data Project is to provide a software-technical foundation for this exchange of data, more specifically to provide an open database platform for data from the raw data coming from experimental measurements or models through intermediate manipulations to finally published results. The sheer expanse of the amount data involved creates some unique software-technical challenges. One of these challenges is addressed in the part of the study presented here, namely to characterize scientific data (with the initial focus being detailed chemistry data from the combustion kinetic community), so that efficient searches can be made. A formalization of this characterization comes in the form of schemas of descriptions of tags and keywords describing data and ontologies describing the relationship between data types and the relationship between the characterizations themselves. These will be translated to meta-data tags connected to the data points within a non-relational data of data for the community.
The focus of the initial work will be on data and its accessibility. As the project progresses, the emphasis will shift on not only having available data accessible for the community, but that the community itself will be able to, with emphasis on minimal effort, will be able contribute their own data. This will involve, for example, the concepts of the ‘electronic lab notebook’ and the existence and availability of extensive concept extraction tools, primarily from the chemical informatics field.
A demonstration of the DMPTool, which helps researchers create data management plans now required by the Nat'l Science Foundation and other US grant funding agencies. See http://www.cdlib.org/uc3/webinars/20111019/
for recording.
Leveraging and interpreting library assessment data 4 17 2016Elizabeth Brown
Assessment data can be collected from a multitude of sources from within and outside your library. It’s not just about the size of collections, or number of reference transactions, or hours a library is open. This presentation will review some of the key places assessment information can be gathered and provide strategies to creatively think about assessment data collection for your library.
RDAP14: An analysis and characterization of DMPs in NSF proposals from the Un...ASIS&T
Research Data Access and Preservation Summit, 2014
San Diego, CA
March 26-28, 2014
Lightning Talks
William Mischo, University of Illinois at Urbana-Champaign
ESI Supplemental 1 E-research Support SlidesDuraSpace
E-Research Support at
Johns Hopkins University & Purdue University
Supplemental Webinar
Wednesday, October 17, 2012
Presented by Sayeed Choudhurry & James Mullins
Research Data Access and Preservation Summit, 2014
San Diego, CA
March 26-28, 2014
Jared Lyle, ICPSR
Jennifer Doty, Emory University
Joel Herndon, Duke University
Libbie Stephenson, University of California, Los Angeles
Steven McEachern - ADA, DDI (metadata standard) and the Data LifecycleSteve Androulakis
Dr. McEachern is Director of the Australian Data Archive at the Australian National University, and has research interests in data management and archiving, community and social attitude surveys, new data collection methods, and reproducible research methods.
This talk was given for the Monthly Tech Talks event hosted by Australian data infrastructure groups ANDS, NeCTAR, RDS and others.
The panel will focus on a pilot project to ensure that all stakeholders understand the services and infrastructures to be included in the DMPs by the granting councils and CFI.
Staffing Research Data Services at University of EdinburghRobin Rice
Invited remote talk for Georg-August University of Göttingen workshop: RDM costs and efforts on 28 May in Göttingen. Organised by the project Göttingen Research Data Exploratory (GRAcE).
Trailblazing in the Wilderness of Data ManagementStephanie Wright
Presentation to Montana State University faculty and librarians. Suggestions for services and collaborations for consideration in development of a new data management program.
Libraries and Research Data Management – What Works? - Sheila Corrall - Immer...LIBER Europe
This presentation by Sheila Corrall was given at the Scholarly Communication and Research Infrastructures Steering Committee Workshop. The workshop title was Libraries and Research Data Management – What Works?
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
An analysis and characterization of DMPs in NSF proposals from the University of Illinois (#RDAP14)
1. An Analysis and Characterization of
DMPs in NSF Proposals from the
University of Illinois
RDAP14 Research Data Access & Preservation
Summit
March 26, 2014
William H. Mischo, Mary C. Schlembach, &
Megan N. O’Donnell
University of Illinois at Urbana-Champaign
Iowa State University
2. NSF Data Management Plans
• Data Management Plans (DMPs): required
element in NSF proposals, January 2011
• July 2011: the Library, working with the campus
Office of Sponsored Programs and Research
Administration (OSPRA) began an analysis of
DMPs in submitted NSF grant proposals
• Currently, looked at 1,600 grants with 1,260 in
the analysis.
3. Reasons for Analysis
•What storage venues and mechanisms for
sharing and reuse are being used?
•Are the PI’s using local templates and local
campus resources such as the IDEALS?
4. Follow-on
• Develop campus-wide infrastructure (Research
Data Service - RDS)
• Assist in compliance with federal agencies
• Develop important partnerships with campus
units (CITES, NCSA, Colleges) and national
entities
• Develop best practices and standard approaches
5. Analysis
• Analysis attempts to characterize and classify
DMPs into categories
• DMPs assigned multiple categories
• 1,260 DMPs from July 2011 to November 2013
6. Categories
• PI Server – Servers and workstations that the PIs
(and their students/staff) use to store project
data.
laboratory server/workstations, external hard drives, group
computer
• PI Website – Websites edited or administered
by the PI or a group they belong to.
Examples: lab website, project website, wiki, PI’s website
7. Categories
• Campus – Services located, operated by, run by
or endorsed by Illinois.
IDEALS, Netfiles and Box.net, NCSA, and Beckman
Institute.
• Department – Used when a department was
specifically mentioned as providing a storage or
hosting resource.
Departmental website, departmental server, departmental
backup service or a web address traced back to an
academic department (also given the “campus” label)
8. Categories
• Remote – Services and sites not located on the
Illinois campus.
NASA, other campuses, collaborative projects, non-Illinois
institutes
• Disciplinary – Disciplinary repositories.
GenBank, arXiv, ICPSR, SEAD, Nanohub, and Dryad
• Cloud – Storage services using cloud technology.
Google Drive, Google Code, Box.net, Amazon, Microsoft,
Dropbox
9. Categories
• Publication - Scholarly outputs.
Journal articles, workshops, and conference
presentations/posters.
• Analog - Physical records/data.
Lab notebooks, photographs, files
• Specimens - Physical specimens.
Usually biological or artifacts
10. Categories
• Optical Disc - DVD, CD, and Blu-ray discs.
• Not specified – the DMP was not specific
enough for us to categorize further.
• No Data – Indicated the proposal will produce
no data products.
• Local Template Used – used a library authored
template.
11. Category Number Percent
PI Server 503 39.9%
PI Website 529 41.9%
Campus 667 52.9%
Department 142 11.2%
Remote 353 28%
Disciplinary 275 21.8%
Publication 556 44.1%
Cloud 63 5%
Optical Disc 56 4%
Analog 131 10.4%
Specimens 111 8.8%
Not Specified 66 5.2%
Collaborative 164 13%
No Data 103 8.2%
ALL DMPs (n=1,260)
12. Data Venue and Risk
Data Location
Submitted Proposals Funded Proposals
Risk of Loss/Corruption/ Breach
n=1260 n=298
PI Server/Website 64% High 61% High
Departmental
Server/Website
11.2% Medium to High 7% Medium to High
Campus-Wide Resource 52.9%
Low
45%
LowIDEALS (Institutional
Repos.)
21.9% 19.8%
NCSA 4.3% 16.4%
Disciplinary
Repository/Cloud
25.8% Medium to Low 21.4% Medium to Low
Remote Repository 28% Medium to High 22.8% Medium to High
Optical Disk, Specimens,
Analog
19.4% Out of Scope 11% Out of Scope
13. Notables
• Funded: 298
• Used local
template: 254
• Only 87 DMPS
contained
information about
file types
• IDEALS: 275
• NCSA/XSEDE: 55
• Dryad: 22
• ICPSR: 17
• GenBank: 55
• ArX: 61
14. Analysis
• Any differences in storage venue or technologies
between the unfunded proposals and the funded
proposals?
• Any differences between the proposals from the
first year and the more current proposals?
• Other differences in proposal categories
between funded and unfunded
• 734 active NSF awards, $861.8 million
15. Analysis: Funded vs. Not-funded
• IDEALS institutional repository:
62 funded, 197 not funded: chi-square: 0.17
• Storing data on PI server or website:
183 funded, 569 not funded: chi-square: 0.7
• Disciplinary or Cloud:
67 funded, 241 not funded: chi-square: 0.85
• Remote storage:
68 funded, 267 not funded: chi-square: 3.01
16. Analysis
• Use of IDEALS
before August 2012 = 108
after (thru November 2013) = 166
chi-square: 4.59, p < .05
• Use of Disciplinary or Cloud
before August 2012 = 121
after = 182
chi-square: 4.33, p < .05
17. Implications and Conclusions
1. No significant differences between
funded/unfunded proposals in storage venues -
no advantage in IDEALS, Disciplinary.
2. More recent proposals suggest IDEALS and
disciplinary repositories included at a
significantly higher level
• What is the role of the library? The campus?
The subject discipline?
• Connecting data to the literature important
Editor's Notes
Took out (covered in keynote)
- Make key research data available and sharable
- Allow the use of data for verification of results and reproducibility of research work
- Agency can show significant return on investment to justify funding
to support Illinois researchers in managing their data
Very few DMPs were explicit as to how their “publications” and data were related or separated.
No data: Many were theoretical studies (math), travel grants, or workshop planning sessions.