CREATING A DATAMANAGEMENT PLANNOVEMBER 5, 2012 Lizzy Rolando, Research Data Librarian
Why Data Management?2 Good for You Good for Science Required by Funding Agencies
Funding Agency Requirements3 Funding Agency Requirement NSF* • Must include DMP in proposal • Materials collected during research should be shared NIH • Papers must be submitted to PubMed • Projects with over $500,000 funding must share data and include Data Sharing Plan in proposal USDA • National Institute of Food and Agriculture requires all data to be submitted to public domain without restriction NOAA • Soon programs require a data management plan Some requiring that all grants include a data sharing plan, which • must also be shared All environmental data should be made visible, accessible and • All data should be made visible, users independently understandable toaccessible and independently understandable to users, within 2 years of end of grant NASA • Data should be made freely and widely available. NASA • • Data should be plan and evidence of anyavailable. A data sharing made freely and widely past sharing activities • A databe included as part of the technicalpast sharing activities should sharing plan and evidence of any proposal should be included as part of the technical proposal CDC • All data are released and/or shared as soon as feasible CDC • All data should be released and/or shared as soon as feasible
Exciting News!4 Beginning January 14, 2013, the Biographical Sketch(es) for an NSF grant proposal will include a section on “Products,” and no longer “Publications.” This way, applicants can include not just publications, but also datasets, software, patents and copyrights.
Basic DMP Components5 Data Description Data and metadata standards Data access and sharing policies Data re-use and re-distribution Data preservation and archiving *Depending on the funding source and the directorate/division/program, data management plan requirements may differ.
Data Description6 What kinds of data will you produce? Numerical data, simulations, text sequences, etc. Experimental, observational, simulation Raw, derived How will you acquire the data? How will you process the data? How much data will you collect? Are you using any existing data? What QA/QC procedures will you use?
Recommendations7 A short description of your project helps to give context to why you are collecting the data. Two people should record and enter data separately. Notes about the data (metadata) should be recorded alongside the data by the data collectors. Make sure you record units and have headers for rows and columns in your tables. Keep all raw data separate from analyzed data, and maintain versions of data during analysis. Survey existing data sources.
Data and Metadata Formats8 What metadata will you create/include with data? i.e. What does someone else need to know about your data in order to reuse them? Where will this be recorded? How? What format? Will you use a community metadata standard? Will you conform to community terminology?
Recommendations9 Use metadata standards common in your discipline. i.e. Ecological Metadata Language for Ecology Always include a “readme.txt” file that describes the who, what, where, when and why of the data, at a bare minimum. Make sure you have recorded the information that you would need if you were trying to use someone else’s data. Check with the data repository where you hope to store your data – sometimes they require a particular metadata standard.
Data Access and Sharing Policies10 Are your data sensitive, so access by others needs to be restricted? What license or publishing model will you use for your data? How will you make your data accessible to others? What data will you make available and at what stage of your research? Do you have protocols, such as IRB, that you need to comply with? If so, how will you do so?
Recommendations11 Apply an open license to data that you will share. Explain why you cannot share data, if that is the case. For example, the data are proprietary. Anonymize or de-identify any sensitive data Use a repository that can mediate data sharing if data cannot be sufficiently anonymized Comply with IRB restrictions That should be obvious, but we’ll say it anyways
Data Re-use and Re-Distribution12 Who do you expect will want to or can reuse your data? Should there be restrictions on who or how your data can be reused? How should others indicate that they have used your data? How long will your data be available to others for reuse? Does your institution have rules about data?
Recommendations13 Imagine the broadest possible audience for your data. Place as few restrictions on your data as you can. Check with your chosen repository to make sure they provide a data citation. You want credit when someone else uses your data! Link your published articles to the data underlying those data. Use a repository that can make your data available far into the future.
Data Preservation and Archiving14 What formats for your data will you use? Are they preservation friendly? What repository or data archive can take your data when you are finished? How do they preserve/share your data? What are their access policies? Is any extra work needed to prepare data for the repository? Who will be responsible for final preservation?
Recommendations15 Appraise your data, selecting those with long-term value, and document your choices. Use preservation friendly digital formats. Non-proprietary,commonly used You may need to transform data into new format. Find a repository that will take your data, and plan to comply with their policies early on. Look into using SMARTech! P.I.’s should ultimately be responsible for dealing with the final disposition of the data.
Step 2: Create a Plan20 Select a Funding Agency Email is sent to Georgia Tech Library
Let’s Talk About Names21 Strongly Recommend Naming Plan “[Insert Proposal Title Here] Data Management Plan”
Downloadable Templates22 Clicking on “Funder Requirements” will lead to a page with a list of all funding agency requirements
Step 3: One Section at a Time23 Sections are different depending on funding source. Georgia Tech and DataONE Enter your have resources answers here available for every section
Some Sections Have Extra Advice24 Georgia Tech specific help text
Almost There25You shouldsave afterevery section,but definitely You’re so closesave at the to the end!very end.
Step 4: Export26 Now that you have the content, you can export your plan.
Step 5: Share plan27 Send your plan to the Research Data Librarian (Me!) to look over your plan. Have your colleagues look at your plan. Do you know your grant officer? Maybe they will look at it.
Step 6: Finish and Start Research!28 Add plan to proposal or distribute among research team Start your newly funded research!
Other Data Management Plan Resources29 Digital Curation Centre - http://www.dcc.ac.uk/resources/data-management-plans ICPSR – while made for Social Science data, it has great resources for anyone: http://www.icpsr.umich.edu/icpsrweb/content/datamanage ment/dmp/plan.html UK Data Archive - http://www.data- archive.ac.uk/media/2894/managingsharing.pdf
Questions?30 Lizzy Rolando Research Data Librarian email@example.com 404.385.3706 http://libguides.gatech.edu/research-data