Your SlideShare is downloading. ×
Creating a Data Management Plan
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Creating a Data Management Plan

1,613
views

Published on

This talk walks through the parts of a data management plan and how to build a management plan for a research project.

This talk walks through the parts of a data management plan and how to build a management plan for a research project.

Published in: Education, Technology

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,613
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
14
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Creating a Data Management Plan Kristin Briney, PhD Data Services Librarian
  • 2. This Session Will Answer • Why am I being asked to create a DMP? • What are the key parts of a DMP? • How do I translate my research to each of these parts?
  • 3. You Will Leave With • An understanding of the main parts of a data management plan • Knowledge of where to find resources and assistance  Rough outline of your data management plan
  • 4. WHY AM I BEING ASKED TO CREATE A DATA MANAGEMENT PLAN?
  • 5. Why Data? Why Now? • Data are DIGITAL – Easy to copy and share – Difficult to preserve • Data are COMPUTABLE – New avenues of research like data mining • Data represent a FINANCIAL INVESTMENT – Poor research funding climate – Can no longer ignore data as a scholarly product
  • 6. Many Funders Require DMPs • NSF • NEH • NIH • NOAA • NASA • …even more funders will require DMPs soon! – White House OSTP Public Access memo
  • 7. The Funder Perspective • Data is a scholarly resource – Data sharing akin to scholarly publishing • Barriers to sharing are – Organization – Documentation – Long-term management and preservation  Hence data management plans
  • 8. DMPs Help You Too! • Don’t loose data • Find data more easily • Easier to analyze organized, documented data • Avoid accusations of fraud & misconduct • Get credit for your data • Don’t drown in irrelevant data!
  • 9. For each minute of planning at beginning of a project, you will save 10 minutes of headache later
  • 10. DMPs Help You Too! A data management plan will make conducting research easier for you… …So if you are required to create a DMP, why not use it to improve your practices?
  • 11. WHAT ARE THE KEY PARTS OF A DATA MANAGEMENT PLAN?
  • 12. Actual NSF DMP Requirements • The types of data, samples, physical collections, software, curriculum materials, and other materials to be produced in the course of the project • The standards to be used for data and metadata format and content • Policies for access and sharing including provisions for appropriate protection of privacy, confidentiality, security, intellectual property, or other rights or requirements • Policies and provisions for re-use, re-distribution, and the production of derivatives • Plans for archiving data, samples, and other research products, and for preservation of access to them http://www.nsf.gov/pubs/policydocs/pappguide/nsf13001/gpg_2.jsp#dmp
  • 13. Key Questions 1. What data will I create? 2. What standards will I use to document the data? 3. How will I protect private/secure/confidential data? 4. How will I archive and preserve the data? 5. How will I provide access to and allow reuse of the data?
  • 14. Be Aware • Actual requirements vary by funder and division • Look up your requirements before you write your DMP
  • 15. HOW DO I TRANSLATE MY RESEARCH TO EACH OF THESE PARTS?
  • 16. 1. WHAT TYPES OF DATA WILL I CREATE?
  • 17. What Are Data? • “Research data is defined as the recorded factual material commonly accepted in the scientific community as necessary to validate research findings” – OMB Circular A-110 http://www.whitehouse.gov/omb/circulars_a110
  • 18. What Are Data? • Observational – Sensor data, telemetry, survey data, sample data, images • Experimental – Gene sequences, chromatograms, toroid magnetic field data • Simulation – Climate models, economic models • Derived or compiled – Text and data mining, compiled database, 3D models, data gathered from public documents
  • 19. What Not To Share • Laboratory notebooks • Preliminary analyses • Drafts of scientific papers • Plans for future research • Peer reviews or communications with colleagues • Physical Samples
  • 20. No Data? • Still need a data management plan • Plans with no data and no sharing will likely be examined more closely – Carefully explain situation if you are in this position
  • 21. Exercise • Conduct a quick inventory of the data you will acquire – What data will you collect? – Is your data unique? – How big will the data be? – How fast will the data grow?
  • 22. 2. WHAT STANDARDS WILL I USE TO DOCUMENT THE DATA?
  • 23. What would someone unfamiliar with your data need in order to find, evaluate, understand, and reuse them?
  • 24. Documentation • Consider the difference in documenting for – someone inside your lab – someone outside your lab but in your field – someone outside your field • Audience matters!
  • 25. Documentation Methods • How the data were gathered • How the data should be interpreted • What you did – Limitations on what you did • …build trust in your data Metadata • What you’re looking at • Who made it and when • How it got there • What it means • What you can do with it • …before you even look at the file
  • 26. Methods • Examples of methods to document – Code – Survey – Codebook – Data dictionary – Anything that lets someone reproduce your results • Don’t forget the units!
  • 27. Metadata • Look for a metadata scheme before you collect the data! – Lots of metadata schemas available – Easier to record metadata when collecting data than to convert later • Consult – Disciplinary repository • Repositories usually have required metadata schemas – Your peers – Subject librarian
  • 28. Metadata Example: Dublin Core • Contributor – Jane Collaborator • Creator – Kristin Briney • Date – 2013 Apr 15 • Description – A microscopy image of cancerous breast tissues under 20x zoom. This image is my control, so it has only the standard staining describe on 2013 Feb 2 in my notebook. • Format – JPEG • Identifier – IMG00057.jpg • Relation – Same sample as images IMG00056.jpg and IMG00055.jpg • Subject – Breast cancer • Title – Cancerous breast tissue control
  • 29. Exercise • What methods information do you need to preserve? • What metadata standard will you use for your data? -OR- Who will you contact to find a relevant standard?
  • 30. 3. HOW WILL I PROTECT PRIVATE/SECURE/CONFIDENTIAL DATA?
  • 31. Security Issues • Does your data fall under the following? – HIPAA • Health information – FERPA • Student information – FISMA • Government subcontractor – Human subject research, etc.  Ask for help!
  • 32. Security Issues • Secure storage • Controlled access • De-identification of personal information • Security training
  • 33. Security Questions • Access permissions – Who is allowed to access the data? • Sharing – Am I required to share? Can I actually share? – Despite requirements, some data can’t be shared • Responsibility – Who will make sure the data stays secure?
  • 34. UWM Security Resources • UWM Information Security Office – Visit: https://www4.uwm.edu/itsecurity/ – Email: infosec@uwm.edu • Certificate in Information Security • HIPAA – https://www4.uwm.edu/legal/hipaa/index.cfm • FERPA – http://www4.uwm.edu/academics/ferpa.cfm
  • 35. Exercise • Do any regulations apply to your data? • If so, who is allowed to access your secure data? Who will be responsible for data security?
  • 36. 4. HOW WILL I ARCHIVE AND PRESERVE THE DATA?
  • 37. Archiving Is Not Storage • Storage is keeping files to access • Archiving is about preservation – Data should be readable and usable – Data should be uncorrupted • We can’t read some digital files from 10 years ago – This is what good digital preservation solves
  • 38. Side Note • If federally funded, you are required to retain your data “for a period of three years from the date of submission of the final expenditure report.” AT LEAST. • Better to keep on hand for at least 6 years – Recent retraction in 6-year old paper for failure to provide original data • Preservation not an abstract issue http://www.whitehouse.gov/omb/circulars_a110#53 http://retractionwatch.wordpress.com/2013/07/19/jci-paper-retracted-for-duplicated-panels-after-authors-cant-provide- original-data/
  • 39. File Formats • Easy way to ensure long-term usability • Use open file formats – Open and standardized – Well documented – In wide use – Examples: .txt, .tiff, .csv, .dbf • Transform your data now, not later – Keep both file types
  • 40. Other Preservation Concerns • Obsolescence – Preserve software along with data • Deterioration – Keep more than 1 copy to avoid corruption • Media – ie. Can you still read a floppy disk? – Periodically move data off outdated media
  • 41. Find a Trustworthy Partner • Find outside help – Servers come and go, so do labs • Off campus – Disciplinary data repository – Journal that accepts data • Let someone else worry about this
  • 42. Exercise • What open file formats will you use to help preserve your data? • If there isn’t an adequate open format, what software and hardware will you preserve?
  • 43. 5. HOW WILL I PROVIDE ACCESS TO AND ALLOW REUSE OF THE DATA?
  • 44. Why Share? • Get more credit for your work – In “studies that created gene expression microarray data, we found that studies that made data available in a public repository received 9% … more citations than similar studies for which the data was not made available” – “The citation boost varied with date of dataset deposition: a citation boost was most clear for papers published in 2004 and 2005, at about 30%” • Get credit for unpublishable results https://peerj.com/preprints/1/ (2013 study)
  • 45. Why Share? • Make your funder happy • Helps you find and use your data later • Disprove misconduct or fraud accusations • Stimulate new research
  • 46. Audience • Who is the audience for this data? – Coworkers? – Disciplinary/institutional colleague? – Researchers in allied fields? – Anyone? • Audience will determine how to share the data
  • 47. Ways To Provide Access • Hands-off options preferable – Journal – Disciplinary repository • Embargoes may be possible here – UWM Digital Commons • Small, discrete datasets • Other options – By request – On your lab website
  • 48. Exercise • Who is the audience for your data? • Which way will you provide access?
  • 49. RESOURCES
  • 50. Resources • Data Services Librarian – briney@uwm.edu • Data management information – dataplan.uwm.edu • UWM Information Security Office – infosec@uwm.edu
  • 51. Thank You • This presentation is available on Slideshare – http://www.slideshare.net/kbriney • The content of this presentation is licensed under a Creative Commons Attribution 3.0 Unported License (CC BY) • Some content used with permission from Brad Houston and Dorothea Salo
  • 52. Questions?