•
•
•
•
•
•
•
•

What if your hard drive crashes?
What if you are accused of fraud?
What if your collaborator abruptly qui...
Data Management &
Data Management Plans
Responsible Conduct of Research
22 November 2013
Kristin Briney & Brad Houston
Why Data Management?
• Don’t lose data
• Find data more easily
– Especially if you need older data

•
•
•
•

Easier to ana...
For each minute of planning at
beginning of a project, you will save
10 minutes of headache later
What Are Data?

http://www.flickr.com/photos/dia-a-dia/7046151669/ (CC BY-NC-SA)
What Are Data?
• “Research data is defined as the recorded
factual material commonly accepted in the
scientific community ...
What Are Data?
• Observational
– Sensor data, telemetry, survey data, sample
data, images

• Experimental
– Gene sequences...
Brad Houston, University Records Officer
Responsible Conduct of Research
November 22, 2013
Source: Jim Linwood




Your Data
Management Plan
should come *last*.
First consider:
◦ Information about
your data
◦ Information about
your...


What kind of data is it?

◦ (See Kristin’s slide on the 4 categories)



What are the key characteristics of the data?...


In order of amount of documentation you’ll
need:
◦ Future You (reference use only)
◦ Colleagues within your discipline,...


Rights shared with
collaborators
◦ Decide who’s
responsible for the
official copy of data




Information Security
Ac...


Your data management plan (DMP) should
contain 5 key components:
◦
◦
◦
◦
◦



Expected Data
Standards for format and c...




In short: What kind of data will be produced
by your research processes?
Keep in mind:
◦ File formats of complete da...




In short: how will you organize your data
within datasets to make it widely
accessible, and how will you make data s...




In short: How will you
allow other
researchers to find
and use your data?
Keep in mind:
◦ How will other
researchers...




In short: How will
researchers obtain
permission to use
your data?
Keep in mind:

◦ Will you grant blanket
permissio...




In short: How will you
make sure your data
stays available?
Keep in Mind:
◦ What are your retention
requirements? Is...


You also need to keep track of supplementary
research records:
◦
◦
◦
◦
◦



Documentation on funding/expenditures
Copi...


Document Everything!
◦ Information about the data and your methods
◦ Information about where/how you’re keeping the
dat...
A Crash Course in

PRACTICAL DATA MANAGEMENT
Storage and Backups

http://www.flickr.com/photos/9246159@N06/599820538/ (CC BY-ND)
Storage and Backups
• Library motto: Lots of Copies Keeps Stuff Safe!
• Rule of 3: 2 onsite, 1 offsite
• Any backup is bet...
Storage and Backups
• Library motto: Lots of Copies Keeps Stuff Safe!
• Rule of 3: 2 onsite, 1 offsite
• Any backup is bet...
Example
• I keep my data
– On my computer
– Backed up manually on shared drive
• I set a weekly reminder to do this

– Bac...
Consistency

http://www.flickr.com/photos/mactucket/361798299/ (CC-BY-ND)
Consistency
• Consistent file naming
– Make it easier to find files
– Avoid many duplicates
– Make it easier to wrap up a ...
Examples
•
•
•
•

DataManagement_v6.pptx
20090923_spctrm_trans_03.csv
SLAposter_FINAL.ai
BlogPost-2011-11-12.docx

• Find ...
Consistency
• Consistent documentation
– Record all necessary information
– Keep information in one place
– Easier to sear...
Example
• For my experiment, I need to collect:
– Date
– Experiment
– Scan number
– Powers
– Wavelengths
– Concentration (...
Recording Your Conventions

http://www.flickr.com/photos/jjpacres/3293117576/ (CC BY-NC-ND)
Recording Your Conventions
• What if someone needs to find your data?
• Eventually will hand off data to your PI
• Record ...
Examples
• Print out near computer/experiment area
– Document conventions

• In front of research/lab notebook
– Page 1: P...
Planning for the Future

http://www.flickr.com/photos/bonedaddy/2791636546/ (CC BY-SA)
Planning for the Future
• Get help for sensitive data!
– HIPAA, FERPA, FISMA, IRB, etc.

• UWM Information Security Office...
Planning for the Future
• We can’t open files from 10 years ago
• Proprietary file types
– Convert to open file format
• ....
Don’t Stress Over Data

http://www.flickr.com/photos/72775875@N06/7729764370/ (CC BY-NC-SA)
More Data Management
• Data Services
– www.uwm.edu/libraries/dataservices/

• Data Management Plans
– dataplan.uwm.edu

• ...
Thank You
• The content of this presentation is licensed
under a Creative Commons Attribution 3.0
Unported License (CC BY)...
Upcoming SlideShare
Loading in …5
×

Responsible Conduct of Research: Data Management

572 views
421 views

Published on

This presentation was given by myself and Brad Houston (http://www.slideshare.net/herodotusjr), for UWM's Responsible Conduct of Research (RCR) series in Fall of 2013. It covers data management plans and practical data management tips. The corresponding handout is also available on Slideshare: http://www.slideshare.net/kbriney/rcr-data-management-handout

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
572
On SlideShare
0
From Embeds
0
Number of Embeds
9
Actions
Shares
0
Downloads
10
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Responsible Conduct of Research: Data Management

  1. 1. • • • • • • • • What if your hard drive crashes? What if you are accused of fraud? What if your collaborator abruptly quits? What if the building burns down? What if you need to use your old data? What if your backup fails? What if your computer gets stolen? What if… Do You Still Have Your Data?
  2. 2. Data Management & Data Management Plans Responsible Conduct of Research 22 November 2013 Kristin Briney & Brad Houston
  3. 3. Why Data Management? • Don’t lose data • Find data more easily – Especially if you need older data • • • • Easier to analyze organized, documented data Avoid accusations of fraud & misconduct Get credit for your data Don’t drown in irrelevant data
  4. 4. For each minute of planning at beginning of a project, you will save 10 minutes of headache later
  5. 5. What Are Data? http://www.flickr.com/photos/dia-a-dia/7046151669/ (CC BY-NC-SA)
  6. 6. What Are Data? • “Research data is defined as the recorded factual material commonly accepted in the scientific community as necessary to validate research findings” – OMB Circular A-110 http://www.whitehouse.gov/omb/circulars_a110
  7. 7. What Are Data? • Observational – Sensor data, telemetry, survey data, sample data, images • Experimental – Gene sequences, chromatograms, toroid magnetic field data • Simulation – Climate models, economic models • Derived or compiled – Text and data mining, compiled database, 3D models, data gathered from public documents
  8. 8. Brad Houston, University Records Officer Responsible Conduct of Research November 22, 2013
  9. 9. Source: Jim Linwood
  10. 10.   Your Data Management Plan should come *last*. First consider: ◦ Information about your data ◦ Information about your audience ◦ Obligations to funders and others Source: Sam Howzit
  11. 11.  What kind of data is it? ◦ (See Kristin’s slide on the 4 categories)  What are the key characteristics of the data? ◦ (File Format? Size? Programs needed to access it?)   Can I recreate the data, if needed? What infrastructure is available to manage it? ◦ On-campus and off-campus– don’t limit yourself  Is the data intelligible to people other than me? ◦ If the answer to this one is “no”, that’s something you should probably fix
  12. 12.  In order of amount of documentation you’ll need: ◦ Future You (reference use only) ◦ Colleagues within your discipline, in your lab or elsewhere ◦ Colleagues in related disciplines ◦ General Public/The World!  The question to ask: is my data described well enough to be usable by my audience?
  13. 13.  Rights shared with collaborators ◦ Decide who’s responsible for the official copy of data   Information Security Access Provisions ◦ NIH: Public Access policy ◦ NSF: Directorate access policy ◦ Others? (OMB A-110) Often attached to funding.
  14. 14.  Your data management plan (DMP) should contain 5 key components: ◦ ◦ ◦ ◦ ◦  Expected Data Standards for format and content Policies for Access and sharing Policies for Reuse and distribution Plans for archiving data and preserving access Note: These are minimum requirements. ◦ Specific agencies or directorates may ask for more– check their application sites!
  15. 15.   In short: What kind of data will be produced by your research processes? Keep in mind: ◦ File formats of complete data sets ◦ Any software or code that will be needed/produced ◦ Physical samples or other individual data points  Some divisions require retention of physical samples; consult your Program Officer
  16. 16.   In short: how will you organize your data within datasets to make it widely accessible, and how will you make data sets identifiable? Keep in mind: ◦ Any data formatting standards for your particular discipline ◦ Any metadata (author, date, subject, etc.) that your program attaches automatically, and what you will need to attach manually ◦ How will you find your data for later consultation? How will others find it?
  17. 17.   In short: How will you allow other researchers to find and use your data? Keep in mind: ◦ How will other researchers find your data? ◦ How will you provide access to your data? ◦ How will you prepare your data for sharing?
  18. 18.   In short: How will researchers obtain permission to use your data? Keep in mind: ◦ Will you grant blanket permission or case-bycase? ◦ What responsibilities will users of your data have re: privacy, intellectual property, etc.? ◦ What if a provision is violated?
  19. 19.   In short: How will you make sure your data stays available? Keep in Mind: ◦ What are your retention requirements? Is this a permanent data set? ◦ What storage media will you use? Are you prepared to migrate as needed? ◦ Do you have a data backup plan? Above: Not A Good Way to archive your data.
  20. 20.  You also need to keep track of supplementary research records: ◦ ◦ ◦ ◦ ◦  Documentation on funding/expenditures Copies of IRB/Animal Care research protocols Hazardous Materials documentation Invention Disclosure/Tech Transfer documentation Conflict of Interest reports Every institution has a different retention requirement– ask your records officer! ◦ For UWM: almost all of this is “End of Grant + 3 years”
  21. 21.  Document Everything! ◦ Information about the data and your methods ◦ Information about where/how you’re keeping the data (short-term and long-term) ◦ What is needed to access the data ◦ What security/privacy policies apply ◦ Any collaborators outside the institution and their rights ◦ Any supplementary files or forms needed to document use of funding
  22. 22. A Crash Course in PRACTICAL DATA MANAGEMENT
  23. 23. Storage and Backups http://www.flickr.com/photos/9246159@N06/599820538/ (CC BY-ND)
  24. 24. Storage and Backups • Library motto: Lots of Copies Keeps Stuff Safe! • Rule of 3: 2 onsite, 1 offsite • Any backup is better than none • Automatic backup is better than manual • Your research is only as safe as your backup plan – Periodically test restore from backup!
  25. 25. Storage and Backups • Library motto: Lots of Copies Keeps Stuff Safe! • Rule of 3: 2 onsite, 1 offsite • Any backup is better than none • Automatic backup is better than manual • Your research is only as safe as your backup plan – Periodically test restore from backup!
  26. 26. Example • I keep my data – On my computer – Backed up manually on shared drive • I set a weekly reminder to do this – Backed up automatically via SpiderOak cloud storage • A note on cloud storage…
  27. 27. Consistency http://www.flickr.com/photos/mactucket/361798299/ (CC-BY-ND)
  28. 28. Consistency • Consistent file naming – Make it easier to find files – Avoid many duplicates – Make it easier to wrap up a project • Names descriptive but short (<25 characters) • Avoid “ / : * ? ‘ < > [ ] & $ and spaces • Date convention: YYYY-MM-DD
  29. 29. Examples • • • • DataManagement_v6.pptx 20090923_spctrm_trans_03.csv SLAposter_FINAL.ai BlogPost-2011-11-12.docx • Find a system that works for you
  30. 30. Consistency • Consistent documentation – Record all necessary information – Keep information in one place – Easier to search and use later • Take 5 minutes before starting a project • Create a list of information to record – Don’t forget to record the units!
  31. 31. Example • For my experiment, I need to collect: – Date – Experiment – Scan number – Powers – Wavelengths – Concentration (or sample weight) – Calibration factors, like timing and beam size
  32. 32. Recording Your Conventions http://www.flickr.com/photos/jjpacres/3293117576/ (CC BY-NC-ND)
  33. 33. Recording Your Conventions • What if someone needs to find your data? • Eventually will hand off data to your PI • Record your naming conventions • Record your documentation schemes • Record overall project information – Contact info, grant #, project summary, etc.
  34. 34. Examples • Print out near computer/experiment area – Document conventions • In front of research/lab notebook – Page 1: Project information – Page 2: Conventions and abbreviations – Page 3-X: Index of experiments • README.txt in data folder – Top-level folder: project information – Lower-level folder: what’s in this folder?
  35. 35. Planning for the Future http://www.flickr.com/photos/bonedaddy/2791636546/ (CC BY-SA)
  36. 36. Planning for the Future • Get help for sensitive data! – HIPAA, FERPA, FISMA, IRB, etc. • UWM Information Security Office – Visit: www.uwm.edu/itsecurity/ – Email: infosec@uwm.edu • Policy pages – www.uwm.edu/legal/hipaa/index.cfm – www.uwm.edu/academics/ferpa.cfm
  37. 37. Planning for the Future • We can’t open files from 10 years ago • Proprietary file types – Convert to open file format • .doc  .txt • .xls  .csv • .jpg  .tif – Preserve software if no open file format • Periodically move data to new media
  38. 38. Don’t Stress Over Data http://www.flickr.com/photos/72775875@N06/7729764370/ (CC BY-NC-SA)
  39. 39. More Data Management • Data Services – www.uwm.edu/libraries/dataservices/ • Data Management Plans – dataplan.uwm.edu • Kristin Briney, Data Services Librarian • Brad Houston, University Records Officer
  40. 40. Thank You • The content of this presentation is licensed under a Creative Commons Attribution 3.0 Unported License (CC BY) – Image licenses as marked

×