Data Management 101 (2015)

627 views

Published on

This presentation is an updated version of my Data Management 101 talk, which covers the basics of research data management in the categories of: storage and backup, documentation, organization, and making files usable for the future.

Published in: Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
627
On SlideShare
0
From Embeds
0
Number of Embeds
10
Actions
Shares
0
Downloads
18
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Data Management 101 (2015)

  1. 1. Do You Still Have Your Data? • What if your hard drive crashes? • What if you are accused of fraud? • What if your collaborator abruptly quits? • What if the building burns down? • What if you need to use your old data? • What if your backup fails? • What if your computer gets stolen? • What if…
  2. 2. Data Management 101 25 September 2015 Kristin Briney, PhD
  3. 3. Why Data Management? • Don’t lose data
  4. 4. Why Data Management? • Don’t lose data • Find data more easily – Especially if you need older data
  5. 5. Why Data Management? • Don’t lose data • Find data more easily – Especially if you need older data • Easier to analyze organized, documented data
  6. 6. Why Data Management? • Don’t lose data • Find data more easily – Especially if you need older data • Easier to analyze organized, documented data • Avoid accusations of fraud & misconduct
  7. 7. Why Data Management? • Don’t lose data • Find data more easily – Especially if you need older data • Easier to analyze organized, documented data • Avoid accusations of fraud & misconduct • Get credit for your data
  8. 8. Why Data Management? • Don’t lose data • Find data more easily – Especially if you need older data • Easier to analyze organized, documented data • Avoid accusations of fraud & misconduct • Get credit for your data • Don’t drown in irrelevant data
  9. 9. Data Management Basics • Introduction to a few topics in data management – Storage and backups – Documentation – File organization and naming – Future file usability
  10. 10. For each minute of planning at beginning of a project, you will save 10 minutes of headache later
  11. 11. grover_net, http://www.flickr.com/photos/9246159@N06/599820538/ (CC BY-ND) STORAGE AND BACKUPS
  12. 12. http://www.theonion.com/article/heroic-computer-dies-to-save-world-from-masters-th-1963
  13. 13. Follow the 3-2-1 Rule 3 copies of your data In 2 different locations On more than 1 type of storage hardware 3 2 1
  14. 14. Storage • How? – Computer – External hard drive – Shared drives/servers – Tape backup – Cloud storage* – CDs/DVDs – USB flash drive Erica Wheelan, https://www.flickr.com/photos/reinventedwheel/5985479866 (CC BY)
  15. 15. *Cloud Storage • Read the Terms of Service! • Eg. Google Drive – “When you upload or otherwise submit content to our Services, you give Google (and those we work with) a worldwide license to use, host, store, reproduce, modify, create derivative works (such as those resulting from translations, adaptations or other changes we make so that your content works better with our Services), communicate, publish, publicly perform, publicly display and distribute such content. The rights you grant in this license are for the limited purpose of operating, promoting, and improving our Services, and to develop new ones”
  16. 16. Backups • How? – Any backup is better than none – Automatic backup is better than manual – Your work is only as safe as your backup plan
  17. 17. Backups • How? – Check your backups • Backups only as good as ability to recover data • Test your backups periodically – Preferably a fixed schedule – 1 or 2 times a year may be enough – Bigger/more complex backups should be checked more often • Test your backup whenever you change things
  18. 18. Example • I keep my data – On my computer – Backed up manually on shared drive • I set a weekly reminder to do this – Backed up automatically via SpiderOak cloud storage
  19. 19. DOCUMENTATION Brady, https://www.flickr.com/photos/freddyfromutah/4424199420 (CC BY)
  20. 20. http://retractionwatch.com/2015/07/23/data-mismatch-and-authors-illness-pluck-finch-study-from-literature/
  21. 21. What would someone unfamiliar with your data need in order to find, evaluate, understand, and reuse them?
  22. 22. Documentation • Why? – Data without notes are unusable – Because you won’t remember everything – For others who may need to use your files
  23. 23. Documentation • How? – Take good notes • Capture as much detail as possible • Your coworkers should be able to understand
  24. 24. Documentation • How? – Keep methods • Protocols • Code • Survey • Codebook • Data dictionary • Anything that lets someone reproduce your results
  25. 25. Documentation • How? – README.txt • For digital information, address the questions – “What the heck am I looking at?” – “Where do I find X?” • Use for project description in main folder • Use to document conventions • Use where ever you need extra clarity
  26. 26. Example • Project-wide README.txt – Basic project information • Title • Contributors • Grant info • etc. – Contact information for at least one person – All locations where data live, including backups
  27. 27. FILE ORGANIZATION & NAMING Dan Zen, http://www.flickr.com/photos/danzen/5551831155/ (CC BY)
  28. 28. https://twitter.com/CMBuddle/status/638800933598679040
  29. 29. https://twitter.com/CMBuddle/status/638802547365556224
  30. 30. https://twitter.com/CMBuddle/status/638808820874133504
  31. 31. File Organization • Why? – Easier to find and use data – Tell, at a glance, what is done and what you have yet to do – Can still find and use files in the future
  32. 32. File Organization • How? – Pick a system • Maybe work out a system with your coworkers – Get in the habit
  33. 33. File Organization • How? – Any system is better than none – Make your system logical for your data • 80/20 Rule – Possibilities • By project • By analysis type • By date • …
  34. 34. Example • Thesis – By chapter • By file type (draft, figure, table, etc.) • Data – By researcher • By analysis type – By date
  35. 35. http://retractionwatch.com/2014/01/07/doing-the-right-thing-authors-retract-brain-paper-with-systematic-human-error- in-coding/
  36. 36. File Naming Conventions • Why? – Make it easier to find files – Avoid duplicates – Make it easier to wrap up a project because you know which files belong to it
  37. 37. File Naming Conventions • How? – Pick what is most important for your name • Date • Site • Analysis • Sample • Short description
  38. 38. File Naming Conventions • How? – Files should be named consistently – Files names should be descriptive but short (<25 characters) – Use underscores instead of spaces – Avoid these characters: “ / : * ? ‘ < > [ ] & $ – Use the dating convention: YYYY-MM-DD – Document your system!
  39. 39. Example • YYYYMMDD_site_sampleNum – 20140422_PikeLake_03 – 20140424_EastLake_12 • Analysis-sample-concentration – UVVis-stilbene-10mM – IR-benzene-pure
  40. 40. DATA SECURITY https://www.flickr.com/photos/bilal-kamoon/6958578902/ (CC BY)
  41. 41. https://chronicle.com/article/UNC-Chapel-Hill-Researcher/124821/
  42. 42. Know Your Data Security Plan • HIPAA, FERPA, FISMA, IRB, etc. • If you have sensitive data, know the plan – Who has access? – What are the procedures? – Who’s responsible? • Ask for help!
  43. 43. FUTURE FILE USABILITY Ian, http://www.flickr.com/photos/ian-s/2152798588/ (CC BY-NC-ND)
  44. 44. http://retractionwatch.com/2013/07/19/jci-paper-retracted-for-duplicated-panels-after-authors-cant-provide-original-data/
  45. 45. Data Retention • 3 years required by government • Better to do 5-10 years
  46. 46. lukasbenc, https://www.flickr.com/photos/lukasbenc/3493808772 (CC BY-NC-SA)
  47. 47. Future File Usability • What? – Can you read your files from 10 years ago? – Data needs to be • Accessible • Interpretable • Readable
  48. 48. Future File Usability: Interpretable • How? – Back up written notes • People always forget this one • Difficult to interpret data without notes • Options – Digitally scan (recommended with digital data) – Photocopies
  49. 49. Future File Usability: Readable • How? – Convert file formats • Can you open digital files from 10 years ago? • Use open, non-proprietary formats that are in wide use – .docx  .txt – .xlsx  .csv – .jpg  .tif • Save a copy in the old format, just in case • Preserve software if no open file format
  50. 50. Future File Usability: Accessible • How? – Move to new media • Hardware dies and becomes obsolete – Floppy disks! • Expect average lifetime to be 3-5 years • Keep up with technology
  51. 51. WHERE TO GO FROM HERE
  52. 52. easylocum, https://www.flickr.com/photos/easylocum/2921542814 (CC BY)
  53. 53. Chris Hoving, https://www.flickr.com/photos/pcrucifer/2433274595 (CC BY-ND)
  54. 54. http://www.flickr.com/photos/72775875@N06/7729764370/ (CC BY-NC-SA)
  55. 55. Resources • Data Services – http://uwm.edu/libraries/dataservices/ • http://uwm.edu/libraries/dataservices/#videos • Data Management Guide – http://guides.library.uwm.edu/data • Data Services Librarian – briney@uwm.edu
  56. 56. Thank You! • This presentation available under a Creative Commons Attribution (CC-BY) license • Some content courtesy of Dorothea Salo – http://www.graduateschool.uwm.edu/research/resear cher-central/proposal-development/data-plan/boot- camp/ (CC BY)

×