Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Data Management for
Undergraduate
Researchers
Office of Undergraduate Research Seminar and Workshop Series
Rebekah Cumming...
• Introductions
• What are data?
• Why manage data?
• Data Management Plans
• File Naming
• Metadata
• Storage and Archivi...
Name
MajorResearch Project
What are data?
“The recorded factual material
commonly accepted in the research
community as necessary to validate
researc...
Data are diverse
Data are messy
Why manage data?
Your best collaborator is yourself
six months from now, and your past
self doesn’t answer emails.
Why else manage data?
• Save time and efficiency
• Meet grant requirements
• Promote reproducible research
• Enable new di...
We are trying to avoid
this scenario…
Two bears data
management problems
1. Didn’t know where he stored the data
2. Saved one copy of the data on a USB drive
3....
Data Management Plan
PLANNINGPLANNING
Courtesy of the UK Data
Archive http://www.data-
archive.ac.uk/create-manage/life-
c...
Scenario
You develop a research project during your
undergraduate experience.You write up the
results, which are accepted ...
• Would you be able to prove you did the
work as you described in the article?
• What would you need to prove you hadn’t
f...
Elements of a DMP
• Types of data, including file formats
• Data description
• Data storage
• Data sharing, including conf...
File naming
File naming best
practices
• Be descriptive
• Don’t be generic
• Appropriate length
• Be consistent
• PLPP_EvaluationData_Workshop2_2014.xlsx
• MyData.xlsx
• publiclibrarypartnershipsprojectevaluationdataw
orkshop22014Cumm...
File naming best practices
• Files should include only letters, numbers, and
underscores.
• No special characters (%@#*?!)...
Dates and numbering…
1. Use leading zeros for scalability
001
002
009
019
999
2. If using dates use YYYYMMDD
June2015 = BA...
Who filed better?
• July 24 2014_SoilSamples%_v6
• 20140724_NSF_SoilSamples_Cummings
• SoilSamples_FINAL
File organization best
practices
• Top level folder should include project title
and date.
• Sub-structure should have a c...
File organization exercise
Metadata
Unstructured
Data
Structured
Data
There was a study put out by Dr. Gary
Bradshaw from the University of
Nebraska ...
Why create metadata?
IJ?
XVAR?
FNAME?
Data documentation
includes…
• Questionnaires
• Interview protocols
• Lab notebooks
• Code or scripts
• Consent forms
• Sa...
Data Storage
LOCKSS (Lots of
Copies Keeps
Stuff Safe)
Options for data
storage
• Personal computers or laptops
• Networked drives
• External storage devices
Storing sensitive data
• If possible, collect the necessary data
without using direct identifiers
• Otherwise, de-identify...
Thinking long-
term
Archiving options
• Public repository – FigShare
• Domain-specific repository
• Institutional repository
Major takeaways
• Data management starts at the beginning of
a project
• Document your data so that someone else
could und...
Questions?
rebekah.cummings@utah.edu
(801) 581-7701
Marriott Library, 1705Y
…or ask now!
Upcoming SlideShare
Loading in …5
×

Data Management for Undergraduate Research

1,979 views

Published on

This is the PowerPoint for my "Data Management for Undergraduate Researchers" workshop for the Office of Undergraduate Research Seminar and Workshop Series. Major topics include motivations behind good data management, file naming, version control, metadata, storage, and archiving.

Published in: Data & Analytics, Education
  • Hello! Get Your Professional Job-Winning Resume Here - Check our website! https://vk.cc/818RFv
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Data Management for Undergraduate Research

  1. 1. Data Management for Undergraduate Researchers Office of Undergraduate Research Seminar and Workshop Series Rebekah Cummings, Research Data Management Librarian J. Willard Marriott Library, University of Utah June 18, 2015
  2. 2. • Introductions • What are data? • Why manage data? • Data Management Plans • File Naming • Metadata • Storage and Archiving • Questions
  3. 3. Name MajorResearch Project
  4. 4. What are data? “The recorded factual material commonly accepted in the research community as necessary to validate research findings.” - U.S. OMB Circular A-110
  5. 5. Data are diverse
  6. 6. Data are messy
  7. 7. Why manage data? Your best collaborator is yourself six months from now, and your past self doesn’t answer emails.
  8. 8. Why else manage data? • Save time and efficiency • Meet grant requirements • Promote reproducible research • Enable new discoveries from your data • Make the results of publicly funded research publicly available
  9. 9. We are trying to avoid this scenario…
  10. 10. Two bears data management problems 1. Didn’t know where he stored the data 2. Saved one copy of the data on a USB drive 3. Data was in a format that could only be read by outdated, proprietary software 4. No codebook to explain the variable names 5. Variable names were not descriptive 6. No contact information for the co-author Sam Lee
  11. 11. Data Management Plan PLANNINGPLANNING Courtesy of the UK Data Archive http://www.data- archive.ac.uk/create-manage/life- cycle
  12. 12. Scenario You develop a research project during your undergraduate experience.You write up the results, which are accepted by a reputable journal. People start citing your work! Three years later someone accuses you of falsifying your work. Scenario adapted from MANTRA training module
  13. 13. • Would you be able to prove you did the work as you described in the article? • What would you need to prove you hadn’t falsified the data? • What should you have done throughout your research study to be able to prove you did the work as described?
  14. 14. Elements of a DMP • Types of data, including file formats • Data description • Data storage • Data sharing, including confidentiality or security restrictions • Data archiving and responsibility • Data management costs
  15. 15. File naming
  16. 16. File naming best practices • Be descriptive • Don’t be generic • Appropriate length • Be consistent
  17. 17. • PLPP_EvaluationData_Workshop2_2014.xlsx • MyData.xlsx • publiclibrarypartnershipsprojectevaluationdataw orkshop22014CummingsHelenaMontana.xlsx Who filed better?
  18. 18. File naming best practices • Files should include only letters, numbers, and underscores. • No special characters (%@#*?!) • No spaces • Lowercase or camel case (LikeThis) • Not all systems are case sensitive.Assume this, THIS, and tHiS are the same.
  19. 19. Dates and numbering… 1. Use leading zeros for scalability 001 002 009 019 999 2. If using dates use YYYYMMDD June2015 = BAD! 06-18-2015 = BAD! 20150618 = GREAT! 2015-06-18 = This is fine too 
  20. 20. Who filed better? • July 24 2014_SoilSamples%_v6 • 20140724_NSF_SoilSamples_Cummings • SoilSamples_FINAL
  21. 21. File organization best practices • Top level folder should include project title and date. • Sub-structure should have a clear and consistent naming convention. • Document your structure in a README text file.
  22. 22. File organization exercise
  23. 23. Metadata Unstructured Data Structured Data There was a study put out by Dr. Gary Bradshaw from the University of Nebraska Medical Center in 1982 called “ Growth of Rodent Kidney Cells in Serum Media and the Effect of Viral Transformation On Growth”. It concerns the cytology of kidney cells. Title Growth of rodent kidney cells in serum media and the effect of viral transformations on growth. Author Gary Bradshaw Date 1982 Publisher University of Nebraska Medical Center Subject Kidney -- Cytology
  24. 24. Why create metadata?
  25. 25. IJ? XVAR? FNAME?
  26. 26. Data documentation includes… • Questionnaires • Interview protocols • Lab notebooks • Code or scripts • Consent forms • Samples, weights, methods • Read me files
  27. 27. Data Storage
  28. 28. LOCKSS (Lots of Copies Keeps Stuff Safe)
  29. 29. Options for data storage • Personal computers or laptops • Networked drives • External storage devices
  30. 30. Storing sensitive data • If possible, collect the necessary data without using direct identifiers • Otherwise, de-identify your data upon collection or immediately afterwards • Do not store or share sensitive data on unencrypted devices • Talk to IRB
  31. 31. Thinking long- term
  32. 32. Archiving options • Public repository – FigShare • Domain-specific repository • Institutional repository
  33. 33. Major takeaways • Data management starts at the beginning of a project • Document your data so that someone else could understand it • Have more than one copy of your data • Consider archiving options when you are done with your project
  34. 34. Questions? rebekah.cummings@utah.edu (801) 581-7701 Marriott Library, 1705Y …or ask now!

×