Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Research Data Management:
Humanities and Social Sciences Edition
CC BY-NC
Celia Emmelhainz and Suzi Cole
August 11, 2015
M...
• All liaison librarians need a basic knowledge of research data
management (RDM).
• RDM is part of the librarian’s toolki...
Why do academic libraries help
with data management?
• Library culture is to acquire, organize, and preserve information
•...
After these sessions, you should…
● Know the concepts in data management
● Feel less anxious when talking about data
● Beg...
But why liaisons?
Info: eScience Team presentation on liaison roles, Image: CC0 from pixabay.com
 A logical extension of ...
Liaisons – Learning Over Time
First Steps: Get comfortable with the idea of
research data management.
Next Steps: Start a ...
Our path…
Today
…introduction to data management
…types of research data you’ll encounter
…data formats and organization
T...
Q1: What is
Prompt: what materials do your faculty use to make sense of their research?
“Research data
is collected, observed, or created
for purposes of analysis
to produce original research results.”
- U Edin...
Q2: What are
in the humanities?
Textual data in the
humanities could include:
- Scholarly editions
- Text corpora
- Text with markup
- Thematic collection...
Data in the qualitative social
sciences could include:
• microfilms
• copies of old
documents
• oral interviews
• video ta...
Humanities and arts data:
● Texts used for research
● Annotations
● Images and illustrations
● Citations
● Bibliographic i...
DigitalThoreau.org: On the left, the Princeton edition of
Walden; right, original 1847 draft with changes marked up.
Text Encoding Initiative (TEI) is a markup language
that records the structure of text (author, chapters,
pages, quotes) f...
Ask Yourself (#1):
Using a project summary, ask yourself:
- what is this research project about?
- what types of data are ...
data (the stuff we do research with)
are vital at every point in the
research lifecycle.
Image: www.lib.uci.edu/dss/images...
example: temperature data from a lake
Raw Processed Analyzed Finalized/Published
Example: data across the lifecycle
WHY manage data?
① for the researchers’ own current/future benefit
② for transparency and integrity
③ for sharing knowledg...
2: Data Formats
and Organization
CC image from pixabay.com/en/filing-cabinet-office-furniture-146160/
File Naming video
● Use meaningful names
● Avoid special characters
● Use caps or underscores, not spaces
● Choose a stand...
Data Structures video
Could organize by:
● Type of information
● Date and time
● Research project
● Theme or subject
front...
Data Dictionaries and Codebooks
Explains what a dataset contains:
● Contents or organization of a file
● Glossary of key c...
Use open formats when possible:
“open source” formats keep files accessible over
time; proprietary formats may be lost of ...
Ask Yourself (#2):
Using the project summary, ask yourself:
- what file formats are the data now in?
- do they need conver...
Intersession exercise:
 Read the NEH guidelines for data
management.
 View any two data management libguides:
Who is the...
Research Data Management:
Session Two!
CC BY-NC
Celia Emmelhainz and Suzi Cole
August 13, 2015
Modified from presentation ...
3: Data Security and Sensitive Data
CC image: pixabay.com/en/computer-security-business-767784/
Don’t let this be
you! (or your
faculty, or your
students…)
Image www.neatorama.com/2013/04/24/Backup-Your-Data/
Common options for data storage:
● Local hard drives (weak)
Ex: personal or office desktop, laptop computer
● External sto...
Data Storage: Best Practices
● Back up all data frequently, especially after
major changes
● Automate the backup process
●...
Sensitive Data:
…is any data that, if released, could harm the
people who participated in the research:
● Address, birth d...
Concepts in Sensitive Data
● Research ethics: protect identities of people
interviewed; minimize risk of any leaks
● Confi...
Sensitive Data: Best Practices
● Collect data without identifying information,
if possible
● Strip sensitive or identifyin...
Ask Yourself (#3):
Using the project summary, ask yourself:
- where will data be stored?
- who is responsible for storage ...
4: Data
Retention &
Preservation
image from datasupport.researchdata.nl/
“What data do I keep?”
It all depends on:
…whether data is irreplaceable
e.g. are there other copies of this book,
documen...
Best Practices: Data Preservation
● Use open-source, non-proprietary files
● Include all software needed, if possible
● No...
Ask Yourself (#4):
Using the project summary, ask yourself:
- Which data should be kept? Why?
- How long should data be ke...
5: Data Sharing and Publication
Fears in sharing data…
Often, researchers want to hide their data:
● Fear criticism of their methods/results
● Fear exposu...
But, sharing data…
● Is often required by journals and funders
● Reduces the costs of research by reducing
project duplica...
Relevant data repositories:
and of course…
Data Papers:
Dataset Description
Reuse Potential
Methods
Overview/Context
Data as a Publication
● Data which has been shared can be cited:
Data citations involve: author, title, year, publisher /
...
Best Practices in Data Sharing
● Find out who owns the data (researcher? university?
funding organization?)
● Review legal...
Data Management Plans
CC image: pixabay.com/en/whiteboard-man-presentation-write-849812/
What’s in a Data Management Plan?
All the things we’ve
discussed!
What’s in a Data Management Plan?
● What types of data will be created?
● Who will own, have access to, and be
responsible...
Data Management Plans (DMPs)
are a great way to…
 plan how you’ll handle research materials
 describe how you’ll documen...
All research proposals sent to the National
Science Foundation (NSF) must include a
2-page data management plan, showing
h...
The NSF expects that all researchers:
“should be prepared to place their data in
fully cleaned and documented form in a
da...
For the NEH, data are “materials generated or
collected during the course of conducting research.”
Humanities data such as...
How do we actually make DMPs?
● Templates are a starting point:
● However, researchers still need to
carefully think throu...
Sample DMPS
image: asphalttexas.com/wp-content/uploads/2014/06/Screen-Shot-2014-06-18-at-4.33.29-PM.png
Data management at Colby:
• Liaisons are first point of contact
• Suzi and Celia advise on further issues
• We are an ICPS...
Question: What 3 things can you do
this year with data management?
Image: http://www.dailymail.co.uk/news/article-2728736/...
More questions? Contact us!
Celia Emmelhainz
celia.emmelhainz@colby.edu
Suzi Cole
swcole@colby.edu
Thanks to New England Collaborative Data Management Curriculum for
sharing their slides.
Many thanks to Leslie Barnes, Dyl...
Upcoming SlideShare
Loading in …5
×

Research Data Management in the Humanities and Social Sciences

1,259 views

Published on

This two-part presentation for librarians reviews basic concepts and concerns with research data management, and is targeted to those working with humanists and social scientists. You are free to re-use and modify with attribution.

Published in: Data & Analytics
  • Be the first to comment

Research Data Management in the Humanities and Social Sciences

  1. 1. Research Data Management: Humanities and Social Sciences Edition CC BY-NC Celia Emmelhainz and Suzi Cole August 11, 2015 Modified from presentation by Leslie Barnes, Dylanne Dearborn, Andrew Nicholson at http://guides.library.utoronto.ca/RDM-intro
  2. 2. • All liaison librarians need a basic knowledge of research data management (RDM). • RDM is part of the librarian’s toolkit for serving faculty research needs. • We don’t all need to be data experts, just as we aren’t experts in many areas that we cover. • RDM is one of many topics we discuss with faculty over time, like collections, instruction, course guides, and student research. • Our faculty may not know RDM terms or may not understand what our institutional repository or other archives can do with data. • Humanists may react negatively to the term “data.” • (Optional): we can faculty by reading their drafts of data management plan: if we don’t understand, reviewers won’t either. • Knowing data concepts enhances our role & expands our visibility. • Data collection and the data lifecycle are part of where we help with curation in the library. • This is a new knowledge area for all academic librarians. Our assumptions
  3. 3. Why do academic libraries help with data management? • Library culture is to acquire, organize, and preserve information • Logical extension of services we’ve traditionally been involved with • Libraries bring people together across disciplinary differences & campuses Reading: Coates (2014) Ensuring research integrity: the role of data management in current crises. C&RL News 75(11): 598-601.
  4. 4. After these sessions, you should… ● Know the concepts in data management ● Feel less anxious when talking about data ● Begin listening to faculty talk about their research process and outputs ● Know where to get more help with research data for faculty in your disciplines
  5. 5. But why liaisons? Info: eScience Team presentation on liaison roles, Image: CC0 from pixabay.com  A logical extension of our role as connections between the library and teaching faculty  A great way to show faculty that we care about their research as well as teaching  Liaisons as natural point of “triage”
  6. 6. Liaisons – Learning Over Time First Steps: Get comfortable with the idea of research data management. Next Steps: Start a conversation with faculty about their needs, share resources, and direct them to data librarians for complex questions. Moving Ahead: Take self-paced courses for librarians on the web. And try it out! Try managing data for one of your own projects. Source: eScience Team presentation on liaison roles for data management
  7. 7. Our path… Today …introduction to data management …types of research data you’ll encounter …data formats and organization Thursday …intro to data storage …intro to data sharing …advising on data management plans
  8. 8. Q1: What is Prompt: what materials do your faculty use to make sense of their research?
  9. 9. “Research data is collected, observed, or created for purposes of analysis to produce original research results.” - U Edinburgh
  10. 10. Q2: What are in the humanities?
  11. 11. Textual data in the humanities could include: - Scholarly editions - Text corpora - Text with markup - Thematic collections - Annotations - Accompanying analysis - Finding aids Cf: guides.library.ucla.edu/c.php?g=180580&p=1187629, guide.dhcuration.org/intro/, image source: slideshare.net/ULCCEvents/the-humanities-and-data-management
  12. 12. Data in the qualitative social sciences could include: • microfilms • copies of old documents • oral interviews • video tapes • hand-written records from: www.nsf.gov/sbe/ses/common/archive.jsp
  13. 13. Humanities and arts data: ● Texts used for research ● Annotations ● Images and illustrations ● Citations ● Bibliographic information ● Contextual information ● Audio or video files Health and Life Sciences data:  Health indicators, vital signs  Protein or genetic sequences  Spectra and images  Artifacts and samples  Slides and specimens Social Sciences data: ● Survey responses ● Focus groups and interviews ● Administrative records ● Demographic information ● Opinion polling ● Maps and geospatial data ● Websites, primary sources Physical Sciences data:  Sensor or lab measurements  Computer modeling and simulations  Observations and/or field notes  Numerical measurements Cf: Best Practices for Arts/Humanities Data Management Plans, CU-Boulder http://bit.ly/1MkKCIa
  14. 14. DigitalThoreau.org: On the left, the Princeton edition of Walden; right, original 1847 draft with changes marked up.
  15. 15. Text Encoding Initiative (TEI) is a markup language that records the structure of text (author, chapters, pages, quotes) for digital humanities/curation purposes.
  16. 16. Ask Yourself (#1): Using a project summary, ask yourself: - what is this research project about? - what types of data are being collected - what types of data are being created
  17. 17. data (the stuff we do research with) are vital at every point in the research lifecycle. Image: www.lib.uci.edu/dss/images/lifecycle.jpg
  18. 18. example: temperature data from a lake Raw Processed Analyzed Finalized/Published Example: data across the lifecycle
  19. 19. WHY manage data? ① for the researchers’ own current/future benefit ② for transparency and integrity ③ for sharing knowledge & how constructed ④ to meet grant requirements (NEH, NSF) ⑤ to comply with ethics requirements ⑥to increase exposure to faculty research
  20. 20. 2: Data Formats and Organization CC image from pixabay.com/en/filing-cabinet-office-furniture-146160/
  21. 21. File Naming video ● Use meaningful names ● Avoid special characters ● Use caps or underscores, not spaces ● Choose a standard date format: YYYYMMDD or YYYY-MM-DD ● Label versions (v2, v15)
  22. 22. Data Structures video Could organize by: ● Type of information ● Date and time ● Research project ● Theme or subject frontispieces/20141211/images images/frontispieces/20141211
  23. 23. Data Dictionaries and Codebooks Explains what a dataset contains: ● Contents or organization of a file ● Glossary of key concepts or terms ● Definitions for each variable name ● Describes relationships of tables/files ● Codes that have been used to sort data ● Sampling or other methods used
  24. 24. Use open formats when possible: “open source” formats keep files accessible over time; proprietary formats may be lost of a company goes out of business. Open formats let future researchers access your data! Video: .mov, .mpeg Audio: .wav, .mp3 Data: .csv, .sas Images: .tiff, JPEG 2000 Text: PDF/A, ASCII
  25. 25. Ask Yourself (#2): Using the project summary, ask yourself: - what file formats are the data now in? - do they need conversion to open formats? - are they well documented with metadata?
  26. 26. Intersession exercise:  Read the NEH guidelines for data management.  View any two data management libguides: Who is the audience? What services are offered? How does it connect to users?  Briefly review your chosen project summary, in preparation for the final class.
  27. 27. Research Data Management: Session Two! CC BY-NC Celia Emmelhainz and Suzi Cole August 13, 2015 Modified from presentation by Leslie Barnes, Dylanne Dearborn, Andrew Nicholson at http://guides.library.utoronto.ca/RDM-intro
  28. 28. 3: Data Security and Sensitive Data CC image: pixabay.com/en/computer-security-business-767784/
  29. 29. Don’t let this be you! (or your faculty, or your students…) Image www.neatorama.com/2013/04/24/Backup-Your-Data/
  30. 30. Common options for data storage: ● Local hard drives (weak) Ex: personal or office desktop, laptop computer ● External storage devices (weak) Ex: USB drives, External hard drives ● Networked storage (okay) Ex: university servers, but see Colby** ● Cloud storage services (okay) Ex: Microsoft, RackSpace, Amazon, Google
  31. 31. Data Storage: Best Practices ● Back up all data frequently, especially after major changes ● Automate the backup process ● Use ‘versioning software’ (see ITS) or file names to track changes in team projects The “Rule of 3”: Keep three copies of key data … in at least two different locations (original file, local backup, remote backup) … in at least one offline/offsite location
  32. 32. Sensitive Data: …is any data that, if released, could harm the people who participated in the research: ● Address, birth date, name, location ● Sensitive political opinions ● Sexual practices ● GPS data locating endangered species ● Coordinates for burial sites or sacred places This is treated with caution; few archiving options now.
  33. 33. Concepts in Sensitive Data ● Research ethics: protect identities of people interviewed; minimize risk of any leaks ● Confidentiality: how participants’ identifiable private information will be managed and disseminated ● Disclosure risk: increased with online accessibility of data or storage of documents
  34. 34. Sensitive Data: Best Practices ● Collect data without identifying information, if possible ● Strip sensitive or identifying information before archiving or sharing research data ● Encrypt your computer, and use secure connections, and secure servers ● Place sensitive data in a restricted archive with an embargo (time delay) or ethics approval required for access
  35. 35. Ask Yourself (#3): Using the project summary, ask yourself: - where will data be stored? - who is responsible for storage and backup? - how will you manage access to sensitive data?
  36. 36. 4: Data Retention & Preservation
  37. 37. image from datasupport.researchdata.nl/
  38. 38. “What data do I keep?” It all depends on: …whether data is irreplaceable e.g. are there other copies of this book, document, version, image, interview? …how much data is needed to verify or reanalyze a research project …policies of funders, IRB, discipline
  39. 39. Best Practices: Data Preservation ● Use open-source, non-proprietary files ● Include all software needed, if possible ● Note all files and their relationship/structure ● Identify who is responsible for preservation ● Determine how long data should be held ● Budget time and money before starting a project to properly preserve and archive data at the end!
  40. 40. Ask Yourself (#4): Using the project summary, ask yourself: - Which data should be kept? Why? - How long should data be kept for? - Who is responsible to preserve the data?
  41. 41. 5: Data Sharing and Publication
  42. 42. Fears in sharing data… Often, researchers want to hide their data: ● Fear criticism of their methods/results ● Fear exposure of confidential data ● Fear political/legal ramifications ● Fear getting “scooped” on analysis ● Believe benefits are low, and the cost is high CC image: pixabay.com/en/hands-holding-embracing-loving-718562/
  43. 43. But, sharing data… ● Is often required by journals and funders ● Reduces the costs of research by reducing project duplication ● Is a valuable check on methods and ethics ● Helps promote faculty discoveries ● Increases the impact of faculty work ● May support faculty tenure or salary increases!
  44. 44. Relevant data repositories: and of course…
  45. 45. Data Papers: Dataset Description Reuse Potential Methods Overview/Context
  46. 46. Data as a Publication ● Data which has been shared can be cited: Data citations involve: author, title, year, publisher / archive, version, URL or DOI for access. ● Data citations are a metric that can support tenure and promotion for our faculty! ● ORCiDs can help people find and cite data by a given researcher.
  47. 47. Best Practices in Data Sharing ● Find out who owns the data (researcher? university? funding organization?) ● Review legal issues such as copyright or publishers’ embargoes ● Consider ethical issues related to sensitive data or communities ● See publisher/funder requirements for sharing
  48. 48. Data Management Plans CC image: pixabay.com/en/whiteboard-man-presentation-write-849812/
  49. 49. What’s in a Data Management Plan? All the things we’ve discussed!
  50. 50. What’s in a Data Management Plan? ● What types of data will be created? ● Who will own, have access to, and be responsible for managing these data? ● What equipment or methods will capture, process and document the data? ● Where will data be stored during and after active research? ● How will the data be shared with current or future researchers?
  51. 51. Data Management Plans (DMPs) are a great way to…  plan how you’ll handle research materials  describe how you’ll document, store, and share data so that others can use it  remain accountable for how you use and share research materials  get funded on major research projects!
  52. 52. All research proposals sent to the National Science Foundation (NSF) must include a 2-page data management plan, showing how the data will be cared for and shared. The NSF is a common source of research money in: anthropology, geography, psychology, economics, government, STS, and many interdisciplinary projects.
  53. 53. The NSF expects that all researchers: “should be prepared to place their data in fully cleaned and documented form in a data archive or library within one year after the expiration of an award. Before an award is made, investigators will be asked to specify in writing where they plan to deposit their data set” - National Science Foundation guide for social and economic sciences at nsf.gov/sbe/ses/common/archive.jsp
  54. 54. For the NEH, data are “materials generated or collected during the course of conducting research.” Humanities data such as “citations, software code, algorithms, digital tools, documentation... geospatial coordinates… reports, and articles” should be archived. Sensitive information can be excluded. So, humanities faculty should also have a plan for how they’ll archive and share their research data! Source: neh.gov/files/grants/data_management_plans_2015.pdf
  55. 55. How do we actually make DMPs? ● Templates are a starting point: ● However, researchers still need to carefully think through data issues with grants officers, peers, or librarians ● http://libguides.colby.edu/data_mgmt
  56. 56. Sample DMPS image: asphalttexas.com/wp-content/uploads/2014/06/Screen-Shot-2014-06-18-at-4.33.29-PM.png
  57. 57. Data management at Colby: • Liaisons are first point of contact • Suzi and Celia advise on further issues • We are an ICPSR member; quantitative researchers can deposit data there. • Images and data may be archived in Digital Commons/Shared Shelf; check with Marty. cf. libguides.colby.edu/data_mgmt.
  58. 58. Question: What 3 things can you do this year with data management? Image: http://www.dailymail.co.uk/news/article-2728736/Otter-aerobics-Large-group-spotted-going-paces-synchronised-exercise.html
  59. 59. More questions? Contact us! Celia Emmelhainz celia.emmelhainz@colby.edu Suzi Cole swcole@colby.edu
  60. 60. Thanks to New England Collaborative Data Management Curriculum for sharing their slides. Many thanks to Leslie Barnes, Dylanne Dearborn, and Andrew Nicholson at University of Toronto for sharing their abbreviated slides (http://guides.library.utoronto.ca/RDM-intro), from which this presentation was adapted for the humanities.

×