Acrl march2015 final

Roles for Libraries in Providing
Research Data Management
Services
Nicole Vasilevsky, Oregon Health & Science University
Victoria Mitchell, University of Oregon
Jeremy Kenyon, University of Idaho

Nicole
Vasilevsky
Project Manager,
Biocurator and
Ontologist,
Ontology
Development
Group,
OHSU
Victoria
Mitchell
Social Science
Data &
Government
Documents
Librarian,
University of
Oregon
Jeremy
Kenyon
Research
Librarian,
University of
Idaho Library

1 | Data services at UO Library
2 | UI support for documentation
3 | OHSU data management trainings

Do you have
experience in data
management training?

Why do our patrons
need to know about
data management?

Why?
Researcher Perspective
Version
control Track
processes for
reproducibility
Quality
Control
Stay Organized Save Time and Stress
Avoid
Data
Loss
Format data for
reuse (by self,
team, or others)
Document for own
recollection,
accountability, reuse

Funding mandates
http://www.economist.com/news/briefing/21588057-scientists-
think-science-self-correcting-alarming-degree-it-not-trouble
Reproducibility
Why?
Funding mandates

At the UO Libraries
Data Services

The UO Environment
• No campus-wide research data policy
• Library leading on research data
management and preservation
• Collaborating with campus IT, Research
Services

The UO Environment
• Digital Scholarship Center
• Open Access Publishing
• Digital Collections
• Institutional Repository
• Interactive Media Development
• Data Services
• Science Data Services Librarian
• Social Science Data Librarian

Services
• Data Management Plans
– Consultation and review

Services
• Consultations with faculty
• Special projects
– Southern Voting Project

Education
• Workshops
• Presentations in classes and new faculty
orientations
• 1-credit course in research data
management for grad students

Graduate Seminar in Data
Management
• 2 iterations so far
• 1st: Spring 2013 – 1 credit course, LIB 407/507
• Made it available to upper-division undergrads; none
signed up
• 2nd Spring 2014 – 1 credit course, LIB 607

Graduate Seminar in Data
Management
Based course around creation of a DMP for a
funding agency
• Students registering for the course were
strongly encouraged to have a research
project already in mind or underway
• Also used, in part and with modification, the
education modules created by DataONE

• Natural disaster
• Facilities infrastructure failure
• Storage failure
• Server hardware/software
failure
• Application software failure
• External dependencies (e.g.
PKI failure)
• Format obsolescence
• Legal encumbrance
• Human error
• Malicious attack by human or
automated agents
• Loss of staffing competencies
• Loss of institutional
commitment
• Loss of financial stability
• Changes in user expectations
and requirements
Data Loss
CCimagebySharynMorrowonFlickr
CCimagebymomboleumonFlickr
Slide adapted from DataONE Education Module: Why Data
Management. DataOne. Retrieved March 21, 2013

Spreadsheet for Help with
Organizing
Research
Project:
[Name of research
project]
Name: [Your name]
Dates:
[when you'll be
conducting your
research, e.g. 7/14-
1/15]
Project Data
Folder:
[e.g.
dissertation_coldfusion
_data]
Research
Process/Method
/ Data Source
Collection
Dates Storage Format
Original
Format
Working
Format Access Format
Preservation
Format(s)
File Naming
Convention
Folder /
Convention Versioning Strategy
Storage
Location Who can help?
Access
restrictions?
Who
needs
access?
Software /
Tools Required
Metadata
Schema Notes

LIB 607 v.3
• Changed to Data Management for the
Social Sciences (and Digital Humanities)
• Less emphasis on DMP per funder
requirements
• More time to address issues specific to the
social sciences and humanities

@ the University of Idaho Library
Research Data Services
Credit: University of Idaho Creative Services

University of Idaho Characteristics:
• Public, comprehensive, land-grant university
• Strong emphasis on agriculture, environmental science, engineering
• Recent emphasis on developing research data and research
cyberinfrastructure, including library research data services, INSIDE
Idaho, the geospatial data repository, and NKN, a multi-disciplinary
institutional data repository

Research Data
Services at the
U-Idaho Library
Appointments
&
Consultations
Northwest
Knowledge
Network
(institutional
data repository)
Embedded
Services
(Buy-outs of
librarian time)Tool & Technology
Support:
IQ-Station,
ESRI Products,
DMPTool,
Metadata editors
Website:
Data
Management
Best Practices
Guide
Instruction &
Workshops
Many modes of service
Raise awareness of research data management & our services
Create a culture of documentation
Transform thinking across disciplines about data distribution &
publishing

Focus: creating a culture of documentation
FISH502 “One-shot” Instruction Session
- Class participants: fisheries biology and statistics graduate students
- Exercise:
1) review the following spreadsheet
2) identify the information needed to re-use this dataset

Research consultation: environmental modelling
Post-doc from a multi-institutional project was
primary contact for several teams
Consultation on metadata was made towards the
end of project
Producing 6 discrete collections of data as netCDF
(format required by funder)
Repository required ISO 19115 XML metadata for
describing whole collections

Challenges:
Understanding the standard
Attribute Conventions for Dataset Discovery
ISO 19115-2
Codelists and controlled vocabularies
Rules for free-text fields
what does a good title look like?
Placement of content
should variables be listed in keywords, title, or description?
Responsibilities
who should create XML files – the researcher or us?

Re-use and comprehension of
data requires good
documentation
Researchers often have
idiosyncratic and localized, i.e.
customized, documentation
practices
Content standards are often not
well-known among researchers
Disciplinary content standards
are necessary for enabling
advanced modes of data access
Library services
must emphasize
documentation

Future Directions
Fienberg, S.E. et al. (1985). Sharing
Research Data. Washington, D.C: National
Academies Press.
http://www.nap.edu/catalog/2033/sharing-
research-data

at Oregon Health & Science University
Research Data Management Efforts

What would you do with
$1k today to make
research communication
better that doesn’t involve
building another tool?

1| Workshops with the library
2| Individual consultations

Gummy Bear:
the
Groundbreaking
Paper

Your Data: Gummy Bear Raw Data
Bounces Amplitude Color
15 4 blue
43 3 red
58 9 green
75 82 purple
Materials:
• Haribo Gummi Bears
Sugar Free, 5 lb bag,
Amazon.com (UPC: 422384500110)
• SpringOMatic 3000
(ICanPickleThat, Portland, OR)
http://laughingsquid.com/the-anatomy-of-a-gummy-
bear-by-jason-freeny/

Figure 1. A) Gummy skeleton with belly button annotated
with red arrow B) Springiness by sample color.
Methods Section: Haribo Gummi Bears (Sugar Free) were purchased from
Amazon.com (UPC: 422384500110). Gummy bears were placed in the
SpringOMatic 3000 (ICanPickleThat, Portland OR) according to the manufactures
instructions. The Gummy Anatomy (Jason Freeny) image was cropped in PPT
(Microsoft) and annotate to highlight the bellybutton.
Gummy Bear Final Figure
0
2
4
6
8
10
12
14
16
blue red green purple
Springiness(bounces/length)
Sample Color
A B Figure
legends/metadat
a
Manipulating
images
Attribution
Metadata about
research
resources

Group 1: Gummy Bear Final Data
0
2
4
6
8
10
12
14
16
4 3 9 82
15 43 58 75
Springiness (Bounces/Amplitude)
15 4 blue
43 3 red
58 9 green
75 82 purple
Methods:
A schematic of a Gummi Bear was cropped to
indicate where the belly button is located (Fig.
1). At this point, raw experimental data
showing the bounce, amplitude, and color
were analyzed and the springiness calculated
for each color of bear. This was accomplished
by dividing the bounce by the amplitude and
plotting this against bear color.
Fig. 1
Belly button of
Haribo Sugar Free
Gummi Bear
What is missing?
A.Image manipulation
B. Attribution
C. Figure Legends
D.Metadata about
resources

Figure 1. A) Gummy skeleton with belly button
annotated with red arrow B) Springiness by sample
color.
Methods Section: Haribo Gummi Bears (Sugar
Free) were purchased from Amazon.com (UPC:
422384500110). Gummy bears were placed in the
SpringOMatic 3000 (ICanPickleThat, Portland OR)
according to the manufactures instructions.
0
2
4
6
8
10
12
14
16
Springiness(bounces/length)
Sample Color
A
B
What is missing?
B. Attribution
C. Figure Legends
D.Metadata about
resources

Figure 2: Schematic depiction of
Haribo Gummi Bear umbilical
skeletal anatomy.
Methods & Materials
Gummi Bears were obtained through Amazon in 3 kg bags. Lot and temperature during transport
data were not made available. Bears were housed in a plastic bowl in accordance with IACUC
policy and national standards for gummi bear care. They were housed at room temperature on a
natural light cycle.
Food and water were provided ad libitum (consumption was not monitored)
Each bear was sampled only once to reduce costs
What is missing?
B. Attribution
C. Figure Legends
D.Metadata about
resources

Belly Button
0.00
2.00
4.00
6.00
8.00
10.00
12.00
14.00
16.00
Springiness(bounces/amplitude)
Gummy Bear Color
(a) (b)
Fig. 1. (a) schematic of the anatomy of a gummy bear (adapted from 1). (b)
springiness of bear by color using spring-o-matic.
Methods: Insert the sample of interest, specifically
a colored gummy bear (Haribo, Japan). Position
the probe above the sample. Press "Tickle" and
the SpringOMatic (ICanPickleThat, Portland) will
poke the belly button a standard depth of 1 cm.
Record the number of bounces and the amplitude
of the largest bounce in cm. From these values,
the springiness can be calculated
(bounce/amplitude).
What is missing?
B. Attribution
C. Figure Legends
D.Metadata about
resources

GUMMY BEARS TAUGHT US…
• People see the same data very
differently
• “Detailed” means different things…
• Metadata?!?
• File management is difficult
• Workflow
Vasilevsky N; Wirz J, Champieux R, Hannon T, Laraway B Banerjee K, Shaffer C, and Haendel M.
Lions, Tigers, and Gummi Bears: Springing Towards Effective Engagement with Research Data
Management (2014). Scholar Archive. Paper 3571.

CONSULTATIONS
Researcher + 2-3 from
Data Stewardship Team

 Researchers DO need assistance:
 Finding and choosing data standards
 File versioning
 Applying metadata to facilitate data sharing
 “Gummi Bear” themed data management exercise
resonated well with students
 Lack of awareness of services and expertise
offered by the Library
Conclusions

OHSU New Directions
 OHSU Library is developing
data services for researchers
 BD2K educational grants in
collaboration with DMICE
www.ohsu.edu/xd/education/library/data

Acknowledgements
OHSU
Melissa Haendel
Robin Champieux
Jackie Wirz
Kyle Banerjee
Bryan Laraway
Chris Shaffer
Kaiser
Todd Hannon
UO
Brian Westra
Karen Estlund
Cathy Flynn- Purvis
John Russell
Idaho
Bruce Godfrey
Nancy Sprague
Lynn Baird
Greg Gollberg
Luke Sheneman
Steven Daley-Laursen

Contact us
Nicole Vasilevsky
vasilevs@ohsu.edu
@N_Vasilevsky
Thank you
Victoria Mitchell
vmitch@uoregon.edu
@VictoriaStap
Jeremy Kenyon
jkenyon@uidaho.edu
@jr_kenyon

Acrl march2015 final

Recommended

Recommended

More Related Content

What's hot

What's hot (7)

Viewers also liked

Viewers also liked (11)

Similar to Acrl march2015 final

Similar to Acrl march2015 final (20)

More from Nicole Vasilevsky

More from Nicole Vasilevsky (6)

Recently uploaded

Recently uploaded (20)

Acrl march2015 final

Editor's Notes