Virginia Tech University Libraries’ Data
Service Pilot with the College of Natural
Resources and Environment (CNRE)
Natsuko Nicholls, Andi Ogier and Kyrille Goldbeck DeBose
Research Data Access & Preservation (RDAP) Summit
Minneapolis, MN
April 22, 2015
Our Mission:
To deliver a suite of research data
services to the Virginia Tech Community.
Our Vision:
Build Infrastructure
Advocate for Data Management
Support Collaboration
Value Partnerships
Project Highlights
Data
Profiling
Data Needs
Assessment
Data Interviews
5 areas of interest
5 questions
15 faculty recruited
Q1: Data Profiles
ASCII is recommended,
but it triples or
quadruples the storage
size
Companies are taking
open formats and
creating their own
proprietary standards.
Data Profiles: Summary
● Diversity of data
○ types
○ formats
○ environments
● Data States
● Raw vs.
Summarized
● Media Issues
Common Elements: Points of Interest:
Q2. Data Workflows
Fragmented workflows are
VERY problematic.
Data Management costs
30% of project time.
Need: High-quality
workflows that allow for
creativity and spontaneity.
Data Workflows: Summary
● Lifecycles are
Complicated
● Data Management
is time consuming
● Establishing and
documenting a
workflow is time
consuming
Common Elements: Points of Interest:
Q3. Data Challenges
Need: Systems that allow
algorithms and processes
to be brought to the data.
Original data isn’t always
available.
Data Challenges: Summary
● Data challenges:
○ format
○ storage
○ versioning
○ ownership
○ copyright
○ privacy
● Workflows and
tracking data
Common Elements: Points of Interest:
Q4: Data Value-Add
Harmonization of data
users and producers.
Concern: easy to mis-
interpret some data;
analysis is dependent on
specialized knowledge.
Data Value-Add: Summary
● Value-add
○ beyond research
community
○ historical data
○ public interest
● Market impacts
● Cost of data
collection is
measurable
Common Elements: Points of Interest:
Q5: Data Management Planning
DMPs seem rigid and
limited.
Standardization hinders
innovation; different
people measure different
things.
Preserve data and prevent
loss by summarization.
Data Management: Summary
● Standards vs.
specifics
● Use DMP as a tool
to find data
● Metadata is vitally
important
Common Elements: Points of Interest:
Final Thoughts
Interview process promotes partnerships,
starts conversations.
Research Management tools are needed at a
project level, not institutional/department
level. Provide information and let project
teams make the choice.
Need additional training as to why DMPs are
important to research. Many faculty still see
them as a hindrance.
Questions?
Natsuko Nicholls
Research Data Consultant
nnatsuko@vt.edu
Andi Ogier
Assistant Director
Data Curation
alop@vt.edu Kyrille Goldbeck DeBose
College Librarian for Natural
Resources & Environment
and Animal Sciences
kgoldbec@vt.edu

RDAP 15: Virginia Tech University Libraries’ Data Service Pilot with the College of Natural Resources and Environment

  • 1.
    Virginia Tech UniversityLibraries’ Data Service Pilot with the College of Natural Resources and Environment (CNRE) Natsuko Nicholls, Andi Ogier and Kyrille Goldbeck DeBose Research Data Access & Preservation (RDAP) Summit Minneapolis, MN April 22, 2015
  • 2.
    Our Mission: To delivera suite of research data services to the Virginia Tech Community. Our Vision: Build Infrastructure Advocate for Data Management Support Collaboration Value Partnerships
  • 3.
    Project Highlights Data Profiling Data Needs Assessment DataInterviews 5 areas of interest 5 questions 15 faculty recruited
  • 4.
  • 5.
    ASCII is recommended, butit triples or quadruples the storage size Companies are taking open formats and creating their own proprietary standards. Data Profiles: Summary ● Diversity of data ○ types ○ formats ○ environments ● Data States ● Raw vs. Summarized ● Media Issues Common Elements: Points of Interest:
  • 6.
  • 7.
    Fragmented workflows are VERYproblematic. Data Management costs 30% of project time. Need: High-quality workflows that allow for creativity and spontaneity. Data Workflows: Summary ● Lifecycles are Complicated ● Data Management is time consuming ● Establishing and documenting a workflow is time consuming Common Elements: Points of Interest:
  • 8.
  • 9.
    Need: Systems thatallow algorithms and processes to be brought to the data. Original data isn’t always available. Data Challenges: Summary ● Data challenges: ○ format ○ storage ○ versioning ○ ownership ○ copyright ○ privacy ● Workflows and tracking data Common Elements: Points of Interest:
  • 10.
  • 11.
    Harmonization of data usersand producers. Concern: easy to mis- interpret some data; analysis is dependent on specialized knowledge. Data Value-Add: Summary ● Value-add ○ beyond research community ○ historical data ○ public interest ● Market impacts ● Cost of data collection is measurable Common Elements: Points of Interest:
  • 12.
  • 13.
    DMPs seem rigidand limited. Standardization hinders innovation; different people measure different things. Preserve data and prevent loss by summarization. Data Management: Summary ● Standards vs. specifics ● Use DMP as a tool to find data ● Metadata is vitally important Common Elements: Points of Interest:
  • 14.
    Final Thoughts Interview processpromotes partnerships, starts conversations. Research Management tools are needed at a project level, not institutional/department level. Provide information and let project teams make the choice. Need additional training as to why DMPs are important to research. Many faculty still see them as a hindrance.
  • 15.
    Questions? Natsuko Nicholls Research DataConsultant nnatsuko@vt.edu Andi Ogier Assistant Director Data Curation alop@vt.edu Kyrille Goldbeck DeBose College Librarian for Natural Resources & Environment and Animal Sciences kgoldbec@vt.edu