This slideshow was used in an Introduction to Research Data Management course taught for the Mathematical, Physical and Life Sciences Division, University of Oxford, on 2017-02-15. It provides an overview of some key issues, looking at both day-to-day data management, and longer term issues, including sharing, and curation.
2. âRepresentations of observations, objects, or other entities
used as evidence of phenomena for the purposes
of research or scholarshipâ
Digital Curation Centre
Slide adapted from the
PrePARe Project
What is data?
3. Any information you use in your research
Slide adapted from the
PrePARe Project
What is data?
4. What is research data management?
Storage
Organizing
Preservation
Documenting
Sharing
Choosing
technology
Versioning
Structuring
Backing up
Curation
Security
5. Carrots and sticks
⢠Work efficiently and
with minimum hassle
over the lifetime of
the project
⢠Save time and avoid
problems in the future
⢠Make it easy to share
your data
⢠Requirements from
funders, University
of Oxford, and
others
6. Data requirements
⢠Did you discover any?
⢠University of Oxford Policy on the Management
of Research Data and Records
⢠Funding body requirements
⢠Data statement in publications
⢠Data made available for reuse
8. University of Oxford policy
⢠The full policy can be viewed on the Research
Data Oxford website
⢠Covers the information needed âto support or
validate a research projectâs observations,
findings or outputsâ
⢠Research data should be:
⢠Accurate, complete, identifiable,
retrievable, and securely stored
⢠Able to be made available to others
9. University of Oxford policy
⢠Research data should be retained
⢠âFor as long as they are of continuing value to the
researcher and the wider research communityâ
⢠But a minimum of three years
⢠Specific requirements from funders take precedence
⢠Researchers are responsible for:
⢠Developing and documenting clear data management
procedures
⢠Planning for the ongoing custodianship of their data
⢠Ensuring legal, ethical, and funder requirements are met
10. Fundersâ requirements
⢠Funding bodies are taking an increasing interest
in what happens to research data
⢠Many require a data management plan as part of
grant applications
⢠RDO website provides
a summary of requirements
11. RCUK Common Principles on Data
Policy
âPublicly funded research data are a public
good, produced in the public interest,
which should be made openly available
with as few restrictions as possible in a
timely and responsible mannerâ
http://www.rcuk.ac.uk/
research/datapolicy/
12. RCUK Common Principles on Data
Policy
⢠Data with long-term value should be preserved for reuse
⢠Sufficient metadata should be recorded
⢠Published results should include information on how to
access the supporting data
⢠Legal, ethical and commercial constraints recognised
⢠A period of privileged use is permitted to enable
researchers to publish results
⢠Appropriate to use public funds for data management
and sharing
13. EPSRC requirements
⢠EPSRC Policy Framework on Research Data
⢠Papers must state how underlying data can
be accessed
⢠Data must be appropriately preserved for at
least ten years
⢠Further details on the RDO site
15. âWhat a messâ by .pst, via Flickr: http://www.flickr.com/photos/psteichen/3915657914/.
Can you find what you
need, when you need it?
Once youâve found it, will
it be clear what it is?
16. A gift to your future self â standard
working practices
⢠Set these up as early as possible in a project
⢠Clear structure for storing files
⢠File naming conventions
⢠Version information
⢠Document practices for future
reference
⢠Particularly important for teams
17. Managing files
⢠Add tags to files to aid searchability
⢠Search can be faster than hunting through folders
⢠Use hyperlinks to link files to each other
⢠Use shortcuts to avoid duplicating files
⢠Use file names to order files in a
folder, or to record version information
⢠Reassess your structure periodically
⢠Move unused items to an archive folder
20. Are you using the right tools for the
job?
⢠Take time to assess whether your current
software and methods are meeting your needs
⢠Sticking with old familiars can
be false economy
⢠Ask friends and colleagues
for recommendations
21. Research Skills Toolkit
⢠Website and hands-
on workshops
⢠A guide to software,
University services,
and other tools and
resources for
research
http://www.skillstoolkit.ox.ac.uk/
22. IT Learning Centre
⢠Over 200 different IT
courses
⢠Covering software,
skills, and new
technologies
⢠ITLC Portfolio offers
course materials and
other resources
http://portfolio.it.ox.ac.uk/
http://courses.it.ox.ac.uk/
25. Make multiple copiesâŚ
âŚand keep them in different places
Automate the
process if you can
Slide adapted from the
PrePARe Project
26. ⌠and about file formats
Think about your storage mediaâŚ
Slide adapted from the
PrePARe Project
27. IT Services: Data back-up on the HFS
⢠HFS is Oxfordâs central back-up and archiving
service
⢠Free of charge to University staff and
postgraduates
⢠Automated back-ups of machines connected to
University network
⢠Copies kept in multiple places
⢠http://www.it.ox.ac.uk/hfs
28. IT Services: Nexus SharePoint
⢠Document repository and collaboration service
⢠Store, manage, and share files
⢠Available free of charge to any member of the
University
⢠http://www.it.ox.ac.uk/services/connect-and-
communicate/sharepoint-nexus
29. Data security
⢠If youâre working with sensitive data, itâs
essential to ensure that every copy kept has
appropriate security
⢠Consider encrypting individual files, or your
whole hard drive
⢠InfoSec can provide advice
⢠https://www.infosec.ox.ac.uk/
31. Whatâs obvious
now might not
be in a few
months, years,
decadesâŚ
Adapted from âClay Tablets with Linear B Scriptâ by Dennis, via Flickr: http://www.flickr.com/photos/archer10/5692813531/
MAKE SURE
YOU CAN
UNDERSTAND
IT LATER
Slide adapted from the
PrePARe Project
Make material understandable
32. Documentation and metadata
⢠The contextual information required to make
data intelligible and aid interpretation
⢠A usersâ guide to your data
⢠For whole datasets, or specific aspects
⢠Metadata sometimes refers to more structured
information
⢠Designed to be machine readable
33. Make material verifiable and reusable
⢠Detailing methods helps
people understand what
you did
⢠And helps make your
work reproducible
⢠Provide context to
minimize risk of
misunderstanding or
misuse
Image by woodleywonderworks , via Flickr:
http://www.flickr.com/photos/wwworks/4588700881/
Slide adapted from the
PrePARe Project
34.
35. Exercise
⢠Imagine you have just downloaded this dataset
from an archive
⢠What contextual or explanatory information is
missing?
⢠Anything odd about the data that needs clarifying?
⢠What additional documentation
would you like to see supplied
⢠For the dataset as a whole?
⢠For specific aspects of it?
36. ⢠Who created it, when and why
⢠Description of the item
⢠Methodology and methods
⢠Units of measurement
⢠Definitions of jargon,
acronyms and code
⢠References to related data
Documentation â what to include
Slide adapted from the
PrePARe Project
37. Metadata â data about data
⢠A formal,
structured
description
of a dataset
⢠Used by
archives
to create
catalogue
records
38. ISA tools software suite
http://isa-tools.org/
Open
source
metadata
tracking
tools for
the life
sciences
39. Missing metadata â or the riddle of the
sixth toe
⢠This painting shows
Georgiana, Duchess
of Devonshire as
Diana
⢠⌠or maybe Cynthia
⢠She has six toes â but
no one knows why
Public domain image from Wikimedia Commons:
http://commons.wikimedia.org/wiki/File:Georgiana_Cavendish,_Duchess_of_Devonshire_as_Diana.jpg
40. For discussion
⢠What data
management
challenges have you
encountered?
⢠What strategies have
you personally found
useful?
⢠Be ready to feed back
to the group
42. Video by NYU Health Sciences Libraries: http://www.youtube.com/watch?v=N2zK3sAtr-4
43. Long-term data management
⢠Key issues are preservation and sharing
⢠What needs to be preserved to validate your
research outputs?
⢠What does your funder require?
⢠Is there anything youâre obliged to destroy?
⢠What might have reuse value?
⢠Can you make any or all of your data
available for use by other researchers?
44. Why share data? Reputation
⢠Get credit for high quality
research
⢠Recognition for contribution
to research community
⢠Open data leads to increased
citations
⢠Of the data itself
⢠Of associated papers
Slide adapted from the
PrePARe Project
45. Why share data? Reuse
⢠Reduces duplication of
effort
⢠Allows public research
funding to be used
more effectively
⢠Use in contexts not
currently envisaged
⢠Extend research beyond
your discipline
Slide adapted from the
PrePARe Project
46. Why share data? Be a trailblazer!
⢠A paradigm shift in how research outputs are
viewed is occurring
⢠Data outputs are of increasing importance â and
are likely to become even more so
⢠E.g. journals looking to
publish datasets
alongside articles
⢠Be at the forefront of an
important shift in the
academic world
47. Data sharing â concerns
⢠Ethical concerns
⢠Confidential or sensitive data
⢠Legal concerns
⢠Third party data
⢠Professional concerns
⢠Intended publication
⢠Commercial issues
(e.g. patent protection)
48. ⢠Redact or embargo if there is good reason
⢠Planning ahead can reduce difficulties
Data sharing â concerns
Slide adapted from the
PrePARe Project
49. Repositories and archives
⢠Data repositories or archives offer a secure long-
term home for research data
⢠Re3Data.org offers a searchable catalogue of
repositories
50. ORA-Data
⢠University of Oxfordâs institutional data archive
⢠Currently in pilot phase
⢠Long term preservation for Oxford research
datasets without another natural home
⢠Datasets assigned DOIs
⢠Datasets can be publicly
available, embargoed for a
fixed period, or hidden
51. ORA-Data
⢠Also a catalogue of Oxford-created data held in
other archives
⢠Researchers depositing data elsewhere strongly
encouraged to add a record to ORA-Data
http://ox.libguides.com/
about-ora-data
52. Figshare
⢠Figshare is a free online data sharing platform
⢠Shared research is allocated a DataCite DOI
⢠A possible alternative to conventional
repositories
⢠Where no suitable repository is available
⢠If you need a data
sharing solution in
a hurry
53. Data licensing
⢠A licence clarifies the conditions for accessing
and making use of a dataset
⢠Lets users know
⢠Whatâs allowed without asking further permission
⢠How to cite the work
⢠Specific requests to go beyond the
terms of the licence can still be made
54. Data licences - examples
⢠Creative Common licences
⢠Widely used and recognized
⢠Six different flavours, plus CC0
public domain dedication
⢠Open Data Commons
⢠Specifically designed for datasets
⢠Recognizes the structure/content
distinction for databases
55. Data licensing - guidance
⢠âHow to License Research Dataâ
⢠A guide from the Digital Curation Centre
http://www.dcc.ac.uk/resources/how-guides/license-research-data
57. Data management plans
⢠Ideally created in the early stages of a project
⢠While planning, applying for funding, or setting up
⢠Initial plan may be expanded later
⢠Details plans and expectations for data
⢠Nature of data and its creation or
acquisition
⢠Storage and security
⢠Preservation and sharing
58. Exercise
⢠Have a go at drafting a data management plan
for your own research
⢠If there are questions you canât answer at this
stage, make a note of
⢠What you need to find out
⢠Decisions you need to make
59. DMP Online
⢠Create a data
management plan
using the DMP
Online tool
⢠Developed by the
DCC â a national
service providing
advice and
resources
https://dmponline.dcc.ac.uk/
http://www.dcc.ac.uk/
60. âIn preparing for
battle, I have
always found that
plans are useless
but planning is
indispensable.â
Dwight D. Eisenhower
62. Research Data Oxford website
⢠Oxfordâs
central
advisory
website
⢠Questions?
Email
researchdata
@ox.ac.uk
http://researchdata.ox.ac.uk/
63. IT Services: Research Support Team
⢠Can assist with technical aspects of research
projects at all stages of the project lifecycle
⢠Help with DMPs, selecting software or storage,
building a database, etc.
⢠Meet with someone for a
research data health check
⢠For more information, see:
http://research.it.ox.ac.uk/
64. Research Data MANTRA
⢠Free online
interactive
training
modules
⢠Aimed at
postgraduates
and early
career
researchers
http://datalib.edina.ac.uk/mantra/
65. Any questions?
Ask now, or email us on
researchdata@ox.ac.uk
Slides and handouts available from
http://research.it.ox.ac.uk/rdmcourses
66. Rights and re-use
⢠This presentation is part of a series of research data management
training resources prepared by the IT Services Research Support
Team at the University of Oxford
⢠The slideshow is based on one developed during the Oxford-based
DaMaRO Project. Parts of it also draw on teaching materials
produced by the PrePARe Project, DATUM for Health, and
DataTrain Archaeology
⢠With the exception of clip art used with permission from
Microsoft, commercial logos and trademarks, and images
specifically credited to other sources, the slideshow is made
available under a Creative Commons Attribution Non-Commercial
Share-Alike License
⢠Within the terms of this licence, we actively encourage sharing,
adaptation, and re-use of this material