3. Data management is a general term covering how you organise, structure,
store and care for the data used or generated during the lifetime of a
research project.
What is research data management?
4. Write a Data Management Plan
Agree data organisation and file formats
Document, and create metadata, from the start
Have robust back-up and quality assurance processes
Plan ahead for preservation and sharing
Include all RDM costs in grant applications
4
How to achieve good RDM
5. Nine Research Councils including: AHRC, BBSRC, EPSRC, ESRC, Innovate
UK, MRC, NERC, Research England, and STFC
All funding recipients expected to comply with Common Principles on
Data policy
Each council has it’s own RDM policy and DMP template(s) – similar
but different
All required funds for RDM MUST be included in grant applications to
these funders
https://www.ukri.org/
UK Research & Innovation (UKRI) funders
6. NERC Data Policy
The NERC Data Policy sets the RDM ground
rules that NERC-funded researchers must
follow.
The Data Policy details a commitment to support
the long-term availability of environmental data,
and also outlines roles and responsibilities of
those involved in the collection and management
of environmental data.
Central to the policy is that NERC-funded
scientists must make their data openly available
within two years of collection, and deposit it in a
NERC Data Centre for long-term preservation.
http://www.nerc.ac.uk/research/sites/data/
7. Revised version of the 2017 work programme “Open Research Data” pilot has been
extended to cover all the thematic areas of Horizon 2020.
Open access to research data thereby becomes applicable by default in Horizon
2020. However the Commission also recognises that there are good reasons to keep
some or even all research data generated in a project closed.
A DMP is required for all projects participating in the extended ORD pilot, unless
they opt out of the ORD pilot. However, projects that opt out are still encouraged to
submit a DMP on a voluntary basis.
In general terms, your research data should be 'FAIR', that is Findable, Accessible,
Interoperable and Re-usable.
http://ec.europa.eu/research/participants/docs/h2020-funding-guide/cross-cutting-issues/open-access-data-management/data-management_en.htm
European Commission – H2020
9. Whether compulsory or not, developing a DMP can help to:
Make informed decisions to anticipate & avoid problems.
Avoid duplication, data loss and security breaches.
Develop procedures early on for consistency.
Ensure data are accurate, complete, reliable and secure.
Save time and effort to make everyone’s lives easier.
Why is it important to have a DMP?
10. A good DMP will answer all the following questions:
What data will be collected or created?
How the data will be documented and described?
Where the data will be stored?
Who will be responsible for data security and backup?
Which data will be shared and/or preserved?
How the data will be shared and with whom?
How much will all of the above cost?
What should a DMP include?
11. Free and open web-based tool to help researchers write plans:
https://dmponline.dcc.ac.uk/
It features:
Templates based on different requirements
Tailored guidance (disciplinary, funder etc.)
Customised exports to a variety of formats
Ability to share DMPs with others
DMPonline
12. Keep DMPs simple, short and specific, avoid jargon.
Seek advice - consult and collaborate.
Start early – don’t wait until the last minute!
The plan will - and should - change over the life of project. It is a living
document so needs updating regularly.
Always contact the funder when you need clarification or further
information.
Include all expected costs in the data management costing, esp. extra
storage space for active data, data deposit / long-term storage etc.
Tips to share
14. Data storage - basic principles
Use managed, network services
whenever possible to ensure:
Regular back-up
Data Security
Accessibility
Avoid using portable HD’s, USB memory
sticks, CD’s, or DVD’s to mitigate:
Data loss due to damage, failure, or
theft.
Quality control issues due to version
confusion.
Unnecessary security risks. Digital Preservation Coalition’s new
promotional USB stick (dpcoline.org):
https://twitter.com/digitalfay/status/41
1444578122600450/photo/1
15. Off-network storage & back-up
Make at least 3 copies of the data if not
using managed services:
on at least 2 different media,
keep storage devices in separate locations
with at least 1 offsite,
check they work regularly,
ensure everyone knows the process and
follows it.
Ensure they can keep track of different
versions of data, especially when
backing-up to multiple devices.
One copy = risk of data loss
16. DataStore provides active storage for all research staff and postgrad students.
RDM DataStore provides a free ‘at point of’ use allocation (currently 0.5TB).
Additional capacity can be purchased for £175 per TB per annum.
Support for very large data (>1PB) hosting available.
This facility also provides a data services cloud for hosting specific data access
mechanisms, or for integrating additional computational infrastructure.
Accessing DataStore:
DataStore
http://www.ed.ac.uk/information-services/computing/desktop-personal/network-shares/
17. 'Dropbox-like’ file-hosting service for
non-sensitive data
Allows sharing and synchronisation of data.
Sync using local clients.
Share using local clients or web URL with colleagues anywhere.
20GB free storage or map to personal / group data on DataStore as required.
Using the ownCloud open source application.
www.ed.ac.uk/is/datasync
DataSync
18. OneDrive For Business is cloud-based file storage for all staff and students
as part of the Office 365 suite, allowing you to store and access personal
and work files from anywhere.
Upload or sync any document from your local computer to OneDrive. It will then be available to you
from any computer, tablet or phone.
Users receive 1 Terabyte (TB) of storage on OneDrive to store all personal and University documents.
Any files created in Office Online are automatically backed up in your OneDrive space.
Share and work documents with friends and colleagues both inside and outside the University (latter
may require Microsoft login).
Easily recover deleted documents or roll back to previous versions.
https://www.ed.ac.uk/information-services/computing/comms-and-collab/office365/onedrive-for-business
OneDrive for Business
19. In the event of a data breach the Unit staff must comply with UoE guidance
on breach management:
https://www.ed.ac.uk/records-management/guidance/data-
protection/breach-management
Breaches must be reported immediately to the University Data Protection
Officer.
The Unit IG Lead and Unit Director must also be informed, and there may be
additional requirements in place for specific projects if third party data is
involved.
A form must be completed with as much detail as possible for each breach.
19
Reporting data breaches
20. A “Walled Garden” service to store and analyse sensitive data.
Open to all UoE researchers at the moment (and partners in future).
Cutting-edge processes and technology.
Standard documentation sets for researchers.
Consultancy service for ongoing guidance, training and audit.
Build for ISO 27001 compliance – certification to follow .
Costs £8500 per annum. Cost should be built into grant application and
DMP.
Currently working with 5 pilot projects from CAHSS and CMVM.
Data Safe Haven
21. Encryption is the process of converting data into an unreadable code. You must
have access to a password or a secret encryption key to be able to read an
encrypted file.
Encryption comes in strengths. A higher key size takes exponentially longer to
crack. A key size of 8 takes 0 milliseconds to crack. A key size of 128 takes 150
trillion years to crack.
For sensitive data, it is advisable to follow NHS Information Governance Guidelines
to protect person identifiable and sensitive information, using an encryption
algorithm that supports a minimum key length of 256 bits, such as AES 256, 3DES,
or Blowfish.
21
Encryption
22. Several options are available to protect personal and sensitive data using encryption:
Encrypt a disk in its entirety: Full Disk Encryption may be applied in order to protect all data held
on the drive. Use for portable HDs, USB sticks, etc.
Encrypt one or more partitions on the disk: Personal and sensitive data can be held on the
encrypted partition, while the anonymised material can be held on the un-encrypted partition.
Create an encrypted container (archive): A file that, when accessed using appropriate software,
can be opened and used in the same way as a physical drive.
Encryption advice from IS:
https://www.ed.ac.uk/infosec/how-to-protect/encrypting
22
Options for encrypting data
24. Data Preservation is the key to the long term existence and future
accessibility of research data
Needs thinking about it at the planning stage
Requires a trusted repository
Data Sharing is making research data available for others to reuse and
build upon.
It is not giving data away!
What do we mean by preserve and share?
25. Document data clearly and comprehensively
Apply consistent quality assurance processes from the outset
Choose file formats that are open, and accessible
Select an appropriate repository or archive
Deposit data in the chosen repository
How to preserve data
26. When choosing a repository consider:
Does the funder require data to be offered to a specific repository?
Is the repository sustainable?
What will be done with the data if the repository closes down?
How much will it cost? Are costs upfront or annual?
Will data be easily accessible to you and to third parties?
How does the repository promote discoverability?
Does the repository record when data is accessed, downloaded, or cited so you get
recognition for your work?
Choosing a data repository
27. Which NERC Data Centre?
British Oceanographic Data Centre (Marine) http://www.bodc.ac.uk/
Centre for Environmental Data Analysis http://www.ceda.ac.uk/
British Atmospheric Data Centre (Atmospheric)
NERC Earth Observation Data Centre (Earth observation)
UK Solar System Data Centre (Solar and space physics)
Environmental Information Data Centre (Terrestrial and freshwater)
http://www.ceh.ac.uk/data
National Geoscience Data Centre (Geoscience) http://www.bgs.ac.uk/services/ngdc/
Polar Data Centre (Polar and cryosphere) https://www.bas.ac.uk/data/uk-pdc/
Archaeology Data Service (NERC-funded research in Science-Based Archaeology)
http://archaeologydataservice.ac.uk/
NERC Data Catalogue Service (integrated, searchable catalogue NERC's data centres):
https://csw-nerc.ceda.ac.uk/geonetwork/srv/eng/catalog.search#/home
28. Edinburgh DataShare
Edinburgh DataShare is the University’s
OA multi-disciplinary data repository
hosted by the Data Library :
http://datashare.is.ed.ac.uk
Assists researchers who want to share
their data, get credit for data
publication, and preserve their data for
the long-term (DOI, licence, citation)
It can help researchers comply with
funder requirements to preserve and
share your data and complies with
Edinburgh’s RDM Policy
http://datashare.is.ed.ac.uk
29. A service to ensure integrity and long-term retention of golden copy
research data.
The DataVault will allow data creators at the University of Edinburgh to:
Store their data safely in the University’s archival storage platform
Link this data to a record in Pure without having to re-enter any of the data
Receive a DOI for the data
Comply with funder and University requirements
Be confident that their data will be there for them, or their nominated
delegate, to reuse in the future as and when required.
Launching soon.
See https://www.ed.ac.uk/is/research-support/datavault for charges etc
DataVault
30. An external repository may be more suitable for data if:
The funder requires or suggests using a particular repository;
If there is an established national or international repository for the discipline;
If it is necessary to implement access controls to protect the data
An international register of research data repositories is maintained at
www.re3data.org as part of Datacite.
External repositories
31. Metadata: Describing data in PURE
Describe all datasets (creating
metadata) in PURE (datasets field):
http://edin.ac/1OF8Auq
Doing this will help datasets to be
discovered, accessed, and reused as
appropriate.
Feeds directly into REF!
Edinburgh Research Explorer:
http://www.research.ed.ac.uk
33. General RDM queries can be sent to data-support@ed.ac.uk
RDM website: http://www.ed.ac.uk/is/research-data-service
RDM blog: http://datablog.is.ed.ac.uk
Research data support
34. RDM Training
RDM training at the University of Edinburgh:
Managing your research data: why is it important and what should you do?
Good Practice in Research Data Management
Creating a Data Management Plan (DMP) for your grant application
Working with personal and sensitive research data
Handling data with SPSS
www.ed.ac.uk/information-services/research-support/research-data-service/training
35. MANTRA
MANTRA is an internationally recognized
self-paced online training course developed
by the Data Library Team for PGR’s and
early career researchers in data
management issues.
Anyone doing a research project will
benefit from at least some part of the
training (and you can pick and choose).
Data handling exercises with open datasets
in 4 analytical packages: R, SPSS, NVivo,
ArcGIS. http://datalib.edina.ac.uk/mantra
36. Information Security: https://www.ed.ac.uk/infosec
Records Management: https://www.ed.ac.uk/records-management
Institute of Academic Development: https://www.ed.ac.uk/institute-
academic-development/research-roles
Edinburgh Innovations: https://www.ed.ac.uk/edinburgh-innovations
Other places to get support