Webinar delivered by the OU Library Research Support team on 21st March 2020. Covers essential tips for working with research data, including file storage, information security, file naming, metadata and working with participants.
3. 3
WHAT WE’LL COVER
• What is Research Data Management?
• The data lifecycle
• Policies
• Storing and organising data
• Working with personal and
sensitive data
• Questions and further information
5. 5
WORKING WITH RESEARCH DATA
“Research data management concerns the
organisation of data, from its entry to the research
cycle through to the dissemination and archiving of
valuable results. It aims to ensure reliable
verification of results, and permits new and
innovative research built on existing information."
Digital Curation Centre (2011)
Making the Case for Research Data Management
http://www.dcc.ac.uk/sites/default/files/documents/publications/Making%20the%20case.pdf
6. 6
UK Data Archive Data Lifecycle model
http://www.data-archive.ac.uk/create-manage/life-cycle
WORKING WITH RESEARCH DATA
7. 7
Why spend time and effort on this?
• So you can work efficiently and
effectively - save time and reduce
frustration
• Because your data is precious
• To enable data re-use and sharing
• To meet funders’ and institutional
requirements
WORKING WITH RESEARCH DATA
8. 8
What does the OU expect?
“Research data must be managed to the highest standards
throughout their lifecycle in order to support excellence in
research practice.”
“In keeping with OU principles of openness, it is expected
that research data will be open and accessible to other
researchers, as soon as appropriate and verifiable, subject
to the application of appropriate safeguards relating to the
sensitivity of the data and legal and commercial
requirements.”
OU Research Data Management Policy, November 2016
http://www.open.ac.uk/library-research-support/sites/www.open.ac.uk.library-research-
support/files/files/Open-University-Research-Data-Management-Policy.pdf
WORKING WITH RESEARCH DATA
9. 9
What do funders expect?
Concordat on Open Research Data
https://www.ukri.org/files/legacy/documents/concordatonopenresearchdata-pdf/
“Good data management is
fundamental to all stages of the
research process and should be
established at the outset.”
“Open access to research data is an
enabler of high quality research, a
facilitator of innovation and safeguards
good research practice.”
WORKING WITH RESEARCH DATA
10. 10
What does the OU provide?
• Support from the library research support
team and website
library-research-support@open.ac.uk
http://www.open.ac.uk/library-research-support/
WORKING WITH RESEARCH DATA
11. 11
What does the OU provide?
WORKING WITH RESEARCH DATA
• A research data repository, (ORDO) for
secure, long-term storage of data, meeting
funder requirements
https://ordo.open.ac.uk
12. 12
“Start as you mean to go on”
Thinking about the
requirements at the beginning
of the project will limit the
work needed during and at
the end.
Finish
WORKING WITH RESEARCH DATA
13. 13
Data storage and security
There are several storage options
available:
• OU networked file storage
• OneDrive
• SharePoint
• STEM Specialist Support Unit
• ORDO
• Cloud based services (DropBox,
Google Drive etc.)
Tip: See the comparison guide
WORKING WITH RESEARCH DATA
14. 14
Information Security: Why?
WORKING WITH RESEARCH DATA
Adopting appropriate security measures will help
protect your data from:
• Breaches of confidentiality
• Failures of integrity (i.e. the accuracy and consistency
of data)
• Interruptions to the availability of data
• Loss through accidental or malicious damage,
modification or theft
15. 15
Information Security: How?
WORKING WITH RESEARCH DATA
Think about:
• What type of data do you have?
• Who needs access to what?
• Secure storage
• Transfer of data
• Encryption of portable storage devices
• Physical security
• Remote working
• Use cloud storage services (Dropbox) responsibly
16. 16
External collaborators: IT Options
• OneDrive
• SharePoint
• ORDO
• Zendto (for one-off transfers)
• Be wary of Dropbox & similar
WORKING WITH RESEARCH DATA
17. 17
Organising your data
Filing is more than saving files, it’s making
sure you can find them later in your project
• Naming
• Directory Structure
• File Types
• Versioning
All these help to keep your data safe and
accessible.
WORKING WITH RESEARCH DATA
18. 18
File naming
Decide on a file naming convention at the start of your project. Useful file
names are:
• consistent.
• meaningful to you and your colleagues.
• allow you to find the file easily.
Agree on the following elements of a file name:
• Vocabulary
• Punctuation
• Dates (YYYY-MM-DD)
• Order
• Numbers
• Version information
Ideally you should be able to tell what’s in a file before opening it.
Tip: create a readme file detailing the naming scheme.
WORKING WITH RESEARCH DATA
19. 19
File naming: an example
Slides-RDM-WorkingWithResearchData-2019-01.ppt
Slides-RDM-WorkingWithResearchData-2019-01.ppt
type of document
general area of
work / topic
specific area of work / title
date
WORKING WITH RESEARCH DATA
20. 20
File naming: what to avoid…
WORKING WITH RESEARCH DATA
Dan.doc
My paper.doc
Results.xls
August Mtg.doc
20June.csv
IMPORTANT.pdf
Article_Manuscript October_FINAL.doc
Article_Manuscript October_FINAL FINAL.doc
Article_Manuscript October_FINAL FINALv1.doc
Article_Manuscript October_FINAL FINALv2.doc
Article_Manuscript October_FINAL FINALv2 last version.doc
22. 22
WORKING WITH RESEARCH DATA
What do others need to understand your data?
Metadata is additional information that is required to make
sense of your files – it’s data about data.
Documentation and metadata
Project level
• For what purpose was data created
• What does the dataset contain
• How was data collected
• Who collected the data and when
• How was the data processed
• What possible manipulations were
done to the data
• What were the quality assurance
procedures
• How can data be accessed
CESSDA ERIC: https://www.cessda.eu/Research-Infrastructure/Training/Expert-Tour-Guide-on-Data-Management
Object level
• code, field and label descriptions
• descriptive headers or summaries
• recording information in the
Document Properties function of a
file (Microsoft)
23. 23
Personal and sensitive data
• Legal: See ‘OU Data protection procedures’ and start with the
DPIA screening questions
Link to intranet (requires log-in) http://intranet6.open.ac.uk/governance/data-protection/
Working with personal data?
• Ethical: See Human Research ethics guidance and the ‘HREC
Project Registration and Risk Checklist’
Link to website http://www.open.ac.uk/research/ethics/human-research
WORKING WITH RESEARCH DATA
24. 24
Personal and sensitive data
When working with research participants....
• Inform your participants what will happen with the data during
and after the project
• Ensure you have obtained their consent
• Consider who needs access to the data
• Can data be anonymised or pseudonymised?
• Pre-planning and agreeing with participants during the
consent process, on what may and may not be recorded or
transcribed, can be more effective than anonymisation
For more information, see the UK Data Service guidance:
https://www.ukdataservice.ac.uk/manage-data/legal-ethical/consent-data-sharing/gaining-consent
WORKING WITH RESEARCH DATA
25. 25
Personal and sensitive data
Managing sensitive data
• If possible, collect the necessary data without using
personally identifying information
• There is a difference between pseudonymisation and
anonymisation
• Pseudonymise or anonymise your data upon collection or
as soon as possible thereafter
• Avoid transmitting unencrypted personal data electronically
• Consider whether you need to keep original collection
instruments (recordings, surveys etc.) once they have been
transcribed and quality assured.
WORKING WITH RESEARCH DATA
26. 26
How we can help
WORKING WITH RESEARCH DATA
• Data Management Plan checking
• Support with setting up new projects
• Advice on preparation of data for sharing
• Data Repository (ORDO)
• Online guidance
• Enquiries
Library-research-support@open.ac.uk
27. 27
Links
USEFUL RESOURCES
• The OU Library Research Support website: http://www.open.ac.uk/library-
research-support/research-data-management
• Open Research Data Online (ORDO): https://ou.figshare.com
• Digital Curation Centre: http://www.dcc.ac.uk/
• DMP Online: https://dmponline.dcc.ac.uk/
• UK Data Archive: http://www.data-archive.ac.uk/
• MANTRA: http://datalib.edina.ac.uk/mantra/
• CESSDA ERIC training: https://www.cessda.eu/Research-
Infrastructure/Training/Expert-Tour-Guide-on-Data-Management
• The Orb: http://open.ac.uk/blogs/the_orb
• OU Human Research Ethics Committee:
http://www.open.ac.uk/research/ethics/
• OU Data Protection: http://intranet6.open.ac.uk/governance/data-
protection/advice-and-resources (if clicking on the link doesn’t work, copy and paste the address)
• OU Information Security: http://intranet6.open.ac.uk/it/main/information-
security (if clicking on the link doesn’t work, copy and paste the address)
29. 29
FEEDBACK
How did we do?
Before you go, please fill out our very short feedback form to
tell us:
• One thing you liked about today's session
• One thing you would change
Use the online form at:
https://openuniversity.onlinesurveys.ac.uk/library-research-training-19-20
Before I start I’m going to flag up our website where you can find loads of information about Data Management Plans and everything else we support, and how to contact us.
(? minutes)
Overview of the webinar
There is quite a lot of content to get through, we will stop for discussion at various points throughout and there will be time for questions at the end but if you have a burning question please feel free to interrupt me!
Read the quotation.
This quotation from the Digital Curation Centre sums up what Research Data Management is all about. It covers the management of data throughout your research lifecycle (more on that later) and beyond, when you will be sharing your data with other researchers.
This is relevant to all research which produces data, although you may find that the methods you use differ depending on your type of research or academic discipline.
But if you remember nothing else, remember…it should help you to do your research
A quick word on the Digital Curation Centre (DCC). They are the leading experts in the UK on Research Data Management, and gave us a lot of help when we set up the RDM project. Their website is a great source of information and guidance.
There are lots of research data lifecycle models, but this is the one I favour.
Think about what actions are required on your data at each of these points –
For example at the creating data stage you will need to think about a data management plan, gaining consent to share data, finding pre-existing research data as well as the collection/creation of data and metadata.
Processing data will often involve data entry, transcription, anonymization, and quality control as well as thinking about how you will be storing the data.
The analysis phase is what many people think of as “proper research”- by managing your data well up to this point, analysis will be much easier
The preservation phase involves assigning metadata and migrating formats for archive – much of this work can be avoided by getting ahead and doing it earlier on in the project.
By putting your data in a trusted repository or archive you will be providing access. You also need to think about assigning licences. We’ll be talking about data sharing in webinars on 11th June and 9th July.
Re-using data - might be by you or by someone else. Might be for further research, validation of results, could be for teaching or journalism purposes.
3 mins
Why spend time and effort on this?
Good data management does require an investment of effort – but ultimately it’s something that can actually save you time, by helping you work more efficiently. Many of us are all too well acquainted with the frustration of trying to track down a fact or a document we know we have somewhere. Good research data management – setting up an organizational system that works for you, and ensuring everything is properly filed or labelled to enable re-identification and retrieval – can make life a lot easier.
And it’s not just a matter of saving time and reducing unnecessary effort (though clearly that’s a major benefit): having everything well ordered can also help you get a better feel of the shape and scope of your research material, which in turn can enable you to spot patterns or connections that might otherwise get missed.
It’s also well worth doing, because the data you’re producing or working with is valuable
As well as this being true for your own research, the data might ultimately be of use to other researchers. Having everything well organized and properly labelled also has the potential to save you a lot of time at the end of a research project, when it comes to deciding what to do with your data – but more of that later.
Finally, there may be requirements imposed by your funding body and/or the university
2 mins
The OU’s RDM policy:
Make your data open wherever possible (including physical data) – no later than the first date of online publication of research.
Published research papers must include statements on how and on what terms supporting data may be accessed, or if there is no data the paper should make that clear.
Manage it responsibly throughout your project
The university will provide services and facilities, training support and guidance
Note: All those engaged in research at the OU, including those involved in collaborating with other institutions, must take personal responsibility for managing their research data in accordance with University and funder requirements
We talk here and later briefly about data sharing, which is something to be aware of now, but is covered in a separate online session we deliver on 11th June
1 min
What do funders expect?
The Concordat on Open Research Data has been developed by a UK multi-stakeholder group. 28th July 2016.
Individual funders (including all the separate UK Research Councils) have their own policies. Should familiarise yourself with your own funders' policy
It is the researcher’s responsibility to manage their data and comply with institutional and funder policies, but it is the University’s responsibility to support you to be able to do so.
We have research support librarians and a website with detailed guidance and resources…
New website (need to update image)
… and a repository to allow you to preserve and share your data.
So having got to grips with what RDM is, we’ll now look at how to work with and manage your data.
Key piece of advice - start as you mean to go on
Consider all the preparation necessary for making your data shareable and how you can reduce the workload at the end of the project by doing the work during the project
You may have heard of Data Management Plans, which is a document where you can write these plans down at the start of your project. We had a webinar on this in April which was recorded. You can find it on the page for this online room, in the top right corner.
The preferred option is networked file storage, as this is secure and provides guaranteed backup.
SharePoint will help you to manage your data effectively, it offers sophisticated version control and allows you to create folders assign tags to different types of documents. Best for internal teams only as external collaborators will need to get visitor access.
OneDrive is secure and can be accessed by anyone with a Microsoft account (inside or outside OU). All data is stored within the EU.
ORDO is our data repository, it’s easy to use during projects but doesn’t offer online editing or version control so not ideal for team working.
If you opt for a cloud service, make sure you’re aware of the terms and conditions - where will the data be kept? What is the backup policy? What happens if the system fails? –
When using cloud based services not supported or licenced for OU corporate use (such as Dropbox, Google Drive), accountability rests with the individual choosing to use this method of storage. It is also against OU information security policy to store OU information on unsupported/unlicensed cloud based storage. Check with IT Helpdesk first.
More information on these options is available on the comparison guide linked to here. If you need help deciding what’s best for your project please get in touch.
Information security enables you to control who has access to your research data and to determine, and keep track of, what others are authorised to do with your data. If you lose your data recovery could be slow, costly or even worse, it could be impossible. Furthermore, academic research often results in the creation of sensitive data which, if released, could be damaging to your reputation or that of your institution.
Adopting appropriate security measures will help protect your data from:
Breaches of confidentiality, which could result in reputational damage or claims because of loss of intellectual property and damage to research subjects
Failures of integrity (i.e. the accuracy and consistency of data), which can undermine research credibility
Interruptions to the availability of data, which can impact on the research process
Loss through accidental or malicious damage, modification or theft
Do you have sensitive data? Why is it sensitive? Assign appropriate controls.
If you’re working in a team – think about who needs to access which files. Does everyone need access to everything?
We have already looked at storage – think about what data you are working with and the level of security it requires
Transferring data outside of the OU – we’ll look at the options on the next slide.
If you’re using portable storage devices (USBs etc) to store sensitive data, make sure it’s encrypted as there is a risk that you may lose it.
Likewise, lock away any paperwork which could cause reputational damage if it got into the wrong hands, as well as storage devices.
If you’re working away from the office, think about who can access your computer – if you’re on the train be aware of people looking over your shoulders!
Use cloud storage services responsibly – be aware of where the data is stored.
OneDrive allows you to share materials rather like Dropbox, but is more secure, and stores are within EU locations, so is complies with EU Data Protection laws concerning the storage of personal data.
We also have ORDO, which you can use to store live research data. You may create a project workspace and then invite external collaborators to share this workspace; you can allow view only permissions, or data upload permissions. The file management in ORDO is a little clunky; you have to download, work on files, and then re-upload, so depending on the extent of your collaboration and other options, you may want to use another of the services provided centrally. If you have any specific queries about which service might be right for your live research data, please get in touch.
You also need to think about what happens to the data once uploaded to these stores – if you are working with partners outside of the EU – think about any limitations there may be on them downloading data from these stores once uploaded I.e. for further processing (anonymisation etc). It may be that you need to consider anonymising data prior to upload. More detailed discussion of such issues may be needed with Data Protection team at the start of your project if you are working with international partners.
1 min
Think about names and formats before clicking save
Where do you need this file; is it used by another program?
Do the name and location make sense?
Consideration at the beginning makes it easier to find files and related documents later.
1 min
Decide on file naming convention at outset
Vocabulary – choose a standard vocabulary for file names, so that everyone uses a common language.
Punctuation – decide on conventions for if and when to use punctuation symbols, capitals, hyphens and spaces.
Dates – agree on a logical use of dates so that they display chronologically i.e. YYYY-MM-DD.
Order - confirm which element should go first, so that files on the same theme are listed together and can therefore be found easily.
Numbers – specify the amount of digits that will be used in numbering so that files are listed numerically e.g. 01, 002, etc.
This is one example of a naming convention, that we use in the library research support team. It has the right level of detail for the work we are doing and the amount of files we produce.
It tells us what type of document it is, what it relates to generally and specifically, and when it was created. The order of the parts of the filename could be changed so that for example the date was first, so all files would sort first chronologically in a folder or list, rather than by document type first. This suits our team.
1 min
Think about names and formats before clicking save
Most people do it to a certain extent without thinking
You might organize your collection by artist, title, even colour! This is made much easier in a digital environment
What information is needed for the data to be to be read and interpreted in the future?
Describe the types of documentation that will accompany the data to help secondary users to understand and reuse it. This should at least include basic details that will help people to find the data, including who created or contributed to the data, its title, date of creation and under what conditions it can be accessed.
Documentation may also include details on the methodology used, analytical and procedural information, definitions of variables, vocabularies, units of measurement, any assumptions made, and the format and file type of the data
Supporting documentation
Working papers or laboratory books
Questionnaires or interview guides
Final project reports and publications
Catalogue metadata
READ ME file
Embedded documentation is integral to the file
Supporting documentation is in addition to the file. We encourage you to upload a readme file alongside any data you deposit in ORDO. Make sure it’s structured in a way that makes it easy to understand.
Anything that will help others understand your data is useful.
2 mins
In the past researchers gained consent from participants primarily so that they could collect data.
However, many funders are now increasingly requesting researchers to share and preserve their data as part of their requirements.
It is therefore important that participants fully understand:
how you will store, publish and share their data
how you will ensure that their data remains confidential and anonymous (where applicable) throughout the duration of the project and after
Failure to obtain consent could result in non-compliance with your funder's requirements and limit the opportunities you have to share, publish and preserve your data.
If things change, you may be able to go back to your participants and change the details of the agreement.
Anonymisation can be time-consuming, so agreeing what can and can’t be recorded or transcribed may well save you time and effort. For example, if they don’t want you to use names, then conduct the interview without using names.
It is your duty to protect participants and comply with data protection
2 mins
As mentioned before, if possible in the collection process, not using personally identifying information can save time and effort as you will have less to anonymise.
Make sure you are storing your sensitive data sensibly. If possible, de-identify your data upon collection, this will reduce the damage if a security breach happens.
Make sure you are encrypting your data if you have to send it electronically (eg by email)
Do you need to keep the original recording? If it’s been transcribed, what value does it hold? By destroying it as early as possible you are reducing the risk.
Send DMPs in advance of bid submission! Preferably a week ahead, if possible. But later is better than never!
I am happy to meet with Pis and project teams at the beginning of projects to discuss strategies for managing data and clarify funder requirements. Also able to set up bespoke training sessions for departments/research groups
At the end of your project, hopefully your data will have been managed in a way that facilitates sharing, but if in doubt get in touch for help
Guidance is on the library research support website. URL on next slide.
Send enquiries to email at bottom of screen, this way anyone from the team can pick it up if I’m away.