This breakout session will introduce a case study covering
the development of Research Data Management services and
systems at King’s College London. The focus will be on researcher engagement and analysis of user requirements – these are activities which are indispensable components in developing systems and services. A question that will be considered is: how can the requirements of other stakeholders such as the university and research funders be met in this process?
Difference Between Search & Browse Methods in Odoo 17
Developing RDM Systems with Researcher Needs in Mind
1. Research Data Management:
developing systems with
researchers needs in mind
Vimal Shah
Research Information Manager
Middlesex University
UKSG Conference 2017
2. Session outline
• Development of research data management
systems and support services at two different
institutions
• Researcher engagement and analysis of user
requirements
• Requirements of other stakeholders such as
the university and research funders
10/04/2017 2
3. Two cases
• King’s College London, established in 1829,
founding member of the University of
London, research-intensive
– Started developing systems in-house
• Middlesex University London, granted
university status in 1992, a polytechnic since
1973, teaching-intensive
– Implementing systems through the JISC Research
Data Shared Service as a pilot institution
10/04/2017 3
5. Inherent benefits
10/04/2017 5
Resources
• Avoids duplication of data,
increases efficiency
Security • Reduces the risk of data loss
Integrity
• Enables greater scrutiny of
published research
Collaboration
• Facilitates sharing and re-use of
data now and in the future
Impact
• Increases visibility of research
data & associated publications,
and potential for citation
6. Compliance and risk management
10/04/2017 6
Funder
expectations
Publisher
policies
Institutional
policies
Research
ethics
Legislation
Contracts,
agreements
Exploitation
of IP
7. “…as open as possible, as closed
as necessary”
European Commission (2016) Open
access & Data management, Horizon
2020
10/04/2017 7
8. WHAT DO WE MEAN BY
‘RESEARCH DATA’?
10/04/2017 8
9. Dataset
• Represents a citable resource
• One or more files, formats, sizes and content.
• Includes documentation
• Could be:
– digital outputs necessary to substantiate and/or
validate research findings in publications
– produced for a specific work package
– requiring storage to enable potential future
access post study completion
10/04/2017 9
11. Requirements analysis: Oct 2014
• “Research data management: a requirements
analysis for institutional infrastructure
development”
– Literature review
– Qualitative analysis using MaxQDA of ~45
written statements from researchers at King’s
(data collected by Veronica Howe, RD Manager)
– In-depth interviews with researchers in the Social
Sciences
10/04/2017 11
12. Summary of requirements
• Shared/networked/cloud/local storage and
backup
• Access to secure storage and archive
• Research computing support
• Central data management services:
– Database design and maintenance
– Statistical data support
10/04/2017 12
13. Summary of requirements
• Data management planning
• Guidance, training, consultancy and advice
• Communication of research outputs and
supporting research collaborations
10/04/2017 13
14. Design and demo: 2015
• In-house development
• User stories, data flow diagrams, IT project
governance…
• 900TB of storage available from Microsoft
• Jul: Product demonstrations and consultation
with the research community
• Sep: DataCite membership
• Oct: Research data Steering Group approval
10/04/2017 14
15. Org. changes and launch: 2016
• Jan: Open Research Group formed
• Feb: Change in available IT resources
• Apr: Testing, fixing, re-testing, re-re-testing…
• May: Launch of version 1 – data asset
register, mediated deposit, DOIs, 1TB storage
• June: 7 open meetings with a cross-section of
the research community
10/04/2017 15
16. Summary of feedback: 2016
• Medium term goals:
– Improved data storage capacity
– Improved data transfer options
– Indexing of records on search engines
– Integration with the research portal and current
research information system (CRIS)
– Preview/approval before data publication
– Digital archive and preservation
10/04/2017 16
17. Summary of feedback: 2016
• Longer term goals
– Support change in research culture
– More self-service facilities rather than mediated
– Life cycle support for RDM, including
management of active/dynamic data
– Searching, browsing and comparison of datasets
within user-selected criteria
10/04/2017 17
18. Summary of feedback: 2016
• Risk management priorities
– Backup and recovery
– Secure storage for datasets containing
personal/sensitive/highly-restricted data
– Sustainability/cost recovery for publishing,
preserving and archiving data
– Meeting the requirements of the EPSRC policy
framework on research data
10/04/2017 18
19. Challenges to keep us busy
• Publishing data openly is problematic for
some before article acceptance/publication
• Handling extra large files i.e. 400-900GB
• Data management planning is seen as a chore
by some
• Metadata requirements – reuse/long term vs.
quick publication
• User experience, interfaces, interoperability
10/04/2017 19
20. Reflection
• Compliance is a process (as opposed to a
destination) when it comes to the EPSRC’s
policy framework for research data
• Development of services and systems has to
go hand in hand with raising awareness and
training
• July 2016 began revising the King’s research
data management policy, now published
10/04/2017 20
22. JISC Research Data Shared Service
• Pilot participant – huge opportunity, JISC
funding
• ‘Outsourced’ through shared procurement
• Implementing a data repository Figshare
• Predecessor Jenny Evans ‘accelerated
implementation’ in 10 weeks Sep – Nov 2016
including all key stakeholders!
10/04/2017 22
23. Building…
• Now adding a preservation solution
(Preservica) and carrying forward further
implementation tasks for the data repository
• Planning to reconvene project working group
and group of researchers piloting
• Trying to bring in other stakeholders from the
outset and sharing ideas with other pilots
• ORCID – Open Researcher & Contributor ID
10/04/2017 23
24. Placing systems in context
10/04/2017 24
Data
management
planning
Training
Outreach
and
consultation
Enquiry
support
Collaboration
and liaison
Digital
repositories
Research
information
systems
25. Thank you
• Colleagues at King’s College London, JISC,
Middlesex University
• Published guidance from the UK Data Archive
and the Digital Curation Centre
• Other universities across the UK whose staff
have created and published their data
management guidance
10/04/2017 25
Editor's Notes
What?! Who else are you developing a research data system for?
Technical aspects of the development and implementation of systems will not be covered during this session.
Mainly experience at King’s but some information about what is happening at Middlesex too.
John Kaye from JISC will be sharing more about the shared service in the Group D sessions.
When you are working with a lot of data collected over time, it can become a challenge to find and use older data if it has not been described, organised, backed up etc. especially if you want to share that data with collaborators.
Securing data appropriately reduces the risk of data breaches and loss.
Describing your data accurately, maintaining research records and making data available for scrutiny supports the integrity of research findings. There have been recent reports on the problems with reproducibility of research findings in certain fields of research.
Good data management practice enables good quality data to be shared and re-used by others, potentially with new collaborators.
Publishing datasets with a persistent identifier could support the citability and findability of datasets, and potentially contribute to greater impact. Data publication is becoming a thing in itself.
There are many other stakeholders that may have a bearing on how research data is managed.
Funders now ask for information on how research data will be preserved and shared. Some stipulate retention periods.
Some publishers now expect underlying data to be made available upon article submission
Many universities now have research data policies in place partly in response to external factors, partly for managing their own risks and of course there may be other reasons too.
Data sharing agreements, commercial data providers, public/government/NHS data providers. Data owners stipulate requirements that must be adhered to, however even here each case has to be dealt with on its own merits.
Legislation covering the protection of intellectual property, the DPA and FOI.
Research ethics – researchers have to consider sensitivity and confidentiality of the data, not just for personal data, but for example data about vulnerable animal species and cultural heritage sites in war-torn countries.
Finally commercialisation opportunities and the exploitation of intellectual property may have a bearing on what is shared and how it is shared.
European Commission’s position on open research data is one way to look at how these factors might be balanced.
When re-writing the King’s research data management policy I found it very helpful to think about this statement and how it might manifest itself in what we ask researchers to do.
Research data
While there is no single definition of data, research data can generally be defined as any representation or objects that are created or gathered for the purposes of producing research or scholarship, and which can be used to validate or reproduce original research findings.
The underlying research materials which support research publications can be described as research data.
Research data can cover a diversity of form and content, including (but not limited to) numbers, text, images, audio, simulations, models, interview recordings, questionnaires, maps, laboratory notebooks, videos, algorithms, codebooks, test results, specimens, databases or any combination of these and more.
Data objects can be physical or print as well as digital.
During the development of systems at King’s in our conversations with researchers we found it helpful to talk about what a dataset might constitute.
At that time working as an Information Specialist for Social Science and Public Policy.
Found out a lot about researchers motivations and approaches to data management, how data management ‘tasks’ were intertwined with the process of research and were not perceived as distinct tasks until I mentioned it. E.g. is the transcription of interview notes to a computer a ‘data management task’?
Anyway focus in this session is on specific requirements rather than some kind of ethnography.
A tall order! Our researchers certainly don’t ask for much do they? But we had to start somewhere.
I became Research Data Manager and jumped straight into a project that had already begun
Researchers had asked for a lot but we had to start somewhere with the skills and experience that we had within the project team.
IT department said here’s some storage from Microsoft, you must use it. Cost efficiency. And quite often even now when I talk to IT colleagues, many think of storage at the first mention of research data management – of course storage is not the only thing.
Jul: researchers and research administrators attended because we had something to show them a prototype system
Oct: approval to go ahead with a 5 phase development programme to address requirements
The institutional context always has a bearing on how systems and services develop.
Change in IT resources meant that we had to squeeze as much development time out of the techies as possible.
About a year after first set of consultations with a cross-section of the research community held seven more open meetings to allow as many people to feedback as possible. Version 1 of the system and explained the trajectory.
May: A set of systems that approximated to a data repository.
It was important to explain that we had started something, it won’t meet every single requirement but we wanted users to continue talking to us as we develop the infrastructure further.
By the time we conducted the series of open meetings in June 2016, we noticed the type of feedback we received from researchers was becoming granular, more specific.
In the course of supporting researchers with their data management requirements we realised there are a lot of areas where we still need to develop services and work with colleagues in IT and the Research Office to find solutions
Metadata requirements – long term vs immediate – becomes an educational exercise.
July 2016 – looking at our experience, the emerging guidance from RCUK and the Open Data Concordat, H2020, of course the implications of the EPSRC, contributed to a revision of the King’s research data policy which also had to take into account the local context.
Early stages but being a smaller organisation might mean it is more nimble in getting things off the ground
Systems are a part of the human and technological infrastructures within the institution. Systems do not exist in a vacuum.
When has a computer system on its own ever solved a problem on its own? Implementing a system on its own without embedding it within existing support structures does not solve the problem.
Data management planning support – most funders now require them. I also see them as part of good information/data governance practice to enable researchers to actively think about risks to their data, as well as how their data will be looked after in the long term. It is only by planning that the most appropriate repository for their data can be identified.
Regular training - is required to raise awareness amongst new researchers and to update experienced researchers about new policies and available tools to help with data management tasks.
Outreach and consultation – is done on an ongoing basis, must be continual. You can’t have a ‘period of outreach and consultation’ in this complex area as things are changing all the time, requirements change, technologies change. So more appropriate to have repeated periods of consultation – built into work patterns and flexibility of team structures. Saying yes to every invitation to a committee and every request for training and updates – requires flexibility as well as an understanding boss and employer
Enquiry support – helpdesk because data management is a complex task and researchers requirements vary greatly between different fields of research and the requirements of each research project, and requirements for the publication and sharing of results also vary.
Collaboration and liaison – between the Research Office, IT, Library, Information Governance, Research ethics teams (legal compliance and records management)
Digital repositories – Existing publications repositories and digital archives. How well they might serve the data deluge that is often quoted. If you make an assessment that your existing repository is not up to the job of supporting research data/datasets, how will any new data repository integrate with it? Also if you call it a new digital repository to be inclusive, have to use careful language as REF requirements mean publications have to be submitted in a particular way.
Research information systems – A lot of universities now have these in place to manage publications data, grant funding data, research projects, researcher profiles, person identifiers and more. Integration with these systems is an absolutely vital consideration.
As we’ll see later on, researchers are crying out for much better integration between these various systems that they now have to navigate.