The Digital Curation Centre (DCC) helps research institutions and funders develop data management plans and policies. The DCC created an online tool called DMP Online that allows researchers to create customized data management plans that meet funder requirements. DMP Online provides guidance and templates on best practices. The DCC also analyzes funder policies and develops training and resources to help institutions build data management strategies and capabilities.
1. Data management planning
at the DCC
Martin Donnelly
Digital Curation Centre
University of Edinburgh
STATSBIBLIOTEKET, AARHUS
31 October 2012
2. - Digital Curation Centre, est. 2004
- Three partners: Edinburgh, Glasgow and Bath
- Primary funder is JISC
Helping to build capacity, capability and
skills in data management and curation
across the UKâs higher education
research community
- DCC Phase 3 Business Plan
www.dcc.ac.uk
5. 7 principles agreed by all of the UK
research councils in May 2011
http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx
⢠Public good
⢠Preservation
⢠Discovery
⢠Confidentiality
⢠First use
⢠Recognition
⢠Public funding
6. UK research funder expectations
⢠timely release of data
â once patents are filed, or on (acceptance for) publication
⢠open data sharing
â minimal or no restrictions
â deposit in data centres, structured databases, data enclave
⢠preservation of data
â most funders expect 5-10 years (or more)
⢠submission of data management and sharing plansâŚ
7. What is a DMP? (1)
UK research funders typically ask for:
⢠A short statement/plan to be submitted alongside grant
applications (NERC ask for two versions: one at
application stage, another when project is underway)
⢠An outline of what you will create/collect, methods,
standards, data management and long-term plans
⢠How and why â justify your decisions and any limitations
8. Whoâs involved in this process?
- Just the principal investigator?
- What about the research assistants?
- And the partners based in other institutions?
- And commercial partners?
- And the institutionâs funding office?
- And the Library/IT?
9. Research
Support Office Data Library / Repository
Researcher
DATA
MANAGEMENT
âŚPLAN?
UNRULY
DATA
Computing Faculty Ethics
Support Etc...
Committee
10. Key things to remember
ďźAll research projects are different, so
thereâs no one-size-fits-all DMP approach
ďźThe DMP will depend upon the nature of
the research AND its context
(funder, domain, institution(s) etc)
ďźDMPs are useful communication tools
between multiple stakeholders
11. 2. The DCC and DMP
Weâve responded to requirements by offering supportâŚ
Analysed
requirements
Developed a
Checklist
Provided tools
& guidance
Links to all DMP resources via http://www.dcc.ac.uk/resources/data-management-plans
13. What is a DMP? (2)
In general, funders tend to ask:
- What kinds of data will be created and how?
- How will the data be documented and described?
- Are there ethical and Intellectual Property issues?
- What are the plans for data sharing and third-
party access?
- What is the strategy for longer-term
preservation?
However, different funders ask these
questions in different waysâŚ
14. A Generic and Comprehensive Checklist
§1: Introduction and Context
§2: Data Types, Formats, Standards and Capture
Methods
§3: Ethics and Intellectual Property
§4: Access, Data Sharing and Re-use
§5: Short-Term Storage and Data Management
§6: Deposit and Long-Term Preservation
§7: Resourcing
Checklist for a Data Management
§8: Adherence and Review Plan v3.0 (Donnelly and
Jones, March 2011)
§9: Agreement/Ratification by Stakeholders
§10: Annexes
http://www.dcc.ac.uk/resources/data-management-plans
15. Printed DMP resources
â âDealing with Dataâ (Lyon, 2008)
â Analysis of Funder Policies (Jones, 2009)
â Checklist for a Data Management Plan
(Donnelly and Jones, 2009)
â âHow to Develop a Data Management and
Sharing Planâ (Jones, 2011)
â âData Management Plans and Planningâ
(Donnelly, 2012) in Pryor (ed.) Managing
Research Data, London: Facet
â DMP Online briefing paper (Donnelly and
Richardson, forthcoming 2012)
Links to all DCC resources via http://www.dcc.ac.uk/resources/data-management-plans
17. What does do?
A free and Open web-based tool enabling users to...
i. Create, store and update multiple versions of Data
Management Plans across the research lifecycle
ii. Meet a variety of specific data-related
requirements (from funders, institutions, publishers,
etc.) in a single place
iii. Get tailored guidance on best practice and helpful
contacts, at the point of need
iv. Customise, export and share DMPs in a variety of
formats in order to facilitate communication within
and beyond research projects
18. New features in v3.0 (May 2012)
- Improved user interface, inc. customisable
institutional versions
- New features
- Overlaying multiple templates for âhybridâ DMPs
- Multiple template phases (e.g. pre- / during / post-
project)
- Granular read / write / share permissions
- Multilingual support / boilerplate text
- API for systems interoperability
- Endorsement from funders
19. Technologies involved (v3.0)
â Ruby on Rails (v3.1.3)
â JavaScript (jQuery v1.7.1)
â MySQL database (v5+)
â Hosting: University of Edinburgh Information Services
Virtual Hosting (13 managed servers across 2 sites)
â Authentication: registered users with passwords encrypted
in DB (we have also used Shibboleth for integration with UK
Access Management Federation for Education and Research)
â Various export formats (PDF, DOCX, XLSX, CSV, XML etc)
http://dmponline.dcc.ac.uk
20. HEFCE Institutional Engagements:
from planning to practice
- We are currently working with c. 20 institutions over
an 18 month period to improve their data
management capabilities
- Broad variety of institutional types and sizes, from
research intensive ancient universities, to new
universities and specialist institutions (e.g. art
schools) from all parts of the UK
- Institutions select from a âmenuâ of tools and
services, e.g. (next slide)
21. The Menu
Components of a Data DCC Tools DCC Services
Management Strategy
(Research and Admin)
Policy Data Asset Framework Policy development
(DAF)
Planning DMP Online Strategy development
Advocacy CARDIO Training
Tools DRAMBORA Workflow assessment
Training Costing
Institutional data catalogues
(discovery)
22. Institutional workflows
DMP Online can also be used in conjunction
with other tools that support the data
management/curation lifecycle, e.g.âŚ
- DAF (Data Asset Framework)
- DRAMBORA (Digital Repository Audit Method
Based On Risk Assessment)
- CARDIO (Collaborative Assessment of
Research Data Infrastructure and Objectives)
Also non-DCC tools:
- LIFE
- Planets tools
- CRIS systems
- and more
23. External connections
Systems Standards / protocols
â CRIS / admin systems â CERIF*
â RCUK Je-S system
â Institutional Repositories â SWORD2
â DDI repository â DDI*
â DMP Tool (US) (TBC)
â Other instances of DMP â RDF (DMP-Oxford)
Online via federated
model (? -TBC)
â Metadata catalogues (?)
* via the RESTful API
24. How to connect: six export formats
For human readership⌠For machine readershipâŚ
- Facilitates quick public
- Pleasant formatting sharing
- Compatible with API for
- Editable. Can be used linking with other
in conjunction with systems
(e.g.) MS Sharepoint
- Minimal formatting
- Removes all formatting
25. Collaborations
- Guidance
- Generic data management guidance (in conjunction with
UK Data Archive)
- Tailored guidance developed in collaboration with funders
themselves (ESRC, MRC, Wellcome Trust)
- Institution-specific guidance developed with key contacts in
universities
- Disciplinary guidance developed and deployed through JISC
MRD projects (e.g. DMT Psych at York, DATUM for Health at
Northumbria)
- Templates developed with funders and institutions
- Joint training events organised and delivered by DCC
and
26. DMP International
- DCC is a founder member of the US
DMPTool consortium, and we continue
to work together. Joint workshops at
IDCC in Amsterdam (Jan 2013) and
iSchools Conference in Texas (Feb 2013)
- Weâre working with ANDS in Australia
to deploy DMP Online on the NECTAR
academic cloud
- European Commission has encouraged
DCC to propose a pilot DMP tool for
Horizon 2020. Expecting a DMP
requirement in the next funding
programme
27. Mange Tak!
Martin Donnelly
Digital Curation Centre
University of Edinburgh
martin.donnelly@ed.ac.uk
Twitter: @mkdDCC
www.dcc.ac.uk/resources/data-management-plans
For other DCC services see www.dcc.ac.uk or follow us on twitter @digitalcuration and #ukdcc
Image credits:
Slide 1 - http://upload.wikimedia.org/wikipedia/commons/8/88/LernaeanHydraRephael.jpg
This work is licensed under the Creative Commons Slide 4 - http://www.flickr.com/photos/axis/
Attribution 2.5 UK: Scotland License. Slide 9 - http://en.wikipedia.org/wiki/File:Hercules_slaying_the_Hydra.jpg
Editor's Notes
The UK Digital Curation Centre was established in 2004. Weâre based across three universities, and have a remit to support UK Higher Education as a whole.
What Iâm going to cover
General expectations
A DMP is a basic statement of how you will create, manage, share and preserve your dataFunders expect the decisions to be justified, particularly where itâs not in line with their policy (e.g. limits on data sharing)
The main things to remember about DMPs is that all research projects are different- the DMP will vary with context.Apart from a few very specialised areas like backup - there are no universal rights and wrongs.Research data management by nature involves multiple stakeholders, so planning is important as a communication mechanism.The process of producing a plan (i.e. engaging with others and deciding on the best way forward) is as important as the plan itself.
Checklist and other DCC resources have been utilised in a number of research-intensive universities, including Columbia, Swinburne, MIT and Heidelberg.
So in summary, these are some of the key DMP-related resources.
The DCC Checklist is by nature very long, and its length was felt to be off-putting to researchers. Most of them donât want to deal with this stuff even at a basic level, and a long Checklist with over 100 questions was not going to enjoy a large takeup.No matter how many times we said âyou donât need to fill it all in, just the bits that are relevant to you at this timeâ the message wasnât going to sink in, so we developed a fairly basic wizard style tool which asked a few questions about what stage your research was at, who your funder was, etc, and then pulled out only the most relevant questions from the Checklist to help you meet the pertinent requirements. So instead of seeing 115 questions, you might be presented with only 15 or 20. Much better.We then added functionalities like export and customisation, and some generic guidance to help with some of the more esoteric sections such as file format selection and metadata.
As I mentioned earlier, version 3 launched very recently, and has a number of great new features.The user interface has been tweaked to allow easier (one-click) access to most of the screens, and weâre investigating customised institutional versions with, among others, the University of Oxford.The tool now enables the application of multiple templates, so you can create a single DMP that satisfies your institution, your funder and your publisher at the same time. These templates can be phased more elegantly, so that you can ask (for example) a few questions at the application stage, more during the projectâs lifetime, and then add even more detail when youâre close to completion.Users now have the option to make their plans more widely available. Authentication can be managed via the UK Federated Access Shibboleth mechanism, and we have coded the new system to enable easy translation into other languages, and to handle boilerplate text where this is thought to be beneficial.We have also been working behind the scenes to gain more official endorsement from some of the big funding councils, and this is starting to bear fruit.
For those interested in such things, these are the technologies used in v3.0.
So thatâs a pretty good high-level summary of what weâve done in the data management planning area over the past four years or so.Weâd like to end with a quick outline of the DCCâs institutional engagement programme, the major job of work that Sarah and I (and about a dozen other colleagues) are currently involved in. From last Autumn until next Spring â UK seasons, so the other way around for colleagues in New Zealand! â the DCC has been funded by the Higher Education Funding Council for England (HEFCE) to support eighteen HEIs in increasing their institutional data management capabilities. Weâre working with a range of institutional types and sizes, from research intensive ancient universities, to new universities and small specialist institutions (e.g. art colleges). The way this works is we first of all make contact with someone already interested in this area, often in the Library, and through them we approach a senior academic, usually at Vice Principal level or equivalent, to make the case for working more concertedly in this area. Once an agreement is reached, the institution selects from a âmenuâ of tools and services, e.g. (next slide)
Developing a Data Management StrategyDCC services to support aspects of the research and data management lifecycle, as given in Column 3, andThe tools to support different strands of this (some tools are simply utilised out of the box, others we can provide help and training with, and others â such as DMP Online â can be customised and tailored to match individual institutionsâ requirements more closely.
Similarly, DMP Online can also be used in conjunction with other tools that support the data management/curation lifecycle, be these DCC tools or tools from other sources.
⌠the tool can link to these types of external systems using a variety of standards and protocols. Of course, this list is not exhaustive, and if you see an opportunity for linking DMP Online with other tools we might not have considered, let us know: the API will probably make it possible.
And at an information exchange level, hereâs what we can do. Plans can be exported in a variety of formats, for human and/or machine readerships, andâŚ
So, in addition to the liaison with the funders, weâve developed relationships with a variety of others. Our closest working relationship has probably been with the UK Data Archive, which is the designated place of long term deposit for the Economic and Social Research Council. Working with UKDA we have developed a data management planning template and guidance for ESRC applicants, and we also point to some UKDA guidance in the generic Checklist. We have also liaised with Wellcome Trust, the Medical Research Council and various other funders to develop dedicated DMP templates for them. Continuing in this vein, weâve worked with disciplinary specialists and key institutional contacts to develop further DMP templates, and through the JISC Managing Research Data programmes weâve contributed to a number of projects creating training materials around this area.
Last but not least, weâve shared experiences with a consortium of US universities â including the Universities of California, Virginia, and Illinois, and the Smithsonian Institution â which has helped them to shape their own DMP Tool, weâre working with ANDS and the EC, and have presented our work in Canada and New Zealand.