Good afternoon. We are Martin Donnelly and Sarah Jones of the Digital Curation Centre, at the Universities of Edinburgh and Glasgow respectively. We’ll be talking today about the journey from research data management policy to good practice, and how the DCC’s resources, notably the DMP Online tool, can support this journey.
Sarah will give an introduction to the DCC and our interest in research data management, before giving an overview of the policy situation in the UK and how we got involved in data management planning.Martin will then take over, talking in more detail about the DMP Online tool, the various collaborations we’ve formed through this work, and an overview of the major job of work that we’re both currently involved in, namely the DCC’s set of institutional engagements.
The UK Digital Curation Centre was established in 2004. We’re based across three universities, and have a remit to support UK Higher Education as a whole.Our mission has changed over time from a focus on digital curation and preservation, working largely with archives and repositories, to research data management in universities.
The DCC has four main strands of activity:We develop tools to help organisations assess their infrastructure & capabilities or to undertake specific tasks e.g. writing DMPs with DMP OnlineWe run a helpdesk, which is open to all, and provide guidance. How To guides are a new range of pragmatic, practical advice.We run training and community building events. The roadshows help institutions develop research data management strategiesWe support JISC by co-ordinating events, working with projects and synthesising/disseminating findings.
The DCC developed the curation lifecycle model to explain the range of activities involved in creating, preserving and sharing digital content.In RDM terms ‘curation’ is simply managing & sharing data. The DCC argues that this is just part of good research practice.
How datasets are created and managed in the short-term affects how much work it is to ingest and preserve them. The transition isn’t always easy, which is why it’s useful to work with researchers early on to support them to make informed decisions about how to create and manage their data.The KRDS costs and benefits studies found that ingest is by far and away the most resource intensive activity.
Last year the 7 UK Research Councils released common principles to harmonise their data policies.These push for open data, acknowledge the importance of policies and planning, and cover various aspects on curating data (including meeting costs).
Basic expectations across the board are that:Data are released as soon as possibleData are shared openly wherever possibleData are preserved for 10+ yearsDMPs are submitted that outline plans for data management and sharing
The DCC has responded to these requirements by providing lots of support on data management planningLiz Lyon first called for plans in 2007 in a recommendation in the Dealing with Data ReportWe have since analysed funders requirements and put together a checklist for a Data Management PlanThe Checklist is the underlying intellectual framework in DMP Online, the flagship of the DCC’s tools and resourcesWe also provide guidance documents and have custom guidance (disciplinary & institutional) built into DMP Online
A DMP is a basic statement of how you will create, manage, share and preserve your dataFunders expect the decisions to be justified, particularly where it’s not in line with their policy (e.g. limits on data sharing)
The main questions across the board cover:Data creationMetadata and documentationEthical and legal issuesData sharing Preservation
You see the common questions come through in the main sections of the DCC ChecklistWe also include administrative sections (intro, review, ratification) so you can ensure co-ordination and commitment across all of the stakeholders involved in managing data.
So in summary, these are some of the key DMP-related resources.
The main things to remember about DMPs is that all research projects are different- the DMP will vary with context.Apart from a few very specialised areas like backup - there are no universal rights and wrongs.Research data management by nature involves multiple stakeholders, so planning is important as a communication mechanism.The process of producing a plan (i.e. engaging with others and deciding on the best way forward) is as important as the plan itself.
SJ > MDThese expectations and trends are not a UK phenomenon. Martin and I have contextualised the UK experience of data policies and planning, by reflecting on international initiatives in Managing Research DataThis is why DMP Online is relevant to international audiences, so I’ll let Martin tell you all about it.
Thanks SarahWe started developing DMP Online in 2009, and launched the first version in 2010. We’re now on to v3.0, which includes some great new features that we’re really excited about.
The DCC Checklist is by nature very long, and its length was felt to be off-putting to researchers. Most of them don’t want to deal with this stuff even at a basic level, and a long Checklist with over 100 questions was not going to enjoy a large takeup.No matter how many times we said “you don’t need to fill it all in, just the bits that are relevant to you at this time” the message wasn’t going to sink in, so we developed a fairly basic wizard style tool which asked a few questions about what stage your research was at, who your funder was, etc, and then pulled out only the most relevant questions from the Checklist to help you meet the pertinent requirements. So instead of seeing 115 questions, you might be presented with only 15 or 20. Much better.We then added functionalities like export and customisation, and some generic guidance to help with some of the more esoteric sections such as file format selection and metadata.
For those interested in such things, these are the technologies used in v3.0.
As I mentioned earlier, version 3 launched very recently, and has a number of great new features.The user interface has been tweaked to allow easier (one-click) access to most of the screens, and we’re investigating customised institutional versions with, among others, the University of Oxford.The tool now enables the application of multiple templates, so you can create a single DMP that satisfies your institution, your funder and your publisher at the same time. These templates can be phased more elegantly, so that you can ask (for example) a few questions at the application stage, more during the project’s lifetime, and then add even more detail when you’re close to completion.Users now have the option to make their plans more widely available. Authentication can be managed via the UK Federated Access Shibboleth mechanism, and we have coded the new system to enable easy translation into other languages, and to handle boilerplate text where this is thought to be beneficial.We have also been working behind the scenes to gain more official endorsement from some of the big funding councils, and this is starting to bear fruit.
So, in addition to the liaison with the funders, we’ve developed relationships with a variety of others. Our closest working relationship has probably been with the UK Data Archive, which is the designated place of long term deposit for the Economic and Social Research Council. Working with UKDA we have developed a data management planning template and guidance for ESRC applicants, and we also point to some UKDA guidance in the generic Checklist. We have also liaised with Wellcome Trust, the Medical Research Council and various other funders to develop dedicated DMP templates for them. Continuing in this vein, we’ve worked with disciplinary specialists and key institutional contacts to develop further DMP templates, and through the JISC Managing Research Data programmes we’ve contributed to a number of projects creating training materials around this area.Last but not least, we’ve shared experiences with a consortium of US universities – including the Universities of California, Virginia, and Illinois, and the Smithsonian Institution – which has helped them to shape their own DMP Tool.
These tables show the templates we’ve developed or are in the process of developing. I won’t go through them all now, but the slides will be available for later perusal.
And more templates are being developed all the time. If you’d like to talk about creating one for your institution or organisation, either catch me afterwards or drop me an email.
So that’s a pretty good high-level summary of what we’ve done in the data management planning area over the past four years or so.We’d like to end with a quick outline of the DCC’s institutional engagement programme, the major job of work that Sarah and I (and about a dozen other colleagues) are currently involved in. From last Autumn until next Spring – UK seasons, so the other way around for colleagues in New Zealand! – the DCC has been funded by the Higher Education Funding Council for England (HEFCE) to support eighteen HEIs in increasing their institutional data management capabilities. We’re working with a range of institutional types and sizes, from research intensive ancient universities, to new universities and small specialist institutions (e.g. art colleges). The way this works is we first of all make contact with someone already interested in this area, often in the Library, and through them we approach a senior academic, usually at Vice Principal level or equivalent, to make the case for working more concertedly in this area. Once an agreement is reached, the institution selects from a ‘menu’ of tools and services, e.g. (next slide)
Developing a Data Management StrategyDCC services to support aspects of the research and data management lifecycle, as given in Column 3, andThe tools to support different strands of this (some tools are simply utilised out of the box, others we can provide help and training with, and others – such as DMP Online – can be customised and tailored to match individual institutions’ requirements more closely.
Similarly, DMP Online can also be used in conjunction with other tools that support the data management/curation lifecycle, be these DCC tools or tools from other sources.
And at an information exchange level, here’s what we can do. Plans can be exported in a variety of formats, for human and/or machine readerships, and…
… the tool can link to these types of external systems using a variety of standards and protocols. Of course, this list is not exhaustive, and if you see an opportunity for linking DMP Online with other tools we might not have considered, let us know: the API will probably make it possible.
So in conclusion, we see the data management plan as a multi-purpose instrument – communication and context – and one that can bring together, if not level, the various stakeholder groups in the research data management endeavour.
Reversing the hydra metaphor somewhat, we hold that research is more than the sum of its parts, and when data management planning acts to facilitate communication for ensuring smooth and accurate interactions, it also serves as a way to bring it all together.
From policy to practice with DMP Online
Research data management: frompolicy to practice with DMP Online Martin Donnelly Sarah Jones Digital Curation Centre Digital Curation Centre University of Edinburgh University of Glasgow Future Perfect 2012: Digital Preservation by Design Te Papa Tongarewa, Wellington, New Zealand 26 – 27 March 2012
Running order (c. 25 mins)1. Introduction to the DCC & research data management2. Data-related policies in the UK Sarah3. The DCC & data management planning4. DMP Online v3.05. Connections and collaborations6. Putting it into practice (UMF work and other things) Martin7. Summary / conclusion
1. The Digital Curation Centre- Founded in 2004- Three partners: Edinburgh, Glasgow and Bath- Primary funder is JISC Helping to build capacity, capability and skills in data management and curation across the UK’s higher education research community - DCC Phase 3 Business Plan
What does the DCC do?• Develop tools – CARDIO, DAF, DRAMBORA, DMP Online• Offer guidance – helpdesk, briefing papers, how-to guides• Run training & events – DC101, roadshow, RDMF, IDCC• Support the JISC – esp. the Managing Research Data programmes
What is Research Data Management? “the active management and Manage appraisal of data over the lifecycle of scholarly and scientific interest” Share Data management is part of good research practice
How does RDM affect preservation?The costs of ingest – receiving data, preparing it for long-termstorage, and incorporating it into the digital archive – receivesthe largest allocation of resources. - Keeping Research Data Safe 2
2. Data-related policies in the UKhttp://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies
RCUK Common Principles• Publicly funded research data are a public good, produced in the public interest, which should be made openly available with as few restrictions as possible in a timely and responsible manner that does not harm intellectual property.• Institutional and project specific data management policies and plans should be in accordance with relevant standards and community best practice. Data with acknowledged long-term value should be preserved and remain accessible and usable for future research.• To enable research data to be discoverable and effectively re-used by others, sufficient metadata should be recorded and made openly available .... 7 principles agreed by all the UK research councils in May 2011 http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx
UK research funder expectations• timely release of data – once patents are filed or on (acceptance for) publication• open data sharing – minimal or no restrictions – deposit in data centres, structured databases, data enclave• preservation of data – most funders state expect 5-10+ years• submission of data management and sharing plans…
3. The DCC and DMP We’ve responded to requirements by offering support Analysed requirements Developed a Checklist Provided tools & guidanceLinks to all DMP resources via http://www.dcc.ac.uk/resources/data-management-plans
What is a DMP?UK research funders typically ask for:• A short statement/plan submitted in grant applications• An outline of what you will create/collect, methods, standards, data management and long-term plans• How and why – justify your decisions and any limits
Common DMP questions• What data will be created (format, types) and how?• How will the data be documented and described?• How will you manage ethics and Intellectual Property?• What are the plans for data sharing and access?• What is the strategy for long-term preservation?
DCC Checklist Coverage§1: Introduction and Context§2: Data Types, Formats, Standards and Capture Methods§3: Ethics and Intellectual Property§4: Access, Data Sharing and Re-use§5: Short-Term Storage and Data Management§6: Deposit and Long-Term Preservation§7: Resourcing Checklist for a Data Management§8: Adherence and Review Plan v3.0 (Donnelly and Jones, March 2011)§9: Agreement/Ratification by Stakeholders§10: Annexes http://www.dcc.ac.uk/resources/data-management-plans
DMP-related resources– “Dealing with Data” (Lyon, 2008)– Analysis of Funder Policies (Jones, 2009)– Checklist for a Data Management Plan (Donnelly and Jones, 2009)– “How to Develop a Data Management and Sharing Plan” (Jones, 2011) Edinburgh: Digital Curation Centre– “Data Management Plans and Planning” (Donnelly, 2012) in Pryor (ed.) Managing Research Data, London: FacetLinks to all DCC resources via http://www.dcc.ac.uk/resources/data-management-plans
Key things to rememberAll research projects are differentThe DMP will depend upon the nature of the research AND the context (funder, domain, institution(s) etc)DMPs are useful communication tools
Not a UK phenomenon Read about the international policy and DMP landscape in:“Research data policies: “Data Managementprinciples, requirements Plans and Planning”and trends” (Jones, (Donnelly, 2012) in2012) in Pryor (ed.) Pryor (ed.) ManagingManaging Research Research Data,Data, London: Facet London: Facet
What does do?A web-based tool that enables users to...i. Create, store and update multiple versions of Data Management Plans across the research lifecycleii. Meet a variety of specific data-related requirements (from funders, institutions, publishers, etc.)iii. Get tailored guidance on best practice and helpful contacts, at the point of neediv. Customise export are share DMPs in a variety of formats in order to facilitate communications within and beyond research projects* N.B. The templates have varying degrees of endorsement from funders,stakeholder communities, etc. More on this shortly…
DMP Online v3.0: Spring 2012- Improved user interface, inc. customisable institutional versions- New features - Overlaying multiple templates for ‘hybrid’ DMPs - Template phases (e.g. pre- / during / post-project) - Granular read / write / share permissions - API for systems interoperability (e.g. this project) - Shibboleth authentication - Multilingual support / boilerplate text- Endorsement from funders
Collaborations- Generic data management guidance ( in conjunction with )- Funder-specific guidance developed in collaboration with the funders themselves- Institution-specific guidance developed with key institutional contacts- Discipline-specific guidance developed and deployed with JISC MRD projects (e.g. DMT Psych at York)- Joint training programmes organised and delivered by DCC and UKDA- Provided advice to US consortium
Templates: Stakeholder Liaison (i)RCUK funders StatusArts and Humanities Research Council (AHRC) Discussions beginningBiotechnology and Biological Sciences Research Council Discussions ongoing(BBSRC)Engineering and Physical Sciences Research Council No explicit data management plan requirements: DCC(EPSRC) referenced in roadmap requirementsEconomic and Social Research Council (ESRC) Template and guidance developed in collaboration with ESRC and ESDS. Funder’s online guidance points applicants towards tool.Medical Research Council (MRC) Template in preparation through collaboration with funderNERC (Natural Environment Research Council) Discussions ongoingScience and Technology Facilities Council (STFC) DCC resources referenced in data requirementsOther funders StatusThe Wellcome Trust Template and guidance endorsed by funderNational Science Foundation (US) Template developed by Sherry Lake, University of Virginia
Templates: Stakeholder Liaison (ii)Disciplinary templates StatusHistory Developed in conjunction with University of Hull and University of HertfordshirePsychology Developed by DMT Psych project, led by University of YorkMechanical Engineering Developed as part of REDm-MED project, led by University of BathHealth sciences Developed by DATUM for Health project, led by University of NorthumbriaSpatial information (INSPIRE) Developed in conjunction with EDINA (UK national data centre) and trialled with Freshwater Biological AssociationInstitutional templates StatusUniversity of Northampton Developed in collaboration with Information Services department More institutional and subject-based templates are being developed through the JISC RDM projects and UMF institutional engagements…
Institutional Engagements: Putting it into practice- Working with eighteen institutions over approximately 18 months to improve data management capabilities- A broad variety of institutional types and sizes, from research intensive ancient universities, to new universities and small specialist institutions (e.g. art colleges)- Institutions select from a ‘menu’ of tools and services, e.g. (next slide)
The MenuComponents of a Data DCC Tools DCC ServicesManagement Strategy(Research and Admin)Policy Data Asset Framework Policy development (DAF)Planning DMP Online Strategy developmentAdvocacy CARDIO TrainingTools DRAMBORA Workflow assessmentTraining Costing Institutional data catalogues (discovery)
Workflow connectionsDMP Online can also be used in conjunctionwith other tools that support the datamanagement/curation lifecycle, e.g.… - DAF (Data Asset Framework) - DRAMBORA (Digital Repository Audit Method Based On Risk Assessment) - CARDIO (Collaborative Assessment of Research Data Infrastructure and Objectives)Also non-DCC tools: - LIFE - Planets tools - and more
How to connect: six export formatsFor human readership… For machine readership… - Pleasant formatting - Facilitates quick public sharing - Editable. Can be used - Compatible with API in conjunction with for linking with other (e.g. MS Sharepoint) systems - Removes all formatting - Minimal formatting
External connectionsSystems Standards / protocols– CRIS / admin systems – CERIF*– RCUK Je-S system– Institutional Repositories – SWORD2– DDI repository – DDI*– DMP Tool (US)– Other instances of DMP – RDF (? - TBC) Online via federated model (? -TBC) * via RESTful API
Research Support Office Data Library / Repository / Archive Researcher(s)DATAMANAGEMENTPLAN UNRULY DATA Computing Faculty Ethics Support Etc... Committee
To sum...All of our DMP-related resources available online via: www.dcc.ac.uk/dmponline/
Thank you Martin Donnelly Sarah Jones Digital Curation Centre Digital Curation Centre University of Edinburgh University of Glasgow email@example.com firstname.lastname@example.org Twitter: @mkdDCC Twitter: @sjDCC Check out DCC at: www.dcc.ac.uk or follow us on twitter @digitalcuration and #ukdcc Image credits: Slide 1 - http://upload.wikimedia.org/wikipedia/commons/8/88/LernaeanHydraRephael.jpg Slide 5 - http://www.dcc.ac.uk/resources/curation-lifecycle-model Slide 6 (The Scream) - http://www.flickr.com/photos/terryfreedman/6548040049 Slide 6 (OAIS) - http://public.ccsds.org/publications/archive/650x0b1.pdfThis work is licensed under the Creative Commons Slide 29 - http://en.wikipedia.org/wiki/File:Hercules_slaying_the_Hydra.jpg Attribution 2.5 UK: Scotland License. Slide 30 - http://www.treehugger.com/picture-is-worth-sum-car-parts.jpg