Developing a Data Management Plan
Martin Donnelly
Digital Curation Centre
University of Edinburgh
AgreenSkills Annual Seminar
Paris, 15 February 2017
Contents
1. About the DCC
2. My involvement with DMPs
3. DMP: what and why?
4. Who’s involved in the DMP process?
5. DMP specifics in H2020
6. Useful resources
7. A few things to note/remember
8. Contacts and opportunity for questions
Contents
1. About the DCC
2. My involvement with DMPs
3. DMP: what and why?
4. Who’s involved in the DMP process?
5. DMP specifics in H2020
6. Useful resources
7. A few things to note/remember
8. Contacts and opportunity for questions
The Digital Curation Centre (DCC)
• The UK’s national centre of expertise in digital
preservation and data management, est. 2004
• Principal audience is the UK higher education sector, but
we increasingly work further afield (continental Europe,
North America, South Africa, Asia…)
• Provide guidance, training, tools (e.g. DMPonline) and
other services on all aspects of research data
management and Open Science
• Now offering tailored consultancy/training
• Organise national and international events and webinars
(International Digital Curation Conference, Research
Data Management Forum)
Contents
1. About the DCC
2. My involvement with DMPs
3. DMP: what and why?
4. Who’s involved in the DMP process?
5. DMP specifics in H2020
6. Useful resources
7. A few things to note/remember
8. Contacts and opportunity for questions
Background
• Checklist for a Data Management Plan (v1, 2009)
• A generic list of issues that a DMP could or should cover, derived from
UK funder requirements
• DMPonline (2010-present)
• A wizard-style, Web-based tool to help researchers and other related
professionals to produce and maintain DMPs according to funder or
institutional policies
• Book Chapter (2011)
• “Data Management Plans and Planning” in Pryor G (ed.) Managing
Research Data (New York, Facet)
• DMPTool (2011-present)
• Helped bring the US DMPTool consortium together, and provided
advice as they were starting up
• EC Reviews (2016-17)
• In summer 2016, I was one of two expert reviewers for first iteration
Horizon 2020 data management plans, and I’m doing it again for the
next batch in February/March 2017
Contents
1. About the DCC
2. My involvement with DMPs
3. DMP: what and why?
4. Who’s involved in the DMP process?
5. DMP specifics in H2020
6. Useful resources
7. A few things to note/remember
8. Contacts and opportunity for questions
Recap: what is RDM?
“the active management
and appraisal of data over
the lifecycle of scholarly
and scientific interest”
What sorts of activities?
- Planning and describing data-
related work before it takes
place
- Documenting your data so that
others can find and understand it
- Storing it safely during the
project
- Depositing it in a trusted archive
at the end of the project
- Linking publications to the
datasets that underpin them
The benefits of Openness
• SPEED: The research process becomes faster
• EFFICIENCY: Data collection can be funded once, and
used many times for a variety of purposes
• ACCESSIBILITY: Interested third parties can (where
appropriate) access and build upon publicly-funded
research resources with minimal barriers to access
• IMPACT and LONGEVITY: Open publications and data
receive more citations, over longer periods
• TRANSPARENCY and QUALITY: The evidence that
underpins research can be made open for anyone to
scrutinise, and attempt to replicate findings. This leads
to a more robust scholarly record
Data Management Plans and Planning
• Data management planning (DMP) underpins and
pulls together different strands of RDM activities,
often across multiple project partners
• DMP is the process of planning, describing and communicating
activities carried out during the research lifecycle in order to…
• Keep sensitive data safe
• Maximise data’s reuse potential
• Support longer-term preservation
• A data management plan is usually a short document detailing specifics
of the data that will be created during a research project, together
with information on how it can be accessed and utilised
• Research funders often ask for DMPs to be submitted alongside grant
applications and/or developed over the course of the research project.
(HEIs are increasingly asking their researchers to do this too…)
Benefits of data management planning
• It is intuitive that planned activities stand a better
chance of meeting their goals than unplanned ones. The
process of planning is also a process of communication,
increasingly important in interdisciplinary/multi-partner
research. Collaboration will be more harmonious if
project partners (in industry, other universities, other
countries…) are on the same page
• In terms of data security, if there are good reasons not to
publish/share data, in whole or in part, you will be on
more solid ground if you flag these up early in the process
• DMP also provides an ideal opportunity to engender good
practice with regard to (e.g.) file formats, metadata
standards, storage and risk management practices,
leading to greater longevity of data, and improved quality
standards…
Limits of data management planning
What can a plan not do? It can’t do the
work for you.
The map is not the territory (Korzybski)
or
Chalk’s no shears (Scottish saying)
It is important to remember that the
human challenges in data management
are often more difficult to meet than
the technological ones.
So communication is vital, especially in
international, multi-partner research!
What does a data management plan look like?
• It	is	usually	a	couple	of	pages	outlining:
ü how	data	will	be	captured/created
ü how	it	will	be	documented
ü who	will	be	able	to	access	it
ü where	it	will	be	stored
ü how	it	will	be	backed	up,	and	
ü whether	(and	how)	it	will	be	shared	and	preserved	long-term
ü etc
• DMPs	are	often	submitted	as	part	of	funding	applications	– and	requirements	
vary	from	funder	to	funder	– but	they	are	useful	whenever	researchers	are	
creating	(or	reusing)	data,	especially	where	the	research	involves	multiple	
partners,	countries,	etc…
Contents
1. About the DCC
2. My involvement with DMPs
3. DMP: what and why?
4. Who’s involved in the DMP process?
5. DMP specifics in H2020
6. Useful resources
7. A few things to note/remember
8. Contacts and opportunity for questions
Roles and responsibilities
Like RDM in general, data management planning is a hybrid
activity, involving multiple stakeholder groups…
• The principal investigator (usually ultimately responsible for data)
• Research assistants (may be more involved in day-to-day data
management)
• The institution’s funding office (may have a compliance role)
• Library/IT/Legal (The library may issue PIDs, or liaise with an external
service who do this, e.g. DataCite.)
• Partners based in other institutions
• Commercial partners
• Etc
Other stakeholders in the modern research process include
governments, public services, and the general public (who fund
lots of research via their taxes)
Caveat!
• It’s	not	necessary	– or	even	desirable	– for	every	
researcher	(or	research	administrator,	or	librarian,	or	IT	
person…)	to	become	an	expert	in	every	aspect	of	data	
management
• Useful	expertise	may	already	exist	within	the	research	
office,	library,	IT,	departmental	support	staff,	legal	
services	etc,	as	well	as	academic	colleagues	well	versed	
in	data	management
• The	trick	is	to	harness	this	and	to	make	it	appear	
seamless.	Communication	and	coordination	(or	at	least	
the	appearance	of…)	is	increasingly	important
Contents
1. About the DCC
2. My involvement with DMPs
3. DMP: what and why?
4. Who’s involved in the DMP process?
5. DMP specifics in H2020
6. Useful resources
7. A few things to note/remember
8. Contacts and opportunity for questions
European policy
• Currently in the midst of an extended pilot for Horizon 2020. Other
projects can participate voluntarily, and opting in has been more
popular than opting out
• Applies as minimum to research data underlying publications, plus any
other data as decided by project
• Participants must:
• Write a DMP as a project deliverable
• Deposit data in a repository
• Make it possible for others to access, mine, exploit and reuse the data
• Share information on the tools needed
…unless there are compelling reasons not to do so.
And these reasons should be recorded… in the DMP.
• Approach: “As open as possible, as closed as necessary”
Horizon 2020 – extended pilot (i)
As part of making research data findable,
accessible, interoperable and re-usable (FAIR), a
DMP should include information on:
• the handling of research data during and after the end
of the project
• what data will be collected, processed and/or
generated
• which methodology and standards will be applied
• whether data will be shared/made open access and
• how data will be curated and preserved (including after
the end of the project)
http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf
Horizon 2020 – extended pilot (ii)
• Once a project has had its funding approved and has started, you must
submit a first version of your DMP (as a deliverable) within the first 6
months of the project
• The Commission provides a DMP template, the use of which is
recommended but voluntary
• The DMP needs to be updated over the course of the project whenever
significant changes arise, such as (but not limited to):
• new data
• changes in consortium policies (e.g. new innovation potential, decision to file
• for a patent)
• changes in consortium composition and external factors (e.g. new consortium
• members joining or old members leaving).
• The DMP should be updated as a minimum in time with the periodic
evaluation/assessment of the project. If there are no other periodic
reviews foreseen within the grant agreement, then such an update
needs to be made in time for the final review at the latest.
Furthermore, the consortium can define a timetable for review in the
DMP itself
http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf
Contents
1. About the DCC
2. My involvement with DMPs
3. DMP: what and why?
4. Who’s involved in the DMP process?
5. DMP specifics in H2020
6. Useful resources
7. A few things to note/remember
8. Contacts and opportunity for questions
DCC resources
• Guidance, e.g. “How-To Develop a Data
Management and Sharing Plan”
• DCC Checklist for a Data Management
Plan:
http://www.dcc.ac.uk/resources/data-
management-plans/checklist
• DMPonline tool:
https://dmponline.dcc.ac.uk/
• Links to all DCC DMP resources via
http://www.dcc.ac.uk/resources/data-
management-plans
• Helps	researchers	write	DMPs
• Provides	funder	questions	and	guidance
• Includes	a	template	DMP	for	Horizon	2020
• Provides	help	from	universities
• Examples	and	suggested	answers
• Free	to	use
• Mature	(v1	launched	April	2010)
• Code	is	Open	Source	(on	GitHub)
https://dmponline.dcc.ac.uk
DMPonline: overview
Registration
Sign	up	with	your	
email	address,	
organisation	and	
password
Select	‘other	
organisation’	if	
yours	is	not	listed
Creating a plan
Select	funder	(if	any)
Select	organisation	for	
additional	questions	
and	guidance
Select	other	sources	
of	guidance
Plan details: summary
Summary	of	the	sections	and	
questions	in	your	DMP
Answering questions
Notes	who	has	answered	
the	question	and	when	
Progress	bar	updates	how	
many	questions	remain
Sharing plans
Allow	colleagues	to	
read-only,	read-write,	
or	become	co-owners
Co-writing DMPs
Sections	are	locked	for	editing	
when	they’re	being	worked	on	
by	colleagues
Exporting DMPs
Can	export	as	plain	text,	docx,	PDF,	html...
Institutions can customise the tool by…
• Adding	templates	
• Adding	custom	guidance
• Providing	example	or	suggested	answers
• Monitoring	usage	within	their	organisation
• Offering	non-English	language	versions
www.dcc.ac.uk/news/customising-dmponline-admin-
interface-launches
More information
Customising	DMPonline
www.dcc.ac.uk/news/customising-
dmponline-admin-interface-launches
http://www.screenr.com/PJHN
Get	the	code,	amend	it,	run	a	local	instance,	flag	issues,	request	features...	
https://github.com/DigitalCurationCentre/DMPonline_v4
And finally, some sample plans
• There are lots of data management plans available on
the Web. The DCC provides links to a number of
sample DMPs via
http://www.dcc.ac.uk/resources/data-management-
plans/guidance-examples
• The US National Endowment for the Humanities (NEH)
recently released over 100 of its DMPs. These are
available via:
http://www.neh.gov/divisions/odh/grant-news/data-
management-plans-successful-grant-applications-2011-
2014-now-available
Contents
1. About the DCC
2. My involvement with DMPs
3. DMP: what and why?
4. Who’s involved in the DMP process?
5. DMP specifics in H2020
6. Useful resources
7. A few things to note/remember
8. Contacts and opportunity for questions
Nota bene!
• DMP is above all a communication activity, between
the data collectors and their contemporaries (project
partners and funders) and with future data re-users…
• Remember that there is no magic bullet, and no one-
size-fits-all solution!
• Much of the benefit of data management planning lies
in the process of planning, above and beyond the plans
produced at the end of the process
• A DMP should be a living document. Research seldom
goes entirely according to plan, and plans should be
updated to reflect the reality of the research, not the
other way around!
Contents
1. About the DCC
2. My involvement with DMPs
3. DMP: what and why?
4. Who’s involved in the DMP process?
5. DMP specifics in H2020
6. Useful resources
7. A few things to note/remember
8. Contacts and opportunity for questions
Thank you: any questions?
• For more information about the DCC:
• Website: www.dcc.ac.uk
• Director: Kevin Ashley
(kevin.ashley@ed.ac.uk)
• General enquiries: Alex Delipalta
(alexandra.delipalta@ed.ac.uk)
• Twitter: @digitalcuration
• My contact details:
• Email: martin.donnelly@ed.ac.uk
• Twitter: @mkdDCC
• Slideshare:
http://www.slideshare.net/martindonn
elly
This work is licensed
under the Creative
Commons Attribution 2.5
UK: Scotland License.

Developing a Data Management Plan

  • 1.
    Developing a DataManagement Plan Martin Donnelly Digital Curation Centre University of Edinburgh AgreenSkills Annual Seminar Paris, 15 February 2017
  • 2.
    Contents 1. About theDCC 2. My involvement with DMPs 3. DMP: what and why? 4. Who’s involved in the DMP process? 5. DMP specifics in H2020 6. Useful resources 7. A few things to note/remember 8. Contacts and opportunity for questions
  • 3.
    Contents 1. About theDCC 2. My involvement with DMPs 3. DMP: what and why? 4. Who’s involved in the DMP process? 5. DMP specifics in H2020 6. Useful resources 7. A few things to note/remember 8. Contacts and opportunity for questions
  • 4.
    The Digital CurationCentre (DCC) • The UK’s national centre of expertise in digital preservation and data management, est. 2004 • Principal audience is the UK higher education sector, but we increasingly work further afield (continental Europe, North America, South Africa, Asia…) • Provide guidance, training, tools (e.g. DMPonline) and other services on all aspects of research data management and Open Science • Now offering tailored consultancy/training • Organise national and international events and webinars (International Digital Curation Conference, Research Data Management Forum)
  • 5.
    Contents 1. About theDCC 2. My involvement with DMPs 3. DMP: what and why? 4. Who’s involved in the DMP process? 5. DMP specifics in H2020 6. Useful resources 7. A few things to note/remember 8. Contacts and opportunity for questions
  • 6.
    Background • Checklist fora Data Management Plan (v1, 2009) • A generic list of issues that a DMP could or should cover, derived from UK funder requirements • DMPonline (2010-present) • A wizard-style, Web-based tool to help researchers and other related professionals to produce and maintain DMPs according to funder or institutional policies • Book Chapter (2011) • “Data Management Plans and Planning” in Pryor G (ed.) Managing Research Data (New York, Facet) • DMPTool (2011-present) • Helped bring the US DMPTool consortium together, and provided advice as they were starting up • EC Reviews (2016-17) • In summer 2016, I was one of two expert reviewers for first iteration Horizon 2020 data management plans, and I’m doing it again for the next batch in February/March 2017
  • 7.
    Contents 1. About theDCC 2. My involvement with DMPs 3. DMP: what and why? 4. Who’s involved in the DMP process? 5. DMP specifics in H2020 6. Useful resources 7. A few things to note/remember 8. Contacts and opportunity for questions
  • 8.
    Recap: what isRDM? “the active management and appraisal of data over the lifecycle of scholarly and scientific interest” What sorts of activities? - Planning and describing data- related work before it takes place - Documenting your data so that others can find and understand it - Storing it safely during the project - Depositing it in a trusted archive at the end of the project - Linking publications to the datasets that underpin them
  • 9.
    The benefits ofOpenness • SPEED: The research process becomes faster • EFFICIENCY: Data collection can be funded once, and used many times for a variety of purposes • ACCESSIBILITY: Interested third parties can (where appropriate) access and build upon publicly-funded research resources with minimal barriers to access • IMPACT and LONGEVITY: Open publications and data receive more citations, over longer periods • TRANSPARENCY and QUALITY: The evidence that underpins research can be made open for anyone to scrutinise, and attempt to replicate findings. This leads to a more robust scholarly record
  • 10.
    Data Management Plansand Planning • Data management planning (DMP) underpins and pulls together different strands of RDM activities, often across multiple project partners • DMP is the process of planning, describing and communicating activities carried out during the research lifecycle in order to… • Keep sensitive data safe • Maximise data’s reuse potential • Support longer-term preservation • A data management plan is usually a short document detailing specifics of the data that will be created during a research project, together with information on how it can be accessed and utilised • Research funders often ask for DMPs to be submitted alongside grant applications and/or developed over the course of the research project. (HEIs are increasingly asking their researchers to do this too…)
  • 11.
    Benefits of datamanagement planning • It is intuitive that planned activities stand a better chance of meeting their goals than unplanned ones. The process of planning is also a process of communication, increasingly important in interdisciplinary/multi-partner research. Collaboration will be more harmonious if project partners (in industry, other universities, other countries…) are on the same page • In terms of data security, if there are good reasons not to publish/share data, in whole or in part, you will be on more solid ground if you flag these up early in the process • DMP also provides an ideal opportunity to engender good practice with regard to (e.g.) file formats, metadata standards, storage and risk management practices, leading to greater longevity of data, and improved quality standards…
  • 12.
    Limits of datamanagement planning What can a plan not do? It can’t do the work for you. The map is not the territory (Korzybski) or Chalk’s no shears (Scottish saying) It is important to remember that the human challenges in data management are often more difficult to meet than the technological ones. So communication is vital, especially in international, multi-partner research!
  • 13.
    What does adata management plan look like? • It is usually a couple of pages outlining: ü how data will be captured/created ü how it will be documented ü who will be able to access it ü where it will be stored ü how it will be backed up, and ü whether (and how) it will be shared and preserved long-term ü etc • DMPs are often submitted as part of funding applications – and requirements vary from funder to funder – but they are useful whenever researchers are creating (or reusing) data, especially where the research involves multiple partners, countries, etc…
  • 14.
    Contents 1. About theDCC 2. My involvement with DMPs 3. DMP: what and why? 4. Who’s involved in the DMP process? 5. DMP specifics in H2020 6. Useful resources 7. A few things to note/remember 8. Contacts and opportunity for questions
  • 15.
    Roles and responsibilities LikeRDM in general, data management planning is a hybrid activity, involving multiple stakeholder groups… • The principal investigator (usually ultimately responsible for data) • Research assistants (may be more involved in day-to-day data management) • The institution’s funding office (may have a compliance role) • Library/IT/Legal (The library may issue PIDs, or liaise with an external service who do this, e.g. DataCite.) • Partners based in other institutions • Commercial partners • Etc Other stakeholders in the modern research process include governments, public services, and the general public (who fund lots of research via their taxes)
  • 16.
    Caveat! • It’s not necessary – or even desirable –for every researcher (or research administrator, or librarian, or IT person…) to become an expert in every aspect of data management • Useful expertise may already exist within the research office, library, IT, departmental support staff, legal services etc, as well as academic colleagues well versed in data management • The trick is to harness this and to make it appear seamless. Communication and coordination (or at least the appearance of…) is increasingly important
  • 17.
    Contents 1. About theDCC 2. My involvement with DMPs 3. DMP: what and why? 4. Who’s involved in the DMP process? 5. DMP specifics in H2020 6. Useful resources 7. A few things to note/remember 8. Contacts and opportunity for questions
  • 18.
    European policy • Currentlyin the midst of an extended pilot for Horizon 2020. Other projects can participate voluntarily, and opting in has been more popular than opting out • Applies as minimum to research data underlying publications, plus any other data as decided by project • Participants must: • Write a DMP as a project deliverable • Deposit data in a repository • Make it possible for others to access, mine, exploit and reuse the data • Share information on the tools needed …unless there are compelling reasons not to do so. And these reasons should be recorded… in the DMP. • Approach: “As open as possible, as closed as necessary”
  • 19.
    Horizon 2020 –extended pilot (i) As part of making research data findable, accessible, interoperable and re-usable (FAIR), a DMP should include information on: • the handling of research data during and after the end of the project • what data will be collected, processed and/or generated • which methodology and standards will be applied • whether data will be shared/made open access and • how data will be curated and preserved (including after the end of the project) http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf
  • 20.
    Horizon 2020 –extended pilot (ii) • Once a project has had its funding approved and has started, you must submit a first version of your DMP (as a deliverable) within the first 6 months of the project • The Commission provides a DMP template, the use of which is recommended but voluntary • The DMP needs to be updated over the course of the project whenever significant changes arise, such as (but not limited to): • new data • changes in consortium policies (e.g. new innovation potential, decision to file • for a patent) • changes in consortium composition and external factors (e.g. new consortium • members joining or old members leaving). • The DMP should be updated as a minimum in time with the periodic evaluation/assessment of the project. If there are no other periodic reviews foreseen within the grant agreement, then such an update needs to be made in time for the final review at the latest. Furthermore, the consortium can define a timetable for review in the DMP itself http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf
  • 21.
    Contents 1. About theDCC 2. My involvement with DMPs 3. DMP: what and why? 4. Who’s involved in the DMP process? 5. DMP specifics in H2020 6. Useful resources 7. A few things to note/remember 8. Contacts and opportunity for questions
  • 22.
    DCC resources • Guidance,e.g. “How-To Develop a Data Management and Sharing Plan” • DCC Checklist for a Data Management Plan: http://www.dcc.ac.uk/resources/data- management-plans/checklist • DMPonline tool: https://dmponline.dcc.ac.uk/ • Links to all DCC DMP resources via http://www.dcc.ac.uk/resources/data- management-plans
  • 23.
    • Helps researchers write DMPs • Provides funder questions and guidance •Includes a template DMP for Horizon 2020 • Provides help from universities • Examples and suggested answers • Free to use • Mature (v1 launched April 2010) • Code is Open Source (on GitHub) https://dmponline.dcc.ac.uk DMPonline: overview
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
    Institutions can customisethe tool by… • Adding templates • Adding custom guidance • Providing example or suggested answers • Monitoring usage within their organisation • Offering non-English language versions www.dcc.ac.uk/news/customising-dmponline-admin- interface-launches
  • 32.
  • 33.
    And finally, somesample plans • There are lots of data management plans available on the Web. The DCC provides links to a number of sample DMPs via http://www.dcc.ac.uk/resources/data-management- plans/guidance-examples • The US National Endowment for the Humanities (NEH) recently released over 100 of its DMPs. These are available via: http://www.neh.gov/divisions/odh/grant-news/data- management-plans-successful-grant-applications-2011- 2014-now-available
  • 34.
    Contents 1. About theDCC 2. My involvement with DMPs 3. DMP: what and why? 4. Who’s involved in the DMP process? 5. DMP specifics in H2020 6. Useful resources 7. A few things to note/remember 8. Contacts and opportunity for questions
  • 35.
    Nota bene! • DMPis above all a communication activity, between the data collectors and their contemporaries (project partners and funders) and with future data re-users… • Remember that there is no magic bullet, and no one- size-fits-all solution! • Much of the benefit of data management planning lies in the process of planning, above and beyond the plans produced at the end of the process • A DMP should be a living document. Research seldom goes entirely according to plan, and plans should be updated to reflect the reality of the research, not the other way around!
  • 36.
    Contents 1. About theDCC 2. My involvement with DMPs 3. DMP: what and why? 4. Who’s involved in the DMP process? 5. DMP specifics in H2020 6. Useful resources 7. A few things to note/remember 8. Contacts and opportunity for questions
  • 37.
    Thank you: anyquestions? • For more information about the DCC: • Website: www.dcc.ac.uk • Director: Kevin Ashley (kevin.ashley@ed.ac.uk) • General enquiries: Alex Delipalta (alexandra.delipalta@ed.ac.uk) • Twitter: @digitalcuration • My contact details: • Email: martin.donnelly@ed.ac.uk • Twitter: @mkdDCC • Slideshare: http://www.slideshare.net/martindonn elly This work is licensed under the Creative Commons Attribution 2.5 UK: Scotland License.