Presentation given by Sarah Jones at a seminar run by LSHTM on 6th November 2012. http://www.lshtm.ac.uk/newsevents/events/2012/11/developing-data-management-expertise-in-research---half-day-event
1. Data Management Planning for
the health sciences
Sarah Jones
Digital Curation Centre
sarah.jones@glasgow.ac.uk
2. What is a Data Management Plan?
A brief statement outlining how data will be created,
managed, shared and preserved, explaining decisions
and justifying any restrictions that need to be applied.
Often submitted as part of grant applications, but
useful whenever you’re creating data.
3. Why develop a DMP?
• to help you manage your data
• to provide guidelines for everyone to work to
• to anticipate and avoid problems e.g. data loss
• to make your life easier!
• to comply with funders requirements...
4. Who requires a DMP or equivalent?
http://researchonline.lshtm.ac.uk/208596/1/Funder_Requirements_Analysis.pdf
5. Cancer Research UK
The following should be considered when developing a data sharing plan:
• The volume, type, content and format of the final dataset
• The standards that will be utilised for data collection and management
• The metadata, documentation or other supporting material that should
accompany the data for it to be interpreted correctly
• The method used to share data
• The timescale for public release of data
• The long-term preservation plan for the dataset
• Whether a data sharing agreement will be required
• Any reasons why there may be restrictions on data sharing
6. Medical Research Council
The MRC provides a template with the following sections:
0. Proposal name
1. Description of the data
2. Data collection / generation
3. Data management, documentation and curation
4. Data security & confidentiality of potentially disclosive personal information
5. Data sharing and access
6. Responsibilities
7. Relevant policies on data sharing and data security
8. Author of this Data Management Plan (Name) and contact details
7. Wellcome Trust
Applicants should consider the following seven questions:
i. What data outputs will your research generate and what data will have
value to other researchers?
ii. When will you share the data?
iii. Where will you make the data available?
iv. How will other researchers be able to access the data?
v. Are any limits to data sharing required – for example, to either
safeguard research participants or to gain appropriate intellectual
property protection?
vi. How will you ensure that key datasets are preserved to ensure their
long-term value?
vii. What resources will you require to deliver your plan?
8. Five common themes in DMPs
1. Provide a description of the data
2. Explain how the data will be collected & documented
3. Outline the plans for data sharing
4. Justify any restrictions on sharing
5. State the long-term preservation plan
9. Provide a description of the data
Why is this important?
A good description of the data to be collected will help reviewers
understand the characteristics of the data, their relationship to
existing data, and any disclosure risks that may apply.
e.g. The proposed research will include data from approximately
500 subjects being screened for three bacterial sexually
transmitted diseases (STDs) at an inner city STD clinic. The final
dataset will include self-reported demographic and behavioural
data from interviews with the subjects and laboratory data from
urine specimens provided.
n.b. EXPLANATIONS AND EXAMPLES COURTESY OF NIH AND ICPSR
10. Explain how the data will be collected
Why is this important?
Creating data in formats preferred for archiving helps to ensure
that they will be usable in the future. Good descriptive metadata
are essential for effective data use.
e.g. Quantitative survey data files generated will be processed as
SPSS system files with DDI XML documentation. The codebook
will contain information on study design, sampling methodology,
fieldwork, variable-level detail, and all information necessary for
a secondary analyst to use the data accurately and effectively.
11. Outline the plans for data sharing
Why is this important?
Sharing data helps to advance science and to maximize the
research investment. Your funder probably expects you to share
data wherever possible.
e.g. We will make the data and associated documentation
available to users under a data-sharing agreement that provides
for: (1) a commitment to using the data only for research
purposes and not to identify any individual participant; (2) a
commitment to securing the data using appropriate computer
technology; and (3) a commitment to destroying or returning the
data after analyses are completed.
12. Justify any restrictions on sharing
Why is this important?
As funders expect data to be shared, any restrictions need to be
valid. Protection of human subjects is a fundamental tenet of
research and an important ethical obligation for everyone.
e.g. Because the STDs being studied are reportable diseases, we
will be collecting identifying information. Even though the final
dataset will be stripped of identifiers prior to release for sharing,
we believe that there remains the possibility of deductive
disclosure of subjects with unusual characteristics. Thus, we will
make the data and associated documentation available to users
only under a data-sharing agreement.
13. State the long-term preservation plan
Why is this important?
Digital data need to be actively managed over time to ensure that
they will always be available and usable. Depositing data resources
with a trusted digital archive can ensure that they are curated and
handled according to good practices in digital preservation.
e.g. The investigators will work with staff at the UKDA to
determine what to archive and how long the deposited data
should be retained. Future long-term use of the data will be
ensured by placing a copy of the data into the repository.
14. Tips for writing DMPs
• Seek advice - consult and collaborate
• Consider good practice for your field
• Base plans on available skills & support
• Make sure implementation is feasible
• Justify the decisions, particularly restrictions
15. Sources of guidance
• ICPSR framework for a data management plan
www.icpsr.umich.edu/icpsrweb/content/datamanagem
ent/dmp/framework.html
• How to develop a data management and sharing plan
www.dcc.ac.uk/resources/how-guides/develop-data-plan
• LSHTM Research Data Management support service
http://blogs.lshtm.ac.uk/rdmss
16. How DMP Online can help
DMP Online is a web-based tool to help
researchers write Data Management Plans
according to different funder requirements
https://dmponline.dcc.ac.uk
17. How DMP Online works
Create a plan
based on
relevant
funder /
institutional
templates...
...and then
answer the
questions
using the
guidance
provided
18. Thanks - any questions?
For DCC guidance, tools and case studies see:
www.dcc.ac.uk/resources
Follow us on twitter @digitalcuration and #ukdcc
Editor's Notes
Some ask for a data sharingor data access plan instead of a DMP, or simply encourage you to describe plans for data cleaning, monitoring & verification. Gareth’s analysis of health-related funders highlighted those listed here as having some form of requirement.
There are various templates in DMP Online based on different funder requirements and institutional customisations