Data Management PlanningUniversity of Northampton, 27th February 2013 Marieke Guy DCC, University of Bath email@example.com Funded by: This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 2.5 UK: Scotland License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/2.5/scotland/ ; or, (b) send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.
What is a data management plan?A brief plan written at the start of your project to define:• how your data will be created?• how it will be documented?• who will access it?• where it will be stored?• who will back it up?• whether (and how) it will be shared & preserved?DMPs are often submitted as part of grant applications,but are useful whenever you are creating data.
Why develop a DMP?• to meet funder requirements• help you manage your data• to make informed decisions so you don’t have to figure out things as you go• to anticipate and avoid problems e.g. data loss• to make your life easier!
What should a DMP cover?1. Provide a description of the data1. Explain how the data will be collected & documented2. Outline the plans for data sharing3. Justify any restrictions on sharing (ethics, IP)4. State the long-term preservation plan
Provide a description of the dataWhy is this important?A good description of the data to be collected will help reviewersunderstand the characteristics of the data, their relationship toexisting data, and any disclosure risks that may apply.•e.g. The proposed research will include data fromapproximately 500 subjects being screened for three bacterialsexually transmitted diseases (STDs) at an inner city STD clinic.The final dataset will include self-reported demographic andbehavioural data from interviews with the subjects andlaboratory data from urine specimens provided.
Data collection & documentationWhy is this important?Creating data in formats preferred for archiving helps to ensurethat they will be usable in the future. Good descriptive metadataare essential for effective data use. •e.g. Quantitative survey data files generated will be processed as SPSS system files with DDI XML documentation. The codebook will contain information on study design, sampling methodology, fieldwork, variable-level detail, and all information necessary for a secondary analyst to use the data accurately and effectively.
Outline the plans for data sharingWhy is this important?Sharing data helps to advance science and to maximize theresearch investment. Your funder probably expects you to sharedata wherever possible.•e.g. We will make the data and associated documentationavailable to users under a data-sharing agreement that providesfor: (1) a commitment to using the data only for researchpurposes and not to identify any individual participant; (2) acommitment to securing the data using appropriate computertechnology; and (3) a commitment to destroying or returningthe data after analyses are completed.
Justify any restrictions on sharingWhy is this important?As funders expect data to be shared, any restrictions need to bevalid. Protection of human subjects is a fundamental tenet ofresearch and an important ethical obligation for everyone.•e.g. Because the STDs being studied are reportable diseases,we will be collecting identifying information. Even though thefinal dataset will be stripped of identifiers prior to release forsharing, we believe that there remains the possibility ofdeductive disclosure of subjects with unusual characteristics.Thus, we will make the data and associated documentationavailable to users only under a data-sharing agreement.
State the long-term preservation planWhy is this important?Digital data need to be actively managed over time to ensure thatthey will always be available and usable. Depositing data resourceswith a trusted digital archive can ensure that they are curated andhandled according to good practices in digital preservation. •e.g. The investigators will work with staff at the UKDA to determine what to archive and how long the deposited data should be retained. Future long-term use of the data will be ensured by placing a copy of the data into the repository.
A useful framework to get you started •Think about why the questions are being asked – why is it useful to consider that? •Look at examples to help you understand what to write•www.icpsr.umich.edu/icpsrweb/content/datamanagement/dmp/framework.html
Help from the DCC •https://dmponline.dcc.ac.uk•www.dcc.ac.uk/resources/how-guides/develop-data-plan
...a web-based tool to help researchers write DataManagement Plans according to different funderrequirementsWe’re currently enhancing it with practical examples,boilerplate text and tailored support https://dmponline.dcc.ac.uk
How DMP Online works Create a plan based on relevant funder / institutional templates......and thenanswer the questions using the guidance provided
Tips for writing DMPs• Seek advice - consult and collaborate• Consider good practice for your field• Base plans on available skills & support• Make sure implementation is feasible
Advice on what funders look forAudio clip from presentation by Peter Dukes of the MRC
Sources of guidance• ICPSR framework for a data management plan www.icpsr.umich.edu/icpsrweb/content/datamanagement/dmp/ framework.html• How to develop a data management and sharing plan www.dcc.ac.uk/resources/how-guides/develop-data-plan• UKDA’s manage and share your data guide• http://data-archive.ac.uk/media/2894/managingsharing.pdf
•To Summarise: Data Management and Sharing Plans Funders typically want a short statement covering: • What data will be created? (format, types, volume) • How will the data be collected and documented? • What are the plans for data sharing and access? • Justify any restrictions on sharing (ethics, IP) • What is the strategy for long-term preservation?
Thanks - any questions? For DCC guidance, tools and case studies see: www.dcc.ac.uk/resourcesFollow us on twitter @digitalcuration and #ukdccThanks to Research360 for contribution to slides