Published on

An introduction to Research Data Management and Data Management Planning presented at the University of the West of England on Wednesday 9th July 2014.

Published in: Technology, Education
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Data is increasing in significance. It will unquestionably matter to your research careers, more than it does to your supervisors’ generation.
    Learn good data habits now! You’ll need them later.
  • Some formats are better for data sharing and long-term preservation than others.
    It’s preferable to use formats that are uncompressed (e.g. large, high-quality files like .wav), non-proprietary (i.e. open) standards that are documented and well-understood. This aids preservation and interoperability.
    Some data centres have preferred formats for deposit so it’s worthwhile encouraging researchers to consult these to check.
  • To make sure their data can be understood by themselves, their community and others, researchers should create metadata and documentation.
    Metadata is basic descriptive information to help identify and understand the structure of the data e.g. title, author...
    Documentation provides the wider context. It’s useful to share the methodology / workflow, software and any information needed to understand the data e.g. explanation of abbreviations or acronyms
    There are lots of standards that can be used. The DCC started a catalogue of disciplinary metadata standards which is now being taken forward as an international initiative via an RDA working group
  • The EC guidelines suggest selecting a suitable repository. The Databib and Re3data lists can be useful for this. They allow you to search and browse by subject. Re3data also allows you to restrict the search by certificates, open access repositories and persistent identifiers.
  • Guidance from the DCC can also help researchers to understand data licensing. This guide outlines the pros and cons of each approach e.g. the limitations of some CC options
    Under Horizon 2020 it’s recommended that researchers use CC-0 or CC-BY to make data as open as possible.
  • I recommend this ICPSR resource
    It explains the importance of different questions as a pointer to how to answer
    Examples are given. This is the most frequent request we get at DCC - examples help researchers think of what to write for their context
  • The DCC has produced a How to guide on writing DMPs and developed a tool to help
  • DC101 UWE

    1. 1. Research Data Management Sarah Jones DCC, University of Glasgow sarah.jones@glasgow.ac.uk Twitter: @sjDCC •University of the West of England, 9th July 2014 Funded by:
    2. 2. Programme • Quiz of funders’ requirements • Introduction to RDM • Data management planning • Demo of DMPonline • Q&A
    3. 3. “the active management and appraisal of data over the lifecycle of scholarly and scientific interest” Data management is part of good research practice What is research data management?
    4. 4. Why manage your research data? • To make your research easier! • To stop yourself drowning in irrelevant stuff • In case you need the data later • To avoid accusations of fraud or bad science • To share your data for others to use and learn from • To get credit for producing it • Because somebody else said to do so
    5. 5. RCUK Common Principles on Data Policy “Publicly funded research data are a public good, produced in the public interest, which should be made openly available with as few restrictions as possible in a timely and responsible manner that does not harm intellectual property.” www.rcuk.ac.uk/research/datapolicy
    6. 6. Why share data?
    7. 7. Benefits of data sharing data (1) www.nytimes.com/2010/08/13/health/research /13alzheimer.html?pagewanted=all&_r=0 “It was unbelievable. Its not science the way most of us have practiced in our careers. But we all realised that we would never get biomarkers unless all of us parked our egos and intellectual property noses outside the door and agreed that all of our data would be public immediately.” Dr John Trojanowski, University of Pennsylvania •... scientific breakthroughs
    8. 8. Benefits of data sharing (2) “There is evidence that studies that make their data available do indeed receive more citations than similar studies that do not.” Piwowar H. and Vision T.J 2013 "Data reuse and the open data citation advantage“ https://peerj.com/preprints/1.pdf 9% - 30% increase •... more citations
    9. 9. If you plan to share your data.... • Have you got consent for sharing? • Do any licences you’ve signed permit sharing? • Is your data in suitable formats? Decisions made early on affect what you can do later
    10. 10. Some formats are better for long-term It’s preferable to opt for formats that are: • Uncompressed • Non-proprietary • Open, documented • Standard representation (ASCII, Unicode) Data centres may have preferred formats for deposit e.g. Type Recommended Non-preferred Tabular data CSV, TSV, SPSS portable Excel Text Plain text, HTML, RTF PDF/A only if layout matters Word Media Container: MP4, Ogg Codec: Theora, Dirac, FLAC Quicktime H264 Images TIFF, JPEG2000, PNG GIF, JPG Structured data XML, RDF RDBMS Further examples: http://www.data-archive.ac.uk/create-manage/format/formats-table
    11. 11. Documentation What would someone unfamiliar with your data need in order to find, evaluate, understand, and reuse them? Consider the differences between someone inside your research group, someone outside your group but in your field, and someone outside your field.
    12. 12. Documentation and standards Metadata: basic info e.g. title, author, dates, access rights... Documentation: context, workflows, methods, code, data dictionary... Use standards wherever possible for interoperability www.dcc.ac.uk/resources/ metadata-standards
    13. 13. Tools for managing data www.dcc.ac.uk/resources/external/tools-services/ managing-active-research-data
    14. 14. Where to store your data? • Your own drive (PC, server, flash drive, etc.) – And if you lose it? Or it breaks? • Somebody else’s drive • Departmental drive • “Cloud” drive – Do they care as much about your data as you do?
    15. 15. How to backup? • 3… 2… 1… backup! – at least 3 copies of a file – on at least 2 different media – with at least 1 offsite • Use managed services where possible e.g. University filestores rather than local or external hard drives • Ask central or local IT team for advice
    16. 16. Archiving: data repositories http://databib.org http://service.re3data.org/search Zenodo •OpenAIRE-CERN joint effort •Multidisciplinary repository •Multiple data types – Publications – Long tail of research data •Citable data (DOI) •Links to funding, pubs, data & software www.zenodo.org
    17. 17. •CREATIVE COMMONS LIMITATIONS • NC Non-Commercial • What counts as commercial? • SA Share Alike • Reduces interoperability • ND No Derivatives • Severely restricts use www.dcc.ac.uk/resources/ how-guides/license-research-data License your data for reuse Outlines pros and cons of each approach and gives practical advice on how to implement your licence
    18. 18. Data citation • Makes it easier for readers to locate the data and validate findings • Data citations ensure that data contributors receive proper credit • Can link to reuse to show impact • Less danger of rival researchers ‘stealing’ results from those who publish their data openly www.dcc.ac.uk/resources/briefing-papers/introduction-curation /data-citation-and-linking
    19. 19. ImpactStory: Altmetrics •https://impactstory.org
    20. 20. Getting your research out there www.katiephd.com/twitter-and-science-publications
    21. 21. Managing and sharing data: a best practice guide • How to write a DMP • Formatting your data • Documentation • Data sharing • Ethics and consent • Copyright • … http://data-archive.ac.uk/media/2894/managingsharing.pdf
    22. 22. Putting the pieces together... ...DMPs Photo by Dread Pirate Jeff http://www.flickr.com/photos /justageek/2851643792
    23. 23. What is a data management plan? A brief plan written at the start of your project to define: • how your data will be created? • how it will be documented? • who will access it? • where it will be stored? • who will back it up? • whether (and how) it will be shared & preserved? DMPs are often submitted as part of grant applications, but are useful whenever you’re creating data.
    24. 24. Why YOU need a Data Management Plan http://blogs.ch.cam.ac.uk/ pmr/2011/08/01/why- you-need-a-data- management-plan What if this was your laptop?
    25. 25. Which UK funders require a DMP? •www.dcc.ac.uk/resources/policy-and-legal/ overview-funders-data-policies
    26. 26. DCC Checklist for a DMP • 13 questions on what’s asked across the board • Prompts / pointers to help researchers get started • Guidance on how to answer www.dcc.ac.uk/sites/default/files/documents /resource/DMP_Checklist_2013.pdf
    27. 27. Common themes in DMPs 1. Description of data to be collected / created (i.e. content, type, format, volume...) 2. Standards / methodologies for data collection & management 3. Ethics and Intellectual Property (highlight any restrictions on data sharing e.g. embargoes, confidentiality) 4. Plans for data sharing and access (i.e. how, when, to whom) 5. Strategy for long-term preservation
    28. 28. A useful framework to get you started Think about why the questions are being asked – why is it useful to consider that topic? Look at examples to help you understand what to write •www.icpsr.umich.edu/icpsrweb/content/datamanagement/dmp/framework.html
    29. 29. Tips for writing DMPs • Seek advice - consult and collaborate • Consider good practice for your field • Base plans on available skills & support • Make sure implementation is feasible
    30. 30. Example plans • Technical plan submitted to AHRC by Bristol Uni http://data.bris.ac.uk/research/planning/files/2013/08/data.bris-AHRC-example-Technical- • Rural Economy & Land Use (RELU) programme examples http://relu.data-archive.ac.uk/data-sharing/planning/examples • UCSD example DMPs (20+ scientific plans for NSF) http://rci.ucsd.edu/data-curation/examples.html • My DMP – a satire (what not to write!) http://ivory.idyll.org/blog/data-management.html More at: https://dmponline.dcc.ac.uk/help#DMPhelp
    31. 31. Help from the DCC •https://dmponline.dcc.ac.uk •www.dcc.ac.uk/resources/how-guides/develop-data-plan A web-based tool to help researchers write data management plans
    32. 32. DMPonline demo https://dmponline.dcc.ac.uk
    33. 33. Thanks – any questions? DCC guidance, tools and case studies: www.dcc.ac.uk/resources Follow us on twitter: @digitalcuration and #ukdcc Credit to Dorothea Salo, Ryan Schryver and colleagues for content from the “Escaping Datageddon” presentation for slides 4, 11 & 14, available at: http://www.slideshare.net/cavlec/escaping-datageddon And to the Research360 project at the University of Bath for content from the “Managing your research data” presentation for slide 10, available at: http://opus.bath.ac.uk/32296