Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

The purpose, practicalities, pitfalls and policies of managing and sharing data in the UK


Published on

Talk to the Royal Society of Chemistry, Chemical Information and Computer Applications Group conference - Measurement, Information and Innovation: Digital Disruption in the Chemical Sciences. Tuesday 20th October 2015, RSC, Burlington House, Piccadilly, London

Published in: Data & Analytics

The purpose, practicalities, pitfalls and policies of managing and sharing data in the UK

  1. 1. The purpose, practicalities, pitfalls and policies of managing and sharing data in the UK AAMG-CICAG Measurement, Information and Innovation meeting 20 October 2015 Dr Danny Kingsley
  2. 2. Can we cover this in 15 minutes (allowing 5 min for questions?) • UK policy landscape • Places to share data • What are we trying to achieve? • Let’s start at the beginning • Basics of Research Data Management • Issues with sharing (or not) data
  3. 3. The data policy landscape Lots of slightly different rules in the UK
  4. 4. Policies • Funder – RCUK Common Principles on Data Policy • Government – Draft Concordat on Open Research Data released by the RCUK for consultation which ended on 28 September • – Cambridge coordinated a joint response with other universities • • Publishers • Institutional – Cambridge University Research Data Management Policy Framework.
  5. 5. RCUK Common Principles on Data –“Publicly funded research data are a public good (…), which should be made openly available with as few restrictions as possible” – /
  6. 6. The principles might be common…
  7. 7. What the researcher hears From Bill Hubbard Getting the rights right: when policies collide
  8. 8. Places to share data There are lots of options
  9. 9. Open repositories • (some are free, some charge)
  10. 10. Disciplinary specific repositories • Gene Expression Omnibus – Public function genomics data repository • • arXiv – e-prints in Physics, Mathematics, Computer Science, Quantitative Biology, Quantitative Finance and Statistics • • Oxford Text Archive – Literary and linguistic texts for higher education • • UK Data Service – Social science data • • Natural Environment Research Council (NERC) run 7 repositories •
  11. 11. Journals • Either as supplementary data, or in data-only journals – PLOS data sharing policy (Dec 2013) • – Nature’s journal Scientific Data •
  12. 12. We are a long way from there
  13. 13. So what’s it all about then? What are we actually trying to achieve with open data policies?
  14. 14. In conversation with Ben Ryan EPSRC • Please share: – the data that underpins publications – the data that validates research findings – the data that is worth keeping • The default position is ‘data should be open’ • Published research findings should be testable • Maximise the impact of publicly funded research • Maintain public trust in science and research • They are trying to create a new research culture •
  15. 15. Responses to data sharing policies • What’s the minimum we can get away with? • This is crap • ‘They’ are just doing this because ‘they’ can • But it will take a huge effort to get the data in a useable form • No-one will look at it • What a waste of time
  16. 16. Data excuse bingo
  17. 17. We are trying to start at the end We should begin at the beginning - a stitch in time and all that…
  18. 18. In conversation with Michael Ball BBSRC • Disciplines themselves must establish ways of dealing with data – This is the beginning of an ongoing process • Researchers need to consider how to deal with data from the beginning of a research project • You can ask for money to manage data in the grant application •
  19. 19. Research data management • The practice of sharing data requires the data to be: – Accessible – Intelligible – Assessable – Reusable
  20. 20. Some of it is really obvious • How many of you: – Use a file naming protocol? – Ensure all your laptops are backed up? – Have written a data management plan for your current project? – Determined who in the team owns the data? • PS: this last one REALLY matters
  21. 21. Skillsets required for managing and curating data
  22. 22. Lots of jobs…
  23. 23. Issues with sharing data Both with sharing and not sharing
  24. 24. Issues raised by researchers • There is a very real concern that the UK will become unattractive for collaborations • Researchers discussing changing the type of research being done to reduce the amount of data being produced • There is discussion in some circles whether applying for EPSRC funding is worth the hassle
  25. 25. Consequences of not sharing data • Medicine – Having the data publicly available in two trials of deworming pills demonstrated that a population wide deworming program did not improve school performance – • Economics – A study widely cited to justify budget cutting in the US had a mistake in the calculations which was only revealed when the Excel file was released – the-excel-error-that-changed-history • Physics – It took 12.5 years to withdraw Jan Hendrik Schon’s work on ‘organic semiconductors’ because the reviewers were unable to replicate the results without access to the original data or lab books – ss_physics_fraud_gets_last_laugh_whole_book_about_himself
  26. 26. Questions? Dr Danny Kingsley Head of Scholarly Communication University of Cambridge Email: Blog: Website: Twitter: @dannykay68