Research data: burden or treasure? (Talk from #fote13)

1,312 views
1,218 views

Published on

A talk at #fote13 (fote-conference.com) about why we should *all* - as taxpayers - care about reuse of research data

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,312
On SlideShare
0
From Embeds
0
Number of Embeds
87
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Those external pressures include those from funders such as EPSRC. Looming deadlines this year and in 2015 got the attention of senior university management.
  • The expectations that universities need to sign up are listed here – their roadmaps need to demonstrate how they are going to deliver on these expectations by 2015. They include a commitment to keep data for 10 years after its last use – note, not just after the project ends. Some worry that this means they need to keep data for 100 years. I say that if your data is still being used (and cited) 100 years later you should break out the champagne, not worry about paying for it.
  • Research data: burden or treasure? (Talk from #fote13)

    1. 1. Research data: burden or treasure? Kevin Ashley Digital Curation Centre www.dcc.ac.uk @kevingashley Kevin.ashley@ed.ac.uk Reusable with attribution: CC-BY The DCC is supported by Jisc & FP7
    2. 2. 164 universities in UK* *2011 HESA data 2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY 2 71 (43%) > 5% research income 115 (70%) > £1m income from research
    3. 3. £4.4 billion total research grants 2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY 3
    4. 4. Funders are making demands 2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY 4
    5. 5. 2013-10-11 Kevin Ashley – FOTiE 2013 - CC- BY 5 http://www.epsrc.ac.uk/about/standards/researchdata/Pages/expectations.aspx EPSRC expects all those institutions it funds to develop a roadmap that aligns … with EPSRC’s expectations by 1st May 2012; to be fully compliant … by 1st May 2015.
    6. 6. 2012-06-15 Kevin Ashley, DCC; IRWM12, ULCC; CC-BY 6 • Awareness of regulatory environment • Data access statement • Policies and processes • Data storage • Structured metadata descriptions • DOIs for data • Securely preserved for a minimum of 10 years from last use
    7. 7. How much data do we have? • Edinburgh – provision for 5 Petabytes • Oxford – guessing 3Pb/year • For comparison – LHC @ CERN – 15 Pb/year • £2m investment in storage not unusual 2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY 7
    8. 8. The Data Deluge is upon us 2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY 8 Sensor’s ability to produce data outstrips IT’s ability to process it
    9. 9. Research Data Centres – the solution! 2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY 9 MANY AREAS OF RESEARCH HAVE NO DATA CENTRE TO SERVE THEM
    10. 10. Cloud – sorted! • Sorry, but it isn’t. • See David Rosenthal’s analysis of the economics of Amazon for preservation “Distributed digital preservation in the cloud” IJDC 8(1), 2013 doi:10.2218/ijdc.v8i1.248 2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY 10
    11. 11. Cost of data for 100 years – local vs Amazon S3 Data from blog.dshr.org/2013/01/talk-at-idcc2013.html © David Rosenthal, used under CC-BY-SA licence 2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY 11
    12. 12. Cost of data for 100 years – local vs Amazon S3 AND Glacier Data from blog.dshr.org/2013/01/talk-at-idcc2013.html © David Rosenthal, used under CC-BY-SA licence 2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY 12
    13. 13. That looks like a problem • Funder requirements exist for a reason: – That data is valuable • Value to funder, society from reuse • Value to the institution is there also 2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY 13 BIS business case: £1.5m investment in research data services pays back 2.5 times after 5 years
    14. 14. Integrity • Not everyone publishes here • Almost all fraud connected to unavailable data • People suffer & die due to research fraud • When your research is reproducible – it gets cited 2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY 14
    15. 15. Citability • Making data available increases citations • Everyone – academic, funder, institution – loves citations • Want evidence? – Alter, Pienta, Lyle – 240%, social sciences * – Piwowar, Vision – 9% (microarray data)† – Henneken, Accomazzi – 20% (astronomy) # 2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY 15 † Piwowar H, Vision TJ. (2013) Data reuse & the open data citation advantage. PeerJ PrePrints 1:e1v1 http://dx.doi.org/10.7287/peerj.preprints.1v1 * Amy Pienta, George Alter, Jared Lyle, (2010) The Enduring Value of Social Science Research: The Use and Reuse of Primary Research Data. http://hdl.handle.net/2027.42/78307 # Edwin Henneken, Alberto Accomazzi, (2011) Linking to Data - Effect on Citation Rates in Astronomy. http://arxiv.org/abs/1111.3618
    16. 16. Value in the institution • New research depends on the old – well managed data resources like well-equipped labs • Teaching more effective when real data from research is used 2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY 16
    17. 17. Wherever it is, it has valueWant a 400% -> 1200% return on your investment? 2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY 17 Try BADC! http://www.jisc.ac.uk/whatwedo/programmes/di_directions/strategicdirections/badc.aspx
    18. 18. Commercial services 2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY 18
    19. 19. Can we find it? • Data must be discoverable to be reused • Alone, or in conjunction with publication • Institutional catalogues, national data registries – JISC is piloting through DCC 2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY 19
    20. 20. 2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY 20
    21. 21. 2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY 21
    22. 22. Jisc – through DCC – can help 2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY 22
    23. 23. http://dataintelligence.3tu.nl/en/home/ Choice of RDM training materials for librarians Up-skilling for data http://datalib.edina.ac.uk/mantra/libtraining.html 2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY 23
    24. 24. 2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY 24 Idea Develop Fund Plan Record Process Publish Read
    25. 25. 2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY 25 Idea Develop Fund Plan Record Process Publish Read Idea Develop Fund Plan Record Process Publish Read
    26. 26. Idea Develop Fund Plan Record Process Publish Read 2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY 26
    27. 27. Data reuse stories • The palaeontologist who saved years of work with archaeological data • The ‘noise’ from research radar that mapped dust from Eyjafjallajökull • The 19th-century logs and photographs that help us model climate change 2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY 27 Often your data tells stories that your publications do not
    28. 28. 3TU treasure chest 2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY 28
    29. 29. Thanks for your attention kevin.ashley@ed.ac.uk www.dcc.ac.uk @kevingashley 2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY 29
    30. 30. DCC ‘institutional engagement’ Assess needs Make the case Develop support and services RDM policy development Customised Data Management Plans DAF & CARDIO assessments Guidance and training Workflow assessment DCC support team Advocacy with senior management Institutional data catalogues Pilot RDM tools …and support policy implementation 2013-10-11 Kevin Ashley – FOTiE 2013 - CC-BY 30

    ×