User engagement in research data curation


Published on

Presentation given by

Published in: Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

User engagement in research data curation

  1. 1. User engagement in research data curation Stuart Macdonald EDINA National Data Centre, University of Edinburgh Luis Martinez-Uribe Oxford e-Research Centre, University of Oxford ECDL Corfu, 30 September 2009
  2. 2. Data deluge <ul><li>An updated IDC white paper reported that the digital universe in 2007 was 281 exabytes and in 2011 should be 1,800 exabytes (or 10 times that produced in 2006). </li></ul><ul><li>*“The Diverse and Exploding Digital Universe - an updated forecast of worldwide information growth through 2011- (Mar. 2008) </li></ul>BBSRC strategic plan (2010-2015) consultation document
  3. 3. Research data definitions <ul><li>US Office of Management and Budget defines research data as “the recorded factual material commonly accepted in the scientific community as necessary to validate research findings” </li></ul><ul><li>Words, pictures, numbers, sounds </li></ul><ul><li>Workflows, methodologies, protocols, standard operating procedures, instrumentation, models, questionaires, code books, set-up files, algorithms, transcripts </li></ul>
  4. 4. <ul><li>“ it is becoming increasingly clear that effective and efficient management and reuse of research data will be a key component in the UK knowledge economy in years to come, essential for the efficient conduct of research ….” </li></ul><ul><li>*JISC (2008) “Identifying the benefits of curating and sharing research data” - </li></ul><ul><li>Research methods experiencing a radical </li></ul><ul><li>transformation </li></ul><ul><li>New tools & infrastructures generating </li></ul><ul><li>research data </li></ul><ul><li>New ways to use, share and re-use </li></ul>Growing importance of curating research data
  5. 5. <ul><li>Departmental websites </li></ul><ul><li>Domain-specific repositories </li></ul><ul><li>Centralised data repositories (UKDA, NERC, MRC) </li></ul><ul><li>Libraries and computing/IT services within academic </li></ul><ul><li>institutions working together to develop and customise </li></ul><ul><li>institutional repositories to curate research data </li></ul>Data deposition and publication
  6. 6. <ul><li>Institutional Repositories: </li></ul><ul><li>open access </li></ul><ul><li>built for academic publications </li></ul><ul><li>technology lead </li></ul><ul><li>No formal requirements analysis procedures </li></ul><ul><li>User engagement required to develop systems that will </li></ul><ul><li>meet researchers’ needs </li></ul><ul><li>Bottom up approach to inform top-down thinking </li></ul>Researchers – key user community overlooked
  7. 7. <ul><li>DISC-UK DataShare - legal, cultural, technical issues surrounding the sharing of research data in institutional settings </li></ul><ul><li>Barriers to sharing: </li></ul><ul><li>time taken to prepare datasets for deposit </li></ul><ul><li>concerns over making data available before full academic </li></ul><ul><li>exploitation </li></ul><ul><li>misuse / misinterpretation (journalists, non-academics) </li></ul><ul><li>loss of ownership, loss of commercial or competitive advantage </li></ul><ul><li>repositories will cease to exist </li></ul><ul><li>unwillingness to change working practices </li></ul><ul><li>uncertainty about IPR and confidentiality </li></ul>Open data – realism versus altruism
  8. 8. <ul><li>Charting individual researcher’s information practices across 7 sub-disciplines of the life sciences - </li></ul><ul><li>DCC / ISSTI (University of Edinburgh) </li></ul><ul><li>Deployed a range of methodologies and tools including short-term ethnographic techniques and semi-structured instruments: </li></ul><ul><li>Diaries (x55), </li></ul><ul><li>F-2-F interviews, (x24) </li></ul><ul><li>Cognitive mapping (1 per case), </li></ul><ul><li>Focus groups (1 per case) </li></ul>RIN-funded Disciplinary case studies
  9. 9. <ul><li>Some disciplines lend themselves more than others to ‘openly’ data sharing </li></ul><ul><li>Research data are varied, specific and complex </li></ul><ul><li>Data curation and/or sharing only becomes crucial at certain stages of research lifecycle </li></ul><ul><li>Feeling that only researchers have subject knowledge to curate their own data </li></ul><ul><li>Keen sense of ‘ownership’ and protectiveness towards data </li></ul>Some findings from RIN Disciplinary case studies project:
  10. 10. Scoping digital repository services for research data management - <ul><li>Scope requirements for services to manage research data generated by Oxford researchers from a variety of disciplines: </li></ul><ul><li>Interviews (x37) conducted to learn about data management practices and identify top requirements </li></ul><ul><li>Workshop (x46) held to compliment findings and to gather examples of good practice regarding use of repository services for research data management </li></ul><ul><li>Consultation with service units (ORA, data library,NGS, oxford digital library) - identify gaps in service, validate researchers’ requirements </li></ul>
  11. 11. <ul><li>Scoping digital repository services - top requirements </li></ul><ul><li>Advice on practical issues related to managing data across their life </li></ul><ul><li>cycle incl. data management plans, assistance with formatting </li></ul><ul><li>Secure storage required for large datasets generated by high </li></ul><ul><li>throughput instruments </li></ul><ul><li>Sustainable & authenticated infrastructure that allows publication and </li></ul><ul><li>long-term preservation of research data </li></ul><ul><li>It is now followed up by the intra-institutional JISC funded Embedding Institutional Data Curation Services in Research (EIDCSR) project - </li></ul>
  12. 12. Tools – Data Audit Framework <ul><li>DAF helps to establish relationships with research communities around </li></ul><ul><li>the issues of data curation </li></ul><ul><li>Allows institutions to identify, locate, describe and assess how they are </li></ul><ul><li>managing their research data </li></ul><ul><li>Provides information specialists who wish to extend support for research </li></ul><ul><li>data with a vehicle for engaging with researchers e.g. through local </li></ul><ul><li>research data management training </li></ul>&quot;staff had numerous comments and suggestions for improvement of data management at different levels indicating an awareness of the issues, even where it had not been made a priority to address&quot; - edinburgh data audit implementation project
  13. 13. Summary <ul><li>Repository development distant from current research needs </li></ul><ul><li>- due to lack of iterative requirements analysis with researchers </li></ul><ul><li>Open data ethos detached from disciplinary research needs </li></ul><ul><li>Trusted relationships </li></ul><ul><li>- dialogue with researchers early in research process </li></ul>
  14. 14. <ul><li>Thank you </li></ul><ul><li> </li></ul><ul><li>[email_address] </li></ul><ul><li>All images - creative commons courtesy of Flickr </li></ul>