To share or not to share? Researchers' perspective on managing and sharing data


Published on

Presented by RIN Director Michael Jubb at UCL e-Content Conference 12 May 2010, London.

Published in: Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

To share or not to share? Researchers' perspective on managing and sharing data

  1. 1. To Share or not to Share? researchers’ perspectives on managing and sharing data <ul><li>Michael Jubb </li></ul><ul><li>E-content 2010 </li></ul><ul><li>UCL, 12 May 2010 </li></ul>
  2. 2. data and information in the research process: some verbs <ul><li>gather </li></ul><ul><li>evaluate </li></ul><ul><li>create </li></ul><ul><li>analyse </li></ul><ul><li>manage </li></ul><ul><li>transform </li></ul><ul><li>present </li></ul><ul><li>communicate </li></ul><ul><li>disseminate </li></ul>
  3. 3. But………. <ul><li>gathering, creating, evaluating data not usually the primary object of research </li></ul><ul><ul><li>few career rewards from sharing data </li></ul></ul><ul><li>how it’s done in different disciplines varies greatly </li></ul><ul><ul><li>big science not the norm </li></ul></ul>
  4. 4. the research process: animal genetics
  5. 5. research process: transgenesis & embryology
  6. 6. research process: epidemiology
  7. 7. research process: neuroscience
  8. 8. why share? <ul><li>researchers as users </li></ul><ul><li>completeness of the scholarly record </li></ul><ul><ul><li>validation of results </li></ul></ul><ul><li>re-use and integration </li></ul><ul><ul><li>exploit what’s already been done </li></ul></ul><ul><ul><li>avoid duplication of effort </li></ul></ul><ul><li>researchers may have different interests as creators and users </li></ul>
  9. 9. creators: motivations and constraints? <ul><li>evidence of benefits </li></ul><ul><li>citation esteem and successful evaluations </li></ul><ul><li>explicit rewards </li></ul><ul><li>altruism </li></ul><ul><li>reciprocity </li></ul><ul><li>enhanced visibility </li></ul><ul><li>cultural/peer pressures </li></ul><ul><li>opportunities for collaboration, co-authorship </li></ul><ul><li>easy-to-do </li></ul><ul><li>no clear benefits/incentives </li></ul><ul><li>competition; resistance to sharing intellectual capital </li></ul><ul><li>desire for/fear of commercial exploitation </li></ul><ul><li>access restrictions desired or imposed </li></ul><ul><li>legal, ethical problems </li></ul><ul><li>lack of time, funds, expertise </li></ul><ul><li>sheer size of datasets </li></ul><ul><li>nowhere to put it </li></ul>
  10. 10. ownership, protection and trust <ul><li>responsibility, protectiveness and desire for control over data </li></ul><ul><ul><li>concerns about inappropriate use </li></ul></ul><ul><li>preference for co-operative arrangements and direct contact with potential users </li></ul><ul><li>decisions on when and how to share </li></ul><ul><ul><li>commercial, ethical, legal issues </li></ul></ul><ul><li>lack of trust in other researchers’ data </li></ul><ul><ul><li>“ I don’t know if they have done it to the same standards I would have done it” </li></ul></ul><ul><li>lack of standardisation </li></ul><ul><li>intricacies of experimental design and processes </li></ul>
  11. 11. Training and Skills “ This has been identified in every study as a major problem, both training researchers to be e-researchers, and training the people running the systems to deal with researchers, and to understand the technology”
  12. 12. researchers <ul><li>“… a lot of scientists don’t get information and structures at all. It’s not what they’re trained to think about.” </li></ul><ul><li>“…… ..the idea of quality, provenance, and metadata about data is woefully inadequate in most science training.” </li></ul><ul><li>“…… it’s not just about creating the infrastructure, it’s actually about creating the demand for the infrastructure, because……there’s still a lack of demand.” </li></ul><ul><li>engagement between researchers and data management professionals </li></ul><ul><ul><li>“…… .now we manage our data, whereas before we didn’t” </li></ul></ul><ul><li>scalability in training and support, and in promoting cultural change </li></ul>
  13. 13. curators <ul><li>concern about low numbers of people with specialist expertise </li></ul><ul><li>people from two kinds of backgrounds </li></ul><ul><ul><li>library/information professionals </li></ul></ul><ul><ul><li>researchers </li></ul></ul><ul><li>need for co-ordination of effort and funding for capacity-building </li></ul><ul><ul><li>much depends at present on short-term project funding </li></ul></ul><ul><li>lack of career structure </li></ul><ul><li>“ Who’s training these people? We need training at the professional level for people who are actually going to run these data centres” </li></ul><ul><li>“… .the career structure for those people with expertise is miserable, because the number of places they can work is not large, and the universities don’t treat them as key staff” </li></ul><ul><li>“ So there’s a real danger of losing people to the private sector” </li></ul>
  14. 14. curation and sharing <ul><li>little sign that data management or curation yet adopted as standard practice </li></ul><ul><ul><li>except in areas such as astronomy, bioinformatics, genomics etc </li></ul></ul><ul><li>other kinds of information more readily shared </li></ul><ul><ul><li>software, code, tools, protocols etc </li></ul></ul><ul><li>challenges for service providers in meeting diverse needs of wide range of research groups </li></ul><ul><ul><li>disciplinary and subject differences </li></ul></ul><ul><ul><li>subject knowledge </li></ul></ul><ul><ul><li>local, national and international </li></ul></ul><ul><li>relationships and engagement between researchers and information specialists </li></ul>
  15. 15. leadership and co-ordination? <ul><li>co-ordination between different funding bodies </li></ul><ul><ul><li>Research Councils, Higher Education funding bodies, JISC, universities </li></ul></ul><ul><li>clarity about roles and responsibilities </li></ul><ul><ul><li>piecemeal initiatives with limited take-up and impact </li></ul></ul><ul><li>difficulty in establishing a single locus of responsibility </li></ul><ul><li>disciplinary and institutional dimensions of scale and complexity </li></ul><ul><li>“… .we need a more co-ordinated strategy and real leadership to take things forward.” </li></ul><ul><li>“…’s very easy in the current framework to pass the buck and do nothing.” </li></ul><ul><li>“ It would be good to have some national view of process for deciding who is responsible. It’s not clear what routes you have to go through to decide should we do this.” </li></ul><ul><li>“ Things are funded in silos. So I just don’t think there is really a national strategy.” </li></ul>
  16. 16. drivers and leaders? <ul><li>agreement that development of the infrastructure should be “driven by the science” </li></ul><ul><li>securing engagement, commitment, time and resources from research leaders </li></ul><ul><li>dangers of “solutions looking for problems” </li></ul><ul><ul><li>“…… soon as you create a facility that isn’t focused on the science, it develops a life of its own.” </li></ul></ul><ul><ul><li>“…… .you have to work pretty hard to demonstrate there’s a business case for reuse of data………there’s no point in paying to curate and store data if nobody ever does use it again.” </li></ul></ul><ul><li>need for careful management of relationships between specialists and researchers </li></ul>
  17. 17. top-down and/or bottom-up? <ul><li>bottom-up </li></ul><ul><ul><li>develop services locally in response to what researchers themselves want </li></ul></ul><ul><ul><li>develop tools and environments within universities to equip the research community with appropriate processes and skills </li></ul></ul><ul><li>top-down </li></ul><ul><ul><li>establish national body/programme to catalyse change required for sustainable and ubiquitous service </li></ul></ul><ul><li>preferred solution in UK seems to be small-scale initiatives working together </li></ul><ul><ul><li>will that work and will it lead to sustainable services? </li></ul></ul>
  18. 18. some key messages <ul><li>different kinds of data, different values attached to them, different user needs </li></ul><ul><li>the sustainability challenge: co-operation needed between researchers, funders, institutions </li></ul><ul><li>scope for publishers to promote access </li></ul><ul><ul><li>and need for clarity on text mining </li></ul></ul><ul><ul><li>and monitoring of Web 2.0 </li></ul></ul><ul><li>need for incentives for researchers </li></ul><ul><ul><li>support and reward for good practice </li></ul></ul><ul><li>benefits and evidence of value </li></ul><ul><ul><li>scholarly record </li></ul></ul><ul><ul><li>re-use and aggregation </li></ul></ul><ul><li>policies and strategies that don’t take account of of all these issues will have limited success </li></ul>
  19. 19. Thank you Michael Jubb