RDAP13 Cerys Willoughby: Towards a global open scientific notebook infrastructure


Published on

Cerys Willoughby, University of Southhampton

Jeremy Frey, Andrew Milsted, Simon Coles, Colin Bird, Cerys Willoughby, Cameron Neylon and Matthew Todd: “Towards a global open scientific notebook infrastructure”

Panel: Global scientific data infrastructure
Research Data Access & Preservation Summit 2013
Baltimore, MD April 4, 2013 #rdap13

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Talk will discuss applications of work originated in Southampton on development of electronic laboratory notebooks to support collaborative investigations and illustrated by work undertaken at Southampton, the ISIS neutron facility (Neylon) and University of Sydney (Todd). Work comes out of the e-Science funding (CombeChem Project) from the UK RCUK (Research Councils UK) [e-Science maps to Cyber-Infrastructure in the USA] further developed by funding from the Universities Modernization Fund, collaborative R&D between chemistry, computer science and library.
  • Open Access debate has been high profile, but primarily and economic argument, from our perspective the question would be open access to what and we are interested in the access to the data! Thus the role of data management plans. The Royal Society report is key as it stresses that access to the data is essential for the whole basis of science to enable other researchers to build on the published work which is must harder and can be impossible if the data is not available (and easier if freely available) but only if the data is comprehensible so intelligent access is highlighted as necessary (i.e. importance of metadata).
  • Infrastructure needs to support the collection and curation of data for high quality dissemination with context and provenance. Infrastructure parallels the DIKW Data, Information, Knowledge, Wisdom hierarchy.
  • Having the ELN leads to changes in behaviour.
  • Development of the ELNs trade off in effort devoted to Semantics, Usability and IP building these up over time, showing our Smart Tea and LabTrove projects
  • The LabTrove system – designed to be quite easy to use for open and closed projects, allow & encourage use of metadata but not require or enforce – approach needed for adoption. Open Source software, with hosting and advice services.
  • Skip this slide – LabTrove was further developed under the SRF project
  • Process is important! As important as the Data. Need to describe as we can’t all “visit” – global tea room [Chemists are big on tea rooms]
  • Images important, able to sketch comment as well as text comment, highly linked notes. For example a record (post) about a substrate, can then trace what processes used this substrate and what results were then produced, so if it transpires there was an issue with the material then the consequences can be readily traced.
  • Computational processes can “blog” as well. A Matlab script can be run from a publish script so that all aspects data, code, figures output are all added to a Trove to give full provenance of a figure/result so a clear reord is kept of what material generated what outputs. Very useful once students have left and figures need modifying for a paper
  • Comments on computational models – in this case GODIVA is a way to show ocean models over the web (University of Reading) and with LabTrove added people can comment on geo-coded regions of the models results and have the video in the post – metadata taken from the models and put in the Trove.
  • Just shows the use in the x-ray project… computationally intensive image reconstruction in a complex, multi-disciplinary project, use of timelines, I have this to show that my work is grounded in physical science as well as computer science. You may want to stress your background in usability which is as we know so important to actually making this all work
  • Examples from USyd of the Open Notebook science use in malaria drugs. Enables global collaboration, link back to notebook from the publications, has industrial participation, links with other platforms (wiki etc). Pictures of the research are really useful.
  • Social media to disseminate open research, links to Twitter, and perhaps Facebook etc, make sure metadata is good enough for search engines to find, perhaps need some specialist metadata for research findings, researcher and funder ids are certainly useful!
  • Attribution requires similar infrastructure to security, so switching between Open Notebook Science, Open on Publication, Closed (i.e. industrially funded private research) is not hard:- in industry the work my not be public but often does need to be shared within the company, so similar issues to Open Science apply.
  • Well more rapidly and more efficiently, but is viewed by many as a problem when it comes to establishing reputation and advancement in career or potential financial gain, but open does not mean free, perhaps free at the point of use, but someone has paid for the work and is paying to maintain the access. Could comment on the collective action of the long tail of laboratory science needs the global collaboration that semantics + the web (not necessarily the formal semantic web) provides.
  • Attitudes to undertaking research need to change so that when data is collected the assumption is that it will be shared (at some point) and that collaboration is essential for rapid progress – don ’ t wait until it is right before you share at least with your collaborators, something students seem to resist not understanding that share and discuss is the best way to find out what is right.
  • RDAP13 Cerys Willoughby: Towards a global open scientific notebook infrastructure

    1. 1. Towards a global open scientific notebook infrastructure Jeremy Frey, Andrew Milsted, Simon Coles, Colin Bird, Cerys Willoughby, Cameron Neylon & Matthew Todd
    2. 2. Science is Science is increasingly increasinglyinterdisciplinary interdisciplinary
    3. 3. Infrastructures - ArchitectureCollaboration Collaboration Sharing Sharing Curation Curation Reuse Reuse
    4. 4. Comparison with Comparison withtraditional paper traditional paper notebooks notebooks •• Higher Quality Record Higher Quality Record •• Natural linking to data and external Natural linking to data and external resources Electronic Electronic resources •• Easier Collaboration Easier Collaboration Laboratory Laboratory •• Improved planning Improved planning Notebooks Notebooks •• Improved discussions Improved discussions •• Efficiency gain in production of Efficiency gain in production of presentations/reports presentations/reports ELNs ELNs •• Change the nature of Change the nature ofCommunication Communication Professor/Student interactions Professor/Student interactions Collaboration Collaboration Sharing Sharing Linking Linking Curating Curating
    5. 5. Commercial offerings Commercial offerings Web 2.0 Web 2.0 Developments in LabTroveELN implementation Smart Tea and characteristics Semantics PNNL User focus Collaboration RS/1 Trust in ELNs for IP compliance1980 1990 2000 2010
    6. 6. The LabTrove story http://www.labtrove.org
    7. 7. How do we If you cant describe whatcommunicate? you are doing as a process, you dont know what• Surprisingly difficult to youre doing. W. Edwards Deming explain what a process involves• Much of the detail is assumed to be understood and not explicitly discussed Growing need for the global (virtual)• This is where the mis- equivalent of the understandings usually “Tea Room” arise.
    8. 8. LabTrove: Easy Communication
    9. 9. AutoTrove from Matlab Computational processes also blog
    10. 10. BlogMyData Project - Godiva
    11. 11. LabTrove Open Notebooks Mat Todd’s PZQ Project
    12. 12. Open Notebooks• Troves can be open Read/Comment/Write – Can control this access so it is your choice• All contributions attributable (login needed) – Anonymous contributions not usually enabled• Open contribution does worry the IT services – Provides potential pathway for abuse of systems – Not just our systems
    13. 13. Global open scientific notebook infrastructure• Global collaboration: – International – Interdisciplinary• Open science• To ascend the knowledge pyramid, we need open collaboration and sharing of results
    14. 14. We must speed up the knowledge discovery process All I am saying is that now is the time to develop the technology to deflect an asteroid