Capturing Process

3,609 views
3,462 views

Published on

A talk given at the Unilever Centre for Molecular Informatics, Chemistry, Cambridge University on 12 May 2009. The talk covers issues to do with capturing research processes and objects taking inspiration from linked open data and distributed version control systems. Live blogged by Nico Adams at http://wwmm.ch.cam.ac.uk/blogs/adams/?p=249

Published in: Technology, Design

Capturing Process

  1. 1. Richard Grant Mat Todd Hide Plausible Accuracy Pedro Beltrao John Branwen Rich Apodaca Dupuis Neil Saunders Steve Wilson Simon Coles Noel Tony Hey Pawel SzcsesnyRichard Akerman Gorelick Dave de Roure Jon Tim O’Reilly Victoria Stodden Jeremy Frey ISIS LSS Group Udell Jean-Claude Bradley Jeremiah Faith Martyn Bull Michael Barton John Cumbers Clay Shirky Bora David Crotty Helen Egon Willighagen Brian Kelly Tony Williams Tim O’Reilly Berman Zivkovic Maxine Clarke Andrew Michael Nielsen Frank Mitch Martin Fenner Milsted Jenny Rohn NormanWaldrop Wilson Greg Yaroslav Nikolaev Iain Emsley Rafael Sidi Lee Smolin Lorie LeJeune JonathanHooker Bill Timo Hannay Gray Ken Shankland Paulo Nuin Deepak Singh Shirley Wu Liz Lyons PLoS STFC Friendfeed Peter Binfield Benjamin Good Dorothea Salo Peter Murray-Rust Richard Akerman Jen Dodd Chad Orzel Lakshmi Shastry ISIS Computing Group Jon Eisen Jenny Hale ciFoo 2008 Flanagan Bill Matt Wood Michael Eisen Jon Tansley Victor HenningGoogle Björn Brembs campers Rufus Pollock John TIM HUBBARD Gavin Bell Andy Powell Harry Collins Wilbanks Mike Ellis Garret Lisi DUNCAN HULL Euan Adie Peter Suber Gavin Baker The BioGang Sabine Hossenfelder Paul Walk Flickr Kevin Kelly Kaitlin Thaney Richard Curry Atilla Csordas Ian Mulvaney
  2. 2. Capturing Process In silico, in the lab, and all the messy in betweens
  3. 3. Laboratory Computational procedures procedures Procedure Experiment Analysis Data Data Material(s) Sample(s) Physical objects Digital objects
  4. 4. http://www.flickr.com/photos/halfchinese/113968722 CC-BY
  5. 5. Data is dynamic... http://www.flickr.com/photos/idletype/282855293/ CC-BY
  6. 6. Inspiration from coding best practice Repositories for storage/backup Strong record of who and when Roll-back, diffs, and reversion Testing as part of the process Scripting for solid replication
  7. 7. Working independently... http://www.flickr.com/photos/tswicegood/3233621766/ CC-BY-SA
  8. 8. ...data integration http://www.flickr.com/photos/tbisaacs/3087193160/ CC-BY
  9. 9. ...but commits are freetext
  10. 10. DVCS systems can provide who, when, what and differences between versions But it doesn’t provide the relationships between objects...
  11. 11. Have a good provenance trail... http://www.flickr.com/photos/a4gpa/195354385 CC-BY-SA
  12. 12. ...but not a good map of how that relates to everything else http://www.flickr.com/photos/normanbleventhalmapcenter/2674855383 CC-BY
  13. 13. If we have the map... ...if we capture the connections
  14. 14. http://is.gd/thVr
  15. 15. ...and on to a semantic web of data
  16. 16. ...but what about in here? http://www.flickr.com/photos/mararie/2151361243 CC-BY-SA
  17. 17. Lab book as a journal... http://www.flickr.com/photos/nbachiyski/2186228572 CC-BY
  18. 18. Blog as journal...
  19. 19. Description, date categorisation, objects, identity, accessibility... ...not of much interest to most people
  20. 20. http://biolab.isis.rl.ac.uk/projects/blog/
  21. 21. http://is.gd/thMB
  22. 22. http://is.gd/thMB
  23. 23. Laboratory Computational procedures procedures Procedure Experiment Analysis Data Data Material(s) Sample(s) Physical objects Digital objects
  24. 24. A web of objects...
  25. 25. A web of objects...
  26. 26. A web of objects... ...and the process that connects them
  27. 27. ...but still not semantic
  28. 28. Tagging goes some way... ...but how to enforce tagging?
  29. 29. Templates create a virtuous circle [table] [row] Lane[col]Sample[col]ul [/row] … [row] 4[col][[Dna:%]][col][[box]] [/row] … [/table] [[Section>Procedure]] [[Procedure_Type>electrophoresis_agarose]] [[Sandpit_group>DrexelDemo]]
  30. 30. Templates create a virtuous circle [table] [row] Lane[col]Sample[col]ul [/row] … [row] 4[col][[Dna:%]][col][[box]] [/row] … [/table] [[Section>Procedure]] [[Procedure_Type>electrophoresis_agarose]] [[Sandpit_group>DrexelDemo]]
  31. 31. Templates create a virtuous circle [table] [row] Lane[col]Sample[col]ul [/row] … [row] 4[col][[Dna:%]][col][[box]] [/row] … [/table] [[Section>Procedure]] [[Procedure_Type>electrophoresis_agarose]] [[Sandpit_group>DrexelDemo]]
  32. 32. Self assembling ontology? Sequence ontology: SO:0000696 “oligo” SO:0000155 “plasmid” ...but... SO:0000006 “PCR product” or SO:0000412 “rest. fragment”? Mixing up of process of production and material type?
  33. 33. We need a robust ontology or controlled vocabulary for experiments... ...but with that in hand http://www.flickr.com/photos/peterkaminski/5444915 CC-BY
  34. 34. We can build a semantic web of objects ...and the processes that connect them
  35. 35. Linked open data and linked open objects http://is.gd/thVr
  36. 36. Building for the future? http://www.flickr.com/photos/blahflowers/1382374610 CC-BY-SA
  37. 37. Capture it at source... ...in context http://flickr.com/photos/jason_burmeister/2053139930 CC-BY
  38. 38. Capture as much as possible automatically Slide adapted from original by Simon Coles
  39. 39. In silico capture the process step by step... http://www.flickr.com/photos/stevoarnold/2787234769 CC-BY
  40. 40. In silico capture the process step by step... ...one way or another the semantics can be baked in http://www.flickr.com/photos/stevoarnold/2787234769 CC-BY
  41. 41. In the lab capture each object as it is created...
  42. 42. In the lab capture each object as it is created... ...and capture the plan and track execution step by step
  43. 43. Plan = Template = Minimal Information Foo = Semantics
  44. 44. Data repositories... ...as easy to use as Flickr
  45. 45. More natural interfaces... http://www.flickr.com/photos/bekathwia/2910518374 CC-BY-SA
  46. 46. More natural interfaces... ...to capture and communicate http://www.flickr.com/photos/bekathwia/2910518374 CC-BY-SA
  47. 47. ...Pages from a project need to be linked in a 3D web of relevance...I want to be able to annotate a...collaborator's work by drawing on it...as I would write on [their] whiteboard... Mat Todd http://is.gd/yVQK http://www.flickr.com/photos/andypowe11/2938538086 CC-BY
  48. 48. But who (and what) can you trust? http://www.flickr.com/photos/joi/2941559903 CC-BY
  49. 49. We trust people... ...not objects
  50. 50. A semantic social web of objects (and data, and process and...)
  51. 51. (Some of) the people I trust... ...in dierent ways and for dierent things
  52. 52. http://friendfeed.com
  53. 53. Code Data Sample Process http://friendfeed.com/lists/isisbiolab
  54. 54. Data finds the data, then people find people. Jeff Jonas/Jon Udell via Deepak Singh
  55. 55. It’s the objects that are the centre of the social interaction and not the people
  56. 56. But that can only work if these objects are...
  57. 57. http://flickr.com/photos/virtualsugar/316200555/ CC-BY
  58. 58. Connected research changes the playing field
  59. 59. Connected research changes the playing field ...availability of resources key
  60. 60. We need to capture objects as they are created...
  61. 61. We need to capture objects as they are created... ...and to capture their relationships
  62. 62. The rest we can build bit by bit as we go
  63. 63. Communicate first, standardize second. Jean-Claude Bradley

×