From BioMoby to SADI<br />The Quest for the Holy Grail!<br />
BioMoby Stats in a nutshell<br />>1800 servicesworldwide (~1300 “alive” at any given time)<br />4 major installations of t...
Model Organism Bring Your-Own Database Interface Conference<br />“MOBY-DIC”<br />Emma Lake, Saskatchewan<br />Sept 21, 200...
Are we going after The Holy Grail here?<br />
The Holy Grail:(this slide created circa 2002)<br />Align the promoters of all serine threonine kinases involved exclusive...
http://sadiframework.org<br />MicrosoftResearch<br />Founding partner<br />
Holy Grail Demo #1<br />
Imagine there is a “virtual database” containing all of the data from all of the databases,together with the output ofever...
How do we query that database?<br />
“SHARE”Semantic Health And Research EnvironmentSADI client applicationhttp://biordf.net/cardioSHARE (Pellet)http://dev.bio...
What pathways does UniProt protein P47989 belong to?<br />PREFIX pred: <http://sadiframework.org/ontologies/predicates.owl...
Recapwhat we just saw<br />A standard SPARQL query was entered into SHARE, a SADI-aware query engine<br />
Recapwhat we just saw<br />The query was interpreted to extract the “triple” patterns<br />subject, predicate, objectbeing...
Recapwhat we just saw<br />Triple-patterns are passed to SADI for Web Service discovery<br />
Recapwhat we just saw<br />Services capable of generating those triple-patterns are automatically executed, the triples ar...
Recapwhat we just saw<br />We posed, and answered a ~complex database query <br />WITHOUT A DATABASE<br />(in fact, the da...
Recapwhat we just saw<br />Note that there is no centralized ontologyUnlike BioMoby, SADI supports all (OWL) ontologiesand...
Holy Grail Demo #1<br />Align the promoters of all serine threonine kinases involved exclusively in the regulation of cell...
Holy Grail Demo #2<br />
Show me the latest Blood Urea Nitrogen and Creatinine levelsof patients who appear to be rejecting their transplants<br />...
Start burrowing through the LikelyRejector OWL class  find that we need a regression model OWL class<br />
Regression models have features like slopes and intercepts, and so on.The class is completely decomposed until a set of re...
Decomposition of the OWL class uncovers the need for a Linear Regression analysis on the patient blood chemistry data<br />
VOILA!<br />
We just dynamically evaluated if individuals matching a particular high-level concept definition exist…or can exist <br />
Holy Grail Demo #2<br />Align the promoters of all serine threonine kinases involved exclusively in the regulation of cell...
How does SADI + SHARE do that?<br />
Please see other presentations uploaded to SlideShare for a full explanation of SADI Functionality See also the Taverna an...
The Holy Grail may not yet be in-handbut I think we can at least see it from here!So… now what?<br />
Mark’s Manifesto<br />What is my next “Holy Grail”?<br />
Science<br />Support for the in silicoScientific Method<br />
The Scientific Method<br />Discourse:  What do you believe?  What do I believe?<br />Disagreement:  You’re wrong!  And I’m...
The Scientific Method<br />Discourse:  What do you believe?  What do I believe?<br />Disagreement:  You’re wrong!  And I’m...
In opposition to the lessons we learnt from Web 2.0<br />The Semantic Web in Healthcare and Life Sciencesis solving the pr...
Result:<br />Large, centrally-designed and centrally-curated ontologies <br />that enforce“community agreement” about “bio...
Science ≠ Consensus<br />
To bring the “traditions of Science” to in silicoScience<br />Web 3.0 must encourage and facilitate personalopinionand deb...
What has this got to do with SADI and SHARE?<br />
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> <br />PREFIX patient: <http://sadiframework.org/ontologies/patie...
Likely Rejecter<br />
Icreated a small ontologydescribing my definition ofa Likely Rejecter<br />
… it was MY ontology!<br />
I can re-use it<br />
I can modify it as I change myworld-view<br />
I can publish it for others to use<br />
Others can modify it and/or compare it to THEIR world-view<br />
Sharing my ontology also gives opportunities for micro-attribution;“Citation” of me is transparent and automatic when some...
Using SADI and SHAREmypersonal world-view isexplicitlyexpressedand can bedynamically evaluated againstglobal data and know...
Ontology development is distributed and personalrather than centralized no institutions<br />
…but there’s more…<br />
“Likely Rejecter”<br />
I made that up!  It came out of my head!<br />
What’s another word for a world-view that you make-up?<br />Hypothesis<br />
The “Likely Rejecter” OWL Class<br />is an explicitly-expressed hypothesis;<br />Members of that class may or may not exis...
Ontologically-expressed Hypotheses drive the discovery, assembly, and analysis of data capable of evaluating their validit...
Join us!<br />SADI and CardioSHARE are Open-Source projects<br />Come join us – we’re having a lot of fun!!<br />http://sa...
C-BRASS:  Canadian Bioinformatics Resources As Semantic Servicestogether with Michel Dumontier, Chris Baker<br />~$1M fund...
Credits<br />Benjamin VanderValk(SADI & SHARE)<br />Luke McCarthy (SADI & SHARE)<br />SoroushSamadian(CardioSHARE)<br />
Microsoft Research<br />
                          Credits<br />Benjamin VanderValk (SADI & CardioSHARE)<br />Luke McCarthy (SADI & CardioSHARE)<br...
Upcoming SlideShare
Loading in …5
×

How SADI & SHARE help restore the Scientific Method to in silico science

1,540 views

Published on

This is my presentation to the Bio Open Source Convention (BOSC) in Boston, July 2010. I start with a brief status-update on the BioMoby project and then launch into a series of demonstrations of it's successor - SADI + SHARE. Rather than discussing how SADI/SHARE work, I focus the discussion on what role I think these technologies can play in bringing the traditional "scientific method" back into in silico biology.

Published in: Technology, Education
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,540
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
10
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

How SADI & SHARE help restore the Scientific Method to in silico science

  1. 1. From BioMoby to SADI<br />The Quest for the Holy Grail!<br />
  2. 2. BioMoby Stats in a nutshell<br />>1800 servicesworldwide (~1300 “alive” at any given time)<br />4 major installations of the Moby Service registry<br />Genome Canada, SUN Center of Excellence, Calgary<br />Genome España, Barcelona Supercomputing Center<br />International Rice Research Institute, Philippines <br />Max Planck, Cologne<br />Canadian service registry brokers ~400,000 requests/month<br />Canadian BioMoby services receive ~700,000 uses/month<br />Canadian server just had a significant memory upgrade to improve performance<br />“The report of my death was an exaggeration”<br />-- Mark Twain<br />
  3. 3. Model Organism Bring Your-Own Database Interface Conference<br />“MOBY-DIC”<br />Emma Lake, Saskatchewan<br />Sept 21, 2001<br />
  4. 4.
  5. 5. Are we going after The Holy Grail here?<br />
  6. 6. The Holy Grail:(this slide created circa 2002)<br />Align the promoters of all serine threonine kinases involved exclusively in the regulation of cell sorting during wound healing in blood vessels.<br />Retrieve and align 2000nt 5' from every serine/threonine kinase in Mus musculus expressed exclusively in the tunica [I | M |A] whose expression increases 5X or more within 5 hours of wounding but is not activated during the normal development of blood vessels, and is <40% homologous in the active site to kinases known to be involved in cell-cycle regulation in any other species.<br />
  7. 7. http://sadiframework.org<br />MicrosoftResearch<br />Founding partner<br />
  8. 8. Holy Grail Demo #1<br />
  9. 9. Imagine there is a “virtual database” containing all of the data from all of the databases,together with the output ofevery conceivable analysis<br />
  10. 10. How do we query that database?<br />
  11. 11. “SHARE”Semantic Health And Research EnvironmentSADI client applicationhttp://biordf.net/cardioSHARE (Pellet)http://dev.biordf.net/cardioSHARE (Pellet 2)<br />
  12. 12. What pathways does UniProt protein P47989 belong to?<br />PREFIX pred: <http://sadiframework.org/ontologies/predicates.owl#><br />PREFIX ont: <http://ontology.dumontierlab.com/><br />PREFIX uniprot: <http://lsrn.org/UniProt:><br />SELECT ?gene ?pathway <br />WHERE { <br /> uniprot:P47989 pred:isEncodedBy ?gene . <br /> ?gene ont:isParticipantIn ?pathway . <br />}<br />
  13. 13.
  14. 14.
  15. 15.
  16. 16. Recapwhat we just saw<br />A standard SPARQL query was entered into SHARE, a SADI-aware query engine<br />
  17. 17. Recapwhat we just saw<br />The query was interpreted to extract the “triple” patterns<br />subject, predicate, objectbeing requested<br />
  18. 18. Recapwhat we just saw<br />Triple-patterns are passed to SADI for Web Service discovery<br />
  19. 19. Recapwhat we just saw<br />Services capable of generating those triple-patterns are automatically executed, the triples are stored, and the query is resolved.<br />
  20. 20. Recapwhat we just saw<br />We posed, and answered a ~complex database query <br />WITHOUT A DATABASE<br />(in fact, the data didn’t even have to exist...)<br />
  21. 21. Recapwhat we just saw<br />Note that there is no centralized ontologyUnlike BioMoby, SADI supports all (OWL) ontologiesand does not invent any of its own<br />
  22. 22. Holy Grail Demo #1<br />Align the promoters of all serine threonine kinases involved exclusively in the regulation of cell sorting during wound healing in blood vessels.<br />Retrieve and align 2000nt 5' from every serine/threonine kinase in Mus musculus expressed exclusively in the tunica [I | M |A] whose expression increases 5X or more within 5 hours of wounding but is not activated during the normal development of blood vessels, and is <40% homologous in the active site to kinases known to be involved in cell-cycle regulation in any other species.<br />
  23. 23. Holy Grail Demo #2<br />
  24. 24. Show me the latest Blood Urea Nitrogen and Creatinine levelsof patients who appear to be rejecting their transplants<br />PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> <br />PREFIX patient: <http://sadiframework.org/ontologies/patients.owl#> <br />PREFIX l: <http://sadiframework.org/ontologies/predicates.owl#> <br />SELECT ?patient ?bun ?creat<br />FROM <http://sadiframework.org/ontologies/patients.rdf><br />WHERE {<br /> ?patient rdf:typepatient:LikelyRejecter .<br /> ?patient l:latestBUN ?bun . <br /> ?patient l:latestCreatinine ?creat . <br />}<br />
  25. 25. Start burrowing through the LikelyRejector OWL class  find that we need a regression model OWL class<br />
  26. 26. Regression models have features like slopes and intercepts, and so on.The class is completely decomposed until a set of required Services are discoveredcapable of creating all these necessary properties<br />
  27. 27. Decomposition of the OWL class uncovers the need for a Linear Regression analysis on the patient blood chemistry data<br />
  28. 28. VOILA!<br />
  29. 29. We just dynamically evaluated if individuals matching a particular high-level concept definition exist…or can exist <br />
  30. 30. Holy Grail Demo #2<br />Align the promoters of all serine threonine kinases involved exclusively in the regulation of cell sorting during wound healing in blood vessels.<br />Retrieve and align 2000nt 5' from every serine/threonine kinase in Mus musculus expressed exclusively in the tunica [I | M |A] whose expression increases 5X or more within 5 hours of wounding but is not activated during the normal development of blood vessels, and is <40% homologous in the active site to kinases known to be involved in cell-cycle regulation in any other species.<br />
  31. 31. How does SADI + SHARE do that?<br />
  32. 32. Please see other presentations uploaded to SlideShare for a full explanation of SADI Functionality See also the Taverna and Protégé plug-insfor discovering, running and creating services<br />Taverna<br />
  33. 33. The Holy Grail may not yet be in-handbut I think we can at least see it from here!So… now what?<br />
  34. 34. Mark’s Manifesto<br />What is my next “Holy Grail”?<br />
  35. 35. Science<br />Support for the in silicoScientific Method<br />
  36. 36.
  37. 37. The Scientific Method<br />Discourse: What do you believe? What do I believe?<br />Disagreement: You’re wrong! And I’m gonna prove it!<br />Clarity: This is the experiment I am going to do<br />Reproducibility: This is how I did it (“provenance”)<br />Clarity: This is my new hypothesis<br />
  38. 38. The Scientific Method<br />Discourse: What do you believe? What do I believe?<br />Disagreement: You’re wrong! And I’m gonna prove it!<br />Clarity: This is the experiment I am going to do<br />Reproducibility: This is how I did it (“provenance”)<br />Clarity: This is my new hypothesis<br />Workflows <br /> (e.g. myExperiment) <br />
  39. 39.
  40. 40. In opposition to the lessons we learnt from Web 2.0<br />The Semantic Web in Healthcare and Life Sciencesis solving the problems of science…<br />…by forming institutions<br />
  41. 41. Result:<br />Large, centrally-designed and centrally-curated ontologies <br />that enforce“community agreement” about “biological reality”<br />
  42. 42. Science ≠ Consensus<br />
  43. 43.
  44. 44.
  45. 45.
  46. 46.
  47. 47. To bring the “traditions of Science” to in silicoScience<br />Web 3.0 must encourage and facilitate personalopinionand debate<br />
  48. 48. What has this got to do with SADI and SHARE?<br />
  49. 49. PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> <br />PREFIX patient: <http://sadiframework.org/ontologies/patients.owl#> <br />PREFIX l: <http://sadiframework.org/ontologies/predicates.owl#> <br />SELECT ?patient ?bun ?creat<br />FROM <http://sadiframework.org/ontologies/patients.rdf><br />WHERE {<br /> ?patient rdf:typepatient:LikelyRejecter .<br /> ?patient l:latestBUN ?bun . <br /> ?patient l:latestCreatinine ?creat . <br />}<br />
  50. 50. Likely Rejecter<br />
  51. 51. Icreated a small ontologydescribing my definition ofa Likely Rejecter<br />
  52. 52. … it was MY ontology!<br />
  53. 53. I can re-use it<br />
  54. 54. I can modify it as I change myworld-view<br />
  55. 55. I can publish it for others to use<br />
  56. 56. Others can modify it and/or compare it to THEIR world-view<br />
  57. 57. Sharing my ontology also gives opportunities for micro-attribution;“Citation” of me is transparent and automatic when someone extends my ontology<br />
  58. 58. Using SADI and SHAREmypersonal world-view isexplicitlyexpressedand can bedynamically evaluated againstglobal data and knowledge<br />
  59. 59. Ontology development is distributed and personalrather than centralized no institutions<br />
  60. 60.
  61. 61. …but there’s more…<br />
  62. 62. “Likely Rejecter”<br />
  63. 63. I made that up! It came out of my head!<br />
  64. 64. What’s another word for a world-view that you make-up?<br />Hypothesis<br />
  65. 65. The “Likely Rejecter” OWL Class<br />is an explicitly-expressed hypothesis;<br />Members of that class may or may not exist!<br />
  66. 66.
  67. 67.
  68. 68. Ontologically-expressed Hypotheses drive the discovery, assembly, and analysis of data capable of evaluating their validity<br />Hypothesis<br />Ischemia<br />SADI<br />+ <br />SHARE<br />Hypertension<br />Blood Pressure<br />Analytical Algorithm<br />Database 1<br />Database 2<br />
  69. 69. Join us!<br />SADI and CardioSHARE are Open-Source projects<br />Come join us – we’re having a lot of fun!!<br />http://sadiframework.org<br />SADI SemanticWeb Services Page<br />#SADIFramework<br />
  70. 70. C-BRASS: Canadian Bioinformatics Resources As Semantic Servicestogether with Michel Dumontier, Chris Baker<br />~$1M funding to help us deploy SADI services and provide training for new service providers<br />We can help you get started!<br />“C-BRASS” is on Facebook!<br />Like<br />
  71. 71. Credits<br />Benjamin VanderValk(SADI & SHARE)<br />Luke McCarthy (SADI & SHARE)<br />SoroushSamadian(CardioSHARE)<br />
  72. 72. Microsoft Research<br />
  73. 73. Credits<br />Benjamin VanderValk (SADI & CardioSHARE)<br />Luke McCarthy (SADI & CardioSHARE)<br />SoroushSamadian (CardioSHARE)<br />IO Informatics (Knowledge Explorer API)<br />Microsoft Research<br />Fin<br />This presentation available on SlideShare: keywords ‘wilkinson’ ‘bosc’<br />

×