Your SlideShare is downloading. ×
How SADI & SHARE help restore the Scientific Method to in silico science
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

How SADI & SHARE help restore the Scientific Method to in silico science

1,169
views

Published on

This is my presentation to the Bio Open Source Convention (BOSC) in Boston, July 2010. I start with a brief status-update on the BioMoby project and then launch into a series of demonstrations of …

This is my presentation to the Bio Open Source Convention (BOSC) in Boston, July 2010. I start with a brief status-update on the BioMoby project and then launch into a series of demonstrations of it's successor - SADI + SHARE. Rather than discussing how SADI/SHARE work, I focus the discussion on what role I think these technologies can play in bringing the traditional "scientific method" back into in silico biology.

Published in: Technology, Education

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,169
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
8
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. From BioMoby to SADI
    The Quest for the Holy Grail!
  • 2. BioMoby Stats in a nutshell
    >1800 servicesworldwide (~1300 “alive” at any given time)
    4 major installations of the Moby Service registry
    Genome Canada, SUN Center of Excellence, Calgary
    Genome España, Barcelona Supercomputing Center
    International Rice Research Institute, Philippines
    Max Planck, Cologne
    Canadian service registry brokers ~400,000 requests/month
    Canadian BioMoby services receive ~700,000 uses/month
    Canadian server just had a significant memory upgrade to improve performance
    “The report of my death was an exaggeration”
    -- Mark Twain
  • 3. Model Organism Bring Your-Own Database Interface Conference
    “MOBY-DIC”
    Emma Lake, Saskatchewan
    Sept 21, 2001
  • 4.
  • 5. Are we going after The Holy Grail here?
  • 6. The Holy Grail:(this slide created circa 2002)
    Align the promoters of all serine threonine kinases involved exclusively in the regulation of cell sorting during wound healing in blood vessels.
    Retrieve and align 2000nt 5' from every serine/threonine kinase in Mus musculus expressed exclusively in the tunica [I | M |A] whose expression increases 5X or more within 5 hours of wounding but is not activated during the normal development of blood vessels, and is <40% homologous in the active site to kinases known to be involved in cell-cycle regulation in any other species.
  • 7. http://sadiframework.org
    MicrosoftResearch
    Founding partner
  • 8. Holy Grail Demo #1
  • 9. Imagine there is a “virtual database” containing all of the data from all of the databases,together with the output ofevery conceivable analysis
  • 10. How do we query that database?
  • 11. “SHARE”Semantic Health And Research EnvironmentSADI client applicationhttp://biordf.net/cardioSHARE (Pellet)http://dev.biordf.net/cardioSHARE (Pellet 2)
  • 12. What pathways does UniProt protein P47989 belong to?
    PREFIX pred: <http://sadiframework.org/ontologies/predicates.owl#>
    PREFIX ont: <http://ontology.dumontierlab.com/>
    PREFIX uniprot: <http://lsrn.org/UniProt:>
    SELECT ?gene ?pathway
    WHERE {
    uniprot:P47989 pred:isEncodedBy ?gene .
    ?gene ont:isParticipantIn ?pathway .
    }
  • 13.
  • 14.
  • 15.
  • 16. Recapwhat we just saw
    A standard SPARQL query was entered into SHARE, a SADI-aware query engine
  • 17. Recapwhat we just saw
    The query was interpreted to extract the “triple” patterns
    subject, predicate, objectbeing requested
  • 18. Recapwhat we just saw
    Triple-patterns are passed to SADI for Web Service discovery
  • 19. Recapwhat we just saw
    Services capable of generating those triple-patterns are automatically executed, the triples are stored, and the query is resolved.
  • 20. Recapwhat we just saw
    We posed, and answered a ~complex database query
    WITHOUT A DATABASE
    (in fact, the data didn’t even have to exist...)
  • 21. Recapwhat we just saw
    Note that there is no centralized ontologyUnlike BioMoby, SADI supports all (OWL) ontologiesand does not invent any of its own
  • 22. Holy Grail Demo #1
    Align the promoters of all serine threonine kinases involved exclusively in the regulation of cell sorting during wound healing in blood vessels.
    Retrieve and align 2000nt 5' from every serine/threonine kinase in Mus musculus expressed exclusively in the tunica [I | M |A] whose expression increases 5X or more within 5 hours of wounding but is not activated during the normal development of blood vessels, and is <40% homologous in the active site to kinases known to be involved in cell-cycle regulation in any other species.
  • 23. Holy Grail Demo #2
  • 24. Show me the latest Blood Urea Nitrogen and Creatinine levelsof patients who appear to be rejecting their transplants
    PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
    PREFIX patient: <http://sadiframework.org/ontologies/patients.owl#>
    PREFIX l: <http://sadiframework.org/ontologies/predicates.owl#>
    SELECT ?patient ?bun ?creat
    FROM <http://sadiframework.org/ontologies/patients.rdf>
    WHERE {
    ?patient rdf:typepatient:LikelyRejecter .
    ?patient l:latestBUN ?bun .
    ?patient l:latestCreatinine ?creat .
    }
  • 25. Start burrowing through the LikelyRejector OWL class  find that we need a regression model OWL class
  • 26. Regression models have features like slopes and intercepts, and so on.The class is completely decomposed until a set of required Services are discoveredcapable of creating all these necessary properties
  • 27. Decomposition of the OWL class uncovers the need for a Linear Regression analysis on the patient blood chemistry data
  • 28. VOILA!
  • 29. We just dynamically evaluated if individuals matching a particular high-level concept definition exist…or can exist
  • 30. Holy Grail Demo #2
    Align the promoters of all serine threonine kinases involved exclusively in the regulation of cell sorting during wound healing in blood vessels.
    Retrieve and align 2000nt 5' from every serine/threonine kinase in Mus musculus expressed exclusively in the tunica [I | M |A] whose expression increases 5X or more within 5 hours of wounding but is not activated during the normal development of blood vessels, and is <40% homologous in the active site to kinases known to be involved in cell-cycle regulation in any other species.
  • 31. How does SADI + SHARE do that?
  • 32. Please see other presentations uploaded to SlideShare for a full explanation of SADI Functionality See also the Taverna and Protégé plug-insfor discovering, running and creating services
    Taverna
  • 33. The Holy Grail may not yet be in-handbut I think we can at least see it from here!So… now what?
  • 34. Mark’s Manifesto
    What is my next “Holy Grail”?
  • 35. Science
    Support for the in silicoScientific Method
  • 36.
  • 37. The Scientific Method
    Discourse: What do you believe? What do I believe?
    Disagreement: You’re wrong! And I’m gonna prove it!
    Clarity: This is the experiment I am going to do
    Reproducibility: This is how I did it (“provenance”)
    Clarity: This is my new hypothesis
  • 38. The Scientific Method
    Discourse: What do you believe? What do I believe?
    Disagreement: You’re wrong! And I’m gonna prove it!
    Clarity: This is the experiment I am going to do
    Reproducibility: This is how I did it (“provenance”)
    Clarity: This is my new hypothesis
    Workflows
    (e.g. myExperiment)
  • 39.
  • 40. In opposition to the lessons we learnt from Web 2.0
    The Semantic Web in Healthcare and Life Sciencesis solving the problems of science…
    …by forming institutions
  • 41. Result:
    Large, centrally-designed and centrally-curated ontologies
    that enforce“community agreement” about “biological reality”
  • 42. Science ≠ Consensus
  • 43.
  • 44.
  • 45.
  • 46.
  • 47. To bring the “traditions of Science” to in silicoScience
    Web 3.0 must encourage and facilitate personalopinionand debate
  • 48. What has this got to do with SADI and SHARE?
  • 49. PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
    PREFIX patient: <http://sadiframework.org/ontologies/patients.owl#>
    PREFIX l: <http://sadiframework.org/ontologies/predicates.owl#>
    SELECT ?patient ?bun ?creat
    FROM <http://sadiframework.org/ontologies/patients.rdf>
    WHERE {
    ?patient rdf:typepatient:LikelyRejecter .
    ?patient l:latestBUN ?bun .
    ?patient l:latestCreatinine ?creat .
    }
  • 50. Likely Rejecter
  • 51. Icreated a small ontologydescribing my definition ofa Likely Rejecter
  • 52. … it was MY ontology!
  • 53. I can re-use it
  • 54. I can modify it as I change myworld-view
  • 55. I can publish it for others to use
  • 56. Others can modify it and/or compare it to THEIR world-view
  • 57. Sharing my ontology also gives opportunities for micro-attribution;“Citation” of me is transparent and automatic when someone extends my ontology
  • 58. Using SADI and SHAREmypersonal world-view isexplicitlyexpressedand can bedynamically evaluated againstglobal data and knowledge
  • 59. Ontology development is distributed and personalrather than centralized no institutions
  • 60.
  • 61. …but there’s more…
  • 62. “Likely Rejecter”
  • 63. I made that up! It came out of my head!
  • 64. What’s another word for a world-view that you make-up?
    Hypothesis
  • 65. The “Likely Rejecter” OWL Class
    is an explicitly-expressed hypothesis;
    Members of that class may or may not exist!
  • 66.
  • 67.
  • 68. Ontologically-expressed Hypotheses drive the discovery, assembly, and analysis of data capable of evaluating their validity
    Hypothesis
    Ischemia
    SADI
    +
    SHARE
    Hypertension
    Blood Pressure
    Analytical Algorithm
    Database 1
    Database 2
  • 69. Join us!
    SADI and CardioSHARE are Open-Source projects
    Come join us – we’re having a lot of fun!!
    http://sadiframework.org
    SADI SemanticWeb Services Page
    #SADIFramework
  • 70. C-BRASS: Canadian Bioinformatics Resources As Semantic Servicestogether with Michel Dumontier, Chris Baker
    ~$1M funding to help us deploy SADI services and provide training for new service providers
    We can help you get started!
    “C-BRASS” is on Facebook!
    Like
  • 71. Credits
    Benjamin VanderValk(SADI & SHARE)
    Luke McCarthy (SADI & SHARE)
    SoroushSamadian(CardioSHARE)
  • 72. Microsoft Research
  • 73. Credits
    Benjamin VanderValk (SADI & CardioSHARE)
    Luke McCarthy (SADI & CardioSHARE)
    SoroushSamadian (CardioSHARE)
    IO Informatics (Knowledge Explorer API)
    Microsoft Research
    Fin
    This presentation available on SlideShare: keywords ‘wilkinson’ ‘bosc’