From BioMoby to SADIThe Quest for the Holy Grail!
BioMoby Stats in a nutshell>1800 servicesworldwide (~1300 “alive” at any given time)4 major installations of the Moby Service registryGenome Canada, SUN Center of Excellence, CalgaryGenome España, Barcelona Supercomputing CenterInternational Rice Research Institute, Philippines Max Planck, CologneCanadian service registry brokers ~400,000 requests/monthCanadian BioMoby services receive ~700,000 uses/monthCanadian server just had a significant memory upgrade to improve performance“The report of my death was an exaggeration”-- Mark Twain
Model Organism Bring Your-Own Database Interface Conference“MOBY-DIC”Emma Lake, SaskatchewanSept 21, 2001
Are we going after The Holy Grail here?
The Holy Grail:(this slide created circa 2002)Align the promoters of all serine threonine kinases involved exclusively in the regulation of cell sorting during wound healing in blood vessels.Retrieve and align 2000nt 5' from every serine/threonine kinase in Mus musculus expressed exclusively in the tunica [I | M |A] whose expression increases 5X or more within 5 hours of wounding but is not activated during the normal development of blood vessels, and is <40% homologous in the active site to kinases known to be involved in cell-cycle regulation in any other species.
http://sadiframework.orgMicrosoftResearchFounding partner
Holy Grail Demo #1
Imagine there is a “virtual database” containing all of the data from all of the databases,together with the output ofevery conceivable analysis
How do we query that database?
“SHARE”Semantic Health And Research EnvironmentSADI client applicationhttp://biordf.net/cardioSHARE (Pellet)http://dev.biordf.net/cardioSHARE (Pellet 2)
What pathways does UniProt protein P47989 belong to?PREFIX pred: <http://sadiframework.org/ontologies/predicates.owl#>PREFIX ont: <http://ontology.dumontierlab.com/>PREFIX uniprot: <http://lsrn.org/UniProt:>SELECT ?gene ?pathway WHERE { 	uniprot:P47989 pred:isEncodedBy ?gene . 	?gene ont:isParticipantIn ?pathway . }
Recapwhat we just sawA standard SPARQL query was entered into SHARE, a SADI-aware query engine
Recapwhat we just sawThe query was interpreted to extract the “triple” patternssubject, predicate, objectbeing requested
Recapwhat we just sawTriple-patterns are passed to SADI for Web Service discovery
Recapwhat we just sawServices capable of generating those triple-patterns are automatically executed, the triples are stored, and the query is resolved.
Recapwhat we just sawWe posed, and answered a ~complex database query WITHOUT A DATABASE(in fact, the data didn’t even have to exist...)
Recapwhat we just sawNote that there is no centralized ontologyUnlike BioMoby, SADI supports all (OWL) ontologies and does not invent any of its own
Holy Grail Demo #1Align the promoters of all serine threonine kinases involved exclusively in the regulation of cell sorting during wound healing in blood vessels.Retrieve and align 2000nt 5' from every serine/threonine kinase in Mus musculus expressed exclusively in the tunica [I | M |A] whose expression increases 5X or more within 5 hours of wounding but is not activated during the normal development of blood vessels, and is <40% homologous in the active site to kinases known to be involved in cell-cycle regulation in any other species.
Holy Grail Demo #2
Show me the latest Blood Urea Nitrogen and Creatinine levelsof patients who appear to be rejecting their transplantsPREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX patient: <http://sadiframework.org/ontologies/patients.owl#> PREFIX l: <http://sadiframework.org/ontologies/predicates.owl#> SELECT ?patient ?bun ?creatFROM <http://sadiframework.org/ontologies/patients.rdf>WHERE {	?patient rdf:typepatient:LikelyRejecter .	?patient l:latestBUN ?bun . 	?patient l:latestCreatinine ?creat . }
Start burrowing through the LikelyRejector OWL class  find that we need a regression model OWL class
Regression models have features like slopes and intercepts, and so on.The class is completely decomposed until a set of required Services are discoveredcapable of creating all these necessary properties
Decomposition of the OWL class uncovers the need for a Linear Regression analysis on the patient blood chemistry data
VOILA!
We just dynamically evaluated if individuals matching a particular high-level concept definition exist…or can exist
Holy Grail Demo #2Align the promoters of all serine threonine kinases involved exclusively in the regulation of cell sorting during wound healing in blood vessels.Retrieve and align 2000nt 5' from every serine/threonine kinase in Mus musculus expressed exclusively in the tunica [I | M |A] whose expression increases 5X or more within 5 hours of wounding but is not activated during the normal development of blood vessels, and is <40% homologous in the active site to kinases known to be involved in cell-cycle regulation in any other species.
How does SADI + SHARE do that?
Please see other presentations uploaded to SlideShare for a full explanation of SADI Functionality See also the Taverna and Protégé plug-insfor discovering, running and creating servicesSentient Knowledge ExplorerTaverna
The Holy Grail may not yet be in-handbut I think we can at least see it from here!So… now what?
Mark’s ManifestoWhat is my next “Holy Grail”?
ScienceSupport for the in silico Scientific Method
The Scientific MethodDiscourse:  What do you believe?  What do I believe?Disagreement:  You’re wrong!  And I’m gonna prove it!Clarity:  This is the experiment I am going to doReproducibility:  This is how I did it (“provenance”)Clarity:  This is my new hypothesis
The Scientific MethodDiscourse:  What do you believe?  What do I believe?Disagreement:  You’re wrong!  And I’m gonna prove it!Clarity:  This is the experiment I am going to doReproducibility:  This is how I did it (“provenance”)Clarity:  This is my new hypothesisWorkflows                (e.g. myExperiment)
In opposition to the lessons we learnt from Web 2.0The Semantic Web in Healthcare and Life Sciencesis currently solving the problems of science……by forming institutions
Result:Large, centrally-designed and centrally-curated ontologies that enforce “community agreement” about “biological reality”
Science ≠ Consensus
To bring the “traditions of Science” to in silicoSciencewe need Web 3.0 tools that encourage and facilitate personalopinionand debate
What has this got to do with SADI and SHARE?
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX patient: <http://sadiframework.org/ontologies/patients.owl#> PREFIX l: <http://sadiframework.org/ontologies/predicates.owl#> SELECT ?patient ?bun ?creatFROM <http://sadiframework.org/ontologies/patients.rdf>WHERE {	?patient rdf:typepatient:LikelyRejecter .	?patient l:latestBUN ?bun . 	?patient l:latestCreatinine ?creat . }
Likely Rejecter
Icreated a small ontologydescribing my definition ofa Likely Rejecter
… it was MY ontology!
I can re-use it
I can modify it as I change myworld-view
I can publish it for others to use
Others can modify it and/or compare it to THEIR world-view
Sharing my ontology also gives opportunities for micro-attribution;“Citation” of me is transparent and automatic when someone extends my ontology
Using SADI and SHAREmypersonal world-view isexplicitlyexpressedand can bedynamically evaluated againstglobal data and knowledge
Ontology development is distributed and personalrather than centralized no institutions“an ecosystem of ideas!”
…but there’s more…
“Likely Rejecter”
I made that up!  It came out of my head!
What’s another word for a world-view that you make-up?Hypothesis
The “Likely Rejecter” OWL Classis an explicitly-expressed hypothesis;Members of that class may or may not exist!
Ontologically-expressed Hypotheses drive the discovery, assembly, and analysis of data capable of evaluating their validityHypothesisIschemiaSADI+ SHAREHypertensionBlood PressureAnalytical AlgorithmDatabase 1Database 2
Join us!SADI and CardioSHARE are Open-Source projectsCome join us – we’re having a lot of fun!!http://sadiframework.orgSADI SemanticWeb Services Page#SADIFramework
C-BRASS:  Canadian Bioinformatics Resources As Semantic Servicestogether with Michel Dumontier, Chris Baker~$1M funding to help us deploy SADI services and provide training for new service providersWe can help you get started!“C-BRASS” is on Facebook!Like
CreditsBenjamin VanderValk(SADI & SHARE)Luke McCarthy (SADI & SHARE)SoroushSamadian(CardioSHARE)
Microsoft Research
                          CreditsBenjamin VanderValk (SADI & CardioSHARE)Luke McCarthy (SADI & CardioSHARE)SoroushSamadian (CardioSHARE)IO Informatics (Knowledge Explorer API)Microsoft ResearchFinThis presentation available on SlideShare:  keywords ‘wilkinson’ ‘bosc’

Wilkinson bosc2010 moby-to-sadi

  • 1.
    From BioMoby toSADIThe Quest for the Holy Grail!
  • 2.
    BioMoby Stats ina nutshell>1800 servicesworldwide (~1300 “alive” at any given time)4 major installations of the Moby Service registryGenome Canada, SUN Center of Excellence, CalgaryGenome España, Barcelona Supercomputing CenterInternational Rice Research Institute, Philippines Max Planck, CologneCanadian service registry brokers ~400,000 requests/monthCanadian BioMoby services receive ~700,000 uses/monthCanadian server just had a significant memory upgrade to improve performance“The report of my death was an exaggeration”-- Mark Twain
  • 3.
    Model Organism BringYour-Own Database Interface Conference“MOBY-DIC”Emma Lake, SaskatchewanSept 21, 2001
  • 5.
    Are we goingafter The Holy Grail here?
  • 6.
    The Holy Grail:(thisslide created circa 2002)Align the promoters of all serine threonine kinases involved exclusively in the regulation of cell sorting during wound healing in blood vessels.Retrieve and align 2000nt 5' from every serine/threonine kinase in Mus musculus expressed exclusively in the tunica [I | M |A] whose expression increases 5X or more within 5 hours of wounding but is not activated during the normal development of blood vessels, and is <40% homologous in the active site to kinases known to be involved in cell-cycle regulation in any other species.
  • 7.
  • 8.
  • 9.
    Imagine there isa “virtual database” containing all of the data from all of the databases,together with the output ofevery conceivable analysis
  • 10.
    How do wequery that database?
  • 11.
    “SHARE”Semantic Health AndResearch EnvironmentSADI client applicationhttp://biordf.net/cardioSHARE (Pellet)http://dev.biordf.net/cardioSHARE (Pellet 2)
  • 12.
    What pathways doesUniProt protein P47989 belong to?PREFIX pred: <http://sadiframework.org/ontologies/predicates.owl#>PREFIX ont: <http://ontology.dumontierlab.com/>PREFIX uniprot: <http://lsrn.org/UniProt:>SELECT ?gene ?pathway WHERE { uniprot:P47989 pred:isEncodedBy ?gene . ?gene ont:isParticipantIn ?pathway . }
  • 16.
    Recapwhat we justsawA standard SPARQL query was entered into SHARE, a SADI-aware query engine
  • 17.
    Recapwhat we justsawThe query was interpreted to extract the “triple” patternssubject, predicate, objectbeing requested
  • 18.
    Recapwhat we justsawTriple-patterns are passed to SADI for Web Service discovery
  • 19.
    Recapwhat we justsawServices capable of generating those triple-patterns are automatically executed, the triples are stored, and the query is resolved.
  • 20.
    Recapwhat we justsawWe posed, and answered a ~complex database query WITHOUT A DATABASE(in fact, the data didn’t even have to exist...)
  • 21.
    Recapwhat we justsawNote that there is no centralized ontologyUnlike BioMoby, SADI supports all (OWL) ontologies and does not invent any of its own
  • 22.
    Holy Grail Demo#1Align the promoters of all serine threonine kinases involved exclusively in the regulation of cell sorting during wound healing in blood vessels.Retrieve and align 2000nt 5' from every serine/threonine kinase in Mus musculus expressed exclusively in the tunica [I | M |A] whose expression increases 5X or more within 5 hours of wounding but is not activated during the normal development of blood vessels, and is <40% homologous in the active site to kinases known to be involved in cell-cycle regulation in any other species.
  • 23.
  • 24.
    Show me thelatest Blood Urea Nitrogen and Creatinine levelsof patients who appear to be rejecting their transplantsPREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX patient: <http://sadiframework.org/ontologies/patients.owl#> PREFIX l: <http://sadiframework.org/ontologies/predicates.owl#> SELECT ?patient ?bun ?creatFROM <http://sadiframework.org/ontologies/patients.rdf>WHERE { ?patient rdf:typepatient:LikelyRejecter . ?patient l:latestBUN ?bun . ?patient l:latestCreatinine ?creat . }
  • 25.
    Start burrowing throughthe LikelyRejector OWL class  find that we need a regression model OWL class
  • 26.
    Regression models havefeatures like slopes and intercepts, and so on.The class is completely decomposed until a set of required Services are discoveredcapable of creating all these necessary properties
  • 27.
    Decomposition of theOWL class uncovers the need for a Linear Regression analysis on the patient blood chemistry data
  • 28.
  • 29.
    We just dynamicallyevaluated if individuals matching a particular high-level concept definition exist…or can exist
  • 30.
    Holy Grail Demo#2Align the promoters of all serine threonine kinases involved exclusively in the regulation of cell sorting during wound healing in blood vessels.Retrieve and align 2000nt 5' from every serine/threonine kinase in Mus musculus expressed exclusively in the tunica [I | M |A] whose expression increases 5X or more within 5 hours of wounding but is not activated during the normal development of blood vessels, and is <40% homologous in the active site to kinases known to be involved in cell-cycle regulation in any other species.
  • 31.
    How does SADI+ SHARE do that?
  • 32.
    Please see otherpresentations uploaded to SlideShare for a full explanation of SADI Functionality See also the Taverna and Protégé plug-insfor discovering, running and creating servicesSentient Knowledge ExplorerTaverna
  • 33.
    The Holy Grailmay not yet be in-handbut I think we can at least see it from here!So… now what?
  • 34.
    Mark’s ManifestoWhat ismy next “Holy Grail”?
  • 35.
    ScienceSupport for thein silico Scientific Method
  • 37.
    The Scientific MethodDiscourse: What do you believe? What do I believe?Disagreement: You’re wrong! And I’m gonna prove it!Clarity: This is the experiment I am going to doReproducibility: This is how I did it (“provenance”)Clarity: This is my new hypothesis
  • 38.
    The Scientific MethodDiscourse: What do you believe? What do I believe?Disagreement: You’re wrong! And I’m gonna prove it!Clarity: This is the experiment I am going to doReproducibility: This is how I did it (“provenance”)Clarity: This is my new hypothesisWorkflows (e.g. myExperiment)
  • 40.
    In opposition tothe lessons we learnt from Web 2.0The Semantic Web in Healthcare and Life Sciencesis currently solving the problems of science……by forming institutions
  • 41.
    Result:Large, centrally-designed andcentrally-curated ontologies that enforce “community agreement” about “biological reality”
  • 42.
  • 47.
    To bring the“traditions of Science” to in silicoSciencewe need Web 3.0 tools that encourage and facilitate personalopinionand debate
  • 48.
    What has thisgot to do with SADI and SHARE?
  • 49.
    PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>PREFIX patient: <http://sadiframework.org/ontologies/patients.owl#> PREFIX l: <http://sadiframework.org/ontologies/predicates.owl#> SELECT ?patient ?bun ?creatFROM <http://sadiframework.org/ontologies/patients.rdf>WHERE { ?patient rdf:typepatient:LikelyRejecter . ?patient l:latestBUN ?bun . ?patient l:latestCreatinine ?creat . }
  • 50.
  • 51.
    Icreated a smallontologydescribing my definition ofa Likely Rejecter
  • 52.
    … it wasMY ontology!
  • 53.
  • 54.
    I can modifyit as I change myworld-view
  • 55.
    I can publishit for others to use
  • 56.
    Others can modifyit and/or compare it to THEIR world-view
  • 57.
    Sharing my ontologyalso gives opportunities for micro-attribution;“Citation” of me is transparent and automatic when someone extends my ontology
  • 58.
    Using SADI andSHAREmypersonal world-view isexplicitlyexpressedand can bedynamically evaluated againstglobal data and knowledge
  • 59.
    Ontology development isdistributed and personalrather than centralized no institutions“an ecosystem of ideas!”
  • 61.
  • 62.
  • 63.
    I made thatup! It came out of my head!
  • 64.
    What’s another wordfor a world-view that you make-up?Hypothesis
  • 65.
    The “Likely Rejecter”OWL Classis an explicitly-expressed hypothesis;Members of that class may or may not exist!
  • 68.
    Ontologically-expressed Hypotheses drivethe discovery, assembly, and analysis of data capable of evaluating their validityHypothesisIschemiaSADI+ SHAREHypertensionBlood PressureAnalytical AlgorithmDatabase 1Database 2
  • 69.
    Join us!SADI andCardioSHARE are Open-Source projectsCome join us – we’re having a lot of fun!!http://sadiframework.orgSADI SemanticWeb Services Page#SADIFramework
  • 70.
    C-BRASS: CanadianBioinformatics Resources As Semantic Servicestogether with Michel Dumontier, Chris Baker~$1M funding to help us deploy SADI services and provide training for new service providersWe can help you get started!“C-BRASS” is on Facebook!Like
  • 71.
    CreditsBenjamin VanderValk(SADI &SHARE)Luke McCarthy (SADI & SHARE)SoroushSamadian(CardioSHARE)
  • 72.
  • 73.
    CreditsBenjamin VanderValk (SADI & CardioSHARE)Luke McCarthy (SADI & CardioSHARE)SoroushSamadian (CardioSHARE)IO Informatics (Knowledge Explorer API)Microsoft ResearchFinThis presentation available on SlideShare: keywords ‘wilkinson’ ‘bosc’