Carole Goble       “Shopping for data        should be as easy          as shopping for                 shoes!!”
Evaluating Hypotheses       using SPARQL-DL as an     abstract workflow language           to choreograph     SADI Semanti...
Start at the end...
We wanted to duplicatea real, peer-reviewed, bioinformatics analysis   simply by providing a model describing             ...
...the machine had to make     every other decision         on it’s own
This is the study we chose:  Gordon, P.M.K., Soliman, M.A., Bose, P.,   Trinh, Q., Sensen, C.W., Riabowol, K.  Interspecie...
Gordon, P.M.K., Soliman, M.A., Bose, P., Trinh, Q., Sensen, C.W., Riabowol, K.: Interspeciesdata mining to predict novel I...
Original Study - simplified and abstracted:Using what is known about interactions in other species,   predict new interact...
For our prototype study, we simplified this further to:                                  X 2                              ...
The tricky part is...In the abstract, the two workflows           are identicalbut in reality they will be differentbecaus...
Modeling the result – Step 1                                           OWL                Web Ontology Language (OWL) is t...
Modeling the result – Step 2ProbableInteractor:  is homologous to (    protein from ModelOrganism1…) # Potential Interacto...
We then publish our OWL model of a Probable Interactor on the Web     In a local data-file we provide the protein we are i...
This is the question we ask:PREFIX i: <http://sadiframework.org/ontologies/InteractingProteins.owl#>SELECT ?proteinFROM <f...
Our system then derives (and executes) the following workflow automatically                                               ...
There are two very cool things about what you just saw...
There are two very cool things about what you just saw...              The system was able to            create a workflow...
There are two very cool things about what you just saw...               The workflow it created              (i.e. the ser...
We got the answer“simply” by designing a model of the answer!
Start at the beginning...
Semantic Automated Discovery and Integration  A Semantic Web-focused Web Services specificationMicrosoftResearch       htt...
Web Services    vs.Semantic Web
Web ServicesXML + XML Schema  Semantic Web   RDF + OWL
Web ServicesPOST of SOAPSemantic Web GET of RDF
Web ServicesNo (rigorous) semantics    Semantic WebRich, flexible semantics
Web Services        &   Semantic WebFundamentally different   Web technologies
A design-practice forWeb Service provision on the      Semantic Web
100% standards-compliantwith no “invented” standards
Lightweight(only 2 inter-related “rules”)
Rules come from observations:
SADI Observation #1:Web Services in Bioinformatics create implicit biological relationships   between their input and output
SADI Observation #1:       SeqRet
SADI Design Practice #1         Make the implicit explicit…A Web Service should create “triples” linking    the input data...
SADI Best Practice #1     This is what bioinformatics Web Services               implicitly do anyway!           Easy to i...
SADI Observation #2:         HTTP GET and POST            GET guaranteesthe response relates to the request URI in a very ...
SADI Observation #2:            HTTP GET and POSTThat’s why Web Services have a fundamentally  different behaviour than th...
SADI Observation #2:              GET and POST       We can fix that! (without breaking any existing rules or standards!)
SADI Best Practice #2SUBJECT URI of the output graph (triples)                 is the same as SUBJECT URI of the input gra...
SADI Best Practice #2  GeneID                               GeneIDrdf:type                                 rdf:type       ...
ConsequenceWeb Services now exhibit a very similar     behavior to the Web itself      POST “behaves like” GET
SADI InterfacesService Interfaces can be described by           two OWL classes  (this is 100% compatible with the SAWSDL ...
SADI InterfacesOWL Class #1: My Input Class
SADI InterfacesOWL Class #2: My Output Class
SADI Service Functionality      Consumes OWL Individuals of Class #1        Returns OWL Individuals of Class #2but the URI...
In practice, of course, you don’t return the input dataStrip it and add the new data provided by the ServiceBut since the ...
Service Description     INPUT OWL Class     NamedIndividual: things with           a “name” property           from “foaf”...
How do we discover services?Input and output are about the same “thing”Therefore, to describe what a service does        s...
Service Description                       INPUT OWL Class                       NamedIndividual: things with              ...
Service Discovery  Index all of the properties added by all of the services   under all circumstances
Real-world ExampleInput Data:           BRCA1 rdf:type    Gene IDOutput Data:          BRCA1    hasDNASequence     AGCTTAG...
Service DiscoverySimply search for the property of interest       based on the data in-hand
e.g. The question:“what is the DNA sequence of BRCA1?”Discover a SADI Web Service that generates the  DNA Sequence propert...
DEMO     Knowledge Explorer          Plug-inFor more information about the Knowledge Explorer surf to:                 htt...
SADI has just filled-in “Encodes” property for the three genes       from the output of discovered Web Service(s)
Discover services that provide the hasGOTerm   property for Protein Sequence datatype
This kind of “Web Service Surfing” is very            intuitive for the Biologist!No need to describe the algorithm or the...
Semantic Health And Research EnvironmentSHARE answers arbitrary SPARQL queries by finding and executing SADI Services
Example #1        What is the phenotype of every allele of the          Antirrhinum majus DEFICIENS geneSELECT ?allele    ...
Example #1        What is the phenotype of every allele of the          Antirrhinum majus DEFICIENS geneSELECT ?allele    ...
Enter that query into      SHARE
Click “Submit”...
...and in a few seconds you get your answer.        Based on predicates in your query, SHARE utilized SADIto automatically...
Because it is the Semantic WebThe query results are live hyperlinksto the respective Database or images
ImportantlyWe posed, and answered a complex database query WITHOUT A DATABASE
(in fact, the data didn’t even have to exist... as I’ll now show you)
Example #2    Show me the latest Blood Urea Nitrogen and Creatinine levels      of patients who appear to be rejecting the...
Likely Rejecter:A patient who has creatinine levels   that are increasing over time                            - - Wilkins...
Likely Rejecter:Our database contains variousblood chemistry measurements    at various time-points
Likely Rejecter: …but there is no “likely rejecter”column or table in our database…
SHARE determines               by itself             the need to do a     Linear Regression analysis overCreatinine blood ...
SHARE determines         by itselfhow and where that analysis       can be done        and does it
The SHARE system utilizes Semantics (via SADI) to discover and access   analytical services on the Web that do linear regr...
VOILA!
How does SHARE work?
Ontologies
Ontology Spectrum                         Thesauri                                                      Frames       Selec...
Ontology Spectrum                                              BETTER!!           Thesauri                    Frames      ...
Ontology Spectrum                                                  Because it                                             ...
Discovery systems - flexible                                                                    Selected                  ...
In the upper end of the Ontology Spectrum,         if the data has the right propertiesIt can be discovered to be a valid ...
...and in the context of a SHARE query those individual properties may have been aggregated              from many differe...
In exactly the same way that the OWL propertyrestrictions of a SADI Input Class tell SHARE what        properties a servic...
Show me the latest Blood Urea Nitrogen and Creatinine levels    of patients who appear to be rejecting their transplantsPR...
The definition of a Likely Rejecter category is encoded in       a machine-readable document written in the OWL Ontology l...
SHARE burrows down through the ontological definitionto learn about what the properties of “regression models” are
SHARE utilizes SADI to discover analytical services on the Web that do linear regression analysis then other services that...
Let’s go through that again from a different perspective
OWL Classes that include property restrictions can be       “executed” as if they were workflows
Ontology = Query = Workflow
WORKFLOW
QUERY:   SELECT images ofmutations from genes in  organism XXX thatshare homology to this gene in organism YYY
Concept:“Homologous Mutant     Image”
As OWL Axioms   HomologousMutantImage     is owl:equivalentTo {   Gene Q   hasImage image PGene Q   hasSequence Sequence Q...
Those axioms combine to  create an OWL Class: Homologous Mutant Image
QUERY:    Retrieveowl:HomologousMutant Image for   gene XXX
SHARE:       Decomposes theowl:Homologous Mutant ImageClass, discovers SADI services  relevant to that class, andpipelines...
“The user experiments showed thatworkflow re-use… is difficult for bioinformaticians.”                                    ...
Under slightly different circumstances(e.g. studying the same phenomenon in a            different organism)              ...
This great idea would be even betterif workflows were a little bit easier to re-purpose!
With SHARE, a workflow is generateddynamically, based on all of the information          presented in the querye.g. to cre...
With SHARE, a workflow is generateddynamically, based on all of the information          presented in the query  Moreover,...
With SHARE, the ontology is the workflow  (not the same as “an ontology that describes workflows”)  The ontology acts as a...
Works (best) with ontologies in the “Frames+” part of the spectrum, if there are SADI services available                  ...
As far as we are aware, SHARE is the only system that           exhibits this particular behaviour        ...and IMO this ...
Gordon, P.M.K., Soliman, M.A., Bose, P., Trinh, Q., Sensen, C.W., Riabowol, K.: Interspeciesdata mining to predict novel I...
That experiment can now be represented ONCE              as an OWL Class     It becomes concretized automatically       fo...
This is far beyond simply changing the parameters entered into a workflow...
Moreover, the idea of automated workflow “individuality” is quite interesting to us...
This is an early prototype of aPatient-driven Personalized Medicine           Web interface
Matching based on official  name, compound name,brand name, trade name, or    “common name” 
Still needs some work...        ??!?!?
Why the alert?                 Link out to PubMed
The SADI+SHARE workflow and reasoning waspersonalized to YOUR medical data
In future iterations, we will enable the workflowto be further customized through “personalized”OWL Classes (e.g. Provided...
These OWL Classes might include information about thecurrent trajectory of your treatment for a chronic disease,for exampl...
Frankly, I think it’s quite cool that “people” are creating and running personalized   workflows at the touch of a button...
...as many of you know, that has been my    dream since I started studying this         problem a decade ago!
Final thoughts
An experiment... based on a hypothesis
An experiment... based on a hypothesis        now modeled in OWL               ...               ...
Does this OWL Class represent a Hypothesis?               I think it does!                  ...                  ...
I believe that we will soon show                 using SADI + SHARE            that we can model a non-trivial            ...
Ontology = Hypothesis = Query = Workflow [= Materials and Methods ]             Most of your publication is done!      All...
Please join us!SADI and SHARE are Open-Source projects       http://sadiframework.org
My New Home!
University of British ColumbiaLuke McCarthy – Lead Dev.                  Edward KawasEverything...                        ...
C-BRASS Collaborators at other sitesU of New Brunswick      Carleton UniversityDr. Chris Baker         Dr. Michel Dumontie...
Microsoft Research
Evaluating Hypotheses using SPARQL-DL as an abstract workflow language to choreograph SADI Semantic Web Services
Evaluating Hypotheses using SPARQL-DL as an abstract workflow language to choreograph SADI Semantic Web Services
Evaluating Hypotheses using SPARQL-DL as an abstract workflow language to choreograph SADI Semantic Web Services
Evaluating Hypotheses using SPARQL-DL as an abstract workflow language to choreograph SADI Semantic Web Services
Evaluating Hypotheses using SPARQL-DL as an abstract workflow language to choreograph SADI Semantic Web Services
Evaluating Hypotheses using SPARQL-DL as an abstract workflow language to choreograph SADI Semantic Web Services
Evaluating Hypotheses using SPARQL-DL as an abstract workflow language to choreograph SADI Semantic Web Services
Evaluating Hypotheses using SPARQL-DL as an abstract workflow language to choreograph SADI Semantic Web Services
Evaluating Hypotheses using SPARQL-DL as an abstract workflow language to choreograph SADI Semantic Web Services
Evaluating Hypotheses using SPARQL-DL as an abstract workflow language to choreograph SADI Semantic Web Services
Evaluating Hypotheses using SPARQL-DL as an abstract workflow language to choreograph SADI Semantic Web Services
Evaluating Hypotheses using SPARQL-DL as an abstract workflow language to choreograph SADI Semantic Web Services
Evaluating Hypotheses using SPARQL-DL as an abstract workflow language to choreograph SADI Semantic Web Services
Evaluating Hypotheses using SPARQL-DL as an abstract workflow language to choreograph SADI Semantic Web Services
Upcoming SlideShare
Loading in …5
×

Evaluating Hypotheses using SPARQL-DL as an abstract workflow language to choreograph SADI Semantic Web Services

1,195 views

Published on

Some of the recent work we've been doing with SADI and SHARE, using SHARE as a mechanism for dynamically converting OWL Classes into workflows in a data-dependent manner; OWL, in this case, is acting as an abstract workflow model. The slides in the middle are the usual SADI/SHARE explanation; the slides at the end show how we're using these dynamically generated workflows to "personalize" medical information on the Web for a particular patient's profile.

Published in: Technology, Education
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,195
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
12
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Evaluating Hypotheses using SPARQL-DL as an abstract workflow language to choreograph SADI Semantic Web Services

  1. 1. Carole Goble “Shopping for data should be as easy as shopping for shoes!!”
  2. 2. Evaluating Hypotheses using SPARQL-DL as an abstract workflow language to choreograph SADI Semantic Web ServicesMark Wilkinson, Isaac Peral Senior Researcher in Biological InformaticsCentro de Biotecnología y Genómica de Plantas, UPM.
  3. 3. Start at the end...
  4. 4. We wanted to duplicatea real, peer-reviewed, bioinformatics analysis simply by providing a model describing what the answer (if one exists) would look like...
  5. 5. ...the machine had to make every other decision on it’s own
  6. 6. This is the study we chose: Gordon, P.M.K., Soliman, M.A., Bose, P., Trinh, Q., Sensen, C.W., Riabowol, K. Interspecies data mining to predict novel ING-protein interactions in human. BMC Genomics 9, 426 (2008).
  7. 7. Gordon, P.M.K., Soliman, M.A., Bose, P., Trinh, Q., Sensen, C.W., Riabowol, K.: Interspeciesdata mining to predict novel ING-protein interactions in human. BMC genomics. 9, 426 (2008).
  8. 8. Original Study - simplified and abstracted:Using what is known about interactions in other species, predict new interactions in your species of interest Given a protein P in Species X Find proteins similar to P in Species Y Retrieve interactors in Species Y Sequence-compare Y-interactors with Species X genome (1)  Keep only those with homologue in X Find proteins similar to P in Species Z Retrieve interactors in Species Z Sequence-compare Z-interactors with (1)  Putative interactors in Species X
  9. 9. For our prototype study, we simplified this further to: X 2 Then intersect
  10. 10. The tricky part is...In the abstract, the two workflows are identicalbut in reality they will be differentbecause they call for information from different species
  11. 11. Modeling the result – Step 1 OWL Web Ontology Language (OWL) is the language approved by the W3C for representing knowledge on the Web
  12. 12. Modeling the result – Step 2ProbableInteractor: is homologous to ( protein from ModelOrganism1…) # Potential Interactor in previous slide and protein from ModelOrganism2…) # Potential Interactor in previous slide Probable Interactor is defined in OWL as a subclass of Potential Interactor that requires homologous pairs of interacting proteins to exist in both comparator model organisms. (Effectively, an intersection)
  13. 13. We then publish our OWL model of a Probable Interactor on the Web In a local data-file we provide the protein we are interested in, and the two species we wish to use in our comparison taxon:9606 a i:OrganismOfInterest . # human uniprot:Q9UK53 a i:ProteinOfInterest . # ING1 taxon:4932 a i:ModelOrganism1 . # yeast taxon:7227 a i:ModelOrganism2 . # flyThese four lines represent all of the data provided to the query I am about to show you...
  14. 14. This is the question we ask:PREFIX i: <http://sadiframework.org/ontologies/InteractingProteins.owl#>SELECT ?proteinFROM <file:/local/workflow.input.n3>WHERE { ?protein a i:ProbableInteractor .} The reference to our OWL model of the answer
  15. 15. Our system then derives (and executes) the following workflow automatically These are different Web services! ...selected at run-time based on the same model
  16. 16. There are two very cool things about what you just saw...
  17. 17. There are two very cool things about what you just saw... The system was able to create a workflow based on an OWL model
  18. 18. There are two very cool things about what you just saw... The workflow it created (i.e. the services chosen) differed depending on context
  19. 19. We got the answer“simply” by designing a model of the answer!
  20. 20. Start at the beginning...
  21. 21. Semantic Automated Discovery and Integration A Semantic Web-focused Web Services specificationMicrosoftResearch http://sadiframework.org
  22. 22. Web Services vs.Semantic Web
  23. 23. Web ServicesXML + XML Schema Semantic Web RDF + OWL
  24. 24. Web ServicesPOST of SOAPSemantic Web GET of RDF
  25. 25. Web ServicesNo (rigorous) semantics Semantic WebRich, flexible semantics
  26. 26. Web Services & Semantic WebFundamentally different Web technologies
  27. 27. A design-practice forWeb Service provision on the Semantic Web
  28. 28. 100% standards-compliantwith no “invented” standards
  29. 29. Lightweight(only 2 inter-related “rules”)
  30. 30. Rules come from observations:
  31. 31. SADI Observation #1:Web Services in Bioinformatics create implicit biological relationships between their input and output
  32. 32. SADI Observation #1: SeqRet
  33. 33. SADI Design Practice #1 Make the implicit explicit…A Web Service should create “triples” linking the input data to the output data thusexplicitly describing the semantic relationship between them
  34. 34. SADI Best Practice #1 This is what bioinformatics Web Services implicitly do anyway! Easy to implement this as a best-practice...and makes SADI Services a source of Linked Data
  35. 35. SADI Observation #2: HTTP GET and POST GET guaranteesthe response relates to the request URI in a very precise and predictable way POST does not…
  36. 36. SADI Observation #2: HTTP GET and POSTThat’s why Web Services have a fundamentally different behaviour than the Semantic Web
  37. 37. SADI Observation #2: GET and POST We can fix that! (without breaking any existing rules or standards!)
  38. 38. SADI Best Practice #2SUBJECT URI of the output graph (triples) is the same as SUBJECT URI of the input graph (triples)(the output is “about” the input... Now explicitly!)
  39. 39. SADI Best Practice #2 GeneID GeneIDrdf:type rdf:type BRCA1 BRCA1 hasDNASequence SeqRet AGCTTA...
  40. 40. ConsequenceWeb Services now exhibit a very similar behavior to the Web itself POST “behaves like” GET
  41. 41. SADI InterfacesService Interfaces can be described by two OWL classes (this is 100% compatible with the SAWSDL standard)
  42. 42. SADI InterfacesOWL Class #1: My Input Class
  43. 43. SADI InterfacesOWL Class #2: My Output Class
  44. 44. SADI Service Functionality Consumes OWL Individuals of Class #1 Returns OWL Individuals of Class #2but the URI of those two GeneID GeneIDindividuals is the same; they are rdf:type rdf:type BRCA1 BRCA1the same individual, just now a hasDNASequencemember of a new class. SeqRet AGCTTA...
  45. 45. In practice, of course, you don’t return the input dataStrip it and add the new data provided by the ServiceBut since the output is still “rooted” in the input node, Input and output are easily merged client-side (just concatenate the output with the input)
  46. 46. Service Description INPUT OWL Class NamedIndividual: things with a “name” property from “foaf” ontology OUTPUT OWL Class GreetedIndividual: things with a “greeting” property from “hello” ontology POST http://example.org/myservice person:1 foaf:name rdf:type hello:Named Guy Incognito Individual person:1 hello:greeting rdf:type hello:GreetedHello, Guy Incognito! Individual
  47. 47. How do we discover services?Input and output are about the same “thing”Therefore, to describe what a service does simply compare (“diff”) the Input and Output OWL classes This is not prescriptive! Just how we use it
  48. 48. Service Description INPUT OWL Class NamedIndividual: things with a “name” property from “foaf” ontology OUTPUT OWL Class GreetedIndividual: things with a “greeting” property from “hello” ontologyThe service provides POST http://example.org/myservicea “greeting” to aNamed Individual person:1based on its “name” foaf:name rdf:type hello:Named Guy Incognito Individual person:1 hello:greeting rdf:type hello:Greeted Hello, Guy Incognito! Individual
  49. 49. Service Discovery Index all of the properties added by all of the services under all circumstances
  50. 50. Real-world ExampleInput Data: BRCA1 rdf:type Gene IDOutput Data: BRCA1 hasDNASequence AGCTTAGCCA…Registry Index: Service provides “hasDNASequence” property to Gene IDs
  51. 51. Service DiscoverySimply search for the property of interest based on the data in-hand
  52. 52. e.g. The question:“what is the DNA sequence of BRCA1?”Discover a SADI Web Service that generates the DNA Sequence property for gene identifiers
  53. 53. DEMO Knowledge Explorer Plug-inFor more information about the Knowledge Explorer surf to: http://io-informatics.com
  54. 54. SADI has just filled-in “Encodes” property for the three genes from the output of discovered Web Service(s)
  55. 55. Discover services that provide the hasGOTerm property for Protein Sequence datatype
  56. 56. This kind of “Web Service Surfing” is very intuitive for the Biologist!No need to describe the algorithm or the database just describe the properties that will be added
  57. 57. Semantic Health And Research EnvironmentSHARE answers arbitrary SPARQL queries by finding and executing SADI Services
  58. 58. Example #1 What is the phenotype of every allele of the Antirrhinum majus DEFICIENS geneSELECT ?allele ?image ?descWHERE { locus:DEF genetics:hasVariant ?allele . ?allele info:visualizedByImage ?image . ?image info:hasDescription ?desc}
  59. 59. Example #1 What is the phenotype of every allele of the Antirrhinum majus DEFICIENS geneSELECT ?allele ?image ?descWHERE { locus:DEF genetics:hasVariant ?allele . ?allele info:visualizedByImage ?image . ?image info:hasDescription ?desc} Note that there is no “FROM” clause! We don’t tell it where it should get the information, The machine has to figure that out by itself...
  60. 60. Enter that query into SHARE
  61. 61. Click “Submit”...
  62. 62. ...and in a few seconds you get your answer. Based on predicates in your query, SHARE utilized SADIto automatically discover the resources required to answer your question.
  63. 63. Because it is the Semantic WebThe query results are live hyperlinksto the respective Database or images
  64. 64. ImportantlyWe posed, and answered a complex database query WITHOUT A DATABASE
  65. 65. (in fact, the data didn’t even have to exist... as I’ll now show you)
  66. 66. Example #2 Show me the latest Blood Urea Nitrogen and Creatinine levels of patients who appear to be rejecting their transplantsSELECT ?patient ?bun ?creatFROM <http://sadiframework.org/ontologies/patients.rdf>WHERE { ?patient rdf:type patient:LikelyRejecter . ?patient l:latestBUN ?bun . ?patient l:latestCreatinine ?creat .}
  67. 67. Likely Rejecter:A patient who has creatinine levels that are increasing over time - - Wilkinson “MD”
  68. 68. Likely Rejecter:Our database contains variousblood chemistry measurements at various time-points
  69. 69. Likely Rejecter: …but there is no “likely rejecter”column or table in our database…
  70. 70. SHARE determines by itself the need to do a Linear Regression analysis overCreatinine blood chemistry measurements
  71. 71. SHARE determines by itselfhow and where that analysis can be done and does it
  72. 72. The SHARE system utilizes Semantics (via SADI) to discover and access analytical services on the Web that do linear regression analysis
  73. 73. VOILA!
  74. 74. How does SHARE work?
  75. 75. Ontologies
  76. 76. Ontology Spectrum Thesauri Frames Selected “narrower (Properties) LogicalCatalog/ term” Formal ConstraintsID relation is-a (disjointness, inverse, …) Terms/ Informal Formal General Value Logical glossary is-a instance Restrs. constraintsOriginally from AAAI 1999- Ontologies Panel by Gruninger, Lehmann, McGuinness, Uschold, Welty;– updated by McGuinness.Description in: www.ksl.stanford.edu/people/dlm/papers/ontologies-come-of-age-abstract.html
  77. 77. Ontology Spectrum BETTER!! Thesauri Frames Selected “narrower (Properties) LogicalCatalog/ term” Formal ConstraintsID relation is-a (disjointness, inverse, …) Terms/ Informal Formal General Value Logical glossary is-a instance Restrs. constraintsBasic SADI/SHARE functionality
  78. 78. Ontology Spectrum Because it fulfils XYZ Thesauri WHY? Frames Selected “narrower (Properties) LogicalCatalog/ term” Formal ConstraintsID relation is-a (disjointness, inverse, …) Terms/ Informal Formal General Value Logical glossary is-a instance Restrs. constraints Because I say so! Why is this data a member of this OWL Class? (and therefore valid input to the service)
  79. 79. Discovery systems - flexible Selected Thesauri LogicalCatalog/ “narrower term” Formal Frames Constraints (disjointness,ID (Properties) relation is-a inverse, …) Terms/ Informal Formal Value General Logical glossary is-a instance Restrs. constraints Categorization Systems – like library shelves, inflexible
  80. 80. In the upper end of the Ontology Spectrum, if the data has the right propertiesIt can be discovered to be a valid input to a Serviceregardless of how it was originally classified or which ontology was used for that classification
  81. 81. ...and in the context of a SHARE query those individual properties may have been aggregated from many different places;The data becomes a valid input as properties aggregate
  82. 82. In exactly the same way that the OWL propertyrestrictions of a SADI Input Class tell SHARE what properties a service requires as input The property restrictions of an OWL Class in the SPARQL query tell SHARE what properties itneeds to retrieve to create members of that class
  83. 83. Show me the latest Blood Urea Nitrogen and Creatinine levels of patients who appear to be rejecting their transplantsPREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>PREFIX patient: <http://sadiframework.org/ontologies/patients.owl#>PREFIX l: <http://sadiframework.org/ontologies/predicates.owl#>SELECT ?patient ?bun ?creatFROM <http://sadiframework.org/ontologies/patients.rdf>WHERE { ?patient rdf:type patient:LikelyRejecter . ?patient l:latestBUN ?bun . ?patient l:latestCreatinine ?creat .}
  84. 84. The definition of a Likely Rejecter category is encoded in a machine-readable document written in the OWL Ontology languageBasically: “the regression line over creatinine measurements should have an increasing slope”
  85. 85. SHARE burrows down through the ontological definitionto learn about what the properties of “regression models” are
  86. 86. SHARE utilizes SADI to discover analytical services on the Web that do linear regression analysis then other services that can determine the “latest” measurements from a time-series
  87. 87. Let’s go through that again from a different perspective
  88. 88. OWL Classes that include property restrictions can be “executed” as if they were workflows
  89. 89. Ontology = Query = Workflow
  90. 90. WORKFLOW
  91. 91. QUERY: SELECT images ofmutations from genes in organism XXX thatshare homology to this gene in organism YYY
  92. 92. Concept:“Homologous Mutant Image”
  93. 93. As OWL Axioms HomologousMutantImage is owl:equivalentTo { Gene Q hasImage image PGene Q hasSequence Sequence QGene R hasSequence Sequence RSequence Q similarTo Sequence R Gene R = “my gene of interest” }
  94. 94. Those axioms combine to create an OWL Class: Homologous Mutant Image
  95. 95. QUERY: Retrieveowl:HomologousMutant Image for gene XXX
  96. 96. SHARE: Decomposes theowl:Homologous Mutant ImageClass, discovers SADI services relevant to that class, andpipelines them together into a workflow
  97. 97. “The user experiments showed thatworkflow re-use… is difficult for bioinformaticians.” - Gooderis, A. (2008) Ph.D. Thesis
  98. 98. Under slightly different circumstances(e.g. studying the same phenomenon in a different organism) this workflow WILL NOT SOLVE THE PROBLEMIt must be edited on a case-by-case basis This editing turns out to be extremelydifficult, and can ~only be done manually
  99. 99. This great idea would be even betterif workflows were a little bit easier to re-purpose!
  100. 100. With SHARE, a workflow is generateddynamically, based on all of the information presented in the querye.g. to create a new workflow simply specify a different organism ID in the query!
  101. 101. With SHARE, a workflow is generateddynamically, based on all of the information presented in the query Moreover, the workflow plan CHANGES dynamically based on service outputs!
  102. 102. With SHARE, the ontology is the workflow (not the same as “an ontology that describes workflows”) The ontology acts as an abstraction of a workflow, which is concretized at run-time based on circumstances of the query
  103. 103. Works (best) with ontologies in the “Frames+” part of the spectrum, if there are SADI services available Selected Thesauri Logical “narrower Constraints Formal Frames Catalog/ term” (disjointness, is-a (Properties) inverse, …) ID relation Terms/ Informal Formal Value General Logical glossary is-a instance Restrs. constraints
  104. 104. As far as we are aware, SHARE is the only system that exhibits this particular behaviour ...and IMO this is a pretty big deal...
  105. 105. Gordon, P.M.K., Soliman, M.A., Bose, P., Trinh, Q., Sensen, C.W., Riabowol, K.: Interspeciesdata mining to predict novel ING-protein interactions in human. BMC genomics. 9, 426 (2008).
  106. 106. That experiment can now be represented ONCE as an OWL Class It becomes concretized automatically for each individual researcheras a distinct workflow given any starting protein and any combination of comparator species
  107. 107. This is far beyond simply changing the parameters entered into a workflow...
  108. 108. Moreover, the idea of automated workflow “individuality” is quite interesting to us...
  109. 109. This is an early prototype of aPatient-driven Personalized Medicine Web interface
  110. 110. Matching based on official name, compound name,brand name, trade name, or “common name” 
  111. 111. Still needs some work... ??!?!?
  112. 112. Why the alert? Link out to PubMed
  113. 113. The SADI+SHARE workflow and reasoning waspersonalized to YOUR medical data
  114. 114. In future iterations, we will enable the workflowto be further customized through “personalized”OWL Classes (e.g. Provided by your Clinician!!)
  115. 115. These OWL Classes might include information about thecurrent trajectory of your treatment for a chronic disease,for example, such that what you read on the Web isplaced in the context of your expert Clinical care...
  116. 116. Frankly, I think it’s quite cool that “people” are creating and running personalized workflows at the touch of a button...
  117. 117. ...as many of you know, that has been my dream since I started studying this problem a decade ago!
  118. 118. Final thoughts
  119. 119. An experiment... based on a hypothesis
  120. 120. An experiment... based on a hypothesis now modeled in OWL ... ...
  121. 121. Does this OWL Class represent a Hypothesis? I think it does! ... ...
  122. 122. I believe that we will soon show using SADI + SHARE that we can model a non-trivial hypothetical biological scenario then evaluate if that hypothesis is supported or notbased on whether the automatically-synthesized workflow returns any individuals that conform to the model.
  123. 123. Ontology = Hypothesis = Query = Workflow [= Materials and Methods ] Most of your publication is done! All you need to do now is interpret the results! These can be automatically derived through provenance information during workflow execution
  124. 124. Please join us!SADI and SHARE are Open-Source projects http://sadiframework.org
  125. 125. My New Home!
  126. 126. University of British ColumbiaLuke McCarthy – Lead Dev. Edward KawasEverything... SADI Service auto-generatorBenjamin VanderValk Ian WoodSHARE & SADI & Experimental modeling & Experimental modeling projectmyHeath ButtonSoroush SamadianCardiovascular data modeling and queries
  127. 127. C-BRASS Collaborators at other sitesU of New Brunswick Carleton UniversityDr. Chris Baker Dr. Michel DumontierAlexandre Riazanov Marc-Alexandre Nolin Leonid Chepelev Steve Etlinger Nichaella Kieth Jose Cruz
  128. 128. Microsoft Research

×