Evaluating scientific Hypotheses using semantic web technologies<br />Alison Callahan<br />TAMALE Seminar<br />November 8 ...
Why the semantic web?<br />PLAIN TEXT<br />What you see: “The weather today, November 8, will be cloudy with a high of 7°C...
Source: http://www.webcitation.org/5u4OJ6rYe<br />
RDF<br />Resource Description Framework: A W3C standard for representing resources and the relationships between them, and...
educational institution<br />foaf:person<br />rdf:type<br />Carleton University<br />rdf:type<br />works at<br />reading<b...
Querying RDF using SPARQL<br />SPARQL = SPARQL Protocol and RDF Query Language<br />	select ?s where {<br />		?s rdf:typef...
OWL: The web ontology language<br />OWL allows the representation of ontology concepts in a machine understandable manner<...
Biological SCIENCES and the semantic web<br />http://bio2rdf.org		 		http://bioportal.bioontology.org<br />
HYQUE<br />Hypothesis-based query and evaluation tool<br />http://semanticscience.org/projects/hyque<br />
Source :http://xkcd.com/242/<br />
Source: http://kentsimmons.uwinnipeg.ca/cm1504/introscience.htm<br />
Finding evidence to support/refute a hypothesis is becoming increasingly difficult<br />Source: http://upload.wikimedia.or...
HyBrow<br />Computationally augmented method for hypothesis evaluation<br /><ul><li>developed by Racunas et al. [1]
minimum event-based vocabulary
uses consistency checking to evaluate hypotheses
constraints
rules
compares hypotheses using neighborhood functions
incremental hypothesis improvement</li></ul>[1] Racunas S. A., Shah N. H., Albert I. and Fedoroff N. V.  (2004). HyBrow: A...
HyBrow<br /><ul><li>small, manually generated knowledge
hard coded Perl rules
challenging to apply to a new domain
needs access to a greater KB</li></li></ul><li>HyBrow HyQue<br /><ul><li>Hypothesis query and evaluation system
Uses RDF/SPARQL/OWL
Background knowledge encoded as OWL ontologies
Queries Bio2RDF’s dedicated SPARQL endpoints
Context-specific rules that consider experimental conditions
HyQueconsumes and producesRDF</li></ul>Paper: Callahan, A., M. Dumontier & N. Shah. 2010. HyQue: Evaluating hypotheses usi...
HyQue is composed of …<br /><ul><li>HyQue hypothesis ontology</li></ul>Describes generic input hypothesis and output hypot...
SGD data in Bio2RDF
Template event-based SPARQL queries
GO, SO, ChEBI, ECO ontologies</li></li></ul><li>A HyQue hypothesis is a collection of propositions<br /><ul><li>propositio...
HyQue hypotheses are composed of propositions connected using logical operators (AND, OR…)</li></ul>HyQuehypothesis   ‘pro...
HyQue events<br />Events are composed of conditional assertions on a relation between ‘actor’ and ‘target’ <br />induces(a...
HyQue Data … but first, a biology primer!<br />
How are these processes regulated? <br />Source: http://www.webcitation.org/5u4OelqJO<br />
Upcoming SlideShare
Loading in …5
×

TAMALE Seminar: Evaluating scientific hypotheses using Semantic Web technologies

943 views
869 views

Published on

This presentation was given as part of the University of Ottawa TAMALE group seminar series: http://tamale.uottawa.ca

Published in: Education, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
943
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • .
  • Take home message: HyQue is a project that uses semantic web technologies to represent biological knowledge; and uses those representations to answer questions and do useful work.
  • TAMALE Seminar: Evaluating scientific hypotheses using Semantic Web technologies

    1. 1. Evaluating scientific Hypotheses using semantic web technologies<br />Alison Callahan<br />TAMALE Seminar<br />November 8 2010<br />
    2. 2. Why the semantic web?<br />PLAIN TEXT<br />What you see: “The weather today, November 8, will be cloudy with a high of 7°C”<br />What your computer sees: akfalksjdfoaohwoiehroe<br />XML<br />What you see:<br /><weather><br /> <date><br /> <month>November</month><br /> <day>8</day><br /> <year>2010</year><br /> </date><br /> <temperature><br /> <value>7</value><br /> <unit>Celcius</unit><br /> </temperature><br /> <conditions>cloudy</conditions><br /></weather><br />What your computer sees:<br /><weather><br /> <date><br /> <month>November</month><br /> <day>8</day><br /> <year>2010</year><br /> </date><br /> <temperature><br /> <value>5</value><br /> <unit>Celcius</unit><br /> </temperature><br /> <condi>Flurries</condi><br /></weather><br />
    3. 3.
    4. 4. Source: http://www.webcitation.org/5u4OJ6rYe<br />
    5. 5. RDF<br />Resource Description Framework: A W3C standard for representing resources and the relationships between them, and for data exchange on the WWW<br />Alison<br />reading<br />likes<br /> subject predicate object<br /> http://people.com#Alisonhttp://feelings.com#likes http://activities.com#reading<br /><rdf:RDF><br /> <rdf:Descriptionrdf:about="http://people.com#Alison"><br /><likesxmlns="http://feelings.com#"<br />rdf:resource="http://activities.com#reading"/><br /> </rdf:Description><br /></rdf:RDF><br />
    6. 6. educational institution<br />foaf:person<br />rdf:type<br />Carleton University<br />rdf:type<br />works at<br />reading<br />Alison<br />likes<br />has brother<br />Chris<br />rdf:type<br />foaf:person<br />
    7. 7. Querying RDF using SPARQL<br />SPARQL = SPARQL Protocol and RDF Query Language<br /> select ?s where {<br /> ?s rdf:typefoaf:Person .<br /> }<br />http://people.com#Alison<br />http://people.com#Christopher<br />
    8. 8. OWL: The web ontology language<br />OWL allows the representation of ontology concepts in a machine understandable manner<br />MotherWomanhasChildPerson<br />
    9. 9. Biological SCIENCES and the semantic web<br />http://bio2rdf.org http://bioportal.bioontology.org<br />
    10. 10. HYQUE<br />Hypothesis-based query and evaluation tool<br />http://semanticscience.org/projects/hyque<br />
    11. 11. Source :http://xkcd.com/242/<br />
    12. 12. Source: http://kentsimmons.uwinnipeg.ca/cm1504/introscience.htm<br />
    13. 13. Finding evidence to support/refute a hypothesis is becoming increasingly difficult<br />Source: http://upload.wikimedia.org/wikipedia/commons/2/26/EnwikipediaArt.PNG<br />
    14. 14. HyBrow<br />Computationally augmented method for hypothesis evaluation<br /><ul><li>developed by Racunas et al. [1]
    15. 15. minimum event-based vocabulary
    16. 16. uses consistency checking to evaluate hypotheses
    17. 17. constraints
    18. 18. rules
    19. 19. compares hypotheses using neighborhood functions
    20. 20. incremental hypothesis improvement</li></ul>[1] Racunas S. A., Shah N. H., Albert I. and Fedoroff N. V. (2004). HyBrow: A prototype system for computer-aided hypothesis evaluation. Bioinformatics 20(S. 1): i1-i8.<br />
    21. 21. HyBrow<br /><ul><li>small, manually generated knowledge
    22. 22. hard coded Perl rules
    23. 23. challenging to apply to a new domain
    24. 24. needs access to a greater KB</li></li></ul><li>HyBrow HyQue<br /><ul><li>Hypothesis query and evaluation system
    25. 25. Uses RDF/SPARQL/OWL
    26. 26. Background knowledge encoded as OWL ontologies
    27. 27. Queries Bio2RDF’s dedicated SPARQL endpoints
    28. 28. Context-specific rules that consider experimental conditions
    29. 29. HyQueconsumes and producesRDF</li></ul>Paper: Callahan, A., M. Dumontier & N. Shah. 2010. HyQue: Evaluating hypotheses using Semantic Web technologies. Bio-ontologies SIG, ISMB’10, Boston MA.<br />On the web: http://semanticscience.org/projects/hyque<br />
    30. 30. HyQue is composed of …<br /><ul><li>HyQue hypothesis ontology</li></ul>Describes generic input hypothesis and output hypothesis evaluation classes<br />Uses upper level classes e.g. ‘proposition’, ‘measurement value’, ‘event’ <br /><ul><li>HyBrow SPARQL endpoint
    31. 31. SGD data in Bio2RDF
    32. 32. Template event-based SPARQL queries
    33. 33. GO, SO, ChEBI, ECO ontologies</li></li></ul><li>A HyQue hypothesis is a collection of propositions<br /><ul><li>proposition: “a statement expressing something true or false”
    34. 34. HyQue hypotheses are composed of propositions connected using logical operators (AND, OR…)</li></ul>HyQuehypothesis ‘proposition’ that ‘has part’ some ‘hypothesis part’<br />Hypothesis part ‘proposition’ that ‘has component’ some ‘event’<br />
    35. 35. HyQue events<br />Events are composed of conditional assertions on a relation between ‘actor’ and ‘target’ <br />induces(actor, target, context, location)<br />For decidable logic (OWL), an n-ary object is used<br />Event<br />agent_aactor<br />agent_btarget<br />perturbation_contextcontext<br />physical_locationlocation<br />
    36. 36. HyQue Data … but first, a biology primer!<br />
    37. 37.
    38. 38. How are these processes regulated? <br />Source: http://www.webcitation.org/5u4OelqJO<br />
    39. 39. Where do OUR CELLS get energy?<br />
    40. 40. HyQue data: The gal gene network IN YEAST<br /><ul><li>Genes that encode proteins that transport and metabolize galactose
    41. 41. permease – gal2p – transports galactose into cells
    42. 42. galactokinase – gal1p
    43. 43. uridylyltransferase – gal7p
    44. 44. epimerase – gal10p
    45. 45. phosphoglucomutase –gal5p
    46. 46. Regulation – whether the pathway is on or off
    47. 47. gal3p
    48. 48. gal4p
    49. 49. gal80p</li></ul>enzymes that use galactose<br />
    50. 50. Source: Ostergaardet al. (2000). Nature Biotechnology 18: 1283 - 1286 <br />
    51. 51. HyQue data<br />Experimentally determined interactions between the GAL proteins<br />Properties of the genes that encode these proteins (SGD)<br />Literature-based evidence (citations)<br />Knowledge about cellular locations and events (GO)<br />Types of evidence supporting these interactions (ECO)<br />
    52. 52. How does HyQue work?<br />
    53. 53. Individual events parsed from input hypothesis RDF<br />hypothesis<br />:h a hyque:Hypothesis ;<br /> a hyque:AND ;<br />hyque:hasPart :p1 ;<br />hyque:hasPart :p2 ;<br />hyque:hasPart :p3 .<br />:p1 a hyque:AND ;<br />hyque:hasComponent :e1 ;<br />hyque:hasComponent :e2 .<br />:e1 a <http://bio2rdf.org/go:0006810> ;<br />hybrow:is_negated “0”^^xsd:boolean ;<br />hybrow:agent_a <http://bio2rdf.org/sgd:Gal2p> ;<br />hybrow:agent_b <http://bio2rdf.org/chebi:28260> .<br />:e2 a <http://bio2rdf.org/go:0005488> ;<br />hybrow:is_negated “0”^^xsd:boolean ;<br />hybrow:agent_a <http://bio2rdf.org/sgd:Gal3p> ;<br />hybrow:agent_b <http://bio2rdf.org/sgd:Gal80p> .<br />has part<br />hypothesis part 1<br />has component<br />gal2p transports galactose<br />gal3p binds to gal80p<br />
    54. 54. Template SPARQL queries completed based on event properties<br />:e1 a go:0006810 ;<br />hybrow:is_negated "0" ;<br />hybrow:agent_asgd:Gal2p ;<br />hybrow:agent_bchebi:28260 .<br />construct { … } where {<br />… <br />?event hybrow:is_negated ?negated .<br />?event hybrow:physical_operator ?physical_operator .<br />?event hybrow:agent_a <http://bio2rdf.org/sgd:Gal2p> .<br />…<br />?event hybrow:agent_b <http://bio2rdf.org/chebi:28260> .<br />?actor semsci:isLocatedIn ?actor_gp_id_location .<br />?actor_gp_id_locationrdf:type ?actor_location_type .<br />?actor semsci:hasFunction ?actor_gp_id_function .<br />?actor_gp_id_functionrdf:type ?actor_function .<br />…<br />}<br />
    55. 55. SPARQL query results retrieved<br />hybrow_data:2c1789a3019fd2fe9843d507824fc591 <br />rdf:type <http://bio2rdf.org/go:0044092> .<br />hybrow:is_negated "0" .<br />hybrow:agent_a sgd:Gal3p ;<br />hybrow:agent_b sgd:Gal80p ;<br />hybrow:actor_type <http://bio2rdf.org/chebi:36080> ;<br />hybrow:target_type <http://bio2rdf.org/chebi:36080> ;<br />hybrow:physical_context <http://bio2rdf.org/go:0005634> ;<br />hybrow:physical_operator <http://bio2rdf.org/go:0005488> .<br />hybrow_data:b09f7cc043201b47610c874499448a23 <br />rdf:type <http://bio2rdf.org/go:0005488> ;<br />hybrow:is_negated "0" ;<br />hybrow:agent_a sgd:Gal3p ;<br />hybrow:agent_b sgd:Gal80p ;<br />hybrow:actor_type <http://bio2rdf.org/chebi:36080> ;<br />hybrow:target_type <http://bio2rdf.org/chebi:36080> ;<br />hybrow:physical_context <http://bio2rdf.org/go:0005634> ;<br />hybrow:physical_operator <http://bio2rdf.org/go:0005488> .<br />
    56. 56. Query results evaluated based on rule sets<br />‘binding’ rule:<br />Is event negated?<br />If yes, subtract 2<br />Is physical operator ‘binding’?<br />If yes, add 1; if no, subtract 1<br />Is actor of type ‘protein’ or ‘small molecule’?<br />If yes, add 1; if of type ‘gene’, subtract 1<br />Is target of type ‘protein’ or ‘small molecule’? <br />If yes, add 1; if of type ‘gene’, subtract 1<br />Does actor have known ‘binding’ function? <br />If yes, add 1 <br />GO:0005488<br />CHEBI:36080<br />SO:0000236<br />
    57. 57. Result scores based on operators between events<br />Final score = e1 score + e2 score + e3 score + e4 score<br />Final score = maximum of e5, e6 or e7 scores<br />:p1 a hyque:AND ;<br />hyque:hasComponent :e1 ;<br />hyque:hasComponent :e2 ; hyque:hasComponent :e3 ;<br />hyque:hasComponent :e4 .<br />:p2 a hyque:OR ;<br />hyque:hasComponent :e5 ;<br />hyque:hasComponent :e6 ;<br />hyque:hasComponent :e7 .<br />
    58. 58. Hypothesis evaluation RDF generated<br />
    59. 59.
    60. 60.
    61. 61. HyQue as a SADI service<br />SADI – Semantic Automated Discovery and Integration<br /><ul><li>HyQueSADI ontology to describe input and output of service
    62. 62. Users can post a hypothesis in RDF and receive the hypothesis evaluation in RDF</li></li></ul><li>ACKNOWLEDGEMENTS<br />My supervisor: Dr. Michel Dumontier<br />Project collaborator: Dr. Nigam Shah http://stanford.edu/~nigam<br />The Dumontier Lab: http://dumontierlab.com<br />Funding: NSERC to MD<br />
    63. 63. Questions?<br />

    ×