Evaluating scientific Hypotheses using semantic web technologiesAlison CallahanTAMALE SeminarNovember 8 2010
Why the semantic web?PLAIN TEXTWhat you see: “The weather today, November 8, will be cloudy with a high of 7°C”What your computer sees: akfalksjdfoaohwoiehroeXMLWhat you see:<weather>   <date>      <month>November</month>      <day>8</day>      <year>2010</year>   </date>   <temperature>      <value>7</value>      <unit>Celcius</unit>    </temperature>    <conditions>cloudy</conditions></weather>What your computer sees:<weather>   <date>      <month>November</month>      <day>8</day>      <year>2010</year>   </date>   <temperature>      <value>5</value>      <unit>Celcius</unit>    </temperature>    <condi>Flurries</condi></weather>
Source: http://www.webcitation.org/5u4OJ6rYe
RDFResource Description Framework: A W3C standard for representing resources and the relationships between them, and for data exchange on the WWWAlisonreadinglikes	        subject                       predicate                        object http://people.com#Alisonhttp://feelings.com#likes http://activities.com#reading<rdf:RDF>    <rdf:Descriptionrdf:about="http://people.com#Alison"><likesxmlns="http://feelings.com#"rdf:resource="http://activities.com#reading"/>    </rdf:Description></rdf:RDF>
educational institutionfoaf:personrdf:typeCarleton Universityrdf:typeworks atreadingAlisonlikeshas brotherChrisrdf:typefoaf:person
Querying RDF using SPARQLSPARQL = SPARQL Protocol and RDF Query Language	select ?s where {		?s rdf:typefoaf:Person .	}http://people.com#Alisonhttp://people.com#Christopher
OWL: The web ontology languageOWL allows the representation of ontology concepts in a machine understandable mannerMotherWomanhasChildPerson
Biological SCIENCES and the semantic webhttp://bio2rdf.org		 		http://bioportal.bioontology.org
HYQUEHypothesis-based query and evaluation toolhttp://semanticscience.org/projects/hyque
Source :http://xkcd.com/242/
Source: http://kentsimmons.uwinnipeg.ca/cm1504/introscience.htm
Finding evidence to support/refute a hypothesis is becoming increasingly difficultSource: http://upload.wikimedia.org/wikipedia/commons/2/26/EnwikipediaArt.PNG
HyBrowComputationally augmented method for hypothesis evaluationdeveloped by Racunas et al. [1]
minimum event-based vocabulary
uses consistency checking to evaluate hypotheses
constraints
rules
compares hypotheses using neighborhood functions
incremental hypothesis improvement[1] Racunas S. A., Shah N. H., Albert I. and Fedoroff N. V.  (2004). HyBrow: A prototype system for computer-aided hypothesis evaluation. Bioinformatics 20(S. 1): i1-i8.
HyBrowsmall, manually generated knowledge
hard coded Perl rules
challenging to apply to a new domain
needs access to a greater KBHyBrow HyQueHypothesis query and evaluation system
Uses RDF/SPARQL/OWL
Background knowledge encoded as OWL ontologies
Queries Bio2RDF’s dedicated SPARQL endpoints
Context-specific rules that consider experimental conditions
HyQueconsumes and producesRDFPaper: Callahan, A., M. Dumontier & N. Shah. 2010. HyQue: Evaluating hypotheses using Semantic Web technologies. Bio-ontologies SIG, ISMB’10, Boston MA.On the web: http://semanticscience.org/projects/hyque
HyQue is composed of …HyQue hypothesis ontologyDescribes generic input hypothesis and output hypothesis evaluation classesUses upper level classes e.g. ‘proposition’, ‘measurement value’, ‘event’ HyBrow SPARQL endpoint
SGD data in Bio2RDF
Template event-based SPARQL queries
GO, SO, ChEBI, ECO ontologiesA HyQue hypothesis is a collection of propositionsproposition: “a statement expressing something true or false”
HyQue hypotheses are composed of propositions connected using logical operators (AND, OR…)HyQuehypothesis   ‘proposition’ that ‘has part’ some ‘hypothesis part’Hypothesis part  ‘proposition’ that ‘has component’ some ‘event’
HyQue eventsEvents are composed of conditional assertions on a relation between ‘actor’ and ‘target’ induces(actor, target, context, location)For decidable logic (OWL), an n-ary object is usedEventagent_aactoragent_btargetperturbation_contextcontextphysical_locationlocation
HyQue Data … but first, a biology primer!
How are these processes regulated? Source: http://www.webcitation.org/5u4OelqJO

TAMALE Seminar: Evaluating scientific hypotheses using Semantic Web technologies

  • 1.
    Evaluating scientific Hypothesesusing semantic web technologiesAlison CallahanTAMALE SeminarNovember 8 2010
  • 2.
    Why the semanticweb?PLAIN TEXTWhat you see: “The weather today, November 8, will be cloudy with a high of 7°C”What your computer sees: akfalksjdfoaohwoiehroeXMLWhat you see:<weather> <date> <month>November</month> <day>8</day> <year>2010</year> </date> <temperature> <value>7</value> <unit>Celcius</unit> </temperature> <conditions>cloudy</conditions></weather>What your computer sees:<weather> <date> <month>November</month> <day>8</day> <year>2010</year> </date> <temperature> <value>5</value> <unit>Celcius</unit> </temperature> <condi>Flurries</condi></weather>
  • 4.
  • 5.
    RDFResource Description Framework:A W3C standard for representing resources and the relationships between them, and for data exchange on the WWWAlisonreadinglikes subject predicate object http://people.com#Alisonhttp://feelings.com#likes http://activities.com#reading<rdf:RDF> <rdf:Descriptionrdf:about="http://people.com#Alison"><likesxmlns="http://feelings.com#"rdf:resource="http://activities.com#reading"/> </rdf:Description></rdf:RDF>
  • 6.
    educational institutionfoaf:personrdf:typeCarleton Universityrdf:typeworksatreadingAlisonlikeshas brotherChrisrdf:typefoaf:person
  • 7.
    Querying RDF usingSPARQLSPARQL = SPARQL Protocol and RDF Query Language select ?s where { ?s rdf:typefoaf:Person . }http://people.com#Alisonhttp://people.com#Christopher
  • 8.
    OWL: The webontology languageOWL allows the representation of ontology concepts in a machine understandable mannerMotherWomanhasChildPerson
  • 9.
    Biological SCIENCES andthe semantic webhttp://bio2rdf.org http://bioportal.bioontology.org
  • 10.
    HYQUEHypothesis-based query andevaluation toolhttp://semanticscience.org/projects/hyque
  • 11.
  • 12.
  • 13.
    Finding evidence tosupport/refute a hypothesis is becoming increasingly difficultSource: http://upload.wikimedia.org/wikipedia/commons/2/26/EnwikipediaArt.PNG
  • 14.
    HyBrowComputationally augmented methodfor hypothesis evaluationdeveloped by Racunas et al. [1]
  • 15.
  • 16.
    uses consistency checkingto evaluate hypotheses
  • 17.
  • 18.
  • 19.
    compares hypotheses usingneighborhood functions
  • 20.
    incremental hypothesis improvement[1]Racunas S. A., Shah N. H., Albert I. and Fedoroff N. V. (2004). HyBrow: A prototype system for computer-aided hypothesis evaluation. Bioinformatics 20(S. 1): i1-i8.
  • 21.
  • 22.
  • 23.
    challenging to applyto a new domain
  • 24.
    needs access toa greater KBHyBrow HyQueHypothesis query and evaluation system
  • 25.
  • 26.
  • 27.
  • 28.
    Context-specific rules thatconsider experimental conditions
  • 29.
    HyQueconsumes and producesRDFPaper:Callahan, A., M. Dumontier & N. Shah. 2010. HyQue: Evaluating hypotheses using Semantic Web technologies. Bio-ontologies SIG, ISMB’10, Boston MA.On the web: http://semanticscience.org/projects/hyque
  • 30.
    HyQue is composedof …HyQue hypothesis ontologyDescribes generic input hypothesis and output hypothesis evaluation classesUses upper level classes e.g. ‘proposition’, ‘measurement value’, ‘event’ HyBrow SPARQL endpoint
  • 31.
  • 32.
  • 33.
    GO, SO, ChEBI,ECO ontologiesA HyQue hypothesis is a collection of propositionsproposition: “a statement expressing something true or false”
  • 34.
    HyQue hypotheses arecomposed of propositions connected using logical operators (AND, OR…)HyQuehypothesis ‘proposition’ that ‘has part’ some ‘hypothesis part’Hypothesis part ‘proposition’ that ‘has component’ some ‘event’
  • 35.
    HyQue eventsEvents arecomposed of conditional assertions on a relation between ‘actor’ and ‘target’ induces(actor, target, context, location)For decidable logic (OWL), an n-ary object is usedEventagent_aactoragent_btargetperturbation_contextcontextphysical_locationlocation
  • 36.
    HyQue Data …but first, a biology primer!
  • 38.
    How are theseprocesses regulated? Source: http://www.webcitation.org/5u4OelqJO
  • 39.
    Where do OURCELLS get energy?
  • 40.
    HyQue data: Thegal gene network IN YEASTGenes that encode proteins that transport and metabolize galactose
  • 41.
    permease – gal2p– transports galactose into cells
  • 42.
  • 43.
  • 44.
  • 45.
  • 46.
    Regulation – whetherthe pathway is on or off
  • 47.
  • 48.
  • 49.
  • 50.
    Source: Ostergaardet al.(2000). Nature Biotechnology 18: 1283 - 1286
  • 51.
    HyQue dataExperimentally determinedinteractions between the GAL proteinsProperties of the genes that encode these proteins (SGD)Literature-based evidence (citations)Knowledge about cellular locations and events (GO)Types of evidence supporting these interactions (ECO)
  • 52.
  • 53.
    Individual events parsedfrom input hypothesis RDFhypothesis:h a hyque:Hypothesis ; a hyque:AND ;hyque:hasPart :p1 ;hyque:hasPart :p2 ;hyque:hasPart :p3 .:p1 a hyque:AND ;hyque:hasComponent :e1 ;hyque:hasComponent :e2 .:e1 a <http://bio2rdf.org/go:0006810> ;hybrow:is_negated “0”^^xsd:boolean ;hybrow:agent_a <http://bio2rdf.org/sgd:Gal2p> ;hybrow:agent_b <http://bio2rdf.org/chebi:28260> .:e2 a <http://bio2rdf.org/go:0005488> ;hybrow:is_negated “0”^^xsd:boolean ;hybrow:agent_a <http://bio2rdf.org/sgd:Gal3p> ;hybrow:agent_b <http://bio2rdf.org/sgd:Gal80p> .has parthypothesis part 1has componentgal2p transports galactosegal3p binds to gal80p
  • 54.
    Template SPARQL queriescompleted based on event properties:e1 a go:0006810 ;hybrow:is_negated "0" ;hybrow:agent_asgd:Gal2p ;hybrow:agent_bchebi:28260 .construct { … } where {… ?event hybrow:is_negated ?negated .?event hybrow:physical_operator ?physical_operator .?event hybrow:agent_a <http://bio2rdf.org/sgd:Gal2p> .…?event hybrow:agent_b <http://bio2rdf.org/chebi:28260> .?actor semsci:isLocatedIn ?actor_gp_id_location .?actor_gp_id_locationrdf:type ?actor_location_type .?actor semsci:hasFunction ?actor_gp_id_function .?actor_gp_id_functionrdf:type ?actor_function .…}
  • 55.
    SPARQL query resultsretrievedhybrow_data:2c1789a3019fd2fe9843d507824fc591 rdf:type <http://bio2rdf.org/go:0044092> .hybrow:is_negated "0" .hybrow:agent_a sgd:Gal3p ;hybrow:agent_b sgd:Gal80p ;hybrow:actor_type <http://bio2rdf.org/chebi:36080> ;hybrow:target_type <http://bio2rdf.org/chebi:36080> ;hybrow:physical_context <http://bio2rdf.org/go:0005634> ;hybrow:physical_operator <http://bio2rdf.org/go:0005488> .hybrow_data:b09f7cc043201b47610c874499448a23 rdf:type <http://bio2rdf.org/go:0005488> ;hybrow:is_negated "0" ;hybrow:agent_a sgd:Gal3p ;hybrow:agent_b sgd:Gal80p ;hybrow:actor_type <http://bio2rdf.org/chebi:36080> ;hybrow:target_type <http://bio2rdf.org/chebi:36080> ;hybrow:physical_context <http://bio2rdf.org/go:0005634> ;hybrow:physical_operator <http://bio2rdf.org/go:0005488> .
  • 56.
    Query results evaluatedbased on rule sets‘binding’ rule:Is event negated?If yes, subtract 2Is physical operator ‘binding’?If yes, add 1; if no, subtract 1Is actor of type ‘protein’ or ‘small molecule’?If yes, add 1; if of type ‘gene’, subtract 1Is target of type ‘protein’ or ‘small molecule’? If yes, add 1; if of type ‘gene’, subtract 1Does actor have known ‘binding’ function? If yes, add 1 GO:0005488CHEBI:36080SO:0000236
  • 57.
    Result scores basedon operators between eventsFinal score = e1 score + e2 score + e3 score + e4 scoreFinal score = maximum of e5, e6 or e7 scores:p1 a hyque:AND ;hyque:hasComponent :e1 ;hyque:hasComponent :e2 ; hyque:hasComponent :e3 ;hyque:hasComponent :e4 .:p2 a hyque:OR ;hyque:hasComponent :e5 ;hyque:hasComponent :e6 ;hyque:hasComponent :e7 .
  • 58.
  • 61.
    HyQue as aSADI serviceSADI – Semantic Automated Discovery and IntegrationHyQueSADI ontology to describe input and output of service
  • 62.
    Users can posta hypothesis in RDF and receive the hypothesis evaluation in RDFACKNOWLEDGEMENTSMy supervisor: Dr. Michel DumontierProject collaborator: Dr. Nigam Shah http://stanford.edu/~nigamThe Dumontier Lab: http://dumontierlab.comFunding: NSERC to MD
  • 63.

Editor's Notes

  • #23 .
  • #39 Take home message: HyQue is a project that uses semantic web technologies to represent biological knowledge; and uses those representations to answer questions and do useful work.