Representing chemicalsusing OWL, Description Graphsand RulesJanna Hastings, EBI, UKMichel Dumontier, Carleton University, CanadaDuncan Hull, EBI, UKMatthew Horridge, Manchester, UKChristoph Steinbeck, EBI, UKUlrike Sattler, Manchester, UKRobert Stevens, Manchester, UKTertia Hӧrne, University of South AfricaKatarina Britz, Meraka Institute, South AfricaOWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
OWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
ProblemWe wish to represent and reason over structured objects		i.e. their representation contains 	also their partsOWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
Chemical structuressingle bondCarbon atomHydrogen atomdouble bondNitrogen atomOxygen atomcaffeineMoleculesconsist of atomsconnected by bondsOWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
Chemical ontologyChemical ontology consists of chemical classes which can be defined by parts of structuresand/or properties of structurescarboxylic acidif molecule has partsome carboxy groupcyclic moleculeif molecule has property cyclic, i.e. a self-connectedcyclic path exists through the molecule’s atomsOWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
OWL representationWithout structure, all parts must be explicitly asserted(combinatorial explosion for larger molecules)But the structure of complex molecules breaks the OWL Tree Model requirementdoes not have a model in the shape of a treeOWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
Description GraphsA recent, decidable extension to OWL 2, allowing expression of complex structures as graphs within the ontologyA description graph consists of a set of labelled vertices and a set of directed edgesEach description graph has a main class which links the graph to the main OWL ontologyOWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
Strong separationIn order to preserve decidability of knowledge bases enriched with description graphs,	atomic properties used as graph edges have to be different to those used in axioms in the main OWL ontologyThis is known as the strong separation requirementOWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
RulesEnhance OWL with the capacity to express if – then constructionsConsist of ‘antecedent’ (if conditions) and ‘consequent’ (then result)Antecedent and consequent are composed of conjunctions of atomic statementsOWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
GoalCan we represent chemical structures using OWL and Description Graphs?Can we reason over the information encoded in chemical structures using OWL, Description Graphs and Rules?OWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
OWL ontologyOWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
Chemical description graphsGenerated based on structures converted from a chemical database (ChEBI)OWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
RulesGenerated for properties, e.g. being cyclicmolecule(?x), atom(?a1), atom (?a2), atom(?a3), atom(?a4), bond(?b1), bond (?b2), bond(?b3), bond (?b4), has_atom(?x, ?a1), has_atom(?x, ?a2), has_atom(?x, ?a3), has_atom(?x, ?a4),has_bond(?a1, ?b1), has_bond(?a1, ?b4), has_bond(?a2, ?b1), has_bond(?a2, ?b2),has_bond(?a3, ?b2), has_bond(?a3, ?b3), has_bond(?a4, ?b3), has_bond(?a4, ?b4)-> cyclic_entity(?x)cyclobutanetetrahedraneOWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
RulesGenerated for parthood, e.g. carboxylic acidmolecule(?y), atom(?a0), oxygen_atom(?a1), carbon_atom(?a2), oxygen_atom (?a3), has_atom(?y, ?a0), has_atom (?y, ?a1), has_atom (?y, ?a2), has_atom (?y, ?a3), double_bond(?b0), single_bond (?b1), single_bond (?b2), has_bond(?a0, ?b2), has_bond(?a1, ?b1), has_bond(?a2, ?b0), has_bond(?a2, ?b1), has_bond(?a2, ?b2), has_bond(?a3, ?b0)                                      ->  carboxylic_acid(?y)carboxylic acidbenzoicacidbenzoic acid	has this part		so: is a carboxylic acidOWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
Testing the reasoningCan we use a reasoner to deduce the classification hierarchy based on the graphs and rules? No asserted hierarchy between test classes and molecules with generated graphsOWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
ResultsInferred hierarchy shows classified moleculesOWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
Testing the performanceHow many molecules (description graphs) can we include in our knowledge base?How does the reasoning task (classification) scale with respect to the number of graphs, both with and without rules in the knowledge base?OWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
ResultsOWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
ExperiencesDifficult to debugTools support needs to be improvedDifficult to construct rules for properties which depend on all atoms or all bonds in a given molecule	e.g. saturated -> all bonds in molecule are singleOWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
ConclusionUsing OWL, Description Graphs and Rules we can represent chemical structures at the class level in our knowledge base and reason over the structural informationScalability of the reasoning with the rules is a concernOWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
AcknowledgementsSpecial thanks to KirillDegtyarenko, Stefan Schulz, Colin Batchelor, BirteGlimm and the ChEBI teamFunding: Meraka Institute, South Africa; BBSRC (BB/G022747/1); NSERC Discovery GrantOWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules

Representing chemicals using OWL, Description Graphs and Rules

  • 1.
    Representing chemicalsusing OWL,Description Graphsand RulesJanna Hastings, EBI, UKMichel Dumontier, Carleton University, CanadaDuncan Hull, EBI, UKMatthew Horridge, Manchester, UKChristoph Steinbeck, EBI, UKUlrike Sattler, Manchester, UKRobert Stevens, Manchester, UKTertia Hӧrne, University of South AfricaKatarina Britz, Meraka Institute, South AfricaOWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
  • 2.
    OWLED2010:San Francisco:Representing chemicalsusing OWL, Description Graphs and Rules
  • 3.
    ProblemWe wish torepresent and reason over structured objects i.e. their representation contains also their partsOWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
  • 4.
    Chemical structuressingle bondCarbonatomHydrogen atomdouble bondNitrogen atomOxygen atomcaffeineMoleculesconsist of atomsconnected by bondsOWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
  • 5.
    Chemical ontologyChemical ontologyconsists of chemical classes which can be defined by parts of structuresand/or properties of structurescarboxylic acidif molecule has partsome carboxy groupcyclic moleculeif molecule has property cyclic, i.e. a self-connectedcyclic path exists through the molecule’s atomsOWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
  • 6.
    OWL representationWithout structure,all parts must be explicitly asserted(combinatorial explosion for larger molecules)But the structure of complex molecules breaks the OWL Tree Model requirementdoes not have a model in the shape of a treeOWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
  • 7.
    Description GraphsA recent,decidable extension to OWL 2, allowing expression of complex structures as graphs within the ontologyA description graph consists of a set of labelled vertices and a set of directed edgesEach description graph has a main class which links the graph to the main OWL ontologyOWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
  • 8.
    Strong separationIn orderto preserve decidability of knowledge bases enriched with description graphs, atomic properties used as graph edges have to be different to those used in axioms in the main OWL ontologyThis is known as the strong separation requirementOWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
  • 9.
    RulesEnhance OWL withthe capacity to express if – then constructionsConsist of ‘antecedent’ (if conditions) and ‘consequent’ (then result)Antecedent and consequent are composed of conjunctions of atomic statementsOWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
  • 10.
    GoalCan we representchemical structures using OWL and Description Graphs?Can we reason over the information encoded in chemical structures using OWL, Description Graphs and Rules?OWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
  • 11.
    OWL ontologyOWLED2010:San Francisco:Representingchemicals using OWL, Description Graphs and Rules
  • 12.
    Chemical description graphsGeneratedbased on structures converted from a chemical database (ChEBI)OWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
  • 13.
    RulesGenerated for properties,e.g. being cyclicmolecule(?x), atom(?a1), atom (?a2), atom(?a3), atom(?a4), bond(?b1), bond (?b2), bond(?b3), bond (?b4), has_atom(?x, ?a1), has_atom(?x, ?a2), has_atom(?x, ?a3), has_atom(?x, ?a4),has_bond(?a1, ?b1), has_bond(?a1, ?b4), has_bond(?a2, ?b1), has_bond(?a2, ?b2),has_bond(?a3, ?b2), has_bond(?a3, ?b3), has_bond(?a4, ?b3), has_bond(?a4, ?b4)-> cyclic_entity(?x)cyclobutanetetrahedraneOWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
  • 14.
    RulesGenerated for parthood,e.g. carboxylic acidmolecule(?y), atom(?a0), oxygen_atom(?a1), carbon_atom(?a2), oxygen_atom (?a3), has_atom(?y, ?a0), has_atom (?y, ?a1), has_atom (?y, ?a2), has_atom (?y, ?a3), double_bond(?b0), single_bond (?b1), single_bond (?b2), has_bond(?a0, ?b2), has_bond(?a1, ?b1), has_bond(?a2, ?b0), has_bond(?a2, ?b1), has_bond(?a2, ?b2), has_bond(?a3, ?b0) -> carboxylic_acid(?y)carboxylic acidbenzoicacidbenzoic acid has this part so: is a carboxylic acidOWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
  • 15.
    Testing the reasoningCanwe use a reasoner to deduce the classification hierarchy based on the graphs and rules? No asserted hierarchy between test classes and molecules with generated graphsOWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
  • 16.
    ResultsInferred hierarchy showsclassified moleculesOWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
  • 17.
    Testing the performanceHowmany molecules (description graphs) can we include in our knowledge base?How does the reasoning task (classification) scale with respect to the number of graphs, both with and without rules in the knowledge base?OWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
  • 18.
    ResultsOWLED2010:San Francisco:Representing chemicalsusing OWL, Description Graphs and Rules
  • 19.
    ExperiencesDifficult to debugToolssupport needs to be improvedDifficult to construct rules for properties which depend on all atoms or all bonds in a given molecule e.g. saturated -> all bonds in molecule are singleOWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
  • 20.
    ConclusionUsing OWL, DescriptionGraphs and Rules we can represent chemical structures at the class level in our knowledge base and reason over the structural informationScalability of the reasoning with the rules is a concernOWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules
  • 21.
    AcknowledgementsSpecial thanks toKirillDegtyarenko, Stefan Schulz, Colin Batchelor, BirteGlimm and the ChEBI teamFunding: Meraka Institute, South Africa; BBSRC (BB/G022747/1); NSERC Discovery GrantOWLED2010:San Francisco:Representing chemicals using OWL, Description Graphs and Rules