Molecular symmetry and specialization
    of atomic connectivity by class-based
       reasoning of chemical structure




                                 Michel Dumontier, Ph.D.

                            Associate Professor of Bioinformatics
    Department of Biology, School of Computer Science, Institute of Biochemistry, Carleton
                                          University
                             Ottawa Institute of Systems Biology
                     Ottawa-Carleton Institute of Biomedical Engineering
                            Professeur Associé, Université Laval

1                                                                           OWLED2012::Dumontier
chemical structure:
    molecules consist of atoms connected by bonds


        Carbon atom                single bond


        Hydrogen atom              double bond


        Nitrogen atom              Oxygen atom


                        caffeine




2                                            OWLED2012::Dumontier
First attempt: class-based representation of
              chemical functional groups




    HydroxylGroup equivalentTo:
    CarbonGroup that (hasSingleBondWith some (
           OxygenAtom that hasSingleBondWith some HydrogenAtom))




              Describing chemical functional groups in OWL-DL for the classification of chemical compounds.
            Natalia Villanueva-Rosales and Michel Dumontier. OWL: Experiences and Directions (OWLED 2007).


3                                                                                                 OWLED2012::Dumontier
automatic classification of chemical
            functional groups




                                            28 OC




4                                OWLED2012::Dumontier
Problems

    1. Descriptions started at an arbitrary central
      atom, so all descriptions needed to
      “specialize these”
    2. Not possible to describe a chemical
      functional groups that are graph-like
         e.g. contains a cycle




5                                         OWLED2012::Dumontier
OWL representation

    We really need to represent and reason over structured objects

               Without structure-based representation,
                 all parts must be explicitly asserted
            (combinatorial explosion for larger molecules)


       But the structure of complex molecules breaks the OWL
                            Tree Model requirement
                           does not have a model in the shape of a tree




6                                                                OWLED2012::Dumontier
Description Graphs

     • A decidable extension to OWL 2 allowing expression of
       complex structures as graphs within the ontology
     • strong separation requirement: atomic properties used
       as graph edges have to be different to those used in
       axioms in the main OWL ontology
     • Rules can be used to enhance OWL with the capacity to
       express if – then constructions
     • Using OWL, Description Graphs and Rules we could
       represent and reason over (classify) chemical structures
       at the class level.

Representing Chemicals using OWL, Description Graphs and Rules. J Hastings, M Dumontier, D Hull, M Horridge,
                   C Steinbeck, U Sattler, R Stevens, T Horne, and K Britz. OWLED 2010.


7                                                                                   OWLED2012::Dumontier
OWL + DG + Rules = Chemical
           Classification




      Before           After




8                              OWLED2012::Dumontier
So, what can we do with just OWL?

    • generate connectivity descriptions for
      every atom to every other atom
      – overcome the central atom problem
      – exponential part list
    • reason at different levels of granularity
      – we could describe atoms in terms of
      1. the types of atoms they are connected to
      2. the exact set of atoms they are connected to
      3. the only atoms they are connected to

9                                           OWLED2012::Dumontier
Dataset




     A) butane, B) pentane, C) iso-butane, D) iso-
     pentane, E) cyclobutane and F) cyclohexane



10                                            OWLED2012::Dumontier
SDF                       Method

     formalization               SDF2OWL
                                               PHP-based
                                                OWLAPI


                       OWL




       reasoning                 Protégé 4.2
                                                HermiT
                                               Explanation
                                               Workbench
                     Inference

11                                                    OWLED2012::Dumontier
Formalization separates the
       chemical graph from the molecule




     `fully connected atom M`             `atom X from molecule A`
       equivalentTo                        equivalentTo
        `atom type`                          `fully connected atom M`
         and `has bond with` exactly 1       and `is component part of`
               `fully connected atom N`             some `molecule Y`
         and ...



12                                                             OWLED2012::Dumontier
Symmetry
                             4       3
                                     1
                                 2

                 equivalence among 2,3 and 4 as
     every peripheral atom is connected to the central atom (1)




13                                                      OWLED2012::Dumontier
Symmetry
                       4    5
                            1
                        2
                                3


     • For iso-pentane, we get equivalence
       between atoms 4 & 5 because they are
       both connected to atoms 1
     • we get a different relationship – one of
       subsumption - between atoms 2 and 4 and
       atoms 2 and 5
14                                     OWLED2012::Dumontier
Atomic specialization
                           4    5
                                1
                            2
                                    3




     Basically, atom 2 has a bond to atom 1, as do atoms
     4 and 5, but it also has a bond to atom 3

15                                           OWLED2012::Dumontier
Symmetry in butane
                               1

                           4        2

                               3

     • Equivalence between atoms 1 & 3 as they both share
       connectivity to atoms 2 & 4, and vice versa.
     • No equivalence among all atoms, however.




16                                                OWLED2012::Dumontier
But not in cyclohexane




     • No 2 atoms are connected to the same
       pair of atoms.



17                                    OWLED2012::Dumontier
Conclusion

     • We investigated class-based representation
       where class descriptions consisted of fully
       qualified cardinality restrictions to other fully-
       connected atoms.
     • We found instances of equivalence (symmetry)
       and specialization (additional bonding), all within
       a single molecule
     • Next, we’ll be looking at reasoning across
       different molecules, but this requires some
       equivalence between atoms of different
       molecules.
18                                              OWLED2012::Dumontier
dumontierlab.com
     michel_dumontier@carleton.ca
                                    Website: http://dumontierlab.com
               Presentations: http://slideshare.com/micheldumontier




19                                                 EBI2011::Dumontier

Class-based reasoning (OWLED2012)

  • 1.
    Molecular symmetry andspecialization of atomic connectivity by class-based reasoning of chemical structure Michel Dumontier, Ph.D. Associate Professor of Bioinformatics Department of Biology, School of Computer Science, Institute of Biochemistry, Carleton University Ottawa Institute of Systems Biology Ottawa-Carleton Institute of Biomedical Engineering Professeur Associé, Université Laval 1 OWLED2012::Dumontier
  • 2.
    chemical structure: molecules consist of atoms connected by bonds Carbon atom single bond Hydrogen atom double bond Nitrogen atom Oxygen atom caffeine 2 OWLED2012::Dumontier
  • 3.
    First attempt: class-basedrepresentation of chemical functional groups HydroxylGroup equivalentTo: CarbonGroup that (hasSingleBondWith some ( OxygenAtom that hasSingleBondWith some HydrogenAtom)) Describing chemical functional groups in OWL-DL for the classification of chemical compounds. Natalia Villanueva-Rosales and Michel Dumontier. OWL: Experiences and Directions (OWLED 2007). 3 OWLED2012::Dumontier
  • 4.
    automatic classification ofchemical functional groups 28 OC 4 OWLED2012::Dumontier
  • 5.
    Problems 1. Descriptions started at an arbitrary central atom, so all descriptions needed to “specialize these” 2. Not possible to describe a chemical functional groups that are graph-like e.g. contains a cycle 5 OWLED2012::Dumontier
  • 6.
    OWL representation We really need to represent and reason over structured objects Without structure-based representation, all parts must be explicitly asserted (combinatorial explosion for larger molecules) But the structure of complex molecules breaks the OWL Tree Model requirement does not have a model in the shape of a tree 6 OWLED2012::Dumontier
  • 7.
    Description Graphs • A decidable extension to OWL 2 allowing expression of complex structures as graphs within the ontology • strong separation requirement: atomic properties used as graph edges have to be different to those used in axioms in the main OWL ontology • Rules can be used to enhance OWL with the capacity to express if – then constructions • Using OWL, Description Graphs and Rules we could represent and reason over (classify) chemical structures at the class level. Representing Chemicals using OWL, Description Graphs and Rules. J Hastings, M Dumontier, D Hull, M Horridge, C Steinbeck, U Sattler, R Stevens, T Horne, and K Britz. OWLED 2010. 7 OWLED2012::Dumontier
  • 8.
    OWL + DG+ Rules = Chemical Classification Before After 8 OWLED2012::Dumontier
  • 9.
    So, what canwe do with just OWL? • generate connectivity descriptions for every atom to every other atom – overcome the central atom problem – exponential part list • reason at different levels of granularity – we could describe atoms in terms of 1. the types of atoms they are connected to 2. the exact set of atoms they are connected to 3. the only atoms they are connected to 9 OWLED2012::Dumontier
  • 10.
    Dataset A) butane, B) pentane, C) iso-butane, D) iso- pentane, E) cyclobutane and F) cyclohexane 10 OWLED2012::Dumontier
  • 11.
    SDF Method formalization SDF2OWL PHP-based OWLAPI OWL reasoning Protégé 4.2 HermiT Explanation Workbench Inference 11 OWLED2012::Dumontier
  • 12.
    Formalization separates the chemical graph from the molecule `fully connected atom M` `atom X from molecule A` equivalentTo equivalentTo `atom type` `fully connected atom M` and `has bond with` exactly 1 and `is component part of` `fully connected atom N` some `molecule Y` and ... 12 OWLED2012::Dumontier
  • 13.
    Symmetry 4 3 1 2 equivalence among 2,3 and 4 as every peripheral atom is connected to the central atom (1) 13 OWLED2012::Dumontier
  • 14.
    Symmetry 4 5 1 2 3 • For iso-pentane, we get equivalence between atoms 4 & 5 because they are both connected to atoms 1 • we get a different relationship – one of subsumption - between atoms 2 and 4 and atoms 2 and 5 14 OWLED2012::Dumontier
  • 15.
    Atomic specialization 4 5 1 2 3 Basically, atom 2 has a bond to atom 1, as do atoms 4 and 5, but it also has a bond to atom 3 15 OWLED2012::Dumontier
  • 16.
    Symmetry in butane 1 4 2 3 • Equivalence between atoms 1 & 3 as they both share connectivity to atoms 2 & 4, and vice versa. • No equivalence among all atoms, however. 16 OWLED2012::Dumontier
  • 17.
    But not incyclohexane • No 2 atoms are connected to the same pair of atoms. 17 OWLED2012::Dumontier
  • 18.
    Conclusion • We investigated class-based representation where class descriptions consisted of fully qualified cardinality restrictions to other fully- connected atoms. • We found instances of equivalence (symmetry) and specialization (additional bonding), all within a single molecule • Next, we’ll be looking at reasoning across different molecules, but this requires some equivalence between atoms of different molecules. 18 OWLED2012::Dumontier
  • 19.
    dumontierlab.com michel_dumontier@carleton.ca Website: http://dumontierlab.com Presentations: http://slideshare.com/micheldumontier 19 EBI2011::Dumontier