10. DEFINITIONS
dicarboxylic acid dianion organic_molecular_entity and
has_part exactly 2 acetate and has_charge value "-2"^int
flavonoid organic_molecular_entity and has_skeleton some
flavones
benzoquinones organic_molecular_entity and ( has_part
some 1,2-benzoquinone or has_part some 1,4-
benzoquinone )
gamma-lactam organic_molecular_entity and has_part
some pyrrolidin-2-one and not ( has_part some
succinimide )
16. ACKNOWLEDGEMENT
S COLLABORATORS
Colin Batchelor, RSC
Lian Duan, ETH
Leonid Chepelev, Ottawa
Michel Dumontier, Stanford
Despoina Magka, Oxford
FUNDING
BBSRC “Continued development of
ChEBI towards better usability for the
systems biology and metabolic
modelling communities” BB/K019783/1
Chebi ontology has 3 sub-ontologies. Namely, role, subatomic particle and chemical ontology
In this talk, I will be focusing only on the chemical ontology.
An ontology that captures the structural features hierarchically.
This is an example entry for a structural classification.
By looking at the graph from top to bottom we can describe few structural features for caffeine.
It is certain that caffeine has at least two cycles as polycyclic compound.
narrowing down that it contains only two cycles & hetero atoms as hetero bicyclic compound. An imidazopyrimidine - a 6 ring containing two nitrogens fused to 5 ring containing two nitrogens
methylxanthine - imidazopyrimidine with two ketones in 6 membraned ring.
what are the challenges that are expected with manual classification of structures ?
As a result we a need a auto-classification tool that would help us to identify and correct these consistencies of ontology.
And also allows us to bulk load of structures.
As result we could speed up the curation process and make an consistent ontology.
SMiles ARbitrary Target Specification (SMARTS)
Web ontology language (OWL)
Fragmentation based approach where it captures the structural features hierarchically in SMARTS and uses owl to classify
No support for negation
Only “min” counting supported, not max or exactly. Thus, a dicarboxylic acid is a monocarboxylic acid
SMARTS is powerful – but not very human-readable notations.
Can we do better at making definitions accessible?
So the new proposed approach is to make this definitions human friendly.
So any chemically intelligent person can validate this definitions without proper computer knowledge.
In this approach, the structural features are encoded in the owl definitions.
As in this example we say a basic functional group ketone contains a structure of acetone.
These owl definitions are parsed and converted in to chemoinformatics definitions.
That are matched against the unclassified structures.
As a result the structure is classified under highly ranked structural features.
These definitions are manually generated to make it more sensible.
As an initial exercise MCS was used to extract the structural features to generate definitions.
In this example class benzoquinone, we have two different substituents and one is more dominant. This is the mcs result for benzoquinones by RDkit & SMSD. This makes the automatic definition generation tricky when there is multiple definitions because of substituents or ring size and so on.