SlideShare a Scribd company logo
1 of 17
AUTOMATED STRUCTURE-BASED
CLASSIFICATION USING ChEBI
ONTOLOGY
Venkatesh Muthukrishnan
Software Engineer - ChEBI
CLASSIFICATION IN ChEBI
CHALLENGES WITH
MANUAL CLASSIFICATION
INCOMPLETE
& INCONSISTENT …
BLOCKS BULK LOADING …
http://sourceforge.net/p/chebi/news/2012/11/chebi-release-97-is-now-available/
STRUCTURE BASED
AUTO-CLASSIFICATION
PREVIOUS APPROACHES
SOCO
SELF-ORGANISING
CHEMICAL ONTOLOGIES
SMARTS & OWL
Chepelev et al. BMC Bioinformatics 2012 13:3 doi:10.1186/1471-2105-13-3
PROPOSED APPROACH
SCHEMA
DEFINITIONS
eg: ketone organic_molecular_entity and has_part some acetone
RANKED
DEFINITIONS
dicarboxylic acid dianion organic_molecular_entity and
has_part exactly 2 acetate and has_charge value "-2"^int
flavonoid organic_molecular_entity and has_skeleton some
flavones
benzoquinones organic_molecular_entity and ( has_part
some 1,2-benzoquinone or has_part some 1,4-
benzoquinone )
gamma-lactam organic_molecular_entity and has_part
some pyrrolidin-2-one and not ( has_part some
succinimide )
WHY MANUAL
DEFINITION
GENERATION
NO MCS
https://github.com/downloads/asad/SMSD/SMS
D20120718.zip
http://dalkescientific.com/writings/diary/archi
ve/2012/05/13/mcs_chebi.html
CHALLENGING CLASSES
PLANNED INTEGRATION
 INTERNAL DATA LOADING
 SUBMISSION TOOL & CURATOR TOOL
ACKNOWLEDGEMENT
S COLLABORATORS
Colin Batchelor, RSC
Lian Duan, ETH
Leonid Chepelev, Ottawa
Michel Dumontier, Stanford
Despoina Magka, Oxford
FUNDING
BBSRC “Continued development of
ChEBI towards better usability for the
systems biology and metabolic
modelling communities” BB/K019783/1
THANKYOU

More Related Content

Viewers also liked

Kynlíf og krabbamein_kynning_2011_
Kynlíf  og  krabbamein_kynning_2011_Kynlíf  og  krabbamein_kynning_2011_
Kynlíf og krabbamein_kynning_2011_Kynlíf&Krabbamein
 
Informatica Concetti Di Base - prima parte
Informatica Concetti Di Base - prima parteInformatica Concetti Di Base - prima parte
Informatica Concetti Di Base - prima parteAnna Rita Colella
 
Blog diagnóstico
Blog   diagnósticoBlog   diagnóstico
Blog diagnósticohenry1956
 
Blog diagnóstico
Blog   diagnósticoBlog   diagnóstico
Blog diagnósticohenry1956
 
Презентация по BurnusHychem
Презентация по BurnusHychemПрезентация по BurnusHychem
Презентация по BurnusHychemAlexander Petrov
 

Viewers also liked (9)

Kynlíf og krabbamein_kynning_2011_
Kynlíf  og  krabbamein_kynning_2011_Kynlíf  og  krabbamein_kynning_2011_
Kynlíf og krabbamein_kynning_2011_
 
Apprendimento On Line
Apprendimento On LineApprendimento On Line
Apprendimento On Line
 
Il sistema binario
Il sistema binarioIl sistema binario
Il sistema binario
 
Informatica Concetti Di Base - prima parte
Informatica Concetti Di Base - prima parteInformatica Concetti Di Base - prima parte
Informatica Concetti Di Base - prima parte
 
Blog diagnóstico
Blog   diagnósticoBlog   diagnóstico
Blog diagnóstico
 
Blog diagnóstico
Blog   diagnósticoBlog   diagnóstico
Blog diagnóstico
 
Softwarelibre
SoftwarelibreSoftwarelibre
Softwarelibre
 
Презентация по BurnusHychem
Презентация по BurnusHychemПрезентация по BurnusHychem
Презентация по BurnusHychem
 
презентация1
презентация1презентация1
презентация1
 

Recently uploaded

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 

Recently uploaded (20)

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 

Structure based auto-classification using ChEBI ontology

Editor's Notes

  1. Chebi ontology has 3 sub-ontologies. Namely, role, subatomic particle and chemical ontology In this talk, I will be focusing only on the chemical ontology. An ontology that captures the structural features hierarchically.
  2. This is an example entry for a structural classification. By looking at the graph from top to bottom we can describe few structural features for caffeine. It is certain that caffeine has at least two cycles as polycyclic compound. narrowing down that it contains only two cycles & hetero atoms as hetero bicyclic compound. An imidazopyrimidine - a 6 ring containing two nitrogens fused to 5 ring containing two nitrogens methylxanthine - imidazopyrimidine with two ketones in 6 membraned ring.
  3. what are the challenges that are expected with manual classification of structures ?
  4. As a result we a need a auto-classification tool that would help us to identify and correct these consistencies of ontology. And also allows us to bulk load of structures. As result we could speed up the curation process and make an consistent ontology.
  5. SMiles ARbitrary Target Specification (SMARTS) Web ontology language (OWL) Fragmentation based approach where it captures the structural features hierarchically in SMARTS and uses owl to classify No support for negation Only “min” counting supported, not max or exactly. Thus, a dicarboxylic acid is a monocarboxylic acid SMARTS is powerful – but not very human-readable notations. Can we do better at making definitions accessible?
  6. So the new proposed approach is to make this definitions human friendly. So any chemically intelligent person can validate this definitions without proper computer knowledge.
  7. In this approach, the structural features are encoded in the owl definitions. As in this example we say a basic functional group ketone contains a structure of acetone. These owl definitions are parsed and converted in to chemoinformatics definitions. That are matched against the unclassified structures. As a result the structure is classified under highly ranked structural features.
  8. These definitions are manually generated to make it more sensible. As an initial exercise MCS was used to extract the structural features to generate definitions.
  9. In this example class benzoquinone, we have two different substituents and one is more dominant. This is the mcs result for benzoquinones by RDkit & SMSD. This makes the automatic definition generation tricky when there is multiple definitions because of substituents or ring size and so on.