SlideShare a Scribd company logo
1 of 54
Download to read offline
Structure Generation,
Metabolite Space, and
Metabolite-Likeness

Julio E. Peironcely  @peyron
Juliopeironcely.com
PhD student at Leiden University and TNO
Metabolomics

   the quantitative and qualitative
      analysis of all metabolites in
     samples of cells, body fluids,
                       tissues, etc.


                  Julio E. Peironcely
Metabolomics

             Experi-                                                                 Biological
Biological                        Sample      Data       Data pre-         Data
             mental    Sampling                                                        inter-
question                        preparation acquisition processing        analysis
             design                                                                  pretation


                                                                 Metabolites




                                                                               Relevant
                                                                            biomolecules/
                                                                List of
                                      Samples     Raw data                   connectivities
                 Protocol                                       peaks/
                                                                                  &
                                                                biomolecules
                                                                                Models




                                                 Julio E. Peironcely
Metabolomics

             Experi-                                                                 Biological
Biological                        Sample      Data       Data pre-         Data
             mental    Sampling                                                        inter-
question                        preparation acquisition processing        analysis
             design                                                                  pretation


                                                                 Metabolites




                                                                               Relevant
                                                                            biomolecules/
                                                                List of
                                      Samples     Raw data                   connectivities
                 Protocol                                       peaks/
                                                                                  &
                                                                biomolecules
                                                                                Models




                                                 Julio E. Peironcely
De-novo identification
We have

          Elemental Composition

          Fragments (sometimes)

      Experimental Information

                 Julio E. Peironcely
We want

    List Of Candidate Structures

           As Short As Possible

    Good Structure Is In The List

                 Julio E. Peironcely
We need

             Structure Generator

           Keep only metabolites

  Use experimental information to
                 filter molecules

                  Julio E. Peironcely
Elemental
Composition




              Julio E. Peironcely
Elemental
Composition




      Structure
     Generation




                  Julio E. Peironcely
Elemental
Composition




      Structure
     Generation




              Molecules

                    Julio E. Peironcely
Structure Generator
                 Elemental	
  
                                         Fragments	
  
                  Formula	
  

                    Generate	
  




                           Candidate	
  
                           Structures	
  
In collaboration with Jean-Loup Faulon, Evry University
                                           Julio E. Peironcely
Structure Generator
                 Elemental	
  
                                          Fragments	
  
                  Formula	
  

                    Generate	
  
                      Keep	
  Molecules	
  if	
  
                         Canonical	
  
                       Augmenta:on	
  

                           Candidate	
  
                           Structures	
  
In collaboration with Jean-Loup Faulon, Evry University
                                            Julio E. Peironcely
Structure Generator                            Adding bonds




In collaboration with Jean-Loup Faulon, Evry University
                                           Julio E. Peironcely
Structure Generator                                                 Isomorphism

    Isomorphic class                                       Isomorphic class
    “triangle + 1 edge”                                    “3-edge chain”
              1                   1                             1                   1
                                                   1        2           3       2                   3    1
      2               3       2               3
                          1                            3                    1                   2            3
                               4          2                     4                4
          4                                                                                 1
                              3                                     2           3                   4
                  2                           4
                                               3                                    2                3
                      4           2                                     4

                                      4                                                 4




In collaboration with Jean-Loup Faulon, Evry University
                                                            Julio E. Peironcely
Structure Generator                                                 Isomorphism

    Isomorphic class                                       Isomorphic class
    “triangle + 1 edge”                                    “3-edge chain”
              1                   1                             1                   1
                                                   1        2           3       2                   3    1
      2               3       2               3
                          1                            3                    1                   2            3
                               4          2                     4                4
          4                                                                                 1
                              3                                     2           3                   4
                  2                           4
                                               3                                    2                3
                      4           2                                     4

                                      4                                                 4




    Output	
  ONLY	
  orange	
  graphs	
  
In collaboration with Jean-Loup Faulon, Evry University
                                                            Julio E. Peironcely
Structure Generator                                            Canonical Labeling
          1                   1                                1                    1
                                               1           2            3       2                   3    1
  2               3       2               3
                      1                            3                        1                   2            3
                           4          2                        4                 4
      4                                                                                     1
                          3                                         2           3                   4
              2                           4
                                           3                                        2                3
                  4           2                                         4

                                  4                                                     4




                  	
  	
  	
  	
  	
  	
  	
  	
  	
  Canonizer	
  	
  	
  (Nauty)	
  

      (1,2) (1,3) (1,4)                                            (1,2) (1,3) (2,4)
      (2,3)


                                                            Julio E. Peironcely
Only 1 canonical
  labeling in each
isomorphic class
Use canonizer to                                                                       1


remove duplicates after                                                          2         3

                                                                                            5
                                                                                     4
each extension        1                                                               (1,2)

                                                  2        3

                                                   4       5
                         1                                                                                 1
                                                  (1,2)(1,3)
                 2               3                                                                  2           3

                  4       5                                                                              4       5
                 (1,2)(1,3)(1,4)                                                                        (1,2)(1,3)
                                                                                                        (2,3)
         1                   1                1                     1

 2                   2               3   2            3        2         3                          1
             3

             5                  5          4        5                     5
     4                 4                                         4                              2          3
                     (1,2)(1,3)(1,4)     (1,2)(1,3)(1,4)       (1,2)(1,3)(1,4)
                                         (2,3)                                                   4       5
                     (3,4)                                     (4,5)
                                                                                                (1,2)(1,3)


                             X
                                                                                                (2,3)(2,4)
Canonical Augmentation



              A canonical object

   augmented in a canonical way

     produces a canonical object

                   Julio E. Peironcely
Check For Canonical Augmentation



                    Keep object if

             a canonical deletion

 takes you to the canonical father

                    Julio E. Peironcely
Accept only canonically                                                                   1

                                                                                    2         3

augmented graphs                                                                        4
                                                                                         (1,2)
                                                                                               5
                                                          1

                                                  2           3

                                                   4       5
                         1                                                                                    1
                                                  (1,2)(1,3)
                 2               3                                                                     2           3

                  4       5                                                                                 4       5
                 (1,2)(1,3)(1,4)                                                                           (1,2)(1,3)
                                                                                                           (2,3)
         1                   1                1                        1

 2                   2               3   2            3           2         3                          1
             3

             5                  5          4        5                        5
     4                 4                                            4                              2          3
                     (1,2)(1,3)(1,4)     (1,2)(1,3)(1,4)          (1,2)(1,3)(1,4)
                                         (2,3)                                                      4       5
                     (3,4)                                        (4,5)
                                                                                                   (1,2)(1,3)



                             X
                                                                                                   (2,3)(2,4)



                                                                                          X
Structure Generator Results                                                MOLGEN
                                                                           same # of
                                                                           molecules




                                                                       p-Cresol
                  Glycine   Phenylalanine Malic acid   D-Cysteine
                                                                        sulfate
     Elemental
                  C2H5NO2     C9H11NO2     C4H6O5      C3H7NO2S        C7H8O3S
    Composition
     # Output
                     84      277,810,163     8,070        3,838        10,203,389
     Molecules

                      6        4,037,499      1,601        100             19,940
    1 Fragment
                                  93,137                                      948

    2 Fragments                     584

    3 Fragments                     278



In collaboration with Jean-Loup Faulon, Evry University
                                                 Julio E. Peironcely
Lots of candidates
         structures
We are looking for
      metabolites
Elemental
Composition




      Structure       Metabolite
     Generation       Likeness




              Molecules

                    Julio E. Peironcely
Elemental
Composition
                                    Metabolites




      Structure       Metabolite
     Generation       Likeness




              Molecules

                    Julio E. Peironcely
How do metabolites
                     look like?
Understanding and Classifying Metabolite Space and Metabolite-Likeness
Julio E. Peironcely et al. PLoS One (in press)
HMDB          ZINC
 8K           21M



       Julio E. Peironcely
metabolites   non metabolites

      Water Solubility
            MW
         C Atoms
     Struc. Complexity
            PSA


               Julio E. Peironcely
PCA




      Julio E. Peironcely
PCA
Not so different
Decision Tree




                Julio E. Peironcely
Elemental
Composition
                                    Metabolites




      Structure       Metabolite
     Generation       Likeness




              Molecules

                    Julio E. Peironcely
Metabolite-likeness
Representation             + Classification
   HMDB            ZINC
    8K             21M


       Atom Counts

   Physicochemical desc.            Support Vector
                                    Machines (SVM)
     MDL Public Keys
                                 Random Forest (RF)
          FCFP_4
                                   Naïve Bayes (NB)
          ECFP_4




                             Julio E. Peironcely
Metabolite-likeness         HMDB
                             8K
                                                ZINC
                                                21M


                               Standardization


      Atom Counts            Diversity Selection
  Physicochemical desc.
    MDL Public Keys
         FCFP_4
         ECFP_4




                          Julio E. Peironcely
Metabolite-likeness           HMDB
                               8K
                                                  ZINC
                                                  21M


                                 Standardization


      Atom Counts              Diversity Selection
  Physicochemical desc.
    MDL Public Keys
         FCFP_4           Training Set              Test Set
         ECFP_4            532 + 532              6.4K + 6.4K




                            Julio E. Peironcely
Metabolite-likeness               HMDB
                                   8K
                                                      ZINC
                                                      21M


                                      Standardization


      Atom Counts                  Diversity Selection
  Physicochemical desc.
    MDL Public Keys
         FCFP_4             Training Set                Test Set
         ECFP_4              532 + 532                6.4K + 6.4K

                            5-fold CV

                          SVM    RF      BC




                                Julio E. Peironcely
Metabolite-likeness        HMDB
                            8K
                                               ZINC
                                               21M


                               Standardization


                            Diversity Selection
   3 classifiers
         X
                      Training Set               Test Set
  5 descriptions       532 + 532               6.4K + 6.4K

                      5-fold CV                Metabolite
                                                likeness
                   SVM    RF      BC




                         Julio E. Peironcely
Metabolite-likeness                          HMDB
                                              8K
                                                                 ZINC
                                                                 21M


Best = RF – MDLPublicKeys                        Standardization

Sensitivity   Specificity    AUC
                                              Diversity Selection
 99.84%        87.52%       99.20%

                                       Training Set                Test Set
      Bad BC – P_desc                   532 + 532                6.4K + 6.4K

Sensitivity   Specificity    AUC       5-fold CV                 Metabolite
                                                                  likeness
                                     SVM    RF      BC
 42.51%        86.56%       61.57%




                                           Julio E. Peironcely
Metabolite-likeness, external
validation
              HMDB
            External          DrugBank          ChEMBL
          validation set


                                          Random Selection



                           Standardization


                             Metabolite
                              likeness




                                    Julio E. Peironcely
Metabolite-likeness, external
validation




                     Julio E. Peironcely
Met-likeness + structure generation
(malic acid) 8K

                                          100%

57%          77%




                    Julio E. Peironcely
Met-likeness + structure generation
(methylhistamine) 260K

                                          71%
     46%




                    Julio E. Peironcely
What else do we know
about our molecules?
Molecule   Minimized_Energy    ALogP   Index


Phenylalanine              0.1100             -1.605   5142
Molecule              Minimized_Energy    ALogP   Index


C9H11NO2
                                          0.1100             -1.605   5142




      Structure
     Generation




                  277 M

                               Julio E. Peironcely
Molecule              Minimized_Energy    ALogP   Index

C9H11NO2
                                          0.1100             -1.605   5142
              99%



      Structure
     Generation
                                    44%



            41 K


                               Julio E. Peironcely
Molecule              Minimized_Energy    ALogP   Index

C9H11NO2
   E < 10                                0.1100             -1.605   5142




      Structure
     Generation
                                   40%


              8K



                              Julio E. Peironcely
Molecule              Minimized_Energy    ALogP   Index

C9H11NO2
    E < 10                                0.1100             -1.605   5142


 ALogP < -1



       Structure
      Generation
                                    76%

               31




                               Julio E. Peironcely
Conclusions
   Met-Likeness prediction is good,
                 interpretation not

               Local models needed

 Structure Generator + Met-Likeness
         + other constraints = Met Id
                       improvement

                    Julio E. Peironcely
Acknowledgements

  Leiden University              University of Cambridge
  Miguel Rojas-Cherto            Andreas Bender
  Piotr Kasper
  Michael van Vliet
  Theo Reijmers
  Rob Vreeken                    Evry University
  Ronnie van Doorn               Jean-Loup Faulon
  Thomas Hankemeier              Davide Fichera


  TNO Quality of Life
  Leon Coulier
  Albert Tas                     HMP University of
                                 Alberta
                                 David Wishart
                                 Ying (Edison) Dong


                        Julio E. Peironcely

More Related Content

Viewers also liked

Viewers also liked (6)

android deep linking
android deep linkingandroid deep linking
android deep linking
 
M Kozlova Portfolio Samples Ln
M Kozlova Portfolio Samples LnM Kozlova Portfolio Samples Ln
M Kozlova Portfolio Samples Ln
 
nosql
nosqlnosql
nosql
 
Government Power Point-3rd Grade
Government Power Point-3rd GradeGovernment Power Point-3rd Grade
Government Power Point-3rd Grade
 
Three levels of government
Three levels of governmentThree levels of government
Three levels of government
 
Branches of government
Branches of governmentBranches of government
Branches of government
 

Similar to Structure generation, metabolite space, and metabolite likeness

Julio Peironcely @ ICCS 2011
Julio Peironcely @ ICCS 2011Julio Peironcely @ ICCS 2011
Julio Peironcely @ ICCS 2011VodafoneZiggo
 
1 introduction to_the_ebi_(katrina_pavelin)
1 introduction to_the_ebi_(katrina_pavelin)1 introduction to_the_ebi_(katrina_pavelin)
1 introduction to_the_ebi_(katrina_pavelin)phdcareers
 
ENVO: The Environment Ontology (Presentation at the Genomics Standards Consor...
ENVO: The Environment Ontology (Presentation at the Genomics Standards Consor...ENVO: The Environment Ontology (Presentation at the Genomics Standards Consor...
ENVO: The Environment Ontology (Presentation at the Genomics Standards Consor...Barry Smith
 
Biomedical ontology tutorial_atlanta_june2011_part2
Biomedical ontology tutorial_atlanta_june2011_part2Biomedical ontology tutorial_atlanta_june2011_part2
Biomedical ontology tutorial_atlanta_june2011_part2Barry Smith
 
Enriching the Gene Ontology via the Dissection of Labels using the Ontology P...
Enriching the Gene Ontology via the Dissection of Labels using the Ontology P...Enriching the Gene Ontology via the Dissection of Labels using the Ontology P...
Enriching the Gene Ontology via the Dissection of Labels using the Ontology P...jesualdofernandez
 
University of Toronto Chemistry Librarians Workshop June 2012
University of Toronto Chemistry Librarians Workshop June 2012University of Toronto Chemistry Librarians Workshop June 2012
University of Toronto Chemistry Librarians Workshop June 2012Brock University
 

Similar to Structure generation, metabolite space, and metabolite likeness (8)

Julio Peironcely @ ICCS 2011
Julio Peironcely @ ICCS 2011Julio Peironcely @ ICCS 2011
Julio Peironcely @ ICCS 2011
 
1 introduction to_the_ebi_(katrina_pavelin)
1 introduction to_the_ebi_(katrina_pavelin)1 introduction to_the_ebi_(katrina_pavelin)
1 introduction to_the_ebi_(katrina_pavelin)
 
Bioinformatica t7-protein structure
Bioinformatica t7-protein structureBioinformatica t7-protein structure
Bioinformatica t7-protein structure
 
ENVO: The Environment Ontology (Presentation at the Genomics Standards Consor...
ENVO: The Environment Ontology (Presentation at the Genomics Standards Consor...ENVO: The Environment Ontology (Presentation at the Genomics Standards Consor...
ENVO: The Environment Ontology (Presentation at the Genomics Standards Consor...
 
New edge biology
New edge biologyNew edge biology
New edge biology
 
Biomedical ontology tutorial_atlanta_june2011_part2
Biomedical ontology tutorial_atlanta_june2011_part2Biomedical ontology tutorial_atlanta_june2011_part2
Biomedical ontology tutorial_atlanta_june2011_part2
 
Enriching the Gene Ontology via the Dissection of Labels using the Ontology P...
Enriching the Gene Ontology via the Dissection of Labels using the Ontology P...Enriching the Gene Ontology via the Dissection of Labels using the Ontology P...
Enriching the Gene Ontology via the Dissection of Labels using the Ontology P...
 
University of Toronto Chemistry Librarians Workshop June 2012
University of Toronto Chemistry Librarians Workshop June 2012University of Toronto Chemistry Librarians Workshop June 2012
University of Toronto Chemistry Librarians Workshop June 2012
 

Recently uploaded

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 

Recently uploaded (20)

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 

Structure generation, metabolite space, and metabolite likeness

  • 1. Structure Generation, Metabolite Space, and Metabolite-Likeness Julio E. Peironcely  @peyron Juliopeironcely.com PhD student at Leiden University and TNO
  • 2. Metabolomics the quantitative and qualitative analysis of all metabolites in samples of cells, body fluids, tissues, etc. Julio E. Peironcely
  • 3. Metabolomics Experi- Biological Biological Sample Data Data pre- Data mental Sampling inter- question preparation acquisition processing analysis design pretation Metabolites Relevant biomolecules/ List of Samples Raw data connectivities Protocol peaks/ & biomolecules Models Julio E. Peironcely
  • 4. Metabolomics Experi- Biological Biological Sample Data Data pre- Data mental Sampling inter- question preparation acquisition processing analysis design pretation Metabolites Relevant biomolecules/ List of Samples Raw data connectivities Protocol peaks/ & biomolecules Models Julio E. Peironcely
  • 6. We have Elemental Composition Fragments (sometimes) Experimental Information Julio E. Peironcely
  • 7. We want List Of Candidate Structures As Short As Possible Good Structure Is In The List Julio E. Peironcely
  • 8. We need Structure Generator Keep only metabolites Use experimental information to filter molecules Julio E. Peironcely
  • 9. Elemental Composition Julio E. Peironcely
  • 10. Elemental Composition Structure Generation Julio E. Peironcely
  • 11. Elemental Composition Structure Generation Molecules Julio E. Peironcely
  • 12. Structure Generator Elemental   Fragments   Formula   Generate   Candidate   Structures   In collaboration with Jean-Loup Faulon, Evry University Julio E. Peironcely
  • 13. Structure Generator Elemental   Fragments   Formula   Generate   Keep  Molecules  if   Canonical   Augmenta:on   Candidate   Structures   In collaboration with Jean-Loup Faulon, Evry University Julio E. Peironcely
  • 14. Structure Generator Adding bonds In collaboration with Jean-Loup Faulon, Evry University Julio E. Peironcely
  • 15. Structure Generator Isomorphism Isomorphic class Isomorphic class “triangle + 1 edge” “3-edge chain” 1 1 1 1 1 2 3 2 3 1 2 3 2 3 1 3 1 2 3 4 2 4 4 4 1 3 2 3 4 2 4 3 2 3 4 2 4 4 4 In collaboration with Jean-Loup Faulon, Evry University Julio E. Peironcely
  • 16. Structure Generator Isomorphism Isomorphic class Isomorphic class “triangle + 1 edge” “3-edge chain” 1 1 1 1 1 2 3 2 3 1 2 3 2 3 1 3 1 2 3 4 2 4 4 4 1 3 2 3 4 2 4 3 2 3 4 2 4 4 4 Output  ONLY  orange  graphs   In collaboration with Jean-Loup Faulon, Evry University Julio E. Peironcely
  • 17. Structure Generator Canonical Labeling 1 1 1 1 1 2 3 2 3 1 2 3 2 3 1 3 1 2 3 4 2 4 4 4 1 3 2 3 4 2 4 3 2 3 4 2 4 4 4                  Canonizer      (Nauty)   (1,2) (1,3) (1,4) (1,2) (1,3) (2,4) (2,3) Julio E. Peironcely
  • 18. Only 1 canonical labeling in each isomorphic class
  • 19. Use canonizer to 1 remove duplicates after 2 3 5 4 each extension 1 (1,2) 2 3 4 5 1 1 (1,2)(1,3) 2 3 2 3 4 5 4 5 (1,2)(1,3)(1,4) (1,2)(1,3) (2,3) 1 1 1 1 2 2 3 2 3 2 3 1 3 5 5 4 5 5 4 4 4 2 3 (1,2)(1,3)(1,4) (1,2)(1,3)(1,4) (1,2)(1,3)(1,4) (2,3) 4 5 (3,4) (4,5) (1,2)(1,3) X (2,3)(2,4)
  • 20. Canonical Augmentation A canonical object augmented in a canonical way produces a canonical object Julio E. Peironcely
  • 21. Check For Canonical Augmentation Keep object if a canonical deletion takes you to the canonical father Julio E. Peironcely
  • 22. Accept only canonically 1 2 3 augmented graphs 4 (1,2) 5 1 2 3 4 5 1 1 (1,2)(1,3) 2 3 2 3 4 5 4 5 (1,2)(1,3)(1,4) (1,2)(1,3) (2,3) 1 1 1 1 2 2 3 2 3 2 3 1 3 5 5 4 5 5 4 4 4 2 3 (1,2)(1,3)(1,4) (1,2)(1,3)(1,4) (1,2)(1,3)(1,4) (2,3) 4 5 (3,4) (4,5) (1,2)(1,3) X (2,3)(2,4) X
  • 23. Structure Generator Results MOLGEN same # of molecules p-Cresol Glycine Phenylalanine Malic acid D-Cysteine sulfate Elemental C2H5NO2 C9H11NO2 C4H6O5 C3H7NO2S C7H8O3S Composition # Output 84 277,810,163 8,070 3,838 10,203,389 Molecules 6 4,037,499 1,601 100 19,940 1 Fragment 93,137 948 2 Fragments 584 3 Fragments 278 In collaboration with Jean-Loup Faulon, Evry University Julio E. Peironcely
  • 24. Lots of candidates structures
  • 25. We are looking for metabolites
  • 26. Elemental Composition Structure Metabolite Generation Likeness Molecules Julio E. Peironcely
  • 27. Elemental Composition Metabolites Structure Metabolite Generation Likeness Molecules Julio E. Peironcely
  • 28. How do metabolites look like? Understanding and Classifying Metabolite Space and Metabolite-Likeness Julio E. Peironcely et al. PLoS One (in press)
  • 29. HMDB ZINC 8K 21M Julio E. Peironcely
  • 30. metabolites non metabolites Water Solubility MW C Atoms Struc. Complexity PSA Julio E. Peironcely
  • 31. PCA Julio E. Peironcely
  • 32. PCA
  • 34. Decision Tree Julio E. Peironcely
  • 35. Elemental Composition Metabolites Structure Metabolite Generation Likeness Molecules Julio E. Peironcely
  • 36. Metabolite-likeness Representation + Classification HMDB ZINC 8K 21M Atom Counts Physicochemical desc. Support Vector Machines (SVM) MDL Public Keys Random Forest (RF) FCFP_4 Naïve Bayes (NB) ECFP_4 Julio E. Peironcely
  • 37. Metabolite-likeness HMDB 8K ZINC 21M Standardization Atom Counts Diversity Selection Physicochemical desc. MDL Public Keys FCFP_4 ECFP_4 Julio E. Peironcely
  • 38. Metabolite-likeness HMDB 8K ZINC 21M Standardization Atom Counts Diversity Selection Physicochemical desc. MDL Public Keys FCFP_4 Training Set Test Set ECFP_4 532 + 532 6.4K + 6.4K Julio E. Peironcely
  • 39. Metabolite-likeness HMDB 8K ZINC 21M Standardization Atom Counts Diversity Selection Physicochemical desc. MDL Public Keys FCFP_4 Training Set Test Set ECFP_4 532 + 532 6.4K + 6.4K 5-fold CV SVM RF BC Julio E. Peironcely
  • 40. Metabolite-likeness HMDB 8K ZINC 21M Standardization Diversity Selection 3 classifiers X Training Set Test Set 5 descriptions 532 + 532 6.4K + 6.4K 5-fold CV Metabolite likeness SVM RF BC Julio E. Peironcely
  • 41. Metabolite-likeness HMDB 8K ZINC 21M Best = RF – MDLPublicKeys Standardization Sensitivity Specificity AUC Diversity Selection 99.84% 87.52% 99.20% Training Set Test Set Bad BC – P_desc 532 + 532 6.4K + 6.4K Sensitivity Specificity AUC 5-fold CV Metabolite likeness SVM RF BC 42.51% 86.56% 61.57% Julio E. Peironcely
  • 42. Metabolite-likeness, external validation HMDB External DrugBank ChEMBL validation set Random Selection Standardization Metabolite likeness Julio E. Peironcely
  • 44.
  • 45. Met-likeness + structure generation (malic acid) 8K 100% 57% 77% Julio E. Peironcely
  • 46. Met-likeness + structure generation (methylhistamine) 260K 71% 46% Julio E. Peironcely
  • 47. What else do we know about our molecules?
  • 48. Molecule Minimized_Energy ALogP Index Phenylalanine 0.1100 -1.605 5142
  • 49. Molecule Minimized_Energy ALogP Index C9H11NO2 0.1100 -1.605 5142 Structure Generation 277 M Julio E. Peironcely
  • 50. Molecule Minimized_Energy ALogP Index C9H11NO2 0.1100 -1.605 5142 99% Structure Generation 44% 41 K Julio E. Peironcely
  • 51. Molecule Minimized_Energy ALogP Index C9H11NO2 E < 10 0.1100 -1.605 5142 Structure Generation 40% 8K Julio E. Peironcely
  • 52. Molecule Minimized_Energy ALogP Index C9H11NO2 E < 10 0.1100 -1.605 5142 ALogP < -1 Structure Generation 76% 31 Julio E. Peironcely
  • 53. Conclusions Met-Likeness prediction is good, interpretation not Local models needed Structure Generator + Met-Likeness + other constraints = Met Id improvement Julio E. Peironcely
  • 54. Acknowledgements Leiden University University of Cambridge Miguel Rojas-Cherto Andreas Bender Piotr Kasper Michael van Vliet Theo Reijmers Rob Vreeken Evry University Ronnie van Doorn Jean-Loup Faulon Thomas Hankemeier Davide Fichera TNO Quality of Life Leon Coulier Albert Tas HMP University of Alberta David Wishart Ying (Edison) Dong Julio E. Peironcely