SlideShare a Scribd company logo
1 of 15
Download to read offline
Artificial Intelligence /Machine Learning
for Secondary Metabolite Prediction
Yannick Djoumbou Feunang
Corteva Agriscience, Indianapolis, IN, US
Online workshop: Computational Applications in Secondary Metabolite Discovery
March 9, 2021
2
In Silico Metabolism Prediction
Task: Given a small molecule, use computational tools to predict the outcome of its
interactions with metabolic enzymes.
Here, we will focus on the structural elucidation of potential metabolites.
(Humans; Bacteria)
(Humans; Plants) (Humans; Plants) (Humans; Plants;
Bacteria)
Limonene
(Citrus)
3
Why is Metabolism so Important?
• Bio/chemo-transformations influence
changes of our chemical exposome
• <2% of detectable peaks identifiable in
large-scale untargeted Metabolomics
• What are these unknowns?
• How to determine their structures?
• How to determine their activities?
• Cheminformatics and AI can be used to:
➢Detect common patterns
➢Predict enzyme/ligand interactions
➢Understand metabolism
➢Generate biologically feasible structures
through simulated reactions
4
Why is Metabolism so Important?
Pro/soft-drug design
Exposure science
ADMET profiling
http://admet.scbdd.com/
Nutrition science
Jin et al. (2017)
Influence of metabolism on a xenobiotic’s pharmacodynamics (PD),
including pharmacological activity (Act), and toxicological effects (Tox).
Djoumbou Feunang et al. (2019);
Strep. griseus
Artemisinin to
Artemisitone-9
Microbial biosynthesis
Liu et al. (2006)
5
Approaches for In silico Metabolism Prediction
• Requires:
• Module to predict (and rank/score) SoMs, enzyme-substrate selectivity, or reaction groups
• Library of reaction templates to apply or select (via prediction) from
• Modules are usually specific (chemical/enzyme classes), or comprehensive (whole species)
• Prediction approaches can be ligand- or structure-based
• Some tools include: ML/DL-based (MetaTrans, Glory), Rule-based (MetabolExpert), Hybrid (BioTransformer,
Meteor Nexus)
Tolclofos-methyl
(Rats; mice)
Substrate specificity Yes/No
Sites of Metabolism
(SoMs)
Yes
No
Yes
Yes
ML/DL-, Rule/Data
mining-based prediction
ML/DL
prediction;
QM/Docking;
fingerprints
Apply
rule
Apply
rule
Hydroxylation
Desulfurization
O-dealkylation
Epoxidation
Reaction
Templates
6
Ligand-based Prediction
• Use structures of ligands/non-ligands to
predict specificity, accessibility to enzymes
• Some methods include, among others:
➢QSAR/QSMR (SoM, BoM, ESS)
➢Data mining/fingerprints (SoM)
➢Substructures/Rules (ESS, Reaction)
• Advantages:
➢Speed
➢Seem to perform as well as structure-based
methods
• Disadvantages:
➢Approaches/models/rule bases tend to not
work on novel chemistries
➢Data quantity and quality is a limiting factor
Strieth-Kalthoff et al. (2020)
Hydroxylation of methyl carbon adjacent to aliphatic ring
7
Structure-based Prediction
• Explicitly model the interaction within the
enzyme’s binding pocket
• Help predicting sites of metabolism (SoMs)
• Some methods include, among others:
➢Protein-ligand docking
➢Molecular dynamics
• Advantages
➢Detailed study of the enzyme-substrate
interaction
• Disadvantages
➢High computational power to model structural
flexibility
Ménard et al. (2012)
8
Examples of In Silico Metabolism Prediction Tools
Based on their coverage, in silico metabolism prediction tools can be classified as:
1. Specific tools, which apply to simple biological systems (e.g. a small set of
enzymes), and often, cover a rather small chemical space.
2. Comprehensive tools, which apply to a more diverse and larger set of enzymes,
species, and cover a larger chemical space.
Prediction Tool Accessibility Spectrum Approach Methods Availability
BioTransformer Web server / Command line Comprehensive Ligand-based Machine Learning
Expert system
Open source
Meteor Nexus GUI-based standalone Comprehensive Ligand-based Machine Learning
Expert system
Commercial
GLORYx Web server Comprehensive Ligand-based Machine Learning Freely available
SmartCYP Web server Specific Combined Machine Learning Freely available
MetaSite GUI-based standalone Specific Combined Machine Learning Commercial
MetabolExpert GUI-based standalone Specific Ligand-based Expert system Commercial
Examples of tools for metabolite prediction
9
Enhancing Metabolite Discovery and Identification
More structures and spectra
available for database search
?
?
?
?
? ?
?
?
?
?
?
?
?
?
?
• >2x105 known metabolites
• > 3x106 expected metabolites
Measuring the exposome
MS/NMR-based Metabolomics
• Expand the databases with predicted
structures
• Add predicted spectra for these metabolites
More structures available for mass/formula
matching to annotate the peaks
?
?
?
√ √
√
√
?
√
?
?
Higher identification rates
10
BioTransformer: Open Source for Metabolite Identification
• Djoumbou-Feunang et al. J Cheminform. 2019 Jan 5;11(1):2
• Open Source: bitbucket.org/djoumbou/biotransformer
• Docker: https://hub.docker.com/r/djoy2018/biotransformer-cl
• RESTful web server: www.biotransformer.ca
Djoumbou Feunang et al. (2019); J. Cheminf.; DOI 10.1186/s13321-018-0324-5
11
BioTransformer: Open source for Metabolite Identification
Epicatechin
11
12
Challenges in Metabolism Prediction
• Challenges in understanding metabolism lead to lower prediction accuracy
➢ML tools can rank SoMs with high accuracy, but not the resulting biotransformations
➢Knowledge-driven approaches achieve higher recall, but suffer from combinatorial explosion
• Limited accessibility/availability of tools
➢Most tools developed in academia are available only via web-sever, raising confidentiality
issues for most potential users
➢Restricted shareability of data generated by commercial tools
• The limited availability of high-quality data
➢Crucial for achieving better accuracy, higher coverage, and larger applicability domains
➢Needed to derive more reaction rules
➢However, data curation (Sites of metabolism, reaction annotation) is tedious and expensive
13
Outlook
• Data sharing
➢Increase the amount of high-quality, publicly available, curated, and downloadable data
(substrate-product, Sites of Metabolism). E.g.: MetXBioDB, PubChem, XMetDB, etc.
➢Develop publicly accessible databases of predicted metabolites (e.g.: BioTransformerDB)
➢Community-wide efforts needed
• Novel, and innovative AI approaches:
➢Seq2Seq transformer architectures have proven applicable for end-to-end learning-based
method, with results comparable to existing tools (MetaTrans: Litsa et al. 2020)
▪ Could improve accuracy, while bypassing manual rule design
➢Bonds of Metabolism (BoMs) seem to provide a better description of reaction centers,
leading to higher accuracy, compared to SoMs (Upcoming - Tian, et al. (2021))
• Open source communities
➢Provide means for easier user feedback loops; valuable for improving prediction tools
➢Provide developers opportunities to improve software tools, in an agile, continuous manner
▪ Successful projects include (BioTransformer, Chemistry Development Kit, Knime)
14
Thank You
• The organizing committee:
➢Fidele Ntie-Kang
➢J. Ludwig Muller
• The Wishart Lab @UofAlberta, Canada
• Corteva Agriscience
• The BioTransformer community
• The listeners
• To learn more about BioTransformer:
➢HS03: In Silico Prediction and Identification of Metabolites with BioTransformer : Enabling
Secondary Metabolite Discovery; March 10, 2021
15

More Related Content

What's hot

Protein structure classification/domain prediction: SCOP and CATH (Bioinforma...
Protein structure classification/domain prediction: SCOP and CATH (Bioinforma...Protein structure classification/domain prediction: SCOP and CATH (Bioinforma...
Protein structure classification/domain prediction: SCOP and CATH (Bioinforma...SELF-EXPLANATORY
 
Ethical issues related to animal biotechnology
Ethical issues related to animal biotechnologyEthical issues related to animal biotechnology
Ethical issues related to animal biotechnologyKAUSHAL SAHU
 
Synthetic biology for pathway engineering
Synthetic biology for pathway engineeringSynthetic biology for pathway engineering
Synthetic biology for pathway engineeringKarthikeyan Rathinam
 
threading and homology modelling methods
threading and homology modelling methodsthreading and homology modelling methods
threading and homology modelling methodsmohammed muzammil
 
Genotyping by Sequencing
Genotyping by SequencingGenotyping by Sequencing
Genotyping by SequencingSenthil Natesan
 
Environmental issues associated with transgenic crops
Environmental issues associated with transgenic cropsEnvironmental issues associated with transgenic crops
Environmental issues associated with transgenic cropsSheetal Mehla
 
An introduction to RNA-seq data analysis
An introduction to RNA-seq data analysisAn introduction to RNA-seq data analysis
An introduction to RNA-seq data analysisAGRF_Ltd
 
Mapping protein to function
Mapping protein to functionMapping protein to function
Mapping protein to functionAbhik Seal
 
Applied Artificial Intelligence & How it's Transforming Life Sciences
Applied Artificial Intelligence & How it's Transforming Life SciencesApplied Artificial Intelligence & How it's Transforming Life Sciences
Applied Artificial Intelligence & How it's Transforming Life SciencesKumaraguru Veerasamy
 
Computational Protein Design. 2. Computational Protein Design Techniques
Computational Protein Design. 2. Computational Protein Design TechniquesComputational Protein Design. 2. Computational Protein Design Techniques
Computational Protein Design. 2. Computational Protein Design TechniquesPablo Carbonell
 
Proteomics, definatio , general concept, signficance
Proteomics,  definatio , general concept, signficanceProteomics,  definatio , general concept, signficance
Proteomics, definatio , general concept, signficanceKAUSHAL SAHU
 
Bio-Molecular Engineering is the Future of Molecular Biology
Bio-Molecular Engineering is the Future of Molecular BiologyBio-Molecular Engineering is the Future of Molecular Biology
Bio-Molecular Engineering is the Future of Molecular BiologyBob Eisenberg
 
Basics of Data Analysis in Bioinformatics
Basics of Data Analysis in BioinformaticsBasics of Data Analysis in Bioinformatics
Basics of Data Analysis in BioinformaticsElena Sügis
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomicsprateek kumar
 
Genome Browsing, Genomic Data Mining and Genome Data Visualization with Ensem...
Genome Browsing, Genomic Data Mining and Genome Data Visualization with Ensem...Genome Browsing, Genomic Data Mining and Genome Data Visualization with Ensem...
Genome Browsing, Genomic Data Mining and Genome Data Visualization with Ensem...VHIR Vall d’Hebron Institut de Recerca
 

What's hot (20)

Genome annotation
Genome annotationGenome annotation
Genome annotation
 
Protein structure classification/domain prediction: SCOP and CATH (Bioinforma...
Protein structure classification/domain prediction: SCOP and CATH (Bioinforma...Protein structure classification/domain prediction: SCOP and CATH (Bioinforma...
Protein structure classification/domain prediction: SCOP and CATH (Bioinforma...
 
Ethical issues related to animal biotechnology
Ethical issues related to animal biotechnologyEthical issues related to animal biotechnology
Ethical issues related to animal biotechnology
 
Synthetic biology for pathway engineering
Synthetic biology for pathway engineeringSynthetic biology for pathway engineering
Synthetic biology for pathway engineering
 
threading and homology modelling methods
threading and homology modelling methodsthreading and homology modelling methods
threading and homology modelling methods
 
Genotyping by Sequencing
Genotyping by SequencingGenotyping by Sequencing
Genotyping by Sequencing
 
Environmental issues associated with transgenic crops
Environmental issues associated with transgenic cropsEnvironmental issues associated with transgenic crops
Environmental issues associated with transgenic crops
 
The STRING database
The STRING databaseThe STRING database
The STRING database
 
An introduction to RNA-seq data analysis
An introduction to RNA-seq data analysisAn introduction to RNA-seq data analysis
An introduction to RNA-seq data analysis
 
Mapping protein to function
Mapping protein to functionMapping protein to function
Mapping protein to function
 
Applied Artificial Intelligence & How it's Transforming Life Sciences
Applied Artificial Intelligence & How it's Transforming Life SciencesApplied Artificial Intelligence & How it's Transforming Life Sciences
Applied Artificial Intelligence & How it's Transforming Life Sciences
 
Composite and Specialized databases
Composite and Specialized databasesComposite and Specialized databases
Composite and Specialized databases
 
Computational Protein Design. 2. Computational Protein Design Techniques
Computational Protein Design. 2. Computational Protein Design TechniquesComputational Protein Design. 2. Computational Protein Design Techniques
Computational Protein Design. 2. Computational Protein Design Techniques
 
Proteomics, definatio , general concept, signficance
Proteomics,  definatio , general concept, signficanceProteomics,  definatio , general concept, signficance
Proteomics, definatio , general concept, signficance
 
Transgenic technology
Transgenic technologyTransgenic technology
Transgenic technology
 
Bio-Molecular Engineering is the Future of Molecular Biology
Bio-Molecular Engineering is the Future of Molecular BiologyBio-Molecular Engineering is the Future of Molecular Biology
Bio-Molecular Engineering is the Future of Molecular Biology
 
Basics of Data Analysis in Bioinformatics
Basics of Data Analysis in BioinformaticsBasics of Data Analysis in Bioinformatics
Basics of Data Analysis in Bioinformatics
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
Molecular farming
Molecular farmingMolecular farming
Molecular farming
 
Genome Browsing, Genomic Data Mining and Genome Data Visualization with Ensem...
Genome Browsing, Genomic Data Mining and Genome Data Visualization with Ensem...Genome Browsing, Genomic Data Mining and Genome Data Visualization with Ensem...
Genome Browsing, Genomic Data Mining and Genome Data Visualization with Ensem...
 

Similar to AI and Machine Learning for Secondary Metabolite Prediction

Designing a community resource - Sandra Orchard
Designing a community resource - Sandra OrchardDesigning a community resource - Sandra Orchard
Designing a community resource - Sandra OrchardEMBL-ABR
 
Java tutorial: Programmatic Access to Molecular Interactions
Java tutorial: Programmatic Access to Molecular InteractionsJava tutorial: Programmatic Access to Molecular Interactions
Java tutorial: Programmatic Access to Molecular InteractionsRafael C. Jimenez
 
Project report: Investigating the effect of cellular objectives on genome-sca...
Project report: Investigating the effect of cellular objectives on genome-sca...Project report: Investigating the effect of cellular objectives on genome-sca...
Project report: Investigating the effect of cellular objectives on genome-sca...Jarle Pahr
 
Technology R&D Theme 1: Differential Networks
Technology R&D Theme 1: Differential NetworksTechnology R&D Theme 1: Differential Networks
Technology R&D Theme 1: Differential NetworksAlexander Pico
 
Drug design based on bioinformatic tools
Drug design based on bioinformatic toolsDrug design based on bioinformatic tools
Drug design based on bioinformatic toolsSujeethKrishnan
 
Chemoinformatic File Format.pptx
Chemoinformatic File Format.pptxChemoinformatic File Format.pptx
Chemoinformatic File Format.pptxwadhava gurumeet
 
NanoAgents: Molecular Docking Using Multi-Agent Technology
NanoAgents: Molecular Docking Using Multi-Agent TechnologyNanoAgents: Molecular Docking Using Multi-Agent Technology
NanoAgents: Molecular Docking Using Multi-Agent TechnologyCSCJournals
 
Intro to in silico drug discovery 2014
Intro to in silico drug discovery 2014Intro to in silico drug discovery 2014
Intro to in silico drug discovery 2014Lee Larcombe
 
Promiscuous patterns and perils in PubChem and the MLSCN
Promiscuous patterns and perils in PubChem and the MLSCNPromiscuous patterns and perils in PubChem and the MLSCN
Promiscuous patterns and perils in PubChem and the MLSCNJeremy Yang
 
Computational Biology Methods for Drug Discovery_Phase 1-5_November 2015
Computational Biology Methods for Drug Discovery_Phase 1-5_November 2015Computational Biology Methods for Drug Discovery_Phase 1-5_November 2015
Computational Biology Methods for Drug Discovery_Phase 1-5_November 2015Mathew Varghese
 
A Biclustering Method for Rationalizing Chemical Biology Mechanisms of Action
A Biclustering Method for Rationalizing Chemical Biology Mechanisms of ActionA Biclustering Method for Rationalizing Chemical Biology Mechanisms of Action
A Biclustering Method for Rationalizing Chemical Biology Mechanisms of ActionGerald Lushington
 
T-BioInfo Methods and Approaches
T-BioInfo Methods and ApproachesT-BioInfo Methods and Approaches
T-BioInfo Methods and ApproachesElia Brodsky
 
Computer aided drug designing
Computer aided drug designing Computer aided drug designing
Computer aided drug designing Ayesha Aftab
 
ANALYSIS OF PROTEIN MICROARRAY DATA USING DATA MINING
ANALYSIS OF PROTEIN MICROARRAY DATA USING DATA MININGANALYSIS OF PROTEIN MICROARRAY DATA USING DATA MINING
ANALYSIS OF PROTEIN MICROARRAY DATA USING DATA MININGijbbjournal
 
The Complete Guide for Metabolomics Methods and Application
The Complete Guide for Metabolomics Methods and ApplicationThe Complete Guide for Metabolomics Methods and Application
The Complete Guide for Metabolomics Methods and ApplicationBennie George
 
The Complete Guide for Metabolomics Methods and Application
The Complete Guide for Metabolomics Methods and ApplicationThe Complete Guide for Metabolomics Methods and Application
The Complete Guide for Metabolomics Methods and ApplicationBennie George
 

Similar to AI and Machine Learning for Secondary Metabolite Prediction (20)

Designing a community resource - Sandra Orchard
Designing a community resource - Sandra OrchardDesigning a community resource - Sandra Orchard
Designing a community resource - Sandra Orchard
 
Java tutorial: Programmatic Access to Molecular Interactions
Java tutorial: Programmatic Access to Molecular InteractionsJava tutorial: Programmatic Access to Molecular Interactions
Java tutorial: Programmatic Access to Molecular Interactions
 
Project report: Investigating the effect of cellular objectives on genome-sca...
Project report: Investigating the effect of cellular objectives on genome-sca...Project report: Investigating the effect of cellular objectives on genome-sca...
Project report: Investigating the effect of cellular objectives on genome-sca...
 
Technology R&D Theme 1: Differential Networks
Technology R&D Theme 1: Differential NetworksTechnology R&D Theme 1: Differential Networks
Technology R&D Theme 1: Differential Networks
 
Applied Bioinformatics Assignment 5docx
Applied Bioinformatics Assignment  5docxApplied Bioinformatics Assignment  5docx
Applied Bioinformatics Assignment 5docx
 
Drug design based on bioinformatic tools
Drug design based on bioinformatic toolsDrug design based on bioinformatic tools
Drug design based on bioinformatic tools
 
Varieties of Self-Awareness and Their Uses in Natural and Artificial Systems ...
Varieties of Self-Awareness and Their Uses in Natural and Artificial Systems ...Varieties of Self-Awareness and Their Uses in Natural and Artificial Systems ...
Varieties of Self-Awareness and Their Uses in Natural and Artificial Systems ...
 
Chemoinformatic File Format.pptx
Chemoinformatic File Format.pptxChemoinformatic File Format.pptx
Chemoinformatic File Format.pptx
 
NanoAgents: Molecular Docking Using Multi-Agent Technology
NanoAgents: Molecular Docking Using Multi-Agent TechnologyNanoAgents: Molecular Docking Using Multi-Agent Technology
NanoAgents: Molecular Docking Using Multi-Agent Technology
 
Intro to in silico drug discovery 2014
Intro to in silico drug discovery 2014Intro to in silico drug discovery 2014
Intro to in silico drug discovery 2014
 
Promiscuous patterns and perils in PubChem and the MLSCN
Promiscuous patterns and perils in PubChem and the MLSCNPromiscuous patterns and perils in PubChem and the MLSCN
Promiscuous patterns and perils in PubChem and the MLSCN
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Computational Biology Methods for Drug Discovery_Phase 1-5_November 2015
Computational Biology Methods for Drug Discovery_Phase 1-5_November 2015Computational Biology Methods for Drug Discovery_Phase 1-5_November 2015
Computational Biology Methods for Drug Discovery_Phase 1-5_November 2015
 
A Biclustering Method for Rationalizing Chemical Biology Mechanisms of Action
A Biclustering Method for Rationalizing Chemical Biology Mechanisms of ActionA Biclustering Method for Rationalizing Chemical Biology Mechanisms of Action
A Biclustering Method for Rationalizing Chemical Biology Mechanisms of Action
 
T-bioinfo overview
T-bioinfo overviewT-bioinfo overview
T-bioinfo overview
 
T-BioInfo Methods and Approaches
T-BioInfo Methods and ApproachesT-BioInfo Methods and Approaches
T-BioInfo Methods and Approaches
 
Computer aided drug designing
Computer aided drug designing Computer aided drug designing
Computer aided drug designing
 
ANALYSIS OF PROTEIN MICROARRAY DATA USING DATA MINING
ANALYSIS OF PROTEIN MICROARRAY DATA USING DATA MININGANALYSIS OF PROTEIN MICROARRAY DATA USING DATA MINING
ANALYSIS OF PROTEIN MICROARRAY DATA USING DATA MINING
 
The Complete Guide for Metabolomics Methods and Application
The Complete Guide for Metabolomics Methods and ApplicationThe Complete Guide for Metabolomics Methods and Application
The Complete Guide for Metabolomics Methods and Application
 
The Complete Guide for Metabolomics Methods and Application
The Complete Guide for Metabolomics Methods and ApplicationThe Complete Guide for Metabolomics Methods and Application
The Complete Guide for Metabolomics Methods and Application
 

Recently uploaded

linearity concept of significance, standard deviation, chi square test, stude...
linearity concept of significance, standard deviation, chi square test, stude...linearity concept of significance, standard deviation, chi square test, stude...
linearity concept of significance, standard deviation, chi square test, stude...KavyasriPuttamreddy
 
Introducing VarSeq Dx as a Medical Device in the European Union
Introducing VarSeq Dx as a Medical Device in the European UnionIntroducing VarSeq Dx as a Medical Device in the European Union
Introducing VarSeq Dx as a Medical Device in the European UnionGolden Helix
 
DIGITAL RADIOGRAPHY-SABBU KHATOON .pptx
DIGITAL RADIOGRAPHY-SABBU KHATOON  .pptxDIGITAL RADIOGRAPHY-SABBU KHATOON  .pptx
DIGITAL RADIOGRAPHY-SABBU KHATOON .pptxSabbu Khatoon
 
Circulation through Special Regions -characteristics and regulation
Circulation through Special Regions -characteristics and regulationCirculation through Special Regions -characteristics and regulation
Circulation through Special Regions -characteristics and regulationMedicoseAcademics
 
CNN-based plastic waste detection system
CNN-based plastic waste detection systemCNN-based plastic waste detection system
CNN-based plastic waste detection systemBOHRInternationalJou1
 
5CL-ADB powder supplier 5cl adb 5cladba 5cl raw materials vendor on sale now
5CL-ADB powder supplier 5cl adb 5cladba 5cl raw materials vendor on sale now5CL-ADB powder supplier 5cl adb 5cladba 5cl raw materials vendor on sale now
5CL-ADB powder supplier 5cl adb 5cladba 5cl raw materials vendor on sale nowSherrylee83
 
Book Trailer: PGMEE in a Nutshell (CEE MD/MS PG Entrance Examination)
Book Trailer: PGMEE in a Nutshell (CEE MD/MS PG Entrance Examination)Book Trailer: PGMEE in a Nutshell (CEE MD/MS PG Entrance Examination)
Book Trailer: PGMEE in a Nutshell (CEE MD/MS PG Entrance Examination)Dr. Aryan (Anish Dhakal)
 
PT MANAGEMENT OF URINARY INCONTINENCE.pptx
PT MANAGEMENT OF URINARY INCONTINENCE.pptxPT MANAGEMENT OF URINARY INCONTINENCE.pptx
PT MANAGEMENT OF URINARY INCONTINENCE.pptxdrtabassum4
 
The Orbit & its contents by Dr. Rabia I. Gandapore.pptx
The Orbit & its contents by Dr. Rabia I. Gandapore.pptxThe Orbit & its contents by Dr. Rabia I. Gandapore.pptx
The Orbit & its contents by Dr. Rabia I. Gandapore.pptxDr. Rabia Inam Gandapore
 
TUBERCULINUM-2.BHMS.MATERIA MEDICA.HOMOEOPATHY
TUBERCULINUM-2.BHMS.MATERIA MEDICA.HOMOEOPATHYTUBERCULINUM-2.BHMS.MATERIA MEDICA.HOMOEOPATHY
TUBERCULINUM-2.BHMS.MATERIA MEDICA.HOMOEOPATHYDRPREETHIJAMESP
 
Denture base resins materials and its mechanism of action
Denture base resins materials and its mechanism of actionDenture base resins materials and its mechanism of action
Denture base resins materials and its mechanism of actionDr.shiva sai vemula
 
SURGICAL ANATOMY OF ORAL IMPLANTOLOGY.pptx
SURGICAL ANATOMY OF ORAL IMPLANTOLOGY.pptxSURGICAL ANATOMY OF ORAL IMPLANTOLOGY.pptx
SURGICAL ANATOMY OF ORAL IMPLANTOLOGY.pptxSuresh Kumar K
 
End Feel -joint end feel - Normal and Abnormal end feel
End Feel -joint end feel - Normal and Abnormal end feelEnd Feel -joint end feel - Normal and Abnormal end feel
End Feel -joint end feel - Normal and Abnormal end feeldranji1
 
In-service education (Nursing Mangement)
In-service education (Nursing Mangement)In-service education (Nursing Mangement)
In-service education (Nursing Mangement)Monika Kanwar
 
MRI Artifacts and Their Remedies/Corrections.pptx
MRI Artifacts and Their Remedies/Corrections.pptxMRI Artifacts and Their Remedies/Corrections.pptx
MRI Artifacts and Their Remedies/Corrections.pptxDr. Dheeraj Kumar
 
NCLEX RN REVIEW EXAM CONTENT BLUE BOOK PDF
NCLEX RN REVIEW EXAM CONTENT BLUE BOOK PDFNCLEX RN REVIEW EXAM CONTENT BLUE BOOK PDF
NCLEX RN REVIEW EXAM CONTENT BLUE BOOK PDFShahid Hussain
 
180-hour Power Capsules For Men In Ghana
180-hour Power Capsules For Men In Ghana180-hour Power Capsules For Men In Ghana
180-hour Power Capsules For Men In Ghanahealthwatchghana
 
BMK Glycidic Acid (sodium salt) CAS 5449-12-7 Pharmaceutical intermediates
BMK Glycidic Acid (sodium salt)  CAS 5449-12-7 Pharmaceutical intermediatesBMK Glycidic Acid (sodium salt)  CAS 5449-12-7 Pharmaceutical intermediates
BMK Glycidic Acid (sodium salt) CAS 5449-12-7 Pharmaceutical intermediatesdorademei
 
Hemodialysis: Chapter 2, Extracorporeal Blood Circuit - Dr.Gawad
Hemodialysis: Chapter 2, Extracorporeal Blood Circuit - Dr.GawadHemodialysis: Chapter 2, Extracorporeal Blood Circuit - Dr.Gawad
Hemodialysis: Chapter 2, Extracorporeal Blood Circuit - Dr.GawadNephroTube - Dr.Gawad
 
Muscle Energy Technique (MET) with variant and techniques.
Muscle Energy Technique (MET) with variant and techniques.Muscle Energy Technique (MET) with variant and techniques.
Muscle Energy Technique (MET) with variant and techniques.Anjali Parmar
 

Recently uploaded (20)

linearity concept of significance, standard deviation, chi square test, stude...
linearity concept of significance, standard deviation, chi square test, stude...linearity concept of significance, standard deviation, chi square test, stude...
linearity concept of significance, standard deviation, chi square test, stude...
 
Introducing VarSeq Dx as a Medical Device in the European Union
Introducing VarSeq Dx as a Medical Device in the European UnionIntroducing VarSeq Dx as a Medical Device in the European Union
Introducing VarSeq Dx as a Medical Device in the European Union
 
DIGITAL RADIOGRAPHY-SABBU KHATOON .pptx
DIGITAL RADIOGRAPHY-SABBU KHATOON  .pptxDIGITAL RADIOGRAPHY-SABBU KHATOON  .pptx
DIGITAL RADIOGRAPHY-SABBU KHATOON .pptx
 
Circulation through Special Regions -characteristics and regulation
Circulation through Special Regions -characteristics and regulationCirculation through Special Regions -characteristics and regulation
Circulation through Special Regions -characteristics and regulation
 
CNN-based plastic waste detection system
CNN-based plastic waste detection systemCNN-based plastic waste detection system
CNN-based plastic waste detection system
 
5CL-ADB powder supplier 5cl adb 5cladba 5cl raw materials vendor on sale now
5CL-ADB powder supplier 5cl adb 5cladba 5cl raw materials vendor on sale now5CL-ADB powder supplier 5cl adb 5cladba 5cl raw materials vendor on sale now
5CL-ADB powder supplier 5cl adb 5cladba 5cl raw materials vendor on sale now
 
Book Trailer: PGMEE in a Nutshell (CEE MD/MS PG Entrance Examination)
Book Trailer: PGMEE in a Nutshell (CEE MD/MS PG Entrance Examination)Book Trailer: PGMEE in a Nutshell (CEE MD/MS PG Entrance Examination)
Book Trailer: PGMEE in a Nutshell (CEE MD/MS PG Entrance Examination)
 
PT MANAGEMENT OF URINARY INCONTINENCE.pptx
PT MANAGEMENT OF URINARY INCONTINENCE.pptxPT MANAGEMENT OF URINARY INCONTINENCE.pptx
PT MANAGEMENT OF URINARY INCONTINENCE.pptx
 
The Orbit & its contents by Dr. Rabia I. Gandapore.pptx
The Orbit & its contents by Dr. Rabia I. Gandapore.pptxThe Orbit & its contents by Dr. Rabia I. Gandapore.pptx
The Orbit & its contents by Dr. Rabia I. Gandapore.pptx
 
TUBERCULINUM-2.BHMS.MATERIA MEDICA.HOMOEOPATHY
TUBERCULINUM-2.BHMS.MATERIA MEDICA.HOMOEOPATHYTUBERCULINUM-2.BHMS.MATERIA MEDICA.HOMOEOPATHY
TUBERCULINUM-2.BHMS.MATERIA MEDICA.HOMOEOPATHY
 
Denture base resins materials and its mechanism of action
Denture base resins materials and its mechanism of actionDenture base resins materials and its mechanism of action
Denture base resins materials and its mechanism of action
 
SURGICAL ANATOMY OF ORAL IMPLANTOLOGY.pptx
SURGICAL ANATOMY OF ORAL IMPLANTOLOGY.pptxSURGICAL ANATOMY OF ORAL IMPLANTOLOGY.pptx
SURGICAL ANATOMY OF ORAL IMPLANTOLOGY.pptx
 
End Feel -joint end feel - Normal and Abnormal end feel
End Feel -joint end feel - Normal and Abnormal end feelEnd Feel -joint end feel - Normal and Abnormal end feel
End Feel -joint end feel - Normal and Abnormal end feel
 
In-service education (Nursing Mangement)
In-service education (Nursing Mangement)In-service education (Nursing Mangement)
In-service education (Nursing Mangement)
 
MRI Artifacts and Their Remedies/Corrections.pptx
MRI Artifacts and Their Remedies/Corrections.pptxMRI Artifacts and Their Remedies/Corrections.pptx
MRI Artifacts and Their Remedies/Corrections.pptx
 
NCLEX RN REVIEW EXAM CONTENT BLUE BOOK PDF
NCLEX RN REVIEW EXAM CONTENT BLUE BOOK PDFNCLEX RN REVIEW EXAM CONTENT BLUE BOOK PDF
NCLEX RN REVIEW EXAM CONTENT BLUE BOOK PDF
 
180-hour Power Capsules For Men In Ghana
180-hour Power Capsules For Men In Ghana180-hour Power Capsules For Men In Ghana
180-hour Power Capsules For Men In Ghana
 
BMK Glycidic Acid (sodium salt) CAS 5449-12-7 Pharmaceutical intermediates
BMK Glycidic Acid (sodium salt)  CAS 5449-12-7 Pharmaceutical intermediatesBMK Glycidic Acid (sodium salt)  CAS 5449-12-7 Pharmaceutical intermediates
BMK Glycidic Acid (sodium salt) CAS 5449-12-7 Pharmaceutical intermediates
 
Hemodialysis: Chapter 2, Extracorporeal Blood Circuit - Dr.Gawad
Hemodialysis: Chapter 2, Extracorporeal Blood Circuit - Dr.GawadHemodialysis: Chapter 2, Extracorporeal Blood Circuit - Dr.Gawad
Hemodialysis: Chapter 2, Extracorporeal Blood Circuit - Dr.Gawad
 
Muscle Energy Technique (MET) with variant and techniques.
Muscle Energy Technique (MET) with variant and techniques.Muscle Energy Technique (MET) with variant and techniques.
Muscle Energy Technique (MET) with variant and techniques.
 

AI and Machine Learning for Secondary Metabolite Prediction

  • 1. Artificial Intelligence /Machine Learning for Secondary Metabolite Prediction Yannick Djoumbou Feunang Corteva Agriscience, Indianapolis, IN, US Online workshop: Computational Applications in Secondary Metabolite Discovery March 9, 2021
  • 2. 2 In Silico Metabolism Prediction Task: Given a small molecule, use computational tools to predict the outcome of its interactions with metabolic enzymes. Here, we will focus on the structural elucidation of potential metabolites. (Humans; Bacteria) (Humans; Plants) (Humans; Plants) (Humans; Plants; Bacteria) Limonene (Citrus)
  • 3. 3 Why is Metabolism so Important? • Bio/chemo-transformations influence changes of our chemical exposome • <2% of detectable peaks identifiable in large-scale untargeted Metabolomics • What are these unknowns? • How to determine their structures? • How to determine their activities? • Cheminformatics and AI can be used to: ➢Detect common patterns ➢Predict enzyme/ligand interactions ➢Understand metabolism ➢Generate biologically feasible structures through simulated reactions
  • 4. 4 Why is Metabolism so Important? Pro/soft-drug design Exposure science ADMET profiling http://admet.scbdd.com/ Nutrition science Jin et al. (2017) Influence of metabolism on a xenobiotic’s pharmacodynamics (PD), including pharmacological activity (Act), and toxicological effects (Tox). Djoumbou Feunang et al. (2019); Strep. griseus Artemisinin to Artemisitone-9 Microbial biosynthesis Liu et al. (2006)
  • 5. 5 Approaches for In silico Metabolism Prediction • Requires: • Module to predict (and rank/score) SoMs, enzyme-substrate selectivity, or reaction groups • Library of reaction templates to apply or select (via prediction) from • Modules are usually specific (chemical/enzyme classes), or comprehensive (whole species) • Prediction approaches can be ligand- or structure-based • Some tools include: ML/DL-based (MetaTrans, Glory), Rule-based (MetabolExpert), Hybrid (BioTransformer, Meteor Nexus) Tolclofos-methyl (Rats; mice) Substrate specificity Yes/No Sites of Metabolism (SoMs) Yes No Yes Yes ML/DL-, Rule/Data mining-based prediction ML/DL prediction; QM/Docking; fingerprints Apply rule Apply rule Hydroxylation Desulfurization O-dealkylation Epoxidation Reaction Templates
  • 6. 6 Ligand-based Prediction • Use structures of ligands/non-ligands to predict specificity, accessibility to enzymes • Some methods include, among others: ➢QSAR/QSMR (SoM, BoM, ESS) ➢Data mining/fingerprints (SoM) ➢Substructures/Rules (ESS, Reaction) • Advantages: ➢Speed ➢Seem to perform as well as structure-based methods • Disadvantages: ➢Approaches/models/rule bases tend to not work on novel chemistries ➢Data quantity and quality is a limiting factor Strieth-Kalthoff et al. (2020) Hydroxylation of methyl carbon adjacent to aliphatic ring
  • 7. 7 Structure-based Prediction • Explicitly model the interaction within the enzyme’s binding pocket • Help predicting sites of metabolism (SoMs) • Some methods include, among others: ➢Protein-ligand docking ➢Molecular dynamics • Advantages ➢Detailed study of the enzyme-substrate interaction • Disadvantages ➢High computational power to model structural flexibility Ménard et al. (2012)
  • 8. 8 Examples of In Silico Metabolism Prediction Tools Based on their coverage, in silico metabolism prediction tools can be classified as: 1. Specific tools, which apply to simple biological systems (e.g. a small set of enzymes), and often, cover a rather small chemical space. 2. Comprehensive tools, which apply to a more diverse and larger set of enzymes, species, and cover a larger chemical space. Prediction Tool Accessibility Spectrum Approach Methods Availability BioTransformer Web server / Command line Comprehensive Ligand-based Machine Learning Expert system Open source Meteor Nexus GUI-based standalone Comprehensive Ligand-based Machine Learning Expert system Commercial GLORYx Web server Comprehensive Ligand-based Machine Learning Freely available SmartCYP Web server Specific Combined Machine Learning Freely available MetaSite GUI-based standalone Specific Combined Machine Learning Commercial MetabolExpert GUI-based standalone Specific Ligand-based Expert system Commercial Examples of tools for metabolite prediction
  • 9. 9 Enhancing Metabolite Discovery and Identification More structures and spectra available for database search ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? • >2x105 known metabolites • > 3x106 expected metabolites Measuring the exposome MS/NMR-based Metabolomics • Expand the databases with predicted structures • Add predicted spectra for these metabolites More structures available for mass/formula matching to annotate the peaks ? ? ? √ √ √ √ ? √ ? ? Higher identification rates
  • 10. 10 BioTransformer: Open Source for Metabolite Identification • Djoumbou-Feunang et al. J Cheminform. 2019 Jan 5;11(1):2 • Open Source: bitbucket.org/djoumbou/biotransformer • Docker: https://hub.docker.com/r/djoy2018/biotransformer-cl • RESTful web server: www.biotransformer.ca Djoumbou Feunang et al. (2019); J. Cheminf.; DOI 10.1186/s13321-018-0324-5
  • 11. 11 BioTransformer: Open source for Metabolite Identification Epicatechin 11
  • 12. 12 Challenges in Metabolism Prediction • Challenges in understanding metabolism lead to lower prediction accuracy ➢ML tools can rank SoMs with high accuracy, but not the resulting biotransformations ➢Knowledge-driven approaches achieve higher recall, but suffer from combinatorial explosion • Limited accessibility/availability of tools ➢Most tools developed in academia are available only via web-sever, raising confidentiality issues for most potential users ➢Restricted shareability of data generated by commercial tools • The limited availability of high-quality data ➢Crucial for achieving better accuracy, higher coverage, and larger applicability domains ➢Needed to derive more reaction rules ➢However, data curation (Sites of metabolism, reaction annotation) is tedious and expensive
  • 13. 13 Outlook • Data sharing ➢Increase the amount of high-quality, publicly available, curated, and downloadable data (substrate-product, Sites of Metabolism). E.g.: MetXBioDB, PubChem, XMetDB, etc. ➢Develop publicly accessible databases of predicted metabolites (e.g.: BioTransformerDB) ➢Community-wide efforts needed • Novel, and innovative AI approaches: ➢Seq2Seq transformer architectures have proven applicable for end-to-end learning-based method, with results comparable to existing tools (MetaTrans: Litsa et al. 2020) ▪ Could improve accuracy, while bypassing manual rule design ➢Bonds of Metabolism (BoMs) seem to provide a better description of reaction centers, leading to higher accuracy, compared to SoMs (Upcoming - Tian, et al. (2021)) • Open source communities ➢Provide means for easier user feedback loops; valuable for improving prediction tools ➢Provide developers opportunities to improve software tools, in an agile, continuous manner ▪ Successful projects include (BioTransformer, Chemistry Development Kit, Knime)
  • 14. 14 Thank You • The organizing committee: ➢Fidele Ntie-Kang ➢J. Ludwig Muller • The Wishart Lab @UofAlberta, Canada • Corteva Agriscience • The BioTransformer community • The listeners • To learn more about BioTransformer: ➢HS03: In Silico Prediction and Identification of Metabolites with BioTransformer : Enabling Secondary Metabolite Discovery; March 10, 2021
  • 15. 15