SlideShare a Scribd company logo
1 of 29
Download to read offline
Prediction of Reaction Conditions
for Michael Additions
G. Marcou, J. Aires de Sousa, A. de Luca, D. Latino, V. Rietsch,
D. Horvath, A. Varnek
ChemAxon UGM, Budapest 19-20 May 2015
• Different scenarios of structure-reactivity modeling
• Descriptors for reactions
- Condensed Graph of Reaction / ISIDA descriptors
- Electron Effects Descriptors
• Michael reaction case
OUTLINE
Chemical reactions are difficult objects
+ +
- many species;
- two types of species: reactants and products;
- multi-step reactions,
- dependent on experimental conditions
• How can I synthesize this structure?
• How can I estimate a yield of a given reaction, its
kinetic and thermodynamic parameters ?
• Which reaction conditions I should choose in order
to obtain desirable product selectively ?
Chemical reactions in Chemoinformatics
𝑃𝑟𝑜𝑝𝑒𝑟𝑡𝑦 = 𝐟(𝑫𝒆𝒔𝒄𝒓𝒊𝒑𝒕𝒐𝒓𝒔)
• substructural fragments
• topological indices,
• physico-chem. parameters
• etc…
Parameters directly
derived from
molecular structure
• Support Vector Machine (SVM)
• Multi-Linear Regression (MLR)
• Artificial Neural Networks
• etc…
Mathematical relationship
established with machine
learning methods
Quantitative Structure-Activity/Property
Relationship (QSAR/QSPR)
𝑃𝑟𝑜𝑝𝑒𝑟𝑡𝑦 = 𝐟(𝒔𝒕𝒓𝒖𝒄𝒕𝒖𝒓𝒆)
• 𝑅𝑒𝑎𝑐𝑡𝑖𝑣𝑖𝑡𝑦 = 𝐟 𝑹𝒆𝒂𝒄𝒕𝒂𝒏𝒕𝒔 𝑠𝑡𝑟𝑢𝑐𝑡𝑢𝑟𝑒
• 𝑅𝑒𝑎𝑐𝑡𝑖𝑣𝑖𝑡𝑦 = 𝐟(𝑷𝒓𝒐𝒅𝒖𝒄𝒕𝒔 𝑠𝑡𝑟𝑢𝑐𝑡𝑢𝑟𝑒)
• 𝑅𝑒𝑎𝑐𝑡𝑖𝑣𝑖𝑡𝑦 = 𝐟(𝑹𝒆𝒂𝒄𝒕𝒊𝒐𝒏 𝑠𝑡𝑟𝑢𝑐𝑡𝑢𝑟𝑒)
Quantitative Structure-Reactivity Relationship
(QSRR)
• Different scenarios of structure-reactivity modeling
• Descriptors for reactions
- Condensed Graph of Reaction / ISIDA descriptors
- Electron Effect Descriptors
• Michael reaction case
OUTLINE
￸
Conventional bonds:
single, double, aromatic, …
Dynamical bonds:
created single, broken single,
…
CGR could be viewed as a pseudo-molecule representing a given reaction
Condensed Graph of Reaction
Condensed Graph of Reaction
• CGR condenses the structural information about products
and reactants
several graphs into one only graph
• This simplified presentation opens an opportunity to
apply to CGR the methods developed in
chemoinformatics for individual molecules
A. Varnek, D. Fourches, F. Hoonakker, V. P. Solov’ev, J. Computer-Aided Molecular Design, 2005, 19, 693-703
ISIDA fragment descriptors
Reaction can be encoded by a descriptor vector
which can be used in structure-reactivity modeling
Condensed graph of
reaction
2 1 2 …
…
ISIDA/CGR fragment descriptors
A. Varnek In: "Chemoinformatics and Computational Chemical Biology", J. Bajorath, Ed., Springer, 2010
Sums of property-dependent P(i) contributions from remote atoms i, at tiK
away from the ‘reactive’ center K, modulated by various working hypotheses:
𝐸𝐸𝐷 𝑝,𝑒,𝑜,𝑤,𝑐 𝐾 =
𝑖=1
𝑁
𝛿 𝑐 𝑖, 𝐾 × 𝑃𝑝
𝑒 (𝑖) × 𝑒𝑥𝑝 −1/𝑤 × 𝜏𝑖𝐾 − 𝑜 2
𝜏𝑖𝐾 =
0.5 𝑖𝑓 𝑖 = 𝐾
𝑠ℎ𝑜𝑟𝑡𝑒𝑠𝑡 − 𝑝𝑎𝑡ℎ 𝑡𝑜𝑝𝑜𝑙𝑜𝑔𝑖𝑐𝑎𝑙 𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒 𝑖, 𝐾 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
𝛿 𝑐
(𝑖, 𝐾) =
1 𝑖𝑓 𝑐 = 0 𝑂𝑅(𝑝𝑎𝑡ℎ 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑖 𝑎𝑛𝑑 𝐾 𝑐𝑜𝑛𝑡𝑎𝑖𝑛𝑠 𝑜𝑛𝑙𝑦 𝑢𝑛𝑠𝑎𝑡𝑢𝑟𝑎𝑡𝑒𝑑 𝑎𝑡𝑜𝑚𝑠)
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Property type p=1..11
Property power e=1..2
Neighborhood control o=1..4
Neighborhood width w=2..5
Conjugation toggle c=0,1
11x2x4x4x2=704 EED terms…
Electron Effect Descriptors
M.Elhabiri, E. Davioud-Charvet, D. Horvath, A. Varnek et al., Chemistry Eur. J, 2014, 20, 1 – 11
Atom Properties Tracked in EED
• Different scenarios of structure-reactivity modeling
• Descriptors for reactions
- Condensed Graph of Reaction / ISIDA descriptors
- Electron Effect Descriptors
• Michael reaction case
OUTLINE
 Solvent
 Hydrophobic (kerosene, benzene, …)
 Polar Aprotic (THF, DMSO, …)
 Polar Protic (water, acetic acid ...)
 No solvent: reaction occurs in pure
solutions of the reagents
 Catalyst
 Bronsted acids (Hydrochloric acid, …
 Lewis acids (transition metal ions, ….)
 Basic (pyridine, Na ethanolate, ….)
 No catalyst: a catalyst is not needed,
or autocatalysis takes place
Reaction occurs in different conditions characterized by solvent and catalyst
Michael reaction
Michael donor
Michael acceptor
For given Nu and R1 – R4, which conditions
(catalyst, solvent) lead to high reaction yield ?
Michael reaction
• to build QSRR models predicting optimal reaction
conditions for a query reaction.
Each model will answer a punctual question such as “Is this process
feasible with Brønsted acid catalysts?”, “Is this process feasible in aprotic
polar solvents?”, etc.
• to develop a public predictive tool adapted to the
Michael-type reaction case
Michael reaction: goals of the modeling
• 53 polar aprotic solvent
• 52 no solvent
• 103 polar protic solvent
• 93 Lewis acid catalyst
• 61 no catalyst
• 40 hydrophobic solvent
• 57 Bronsted acid catalyst
• 45 Basic catalysis
• 24 Decoys (not observed Michael reactions)
Notice that one same reaction may proceed under
different conditions !
Michael reaction: data
The data set consists in 222 reactions:
Non-occuring reaction
Occuring reaction :
condensation of hydroxylamine and aldehyde
Decoy Michael Addition
Compatibility of the Michael reaction in the database with each of the condition
classes is rendered by the condition bitvector.
0 1 1 0 0 0 1 0 0
Solvent
Hydro-
phobic
Solvent
Polar
Aprotic
Solvent
Polar Protic
No Solvent Catalyst
Bronsted
Acid
Catalyst
Lewis Acid
Catalyst
Basic
No
Catalyst
Not
observed
A bit is set on if the reaction was seen to happen under this condition.
NOTE – if off, it means we don’t know whether it’s feasible!
An additional ‘no-go’ bit is on if the given Michael addition should never be
observed (because of competing carbonyl addition).
Bitvector of reaction conditions
Modeling setup
Goal: preparation of 9 two-class classification models:
- 8 models for reaction feasibility for each catalyst or solvent type,
- 1 model « Michael / non-Michael »
Descriptors: ISIDA/CGR, EED … and also MOLMAP, CDK
Scenarios: reagent-based, product-based and reaction-based
Performance assessment : ROC AUC in 3-fold cross-validation
Machine-learning methods: Random Forest, SVM, Naïve Bayes
CGR for Michael reaction
created single double to single
Reaction descriptors
EED ISIDA
Solv:A
Solv:NA
Solv:P
Cat:LA
Cat:NA
Solv:H
Cat:BA
Cat:B
Cat:NO
1
0.9
0.8
0.7
0.6
0.5
1
0.9
0.8
0.7
0.6
0.5
Solv:A
Solv:NA
Solv:P
Cat:LA
Cat:NA
Solv:H
Cat:BA
Cat:B
Cat:NO
ROC AUC
ROC AUC = 1 corresponds to an ideal model
G. Marcou, J. Aires de Sousa, D. Latino, A. Deluca, D. Horvath, V. Rietsch, and A. Varnek J. Chem. Inf. Model., 2015, 55, 239−250.
Michael reaction: models performance
EED ISIDA
Reagent descriptors
Michael reaction: models performance
G. Marcou, J. Aires de Sousa, D. Latino, A. Deluca, D. Horvath, V. Rietsch, and A. Varnek J. Chem. Inf. Model., 2015, 55, 239−250.
http://infochim.u-strasbg.fr/webserv/VSEngine.html
WEB-based expert system predicting
Michael addition feasibility
Model validation on the set of 52 reactions extracted from the literature
Model # in the set # correctly predicted
Catalyst and solvent 52 8
aprotic solvent 12 2
Protic solvent 21 21
No catalyst 26 14
base-catalyzed 19 8
Brønsted acids 2 1
Model failure or uncomplete data for the modeling ?
Michael reaction: external validation
Both ISIDA/CGR and EED descriptors provide with
acceptable models in cross-validation
Reagent, Product and Reaction scenarios perform
similarly
The models cross-validate very well, but external testing
is inconclusive – more data are needed !
Conclusions
J. Chem. Inf. Model., 2015, 55, 239−250
 Campus France for a Franco-Portugese grant
“PESSOA”
 ChemAxon for the license
Thanks
EUGM15 - Alexandre Varnek (Université de Strasbourg): Towards an expert system for predicting reaction conditions: the Michael reaction case

More Related Content

Similar to EUGM15 - Alexandre Varnek (Université de Strasbourg): Towards an expert system for predicting reaction conditions: the Michael reaction case

Quantative Structure-Activity Relationships (QSAR)
Quantative Structure-Activity Relationships (QSAR)Quantative Structure-Activity Relationships (QSAR)
Quantative Structure-Activity Relationships (QSAR)Atai Rabby
 
Extracting Synthetic Knowledge from Reaction Databases - ARChem at the 246th ACS
Extracting Synthetic Knowledge from Reaction Databases - ARChem at the 246th ACSExtracting Synthetic Knowledge from Reaction Databases - ARChem at the 246th ACS
Extracting Synthetic Knowledge from Reaction Databases - ARChem at the 246th ACSSimBioSys_Inc
 
Molecular design: One step back and two paths forward
Molecular design:  One step back and two paths forwardMolecular design:  One step back and two paths forward
Molecular design: One step back and two paths forwardPeter Kenny
 
Molecular design: How to and how not to?
Molecular design:  How to and how not to?Molecular design:  How to and how not to?
Molecular design: How to and how not to?Peter Kenny
 
systems biology- Representation of chemical reaction networks
 systems biology- Representation of chemical reaction networks systems biology- Representation of chemical reaction networks
systems biology- Representation of chemical reaction networksShubham Kaushik
 
Drug properties (ADMET) prediction using AI
Drug properties (ADMET) prediction using AIDrug properties (ADMET) prediction using AI
Drug properties (ADMET) prediction using AIIndrajeetKumar124
 
Large scale classification of chemical reactions from patent data
Large scale classification of chemical reactions from patent dataLarge scale classification of chemical reactions from patent data
Large scale classification of chemical reactions from patent dataGreg Landrum
 
ACS 238th Meeting, 2009, Rasulev
ACS 238th Meeting, 2009, RasulevACS 238th Meeting, 2009, Rasulev
ACS 238th Meeting, 2009, RasulevB R
 
Jacob Kleine undergrad. Thesis
Jacob Kleine undergrad. ThesisJacob Kleine undergrad. Thesis
Jacob Kleine undergrad. ThesisJacob Kleine
 
EcoEngines Chemical Kinetics
EcoEngines Chemical KineticsEcoEngines Chemical Kinetics
EcoEngines Chemical KineticsEdward Blurock
 
Lecture 11 developing qsar, evaluation of qsar model and virtual screening
Lecture 11  developing qsar, evaluation of qsar model and virtual screeningLecture 11  developing qsar, evaluation of qsar model and virtual screening
Lecture 11 developing qsar, evaluation of qsar model and virtual screeningRAJAN ROLTA
 
Cadd and molecular modeling for M.Pharm
Cadd and molecular modeling for M.PharmCadd and molecular modeling for M.Pharm
Cadd and molecular modeling for M.PharmShikha Popali
 
chemical reaction engineering
chemical reaction engineeringchemical reaction engineering
chemical reaction engineeringH.M.Azam Azam
 

Similar to EUGM15 - Alexandre Varnek (Université de Strasbourg): Towards an expert system for predicting reaction conditions: the Michael reaction case (20)

Quantative Structure-Activity Relationships (QSAR)
Quantative Structure-Activity Relationships (QSAR)Quantative Structure-Activity Relationships (QSAR)
Quantative Structure-Activity Relationships (QSAR)
 
Extracting Synthetic Knowledge from Reaction Databases - ARChem at the 246th ACS
Extracting Synthetic Knowledge from Reaction Databases - ARChem at the 246th ACSExtracting Synthetic Knowledge from Reaction Databases - ARChem at the 246th ACS
Extracting Synthetic Knowledge from Reaction Databases - ARChem at the 246th ACS
 
Drug design
Drug designDrug design
Drug design
 
Molecular design: One step back and two paths forward
Molecular design:  One step back and two paths forwardMolecular design:  One step back and two paths forward
Molecular design: One step back and two paths forward
 
Molecular design: How to and how not to?
Molecular design:  How to and how not to?Molecular design:  How to and how not to?
Molecular design: How to and how not to?
 
systems biology- Representation of chemical reaction networks
 systems biology- Representation of chemical reaction networks systems biology- Representation of chemical reaction networks
systems biology- Representation of chemical reaction networks
 
Drug properties (ADMET) prediction using AI
Drug properties (ADMET) prediction using AIDrug properties (ADMET) prediction using AI
Drug properties (ADMET) prediction using AI
 
Qsar
QsarQsar
Qsar
 
Large scale classification of chemical reactions from patent data
Large scale classification of chemical reactions from patent dataLarge scale classification of chemical reactions from patent data
Large scale classification of chemical reactions from patent data
 
Qsar
QsarQsar
Qsar
 
ACS 238th Meeting, 2009, Rasulev
ACS 238th Meeting, 2009, RasulevACS 238th Meeting, 2009, Rasulev
ACS 238th Meeting, 2009, Rasulev
 
Computer aided Drug designing (CADD)
Computer aided Drug designing (CADD)Computer aided Drug designing (CADD)
Computer aided Drug designing (CADD)
 
Presentation
PresentationPresentation
Presentation
 
Virtual sreening
Virtual sreeningVirtual sreening
Virtual sreening
 
Jacob Kleine undergrad. Thesis
Jacob Kleine undergrad. ThesisJacob Kleine undergrad. Thesis
Jacob Kleine undergrad. Thesis
 
EcoEngines Chemical Kinetics
EcoEngines Chemical KineticsEcoEngines Chemical Kinetics
EcoEngines Chemical Kinetics
 
Lecture 11 developing qsar, evaluation of qsar model and virtual screening
Lecture 11  developing qsar, evaluation of qsar model and virtual screeningLecture 11  developing qsar, evaluation of qsar model and virtual screening
Lecture 11 developing qsar, evaluation of qsar model and virtual screening
 
Cadd and molecular modeling for M.Pharm
Cadd and molecular modeling for M.PharmCadd and molecular modeling for M.Pharm
Cadd and molecular modeling for M.Pharm
 
Research Poster
Research PosterResearch Poster
Research Poster
 
chemical reaction engineering
chemical reaction engineeringchemical reaction engineering
chemical reaction engineering
 

More from ChemAxon

Akos Tarcsay (ChemAxon): How fast is Chemaxon RDBMS Search?
Akos Tarcsay (ChemAxon): How fast is Chemaxon RDBMS Search?Akos Tarcsay (ChemAxon): How fast is Chemaxon RDBMS Search?
Akos Tarcsay (ChemAxon): How fast is Chemaxon RDBMS Search?ChemAxon
 
Chemaxon EU UGM 2022 | Translating data to predictive models
Chemaxon EU UGM 2022 | Translating data to predictive modelsChemaxon EU UGM 2022 | Translating data to predictive models
Chemaxon EU UGM 2022 | Translating data to predictive modelsChemAxon
 
Translating data to predictive models
Translating data to predictive modelsTranslating data to predictive models
Translating data to predictive modelsChemAxon
 
Efficient biomolecular structural data handling and analysis - Webinar with D...
Efficient biomolecular structural data handling and analysis - Webinar with D...Efficient biomolecular structural data handling and analysis - Webinar with D...
Efficient biomolecular structural data handling and analysis - Webinar with D...ChemAxon
 
Biomolecule structural data management
Biomolecule structural data managementBiomolecule structural data management
Biomolecule structural data managementChemAxon
 
Cheminfo Stories 2021 | Virtual UGM | Marvin Pro: The first release
Cheminfo Stories 2021 | Virtual UGM | Marvin Pro: The first releaseCheminfo Stories 2021 | Virtual UGM | Marvin Pro: The first release
Cheminfo Stories 2021 | Virtual UGM | Marvin Pro: The first releaseChemAxon
 
Enhanced stereochemistry representation
Enhanced stereochemistry representation Enhanced stereochemistry representation
Enhanced stereochemistry representation ChemAxon
 
Intellectual property (IP) intelligence solutions designed for the way resear...
Intellectual property (IP) intelligence solutions designed for the way resear...Intellectual property (IP) intelligence solutions designed for the way resear...
Intellectual property (IP) intelligence solutions designed for the way resear...ChemAxon
 
GPS for Chemical Space - Digital Assistants to Support Molecule Design - Chem...
GPS for Chemical Space - Digital Assistants to Support Molecule Design - Chem...GPS for Chemical Space - Digital Assistants to Support Molecule Design - Chem...
GPS for Chemical Space - Digital Assistants to Support Molecule Design - Chem...ChemAxon
 
Patent Data for Artificial Intelligence based Drug Discovery
Patent Data for Artificial Intelligence based Drug DiscoveryPatent Data for Artificial Intelligence based Drug Discovery
Patent Data for Artificial Intelligence based Drug DiscoveryChemAxon
 
Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...
Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...
Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...ChemAxon
 
Research data management on the cloud
Research data management on the cloudResearch data management on the cloud
Research data management on the cloudChemAxon
 
Cheminfo Stories APAC 2020 - Introducing Design Hub & Compound Registration
Cheminfo Stories APAC 2020 - Introducing Design Hub & Compound RegistrationCheminfo Stories APAC 2020 - Introducing Design Hub & Compound Registration
Cheminfo Stories APAC 2020 - Introducing Design Hub & Compound RegistrationChemAxon
 
Cheminfo Stories APAC 2020 - JChem Engines introduction
Cheminfo Stories APAC 2020 - JChem Engines introduction Cheminfo Stories APAC 2020 - JChem Engines introduction
Cheminfo Stories APAC 2020 - JChem Engines introduction ChemAxon
 
Cheminfo Stories APAC 2020 - Database management on desktop with JChem for Of...
Cheminfo Stories APAC 2020 - Database management on desktop with JChem for Of...Cheminfo Stories APAC 2020 - Database management on desktop with JChem for Of...
Cheminfo Stories APAC 2020 - Database management on desktop with JChem for Of...ChemAxon
 
Cheminfo Stories APAC 2020 -- Markush technology
Cheminfo Stories APAC 2020 -- Markush technology Cheminfo Stories APAC 2020 -- Markush technology
Cheminfo Stories APAC 2020 -- Markush technology ChemAxon
 
JChem Microservices
JChem MicroservicesJChem Microservices
JChem MicroservicesChemAxon
 
Migration from joc to jpc or choral
Migration from joc to jpc or choralMigration from joc to jpc or choral
Migration from joc to jpc or choralChemAxon
 
ChemAxon's Compliance Checker - Cheminfo Stories 2020 Day 5
ChemAxon's Compliance Checker - Cheminfo Stories 2020 Day 5ChemAxon's Compliance Checker - Cheminfo Stories 2020 Day 5
ChemAxon's Compliance Checker - Cheminfo Stories 2020 Day 5ChemAxon
 
Chemicalize Pro - Cheminfo Stories 2020 Day 5
Chemicalize Pro - Cheminfo Stories 2020 Day 5Chemicalize Pro - Cheminfo Stories 2020 Day 5
Chemicalize Pro - Cheminfo Stories 2020 Day 5ChemAxon
 

More from ChemAxon (20)

Akos Tarcsay (ChemAxon): How fast is Chemaxon RDBMS Search?
Akos Tarcsay (ChemAxon): How fast is Chemaxon RDBMS Search?Akos Tarcsay (ChemAxon): How fast is Chemaxon RDBMS Search?
Akos Tarcsay (ChemAxon): How fast is Chemaxon RDBMS Search?
 
Chemaxon EU UGM 2022 | Translating data to predictive models
Chemaxon EU UGM 2022 | Translating data to predictive modelsChemaxon EU UGM 2022 | Translating data to predictive models
Chemaxon EU UGM 2022 | Translating data to predictive models
 
Translating data to predictive models
Translating data to predictive modelsTranslating data to predictive models
Translating data to predictive models
 
Efficient biomolecular structural data handling and analysis - Webinar with D...
Efficient biomolecular structural data handling and analysis - Webinar with D...Efficient biomolecular structural data handling and analysis - Webinar with D...
Efficient biomolecular structural data handling and analysis - Webinar with D...
 
Biomolecule structural data management
Biomolecule structural data managementBiomolecule structural data management
Biomolecule structural data management
 
Cheminfo Stories 2021 | Virtual UGM | Marvin Pro: The first release
Cheminfo Stories 2021 | Virtual UGM | Marvin Pro: The first releaseCheminfo Stories 2021 | Virtual UGM | Marvin Pro: The first release
Cheminfo Stories 2021 | Virtual UGM | Marvin Pro: The first release
 
Enhanced stereochemistry representation
Enhanced stereochemistry representation Enhanced stereochemistry representation
Enhanced stereochemistry representation
 
Intellectual property (IP) intelligence solutions designed for the way resear...
Intellectual property (IP) intelligence solutions designed for the way resear...Intellectual property (IP) intelligence solutions designed for the way resear...
Intellectual property (IP) intelligence solutions designed for the way resear...
 
GPS for Chemical Space - Digital Assistants to Support Molecule Design - Chem...
GPS for Chemical Space - Digital Assistants to Support Molecule Design - Chem...GPS for Chemical Space - Digital Assistants to Support Molecule Design - Chem...
GPS for Chemical Space - Digital Assistants to Support Molecule Design - Chem...
 
Patent Data for Artificial Intelligence based Drug Discovery
Patent Data for Artificial Intelligence based Drug DiscoveryPatent Data for Artificial Intelligence based Drug Discovery
Patent Data for Artificial Intelligence based Drug Discovery
 
Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...
Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...
Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...
 
Research data management on the cloud
Research data management on the cloudResearch data management on the cloud
Research data management on the cloud
 
Cheminfo Stories APAC 2020 - Introducing Design Hub & Compound Registration
Cheminfo Stories APAC 2020 - Introducing Design Hub & Compound RegistrationCheminfo Stories APAC 2020 - Introducing Design Hub & Compound Registration
Cheminfo Stories APAC 2020 - Introducing Design Hub & Compound Registration
 
Cheminfo Stories APAC 2020 - JChem Engines introduction
Cheminfo Stories APAC 2020 - JChem Engines introduction Cheminfo Stories APAC 2020 - JChem Engines introduction
Cheminfo Stories APAC 2020 - JChem Engines introduction
 
Cheminfo Stories APAC 2020 - Database management on desktop with JChem for Of...
Cheminfo Stories APAC 2020 - Database management on desktop with JChem for Of...Cheminfo Stories APAC 2020 - Database management on desktop with JChem for Of...
Cheminfo Stories APAC 2020 - Database management on desktop with JChem for Of...
 
Cheminfo Stories APAC 2020 -- Markush technology
Cheminfo Stories APAC 2020 -- Markush technology Cheminfo Stories APAC 2020 -- Markush technology
Cheminfo Stories APAC 2020 -- Markush technology
 
JChem Microservices
JChem MicroservicesJChem Microservices
JChem Microservices
 
Migration from joc to jpc or choral
Migration from joc to jpc or choralMigration from joc to jpc or choral
Migration from joc to jpc or choral
 
ChemAxon's Compliance Checker - Cheminfo Stories 2020 Day 5
ChemAxon's Compliance Checker - Cheminfo Stories 2020 Day 5ChemAxon's Compliance Checker - Cheminfo Stories 2020 Day 5
ChemAxon's Compliance Checker - Cheminfo Stories 2020 Day 5
 
Chemicalize Pro - Cheminfo Stories 2020 Day 5
Chemicalize Pro - Cheminfo Stories 2020 Day 5Chemicalize Pro - Cheminfo Stories 2020 Day 5
Chemicalize Pro - Cheminfo Stories 2020 Day 5
 

Recently uploaded

What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprisepreethippts
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationBradBedford3
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Hr365.us smith
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalLionel Briand
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Developmentvyaparkranti
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 

Recently uploaded (20)

Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprise
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Development
 
Advantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your BusinessAdvantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your Business
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 

EUGM15 - Alexandre Varnek (Université de Strasbourg): Towards an expert system for predicting reaction conditions: the Michael reaction case

  • 1. Prediction of Reaction Conditions for Michael Additions G. Marcou, J. Aires de Sousa, A. de Luca, D. Latino, V. Rietsch, D. Horvath, A. Varnek ChemAxon UGM, Budapest 19-20 May 2015
  • 2. • Different scenarios of structure-reactivity modeling • Descriptors for reactions - Condensed Graph of Reaction / ISIDA descriptors - Electron Effects Descriptors • Michael reaction case OUTLINE
  • 3. Chemical reactions are difficult objects + + - many species; - two types of species: reactants and products; - multi-step reactions, - dependent on experimental conditions
  • 4. • How can I synthesize this structure? • How can I estimate a yield of a given reaction, its kinetic and thermodynamic parameters ? • Which reaction conditions I should choose in order to obtain desirable product selectively ? Chemical reactions in Chemoinformatics
  • 5. 𝑃𝑟𝑜𝑝𝑒𝑟𝑡𝑦 = 𝐟(𝑫𝒆𝒔𝒄𝒓𝒊𝒑𝒕𝒐𝒓𝒔) • substructural fragments • topological indices, • physico-chem. parameters • etc… Parameters directly derived from molecular structure • Support Vector Machine (SVM) • Multi-Linear Regression (MLR) • Artificial Neural Networks • etc… Mathematical relationship established with machine learning methods Quantitative Structure-Activity/Property Relationship (QSAR/QSPR) 𝑃𝑟𝑜𝑝𝑒𝑟𝑡𝑦 = 𝐟(𝒔𝒕𝒓𝒖𝒄𝒕𝒖𝒓𝒆)
  • 6. • 𝑅𝑒𝑎𝑐𝑡𝑖𝑣𝑖𝑡𝑦 = 𝐟 𝑹𝒆𝒂𝒄𝒕𝒂𝒏𝒕𝒔 𝑠𝑡𝑟𝑢𝑐𝑡𝑢𝑟𝑒 • 𝑅𝑒𝑎𝑐𝑡𝑖𝑣𝑖𝑡𝑦 = 𝐟(𝑷𝒓𝒐𝒅𝒖𝒄𝒕𝒔 𝑠𝑡𝑟𝑢𝑐𝑡𝑢𝑟𝑒) • 𝑅𝑒𝑎𝑐𝑡𝑖𝑣𝑖𝑡𝑦 = 𝐟(𝑹𝒆𝒂𝒄𝒕𝒊𝒐𝒏 𝑠𝑡𝑟𝑢𝑐𝑡𝑢𝑟𝑒) Quantitative Structure-Reactivity Relationship (QSRR)
  • 7. • Different scenarios of structure-reactivity modeling • Descriptors for reactions - Condensed Graph of Reaction / ISIDA descriptors - Electron Effect Descriptors • Michael reaction case OUTLINE
  • 8. ￸ Conventional bonds: single, double, aromatic, … Dynamical bonds: created single, broken single, … CGR could be viewed as a pseudo-molecule representing a given reaction Condensed Graph of Reaction
  • 9. Condensed Graph of Reaction • CGR condenses the structural information about products and reactants several graphs into one only graph • This simplified presentation opens an opportunity to apply to CGR the methods developed in chemoinformatics for individual molecules A. Varnek, D. Fourches, F. Hoonakker, V. P. Solov’ev, J. Computer-Aided Molecular Design, 2005, 19, 693-703
  • 10. ISIDA fragment descriptors Reaction can be encoded by a descriptor vector which can be used in structure-reactivity modeling Condensed graph of reaction 2 1 2 … … ISIDA/CGR fragment descriptors A. Varnek In: "Chemoinformatics and Computational Chemical Biology", J. Bajorath, Ed., Springer, 2010
  • 11. Sums of property-dependent P(i) contributions from remote atoms i, at tiK away from the ‘reactive’ center K, modulated by various working hypotheses: 𝐸𝐸𝐷 𝑝,𝑒,𝑜,𝑤,𝑐 𝐾 = 𝑖=1 𝑁 𝛿 𝑐 𝑖, 𝐾 × 𝑃𝑝 𝑒 (𝑖) × 𝑒𝑥𝑝 −1/𝑤 × 𝜏𝑖𝐾 − 𝑜 2 𝜏𝑖𝐾 = 0.5 𝑖𝑓 𝑖 = 𝐾 𝑠ℎ𝑜𝑟𝑡𝑒𝑠𝑡 − 𝑝𝑎𝑡ℎ 𝑡𝑜𝑝𝑜𝑙𝑜𝑔𝑖𝑐𝑎𝑙 𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒 𝑖, 𝐾 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 𝛿 𝑐 (𝑖, 𝐾) = 1 𝑖𝑓 𝑐 = 0 𝑂𝑅(𝑝𝑎𝑡ℎ 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑖 𝑎𝑛𝑑 𝐾 𝑐𝑜𝑛𝑡𝑎𝑖𝑛𝑠 𝑜𝑛𝑙𝑦 𝑢𝑛𝑠𝑎𝑡𝑢𝑟𝑎𝑡𝑒𝑑 𝑎𝑡𝑜𝑚𝑠) 0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 Property type p=1..11 Property power e=1..2 Neighborhood control o=1..4 Neighborhood width w=2..5 Conjugation toggle c=0,1 11x2x4x4x2=704 EED terms… Electron Effect Descriptors M.Elhabiri, E. Davioud-Charvet, D. Horvath, A. Varnek et al., Chemistry Eur. J, 2014, 20, 1 – 11
  • 13. • Different scenarios of structure-reactivity modeling • Descriptors for reactions - Condensed Graph of Reaction / ISIDA descriptors - Electron Effect Descriptors • Michael reaction case OUTLINE
  • 14.  Solvent  Hydrophobic (kerosene, benzene, …)  Polar Aprotic (THF, DMSO, …)  Polar Protic (water, acetic acid ...)  No solvent: reaction occurs in pure solutions of the reagents  Catalyst  Bronsted acids (Hydrochloric acid, …  Lewis acids (transition metal ions, ….)  Basic (pyridine, Na ethanolate, ….)  No catalyst: a catalyst is not needed, or autocatalysis takes place Reaction occurs in different conditions characterized by solvent and catalyst Michael reaction Michael donor Michael acceptor
  • 15. For given Nu and R1 – R4, which conditions (catalyst, solvent) lead to high reaction yield ? Michael reaction
  • 16. • to build QSRR models predicting optimal reaction conditions for a query reaction. Each model will answer a punctual question such as “Is this process feasible with Brønsted acid catalysts?”, “Is this process feasible in aprotic polar solvents?”, etc. • to develop a public predictive tool adapted to the Michael-type reaction case Michael reaction: goals of the modeling
  • 17. • 53 polar aprotic solvent • 52 no solvent • 103 polar protic solvent • 93 Lewis acid catalyst • 61 no catalyst • 40 hydrophobic solvent • 57 Bronsted acid catalyst • 45 Basic catalysis • 24 Decoys (not observed Michael reactions) Notice that one same reaction may proceed under different conditions ! Michael reaction: data The data set consists in 222 reactions:
  • 18. Non-occuring reaction Occuring reaction : condensation of hydroxylamine and aldehyde Decoy Michael Addition
  • 19. Compatibility of the Michael reaction in the database with each of the condition classes is rendered by the condition bitvector. 0 1 1 0 0 0 1 0 0 Solvent Hydro- phobic Solvent Polar Aprotic Solvent Polar Protic No Solvent Catalyst Bronsted Acid Catalyst Lewis Acid Catalyst Basic No Catalyst Not observed A bit is set on if the reaction was seen to happen under this condition. NOTE – if off, it means we don’t know whether it’s feasible! An additional ‘no-go’ bit is on if the given Michael addition should never be observed (because of competing carbonyl addition). Bitvector of reaction conditions
  • 20. Modeling setup Goal: preparation of 9 two-class classification models: - 8 models for reaction feasibility for each catalyst or solvent type, - 1 model « Michael / non-Michael » Descriptors: ISIDA/CGR, EED … and also MOLMAP, CDK Scenarios: reagent-based, product-based and reaction-based Performance assessment : ROC AUC in 3-fold cross-validation Machine-learning methods: Random Forest, SVM, Naïve Bayes
  • 21. CGR for Michael reaction created single double to single
  • 22. Reaction descriptors EED ISIDA Solv:A Solv:NA Solv:P Cat:LA Cat:NA Solv:H Cat:BA Cat:B Cat:NO 1 0.9 0.8 0.7 0.6 0.5 1 0.9 0.8 0.7 0.6 0.5 Solv:A Solv:NA Solv:P Cat:LA Cat:NA Solv:H Cat:BA Cat:B Cat:NO ROC AUC ROC AUC = 1 corresponds to an ideal model G. Marcou, J. Aires de Sousa, D. Latino, A. Deluca, D. Horvath, V. Rietsch, and A. Varnek J. Chem. Inf. Model., 2015, 55, 239−250. Michael reaction: models performance
  • 23. EED ISIDA Reagent descriptors Michael reaction: models performance G. Marcou, J. Aires de Sousa, D. Latino, A. Deluca, D. Horvath, V. Rietsch, and A. Varnek J. Chem. Inf. Model., 2015, 55, 239−250.
  • 25. Model validation on the set of 52 reactions extracted from the literature Model # in the set # correctly predicted Catalyst and solvent 52 8 aprotic solvent 12 2 Protic solvent 21 21 No catalyst 26 14 base-catalyzed 19 8 Brønsted acids 2 1 Model failure or uncomplete data for the modeling ? Michael reaction: external validation
  • 26. Both ISIDA/CGR and EED descriptors provide with acceptable models in cross-validation Reagent, Product and Reaction scenarios perform similarly The models cross-validate very well, but external testing is inconclusive – more data are needed ! Conclusions
  • 27. J. Chem. Inf. Model., 2015, 55, 239−250
  • 28.  Campus France for a Franco-Portugese grant “PESSOA”  ChemAxon for the license Thanks