SlideShare a Scribd company logo
1 of 32
Deep
reinforcement
learning for de
novo drug
design
By Popova et al.
Presenter: Nimmi
1
“ A biological representation
of a molecule or chemical
compound with sufficient
potential to progress to a full
drug development.1
2
De novo drug design
1. https://www.beckman.com/support/faq/research/what-is-a-lead-compound#:~:text=A%20lead%20compound%20is%20a,a%20full%20drug%20development%20program.
Introduction
This paper is about generating chemical structures of
candidate drug molecules.
3
Problem:
Formulation of a well-
motivated hypothesis for
candidate drug compound
generation or selection based
on the available data is
challenging.
ReLeaSE:
A novel method for generating
chemical compounds with
desired physical, chemical,
and/or bioactivity properties de
novo that is based on deep
reinforcement learning (RL).
Proposed
Solution
Background
RL is a subfield of AI, which is used to solve dynamic
decision problems.
ML models can gain abilities to make decisions and
explore in an unsupervised and complex environment
by RL.
Using RL to generate candidate drug molecules would
avoid brute-force computing to examine the every
possible solution in the chemical space.
4
4
Properties of Drugs
Partition coefficient
LogP is a critical
measure that not
only determines
how well a drug will
be absorbed,
transported, and
distributed in the
body but also
dictates how a drug
should be
formulated and
dosed.
Melting Temperature
(Tm) is the
temperature at
which a given solid
drug changes from
its solid state to a
liquid, or melts
A Janus kinase
inhibitor, also known
as JAK inhibitor or
jakinib, is a type of
immune
modulating
medication, which
inhibits the activity of
one or more of the
Janus kinase family
of enzymes (JAK1,
JAK2, JAK3, TYK2).
Chemical
complexity of drug
is measures by the
number of benzene
rings.
5
Physical Properties Bioactivity Properties
Structural
Properties
SMILES representations
6
Simplified Molecular-
Input Line-Entry
System
https://www.researchgate.net/figure/Canonical-a-and-randomized-b-SMILES-representations-of-aspirin-Randomized-SMILES_fig1_344805688
Proposed
Solution:
ReLeaSE
7
Generative
Model (G)
Agent
1
8
Generative models in general
E.g. painting generation, human face generation, etc.
9
E.g. text generation, poem generation, source code generation, etc.
Generative RNN models
10
Generative RNN model (G) to generate SMILES strings
11
Training step of the generative
Stack-RNN
Generative RNN model (G) to generate SMILES strings
12
Training step of the generative
Stack-RNN
Generator step of the generative
Stack-RNN
Drug molecules corresponding to some of the generated
SMILES strings
13
2
14
Predictive
Model (P)
Environment
Predictive Model (P)
15
= sT = P(sT)
Reward:
r(sT) = f(P(sT))
QSPR analysis
Quantitative structure–property relationship (QSPR) analysis finds correlations
between structural descriptions and material properties through ML models.
16
= sT = P(sT)
Structural
description
Material
Property
(e.g. LogP, Tm)
For predicting LogP using 5CV, the model accuracy is 0.91 and root mean square error (RMSE) is 0.53.
RL
formulation
ReLeaSE
3
17
18
= P(sT)
= sT
r(sT) = f(P(sT))
Objective the Generative model G (agent) optimizes
19
20
Reward proportional to the number of benzene rings.
Reward Increases
Evaluation
4
21
Predicted properties of the training drug molecules vs RL-
based genderated drug molecules
22
23
Comparison of statistics for training vs generated molecules
Related Work
5
24
Related work
Olivecrona et al
(2017)
No data were
provided to show
that the predicted
properties of
molecular
compounds are
optimized by the
RL model.
There was a large
fraction of the
generated
molecules, that
were similar to
those in training
and test sets.
Jaques et al. (2017)
The RL model did
not directly
optimize any
physical or
biological
properties but
rather a proxy
function that
includes a SAS,
drug-likeness, and
a ring penalty
Segler et al (2018)
Did not use RL, but
vanilla RNN.
25
25
Limitations
6
26
Limitations
ReLeaSE generate candidate drug molecules while
independently optimizing for each chemical property
of interest.
● E.g. Having one model to generate molecules
optimized for LogP and having another model for
to generate molecules optimized for melting
temperature
27
27
Conclusion
7
28
29
Take home message
● ReLeaSE is a new strategy for designing
libraries of compounds with the desired
properties that uses both DL and RL
approaches.
● ReLeaSE efficiency generates candidate
drug molecules while independently
optimizing for each chemical property of
interest.
What’s next?
● Extending to afford multiobjective
optimization of several target properties
concurrently with respect to potency,
selectivity, solubility, and other drug-
likeness properties.
Thanks!
30
Generating a chemically diverse
stream of molecules is important,
because drug candidates can fail in
many unexpected ways, later in the
drug discovery pipeline.[1]
On the basis of the analysis of
respective sets of 10,000 molecules
generated by each method, the library
obtained without stack memory
showed a decrease in internal
diversity of 0.2 units of the Tanimoto
coefficient and yet a fourfold increase
in the number of duplicates, from just
about 1 to 5%.[2]
31
[1] Benhenda M. Can AI reproduce observed chemical diversity?. bioRxiv. 2018 Jan 1:292177.
[2] Popova M, Isayev O, Tropsha A. Deep reinforcement learning for de novo drug design. Science advances. 2018 Jul 25;4(7):eaap7885.
Internal diversity of generated libraries.
Visualization of new drug libraries
32
(a) LogP values predicted by the predictive
model P
(b) Melting temperatures predicted by the
predictive model P

More Related Content

Similar to Deep reinforcement learning for de novo drug design

Computer Assisted Drug Design By Rauf Pathan and Patel Mo Shaffan
Computer Assisted Drug Design By Rauf Pathan and Patel Mo ShaffanComputer Assisted Drug Design By Rauf Pathan and Patel Mo Shaffan
Computer Assisted Drug Design By Rauf Pathan and Patel Mo ShaffanPathan Rauf Khan
 
Ai in drug design webinar 26 feb 2019
Ai in drug design webinar 26 feb 2019Ai in drug design webinar 26 feb 2019
Ai in drug design webinar 26 feb 2019Pistoia Alliance
 
AI & ML in Drug Design: Pistoia Alliance CoE
AI & ML in Drug Design: Pistoia Alliance CoEAI & ML in Drug Design: Pistoia Alliance CoE
AI & ML in Drug Design: Pistoia Alliance CoEPistoia Alliance
 
Chemistry Reserach as a Social Machine
 Chemistry Reserach as a Social Machine Chemistry Reserach as a Social Machine
Chemistry Reserach as a Social MachineJeremy Frey
 
cadd-191129134050 (1).pptx
cadd-191129134050 (1).pptxcadd-191129134050 (1).pptx
cadd-191129134050 (1).pptxNoorelhuda2
 
University of Texas at Austin
University of Texas at AustinUniversity of Texas at Austin
University of Texas at Austinbutest
 
CBE_Symposium_Poster_Aparajita - sjp
CBE_Symposium_Poster_Aparajita - sjpCBE_Symposium_Poster_Aparajita - sjp
CBE_Symposium_Poster_Aparajita - sjpAparajita Dasgupta
 
Trends in Computer Science and Information Technology
Trends in Computer Science and Information TechnologyTrends in Computer Science and Information Technology
Trends in Computer Science and Information Technologypeertechzpublication
 
Drug Target Interaction (DTI) prediction (MSc. thesis)
Drug Target Interaction (DTI) prediction (MSc. thesis) Drug Target Interaction (DTI) prediction (MSc. thesis)
Drug Target Interaction (DTI) prediction (MSc. thesis) Dimitris Papadopoulos
 
Principles of Biochemistry
Principles of Biochemistry Principles of Biochemistry
Principles of Biochemistry Dat Le
 
2016 bioinformatics i_bio_cheminformatics_wimvancriekinge
2016 bioinformatics i_bio_cheminformatics_wimvancriekinge2016 bioinformatics i_bio_cheminformatics_wimvancriekinge
2016 bioinformatics i_bio_cheminformatics_wimvancriekingeProf. Wim Van Criekinge
 
Evolutionary Symbolic Discovery for Bioinformatics, Systems and Synthetic Bi...
Evolutionary Symbolic Discovery for Bioinformatics,  Systems and Synthetic Bi...Evolutionary Symbolic Discovery for Bioinformatics,  Systems and Synthetic Bi...
Evolutionary Symbolic Discovery for Bioinformatics, Systems and Synthetic Bi...Natalio Krasnogor
 
2015 bioinformatics bio_cheminformatics_wim_vancriekinge
2015 bioinformatics bio_cheminformatics_wim_vancriekinge2015 bioinformatics bio_cheminformatics_wim_vancriekinge
2015 bioinformatics bio_cheminformatics_wim_vancriekingeProf. Wim Van Criekinge
 
Internship Report
Internship ReportInternship Report
Internship ReportNeha Gupta
 

Similar to Deep reinforcement learning for de novo drug design (20)

BERTology meets Biology
BERTology meets BiologyBERTology meets Biology
BERTology meets Biology
 
Computer Assisted Drug Design By Rauf Pathan and Patel Mo Shaffan
Computer Assisted Drug Design By Rauf Pathan and Patel Mo ShaffanComputer Assisted Drug Design By Rauf Pathan and Patel Mo Shaffan
Computer Assisted Drug Design By Rauf Pathan and Patel Mo Shaffan
 
Ai in drug design webinar 26 feb 2019
Ai in drug design webinar 26 feb 2019Ai in drug design webinar 26 feb 2019
Ai in drug design webinar 26 feb 2019
 
AI & ML in Drug Design: Pistoia Alliance CoE
AI & ML in Drug Design: Pistoia Alliance CoEAI & ML in Drug Design: Pistoia Alliance CoE
AI & ML in Drug Design: Pistoia Alliance CoE
 
Chemistry Reserach as a Social Machine
 Chemistry Reserach as a Social Machine Chemistry Reserach as a Social Machine
Chemistry Reserach as a Social Machine
 
cadd-191129134050 (1).pptx
cadd-191129134050 (1).pptxcadd-191129134050 (1).pptx
cadd-191129134050 (1).pptx
 
University of Texas at Austin
University of Texas at AustinUniversity of Texas at Austin
University of Texas at Austin
 
CBE_Symposium_Poster_Aparajita - sjp
CBE_Symposium_Poster_Aparajita - sjpCBE_Symposium_Poster_Aparajita - sjp
CBE_Symposium_Poster_Aparajita - sjp
 
Trends in Computer Science and Information Technology
Trends in Computer Science and Information TechnologyTrends in Computer Science and Information Technology
Trends in Computer Science and Information Technology
 
Drug Target Interaction (DTI) prediction (MSc. thesis)
Drug Target Interaction (DTI) prediction (MSc. thesis) Drug Target Interaction (DTI) prediction (MSc. thesis)
Drug Target Interaction (DTI) prediction (MSc. thesis)
 
Principles of Biochemistry
Principles of Biochemistry Principles of Biochemistry
Principles of Biochemistry
 
Computational Chemistry.pptx
 Computational Chemistry.pptx Computational Chemistry.pptx
Computational Chemistry.pptx
 
CADD
CADDCADD
CADD
 
2016 bioinformatics i_bio_cheminformatics_wimvancriekinge
2016 bioinformatics i_bio_cheminformatics_wimvancriekinge2016 bioinformatics i_bio_cheminformatics_wimvancriekinge
2016 bioinformatics i_bio_cheminformatics_wimvancriekinge
 
Evolutionary Symbolic Discovery for Bioinformatics, Systems and Synthetic Bi...
Evolutionary Symbolic Discovery for Bioinformatics,  Systems and Synthetic Bi...Evolutionary Symbolic Discovery for Bioinformatics,  Systems and Synthetic Bi...
Evolutionary Symbolic Discovery for Bioinformatics, Systems and Synthetic Bi...
 
2015 bioinformatics bio_cheminformatics_wim_vancriekinge
2015 bioinformatics bio_cheminformatics_wim_vancriekinge2015 bioinformatics bio_cheminformatics_wim_vancriekinge
2015 bioinformatics bio_cheminformatics_wim_vancriekinge
 
Qsar ppt
Qsar pptQsar ppt
Qsar ppt
 
NMR Chemical Shift Prediction by Atomic Increment-Based Algorithms
NMR Chemical Shift Prediction by Atomic Increment-Based AlgorithmsNMR Chemical Shift Prediction by Atomic Increment-Based Algorithms
NMR Chemical Shift Prediction by Atomic Increment-Based Algorithms
 
Internship Report
Internship ReportInternship Report
Internship Report
 
Computer aided Drug designing (CADD)
Computer aided Drug designing (CADD)Computer aided Drug designing (CADD)
Computer aided Drug designing (CADD)
 

More from Nimmi Weeraddana

Predicting Stock Prices using News data
Predicting Stock Prices using  News data Predicting Stock Prices using  News data
Predicting Stock Prices using News data Nimmi Weeraddana
 
Wilderness Touch Screen Display
Wilderness Touch Screen DisplayWilderness Touch Screen Display
Wilderness Touch Screen DisplayNimmi Weeraddana
 
Application of tree based structures in machine learning to a real word scenario
Application of tree based structures in machine learning to a real word scenarioApplication of tree based structures in machine learning to a real word scenario
Application of tree based structures in machine learning to a real word scenarioNimmi Weeraddana
 
Essentials of law short note (version 3)
Essentials of law short note (version 3)Essentials of law short note (version 3)
Essentials of law short note (version 3)Nimmi Weeraddana
 
Computer architecture short note (version 8)
Computer architecture short note (version 8)Computer architecture short note (version 8)
Computer architecture short note (version 8)Nimmi Weeraddana
 
Computer networks short note (version 8)
Computer networks short note (version 8)Computer networks short note (version 8)
Computer networks short note (version 8)Nimmi Weeraddana
 
Data structures and algorithms short note (version 14).pd
Data structures and algorithms short note (version 14).pdData structures and algorithms short note (version 14).pd
Data structures and algorithms short note (version 14).pdNimmi Weeraddana
 

More from Nimmi Weeraddana (7)

Predicting Stock Prices using News data
Predicting Stock Prices using  News data Predicting Stock Prices using  News data
Predicting Stock Prices using News data
 
Wilderness Touch Screen Display
Wilderness Touch Screen DisplayWilderness Touch Screen Display
Wilderness Touch Screen Display
 
Application of tree based structures in machine learning to a real word scenario
Application of tree based structures in machine learning to a real word scenarioApplication of tree based structures in machine learning to a real word scenario
Application of tree based structures in machine learning to a real word scenario
 
Essentials of law short note (version 3)
Essentials of law short note (version 3)Essentials of law short note (version 3)
Essentials of law short note (version 3)
 
Computer architecture short note (version 8)
Computer architecture short note (version 8)Computer architecture short note (version 8)
Computer architecture short note (version 8)
 
Computer networks short note (version 8)
Computer networks short note (version 8)Computer networks short note (version 8)
Computer networks short note (version 8)
 
Data structures and algorithms short note (version 14).pd
Data structures and algorithms short note (version 14).pdData structures and algorithms short note (version 14).pd
Data structures and algorithms short note (version 14).pd
 

Recently uploaded

VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 

Recently uploaded (20)

VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 

Deep reinforcement learning for de novo drug design

  • 1. Deep reinforcement learning for de novo drug design By Popova et al. Presenter: Nimmi 1
  • 2. “ A biological representation of a molecule or chemical compound with sufficient potential to progress to a full drug development.1 2 De novo drug design 1. https://www.beckman.com/support/faq/research/what-is-a-lead-compound#:~:text=A%20lead%20compound%20is%20a,a%20full%20drug%20development%20program.
  • 3. Introduction This paper is about generating chemical structures of candidate drug molecules. 3 Problem: Formulation of a well- motivated hypothesis for candidate drug compound generation or selection based on the available data is challenging. ReLeaSE: A novel method for generating chemical compounds with desired physical, chemical, and/or bioactivity properties de novo that is based on deep reinforcement learning (RL). Proposed Solution
  • 4. Background RL is a subfield of AI, which is used to solve dynamic decision problems. ML models can gain abilities to make decisions and explore in an unsupervised and complex environment by RL. Using RL to generate candidate drug molecules would avoid brute-force computing to examine the every possible solution in the chemical space. 4 4
  • 5. Properties of Drugs Partition coefficient LogP is a critical measure that not only determines how well a drug will be absorbed, transported, and distributed in the body but also dictates how a drug should be formulated and dosed. Melting Temperature (Tm) is the temperature at which a given solid drug changes from its solid state to a liquid, or melts A Janus kinase inhibitor, also known as JAK inhibitor or jakinib, is a type of immune modulating medication, which inhibits the activity of one or more of the Janus kinase family of enzymes (JAK1, JAK2, JAK3, TYK2). Chemical complexity of drug is measures by the number of benzene rings. 5 Physical Properties Bioactivity Properties Structural Properties
  • 6. SMILES representations 6 Simplified Molecular- Input Line-Entry System https://www.researchgate.net/figure/Canonical-a-and-randomized-b-SMILES-representations-of-aspirin-Randomized-SMILES_fig1_344805688
  • 9. Generative models in general E.g. painting generation, human face generation, etc. 9
  • 10. E.g. text generation, poem generation, source code generation, etc. Generative RNN models 10
  • 11. Generative RNN model (G) to generate SMILES strings 11 Training step of the generative Stack-RNN
  • 12. Generative RNN model (G) to generate SMILES strings 12 Training step of the generative Stack-RNN Generator step of the generative Stack-RNN
  • 13. Drug molecules corresponding to some of the generated SMILES strings 13
  • 15. Predictive Model (P) 15 = sT = P(sT) Reward: r(sT) = f(P(sT))
  • 16. QSPR analysis Quantitative structure–property relationship (QSPR) analysis finds correlations between structural descriptions and material properties through ML models. 16 = sT = P(sT) Structural description Material Property (e.g. LogP, Tm) For predicting LogP using 5CV, the model accuracy is 0.91 and root mean square error (RMSE) is 0.53.
  • 18. 18 = P(sT) = sT r(sT) = f(P(sT))
  • 19. Objective the Generative model G (agent) optimizes 19
  • 20. 20 Reward proportional to the number of benzene rings. Reward Increases
  • 22. Predicted properties of the training drug molecules vs RL- based genderated drug molecules 22
  • 23. 23 Comparison of statistics for training vs generated molecules
  • 25. Related work Olivecrona et al (2017) No data were provided to show that the predicted properties of molecular compounds are optimized by the RL model. There was a large fraction of the generated molecules, that were similar to those in training and test sets. Jaques et al. (2017) The RL model did not directly optimize any physical or biological properties but rather a proxy function that includes a SAS, drug-likeness, and a ring penalty Segler et al (2018) Did not use RL, but vanilla RNN. 25 25
  • 27. Limitations ReLeaSE generate candidate drug molecules while independently optimizing for each chemical property of interest. ● E.g. Having one model to generate molecules optimized for LogP and having another model for to generate molecules optimized for melting temperature 27 27
  • 29. 29 Take home message ● ReLeaSE is a new strategy for designing libraries of compounds with the desired properties that uses both DL and RL approaches. ● ReLeaSE efficiency generates candidate drug molecules while independently optimizing for each chemical property of interest. What’s next? ● Extending to afford multiobjective optimization of several target properties concurrently with respect to potency, selectivity, solubility, and other drug- likeness properties. Thanks!
  • 30. 30
  • 31. Generating a chemically diverse stream of molecules is important, because drug candidates can fail in many unexpected ways, later in the drug discovery pipeline.[1] On the basis of the analysis of respective sets of 10,000 molecules generated by each method, the library obtained without stack memory showed a decrease in internal diversity of 0.2 units of the Tanimoto coefficient and yet a fourfold increase in the number of duplicates, from just about 1 to 5%.[2] 31 [1] Benhenda M. Can AI reproduce observed chemical diversity?. bioRxiv. 2018 Jan 1:292177. [2] Popova M, Isayev O, Tropsha A. Deep reinforcement learning for de novo drug design. Science advances. 2018 Jul 25;4(7):eaap7885. Internal diversity of generated libraries.
  • 32. Visualization of new drug libraries 32 (a) LogP values predicted by the predictive model P (b) Melting temperatures predicted by the predictive model P

Editor's Notes

  1. What is the problem tackled? The crucial step in many new drug discovery projects is the formulation of a well-motivated hypothesis for new lead compound generation or compound selection from available or synthetically feasible chemical libraries based on the available structure-activity relationship (SAR) data.
  2. To proceed with the rest of the presentation, we need to know what is the SMILES representation of a drug molecule. Suppose this is a drug or any kind of molecule. You can see that it has a benzene ring, and two double bonds, etc. [NEXT[ The SMILES representation of this molecule is the sequence of atoms in the molecule as we traverse through it. …… So the aim of this research is to generate SMILES representations of candidate drug molecules.
  3. Generative RNN models are often used for tasks such as painting generation, human face generation, etc.
  4. Generative RNN models are often used for tasks such as text generation, poem generation, painting generation, human face generation, etc.
  5. [NEXT] DESCRIBE training step. …What we need to maximize here is how well the stack-rnn predicts the next character in the SMILES string. The authors used cross-entropy loss function minimization for that. A question somebody would ask here is that why stack-rnn? [NEXT molecule image appear] Recall how we represent a molecule in a SMILES string. One must count ring opening and closure, as well as parenthesis sequences to make sure the SMILES string is a valid one. Regular RNNs such as LSTM and gated recurrent unit (GRU) are unable to keep track of such information in their memory cells. That is why we need stack memory in rnn. So whenever the model come across a input character “an opening parenthesis”, it push a parenthesis into the stack, and whenever the model come across an input character which is “a closing parenthesis”, model pops out last opening parenthesis in the stack. [BOARD] By the time we predict <end> token, if we still have some remaining parenthesis in the stack, that SMILES string is not a valid smiles string. This way, the author trained ~1.5 million molecule structures taken from an existing dataset. [NEXT]
  6. To make sure that the model does not memorize the training molecules, authors generated 1M molecules using the model, and compared them with the with the molecules used to train the model. They found that the model produced less than 0.1% of structures from the training data set.
  7. In drug discovery process, generating a chemically diverse stream of candidate drug molecules is important, because drug molecule candidates can fail in many unexpected ways, later in the drug discovery pipeline. The authors claimed that the library of drug molecules obtained with stack memory showed an increase in internal chemical diversity compared to the drug moleculed obtained without a stack memory.
  8. This predictive model P is a feed forward neural network. Authors used existing datasets to train this, and they trained different models to predict different properties.
  9. Training these type of predictive models is also called QSPR analysis. Such analysis would find correlations between structural descriptions and material properties through ML models. [NEXT] In our model, [NEXT] Smiles string is the structural description of the molecule and [NEXT] predicted property is some material property like melting temperature or log p. Building ML models directly from SMILES strings is a unique feature of our approach Because (1) it completely bypasses the very slow step of descriptor generation in traditional QSAR modeling approaches. (2) Moreover this MILES string based model out performed the existing QSPR models. [NEXT] For example, for predicting logP using 5-folde cross validation, the model accuracy is 0.91 and root mean square error (RMSE) is 0.53.
  10. Two deep neural networks: generative (G) and predictive (P). Both models need to be trained separately with supervised learning algorithms, Then, these two models need to be trained jointly with an RL approach that optimizes target properties. [NEXT] The generative model is used to generate SMILES strings of novel chemically feasible molecules, that is, it plays a role of an agent. [NEXT] Suppose sT is a SMILES string generated by G. The predictive model takes in the generated SMILES string as the input and provides one real number as an output: [NEXT] P(sT) which is an estimated property value. Property could be either LogP, melting temperature, etc. [NEXT] Reward is a function of P(sT) -> [NEXT] this function f is chosen depending on the task/the property. Considering this reward, the generative model is trained to maximize the expected reward.
  11. Talking a bit more about the reward, it must be based on the property, the RL model trying to optimize. Suppose it wants to maximize the number of benzene rings in the generated molecule, the reward is computed based on that. If the generated molecule has a small number of benzene rings (like this), the reward would be small. On the other hand, when the generated molecule has a large number of benzene rings (like this), the reward would be high.
  12. First we will talk about the predicted properties (specifically the melting temperature and logP) of train molecules vs generated molecules. Here the baseline is training molecules.
  13. Here is a comparison of training vs generated molecules. The baseline ones are the training molecules. If you see the first property ™ the melting temperature [NEXT], the baseline has 95% valid molecules. The generated molecules with minimized melting temperatures have 31% valide molecules, and it is much lower than that for baseline. We can observe similar thing for The generated molecules with maximized melting temperatures as well. However, the mean melting temperatures [NEXT] shows that the molecules with minimized and maximized melting temperatures are better than the baseline. If you go down the table, other properties such as [NEXT] logP, [NEXT] the number of benzene rings tend to have percentage of valide molecules somewhat comparable to the baseline.
  14. [NEXT] Compared to these existing models, what is unique about this particular approach that I have been presenting was it produces valid SMILES strings, and it generated SMILES string that matches to less than 0.1% of the training molecules.
  15. To understand how the generative models populate chemical space with newly generated drug molecule structures, the authors used t-distributed stochastic neighbor embedding to reduced dimensionality of the predicted value distribution from the model P. The colour code shows the magnitude of the predicted property values. … The labeled ones are the randomly picked generated drug molecules matched with drug modules in existing datasets