Deep reinforcement learning for de novo drug design

Deep
reinforcement
learning for de
novo drug
design
By Popova et al.
Presenter: Nimmi
1

“ A biological representation
of a molecule or chemical
compound with sufficient
potential to progress to a full
drug development.1
2
De novo drug design
1. https://www.beckman.com/support/faq/research/what-is-a-lead-compound#:~:text=A%20lead%20compound%20is%20a,a%20full%20drug%20development%20program.

Introduction
This paper is about generating chemical structures of
candidate drug molecules.
3
Problem:
Formulation of a well-
motivated hypothesis for
candidate drug compound
generation or selection based
on the available data is
challenging.
ReLeaSE:
A novel method for generating
chemical compounds with
desired physical, chemical,
and/or bioactivity properties de
novo that is based on deep
reinforcement learning (RL).
Proposed
Solution

Background
RL is a subfield of AI, which is used to solve dynamic
decision problems.
ML models can gain abilities to make decisions and
explore in an unsupervised and complex environment
by RL.
Using RL to generate candidate drug molecules would
avoid brute-force computing to examine the every
possible solution in the chemical space.
4
4

Properties of Drugs
Partition coefficient
LogP is a critical
measure that not
only determines
how well a drug will
be absorbed,
transported, and
distributed in the
body but also
dictates how a drug
should be
formulated and
dosed.
Melting Temperature
(Tm) is the
temperature at
which a given solid
drug changes from
its solid state to a
liquid, or melts
A Janus kinase
inhibitor, also known
as JAK inhibitor or
jakinib, is a type of
immune
modulating
medication, which
inhibits the activity of
one or more of the
Janus kinase family
of enzymes (JAK1,
JAK2, JAK3, TYK2).
Chemical
complexity of drug
is measures by the
number of benzene
rings.
5
Physical Properties Bioactivity Properties
Structural
Properties

SMILES representations
6
Simplified Molecular-
Input Line-Entry
System
https://www.researchgate.net/figure/Canonical-a-and-randomized-b-SMILES-representations-of-aspirin-Randomized-SMILES_fig1_344805688

Generative
Model (G)
Agent
1
8

Generative models in general
E.g. painting generation, human face generation, etc.
9

E.g. text generation, poem generation, source code generation, etc.
Generative RNN models
10

Generative RNN model (G) to generate SMILES strings
11
Training step of the generative
Stack-RNN

Generative RNN model (G) to generate SMILES strings
12
Training step of the generative
Stack-RNN
Generator step of the generative
Stack-RNN

Drug molecules corresponding to some of the generated
SMILES strings
13

2
14
Predictive
Model (P)
Environment

Predictive Model (P)
15
= sT = P(sT)
Reward:
r(sT) = f(P(sT))

QSPR analysis
Quantitative structure–property relationship (QSPR) analysis finds correlations
between structural descriptions and material properties through ML models.
16
= sT = P(sT)
Structural
description
Material
Property
(e.g. LogP, Tm)
For predicting LogP using 5CV, the model accuracy is 0.91 and root mean square error (RMSE) is 0.53.

18
= P(sT)
= sT
r(sT) = f(P(sT))

Objective the Generative model G (agent) optimizes
19

20
Reward proportional to the number of benzene rings.
Reward Increases

Predicted properties of the training drug molecules vs RL-
based genderated drug molecules
22

23
Comparison of statistics for training vs generated molecules

Related work
Olivecrona et al
(2017)
No data were
provided to show
that the predicted
properties of
molecular
compounds are
optimized by the
RL model.
There was a large
fraction of the
generated
molecules, that
were similar to
those in training
and test sets.
Jaques et al. (2017)
The RL model did
not directly
optimize any
physical or
biological
properties but
rather a proxy
function that
includes a SAS,
drug-likeness, and
a ring penalty
Segler et al (2018)
Did not use RL, but
vanilla RNN.
25
25

Limitations
ReLeaSE generate candidate drug molecules while
independently optimizing for each chemical property
of interest.
● E.g. Having one model to generate molecules
optimized for LogP and having another model for
to generate molecules optimized for melting
temperature
27
27

29
Take home message
● ReLeaSE is a new strategy for designing
libraries of compounds with the desired
properties that uses both DL and RL
approaches.
● ReLeaSE efficiency generates candidate
drug molecules while independently
optimizing for each chemical property of
interest.
What’s next?
● Extending to afford multiobjective
optimization of several target properties
concurrently with respect to potency,
selectivity, solubility, and other drug-
likeness properties.
Thanks!

Generating a chemically diverse
stream of molecules is important,
because drug candidates can fail in
many unexpected ways, later in the
drug discovery pipeline.[1]
On the basis of the analysis of
respective sets of 10,000 molecules
generated by each method, the library
obtained without stack memory
showed a decrease in internal
diversity of 0.2 units of the Tanimoto
coefficient and yet a fourfold increase
in the number of duplicates, from just
about 1 to 5%.[2]
31
[1] Benhenda M. Can AI reproduce observed chemical diversity?. bioRxiv. 2018 Jan 1:292177.
[2] Popova M, Isayev O, Tropsha A. Deep reinforcement learning for de novo drug design. Science advances. 2018 Jul 25;4(7):eaap7885.
Internal diversity of generated libraries.

Visualization of new drug libraries
32
(a) LogP values predicted by the predictive
model P
(b) Melting temperatures predicted by the
predictive model P

Deep reinforcement learning for de novo drug design

Recommended

Recommended

More Related Content

Similar to Deep reinforcement learning for de novo drug design

Similar to Deep reinforcement learning for de novo drug design (20)

More from Nimmi Weeraddana

More from Nimmi Weeraddana (7)

Recently uploaded

Recently uploaded (20)

Deep reinforcement learning for de novo drug design

Editor's Notes