Some new directions for pharmaceutical molecular design
Peter W Kenny (pwk.pub.2008@gmail.com)
Some things that make drug discovery difficult
• Having to exploit targets that are weakly-linked to
human disease
• Inability to predict idiosyncratic toxicity
• Inability to measure free (unbound) physiological
concentrations of drug for remote targets (e.g.
intracellular or on far side of blood brain barrier)

Dans la merde : http://fbdd-lit.blogspot.com/2011/09/dans-la-merde.html
Molecular Design
• Control of behavior of compounds and materials by
manipulation of molecular properties
• Hypothesis-driven or prediction-driven
• Sampling of chemical space
– Does fragment-based screening allow better control of
sampling resolution?
Illustrating hypothesis-driven design
DNA Base Isosteres: Acceptor & Donor Definitions

Do1

Do2

Ac1

Kenny (2009) JCIM 49:1234-1244 DOI
Vmin (Ac1)

Watson-Crick Donor & Acceptor Electrostatic Potentials for
Adenine Isosteres

Va (Do1)
Kenny (2009) JCIM 49:1234-1244 DOI
The lurking menace of correlation inflation

Kenny & Montanari (2013) JCAMD 27:1-13 DOI
Preparation of synthetic data for correlation
inflation study

Add Gaussian
noise (SD=10) to Y

Kenny & Montanari (2013) JCAMD 27:1-13 DOI
Correlation inflation by hiding variation
See Hopkins, Mason & Overington (2006) Curr Opin Struct Biol 16:127-136 DOI
Leeson & Springthorpe (2007) NRDD 6:881-890 DOI

R = 0.34

R = 0.30

R = 0.31

R = 0.67

R = 0.93

R = 0.996

Data is naturally binned (X is an integer) and mean value of Y is calculated for
each value of X. In some studies, averaged data is only presented graphically
and it is left to the reader to judge the strength of the correlation.
Correlation Inflation in Flatland
See Lovering, Bikker & Humblet (2009) JMC 52:6752-6756 DOI

N

1202

N

R r 0.247 ( 95% CI: 0.193 | 0.299)

8

R

0.972 ( 95% CI: 0.846 | 0.995)

Kenny & Montanari (2013) JCAMD 27:1-13 DOI
Choosing octanol was the first mistake...
An alternative view of the Rule of 5

N

ClogP ≤ 5

Acc ≤ 10; Don ≤5

Polarity
Does octanol/water ‘see’ hydrogen bond donors?

--0.06

-0.23

--1.01

-0.24

--1.05

-0.66

Sangster lab database of octanol/water partition coefficients: http://logkow.cisti.nrc.ca/logkow/index.jsp
Octanol/water is not the only partitioning system

Octanol/Water

Alkane/Water
Differences in octanol/water and alkane/water logP values
reflect hydrogen bonding between solute and octanol

logPoct = 2.1

logPoct = 1.5

logPoct = 2.5

logPalk = 1.9

logPalk = -0.8

logPalk = -1.8

DlogP = 0.2

DlogP = 2.3

DlogP = 4.3

Toulmin et al (2008) J Med Chem 51:3720-3730 DOI
Polar Surface Area is not predictive of
hydrogen bond strength

DlogP

=

0.5

PSA/ Å2 = 48

DlogP

=

4.3

PSA/ Å2 = 22

Toulmin et al (2008) J Med Chem 51:3720-3730 DOI
Hydrogen bonding of esters

-0.086

-0.104

-0.091

-0.072

-0.054

Toulmin et al (2008) J Med Chem 51:3720-3730 DOI

-0.093
Prediction of contribution of acceptors to DlogP

N or ether O
DlogP
(corrected)

Vmin/(Hartree/electron)

Carbonyl O
DlogP
(corrected)

Vmin/(Hartree/electron)

DlogP = DlogP0 x exp(-kVmin)

Toulmin et al (2008) J Med Chem 51:3720-3730 DOI
logPalk

Basis for ClogPalk model

MSA/Å2

Kenny, Montanari & Propopczyk et al (2013) JCAMD 27:389-402 DOI
ClogPalk from perturbation of saturated hydrocarbon
𝐶𝑙𝑜𝑔𝑃 𝑎𝑙𝑘 = 𝑙𝑜𝑔𝑃0 + 𝑠 × 𝑀𝑆𝐴 −

∆𝑙𝑜𝑔𝑃 𝐹𝐺,𝑖 −
𝑖

logPalk predicted
for saturated
hydrocarbon

∆𝑙𝑜𝑔𝑃 𝐼𝑛𝑡,𝑗
𝑗

Perturbation by
functional groups
Perturbation by
interactions
between
functional groups

Kenny, Montanari & Propopczyk et al (2013) JCAMD 27:389-402 DOI
Performance of ClogPalk model
Cortisone

logPalk  ClogPalk

Hydrocortisone

Papavarlne

Atropine
Propanolol

(logPalk  ClogPalk)/2
Kenny, Montanari & Propopczyk et al (2013) JCAMD 27:389-402 DOI
Another way to look at SAR?
(Descriptor-based) QSAR/QSPR:
Some questions
• How valid is methodology (especially for validation)
when distribution of compounds in training/test space
is highly non-uniform?
• Are models predicting activity or locating neighbours?
• To what extent are ‘global’ models just ensembles of
local models?
• How well do the methods handle ‘activity cliffs’?
• How should we account for sizes of descriptor pools
when comparing model performance?
Measures of Diversity & Coverage
•

•

•
• •

•
•

•

•

•

•

•
•

•
•

2-Dimensional representation of chemical space is used here to illustrate concepts of diversity
and coverage. Stars indicate compounds selected to sample this region of chemical space.
In this representation, similar compounds are close together
Neighborhoods and library design
Examples of relationships between structures

Tanimoto coefficient (foyfi) for structures is 0.90

Ester is methylated acid

Amides are ‘reversed’
Leatherface molecular editor
From chain saw to Matched Molecular Pairs
c-[A;!R]
bnd 1 2

[nX2]1c([OH])cccc1
hyd 1 1
hyd 3 -1
bnd 2 3 2

c-Br
cul 2
hyd 1 1

Kenny & Sadowski Structure modification in chemical databases, Methods and Principles in Medicinal
Chemistry (Chemoinformatics in Drug Discovery 2005, 23, 271-285 DOI
Glycogen Phosphorylase inhibitors:
Series comparison

DpIC50 0.38 (0.06)
DlogFu -0.30 (0.06)
DlogS -0.29 (0.13)

DpIC50 0.21 (0.06)
DlogFu 0.13 (0.04)
DlogS 0.20 (0.09)

DpIC50 0.29 (0.07)
DlogFu -0.42 (0.08)
DlogS -0.62 (0.13)

Standard errors in mean values in parenthesis; see Birch et al (2009) BMCL 19:850-853 DOI
Effect of bioisosteric replacement
on plasma protein binding

?

Mining PPB database for carboxylate/tetrazole pairs suggested that bioisosteric
replacement would lead to decrease in Fu so tetrazoles were not synthesised.

Date of Analysis

N

DlogFu

SE

SD

%increase

2003

7

-0.64

0.09

0.23

0

2008

12

-0.60

0.06

0.20

0

Birch et al (2009) BMCL 19:850-853 DOI
Bioisosterism: Carboxylate & tetrazole

-0.262

-0.296
-0.268

-0.316

-0.315

-0.268
-0.295

Kenny (2009) JCIM 49:1234-1244 DOI

-0.261
Effect of amide N-methylation on aqueous solubility
is dependent on substructural context

Amide

N

DlogS

SE

SD

%Increase

Acyclic (aliphatic amine)

109

0.59

0.07

0.71

76

Cyclic

9

0.18

0.15

0.47

44

Benzanilides

9

1.49

0.25

0.76

100

Birch et al (2009) BMCL 19:850-853 DOI
Relationships between structures

Prediction of activity &
properties

Direct
prediction
(e.g. look up
substituent
effects)

Indirect
prediction
(e.g. apply
correction to
existing model)

Discover new
bioisosteres &
scaffolds

Recognise
extreme data

Bad
measurement
or interesting
effect?
MUDO Molecule Editor
• SMIRKS-based re-write of Leatherface using
OEChem
• Can process 3D structures (e.g. form covalent bond
between protein and ligand)
• Identification of matched molecular pairs is much
easier than with Leatherface

Kenny, Montanari, Propopczyk, Sala, Rodrigues Sartori (2013) JCAMD 27:655-664 DOI
Stuff to think about
• Molecular design is not just about prediction so
how can we make hypothesis-driven design more
systematic?
• Data can be massaged and correlations can be
inflated but it won’t extract us from ‘la merde’
• There is life beyond octanol/water (and atomcentered charges) if we choose to look for it
• Even molecules can have meaningful relationships

Some new directions for pharmaceutical molecular design

  • 1.
    Some new directionsfor pharmaceutical molecular design Peter W Kenny (pwk.pub.2008@gmail.com)
  • 2.
    Some things thatmake drug discovery difficult • Having to exploit targets that are weakly-linked to human disease • Inability to predict idiosyncratic toxicity • Inability to measure free (unbound) physiological concentrations of drug for remote targets (e.g. intracellular or on far side of blood brain barrier) Dans la merde : http://fbdd-lit.blogspot.com/2011/09/dans-la-merde.html
  • 3.
    Molecular Design • Controlof behavior of compounds and materials by manipulation of molecular properties • Hypothesis-driven or prediction-driven • Sampling of chemical space – Does fragment-based screening allow better control of sampling resolution?
  • 4.
    Illustrating hypothesis-driven design DNABase Isosteres: Acceptor & Donor Definitions Do1 Do2 Ac1 Kenny (2009) JCIM 49:1234-1244 DOI
  • 5.
    Vmin (Ac1) Watson-Crick Donor& Acceptor Electrostatic Potentials for Adenine Isosteres Va (Do1) Kenny (2009) JCIM 49:1234-1244 DOI
  • 6.
    The lurking menaceof correlation inflation Kenny & Montanari (2013) JCAMD 27:1-13 DOI
  • 7.
    Preparation of syntheticdata for correlation inflation study Add Gaussian noise (SD=10) to Y Kenny & Montanari (2013) JCAMD 27:1-13 DOI
  • 8.
    Correlation inflation byhiding variation See Hopkins, Mason & Overington (2006) Curr Opin Struct Biol 16:127-136 DOI Leeson & Springthorpe (2007) NRDD 6:881-890 DOI R = 0.34 R = 0.30 R = 0.31 R = 0.67 R = 0.93 R = 0.996 Data is naturally binned (X is an integer) and mean value of Y is calculated for each value of X. In some studies, averaged data is only presented graphically and it is left to the reader to judge the strength of the correlation.
  • 9.
    Correlation Inflation inFlatland See Lovering, Bikker & Humblet (2009) JMC 52:6752-6756 DOI N 1202 N R r 0.247 ( 95% CI: 0.193 | 0.299) 8 R 0.972 ( 95% CI: 0.846 | 0.995) Kenny & Montanari (2013) JCAMD 27:1-13 DOI
  • 10.
    Choosing octanol wasthe first mistake...
  • 11.
    An alternative viewof the Rule of 5 N ClogP ≤ 5 Acc ≤ 10; Don ≤5 Polarity
  • 12.
    Does octanol/water ‘see’hydrogen bond donors? --0.06 -0.23 --1.01 -0.24 --1.05 -0.66 Sangster lab database of octanol/water partition coefficients: http://logkow.cisti.nrc.ca/logkow/index.jsp
  • 13.
    Octanol/water is notthe only partitioning system Octanol/Water Alkane/Water
  • 14.
    Differences in octanol/waterand alkane/water logP values reflect hydrogen bonding between solute and octanol logPoct = 2.1 logPoct = 1.5 logPoct = 2.5 logPalk = 1.9 logPalk = -0.8 logPalk = -1.8 DlogP = 0.2 DlogP = 2.3 DlogP = 4.3 Toulmin et al (2008) J Med Chem 51:3720-3730 DOI
  • 15.
    Polar Surface Areais not predictive of hydrogen bond strength DlogP = 0.5 PSA/ Å2 = 48 DlogP = 4.3 PSA/ Å2 = 22 Toulmin et al (2008) J Med Chem 51:3720-3730 DOI
  • 16.
    Hydrogen bonding ofesters -0.086 -0.104 -0.091 -0.072 -0.054 Toulmin et al (2008) J Med Chem 51:3720-3730 DOI -0.093
  • 17.
    Prediction of contributionof acceptors to DlogP N or ether O DlogP (corrected) Vmin/(Hartree/electron) Carbonyl O DlogP (corrected) Vmin/(Hartree/electron) DlogP = DlogP0 x exp(-kVmin) Toulmin et al (2008) J Med Chem 51:3720-3730 DOI
  • 18.
    logPalk Basis for ClogPalkmodel MSA/Å2 Kenny, Montanari & Propopczyk et al (2013) JCAMD 27:389-402 DOI
  • 19.
    ClogPalk from perturbationof saturated hydrocarbon 𝐶𝑙𝑜𝑔𝑃 𝑎𝑙𝑘 = 𝑙𝑜𝑔𝑃0 + 𝑠 × 𝑀𝑆𝐴 − ∆𝑙𝑜𝑔𝑃 𝐹𝐺,𝑖 − 𝑖 logPalk predicted for saturated hydrocarbon ∆𝑙𝑜𝑔𝑃 𝐼𝑛𝑡,𝑗 𝑗 Perturbation by functional groups Perturbation by interactions between functional groups Kenny, Montanari & Propopczyk et al (2013) JCAMD 27:389-402 DOI
  • 20.
    Performance of ClogPalkmodel Cortisone logPalk  ClogPalk Hydrocortisone Papavarlne Atropine Propanolol (logPalk  ClogPalk)/2 Kenny, Montanari & Propopczyk et al (2013) JCAMD 27:389-402 DOI
  • 21.
    Another way tolook at SAR?
  • 22.
    (Descriptor-based) QSAR/QSPR: Some questions •How valid is methodology (especially for validation) when distribution of compounds in training/test space is highly non-uniform? • Are models predicting activity or locating neighbours? • To what extent are ‘global’ models just ensembles of local models? • How well do the methods handle ‘activity cliffs’? • How should we account for sizes of descriptor pools when comparing model performance?
  • 23.
    Measures of Diversity& Coverage • • • • • • • • • • • • • • • 2-Dimensional representation of chemical space is used here to illustrate concepts of diversity and coverage. Stars indicate compounds selected to sample this region of chemical space. In this representation, similar compounds are close together
  • 24.
  • 25.
    Examples of relationshipsbetween structures Tanimoto coefficient (foyfi) for structures is 0.90 Ester is methylated acid Amides are ‘reversed’
  • 26.
    Leatherface molecular editor Fromchain saw to Matched Molecular Pairs c-[A;!R] bnd 1 2 [nX2]1c([OH])cccc1 hyd 1 1 hyd 3 -1 bnd 2 3 2 c-Br cul 2 hyd 1 1 Kenny & Sadowski Structure modification in chemical databases, Methods and Principles in Medicinal Chemistry (Chemoinformatics in Drug Discovery 2005, 23, 271-285 DOI
  • 27.
    Glycogen Phosphorylase inhibitors: Seriescomparison DpIC50 0.38 (0.06) DlogFu -0.30 (0.06) DlogS -0.29 (0.13) DpIC50 0.21 (0.06) DlogFu 0.13 (0.04) DlogS 0.20 (0.09) DpIC50 0.29 (0.07) DlogFu -0.42 (0.08) DlogS -0.62 (0.13) Standard errors in mean values in parenthesis; see Birch et al (2009) BMCL 19:850-853 DOI
  • 28.
    Effect of bioisostericreplacement on plasma protein binding ? Mining PPB database for carboxylate/tetrazole pairs suggested that bioisosteric replacement would lead to decrease in Fu so tetrazoles were not synthesised. Date of Analysis N DlogFu SE SD %increase 2003 7 -0.64 0.09 0.23 0 2008 12 -0.60 0.06 0.20 0 Birch et al (2009) BMCL 19:850-853 DOI
  • 29.
    Bioisosterism: Carboxylate &tetrazole -0.262 -0.296 -0.268 -0.316 -0.315 -0.268 -0.295 Kenny (2009) JCIM 49:1234-1244 DOI -0.261
  • 30.
    Effect of amideN-methylation on aqueous solubility is dependent on substructural context Amide N DlogS SE SD %Increase Acyclic (aliphatic amine) 109 0.59 0.07 0.71 76 Cyclic 9 0.18 0.15 0.47 44 Benzanilides 9 1.49 0.25 0.76 100 Birch et al (2009) BMCL 19:850-853 DOI
  • 31.
    Relationships between structures Predictionof activity & properties Direct prediction (e.g. look up substituent effects) Indirect prediction (e.g. apply correction to existing model) Discover new bioisosteres & scaffolds Recognise extreme data Bad measurement or interesting effect?
  • 32.
    MUDO Molecule Editor •SMIRKS-based re-write of Leatherface using OEChem • Can process 3D structures (e.g. form covalent bond between protein and ligand) • Identification of matched molecular pairs is much easier than with Leatherface Kenny, Montanari, Propopczyk, Sala, Rodrigues Sartori (2013) JCAMD 27:655-664 DOI
  • 33.
    Stuff to thinkabout • Molecular design is not just about prediction so how can we make hypothesis-driven design more systematic? • Data can be massaged and correlations can be inflated but it won’t extract us from ‘la merde’ • There is life beyond octanol/water (and atomcentered charges) if we choose to look for it • Even molecules can have meaningful relationships