Aspects of pharmaceutical molecular design
Peter Kenny
Outline of presentation
• General aspects of molecular design and introducing
pharmaceutical molecular design
• How to enhance your correlations in one easy lesson
• Introduction to partition coefficients
• Using relationships between structures to mine
activity and properties of compounds
Some things that make drug discovery difficult
• Having to exploit targets that are weakly-linked to
human disease
• Inability to predict idiosyncratic toxicity
• Inability to measure free (unbound) physiological
concentrations of drug for remote targets (e.g.
intracellular or on far side of blood brain barrier)
Dans la merde : http://fbdd-lit.blogspot.com/2011/09/dans-la-merde.html
In tissues
Free in
plasma
Bound to
plasma
protein
Dose of drug
Eliminated drug
A simplified view of what happens to drugs after dosing
TEP =
[𝐷𝑟𝑢𝑔 𝑿,𝑡 ] 𝑓𝑟𝑒𝑒
𝐾 𝑑
Target engagement potential (TEP)
A basis for pharmaceutical molecular design?
Design objectives
• Low Kd for target(s)
• High (hopefully undetectable) Kd for antitargets
• Ability to control [Drug(X,t)]free
Pharmaceutical molecular design
• Control of behavior of compounds and materials by
manipulation of molecular properties
• Sampling of chemical space
– Does fragment-based screening allow better control of
sampling resolution?
Hypothesis-Driven
Framework in which to
assemble SAR/SPR as
efficiently as possible
Prediction-Driven
Assume that we can build
predictive models with
required degree of
accuracy
Molecular Design
Do1 Do2
Ac1
Kenny (2009) JCIM 49:1234-1244 DOI
Illustrating hypothesis-driven design with
DNA base isosteres: H bond acceptor & donor
definitions
Watson-Crick Donor & Acceptor Electrostatic Potentials for
Adenine Isosteres
Vmin(Ac1)
Va (Do1)
The lurking menace of correlation inflation
Kenny & Montanari (2013) JCAMD 27:1-13 DOI
Preparation of synthetic data for correlation
inflation study
Add Gaussian
noise (SD=10) to Y
Kenny & Montanari (2013) JCAMD 27:1-13 DOI
Correlation inflation by hiding variation
See Hopkins, Mason & Overington (2006) Curr Opin Struct Biol 16:127-136 DOI
Leeson & Springthorpe (2007) NRDD 6:881-890 DOI
Data is naturally binned (X is an integer) and mean value of Y is calculated for
each value of X. In some studies, averaged data is only presented graphically
and it is left to the reader to judge the strength of the correlation.
R = 0.34 R = 0.30 R = 0.31
R = 0.67 R = 0.93 R = 0.996
r
N 1202
R 0.247 ( 95% CI: 0.193 | 0.299)
N 8
R 0.972 ( 95% CI: 0.846 | 0.995)
Correlation Inflation in Flatland
See Lovering, Bikker & Humblet (2009) JMC 52:6752-6756 DOI
Kenny & Montanari (2013) JCAMD 27:1-13 DOI
Choosing octanol was the first mistake...
Why are we interested in partition coefficients?
• Definition and quantification of lipophilicity
• For modelling permeability (e.g. blood brain barrier)
• Insight into hydrophobic effect (molecular recognition)
• ‘Compound quality’ (bad stuff like poor solubility and
metabolic instability associated with high lipophilicity)
Lipophilic & half ionised Hydrophilic & neutral
Introduction to partition coefficients
Polarity
N
ClogP ≤ 5 Acc ≤ 10; Don ≤5
An alternative view of the Rule of 5
Does octanol/water ‘see’ hydrogen bond donors?
--0.06 -0.23 -0.24
--1.01 -0.66
Sangster lab database of octanol/water partition coefficients: http://logkow.cisti.nrc.ca/logkow/index.jsp
--1.05
Octanol/Water Alkane/Water
Octanol/water is not the only partitioning system
logPoct = 2.1
logPalk = 1.9
DlogP = 0.2
logPoct = 1.5
logPalk = -0.8
DlogP = 2.3
logPoct = 2.5
logPalk = -1.8
DlogP = 4.3
Differences in octanol/water and alkane/water logP values
reflect hydrogen bonding between solute and octanol
Toulmin et al (2008) J Med Chem 51:3720-3730 DOI
-0.054
-0.086
-0.091
-0.072
-0.104 -0.093
Hydrogen bonding of esters
Toulmin et al (2008) J Med Chem 51:3720-3730 DOI
Basis for ClogPalk model
logPalk
MSA/Å2
Kenny, Montanari & Propopczyk et al (2013) JCAMD 27:389-402 DOIKenny, Montanari & Propopczyk et al (2013) JCAMD 27:389-402 DOI
𝐶𝑙𝑜𝑔𝑃𝑎𝑙𝑘 = 𝑙𝑜𝑔𝑃0 + 𝑠 × 𝑀𝑆𝐴 −
𝑖
∆𝑙𝑜𝑔𝑃𝐹𝐺,𝑖 −
𝑗
∆𝑙𝑜𝑔𝑃𝐼𝑛𝑡,𝑗
ClogPalk from perturbation of saturated hydrocarbon
logPalk predicted
for saturated
hydrocarbon
Perturbation by
functional groups
Perturbation by
interactions
between
functional groups
Kenny, Montanari & Propopczyk et al (2013) JCAMD 27:389-402 DOI
Performance of ClogPalk model
Hydrocortisone
Cortisone
(logPalk  ClogPalk)/2
logPalkClogPalk
Atropine
Propanolol
Papavarlne
Kenny, Montanari & Propopczyk et al (2013) JCAMD 27:389-402 DOI
Another way to look at Structure Activity Relationships?
(Descriptor-based) QSAR/QSPR
• How likely is it that we will have sufficient data for useful activity
model before project delivers?
• How valid is methodology (especially for validation) when
distribution of compounds in training/test space is highly non-
uniform?
• Are models predicting activity or locating neighbours?
• To what extent are ‘global’ models ensembles of local models?
• How well do methods handle ‘activity cliffs’?
• How should we account for sizes of descriptor pools when
comparing model performance?
Measures of Diversity & Coverage
•
• •
•
•
•
•
•
•
•
•
•
•
•
•
2-Dimensional representation of chemical space is used here to illustrate concepts of diversity
and coverage. Stars indicate compounds selected to sample this region of chemical space.
In this representation, similar compounds are close together
Neighborhoods and library design
Examples of relationships between structures
Tanimoto coefficient (foyfi) for structures is 0.90
Ester is methylated acid Amides are ‘reversed’
Leatherface molecular editor
From chain saw to Matched Molecular Pairs
c-[A;!R]
bnd 1 2
c-Br
cul 2
hyd 1 1
[nX2]1c([OH])cccc1
hyd 1 1
hyd 3 -1
bnd 2 3 2
Kenny & Sadowski Structure modification in chemical databases, Methods and Principles in Medicinal
Chemistry (Chemoinformatics in Drug Discovery 2005, 23, 271-285 DOI
Glycogen Phosphorylase inhibitors:
Series comparison
DpIC50
DlogFu
DlogS
0.38 (0.06)
-0.30 (0.06)
-0.29 (0.13)
DpIC50
DlogFu
DlogS
0.21 (0.06)
0.13 (0.04)
0.20 (0.09)
DpIC50
DlogFu
DlogS
0.29 (0.07)
-0.42 (0.08)
-0.62 (0.13)
Standard errors in mean values in parenthesis; see Birch et al (2009) BMCL 19:850-853 DOI
Effect of bioisosteric replacement
on plasma protein binding
?
Date of Analysis N DlogFu SE SD %increase
2003 7 -0.64 0.09 0.23 0
2008 12 -0.60 0.06 0.20 0
Mining PPB database for carboxylate/tetrazole pairs suggested that bioisosteric
replacement would lead to decrease in Fu so tetrazoles were not synthesised.
Birch et al (2009) BMCL 19:850-853 DOI
-0.316
-0.315
-0.296
-0.295
Bioisosterism: Carboxylate & tetrazole
-0.262
-0.261
-0.268
-0.268
Kenny (2009) JCIM 49:1234-1244 DOI
Amide N DlogS SE SD %Increase
Acyclic (aliphatic amine) 109 0.59 0.07 0.71 76
Cyclic 9 0.18 0.15 0.47 44
Benzanilides 9 1.49 0.25 0.76 100
Effect of amide N-methylation on aqueous solubility
is dependent on substructural context
Birch et al (2009) BMCL 19:850-853 DOI
Relationships between structures
Discover new
bioisosteres &
scaffolds
Prediction of activity &
properties
Recognise
extreme data
Direct
prediction
(e.g. look up
substituent
effects)
Indirect
prediction
(e.g. apply
correction to
existing model)
Bad
measurement
or interesting
effect?
• Molecular design is not just about prediction so
how can we make hypothesis-driven design more
systematic?
• Inflated correlations will not extract us from ‘la
merde’ even if we can get a journal to print them
• There is life beyond octanol/water (and atom-
centered charges) if we only choose to look for it
• Even molecules can have meaningful relationships
Some stuff to think about

Aspects of pharmaceutical molecular design

  • 1.
    Aspects of pharmaceuticalmolecular design Peter Kenny
  • 2.
    Outline of presentation •General aspects of molecular design and introducing pharmaceutical molecular design • How to enhance your correlations in one easy lesson • Introduction to partition coefficients • Using relationships between structures to mine activity and properties of compounds
  • 3.
    Some things thatmake drug discovery difficult • Having to exploit targets that are weakly-linked to human disease • Inability to predict idiosyncratic toxicity • Inability to measure free (unbound) physiological concentrations of drug for remote targets (e.g. intracellular or on far side of blood brain barrier) Dans la merde : http://fbdd-lit.blogspot.com/2011/09/dans-la-merde.html
  • 4.
    In tissues Free in plasma Boundto plasma protein Dose of drug Eliminated drug A simplified view of what happens to drugs after dosing
  • 5.
    TEP = [𝐷𝑟𝑢𝑔 𝑿,𝑡] 𝑓𝑟𝑒𝑒 𝐾 𝑑 Target engagement potential (TEP) A basis for pharmaceutical molecular design? Design objectives • Low Kd for target(s) • High (hopefully undetectable) Kd for antitargets • Ability to control [Drug(X,t)]free
  • 6.
    Pharmaceutical molecular design •Control of behavior of compounds and materials by manipulation of molecular properties • Sampling of chemical space – Does fragment-based screening allow better control of sampling resolution?
  • 7.
    Hypothesis-Driven Framework in whichto assemble SAR/SPR as efficiently as possible Prediction-Driven Assume that we can build predictive models with required degree of accuracy Molecular Design
  • 8.
    Do1 Do2 Ac1 Kenny (2009)JCIM 49:1234-1244 DOI Illustrating hypothesis-driven design with DNA base isosteres: H bond acceptor & donor definitions
  • 9.
    Watson-Crick Donor &Acceptor Electrostatic Potentials for Adenine Isosteres Vmin(Ac1) Va (Do1)
  • 10.
    The lurking menaceof correlation inflation Kenny & Montanari (2013) JCAMD 27:1-13 DOI
  • 11.
    Preparation of syntheticdata for correlation inflation study Add Gaussian noise (SD=10) to Y Kenny & Montanari (2013) JCAMD 27:1-13 DOI
  • 12.
    Correlation inflation byhiding variation See Hopkins, Mason & Overington (2006) Curr Opin Struct Biol 16:127-136 DOI Leeson & Springthorpe (2007) NRDD 6:881-890 DOI Data is naturally binned (X is an integer) and mean value of Y is calculated for each value of X. In some studies, averaged data is only presented graphically and it is left to the reader to judge the strength of the correlation. R = 0.34 R = 0.30 R = 0.31 R = 0.67 R = 0.93 R = 0.996
  • 13.
    r N 1202 R 0.247( 95% CI: 0.193 | 0.299) N 8 R 0.972 ( 95% CI: 0.846 | 0.995) Correlation Inflation in Flatland See Lovering, Bikker & Humblet (2009) JMC 52:6752-6756 DOI Kenny & Montanari (2013) JCAMD 27:1-13 DOI
  • 14.
    Choosing octanol wasthe first mistake...
  • 15.
    Why are weinterested in partition coefficients? • Definition and quantification of lipophilicity • For modelling permeability (e.g. blood brain barrier) • Insight into hydrophobic effect (molecular recognition) • ‘Compound quality’ (bad stuff like poor solubility and metabolic instability associated with high lipophilicity)
  • 16.
    Lipophilic & halfionised Hydrophilic & neutral Introduction to partition coefficients
  • 17.
    Polarity N ClogP ≤ 5Acc ≤ 10; Don ≤5 An alternative view of the Rule of 5
  • 18.
    Does octanol/water ‘see’hydrogen bond donors? --0.06 -0.23 -0.24 --1.01 -0.66 Sangster lab database of octanol/water partition coefficients: http://logkow.cisti.nrc.ca/logkow/index.jsp --1.05
  • 19.
    Octanol/Water Alkane/Water Octanol/water isnot the only partitioning system
  • 20.
    logPoct = 2.1 logPalk= 1.9 DlogP = 0.2 logPoct = 1.5 logPalk = -0.8 DlogP = 2.3 logPoct = 2.5 logPalk = -1.8 DlogP = 4.3 Differences in octanol/water and alkane/water logP values reflect hydrogen bonding between solute and octanol Toulmin et al (2008) J Med Chem 51:3720-3730 DOI
  • 21.
    -0.054 -0.086 -0.091 -0.072 -0.104 -0.093 Hydrogen bondingof esters Toulmin et al (2008) J Med Chem 51:3720-3730 DOI
  • 22.
    Basis for ClogPalkmodel logPalk MSA/Å2 Kenny, Montanari & Propopczyk et al (2013) JCAMD 27:389-402 DOIKenny, Montanari & Propopczyk et al (2013) JCAMD 27:389-402 DOI
  • 23.
    𝐶𝑙𝑜𝑔𝑃𝑎𝑙𝑘 = 𝑙𝑜𝑔𝑃0+ 𝑠 × 𝑀𝑆𝐴 − 𝑖 ∆𝑙𝑜𝑔𝑃𝐹𝐺,𝑖 − 𝑗 ∆𝑙𝑜𝑔𝑃𝐼𝑛𝑡,𝑗 ClogPalk from perturbation of saturated hydrocarbon logPalk predicted for saturated hydrocarbon Perturbation by functional groups Perturbation by interactions between functional groups Kenny, Montanari & Propopczyk et al (2013) JCAMD 27:389-402 DOI
  • 24.
    Performance of ClogPalkmodel Hydrocortisone Cortisone (logPalk  ClogPalk)/2 logPalkClogPalk Atropine Propanolol Papavarlne Kenny, Montanari & Propopczyk et al (2013) JCAMD 27:389-402 DOI
  • 25.
    Another way tolook at Structure Activity Relationships?
  • 26.
    (Descriptor-based) QSAR/QSPR • Howlikely is it that we will have sufficient data for useful activity model before project delivers? • How valid is methodology (especially for validation) when distribution of compounds in training/test space is highly non- uniform? • Are models predicting activity or locating neighbours? • To what extent are ‘global’ models ensembles of local models? • How well do methods handle ‘activity cliffs’? • How should we account for sizes of descriptor pools when comparing model performance?
  • 27.
    Measures of Diversity& Coverage • • • • • • • • • • • • • • • 2-Dimensional representation of chemical space is used here to illustrate concepts of diversity and coverage. Stars indicate compounds selected to sample this region of chemical space. In this representation, similar compounds are close together
  • 28.
  • 29.
    Examples of relationshipsbetween structures Tanimoto coefficient (foyfi) for structures is 0.90 Ester is methylated acid Amides are ‘reversed’
  • 30.
    Leatherface molecular editor Fromchain saw to Matched Molecular Pairs c-[A;!R] bnd 1 2 c-Br cul 2 hyd 1 1 [nX2]1c([OH])cccc1 hyd 1 1 hyd 3 -1 bnd 2 3 2 Kenny & Sadowski Structure modification in chemical databases, Methods and Principles in Medicinal Chemistry (Chemoinformatics in Drug Discovery 2005, 23, 271-285 DOI
  • 31.
    Glycogen Phosphorylase inhibitors: Seriescomparison DpIC50 DlogFu DlogS 0.38 (0.06) -0.30 (0.06) -0.29 (0.13) DpIC50 DlogFu DlogS 0.21 (0.06) 0.13 (0.04) 0.20 (0.09) DpIC50 DlogFu DlogS 0.29 (0.07) -0.42 (0.08) -0.62 (0.13) Standard errors in mean values in parenthesis; see Birch et al (2009) BMCL 19:850-853 DOI
  • 32.
    Effect of bioisostericreplacement on plasma protein binding ? Date of Analysis N DlogFu SE SD %increase 2003 7 -0.64 0.09 0.23 0 2008 12 -0.60 0.06 0.20 0 Mining PPB database for carboxylate/tetrazole pairs suggested that bioisosteric replacement would lead to decrease in Fu so tetrazoles were not synthesised. Birch et al (2009) BMCL 19:850-853 DOI
  • 33.
    -0.316 -0.315 -0.296 -0.295 Bioisosterism: Carboxylate &tetrazole -0.262 -0.261 -0.268 -0.268 Kenny (2009) JCIM 49:1234-1244 DOI
  • 34.
    Amide N DlogSSE SD %Increase Acyclic (aliphatic amine) 109 0.59 0.07 0.71 76 Cyclic 9 0.18 0.15 0.47 44 Benzanilides 9 1.49 0.25 0.76 100 Effect of amide N-methylation on aqueous solubility is dependent on substructural context Birch et al (2009) BMCL 19:850-853 DOI
  • 35.
    Relationships between structures Discovernew bioisosteres & scaffolds Prediction of activity & properties Recognise extreme data Direct prediction (e.g. look up substituent effects) Indirect prediction (e.g. apply correction to existing model) Bad measurement or interesting effect?
  • 36.
    • Molecular designis not just about prediction so how can we make hypothesis-driven design more systematic? • Inflated correlations will not extract us from ‘la merde’ even if we can get a journal to print them • There is life beyond octanol/water (and atom- centered charges) if we only choose to look for it • Even molecules can have meaningful relationships Some stuff to think about