SlideShare a Scribd company logo
Ed Griffen, MedChemica Ltd
Extracting actionable
knowledge from large
scale in vitro
pharmacology data
MedChemica
Why improve medicinal chemistry practice?
For an aging population and emerging pathogens
“Eroom’s Law” – The cost of discovering a new drug has
doubled every 9 years consistently for the last 60 years.1
= cost  8%/year
2
1. Scannell et al Nature Reviews Drug Discovery (2012), 11, 191-200
2. Paul et al Nature Reviews Drug Discovery (2010), 9, 203-214
Cost /
$million
Cost/Launch(2010): $873m
Capitalised: $1.8Bn2
0
50
100
150
200
250
300
350
400
450
500
Cost / project
Cost/Launch
Cost/Launch (capitalized)
MedChemica
Actionable knowledge
Critical information that the user can immediately choose
a course of action from:
3
ADME
– ways to ‘fix’ your molecule
Toxicology
– sub structures to avoid
Pharmacology
– substructural leads built for
practical design
MedChemica
Roche
Data
rule
finder
Roche
Database
Genentech
Data
rule
finder
Genentech
Data
AZ
Data
rule
finder
AZ
Database
Grand Rule
Database
ADMET Rule database
Better medicinal chemistry by combining knowledge
MedChemica
Grand Rule
Database
Grand Rule
Database
Grand Rule
Database
AZ
Exploitation
Roche
Exploitation
Genentech
Exploitation
Pharma 4
Data
rule
finder
Pharma 4
Data
Grand Rule
Database
Pharma 4
Exploitation
Grand Rule
Database
Pharma 5
Data
rule
finder
Pharma 5
Data
Grand Rule
Database
Pharma 5
Exploitation
Grand Rule
Database
>500 million pairs
from companies
+ 12 million from
public data
Current Knowledge sets – GRDv3
Numbers of statistically valid transforms
Grouped Datasets Number of Rules
logD7.4 153449
Merged solubility 46655
In vitro microsomal clearance:
Human, rat ,mouse, cyno, dog
88423
In vitro hepatocyte clearance :
Human, rat ,mouse, cyno, dog
26627
MCDK permeability A-B / B – A efflux 1852
Cytochrome P450 inhibition:
2C9, 2D6 , 3A4 , 2C19 , 1A2 40605
Cardiac ion channels
NaV 1.5 , hERG ion channel inhibition
15636
Glutothione Stability 116
plasma protein or albumin binding
Human, rat ,mouse, cyno, dog
64622
MedChemica
Actionable knowledge
Critical information that the user can immediately choose
a course of action from:
6
ADME
– ways to ‘fix’ your molecule
Toxicology
– substructures to avoid
Pharmacology
– substructural leads built for
practical design
MedChemica
Clear structural direction from Big Data
Example
Dopamine Transporter inhibitors
7
0
30
60
90
120
4 6 8 10
pIC50
count
Pharmacophore
0
1
pKi
Predicted 8.6
Measured 9.1
Mean with Pharmacophore 8.3
Mean without 6.7
n examples 27
Odds ratio : ChEMBL 407
What do I want?:
• Substructures associated with potency
• Specificity of model
• Predictions
• Domain of Applicability
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
4
6
8
5 6 7 8 9
pIC50_pred
pIC50
CHEMBL538405
MedChemica
MedChemica Principles of Pharmacophore Extraction
• Pharmacophores must be clear and understandable
• Pharmacophore generation must be transparent to allow checking and
validation
• Use as much measured data as possible
• Look for key elements influencing potency
• Don’t base pharmacophores on a few compounds
• Pharmacophore must be specific
• (not like phenyl + amine = hERG inhibitor)
• Can be applied quickly (to large libraries)
8
Cation
HyAr
HyAr
How do I
actually use
this?
MedChemica
QSAR and Knowledge extraction
Model as filter or knowledge?
9
substructures Physical chemistry
descriptors(Hansch,
Taft, Fujita, Abraham)
Atomic, pair, triplet
descriptors
Indices
(M)LR Free Wilson
PLS
Trees / Forests
SVM
Bayesian NN
Deep Learning Dark
Black
Descriptors
Method
MedChemica
• Identify key potency giving changes by matched molecular pair analysis on large datasets
• Extract fragments that are associated with potency
• Find pairs of fragments and linkers that are specific to potent compound subsets
•  easy to use 2D pharmacophores
• 2 potency enhancing fragments joined by a specific linker
Specific Pharmacophore extraction from MMPA
10
Model
1470 compounds
CHEMBL339
Dopamine
transporter
Pharmacophore
Identification CHEMBL538405 pKi 9.1
Example
0
200
400
5 6 7 8 9
pIC50
count
All D2 Compounds
Fragment I
Fragment 2
Linker
pKi
Predicted 8.6
Measured 9.1
Mean with Pharmacophore 8.3
Mean without 6.7
n examples 27
Odds ratio : ChEMBL 407
Predict potency
and show
Pharmacophore
match
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
4
6
8
5 6 7 8 9
pIC50_pred
pIC50
Public
Data
Find
Matched
Pairs
Pharmacophores
Find
Pharmacophore
dyads
Find
Potent
Fragments
MedChemica
Matched Molecular Pairs
• Molecules that differ only by a particular,
well-defined structural transformation
Transformation with environment
capture
• MMPs can be recorded as transformations from
A B
• Environment is essential to understand chemistry
Statistical analysis
• Learn what effect the transformation has had on properties in the past
Griffen, E. et al. Matched Molecular Pairs as a Medicinal Chemistry Tool. Journal of Medicinal Chemistry.
2011, 54(22), pp.7739-7750.
Advanced MMPA
Δ Data A-
B
1
2
2
3
3
3
4
4
4
12
23
3
34
4
4
A B
Public
Data
Find
Matched
Pairs
Fragments
MedChemica
Matched pair methodology
because MCSS and F&I each find different pairings
A – CHEMBL156639 B - CHEMBL2387702
A – CHEMBL100461 B –CHEMBL103900
MCSS ✓, F&I ✗ MCSS ✗ , F&I ✓
MCSS ✓, F&I ✗
MCSS ✓, F&I ✗
MCSS ✗, F&I ✗ MCSS ✗, F&I ✗
MCSS ✗ , F&I ✓
MCSS ✓, F&I ✗
MedChemica
Does the Matched Pair method really matter?
Using only one technique will miss between 12% and 56% of pairings
13
Pairings Pairings
number of
compounds common FI only MCSS only total FI only % common % MCSS only %
VEGF 4466 14631 17172 14823 46626 37 31 32
Dopamine
Transporter 1470 4480 8930 3497 16907 53 26 21
GABAA 848 2500 1722 4205 8427 20 30 50
D2 human 3873 12995 13811 13098 39904 35 33 33
D2 rat 1807 5408 6595 7346 19349 34 28 38
Acetylcholine
esterase 383 536 725 1434 2695 27 20 53
Monoamine
oxidase 264 653 1156 246 2055 56 32 12
min 20 20 12
max 56 33 53
FI MCSS
common
MedChemica
Mining transform sets to find potent fragments
Identify the ‘A’ fragments associated with a significant
number `of potency decreasing changes – irrespective
of what they are replaced with
‘A’ is ‘better than anything you replace it with’
Fragment A Fragment B
Change in binding
measurement
• One-tailed binomial test with Holm–
Bonferonni correction at 95%
confidence identifies potent
fragments
• Compare the mean of the
compounds that contain the
fragment with the mean of the
remaining compounds
Statistics:
pKi/
pIC50
Compounds
containing potent
fragment
Remaining
compounds
Effect size =
Cohen’s d test
A
B
C ED
+2.1+2.2
+1.4
+0.4 F
+1.8
Public
Data
Find
Matched
Pairs
Find
Potent
Fragments
Cohen’s d
Effect sizes:
Large >= 0.8
Medium 0.5 – 0.8
Small 0.2 - 0.5
Trivial 0.1 – 0.2
No effect < 0.1
d = A
m -
B
m
1
s
s ' = A
2
s + B
2
s
2
MedChemica
Mining transform sets to find destructive fragments
Identify the ‘Z’ fragments associated with a significant
number `of potency increasing changes – irrespective of
what they are replaced with
‘Z’ is ‘worse than anything you replace it with’
Fragment A Fragment B
Change in binding
measurement
Public
Data
Find
Matched
Pairs
Find
Potent
Fragments
+2.7
+3.2
+0.6
+0.6
Z
pKi/
pIC50
Compounds containing
destructive fragment
Remaining
compounds
MedChemica
Mining transform sets to find influential fragments
Identify the ‘Z’ fragments associated with a significant
number `of potency increasing changes – irrespective of
what they are replaced with
‘Z’ is ‘worse than anything you replace it with’
Fragment A Fragment B
Change in binding
measurement
Public
Data
Find
Matched
Pairs
Find
Potent
Fragments
+2.7
+3.2
+0.6
+0.6
Identify the ‘A’ fragments associated with a significant
number `of potency decreasing changes – irrespective
of what they are replaced with
‘A’ is ‘better than anything you replace it with’
A
+2.1+2.2
+1.4
+0.4
+1.8
Z
pKi/
pIC50
Compounds with
destructive fragment
Compounds with
constructive fragments
MedChemica
Building Pharmacophores from potent Fragments
But individual Fragments are small and often non – specific so…
• Permutate all the pairs of fragments and find the the shortest
path between them (pharmacophore dyads) in the training set
• shortest path between them encodes distance & geometry
• select pharmacophore dyads with PLS to identify the dyads that
are explaining most of the potency
• check for significance and effect size with Cohen’s d and Welch’s
t-test.
17
• But what about specificity?
Path
Fragment 1
Fragment 2
[CH2]CN
Public
Data
Find
Matched
Pairs
Pharmacophores
Find
Pharmacophore
dyads
Find
Potent
Fragments
MedChemica
Testing for specificity - pharmacophores
• How selective is the pharmacophore?
• What are the odds of it hitting a molecule in the test set vs
CHEMBL?
• Odds of finding in potency set =
n(pharmacophore hits in potency set)
n(in potency set)
• Odds of finding in CHEMBL =
n(pharmacophore hits in CHEMBL not in potency set)
n(in CHEMBL)
• Odds ratio = selectivity =
Odds of finding in potency set_______
Odds of finding in CHEMBL(not potency set)
18
27
1470
62
1351211
27/1470
62/1351211
=407
(95% confidence limits: 259-642)
Odds of hitting a potent compound are 407 times greater than a random compound in CHEMBL
Path
Fragment 1
Fragment 2
[CH2]CN
MedChemica
How specific is a Pharmacophore?
What does a bad odds ratio look like?
What is the odds ratio?
Found in CHEMBL 565658/1352681
Found in CHEMBL240 – hERG where pIC50 >=5 1985/2451
OR = 1985/2451 = 0.81
565658/1352681 0.42
=1.94 (95% conf 1.83 – 2.05)
19
Lipophilic base, usually a tertiary amine
X = 2-5 atom chain, may include rings, heteroatoms
or polar groups
X
N
R1
R2
e. g. sertindole: 14nM vs hERG
[$([NX3;H2,H1,H0;!$(N[C,S]=[O,N])]~*~*~*~c),$([NX3;H2,H1,H0;!$(N[C,S]=[O,N])]~*~*~c),$([NX3;H2,H1,H0;!$(N[C,S]=[O,N])]~*~*~*~*~c),$([NX3;H2,H1,H0;!$(N[C,S]=[O,N])]~*~*~*~*~*~c)]
Early simple hERG model
Ar-linker-base has only been found 1.9x more often in
hERG inhibitors than at random in ChEMBL
MedChemica
Domain of Applicability
“Whereof one cannot speak, thereof one must be silent.”1
Claiming to have extracted knowledge or making a prediction when we know don’t have
enough evidence is:
• Delusional
• Dangerous
• it would be more productive to act on a different hypothesis or at random
• Degrades using rational analysis at all
Compound activity prediction should have three classes of output:
• Active
• Inactive
• Out of domain – no prediction possible
Only fragments with sufficient evidential base are used to form into
pharmacophore dyads
In turn only pharmcophore dyads that have enough support are used in the
model
20
1. Wittgenstein, Tractatus Logico-Philosophicus, 1922
MedChemica
Model activity from presence of Pharmacophores
21
0
20
40
60
80
0
20
40
60
80
01
4 6 8 10
pIC50
count
Matches
0
1
0
30
60
90
120
4 6 8 10
pIC50
count
Matches
0
1
Identify and group Fragment SMARTS from
MMPA
If n ≥ 8, perform a one-tailed binomial test with
Holm-Bonferroni adjustment
Remove non significant ‘Biophores’
Compare the mean of the compounds containing
the biophore with the mean of the remaining
compounds for significance (Welch’s t test and
effect size Cohen’s d)
Permutate all the significant Biophores and
determine the shortest paths between them in
the training set = Pharmacophore dyads
Select Pharmacophore dyads with n >=6
examples
Use presence /absence of
Pharmacophore dyad as an indicator
variable in PLS modelling
Dopamine Transport +/- pharmacophores
MedChemica
Modelling critical safety targets
22
1. J. Bowes, A. J. Brown, J. Hamon, W. Jarolimek, A. Sridhar, G. Waldron, and S. Whitebread, “Reducing safety-related
drug attrition: the use of in vitro pharmacological profiling,” Nat. Rev. Drug Discov., vol. 11, no. 12, pp. 909–922, Nov.
2012
Public Data
Find
Matched
Pairs
Pharmacophores
Find
Pharmacophore
dyads
Find Potent
Fragments
Target Class Effect Number of compounds
Acetylcholine esterase - human enzyme CV: drop in BP, drop in HR, bronchioconstriction 383
b 1 adrenergic receptor GPCR CV: change in HR, BP, bronchiodilation, vasodilation, tremor 505
Androgen receptor NHR
Endocrine: agonism: androgenicity / gynecomastia, prostrate /
breast carcinoma
1064
CB1 canabinnoid receptor GPCR
CNS: euphoria, dysphoria, anxiety, memory impairment, analgesia,
hypothermia, weight loss, emesis, depression
1104
CB2 canabinnoid receptor GPCR increased inflammation 1112
Dopamine D2 receptor - human GPCR
CNS: hallucinations, drowsiness, confusion, emesis,
CV drop in heart rate
3873
Dopamine D2 receptor - rat GPCR As human 1807
Dopamine Transporter Transporter
CNS: addictive psychostimulation, depression , parkinsonism,
seizures
1470
GABA A receptor Ion channel CNS: anxiolysis, ataxia, sedation, depression, amnesia 848
hERG ion channel Ion channel CV: QT prolongation 4189
5HT2a receptor GPCR CNS:drop in body temp, anxiogenic 642
Monoamine oxidase enzyme CV increase BP, DDI potential CNS: dizziness, nausea 264
Muscarinic acetyl choline receptor
M1
GPCR CNS: proconvulsant, drop in cognitive function, vision impairment 628
m opioid receptor GPCR CNS: sedation, abuse liability, respiratory depression, hypothermia 1128
MedChemica
Target
Number of
compounds
Number of
compound
pairs
Number of
Fragments
Number of
Pharmacophore
dyads after
filtering
R2 RMSEP ROC
odds_ratio
(geomean)
Acetylcholine esterase - human 383 27755 44 10 0.43 1.57 0.80 4
b 1 adrenergic receptor 505 145447 276 313 0.64 0.70 0.96 833
Androgen receptor 1064 113163 186 46 0.47 0.77 0.86 140
CB1 canabinnoid receptor 1104 88091 165 90 0.61 1.02 0.87 96
CB2 canabinnoid receptor 1112 82130 194 158 0.19 0.85 0.64 5.7
Dopamine D2 receptor - human 3873 230962 483 602 0.42 0.88 0.69 110
Dopamine D2 receptor - rat 1807 118736 267 377 0.29 0.85 0.78 125
Dopamine Transporter 1470 106969 282 336 0.58 0.73 0.88 141
GABA A receptor 848 39494 106 167 0.70 0.76 0.97 560
hERG ion channel 4189 242261 392 76 0.61 0.96 0.92 55
5HT2a receptor 642 50870 197 267 0.61 0.59 0.83 600
Monoamine oxidase 264 15439 44 11 0.12 1.25 0.48 181
Muscarinic acetylcholine receptor M1 628 48200 97 510 0.62 0.94 0.89 48
m opioid receptor 1128 37184 33 11 0.69 1.30 0.87 81
Modelling critical safety targets
• Build models using 10-fold cross validated PLS
• Assess using ROC / BEDROC, R2 vs 100 fold y-scrambled R2 and geomean odds ratio
23
Public Data
Find
Matched
Pairs
Pharmacophores
Find
Pharmacophore
dyads
Find Potent
Fragments
MedChemica
Toxophore examples
Detailed, specific & transparent
24
Dopamine D2 receptor human
Actual: 9.5
Predicted: 9.1
Mean with: 8.0
Mean without: 6.6
Odds Ratio: 340
Dopamine Transporter
Actual: 9.1
Predicted: 8.6
Mean with: 8.3
Mean without: 6.7
Odds Ratio: 407
GABA-A
Actual: 9.0
Predicted: 8.7
Mean with: 8.0
Mean without: 6.8
Odds Ratio: 1506
b1 adrenergic receptor
Actual: 7.8
Predicted: 7.7
Mean with: 6.5
Mean without: 5.7
Odds Ratio: 1501
MedChemica
Safety Target Conclusions
• We can model safety critical targets and extract both predictive models and
useful ligand structural information
• Clear areas to action
• Clearly defined domain of applicability
• No prediction where there is insufficient evidence (conservative method)
• The method relies on having large data sets >= 500 data points
• MMPA is computationally intense phase
• But of course molecules only need pairing once…
25
MedChemica
Actionable knowledge
Critical information that the user can immediately choose
a course of action from:
26
ADME
– ways to ‘fix’ your molecule
Toxicology
– sub structures to avoid
Pharmacology
– substructural leads built for
practical design
MedChemica
Prediction of unseen new molecules
The acid test…
• Vascular endothelial growth factor receptor 2 tyrosine kinase (KDR)
• Inhibitors have oncology and ophthalmic indications
• Large dataset in CHEMBL
• 10 fold cross validated PLS model
• Selected model by minimised RMSEP
27
Compounds 4466
Matched Pairs 288100
Fragments 678
Pharmacophore dyads 787
RMSEP 0.8
R2 0.64
Y-scrambled R2 0.0
ROC 0.95
Geomean odds ratio 80
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
● ●●
●●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●●
●
●
●
●●●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
4
6
8
10
5 7 9
pIC50_pred
pIC50
MedChemica
Novartis Predictions From Our Model
Domain of Applicabiltiy….
Actual: 8.4[1]
Predicted: 7.5
28
Actual: 7.6[1]
Predicted: 7.5
1. J MedChem(2016), Bold et al.
2. MedChem Lett (2016), Mainolfi et al.
Actual: 7.7[2]
Predicted: 7.1
Actual: 9.0[2]
Predicted: Out of Domain
MedChemica
Value of Potency prediction from MMPA:
Clear substructures enable rapid actions
29
Compounds +
data
Safety
data
Potency
data
HTS
data
Toxicity alerts
Virtual Library prioritisation
Virtual Library design
Fragment set design
Retest prioritisation
Hit re-mining / analogue hunting
Substructure modification
Lead design
Fast Follower design
26 examples in
training set
Mean without
pharmacophore
Mean with
pharmacophore
0
2
4
6
6 7 8 9 10
pIC50
count
MedChemica
The MedChemica team
Andrew G Leach
Al Dossetter
Shane Montague
Lauren Reid*
Jess Stacey*
*Royal Society of Chemistry Industrial Placements Grant Scheme
MedChemica
A Collaboration of the willing
Craig Bruce OE
David Cosgrove GalCoz
Andy Grant★
Martin Harrison Elixir
Paul Faulder Elixir
Andrew Griffin Elixir
Huw Jones Base360
Al Rabow
David Riley AZ
Graeme Robb AZ
Attilla Ting AZ
Howard Tucker retired
Dan Warner Myjar
Steve St-Galley Sygnature
David Wood JBA Risk
Management
Phil Jewsbury AZ
Mike Snowden AZ
Peter Sjo AZ
Martin Packer AZ
Manos Perros AZ
Nick Tomkinson AZ
Martin Stahl Roche
Jerome Hert Roche
Martin Blapp Roche
Torsten Schindler Roche
Paula Petrone Roche
John Cumming Roche
Jeff Blaney Genentech
Hao Zheng Genentech
Slaton Lipscomb Genentech
James Crawford Genentech

More Related Content

What's hot

Cadd (Computer-Aided Drug Designing)
Cadd (Computer-Aided Drug Designing)Cadd (Computer-Aided Drug Designing)
Cadd (Computer-Aided Drug Designing)
siddharth singh
 
Alternative to Animal Experimentation.pptx
Alternative to Animal Experimentation.pptxAlternative to Animal Experimentation.pptx
Alternative to Animal Experimentation.pptx
Ashwani Dhingra
 
A Systematic Approach to Overcome the Matrix Effect during LC-ESI-MS/MS Analysis
A Systematic Approach to Overcome the Matrix Effect during LC-ESI-MS/MS AnalysisA Systematic Approach to Overcome the Matrix Effect during LC-ESI-MS/MS Analysis
A Systematic Approach to Overcome the Matrix Effect during LC-ESI-MS/MS Analysis
Bhaswat Chakraborty
 
admet
admet admet
admet
Alia Malick
 
Drug Discovery Today: Fighting TB with Technology
Drug Discovery Today: Fighting TB with TechnologyDrug Discovery Today: Fighting TB with Technology
Drug Discovery Today: Fighting TB with Technology
rendevilla
 
Matrix Effects In Metabolic Profiling Using Gc Lc Coupled Mass Spectrometers
Matrix Effects In Metabolic Profiling Using Gc Lc Coupled Mass SpectrometersMatrix Effects In Metabolic Profiling Using Gc Lc Coupled Mass Spectrometers
Matrix Effects In Metabolic Profiling Using Gc Lc Coupled Mass Spectrometers
beneshjoseph
 
PPT Of Drugs Test Strip
PPT Of Drugs Test StripPPT Of Drugs Test Strip
PPT Of Drugs Test Strip
Drugsteststrip
 
LC-MS methods for regulated bioequivalence
LC-MS methods for regulated bioequivalenceLC-MS methods for regulated bioequivalence
LC-MS methods for regulated bioequivalence
Bhaswat Chakraborty
 
Pharmacophore mapping in Drug Development
Pharmacophore mapping in Drug DevelopmentPharmacophore mapping in Drug Development
Pharmacophore mapping in Drug Development
Mbachu Chinedu
 
Denovo Drug Design
Denovo Drug DesignDenovo Drug Design
Denovo Drug Design
Somasekhar Gupta
 
High-throughput screening (HTS)
High-throughput screening (HTS)High-throughput screening (HTS)
High-throughput screening (HTS)
SudipDandapat1
 
In Vitro In Vivo Correlation
In Vitro In Vivo CorrelationIn Vitro In Vivo Correlation
In Vitro In Vivo Correlation
MutibaRazzaq
 
High Throughput Screening
High Throughput Screening High Throughput Screening
High Throughput Screening
ISF COLLEGE OF PHARMACY MOGA
 
To perform Analytical method validation of Paracetamol Tablets by UV-spectrop...
To perform Analytical method validation of Paracetamol Tablets by UV-spectrop...To perform Analytical method validation of Paracetamol Tablets by UV-spectrop...
To perform Analytical method validation of Paracetamol Tablets by UV-spectrop...
Aakashdeep Raval
 
The Analysis of SunscreenActive Ingredients and Parabens in Lotions and Lip B...
The Analysis of SunscreenActive Ingredients and Parabens in Lotions and Lip B...The Analysis of SunscreenActive Ingredients and Parabens in Lotions and Lip B...
The Analysis of SunscreenActive Ingredients and Parabens in Lotions and Lip B...
PerkinElmer, Inc.
 
Computer aided-drug-design-boc sciences
Computer aided-drug-design-boc sciencesComputer aided-drug-design-boc sciences
Computer aided-drug-design-boc sciences
BOC-Sciences
 
Computer aided drug designing
Computer aided drug designingComputer aided drug designing
Computer aided drug designing
Muhammed sadiq
 
Stability Testing of Pharmaceuticals and Supplements
Stability Testing of Pharmaceuticals and SupplementsStability Testing of Pharmaceuticals and Supplements
Stability Testing of Pharmaceuticals and Supplements
EMMAIntl
 
Occupational drug and alcohol testing
Occupational drug and alcohol testingOccupational drug and alcohol testing
Occupational drug and alcohol testing
Flyingmedicine Ltd
 
Validated Pain Management Drugs in Urine-MicroLiter
Validated Pain Management Drugs in Urine-MicroLiterValidated Pain Management Drugs in Urine-MicroLiter
Validated Pain Management Drugs in Urine-MicroLiter
Rick Youngblood
 

What's hot (20)

Cadd (Computer-Aided Drug Designing)
Cadd (Computer-Aided Drug Designing)Cadd (Computer-Aided Drug Designing)
Cadd (Computer-Aided Drug Designing)
 
Alternative to Animal Experimentation.pptx
Alternative to Animal Experimentation.pptxAlternative to Animal Experimentation.pptx
Alternative to Animal Experimentation.pptx
 
A Systematic Approach to Overcome the Matrix Effect during LC-ESI-MS/MS Analysis
A Systematic Approach to Overcome the Matrix Effect during LC-ESI-MS/MS AnalysisA Systematic Approach to Overcome the Matrix Effect during LC-ESI-MS/MS Analysis
A Systematic Approach to Overcome the Matrix Effect during LC-ESI-MS/MS Analysis
 
admet
admet admet
admet
 
Drug Discovery Today: Fighting TB with Technology
Drug Discovery Today: Fighting TB with TechnologyDrug Discovery Today: Fighting TB with Technology
Drug Discovery Today: Fighting TB with Technology
 
Matrix Effects In Metabolic Profiling Using Gc Lc Coupled Mass Spectrometers
Matrix Effects In Metabolic Profiling Using Gc Lc Coupled Mass SpectrometersMatrix Effects In Metabolic Profiling Using Gc Lc Coupled Mass Spectrometers
Matrix Effects In Metabolic Profiling Using Gc Lc Coupled Mass Spectrometers
 
PPT Of Drugs Test Strip
PPT Of Drugs Test StripPPT Of Drugs Test Strip
PPT Of Drugs Test Strip
 
LC-MS methods for regulated bioequivalence
LC-MS methods for regulated bioequivalenceLC-MS methods for regulated bioequivalence
LC-MS methods for regulated bioequivalence
 
Pharmacophore mapping in Drug Development
Pharmacophore mapping in Drug DevelopmentPharmacophore mapping in Drug Development
Pharmacophore mapping in Drug Development
 
Denovo Drug Design
Denovo Drug DesignDenovo Drug Design
Denovo Drug Design
 
High-throughput screening (HTS)
High-throughput screening (HTS)High-throughput screening (HTS)
High-throughput screening (HTS)
 
In Vitro In Vivo Correlation
In Vitro In Vivo CorrelationIn Vitro In Vivo Correlation
In Vitro In Vivo Correlation
 
High Throughput Screening
High Throughput Screening High Throughput Screening
High Throughput Screening
 
To perform Analytical method validation of Paracetamol Tablets by UV-spectrop...
To perform Analytical method validation of Paracetamol Tablets by UV-spectrop...To perform Analytical method validation of Paracetamol Tablets by UV-spectrop...
To perform Analytical method validation of Paracetamol Tablets by UV-spectrop...
 
The Analysis of SunscreenActive Ingredients and Parabens in Lotions and Lip B...
The Analysis of SunscreenActive Ingredients and Parabens in Lotions and Lip B...The Analysis of SunscreenActive Ingredients and Parabens in Lotions and Lip B...
The Analysis of SunscreenActive Ingredients and Parabens in Lotions and Lip B...
 
Computer aided-drug-design-boc sciences
Computer aided-drug-design-boc sciencesComputer aided-drug-design-boc sciences
Computer aided-drug-design-boc sciences
 
Computer aided drug designing
Computer aided drug designingComputer aided drug designing
Computer aided drug designing
 
Stability Testing of Pharmaceuticals and Supplements
Stability Testing of Pharmaceuticals and SupplementsStability Testing of Pharmaceuticals and Supplements
Stability Testing of Pharmaceuticals and Supplements
 
Occupational drug and alcohol testing
Occupational drug and alcohol testingOccupational drug and alcohol testing
Occupational drug and alcohol testing
 
Validated Pain Management Drugs in Urine-MicroLiter
Validated Pain Management Drugs in Urine-MicroLiterValidated Pain Management Drugs in Urine-MicroLiter
Validated Pain Management Drugs in Urine-MicroLiter
 

Similar to Extracting actionable knowledge from large scale in vitro pharmacology data

Pjb Probes 2009
Pjb Probes 2009Pjb Probes 2009
Pjb Probes 2009
toluene
 
Long Acting Injectables - A New Dimension for Proteins and Peptides
Long Acting Injectables - A New Dimension for Proteins and PeptidesLong Acting Injectables - A New Dimension for Proteins and Peptides
Long Acting Injectables - A New Dimension for Proteins and Peptides
MilliporeSigma
 
Long Acting Injectables - A New Dimension for Proteins and Peptides
Long Acting Injectables - A New Dimension for Proteins and PeptidesLong Acting Injectables - A New Dimension for Proteins and Peptides
Long Acting Injectables - A New Dimension for Proteins and Peptides
Merck Life Sciences
 
Intro to homology modeling
Intro to homology modelingIntro to homology modeling
Network analysis of cancer metabolism: A novel route to precision medicine
Network analysis of cancer metabolism: A novel route to precision medicineNetwork analysis of cancer metabolism: A novel route to precision medicine
Network analysis of cancer metabolism: A novel route to precision medicine
Varshit Dusad
 
Pharmacophore extraction from Matched Molecular Pair Analysis
Pharmacophore extraction from Matched Molecular Pair AnalysisPharmacophore extraction from Matched Molecular Pair Analysis
Pharmacophore extraction from Matched Molecular Pair Analysis
Ed Griffen
 
Docking
DockingDocking
Docking
Monika Verma
 
Unc slides on computational toxicology
Unc slides on computational toxicologyUnc slides on computational toxicology
Unc slides on computational toxicology
Sean Ekins
 
Explainable AI in Drug Hunting
Explainable AI in Drug HuntingExplainable AI in Drug Hunting
Explainable AI in Drug Hunting
Ed Griffen
 
Development and sharing of ADME/Tox and Drug Discovery Machine learning models
Development and sharing of ADME/Tox and Drug Discovery Machine learning modelsDevelopment and sharing of ADME/Tox and Drug Discovery Machine learning models
Development and sharing of ADME/Tox and Drug Discovery Machine learning models
Sean Ekins
 
2015 bioinformatics bio_cheminformatics_wim_vancriekinge
2015 bioinformatics bio_cheminformatics_wim_vancriekinge2015 bioinformatics bio_cheminformatics_wim_vancriekinge
2015 bioinformatics bio_cheminformatics_wim_vancriekinge
Prof. Wim Van Criekinge
 
2016 bioinformatics i_bio_cheminformatics_wimvancriekinge
2016 bioinformatics i_bio_cheminformatics_wimvancriekinge2016 bioinformatics i_bio_cheminformatics_wimvancriekinge
2016 bioinformatics i_bio_cheminformatics_wimvancriekinge
Prof. Wim Van Criekinge
 
BCSRCv1.3
BCSRCv1.3BCSRCv1.3
BCSRCv1.3
Kishan Bhut
 
MedChemica Large scale analysis and sharing of Medicinal chemistry Knowledge ...
MedChemica Large scale analysis and sharing of Medicinal chemistry Knowledge ...MedChemica Large scale analysis and sharing of Medicinal chemistry Knowledge ...
MedChemica Large scale analysis and sharing of Medicinal chemistry Knowledge ...
Ed Griffen
 
Drug properties (ADMET) prediction using AI
Drug properties (ADMET) prediction using AIDrug properties (ADMET) prediction using AI
Drug properties (ADMET) prediction using AI
IndrajeetKumar124
 
Probes 2010
Probes 2010Probes 2010
Probes 2010
toluene
 
BioExpo 2023 Presentation - Computational Chemistry in Drug Discovery: Bridgi...
BioExpo 2023 Presentation - Computational Chemistry in Drug Discovery: Bridgi...BioExpo 2023 Presentation - Computational Chemistry in Drug Discovery: Bridgi...
BioExpo 2023 Presentation - Computational Chemistry in Drug Discovery: Bridgi...
Trustlife
 
Cadd and molecular modeling for M.Pharm
Cadd and molecular modeling for M.PharmCadd and molecular modeling for M.Pharm
Cadd and molecular modeling for M.Pharm
Shikha Popali
 
Virtual screening of chemicals for endocrine disrupting activity through CER...
Virtual screening of chemicals for endocrine disrupting activity through  CER...Virtual screening of chemicals for endocrine disrupting activity through  CER...
Virtual screening of chemicals for endocrine disrupting activity through CER...
Kamel Mansouri
 
Determining stable ligand orientation
Determining stable ligand orientationDetermining stable ligand orientation
Determining stable ligand orientation
ijaia
 

Similar to Extracting actionable knowledge from large scale in vitro pharmacology data (20)

Pjb Probes 2009
Pjb Probes 2009Pjb Probes 2009
Pjb Probes 2009
 
Long Acting Injectables - A New Dimension for Proteins and Peptides
Long Acting Injectables - A New Dimension for Proteins and PeptidesLong Acting Injectables - A New Dimension for Proteins and Peptides
Long Acting Injectables - A New Dimension for Proteins and Peptides
 
Long Acting Injectables - A New Dimension for Proteins and Peptides
Long Acting Injectables - A New Dimension for Proteins and PeptidesLong Acting Injectables - A New Dimension for Proteins and Peptides
Long Acting Injectables - A New Dimension for Proteins and Peptides
 
Intro to homology modeling
Intro to homology modelingIntro to homology modeling
Intro to homology modeling
 
Network analysis of cancer metabolism: A novel route to precision medicine
Network analysis of cancer metabolism: A novel route to precision medicineNetwork analysis of cancer metabolism: A novel route to precision medicine
Network analysis of cancer metabolism: A novel route to precision medicine
 
Pharmacophore extraction from Matched Molecular Pair Analysis
Pharmacophore extraction from Matched Molecular Pair AnalysisPharmacophore extraction from Matched Molecular Pair Analysis
Pharmacophore extraction from Matched Molecular Pair Analysis
 
Docking
DockingDocking
Docking
 
Unc slides on computational toxicology
Unc slides on computational toxicologyUnc slides on computational toxicology
Unc slides on computational toxicology
 
Explainable AI in Drug Hunting
Explainable AI in Drug HuntingExplainable AI in Drug Hunting
Explainable AI in Drug Hunting
 
Development and sharing of ADME/Tox and Drug Discovery Machine learning models
Development and sharing of ADME/Tox and Drug Discovery Machine learning modelsDevelopment and sharing of ADME/Tox and Drug Discovery Machine learning models
Development and sharing of ADME/Tox and Drug Discovery Machine learning models
 
2015 bioinformatics bio_cheminformatics_wim_vancriekinge
2015 bioinformatics bio_cheminformatics_wim_vancriekinge2015 bioinformatics bio_cheminformatics_wim_vancriekinge
2015 bioinformatics bio_cheminformatics_wim_vancriekinge
 
2016 bioinformatics i_bio_cheminformatics_wimvancriekinge
2016 bioinformatics i_bio_cheminformatics_wimvancriekinge2016 bioinformatics i_bio_cheminformatics_wimvancriekinge
2016 bioinformatics i_bio_cheminformatics_wimvancriekinge
 
BCSRCv1.3
BCSRCv1.3BCSRCv1.3
BCSRCv1.3
 
MedChemica Large scale analysis and sharing of Medicinal chemistry Knowledge ...
MedChemica Large scale analysis and sharing of Medicinal chemistry Knowledge ...MedChemica Large scale analysis and sharing of Medicinal chemistry Knowledge ...
MedChemica Large scale analysis and sharing of Medicinal chemistry Knowledge ...
 
Drug properties (ADMET) prediction using AI
Drug properties (ADMET) prediction using AIDrug properties (ADMET) prediction using AI
Drug properties (ADMET) prediction using AI
 
Probes 2010
Probes 2010Probes 2010
Probes 2010
 
BioExpo 2023 Presentation - Computational Chemistry in Drug Discovery: Bridgi...
BioExpo 2023 Presentation - Computational Chemistry in Drug Discovery: Bridgi...BioExpo 2023 Presentation - Computational Chemistry in Drug Discovery: Bridgi...
BioExpo 2023 Presentation - Computational Chemistry in Drug Discovery: Bridgi...
 
Cadd and molecular modeling for M.Pharm
Cadd and molecular modeling for M.PharmCadd and molecular modeling for M.Pharm
Cadd and molecular modeling for M.Pharm
 
Virtual screening of chemicals for endocrine disrupting activity through CER...
Virtual screening of chemicals for endocrine disrupting activity through  CER...Virtual screening of chemicals for endocrine disrupting activity through  CER...
Virtual screening of chemicals for endocrine disrupting activity through CER...
 
Determining stable ligand orientation
Determining stable ligand orientationDetermining stable ligand orientation
Determining stable ligand orientation
 

More from Ed Griffen

MedChemica Levinthal Lecture at Openeye CUP XX 2020
MedChemica Levinthal Lecture at Openeye CUP XX 2020MedChemica Levinthal Lecture at Openeye CUP XX 2020
MedChemica Levinthal Lecture at Openeye CUP XX 2020
Ed Griffen
 
Emerging Challenges for Artificial Intelligence in Medicinal Chemistry
Emerging Challenges for Artificial Intelligence in Medicinal ChemistryEmerging Challenges for Artificial Intelligence in Medicinal Chemistry
Emerging Challenges for Artificial Intelligence in Medicinal Chemistry
Ed Griffen
 
Accelerating lead optimisation with active learning by exploiting MMPA based ...
Accelerating lead optimisation with active learning by exploiting MMPA based ...Accelerating lead optimisation with active learning by exploiting MMPA based ...
Accelerating lead optimisation with active learning by exploiting MMPA based ...
Ed Griffen
 
Griffen MedChemica Virtual Tox Panel
Griffen MedChemica Virtual Tox PanelGriffen MedChemica Virtual Tox Panel
Griffen MedChemica Virtual Tox Panel
Ed Griffen
 
RSC Hatfield 2018 Kinase meeting : potency patents MMPA approaches
RSC Hatfield 2018  Kinase meeting : potency patents MMPA approachesRSC Hatfield 2018  Kinase meeting : potency patents MMPA approaches
RSC Hatfield 2018 Kinase meeting : potency patents MMPA approaches
Ed Griffen
 
SCI What can Big Data do for Chemistry 2017 MedChemica
SCI What can Big Data do for Chemistry 2017 MedChemicaSCI What can Big Data do for Chemistry 2017 MedChemica
SCI What can Big Data do for Chemistry 2017 MedChemica
Ed Griffen
 
Learning Medicinal Chemistry ADMET rules UKQSAR Sept 2017
Learning Medicinal Chemistry ADMET rules UKQSAR Sept 2017Learning Medicinal Chemistry ADMET rules UKQSAR Sept 2017
Learning Medicinal Chemistry ADMET rules UKQSAR Sept 2017
Ed Griffen
 
Extracting medicinal chemistry knowledge by a secured Matched Molecular Pair ...
Extracting medicinal chemistry knowledge by a secured Matched Molecular Pair ...Extracting medicinal chemistry knowledge by a secured Matched Molecular Pair ...
Extracting medicinal chemistry knowledge by a secured Matched Molecular Pair ...
Ed Griffen
 

More from Ed Griffen (8)

MedChemica Levinthal Lecture at Openeye CUP XX 2020
MedChemica Levinthal Lecture at Openeye CUP XX 2020MedChemica Levinthal Lecture at Openeye CUP XX 2020
MedChemica Levinthal Lecture at Openeye CUP XX 2020
 
Emerging Challenges for Artificial Intelligence in Medicinal Chemistry
Emerging Challenges for Artificial Intelligence in Medicinal ChemistryEmerging Challenges for Artificial Intelligence in Medicinal Chemistry
Emerging Challenges for Artificial Intelligence in Medicinal Chemistry
 
Accelerating lead optimisation with active learning by exploiting MMPA based ...
Accelerating lead optimisation with active learning by exploiting MMPA based ...Accelerating lead optimisation with active learning by exploiting MMPA based ...
Accelerating lead optimisation with active learning by exploiting MMPA based ...
 
Griffen MedChemica Virtual Tox Panel
Griffen MedChemica Virtual Tox PanelGriffen MedChemica Virtual Tox Panel
Griffen MedChemica Virtual Tox Panel
 
RSC Hatfield 2018 Kinase meeting : potency patents MMPA approaches
RSC Hatfield 2018  Kinase meeting : potency patents MMPA approachesRSC Hatfield 2018  Kinase meeting : potency patents MMPA approaches
RSC Hatfield 2018 Kinase meeting : potency patents MMPA approaches
 
SCI What can Big Data do for Chemistry 2017 MedChemica
SCI What can Big Data do for Chemistry 2017 MedChemicaSCI What can Big Data do for Chemistry 2017 MedChemica
SCI What can Big Data do for Chemistry 2017 MedChemica
 
Learning Medicinal Chemistry ADMET rules UKQSAR Sept 2017
Learning Medicinal Chemistry ADMET rules UKQSAR Sept 2017Learning Medicinal Chemistry ADMET rules UKQSAR Sept 2017
Learning Medicinal Chemistry ADMET rules UKQSAR Sept 2017
 
Extracting medicinal chemistry knowledge by a secured Matched Molecular Pair ...
Extracting medicinal chemistry knowledge by a secured Matched Molecular Pair ...Extracting medicinal chemistry knowledge by a secured Matched Molecular Pair ...
Extracting medicinal chemistry knowledge by a secured Matched Molecular Pair ...
 

Recently uploaded

Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
Leonel Morgado
 
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
frank0071
 
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Sérgio Sacani
 
Anti-Universe And Emergent Gravity and the Dark Universe
Anti-Universe And Emergent Gravity and the Dark UniverseAnti-Universe And Emergent Gravity and the Dark Universe
Anti-Universe And Emergent Gravity and the Dark Universe
Sérgio Sacani
 
Pests of Storage_Identification_Dr.UPR.pdf
Pests of Storage_Identification_Dr.UPR.pdfPests of Storage_Identification_Dr.UPR.pdf
Pests of Storage_Identification_Dr.UPR.pdf
PirithiRaju
 
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptxLEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
yourprojectpartner05
 
CLASS 12th CHEMISTRY SOLID STATE ppt (Animated)
CLASS 12th CHEMISTRY SOLID STATE ppt (Animated)CLASS 12th CHEMISTRY SOLID STATE ppt (Animated)
CLASS 12th CHEMISTRY SOLID STATE ppt (Animated)
eitps1506
 
fermented food science of sauerkraut.pptx
fermented food science of sauerkraut.pptxfermented food science of sauerkraut.pptx
fermented food science of sauerkraut.pptx
ananya23nair
 
Farming systems analysis: what have we learnt?.pptx
Farming systems analysis: what have we learnt?.pptxFarming systems analysis: what have we learnt?.pptx
Farming systems analysis: what have we learnt?.pptx
Frédéric Baudron
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
University of Hertfordshire
 
Microbiology of Central Nervous System INFECTIONS.pdf
Microbiology of Central Nervous System INFECTIONS.pdfMicrobiology of Central Nervous System INFECTIONS.pdf
Microbiology of Central Nervous System INFECTIONS.pdf
sammy700571
 
HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1
Shashank Shekhar Pandey
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
vluwdy49
 
Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)
Sciences of Europe
 
cathode ray oscilloscope and its applications
cathode ray oscilloscope and its applicationscathode ray oscilloscope and its applications
cathode ray oscilloscope and its applications
sandertein
 
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
Sérgio Sacani
 
HUMAN EYE By-R.M Class 10 phy best digital notes.pdf
HUMAN EYE By-R.M Class 10 phy best digital notes.pdfHUMAN EYE By-R.M Class 10 phy best digital notes.pdf
HUMAN EYE By-R.M Class 10 phy best digital notes.pdf
Ritik83251
 
Discovery of An Apparent Red, High-Velocity Type Ia Supernova at 𝐳 = 2.9 wi...
Discovery of An Apparent Red, High-Velocity Type Ia Supernova at  𝐳 = 2.9  wi...Discovery of An Apparent Red, High-Velocity Type Ia Supernova at  𝐳 = 2.9  wi...
Discovery of An Apparent Red, High-Velocity Type Ia Supernova at 𝐳 = 2.9 wi...
Sérgio Sacani
 
Summary Of transcription and Translation.pdf
Summary Of transcription and Translation.pdfSummary Of transcription and Translation.pdf
Summary Of transcription and Translation.pdf
vadgavevedant86
 
Methods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdfMethods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdf
PirithiRaju
 

Recently uploaded (20)

Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
 
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
 
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
 
Anti-Universe And Emergent Gravity and the Dark Universe
Anti-Universe And Emergent Gravity and the Dark UniverseAnti-Universe And Emergent Gravity and the Dark Universe
Anti-Universe And Emergent Gravity and the Dark Universe
 
Pests of Storage_Identification_Dr.UPR.pdf
Pests of Storage_Identification_Dr.UPR.pdfPests of Storage_Identification_Dr.UPR.pdf
Pests of Storage_Identification_Dr.UPR.pdf
 
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptxLEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
 
CLASS 12th CHEMISTRY SOLID STATE ppt (Animated)
CLASS 12th CHEMISTRY SOLID STATE ppt (Animated)CLASS 12th CHEMISTRY SOLID STATE ppt (Animated)
CLASS 12th CHEMISTRY SOLID STATE ppt (Animated)
 
fermented food science of sauerkraut.pptx
fermented food science of sauerkraut.pptxfermented food science of sauerkraut.pptx
fermented food science of sauerkraut.pptx
 
Farming systems analysis: what have we learnt?.pptx
Farming systems analysis: what have we learnt?.pptxFarming systems analysis: what have we learnt?.pptx
Farming systems analysis: what have we learnt?.pptx
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
 
Microbiology of Central Nervous System INFECTIONS.pdf
Microbiology of Central Nervous System INFECTIONS.pdfMicrobiology of Central Nervous System INFECTIONS.pdf
Microbiology of Central Nervous System INFECTIONS.pdf
 
HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
 
Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)
 
cathode ray oscilloscope and its applications
cathode ray oscilloscope and its applicationscathode ray oscilloscope and its applications
cathode ray oscilloscope and its applications
 
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
 
HUMAN EYE By-R.M Class 10 phy best digital notes.pdf
HUMAN EYE By-R.M Class 10 phy best digital notes.pdfHUMAN EYE By-R.M Class 10 phy best digital notes.pdf
HUMAN EYE By-R.M Class 10 phy best digital notes.pdf
 
Discovery of An Apparent Red, High-Velocity Type Ia Supernova at 𝐳 = 2.9 wi...
Discovery of An Apparent Red, High-Velocity Type Ia Supernova at  𝐳 = 2.9  wi...Discovery of An Apparent Red, High-Velocity Type Ia Supernova at  𝐳 = 2.9  wi...
Discovery of An Apparent Red, High-Velocity Type Ia Supernova at 𝐳 = 2.9 wi...
 
Summary Of transcription and Translation.pdf
Summary Of transcription and Translation.pdfSummary Of transcription and Translation.pdf
Summary Of transcription and Translation.pdf
 
Methods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdfMethods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdf
 

Extracting actionable knowledge from large scale in vitro pharmacology data

  • 1. Ed Griffen, MedChemica Ltd Extracting actionable knowledge from large scale in vitro pharmacology data
  • 2. MedChemica Why improve medicinal chemistry practice? For an aging population and emerging pathogens “Eroom’s Law” – The cost of discovering a new drug has doubled every 9 years consistently for the last 60 years.1 = cost  8%/year 2 1. Scannell et al Nature Reviews Drug Discovery (2012), 11, 191-200 2. Paul et al Nature Reviews Drug Discovery (2010), 9, 203-214 Cost / $million Cost/Launch(2010): $873m Capitalised: $1.8Bn2 0 50 100 150 200 250 300 350 400 450 500 Cost / project Cost/Launch Cost/Launch (capitalized)
  • 3. MedChemica Actionable knowledge Critical information that the user can immediately choose a course of action from: 3 ADME – ways to ‘fix’ your molecule Toxicology – sub structures to avoid Pharmacology – substructural leads built for practical design
  • 4. MedChemica Roche Data rule finder Roche Database Genentech Data rule finder Genentech Data AZ Data rule finder AZ Database Grand Rule Database ADMET Rule database Better medicinal chemistry by combining knowledge MedChemica Grand Rule Database Grand Rule Database Grand Rule Database AZ Exploitation Roche Exploitation Genentech Exploitation Pharma 4 Data rule finder Pharma 4 Data Grand Rule Database Pharma 4 Exploitation Grand Rule Database Pharma 5 Data rule finder Pharma 5 Data Grand Rule Database Pharma 5 Exploitation Grand Rule Database >500 million pairs from companies + 12 million from public data
  • 5. Current Knowledge sets – GRDv3 Numbers of statistically valid transforms Grouped Datasets Number of Rules logD7.4 153449 Merged solubility 46655 In vitro microsomal clearance: Human, rat ,mouse, cyno, dog 88423 In vitro hepatocyte clearance : Human, rat ,mouse, cyno, dog 26627 MCDK permeability A-B / B – A efflux 1852 Cytochrome P450 inhibition: 2C9, 2D6 , 3A4 , 2C19 , 1A2 40605 Cardiac ion channels NaV 1.5 , hERG ion channel inhibition 15636 Glutothione Stability 116 plasma protein or albumin binding Human, rat ,mouse, cyno, dog 64622
  • 6. MedChemica Actionable knowledge Critical information that the user can immediately choose a course of action from: 6 ADME – ways to ‘fix’ your molecule Toxicology – substructures to avoid Pharmacology – substructural leads built for practical design
  • 7. MedChemica Clear structural direction from Big Data Example Dopamine Transporter inhibitors 7 0 30 60 90 120 4 6 8 10 pIC50 count Pharmacophore 0 1 pKi Predicted 8.6 Measured 9.1 Mean with Pharmacophore 8.3 Mean without 6.7 n examples 27 Odds ratio : ChEMBL 407 What do I want?: • Substructures associated with potency • Specificity of model • Predictions • Domain of Applicability ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● 4 6 8 5 6 7 8 9 pIC50_pred pIC50 CHEMBL538405
  • 8. MedChemica MedChemica Principles of Pharmacophore Extraction • Pharmacophores must be clear and understandable • Pharmacophore generation must be transparent to allow checking and validation • Use as much measured data as possible • Look for key elements influencing potency • Don’t base pharmacophores on a few compounds • Pharmacophore must be specific • (not like phenyl + amine = hERG inhibitor) • Can be applied quickly (to large libraries) 8 Cation HyAr HyAr How do I actually use this?
  • 9. MedChemica QSAR and Knowledge extraction Model as filter or knowledge? 9 substructures Physical chemistry descriptors(Hansch, Taft, Fujita, Abraham) Atomic, pair, triplet descriptors Indices (M)LR Free Wilson PLS Trees / Forests SVM Bayesian NN Deep Learning Dark Black Descriptors Method
  • 10. MedChemica • Identify key potency giving changes by matched molecular pair analysis on large datasets • Extract fragments that are associated with potency • Find pairs of fragments and linkers that are specific to potent compound subsets •  easy to use 2D pharmacophores • 2 potency enhancing fragments joined by a specific linker Specific Pharmacophore extraction from MMPA 10 Model 1470 compounds CHEMBL339 Dopamine transporter Pharmacophore Identification CHEMBL538405 pKi 9.1 Example 0 200 400 5 6 7 8 9 pIC50 count All D2 Compounds Fragment I Fragment 2 Linker pKi Predicted 8.6 Measured 9.1 Mean with Pharmacophore 8.3 Mean without 6.7 n examples 27 Odds ratio : ChEMBL 407 Predict potency and show Pharmacophore match ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● 4 6 8 5 6 7 8 9 pIC50_pred pIC50 Public Data Find Matched Pairs Pharmacophores Find Pharmacophore dyads Find Potent Fragments
  • 11. MedChemica Matched Molecular Pairs • Molecules that differ only by a particular, well-defined structural transformation Transformation with environment capture • MMPs can be recorded as transformations from A B • Environment is essential to understand chemistry Statistical analysis • Learn what effect the transformation has had on properties in the past Griffen, E. et al. Matched Molecular Pairs as a Medicinal Chemistry Tool. Journal of Medicinal Chemistry. 2011, 54(22), pp.7739-7750. Advanced MMPA Δ Data A- B 1 2 2 3 3 3 4 4 4 12 23 3 34 4 4 A B Public Data Find Matched Pairs Fragments
  • 12. MedChemica Matched pair methodology because MCSS and F&I each find different pairings A – CHEMBL156639 B - CHEMBL2387702 A – CHEMBL100461 B –CHEMBL103900 MCSS ✓, F&I ✗ MCSS ✗ , F&I ✓ MCSS ✓, F&I ✗ MCSS ✓, F&I ✗ MCSS ✗, F&I ✗ MCSS ✗, F&I ✗ MCSS ✗ , F&I ✓ MCSS ✓, F&I ✗
  • 13. MedChemica Does the Matched Pair method really matter? Using only one technique will miss between 12% and 56% of pairings 13 Pairings Pairings number of compounds common FI only MCSS only total FI only % common % MCSS only % VEGF 4466 14631 17172 14823 46626 37 31 32 Dopamine Transporter 1470 4480 8930 3497 16907 53 26 21 GABAA 848 2500 1722 4205 8427 20 30 50 D2 human 3873 12995 13811 13098 39904 35 33 33 D2 rat 1807 5408 6595 7346 19349 34 28 38 Acetylcholine esterase 383 536 725 1434 2695 27 20 53 Monoamine oxidase 264 653 1156 246 2055 56 32 12 min 20 20 12 max 56 33 53 FI MCSS common
  • 14. MedChemica Mining transform sets to find potent fragments Identify the ‘A’ fragments associated with a significant number `of potency decreasing changes – irrespective of what they are replaced with ‘A’ is ‘better than anything you replace it with’ Fragment A Fragment B Change in binding measurement • One-tailed binomial test with Holm– Bonferonni correction at 95% confidence identifies potent fragments • Compare the mean of the compounds that contain the fragment with the mean of the remaining compounds Statistics: pKi/ pIC50 Compounds containing potent fragment Remaining compounds Effect size = Cohen’s d test A B C ED +2.1+2.2 +1.4 +0.4 F +1.8 Public Data Find Matched Pairs Find Potent Fragments Cohen’s d Effect sizes: Large >= 0.8 Medium 0.5 – 0.8 Small 0.2 - 0.5 Trivial 0.1 – 0.2 No effect < 0.1 d = A m - B m 1 s s ' = A 2 s + B 2 s 2
  • 15. MedChemica Mining transform sets to find destructive fragments Identify the ‘Z’ fragments associated with a significant number `of potency increasing changes – irrespective of what they are replaced with ‘Z’ is ‘worse than anything you replace it with’ Fragment A Fragment B Change in binding measurement Public Data Find Matched Pairs Find Potent Fragments +2.7 +3.2 +0.6 +0.6 Z pKi/ pIC50 Compounds containing destructive fragment Remaining compounds
  • 16. MedChemica Mining transform sets to find influential fragments Identify the ‘Z’ fragments associated with a significant number `of potency increasing changes – irrespective of what they are replaced with ‘Z’ is ‘worse than anything you replace it with’ Fragment A Fragment B Change in binding measurement Public Data Find Matched Pairs Find Potent Fragments +2.7 +3.2 +0.6 +0.6 Identify the ‘A’ fragments associated with a significant number `of potency decreasing changes – irrespective of what they are replaced with ‘A’ is ‘better than anything you replace it with’ A +2.1+2.2 +1.4 +0.4 +1.8 Z pKi/ pIC50 Compounds with destructive fragment Compounds with constructive fragments
  • 17. MedChemica Building Pharmacophores from potent Fragments But individual Fragments are small and often non – specific so… • Permutate all the pairs of fragments and find the the shortest path between them (pharmacophore dyads) in the training set • shortest path between them encodes distance & geometry • select pharmacophore dyads with PLS to identify the dyads that are explaining most of the potency • check for significance and effect size with Cohen’s d and Welch’s t-test. 17 • But what about specificity? Path Fragment 1 Fragment 2 [CH2]CN Public Data Find Matched Pairs Pharmacophores Find Pharmacophore dyads Find Potent Fragments
  • 18. MedChemica Testing for specificity - pharmacophores • How selective is the pharmacophore? • What are the odds of it hitting a molecule in the test set vs CHEMBL? • Odds of finding in potency set = n(pharmacophore hits in potency set) n(in potency set) • Odds of finding in CHEMBL = n(pharmacophore hits in CHEMBL not in potency set) n(in CHEMBL) • Odds ratio = selectivity = Odds of finding in potency set_______ Odds of finding in CHEMBL(not potency set) 18 27 1470 62 1351211 27/1470 62/1351211 =407 (95% confidence limits: 259-642) Odds of hitting a potent compound are 407 times greater than a random compound in CHEMBL Path Fragment 1 Fragment 2 [CH2]CN
  • 19. MedChemica How specific is a Pharmacophore? What does a bad odds ratio look like? What is the odds ratio? Found in CHEMBL 565658/1352681 Found in CHEMBL240 – hERG where pIC50 >=5 1985/2451 OR = 1985/2451 = 0.81 565658/1352681 0.42 =1.94 (95% conf 1.83 – 2.05) 19 Lipophilic base, usually a tertiary amine X = 2-5 atom chain, may include rings, heteroatoms or polar groups X N R1 R2 e. g. sertindole: 14nM vs hERG [$([NX3;H2,H1,H0;!$(N[C,S]=[O,N])]~*~*~*~c),$([NX3;H2,H1,H0;!$(N[C,S]=[O,N])]~*~*~c),$([NX3;H2,H1,H0;!$(N[C,S]=[O,N])]~*~*~*~*~c),$([NX3;H2,H1,H0;!$(N[C,S]=[O,N])]~*~*~*~*~*~c)] Early simple hERG model Ar-linker-base has only been found 1.9x more often in hERG inhibitors than at random in ChEMBL
  • 20. MedChemica Domain of Applicability “Whereof one cannot speak, thereof one must be silent.”1 Claiming to have extracted knowledge or making a prediction when we know don’t have enough evidence is: • Delusional • Dangerous • it would be more productive to act on a different hypothesis or at random • Degrades using rational analysis at all Compound activity prediction should have three classes of output: • Active • Inactive • Out of domain – no prediction possible Only fragments with sufficient evidential base are used to form into pharmacophore dyads In turn only pharmcophore dyads that have enough support are used in the model 20 1. Wittgenstein, Tractatus Logico-Philosophicus, 1922
  • 21. MedChemica Model activity from presence of Pharmacophores 21 0 20 40 60 80 0 20 40 60 80 01 4 6 8 10 pIC50 count Matches 0 1 0 30 60 90 120 4 6 8 10 pIC50 count Matches 0 1 Identify and group Fragment SMARTS from MMPA If n ≥ 8, perform a one-tailed binomial test with Holm-Bonferroni adjustment Remove non significant ‘Biophores’ Compare the mean of the compounds containing the biophore with the mean of the remaining compounds for significance (Welch’s t test and effect size Cohen’s d) Permutate all the significant Biophores and determine the shortest paths between them in the training set = Pharmacophore dyads Select Pharmacophore dyads with n >=6 examples Use presence /absence of Pharmacophore dyad as an indicator variable in PLS modelling Dopamine Transport +/- pharmacophores
  • 22. MedChemica Modelling critical safety targets 22 1. J. Bowes, A. J. Brown, J. Hamon, W. Jarolimek, A. Sridhar, G. Waldron, and S. Whitebread, “Reducing safety-related drug attrition: the use of in vitro pharmacological profiling,” Nat. Rev. Drug Discov., vol. 11, no. 12, pp. 909–922, Nov. 2012 Public Data Find Matched Pairs Pharmacophores Find Pharmacophore dyads Find Potent Fragments Target Class Effect Number of compounds Acetylcholine esterase - human enzyme CV: drop in BP, drop in HR, bronchioconstriction 383 b 1 adrenergic receptor GPCR CV: change in HR, BP, bronchiodilation, vasodilation, tremor 505 Androgen receptor NHR Endocrine: agonism: androgenicity / gynecomastia, prostrate / breast carcinoma 1064 CB1 canabinnoid receptor GPCR CNS: euphoria, dysphoria, anxiety, memory impairment, analgesia, hypothermia, weight loss, emesis, depression 1104 CB2 canabinnoid receptor GPCR increased inflammation 1112 Dopamine D2 receptor - human GPCR CNS: hallucinations, drowsiness, confusion, emesis, CV drop in heart rate 3873 Dopamine D2 receptor - rat GPCR As human 1807 Dopamine Transporter Transporter CNS: addictive psychostimulation, depression , parkinsonism, seizures 1470 GABA A receptor Ion channel CNS: anxiolysis, ataxia, sedation, depression, amnesia 848 hERG ion channel Ion channel CV: QT prolongation 4189 5HT2a receptor GPCR CNS:drop in body temp, anxiogenic 642 Monoamine oxidase enzyme CV increase BP, DDI potential CNS: dizziness, nausea 264 Muscarinic acetyl choline receptor M1 GPCR CNS: proconvulsant, drop in cognitive function, vision impairment 628 m opioid receptor GPCR CNS: sedation, abuse liability, respiratory depression, hypothermia 1128
  • 23. MedChemica Target Number of compounds Number of compound pairs Number of Fragments Number of Pharmacophore dyads after filtering R2 RMSEP ROC odds_ratio (geomean) Acetylcholine esterase - human 383 27755 44 10 0.43 1.57 0.80 4 b 1 adrenergic receptor 505 145447 276 313 0.64 0.70 0.96 833 Androgen receptor 1064 113163 186 46 0.47 0.77 0.86 140 CB1 canabinnoid receptor 1104 88091 165 90 0.61 1.02 0.87 96 CB2 canabinnoid receptor 1112 82130 194 158 0.19 0.85 0.64 5.7 Dopamine D2 receptor - human 3873 230962 483 602 0.42 0.88 0.69 110 Dopamine D2 receptor - rat 1807 118736 267 377 0.29 0.85 0.78 125 Dopamine Transporter 1470 106969 282 336 0.58 0.73 0.88 141 GABA A receptor 848 39494 106 167 0.70 0.76 0.97 560 hERG ion channel 4189 242261 392 76 0.61 0.96 0.92 55 5HT2a receptor 642 50870 197 267 0.61 0.59 0.83 600 Monoamine oxidase 264 15439 44 11 0.12 1.25 0.48 181 Muscarinic acetylcholine receptor M1 628 48200 97 510 0.62 0.94 0.89 48 m opioid receptor 1128 37184 33 11 0.69 1.30 0.87 81 Modelling critical safety targets • Build models using 10-fold cross validated PLS • Assess using ROC / BEDROC, R2 vs 100 fold y-scrambled R2 and geomean odds ratio 23 Public Data Find Matched Pairs Pharmacophores Find Pharmacophore dyads Find Potent Fragments
  • 24. MedChemica Toxophore examples Detailed, specific & transparent 24 Dopamine D2 receptor human Actual: 9.5 Predicted: 9.1 Mean with: 8.0 Mean without: 6.6 Odds Ratio: 340 Dopamine Transporter Actual: 9.1 Predicted: 8.6 Mean with: 8.3 Mean without: 6.7 Odds Ratio: 407 GABA-A Actual: 9.0 Predicted: 8.7 Mean with: 8.0 Mean without: 6.8 Odds Ratio: 1506 b1 adrenergic receptor Actual: 7.8 Predicted: 7.7 Mean with: 6.5 Mean without: 5.7 Odds Ratio: 1501
  • 25. MedChemica Safety Target Conclusions • We can model safety critical targets and extract both predictive models and useful ligand structural information • Clear areas to action • Clearly defined domain of applicability • No prediction where there is insufficient evidence (conservative method) • The method relies on having large data sets >= 500 data points • MMPA is computationally intense phase • But of course molecules only need pairing once… 25
  • 26. MedChemica Actionable knowledge Critical information that the user can immediately choose a course of action from: 26 ADME – ways to ‘fix’ your molecule Toxicology – sub structures to avoid Pharmacology – substructural leads built for practical design
  • 27. MedChemica Prediction of unseen new molecules The acid test… • Vascular endothelial growth factor receptor 2 tyrosine kinase (KDR) • Inhibitors have oncology and ophthalmic indications • Large dataset in CHEMBL • 10 fold cross validated PLS model • Selected model by minimised RMSEP 27 Compounds 4466 Matched Pairs 288100 Fragments 678 Pharmacophore dyads 787 RMSEP 0.8 R2 0.64 Y-scrambled R2 0.0 ROC 0.95 Geomean odds ratio 80 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ●● ●●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ● ● ●●●● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●●● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 4 6 8 10 5 7 9 pIC50_pred pIC50
  • 28. MedChemica Novartis Predictions From Our Model Domain of Applicabiltiy…. Actual: 8.4[1] Predicted: 7.5 28 Actual: 7.6[1] Predicted: 7.5 1. J MedChem(2016), Bold et al. 2. MedChem Lett (2016), Mainolfi et al. Actual: 7.7[2] Predicted: 7.1 Actual: 9.0[2] Predicted: Out of Domain
  • 29. MedChemica Value of Potency prediction from MMPA: Clear substructures enable rapid actions 29 Compounds + data Safety data Potency data HTS data Toxicity alerts Virtual Library prioritisation Virtual Library design Fragment set design Retest prioritisation Hit re-mining / analogue hunting Substructure modification Lead design Fast Follower design 26 examples in training set Mean without pharmacophore Mean with pharmacophore 0 2 4 6 6 7 8 9 10 pIC50 count
  • 30. MedChemica The MedChemica team Andrew G Leach Al Dossetter Shane Montague Lauren Reid* Jess Stacey* *Royal Society of Chemistry Industrial Placements Grant Scheme
  • 31. MedChemica A Collaboration of the willing Craig Bruce OE David Cosgrove GalCoz Andy Grant★ Martin Harrison Elixir Paul Faulder Elixir Andrew Griffin Elixir Huw Jones Base360 Al Rabow David Riley AZ Graeme Robb AZ Attilla Ting AZ Howard Tucker retired Dan Warner Myjar Steve St-Galley Sygnature David Wood JBA Risk Management Phil Jewsbury AZ Mike Snowden AZ Peter Sjo AZ Martin Packer AZ Manos Perros AZ Nick Tomkinson AZ Martin Stahl Roche Jerome Hert Roche Martin Blapp Roche Torsten Schindler Roche Paula Petrone Roche John Cumming Roche Jeff Blaney Genentech Hao Zheng Genentech Slaton Lipscomb Genentech James Crawford Genentech

Editor's Notes

  1. That’s 8% cost increase / year - and nowhere has budget increases of 8% per year…
  2. qplot(x=pIC50,data=s,geom="histogram",fill=I("blue"))
  3. We may be at the summit but who can tell? And what is around us? Alternatively we may want to have a completely clear view and potential cliffs and valleys, but by the time you get there, so much has been published that compounds are probabaly in the clinic if not to market – but of course there may still be opportunities
  4. 4 bottom left structures contain a raft of problems for F&I and MCSS – the F&I probably wont capture anything as the indole is smaller than the phenyl, MCSS fails to recognize the amido pyridine to indole as the exo NH is aliphatic and the indole nH aromatic, the cyclic amide is matched – but it won’t match the indole… The other sets show the strength of FI to find linker and core changes, but the weakness of FI to find simple changes in macrocycles.
  5. It’s usually downhill from A
  6. It’s usually downhill from A
  7. It’s usually downhill from A
  8. Fragments may be separate, joined or overlap (but not one be a subset of the other) PLS is Partial least squares, a regression technique that deals with sparse matrices of data where there may be correlations in the descriptors
  9. s<- read.csv("/Users/Ed_Griffen/Dropbox (MedChemica)/MedChemica_Team_Folder/toxophore_finding/pharmacophore_analysis/kinase_analysis/DopTrans/DopTrans_pharm_present.csv"") s$Matches<-as.factor(s$Matches) p<-ggplot(data=s) qplot(x=pIC50,data=s,geom="histogram", color=Matches,fill=Matches) qplot(x=pIC50,data=s,geom="histogram", color=Matches,fill=Matches,facets=Matches ~ .) Note errors are more significant
  10. s<- read.csv("/Users/Ed_Griffen/Desktop/VEGF_test.csv") p<-ggplot(data=s) qplot(x=pIC50,data=s,geom="histogram", fill=I("blue"), bins=10)+geom_vline(xintercept = 6.4, colour = "red")+geom_vline(xintercept = 8.64, colour = "green")