Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes

Medicinal Chemistry Due Diligence: Computational
Predictions of an expert’s evaluation of the NIH
chemical probes

SaaS
Easy to use
Used by
Academia
Industry,
Biotech
Private
Selective
collaboration
100’s of
published
datasets

MM4TB: 25 organizations
Copyright © 2013 All Rights Reserved Collaborative Drug Discovery
New
Old
Neuroscience
Kinetoplastid Drug Development
Consortium

 NIH spent a decade funding HTS efforts as
part of the MLSCN and MLPCN
 By 2010 $576.6M in funding
 Various definitions of a probe
 Potency, selectivity, solubility and availability
 Little has been done to learn from this work

 Lajiness et al. - 13 Chemists assessed 22,000 compounds (2000 each) for
drug or lead likeness.
 Not consistent in rejecting undesirable compounds
 (J Med Chem 2004, 47: 4891-6)
 Hack et al.- 145 chemists to fill holes in a screening library
 (J Chem Inf Model 2012; 51, 3275-86)
 Kutchukian et al. – medicinal chemists surveyed in selecting fragments
for a lead –
 lack of consensus in compound selection
 (PLOS ONE 2012, 7, e48476)
 Since the rule of 5 there has been a considerable focus on more rules –
ALERTS, PAINS, QED, BadApple etc

 But do we really need a crowd?
 Could 1 medicinal chemist be enough?
 > 40 years experience

 Chris Lipinski scored the original 64 cpds – he
was close to median
 Found more probes since 2009
 Now scored more than 300 NIH Probes for
desirability
 Extensive due diligence
▪ Based on literature (public/private)
▪ Chemical Reactivity

 79% of 322 probes are desirable

representing molecules of different classes from public and commercial databases
ML010
(CID 17757274)
valsartan
(CID 60846) CAS1164083-19-5
US20120040982
(CID 57498937)
ML160
(CID 824820)

 Properties from CDD
 Properties from Discovery Studio
 Higher Mwt, rotatable bonds and heavy atoms is desirable

Yellow - desirable
Blue - undesirable
Yellow – chemical probes
Blue - Microsource spectrum
compounds

 Desirable probes
less likely to be
filtered by PAINS
or BadApple as
promiscuous than
those scored as
undesirable.
 (Fisher's exact
test, p>0.0001 for
PAINS and p=0.04
for BadApple).

 322 NIH MLP
probes
 clustered into 44
groups using
ECFP_6
fingerprints
 using a Tanimoto
similarity threshold
of >0.11 for cluster
membership.
 Blue - desirable
 Red – undesirable
 Circle area is
proportional to
cluster size, and
singletons are
represented as a
dot.

Drug discovery is repetitive and there are 1000s of diseases
Drug discovery is high risk
Do we need robots or just smarter programs that discover the ideas we test?

 What would happen if we could model Chris’s
decisions
NIH probes
 Potential for other non medicinal chemists to benefit
 Streamline scoring compounds, save time

 FCFP_6 descriptors + 8 simple descriptors
 Leave out 50% x 100 of Bayesian models
 5 fold cross validation for n307 models

• The colors on the heat map correspond to the value of
the indicated metric for each probe, listed vertically.
• The scale was normalized internally with green
corresponding to the optimal condition within each
metric.

MoDELS RESIDE IN PAPERS
NOT ACCESSIBLE…THIS IS
UNDESIRABLE
How do we share them?
How do we use Them?

Open Extended Connectivity Fingerprints
ECFP_6 FCFP_6
 Collected,
deduplicated,
hashed
 Sparse integers
• Invented for Pipeline Pilot: public method, proprietary details
• Often used with Bayesian models: many published papers
• Built a new implementation: open source, Java, CDK
– stable: fingerprints don't change with each new toolkit release
– well defined: easy to document precise steps
– easy to port: already migrated to iOS (Objective-C) for TB Mobile app
• Provides core basis feature for CDD open source model service

Data + One Click =
Uses Bayesian algorithm and FCFP_6 fingerprints

 Rebuilt the n307
model in CDD
Models
 3 fold cross
validation
 ROC = 0.69

http://goo.gl/PVkQeo
Making the data more accessible as we are
drowning in molecules
3.5
3
2.5
2
1.5
1
0.5
0
-0.5
-1
log database size (millions)

 Ligand efficiency higher in
undesirable compounds
 Bayesian model preferable in
classifying desirable
compounds vs other molecule
quality metrics
 Model could improve probe
selection, score libraries, prior
to more extensive due diligence
 Probes could be scored by
additional chemists dependent
on needs e.g. bias to CNS,
anticancer..
CNS
Anticancer
NIH probes

 Complexities in finding the NIH
MLP probes in PubChem
 Identifier and structure
searches in CAS SciFinderTM
reveals an extreme disclosure
 The parallel worlds of
commercial and public
database disclosure do not
completely intersect
 Integration and intersections of
databases and the need for
bioassay ontology adoption
Public Commercial

 Need more collaboration or openness
in terms of availability of chemistry
and biology data.
 Increased communication between
the various databases that are both
public and proprietary
 Major hurdles exist to prevent this
from happening - too much
commercial value to proprietary
databases
 Clearly CAS and the other
commercial vendors have to take
notice

 We acknowledge that the Bayesian model software within
CDD was developed with support from Award Number
9R44TR000942-02 “Biocomputation across distributed
private datasets to enhance drug discovery” from the
NCATS.
 SE gratefully acknowledges Biovia (formerly Accelrys) for
providing Discovery Studio.
 SE thanks Jeremy Yang for the link to BadApple

Litterman NK, Lipinski CA, Bunin BA, Ekins S. Computational
Prediction and Validation of an Expert's Evaluation of
Chemical Probes. J Chem Inf Model. 2014 Oct 27;54(10):2996-
3004. doi: 10.1021/ci500445u. Epub 2014 Oct 7.
Christopher A. Lipinski, Nadia Litterman, Christopher Southan,
Antony J. Williams, Alex M. Clark and Sean Ekins, The parallel
worlds of public and commercial bioactive chemistry data
J Med Chem. Epub 2014 Nov 21.

Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (16)

Similar to Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes

Similar to Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes (20)

More from Sean Ekins

More from Sean Ekins (20)

Recently uploaded

Recently uploaded (20)

Medicinal Chemistry Due Diligence: Computational Predictions of an expert’s evaluation of the NIH chemical probes

Editor's Notes