2. Traditional Drug Discovery (insert graph)
In Silico Prediction of ADME (insert graph)
◦ Potency
◦ Absorption
◦ Lead
◦ Drug
◦ Toxicity
◦ Excretion
◦ Metabolism
◦ distribution
3. Target IVY(Brute force virtual screening of
very large compound libraries) Lead
Discovery IVY(Utilize predictive models
from Biogen data for more efficient virtual
screening) Lead Optimization candidate
4. (insert graph)
◦ Potency
◦ Lead
◦ Drug
◦ Toxicity
◦ Excretion
◦ Metabolism
◦ Distribution
◦ absorption
5. Goal: Identify crystallographic binding mode,
Rank order ligands wrt binding with protein
(insert graph)
Receptor Docking
Ligand Shape
Generate plausible trial binding modes using
docking function then Re-rank modes with
scoring function
13. Goal: Predict hit/miss class based on presence of features
(fingerprints)
Method
◦ Given a set of N samples
◦ Given that some subset A of them are good (‘active’)
Then we estimate for a new compound: P(good)~ A/N
◦ Given a set of binary features F
For a given feature F:
It appears in N samples
It appears in A good samples
Can we estimate: P(good l F)~A/N
(Problem: Error gets worse as Nsmall)
◦ P’(good l F)= (A+P(good)k)/(n+k)
P’(good l F)p(good)as N0
P’(good l F) A/N as N large
◦ (If K=1/P(good) this is the Laplacian correction)
Descriptors (insert)
Advantages
◦ Can describe huge number of features (up to 4 billion; MDL 1024; Lead
scope 27,000)
◦ Contains tertiary and stereochemistry information
◦ Fast
14. Classification Analysis
◦ Developing Non-Linear Scoring Functions to classify
actives and non-actives
◦ (insert graphs)
◦ Cost Function to Minimize: Gini Impurity N= 1-
ΣP^2(ω)
15. Training Set Prediction Success
(insert table)
10-fold cross validation
Randomly split training and test sets
Significant Improvement in Separating Actives
from Non-Actives
23. Features found in high OBA
Features found in low OBA
Would be nice if CART did similar view
24. Improved scoring functions for separating
hits from non-hits in structure-based drug
design developed with CART and Bayesian
models
Identified key differences in molecular
physical properties that led to hits
Built reasonably predictive OBA model
(cannot expect method to extend to other
systems given complexity of OBA, however)
25. Biogen IDEC
Modeling
◦ Rajiah Denny
◦ Claudio Chuaqui
◦ Juswinder Singh
◦ Herman van Vlijmen
◦ Norman Wang
◦ Anuj Patel
◦ Zhan Deng
Chemistry
◦ Kevin Guckian
◦ Dan Scott
◦ Thomas Durand-Reville
◦ Pat Conlon
◦ Charlie Hammond
◦ Chuck Jewell
Pharmacology
◦ Tonika Bonhert