Improved Predictions in Structure-Based Drug Design Using CART and Bayesian Models

1,978 views

Published on

Improved Predictions in Structure-Based Drug Design Using CART and Bayesian Models - Using Salford Systems' data mining tools.

Published in: Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,978
On SlideShare
0
From Embeds
0
Number of Embeds
256
Actions
Shares
0
Downloads
17
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Improved Predictions in Structure-Based Drug Design Using CART and Bayesian Models

  1. 1. Donovan N. Chin & R. Aldrin Denny
  2. 2.  Traditional Drug Discovery (insert graph) In Silico Prediction of ADME (insert graph) ◦ Potency ◦ Absorption ◦ Lead ◦ Drug ◦ Toxicity ◦ Excretion ◦ Metabolism ◦ distribution
  3. 3.  Target IVY(Brute force virtual screening of very large compound libraries) Lead Discovery IVY(Utilize predictive models from Biogen data for more efficient virtual screening) Lead Optimization candidate
  4. 4.  (insert graph) ◦ Potency ◦ Lead ◦ Drug ◦ Toxicity ◦ Excretion ◦ Metabolism ◦ Distribution ◦ absorption
  5. 5.  Goal: Identify crystallographic binding mode, Rank order ligands wrt binding with protein (insert graph) Receptor Docking Ligand Shape Generate plausible trial binding modes using docking function then Re-rank modes with scoring function
  6. 6.  (insert graph) 341 Active 47 Non-Active
  7. 7.  (insert graph) After filtering by Pharmacophore Feature
  8. 8.  (insert graph)
  9. 9.  (insert functions for) ◦ F_Score* ◦ D_Score ◦ G_Score ◦ PMF_Score ◦ Chem_Score ◦ ICM_Score*
  10. 10.  Cell Adhesion Assay (50% Serum) ◦ (insert graph) Biochemical Adhesion Assay ◦ (insert graph) Scoring Functions Are Poor More Often Than Not
  11. 11.  Receptor Site View Library Design FlexX Score Consensus Score>=3 e.g. Contact Map, CLogP MW, HBOND Rotatable bonds Consensus=5? if yes, substructure exists? if yes, Pharmacophore<4.2Å? if yes, Publish Hit Report
  12. 12.  (insert graph)
  13. 13.  Goal: Predict hit/miss class based on presence of features (fingerprints) Method ◦ Given a set of N samples ◦ Given that some subset A of them are good („active‟)  Then we estimate for a new compound: P(good)~ A/N ◦ Given a set of binary features F  For a given feature F:  It appears in N samples  It appears in A good samples  Can we estimate: P(good l F)~A/N  (Problem: Error gets worse as Nsmall) ◦ P‟(good l F)= (A+P(good)k)/(n+k)  P‟(good l F)p(good)as N0  P‟(good l F) A/N as N large ◦ (If K=1/P(good) this is the Laplacian correction) Descriptors (insert) Advantages ◦ Can describe huge number of features (up to 4 billion; MDL 1024; Lead scope 27,000) ◦ Contains tertiary and stereochemistry information ◦ Fast
  14. 14.  Classification Analysis ◦ Developing Non-Linear Scoring Functions to classify actives and non-actives ◦ (insert graphs) ◦ Cost Function to Minimize: Gini Impurity N= 1- ΣP^2(ω)
  15. 15.  Training Set Prediction Success (insert table) 10-fold cross validation Randomly split training and test sets Significant Improvement in Separating Actives from Non-Actives
  16. 16.  (insert graph) Significant Improvement in Finding Hits Using New SF
  17. 17.  Optimal tree identified (insert graph) No random effects (insert graph)
  18. 18.  (insert cluster) Able to identify different molecular property criteria that lead to hits
  19. 19.  (insert graph)
  20. 20.  (insert graph) Size= magnitude of OBA OBA values cover range of descriptor space
  21. 21.  (insert graph) Choose 1 & 2D Descriptors for ease of interpretation and lower “noise”
  22. 22.  Build Model (insert graphs) Apply Model
  23. 23.  Features found in high OBA Features found in low OBA Would be nice if CART did similar view
  24. 24.  Improved scoring functions for separating hits from non-hits in structure-based drug design developed with CART and Bayesian models Identified key differences in molecular physical properties that led to hits Built reasonably predictive OBA model (cannot expect method to extend to other systems given complexity of OBA, however)
  25. 25.  Biogen IDEC Modeling ◦ Rajiah Denny ◦ Claudio Chuaqui ◦ Juswinder Singh ◦ Herman van Vlijmen ◦ Norman Wang ◦ Anuj Patel ◦ Zhan Deng Chemistry ◦ Kevin Guckian ◦ Dan Scott ◦ Thomas Durand-Reville ◦ Pat Conlon ◦ Charlie Hammond ◦ Chuck Jewell Pharmacology ◦ Tonika Bonhert

×