Your SlideShare is downloading. ×
Talk at Yale University April 26th 2011: Applying Computational Modelsfor Toxicology, Drug Discovery and Beyond
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Talk at Yale University April 26th 2011: Applying Computational Models for Toxicology, Drug Discovery and Beyond


Published on

Slides presented April 26th at Yale Green Chemistry group

Slides presented April 26th at Yale Green Chemistry group

  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide
  • CDD Experienced Team Innovates and Executes Barry Bunin, PhD (Pres. & Cofounder as first Eli Lilly EIR) Libraria (CEO, Pres.-CSO), Arris Pharmaceuticals (Sr. Scientist), Genentech, UC Berkeley (Ellman), Columbia University, author. Moses Hohman, PhD (Director Software Engineering) Northwestern Assoc. Director of Bioinformatics, Thoughtworks, Inc., U of Chicago (PhD), Harvard ( magna cum laude, Physics) Sylvia Ernst, PhD (Director Community Growth & Sales) Left 800-lb Gorillas: Accelrys-Scitegic, MDL-Elsevier-Beilstein Peter Cohan (BOD & Overall Sales Strategy) Symyx (VP Bus Dev & President-Discovery Tools), MDL (VP Customer Marketing),, author. Omidyar Network, Founders Fund, & Lilly (BOD observers) WSGR (Corporate Counsel), Rina Accountancy (GAAP compliance) Partners: Hub Consortium Members, ChemAxon, DNDi, MMV, Sandler Center… CDD SAB: Christopher Lipinski PhD, James McKerrow, MD PhD, David Roos PhD, Adam Renslo PhD, Wes Van Voorhis, MD PhD
  • The process of ADME/tox can now be viewed as an iterative process were molecules may be assessed against many properties early on before selecting molecules for clinical trials. These endpoints may be complex like toxicity.
  • Transcript

    • 1. Applying Computational Models for Toxicology, Drug Discovery and Beyond Sean Ekins Collaborations in Chemistry, Jenkintown, PA. Collaborative Drug Discovery, Burlingame, CA. Department of Pharmacology, University of Medicine & Dentistry of New Jersey-Robert Wood Johnson Medical School, Piscataway, NJ. School of Pharmacy, Department of Pharmaceutical Sciences, University of Maryland, Baltimore, MD.
    • 2. … mathematical learning will be the distinguishing mark of a physician from a quack… Richard Mead A mechanical account of poisons in several essays 2nd Edition, London, 1708.
    • 3. A decade ago we had limited data for modeling Now we are inundated with it What can we do with it?
    • 4. The future: crowdsourced drug discovery Williams et al., Drug Discovery World, Winter 2009
    • 5. Pharma reached a productivity tipping point Cost of drug development high Failure in clinic due to toxicity How to predict earlier
    • 6. Bottleneck
      • A bsorption
      • D istribution
      • M etabolism
      • E xcretion
      • T oxicology
    • 7. Ekins et al., Trends Pharm Sci 26: 202-209 (2005) The Iterative ADME/Tox Optimization Process “ Drug discovery & development needs to be more like engineering” Janet Woodcock, FDA – PharmaDiscovery May 10 2006
    • 8. Why We Need Models
      • Define structure activity relationship (SAR)
      • in vitro models - limited throughput
      • in silico – in vitro approach has value in targeting testing of compounds.
      • Computers can beat humans at chess and Jeopardy why not help with predicting toxicity?
      • Could be done on phone?
    • 9. What has been modeled in ADMET?
      • Lipophilicity (log P, log D), pK a
      • Solubility
      • Passive permeability (BBB, gut, ...)
      • Plasma protein binding
      • Affinity for transporters (P-gp, hOCT, ...)
      • Nature of metabolites
      • Toxicity endpoints (mutagenesis, cytotoxicity, ...)
      • CL H , CL R , CL int
      • V D , t 1/2 , ...
      composite character
    • 10. What is DILI?
      • Drug metabolism in the liver can convert some drugs into highly reactive intermediates,
      • In turn can adversely affect the structure and functions of the liver.
      • Drug-induced liver injury (DILI), is the number one reason drugs are not approved
        • and also the reason some of them were withdrawn from the market after approval
      • Estimated global annual incidence rate of DILI is 13.9-24.0 per 100,000 inhabitants,
        • and DILI accounts for an estimated 3-9% of all adverse drug reactions reported to health authorities
      • Herbal components can cause DILI too
    • 11. Drug Examples for DILI + and - Troglitazone DILI + Pioglitazone DILI - Rosiglitzone DILI - Sulindac DILI + Aspirin DILI - Diclofenac DILI + Xu et al., Toxicol Sci 105: 97-105 (2008).
    • 12. Limitations of DILI?
      • Compound has to physically have been made and be available for testing.
      • The screening system is still relatively low throughput compared with any primary screens
      • Whole compound or vendor libraries cannot be cost effectively screened for prioritization.
      • Screening system should be representative of the human organ including drug metabolism capability.
      • Prediction of human therapeutic C max is often imprecise before clinical testing in actual patients.
      Ekins, Williams and Xu, Drug Metab Dispos 38: 2302-2308, 2010
    • 13. DILI Computational Models
      • 74 compounds - classification models (linear discriminant analysis, artificial neural networks, and machine learning algorithms (OneR))
        • Internal cross-validation (accuracy 84%, sensitivity 78%, and specificity 90%). Testing on 6 and 13 compounds, respectively > 80% accuracy.
      • (Cruz-Monteagudo et al., J Comput Chem 29: 533-549, 2008).
      • A second study used binary QSAR (248 active and 283 inactive) Support vector machine models –
        • external 5-fold cross-validation procedures and 78% accuracy for a set of 18 compounds
          • (Fourches et al., Chem Res Toxicol 23: 171-183, 2010).
      • A third study created a knowledge base with structural alerts from 1266 chemicals.
        • Alerts created were used to predict results for 626 Pfizer compounds (sensitivity of 46%, specificity of 73%, and concordance of 56% for the latest version)
          • (Greene et al., Chem Res Toxicol 23: 1215-1222, 2010).
    • 14. DILI data
      • Tested a panel of orally administered drugs at multiples of the maximum therapeutic concentration ( C max ),
        • taking into account the first-pass effect of the liver and other idiosyncratic toxicokinetic/toxicodynamic factors.
      • The 100-fold C max scaling factor represented a reasonable threshold to differentiate safe versus toxic drugs for an orally dosed drug and with regard to hepatotoxicity.
      • Concordance of the in vitro human hepatocyte imaging assay technology (HIAT) for 300 drugs and chemicals, ~ 75% with regard to clinical hepatotoxicity, with very few false-positive results
      • Xu et al., Toxicol Sci 105: 97-105 (2008).
    • 15. CRIMALDDI Meeting 2010 Linking databases
        • ~25 million compounds.
        • Linking to >300 data sources
        • Underpinning the semantic web.patents and publications, chemical suppliers etc. host for crowdsourced data
        • Focus on data curation quality
        • Used multiple databases to validate structures
      [email_address] Data curation
    • 16. Bayesian machine learning
      • Laplacian-corrected Bayesian classifier models were generated using Discovery Studio (version 2.5.5; Accelrys).
      • Training set = 295, test set = 237 compounds
      • Uses two-dimensional descriptors to distinguish between compounds that are DILI-positive and those that are DILI-negative
        • ALogP
        • ECFC_6
        • Apol
        • logD
        • molecular weight
        • number of aromatic rings
        • number of hydrogen bond acceptors
        • number of hydrogen bond donors
        • number of rings
        • number of rotatable bonds
        • molecular polar surface area
        • molecular surface area
        • Wiener and Zagreb indices
      Ekins, Williams and Xu, Drug Metab Dispos 38: 2302-2308, 2010 Extended connectivity fingerprints
    • 17. Bayesian machine learning Ekins, Williams and Xu, Drug Metab Dispos 38: 2302-2308, 2010 Bayesian classification is a simple probabilistic classification model. It is based on Bayes’ theorem h is the hypothesis or model d is the observed data p ( h ) is the prior belief (probability of hypothesis h before observing any data) p ( d ) is the data evidence (marginal probability of the data) p ( d|h ) is the likelihood (probability of data d if hypothesis h is true) p ( h|d ) is the posterior probability (probability of hypothesis h being true given the observed data d ) A weight is calculated for each feature using a Laplacian-adjusted probability estimate to account for the different sampling frequencies of different features. The weights are summed to provide a probability estimate
    • 18. Features in DILI + Ekins, Williams and Xu, Drug Metab Dispos 38: 2302-2308, 2010 Avoid Long aliphatic chains Phenols Ketones Diols  -methyl styrene Conjugated structures Cyclohexenones Amides ?
    • 19. Features in DILI - Ekins, Williams and Xu, Drug Metab Dispos 38: 2302-2308, 2010
    • 20. Results
      • Fingerprints with high Bayesian scores that are present in many DILI compounds appeared to be reactive in nature,
      • Could cause time-dependent inhibition of cytochromes P450 or be precursors for metabolites that are reactive and may covalently bind to proteins.
      • Why are long aliphatic chains important for DILI
        • generally hydrophobic and perhaps enabling increased accumulation?
        • may be hydroxylated and then form other metabolites that are in turn reactive?
      Ekins, Williams and Xu, Drug Metab Dispos 38: 2302-2308, 2010
    • 21. Test set analysis
      • compounds of most interest
        • well known hepatotoxic drugs (U.S. Food and Drug Administration Guidance for Industry “Drug-Induced Liver Injury: Premarketing Clinical Evaluation,” 2009), plus their less hepatotoxic comparators, if clinically available.
      Ekins, Williams and Xu, Drug Metab Dispos 38: 2302-2308, 2010
    • 22. Training vs test set PCA Ekins, Williams and Xu, Drug Metab Dispos 38: 2302-2308, 2010 Yellow = test Blue = training
    • 23. Compare to newer drugs
      • Extracted small molecule drugs from 2006 to 2010 from the Prous Integrity database
      • Structure validation resulted in a set of 77 molecules (mean molecular weight 427.05 ± 280.31, range 94.11–1994.09)
      • These molecules were distributed throughout the combined training and test sets (N = 532), representative of overlap
      • These combined analyses suggest that the test and training sets used for the DILI model are representative of current medicinal chemistry efforts.
      Ekins, Williams and Xu, Drug Metab Dispos 38: 2302-2308, 2010
    • 24. SMARTS FIlters Smartsfilter kindly provided by Dr. Jeremy Yang (University of New Mexico, Albuquerque, NM, Substructure Alerts used to filter libraries – remove reactive groups etc. Ekins, Williams and Xu, Drug Metab Dispos 38: 2302-2308, 2010
    • 25. SMARTS Filters vs Rule of 5 Ekins and Freundlich, Pharm Res, In press 2011 Correlation between the number of SMARTS filter failures and the number of Lipinski violations for different types of rules sets with FDA drug set from CDD (N = 2804) Suggests # of Lipinski violations may also be an indicator of undesirable chemical features that result in reactivity
    • 26. Conclusions
      • First large-scale testing of DILI machine learning model
        • Concordance lower than with in vitro model
        • Statistics similar to Structural alerts from Pfizer paper
      • SMARTS can be used to filter libraries
        • Machine learning models better than SMARTS
      • Could use models to filter compounds for further testing in vitro
        • Use published knowledge to predict DILI
        • Combinations of models
        • Combine datasets – create models with Open descriptors and algorithms
      • Make models widely available
      Ekins, Williams and Xu, Drug Metab Dispos 38: 2302-2308, 2010
    • 27.
      • Ideal when we have few molecules for training
      • In silico database searching
      • Accelrys Catalyst in Discovery Studio
      • Geometric arrangement of functional groups necessary for a biological response
      • Generate 3D conformations
      • Align molecules
      • Select features contributing to activity
      • Regress hypothesis
      • Evaluate with new molecules
      • Excluded volumes – relate to inactive molecules
      Pharmacophores applied broadly Created for CYP2B6 CYP2C9 CYP2D6 CYP3A4 CYP3A5 CYP3A7 hERG P-gp OATPs OCT1 OCT2 BCRP hOCTN2 ASBT hPEPT1 hPEPT2 FXR LXR CAR PXR etc
    • 28. MRP BCRP P-gp Molecule Databases In vitro testing hPEPT Transporter Pharmacophores or other model types Feedback of new substrates or inhibitors In silico and in vitro screening for Transporters Ekins, in Ecker G and Chiba P, Transporters as drug carriers, John Wiley and Sons. P215-227, 2009. MRP BCRP P-gp Molecule Databases In vitro testing hPEPT Transporter Pharmacophores Feedback of new substrates or inhibitors
    • 29. hOCTN2
      • High affinity cation/carnitine transporter - expressed in kidney, skeletal muscle, heart, placenta and small intestine
      • Inhibition correlation with muscle weakness - rhabdomyolysis
      • A common features pharmacophore developed with 7 inhibitors
      • Searched a database of over 600 FDA approved drugs - selected drugs for in vitro testing.
      • 33 tested drugs predicted to map to the pharmacophore, 27 inhibited hOCTN2 in vitro
      • Compounds were more likely to cause rhabdomyolysis if the C max / K i ratio was higher than 0.0025
      Diao, Ekins, and Polli, Pharm Res, 26, 1890, (2009)
    • 30. Possible Association between Clinical Rhabdomyolysis and hOCTN2 Inhibition Diao, Ekins, and Polli, Pharm Res, 26, 1890, (2009)
    • 31. +ve -ve hOCTN2 quantitative pharmacophore and Bayesian model Diao et al., Mol Pharm, 7: 2120-2131, 2010 r = 0.89 vinblastine cetirizine emetine
    • 32. hOCTN2 quantitative pharmacophore and Bayesian model Bayesian Model - Leaving 50% out 97 times external ROC 0.90 internal ROC 0.79 concordance 73.4%; specificity 88.2%; sensitivity 64.2%. Lab test set (N = 27) Bayesian model has better correct predictions (> 80%) and lower false positives and negatives than pharmacophore (> 70%) Predictions for literature test set (N=32) not as good as in house – mean max Tanimoto similarity were ~ 0.6 Diao et al., Mol Pharm, 7: 2120-2131, 2010 PCA used to assess training and test set overlap
    • 33. Among the 21 drugs associated with rhabdomyolysis or carnitine deficiency, 14 (66.7%) provided a C max/ K i ratio higher than 0.0025. Among 25 drugs that were not associated with rhabdomyolysis or carnitine deficiency, only 9 (36.0%) showed a C max / K i ratio higher than 0.0025. Rhabdomyolysis or carnitine deficiency was associated with a C max / K i value above 0.0025 (Pearson’s chi-square test p = 0.0382). limitations of C max / K i serving as a predictor for rhabdomyolysis -- C max / K i does not consider the effects of drug tissue distribution or plasma protein binding. hOCTN2 association with rhabdomyolysis Diao et al., Mol Pharm, 7: 2120-2131, 2010
    • 34. Proactive database searching - Prioritize compounds for testing in vitro Understand drug interactions In silico allows rapid parallel optimization vs transporters or other properties Provide novel insights into the molecular interaction of inhibitors Repurpose - reposition FDA drugs Summing up
    • 35. Pregnane X Receptor (PXR) is promiscuous
      • Bile salts
      • Cholesterol metabolites
      • Statins
      • PPAR antagonists
      • Calcium channel modulators
      • Synthetic peptide mimetics
      • Anticancer compounds
      • HIV protease inhibitors
      • Herbal components
      • Carotenoids
      • Vitamins
      • Endocrine disrupters
      • Pesticides
      • Plasticizers
      • and many more (hundreds published)
      Endogenous Drugs Exogenous Environmental Contaminants
    • 36. Human PXR – direct downstream interactions
      • PXR increases transcription of CYP3A4 and >37 other genes Transporters, drug metabolizing enzymes
    • 37. PXR Agonist Machine Learning and Docking Comparison
      • Compared different methods for predicting agonist binding to human PXR:
      • Training set 98 hPXR activators and 79 hPXR non-activators (Ung et al., Mol Pharmacol 71: 158-168 2007)
      • Recursive partitioning (RP)
      • Random forest (RF)
      • Support Vector Machines (SVM)
      • Using VolSurf 3D descriptors
      • Docking (FlexX)
      • Large external test set N = 145 molecules (82 active, 63 inactive)
      Khandelwal et al., Chem Res Toxicol, 21:1457-67 (2008)
    • 38. Khandelwal et al., Chem Res Toxicol, 21:1457-67 (2008)
    • 39. Receptor model for PXR obtained using Raptor (5D-QSAR) Bayesian model Ekins S, Kortagere S, Iyer M, Reschly EJ, Lill MA, Redinbo MR and Krasowski MD, PLoS Comp Biol 5: e1000594 (2009). A C T I V E I N A C T I V E
    • 40. ToxCast: docking chemicals in human PXR
      • 10 Groups have contracts with EPA to test ~300 conazoles & pesticides, etc with various biological assays (cell based, receptor etc)
      • We have docked all the molecules into the PXR agonist site of 5 structures
      • GOLD (ver 4) -genetic algorithm explores conformations of ligands and flexible receptor side
      • 20 independent docking runs
      • Used the regular goldscore to classify compounds
      • Comparing their respective scores to the corresponding goldscores of the co-crystalized ligands.
      • Majority vote across the five structures.
      Kortagere et al., Env Health Perspect, 118: 1412-1417, 2010
    • 41. ToxCast: docking pesticides in PXR
      • Activities of most activators more potent vs NCGC data
      • We correctly predict ~70% of compounds and 75% of activators
      • Including other predicted pesticides from Lemaire, G et al., Toxicol Sci. 2006; 91:501-9, (2006).
      • When compared to NCGC data for complete Toxcast set Sensitivity 74%
      Kortagere et al., Env Health Perspect, 118: 1412-1417, 2010
    • 42. ToxCast (blue) vs Steroidal (yellow) compounds
      • Different areas in PCA using simple descriptors
      • ToxCast requires a model built with similar molecules
      • General PXR models may be limited in predicting ToxCast data
      Kortagere et al., Env Health Perspect, 118: 1412-1417, 2010
    • 43. How Could Green Chemistry Benefit From These Models? Chem Rev. 2010 Oct 13;110(10):5845-82
    • 44. Where Can We Apply Models In Green Chemistry? … N AT U R E, 4 6 9: 6 JA N 2 0 1 1
    • 45. Models are cheaper N AT U R E, 4 6 9: 6 JA N 2 0 1 1 Is this experimental prediction or computational prediction?
    • 46. … ^ a Chemist -"I think if you study-if you learn too much of what others have done, you may tend to take the same direction as everybody else"- Jim Henson
    • 47. Some observations
    • 48. Computational modeling – from simple to complex models with more data Ekins et a., Xenobiotica, 37:1152-1170, 2007
    • 49. Need to think about more than one property at a time Multi-objective optimisation Ekins, Honeycutt and Metz, Drug Disc Today, 15: 451-460, 2010
    • 50. Abbott – evolving molecules using ADME multi-objective optimization Ekins, Honeycutt and Metz, Drug Disc Today, 15: 451-460, 2010
    • 51. Could Green Chemistry Modeling Benefit from Collaboration? Modelers Modelers Modelers Modelers
    • 52.  
    • 53. More collaborations, integrating models into scientific social networks Drug Disc Today, 14: 261-270, 2009
    • 54. Could all pharmas share their data as models with each other?
    • 55.
      • Open source software for molecular descriptors and algorithms
      • Spend only a fraction of the money on QSAR
      • Selectively share your models with collaborators and control access
      • Have someone else host the models / predictions
      The next opportunities for crowdsourcing… Models Inside company Collaborators Commercial Descriptors Algorithms ADME/Tox data Current investments >$1M/yr >$10-100’s M/yr
    • 56. Open source tools for modeling
      • Open source descriptors CDK and C5.0 algorithm
      • ~60,000 molecules with P-gp efflux data from Pfizer
      • MDR <2.5 (low risk) (N = 14,175) MDR > 2.5 (high risk) (N = 10,820)
      • Test set MDR <2.5 (N = 10,441) > 2.5 (N = 7972)
      • Could facilitate model sharing?
      • Gupta RR, et al., Drug Metab Dispos, 38: 2083-2090, 2010
      $ $$$$$$
    • 57. Pfizer Merck GSK Novartis Lilly BMS Could combining models give greater coverage of ADME/ Tox chemistry space and improve predictions? Lundbeck Allergan Bayer AZ Roche BI Merk KGaA Expanding computational model coverage of chemical space
    • 58. Xenobiotica, 37:1152-1170, 2007 Mobile computing – an opportunity for science
      • Everything is mobile - Devices smaller
      • Chemists move from e-notebook – tablet pc – to smart phones / devices iPhone etc
      • What apps could be provided for data, collaboration, GREEN CHEMISTRY… etc?
      Williams, Ekins et al In Press Williams – chemistry world May 2010
    • 59. Acknowledgments
      • University of Maryland
        • Xiaowan Zheng
        • Lei Diao
        • Peter W. Swaan
        • James E. Polli
      • Pfizer
        • Rishi Gupta
        • Eric Gifford
        • Ted Liston
        • Chris Waller
      • University of Iowa
        • Matthew D. Krasowski ,
      • Drexel University
        • Sandhya Kortagere
      • University of Maryland
        • Akash Khandelwal, Peter W. Swaan, Cheng Chang
      • Purdue University
        • Markus Lill
      • Merck
        • Jim Xu
      • RSC
        • Antony J. Williams
      • Accelrys
      • CDD
      • Ingenuity
      • Email:
      • Slideshare:
      • Twitter: collabchem
      • Blog:
      • Website: