Unc slides on computational toxicology


Published on

lecture at UNC in November 2011

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Edited text
  • Text edited
  • Unc slides on computational toxicology

    1. 1. Sean Ekins, M.Sc, Ph.D., D.Sc. Collaborations in Chemistry, Fuquay-Varina, NC. Collaborative Drug Discovery, Burlingame, CA. School of Pharmacy, Department of Pharmaceutical Sciences, University of Maryland. 215-687-1320 [email_address] Computational Models for Predicting Human Toxicities
    2. 2. <ul><li>Key enablers </li></ul><ul><li>What has been modeled – a quick review </li></ul><ul><li>What will be modeled </li></ul><ul><li>Future </li></ul>Outline
    3. 3. Why Use Computational Models For Toxicology ? Goal of a model – Alert you to potential toxicity, enable you to focus efforts on best molecules – reduce risk Selection of model – trade off between interpretability, insights for modifying molecules, speed of calculation and coverage of chemistry space – applicability domain Models can be built with proprietary, open and commercial tools software (descriptors + algorithms) + data = model/s Human operator decides whether a model is acceptable
    4. 4. Key enablers: Hardware is getting smaller 1930’s 1980s 1990s Room size Desktop size Not to scale and not equivalent computing power – illustrates mobility Laptop Netbook Phone Watch
    5. 5. Key Enablers: More data available and open tools <ul><li>Details </li></ul><ul><li>Details </li></ul>
    6. 6. What has been modeled <ul><li>Physicochemical properties, LogP, logD, Solubility, boiling point, melting point </li></ul><ul><li>QSAR for various proteins, complex properties </li></ul><ul><li>Homology models, Docking </li></ul><ul><li>Expert systems </li></ul><ul><li>Hybrid methods – combine different approaches </li></ul><ul><li>Mutagenicity (Ames, micronucleus, clastogenicity, and DNA damage, developmental tox.. ) </li></ul><ul><li>Environmental Tox – Aquatic, dermatotoxicology </li></ul><ul><li>Mixtures </li></ul>
    7. 7. Physicochemical properties <ul><li>Solubility data – 1000’s data in Literature </li></ul><ul><li>Models median error ~0.5 log = experimental error </li></ul><ul><li>LogP –tens of 1000’s data available </li></ul><ul><li>Fragmental or whole molecule predictors </li></ul><ul><li>All logP predictors are not equal. Median error ~ 0.3 log = experimental error </li></ul><ul><li>People now accept solubility and LogP predictions as if real </li></ul>ACD predictions + EpiSuite predictions in www.chemspider.com <ul><li>Mobile molecular data sheet </li></ul><ul><li>Links to melting point predictor from open notebook science </li></ul><ul><li>Required curation of data </li></ul>
    8. 8. Simple Rules <ul><li>Rule of 5 </li></ul><ul><li>Lipinski, Lombardo, Dominy, Feeney Adv. Drug Deliv. Rev. 23: 3-25 (1997). </li></ul><ul><li>AlogP98 vs PSA </li></ul><ul><li>Egan, Merz, Baldwin, J. Med. Chem. 43: 3867-3877 (2000) </li></ul><ul><li>Greater than ten rotatable bonds correlates with decreased rat oral bioavailability </li></ul><ul><li>Veber, Johnson, Cheng, Smith, Ward, Kopple. J Med Chem 45: 2515–2623, (2002) </li></ul><ul><li>Compounds with ClogP < 3 and total polar surface area > 75A 2 fewer animal toxicity findings. </li></ul><ul><li>Hughes, et al. Bioorg Med Chem Lett 18, 4872-4875 (2008). </li></ul>
    9. 9. L. Carlsson,et al., BMC Bioinformatics 2010, 11: 362 MetaPrint 2D in Bioclipse- free metabolism site predictor Uses fingerprint descriptors and metabolite database to learn frequencies of metabolites in various substructures
    10. 10. QSAR for Various Proteins <ul><li>Enzymes – predominantly Cytochrome P450s - for drug-drug interactions </li></ul><ul><li>Transporters – predominantly P-gp but some others e.g. OATP, BCRP - </li></ul><ul><li>Receptors – PXR, CAR, for hepatotoxicity </li></ul><ul><li>Ion Channels – predominantly hERG for cardiotoxicity </li></ul><ul><li>Issues – initially small training sets – public data is a fraction of what drug companies have </li></ul>
    11. 11. Pharmacophores <ul><li>Ideal when we have few molecules for training </li></ul><ul><li>In silico database searching </li></ul><ul><li>Accelrys Catalyst in Discovery Studio </li></ul><ul><li>Geometric arrangement of functional groups necessary for a biological response </li></ul><ul><li>Generate 3D conformations </li></ul><ul><li>Align molecules </li></ul><ul><li>Select features contributing to activity </li></ul><ul><li>Regress hypothesis </li></ul><ul><li>Evaluate with new molecules </li></ul><ul><li>Excluded volumes – relate to inactive molecules </li></ul>CYP2B6 CYP2C9 CYP2D6 CYP3A4 CYP3A5 CYP3A7 hERG P-gp OATPs OCT1 OCT2 BCRP hOCTN2 ASBT hPEPT1 hPEPT2 FXR LXR CAR PXR etc
    12. 12. hOCTN2 – Organic Cation transporter Pharmacophore <ul><li>High affinity cation/carnitine transporter - expressed in kidney, skeletal muscle, heart, placenta and small intestine </li></ul><ul><li>Inhibition correlation with muscle weakness - rhabdomyolysis </li></ul><ul><li>A common features pharmacophore developed with 7 inhibitors </li></ul><ul><li>Searched a database of over 600 FDA approved drugs - selected drugs for in vitro testing. </li></ul><ul><li>33 tested drugs predicted to map to the pharmacophore, 27 inhibited hOCTN2 in vitro </li></ul><ul><li>Compounds were more likely to cause rhabdomyolysis if the C max / K i ratio was higher than 0.0025 </li></ul>Diao, Ekins, and Polli, Pharm Res, 26, 1890, (2009)
    13. 13. hOCTN2 – Organic Cation transporter Pharmacophore Diao, Ekins, and Polli, Pharm Res, 26, 1890, (2009)
    14. 14. Diao, Ekins, and Polli, Pharm Res, 26, 1890, (2009) +ve -ve hOCTN2 quantitative pharmacophore and Bayesian model Diao et al., Mol Pharm, 7: 2120-2131, 2010 r = 0.89 vinblastine cetirizine emetine
    15. 15. hOCTN2 quantitative pharmacophore and Bayesian model Bayesian Model - Leaving 50% out 97 times external ROC 0.90 internal ROC 0.79 concordance 73.4%; specificity 88.2%; sensitivity 64.2%. Lab test set (N = 27) Bayesian model has better correct predictions (> 80%) and lower false positives and negatives than pharmacophore (> 70%) Predictions for literature test set (N=32) not as good as in house – mean max Tanimoto similarity were ~ 0.6 Diao et al., Mol Pharm, 7: 2120-2131, 2010 PCA used to assess training and test set overlap
    16. 16. Among the 21 drugs associated with rhabdomyolysis or carnitine deficiency, 14 (66.7%) provided a C max/ K i ratio higher than 0.0025. Among 25 drugs that were not associated with rhabdomyolysis or carnitine deficiency, only 9 (36.0%) showed a C max / K i ratio higher than 0.0025. Rhabdomyolysis or carnitine deficiency was associated with a C max / K i value above 0.0025 (Pearson’s chi-square test p = 0.0382). limitations of C max / K i serving as a predictor for rhabdomyolysis -- C max / K i does not consider the effects of drug tissue distribution or plasma protein binding. hOCTN2 association with rhabdomyolysis
    17. 17. Drug induced liver injury DILI <ul><li>Drug metabolism in the liver can convert some drugs into highly reactive intermediates, </li></ul><ul><li>In turn can adversely affect the structure and functions of the liver. </li></ul><ul><li>DILI, is the number one reason drugs are not approved </li></ul><ul><ul><li>and also the reason some of them were withdrawn from the market after approval </li></ul></ul><ul><li>Estimated global annual incidence rate of DILI is 13.9-24.0 per 100,000 inhabitants, </li></ul><ul><ul><li>and DILI accounts for an estimated 3-9% of all adverse drug reactions reported to health authorities </li></ul></ul><ul><li>Herbal components can cause DILI too </li></ul>https://dilin.dcri.duke.edu/for-researchers/info/
    18. 18. <ul><li>Drug Induced Liver Injury Models </li></ul><ul><li>74 compounds - classification models (linear discriminant analysis, artificial neural networks, and machine learning algorithms (OneR)) </li></ul><ul><ul><li>Internal cross-validation (accuracy 84%, sensitivity 78%, and specificity 90%). Testing on 6 and 13 compounds, respectively > 80% accuracy. </li></ul></ul><ul><li>(Cruz-Monteagudo et al., J Comput Chem 29: 533-549, 2008). </li></ul><ul><li>A second study used binary QSAR (248 active and 283 inactive) Support vector machine models – </li></ul><ul><ul><li>external 5-fold cross-validation procedures and 78% accuracy for a set of 18 compounds </li></ul></ul><ul><ul><ul><li> (Fourches et al., Chem Res Toxicol 23: 171-183, 2010). </li></ul></ul></ul><ul><li>A third study created a knowledge base with structural alerts from 1266 chemicals. </li></ul><ul><ul><li>Alerts created were used to predict results for 626 Pfizer compounds (sensitivity of 46%, specificity of 73%, and concordance of 56% for the latest version) </li></ul></ul><ul><ul><ul><li> </li></ul></ul></ul><ul><ul><ul><li>(Greene et al., Chem Res Toxicol 23: 1215-1222, 2010). </li></ul></ul></ul>
    19. 19. <ul><li>DILI Model - Bayesian </li></ul><ul><li>Laplacian-corrected Bayesian classifier models were generated using Discovery Studio (version 2.5.5; Accelrys). </li></ul><ul><li>Training set = 295, test set = 237 compounds </li></ul><ul><li>Uses two-dimensional descriptors to distinguish between compounds that are DILI-positive and those that are DILI-negative </li></ul><ul><ul><li>ALogP </li></ul></ul><ul><ul><li>ECFC_6 </li></ul></ul><ul><ul><li>Apol </li></ul></ul><ul><ul><li>logD </li></ul></ul><ul><ul><li>molecular weight </li></ul></ul><ul><ul><li>number of aromatic rings </li></ul></ul><ul><ul><li>number of hydrogen bond acceptors </li></ul></ul><ul><ul><li>number of hydrogen bond donors </li></ul></ul><ul><ul><li>number of rings </li></ul></ul><ul><ul><li>number of rotatable bonds </li></ul></ul><ul><ul><li>molecular polar surface area </li></ul></ul><ul><ul><li>molecular surface area </li></ul></ul><ul><ul><li>Wiener and Zagreb indices </li></ul></ul>Ekins, Williams and Xu, Drug Metab Dispos 38: 2302-2308, 2010 Extended connectivity fingerprints
    20. 20. <ul><li>DILI Bayesian </li></ul>Features in DILI - Features in DILI + Avoid===Long aliphatic chains, Phenols, Ketones, Diols,  -methyl styrene, Conjugated structures, Cyclohexenones, Amides
    21. 21. Test set analysis Ekins, Williams and Xu, Drug Metab Dispos 38: 2302-2308, 2010 <ul><li>compounds of most interest </li></ul><ul><ul><li>well known hepatotoxic drugs (U.S. Food and Drug Administration Guidance for Industry “Drug-Induced Liver Injury: Premarketing Clinical Evaluation,” 2009), plus their less hepatotoxic comparators, if clinically available. </li></ul></ul>
    22. 22. Fingolimod (Gilenya) for MS (EMEA and FDA) Paliperidone for schizophrenia Pirfenidone for Idiopathic pulmonary fibrosis Roflumilast for pulmonary disease Predictions for newly approved EMEA compounds Can we get DILI data for these?
    23. 23. Time dependent inhibition for P450 3A4 <ul><li>Pfizer generated a large dataset (~2000 compounds) and went through sequential Bayesian model generation and testing cycles </li></ul>Test set 2 20 active in 156 compounds Combined both model predictions Zientek et al., Chem Res Toxicol 23: 664-676 (2010)
    24. 24. <ul><li>3A4 TDI </li></ul>Indazole ring, the pyrazole, and the methoxy-aminopyridine rings are important for TDI Approach decreased in vitro screening 30% Helps identify reactive metabolite forming compounds Zientek et al., Chem Res Toxicol 23: 664-676 (2010)
    25. 25. http://www.slideshare.net/ekinssean Ekins S and Williams AJ, MedChemComm, 1: 325-330, 2010. Analysis of malaria and TB datasets
    26. 26. Antimalarial Compound libraries and filter failures Ekins and Williams Drug Disc Today 15; 812-815, 2010 Filtering using SMARTs filters to remove thiol reactives, false positives etc at University of New Mexico (http://pasilla.health.unm.edu/tomcat/biocomp/smartsfilter) % Failure
    27. 27. TB Compound libraries and filter failures Filtering using SMARTs filters to remove thiol reactives, false positives etc at University of New Mexico (http://pasilla.health.unm.edu/tomcat/biocomp/smartsfilter) Ekins et al., Mol Biosyst, 6: 2316-2324, 2010
    28. 28. Correlation between the number of SMARTS filter failures and the number of Lipinski violations for different types of rules sets with FDA drug set from CDD (N = 2804) Suggests # of Lipinski violations may also be an indicator of undesirable chemical features that result in reactivity Correlations Ekins and Freundlich, Pharm Res, 28, 1859-1869, 2011.
    29. 29. Could all pharmas share their data as models with each other? Increasing Data & Model Access Ekins and Williams, Lab On A Chip, 10: 13-22, 2010.
    30. 30. The big idea <ul><li>Challenge..There is limited access to ADME/Tox data and models needed for R&D </li></ul><ul><li>How could a company share data but keep the structures proprietary? </li></ul><ul><li>Sharing models means both parties use costly software </li></ul><ul><li>What about open source tools? </li></ul><ul><li>Pfizer had never considered this - So we proposed a study and Rishi Gupta generated models </li></ul>
    31. 31. Pfizer Open models and descriptors Gupta RR, et al., Drug Metab Dispos, 38: 2083-2090, 2010 <ul><li>What can be developed with very large training and test sets? </li></ul><ul><li>HLM training 50,000 testing 25,000 molecules </li></ul><ul><li>training 194,000 and testing 39,000 </li></ul><ul><li>MDCK training 25,000 testing 25,000 </li></ul><ul><li>MDR training 25,000 testing 18,400 </li></ul><ul><li>Open molecular descriptors / models vs commercial descriptors </li></ul>
    32. 32. <ul><li>Examples – Metabolic Stability </li></ul>Gupta RR, et al., Drug Metab Dispos, 38: 2083-2090, 2010 PCA of training (red) and test (blue) compounds Overlap in Chemistry space HLM Model with CDK and SMARTS Keys: HLM Model with MOE2D and SMARTS Keys <ul><li># Descriptors: 578 Descriptors </li></ul><ul><li># Training Set compounds: 193,650 </li></ul><ul><li>Cross Validation Results: 38,730 compounds </li></ul><ul><li>Training R 2 : 0.79 </li></ul><ul><li>20% Test Set R 2 : 0.69 </li></ul><ul><li>Blind Data Set (2310 compounds): </li></ul><ul><li>R 2 = 0.53 </li></ul><ul><li>RMSE = 0.367 </li></ul><ul><li>Continuous  Categorical: </li></ul><ul><li>κ = 0.40 </li></ul><ul><li>Sensitivity = 0.16 </li></ul><ul><li>Specificity = 0.99 </li></ul><ul><li>PPV = 0.80 </li></ul><ul><li>Time (sec/compound): 0.252 </li></ul><ul><li># Descriptors: 818 Descriptors </li></ul><ul><li># Training Set compounds: 193,930 </li></ul><ul><li>Cross Validation Results: 38,786 compounds </li></ul><ul><li>Training R 2 : 0.77 </li></ul><ul><li>20% Test Set R 2 : 0.69 </li></ul><ul><li>Blind Data Set (2310 compounds): </li></ul><ul><li>R 2 = 0.53 </li></ul><ul><li>RMSE = 0.367 </li></ul><ul><li>Continuous  Categorical: </li></ul><ul><li>κ = 0.42 </li></ul><ul><li>Sensitivity = 0.24 </li></ul><ul><li>Specificity = 0.987 </li></ul><ul><li>PPV = 0.823 </li></ul><ul><li>Time (sec/compound): 0.303 </li></ul>
    33. 33. <ul><li>Examples – P-gp </li></ul>Gupta RR, et al., Drug Metab Dispos, 38: 2083-2090, 2010 Open source descriptors CDK and C5.0 algorithm ~60,000 molecules with P-gp efflux data from Pfizer MDR <2.5 (low risk) (N = 14,175) MDR > 2.5 (high risk) (N = 10,820) Test set MDR <2.5 (N = 10,441) > 2.5 (N = 7972) Could facilitate model sharing?
    34. 34. Merck KGaA Combining models may give greater coverage of ADME/ Tox chemistry space and improve predictions? Model coverage of chemistry space Lundbeck Pfizer Merck GSK Novartis Lilly BMS Allergan Bayer AZ Roche BI Merk KGaA
    35. 35. Next steps <ul><li>ADME/Tox Data crosses diseases </li></ul><ul><li>Potential to share models selectively with collaborators e.g. academics, neglected disease researchers </li></ul><ul><li>We used the proof of concept to submit an SBIR “ Biocomputation across distributed private datasets to enhance drug discovery” </li></ul><ul><li>Develop prototype for sharing models securely- collaborate to show how combining data for TB etc could improve models </li></ul><ul><li>Phase II- develop a commercial product that leverages CDD </li></ul><ul><li>Engage Pistoia Alliance to expand concept to many companies – in progress </li></ul>
    36. 36. Future: What will be modeled <ul><li>Mitochondrial toxicity, hepatotoxicity, </li></ul><ul><li>More Transporters – MATE, OATPs, BSEP..bigger datasets – driven by academia </li></ul><ul><li>Screening centers – more data – more models </li></ul><ul><li>Understanding differences between ligands for Nuclear Receptors </li></ul><ul><ul><li>CAR vs PXR </li></ul></ul><ul><li>Models will become replacements for data as datasets expand (e.g. like logP) </li></ul><ul><li>Toxicity Models used for Green Chemistry </li></ul>Chem Rev. 2010 Oct 13;110(10):5845-82
    37. 37. What You Might Not Know About Chemistry Databases On The Internet <ul><li>Data-sharing between open databases is cyclic </li></ul><ul><li>This can proliferate errors in the “Linked Data” </li></ul>
    38. 38. Government Databases Should Come With a Health Warning Openness Can Bring Serious Quality Issues NPC Browser http://tripod.nih.gov/npc/ Database released and within days 100’s of errors found in structures Williams and Ekins, DDT, 16: 747-750 (2011) Science Translational Medicine 2011
    39. 39. <ul><li>Make science more accessible = >communication </li></ul><ul><li>Mobile – take a phone into field /lab and do science more readily than on a laptop </li></ul><ul><li>GREEN – energy efficient computing </li></ul><ul><li>MolSync + DropBox + MMDS = Share molecules as SDF files on the cloud = collaborate </li></ul>Mobile Apps for Drug Discovery Williams et al DDT 16:928-939, 2011
    40. 40. Acknowledgments <ul><li>University of Maryland </li></ul><ul><ul><li>Lei Diao </li></ul></ul><ul><ul><li>James E. Polli </li></ul></ul><ul><li>Pfizer </li></ul><ul><ul><li>Rishi Gupta </li></ul></ul><ul><ul><li>Eric Gifford </li></ul></ul><ul><ul><li>Ted Liston </li></ul></ul><ul><ul><li>Chris Waller </li></ul></ul><ul><li>Merck </li></ul><ul><ul><li>Jim Xu </li></ul></ul><ul><li>Antony J. Williams (RSC) </li></ul><ul><li>Accelrys </li></ul><ul><li>CDD </li></ul><ul><li>Email: ekinssean@yahoo.com </li></ul><ul><li>Slideshare: http://www.slideshare.net/ekinssean </li></ul><ul><li>Twitter: collabchem </li></ul><ul><li>Blog: http://www.collabchem.com/ </li></ul><ul><li>Website: http://www.collaborations.com/CHEMISTRY.HTM </li></ul>