CDD Experienced Team Innovates and Executes Barry Bunin, PhD (Pres. & Cofounder as first Eli Lilly EIR) Libraria (CEO, Pres.-CSO), Arris Pharmaceuticals (Sr. Scientist), Genentech, UC Berkeley (Ellman), Columbia University, author. Moses Hohman, PhD (Director Software Engineering) Northwestern Assoc. Director of Bioinformatics, Thoughtworks, Inc., U of Chicago (PhD), Harvard ( magna cum laude, Physics) Sylvia Ernst, PhD (Director Community Growth & Sales) Left 800-lb Gorillas: Accelrys-Scitegic, MDL-Elsevier-Beilstein Peter Cohan (BOD & Overall Sales Strategy) Symyx (VP Bus Dev & President-Discovery Tools), MDL (VP Customer Marketing), www.secondderivative.com, author. Omidyar Network, Founders Fund, & Lilly (BOD observers) WSGR (Corporate Counsel), Rina Accountancy (GAAP compliance) Partners: Hub Consortium Members, ChemAxon, DNDi, MMV, Sandler Center… CDD SAB: Christopher Lipinski PhD, James McKerrow, MD PhD, David Roos PhD, Adam Renslo PhD, Wes Van Voorhis, MD PhD
Montreal 8th world congress
Computational Models for Predicting Human Toxicities Sean Ekins Collaborations in Chemistry, Fuquay-Varina, NC. Collaborative Drug Discovery, Burlingame, CA. School of Pharmacy, Department of Pharmaceutical Sciences, University of Maryland.
A LITTLE BACKGROUND : computer aided drug design 1999 Accelrys UGM 2003
The future: crowdsourced drug discoveryWilliams et al., Drug Discovery World, Winter 2009
Hardware is getting smaller Laptop 1930’s Room size Netbook 1980s Phone Desktop size Watch 1990sNot to scale and not equivalent computing power – illustrates mobility
Models and software becoming more accessible-free
Driving change Pharma reached a productivity tipping point Cost of drug development high Failure in clinic due to toxicityInitiatives like REACH, ToxCast etc need to screen many molecules Reduce use of animals How to predict failure earlier – are we at a turning point?
Examples of Models for Human Toxicities Drug induced liver injury (DILI) Time dependent inhibition of P450 3A4 Transporters – hOCTN2 PXR and ToxCast Precompetitive pharma models
Application : Drug induced liver injury DILI Drug metabolism in the liver can convert some drugs into highly reactive intermediates, In turn can adversely affect the structure and functions of the liver. DILI, is the number one reason drugs are not approved and also the reason some of them were withdrawn from the market after approval Estimated global annual incidence rate of DILI is 13.9-24.0 per 100,000 inhabitants, and DILI accounts for an estimated 3-9% of all adverse drug reactions reported to health authorities Herbal components can cause DILI too https://dilin.dcri.duke.edu/for-researchers/info/
Drug Examples for DILI + and -Troglitazone DILI + Pioglitazone DILI - Rosiglitzone DILI - Aspirin DILI -Sulindac DILI + Diclofenac DILI + Xu et al., Toxicol Sci 105: 97-105 (2008)
Limitations of DILI? Compound has to physically have been made and be available for testing. The screening system is still relatively low throughput compared with any primary screens Whole compound or vendor libraries cannot be cost effectively screened for prioritization. Screening system should be representative of the human organ including drug metabolism capability. Prediction of human therapeutic Cmax is often imprecise before clinical testing in actual patients.Ekins, Williams and Xu, Drug Metab Dispos 38: 2302-2308, 2010
DILI Computational Models 74 compounds - classification models (linear discriminant analysis, artificial neural networks, and machine learning algorithms (OneR)) Internal cross-validation (accuracy 84%, sensitivity 78%, and specificity 90%). Testing on 6 and 13 compounds, respectively > 80% accuracy. (Cruz-Monteagudo et al., J Comput Chem 29: 533-549, 2008). A second study used binary QSAR (248 active and 283 inactive) Support vector machine models – external 5-fold cross-validation procedures and 78% accuracy for a set of 18 compounds (Fourches et al., Chem Res Toxicol 23: 171-183, 2010). A third study created a knowledge base with structural alerts from 1266 chemicals. Alerts created were used to predict results for 626 Pfizer compounds (sensitivity of 46%, specificity of 73%, and concordance of 56% for the latest version) (Greene et al., Chem Res Toxicol 23: 1215-1222, 2010).
DILI data Tested a panel of orally administered drugs at multiples of the maximum therapeutic concentration (Cmax), taking into account the first-pass effect of the liver and other idiosyncratic toxicokinetic/toxicodynamic factors. The 100-fold Cmax scaling factor represented a reasonable threshold to differentiate safe versus toxic drugs for an orally dosed drug and with regard to hepatotoxicity. Concordance of the in vitro human hepatocyte imaging assay technology (HIAT) for 300 drugs and chemicals, ~ 75% with regard to clinical hepatotoxicity, with very few false-positive resultsXu et al., Toxicol Sci 105: 97-105 (2008).
Bayesian machine learning Laplacian-corrected Bayesian classifier models were generated using Discovery Studio (version 2.5.5; Accelrys). Training set = 295, test set = 237 compounds Uses two-dimensional descriptors to distinguish between compounds that are DILI-positive and those that are DILI-negative ALogP ECFC_6 Apol logD molecular weight Extended number of aromatic rings connectivity number of hydrogen bond acceptors number of hydrogen bond donors fingerprints number of rings number of rotatable bonds molecular polar surface area molecular surface area Wiener and Zagreb indicesEkins, Williams and Xu, Drug Metab Dispos 38: 2302-2308, 2010
Features in DILI + AvoidLong aliphatic chains Phenols Ketones Diols α-methyl styreneConjugated structures Cyclohexenones Amides ?Ekins, Williams and Xu, Drug Metab Dispos 38: 2302-2308, 2010
Features in DILI -Ekins, Williams and Xu, Drug Metab Dispos 38: 2302-2308, 2010
Results Fingerprints with high Bayesian scores that are present in many DILI compounds appeared to be reactive in nature, Could cause time-dependent inhibition of cytochromes P450 or be precursors for metabolites that are reactive and may covalently bind to proteins. Why are long aliphatic chains important for DILI generally hydrophobic and perhaps enabling increased accumulation? may be hydroxylated and then form other metabolites that are in turn reactive?Ekins, Williams and Xu, Drug Metab Dispos 38: 2302-2308, 2010
Test set analysis compounds of most interest well known hepatotoxic drugs (U.S. Food and Drug Administration Guidance for Industry “Drug-Induced Liver Injury: Premarketing Clinical Evaluation,” 2009), plus their less hepatotoxic comparators, if clinically available. Ekins, Williams and Xu, Drug Metab Dispos 38: 2302-2308, 2010
Training vs test set PCA Yellow = test Blue = training Retinyl palmitateEkins, Williams and Xu, Drug Metab Dispos 38: 2302-2308, 2010
Compare to newer drugs Extracted small molecule drugs from 2006 to 2010 from the Prous Integrity database Structure validation resulted in a set of 77 molecules (mean molecular weight 427.05 ± 280.31, range 94.11–1994.09) These molecules were distributed throughout the combined training and test sets (N = 532), representative of overlap These combined analyses suggest that the test and training sets used for the DILI model are representative of current medicinal chemistry efforts. Ekins, Williams and Xu, Drug Metab Dispos 38: 2302-2308, 2010
Predictions for newly approved EMEA compounds Fingolimod (Gilenya) for Pirfenidone for MS (EMEA and FDA) Idiopathic pulmonary fibrosis Roflumilast for Paliperidone for pulmonary disease schizophrenia Name DILI Bayesian ECFC6 Bayes ian ECFC6 for paper#PredictionECFC6 for paper_ClosestSimilarity DILI for paper DILI Bayesian fingolimod 0.422051 TRUE 0.4paliperidone 8.79189 TRUE 0.865385perfenidone 0.542769 TRUE 0.322581 roflumilast 3.17631 TRUE 0.326923 Can we get DILI data for these?
Conclusions First large-scale testing of DILI machine learning model Concordance lower than with in vitro model Statistics similar to Structural alerts from Pfizer paper Could use models to filter compounds for further testing in vitro Use published knowledge to predict DILI Combinations of models Combine datasets – create models with Open descriptors and algorithms Make models widely availableEkins, Williams and Xu, Drug Metab Dispos 38: 2302-2308, 2010
Integrated in Silico-in Vitro Strategy for Addressing Cytochrome P450 3A4 Time-Dependent Inhibition Pfizer generated a large dataset (~2000 compounds) and went through sequential Bayesian model generation and testing cycles Test set 2 20 active in 156 compounds Combined both model predictions Zientek et al., Chem Res Toxicol 23: 664-676 (2010)
Important substructures for CYP3A4 Time dependent inhibitionIndazole ring, the pyrazole,and the methoxy-aminopyridine rings areimportant for TDIApproach decreased invitro screening 30%Helps identify reactivemetabolite formingcompounds Zientek et al., Chem Res Toxicol 23: 664-676 (2010)
Pharmacophores applied broadly Created forIdeal when we have few molecules for training CYP2B6In silico database searching CYP2C9 CYP2D6Accelrys Catalyst in Discovery Studio CYP3A4 CYP3A5 CYP3A7Geometric arrangement of functional groups necessary hERGfor a biological response P-gp OATPs•Generate 3D conformations OCT1 OCT2•Align molecules BCRP•Select features contributing to activity hOCTN2•Regress hypothesis ASBT•Evaluate with new molecules hPEPT1 hPEPT2•Excluded volumes – relate to inactive molecules FXR LXR CAR PXR etc
hOCTN2 – Organic Cation transporter High affinity cation/carnitine transporter - expressed in kidney, skeletal muscle, heart, placenta and small intestine Inhibition correlation with muscle weakness - rhabdomyolysis A common features pharmacophore developed with 7 inhibitors Searched a database of over 600 FDA approved drugs - selected drugs for in vitro testing. 33 tested drugs predicted to map to the pharmacophore, 27 inhibited hOCTN2 in vitro Compounds were more likely to cause rhabdomyolysis if the Cmax/Ki ratio was higher than 0.0025 Diao, Ekins, and Polli, Pharm Res, 26, 1890, (2009)
Possible Association between Clinical Rhabdomyolysis and hOCTN2 Inhibition Diao, Ekins, and Polli, Pharm Res, 26, 1890, (2009)
hOCTN2 quantitative pharmacophore and Bayesian modelvinblastinecetirizine +ve emetine -ve r = 0.89 Diao et al., Mol Pharm, 7: 2120-2131, 2010
hOCTN2 quantitative pharmacophore and Bayesian modelBayesian Model - Leaving 50% out 97 times external ROC 0.90 internal ROC 0.79 concordance 73.4%; specificity 88.2%; sensitivity 64.2%.Lab test set (N = 27) Bayesian model has better correct predictions (> 80%) andlower false positives and negatives than pharmacophore (> 70%)Predictions for literature test set (N=32) not as good as in house – mean maxTanimoto similarity were ~ 0.6 PCA used to assess training and test set overlap Diao et al., Mol Pharm, 7: 2120-2131, 2010
hOCTN2 association with rhabdomyolysisAmong the 21 drugs associated with rhabdomyolysis or carnitinedeficiency, 14 (66.7%) provided a Cmax/Ki ratio higher than0.0025.Among 25 drugs that were not associated with rhabdomyolysis orcarnitine deficiency, only 9 (36.0%) showed a Cmax/Ki ratio higher than0.0025.Rhabdomyolysis or carnitine deficiency was associated with a Cmax/Kivalue above 0.0025 (Pearson’s chi-square test p = 0.0382).limitations of Cmax/Ki serving as a predictor for rhabdomyolysis-- Cmax/Ki does not consider the effects of drug tissue distributionor plasma protein binding. Diao et al., Mol Pharm, 7: 2120-2131, 2010
hOCTN2 Substrates Substrate Km (microM) L-carnitine 5.3 Acetyl-L-carnitine 9 Mildronate 26 Ipratropium 53 Valproyl-L-carnitine 132 ± 23 Naproxen-L-carnitine 257 ± 57 Ketoprofen-L-carnitine 77.0 ± 4.0 Ketoprofen-glycine-L-carnitine 58.5 ± 8.7 Valproyl-glycolic acid-L-carnitine 161 ± 50Ekins et al submitted 2011 Data from Polli lab (conjugates) and literature
hOCTN2 Substrate + Inhibitor Pharmacophores Inhibitor Hypogen pharmacophoreSubstrate Common feature Pharmacophore---Used CAESAR and excluded volumes Overlap of pharmacophores RMSD 0.27 Angstroms Substrate pharmacophore mapped 6 out of 7 substrates in a test set. After searching ~800 known drugs, 30 were predicted to map to the substrate pharmacophore with L-carnitine shape restriction. 16 had case reports documenting an association with rhabdomyolysis
Growing role for PXR agonists Interaction between hyperforin in St Johns Wort and irinotecan = reduces efficacy Ablating the inflammatory response mediated by exogenous toxins e.g. inflammatory diseases of the bowel Cholesterol metabolism pathway control - a negative effect Mediating blood-brain barrier efflux of drugs modulation of efflux transporters e.g. mdr1 and mrp2. Decrease retention of CNS drugs e.g. anti-epileptics and pain killers, decreasing efficacy PXR induces cell growth and is pro-carcinogenic
ToxCast: docking chemicals in human PXR• 10 Groups had contracts with EPA to test ~300 conazoles & pesticides, etc with various biological assays (cell based, receptor etc)• We have docked all the molecules into the PXR agonist site of 5 structures• GOLD (ver 4) -genetic algorithm explores conformations of ligands and flexible receptor side• 20 independent docking runs• Used the regular goldscore to classify compounds• Comparing their respective scores to the corresponding goldscores of the co-crystalized ligands.• Majority vote across the five structures. Kortagere et al., Env Health Perspect, 118: 1412-1417, 2010
ToxCast: docking pesticides in PXR• Activities of most activators more potent vs NCGC data• We correctly predict ~70% of compounds and 75% of activators• Including other predicted pesticides from Lemaire, G et al., Toxicol Sci. 2006; 91:501-9, (2006).• When compared to NCGC data for complete Toxcast set Sensitivity 74%Kortagere et al., Env HealthPerspect, 118: 1412-1417, 2010
ToxCast (blue) vs Steroidal (yellow) compounds •Different areas in PCA using simple descriptors •ToxCast requires a model built with similar molecules •General PXR models may be limited in predicting ToxCast data •Phase II of ToxCast – further testing of models Kortagere et al., Env Health Perspect, 118: 1412-1417, 2010
How Could Green Chemistry Benefit From These Models? … N AT U R E, 4 6 9: 6 JA N 2 0 1 1 Chem Rev. 2010 Oct 13;110(10):5845-82
Increasing Data & Model AccessCould all pharmas share their data as models with each other?
Open source tools for modelingGupta RR, et al., Drug Metab Dispos, 38: 2083-2090, 2010
Open source tools for modeling Open source descriptors CDK and C5.0 algorithm ~60,000 molecules with P-gp efflux data from Pfizer MDR <2.5 (low risk) (N = 14,175) MDR > 2.5 (high risk) (N = 10,820) Test set MDR <2.5 (N = 10,441) > 2.5 (N = 7972) CDK +fragment descriptors MOE 2D +fragment descriptors Kappa 0.65 0.67 sensitivity 0.86 0.86 specificity 0.78 0.8 PPV 0.84 0.84 $ $$$$$$ Could facilitate model sharing?Gupta RR, et al., Drug Metab Dispos, 38: 2083-2090, 2010
….Near FutureBetter & wider applicability domain models availableWider use of modelsSelective sharing of modelsComputational ADME/Tox apps?Williams et al DDT in pressBunin & Ekins DDT in Press
Acknowledgments University of Maryland Lei Diao James E. Polli Pfizer Rishi Gupta Eric Gifford Ted Liston Chris Waller Merck Jim Xu Antony J. Williams (RSC) Matthew D. Krasowski, Erica J. Reschly (University of Iowa) Sandhya Kortagere (Drexel University) Sridhar Mani (Albert Einstein) Accelrys CDD Email: firstname.lastname@example.org• Slideshare: http://www.slideshare.net/ekinssean• Twitter: collabchem• Blog: http://www.collabchem.com/• Website: http://www.collaborations.com/CHEMISTRY.HTM
Bayesian machine learningBayesian classification is a simple probabilistic classification model. It is based onBayes’ theoremh is the hypothesis or modeld is the observed datap(h) is the prior belief (probability of hypothesis h before observing any data)p(d) is the data evidence (marginal probability of the data)p(d|h) is the likelihood (probability of data d if hypothesis h is true)p(h|d) is the posterior probability (probability of hypothesis h being true given theobserved data d)A weight is calculated for each feature using a Laplacian-adjusted probabilityestimate to account for the different sampling frequencies of different features.The weights are summed to provide a probability estimateEkins, Williams and Xu, Drug Metab Dispos 38: 2302-2308, 2010
Examples of using Bayesian ModelsIntegrated in Silico-in Vitro Strategy for Addressing Cytochrome P450 3A4 Time-DependentInhibitionZientek et al., Chem Res Toxicol 23: 664-676 (2010)Challenges predicting ligand-receptor interactions of promiscuous proteins: the nuclearreceptor PXREkins S, Kortagere S, Iyer M, Reschly EJ, Lill MA, Redinbo MR and Krasowski MD, PLoSComput Biol 5(12): e1000594, (2009) .Computational models for drug inhibition of the human apical sodium-dependent bile acidtransporterZheng X, et al., Mol Pharm, 6: 1591-1603, (2009)Quantitative structure activity relationship for inhibition of human organic cation/carnitinetransporterDiao et al., Mol Pharm, 7: 2120-2131, (2010)