Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

ChEMBL US tour December 2014

897 views

Published on

Slides from a tour of a number of US labs in December 2014

Published in: Science
  • Be the first to comment

  • Be the first to like this

ChEMBL US tour December 2014

  1. 1. ChEMBL: Resources for Drug Discovery John P. Overington @johnpoverington jpo@ebi.ac.uk
  2. 2. EMBL-­‐EBI
  3. 3. ChEMBL Strategy • Comprehensively catalogue historical drug discovery • Include successes and failures • Large scale abstracHon curaHon of primary literature • Direct deposiHons • Drugs can be small molecules, pepHdes, recombinant proteins, siRNA, cells, viruses, etc. • ‘Learn’ rules for drug discovery ‘success’ • Target selecHon and prioriHsaHon -­‐ druggability • Lead discovery, opHmisaHon, clinical candidate selecHon • Develop approaches to new target classes – e.g. PPIs • Make all data freely available to enHre community • Encourage re-­‐use, integraHon and cross-­‐linking
  4. 4. Target Discovery Lead Discovery Lead OpHmisaHon Preclinical Development Phase 1 Phase 2 Phase 3 Launch (Phase 4) Drug Discovery >1,638,000 compound records >12,800,000 bioacHviHes ~57,150 abstracted papers ~10,579 targets ~12,000 clinical candidates ~1,600 drugs • Target idenHficaHon • Microarray profiling • Target validaHon • Assay development • Biochemistry • Clinical/Animal disease models • High-­‐throughput Screening (HTS) • Fragment-­‐based screening • Focused libraries • Screening collecHon • Medicinal Chemistry • Structure-­‐based drug design • SelecHvity screens • ADMET screens • Cellular/Animal disease models • PharmacokineHcs • Toxicology • In vivo safety pharmacology • FormulaHon • Dose predicHon PK tolerability Efficacy Safety & Efficacy IndicaHon discovery, repurposing & expansion Med. Chem. SAR Clinical Candidates Drugs Discovery Development Use ChEMBL content ChEMBL19 content
  5. 5. Prototype Drug Optimisation 1st generaHon 2nd generaHon 3rd generaHon 4th generaHon N O N O O H N N N Cl Cl N N N O N O N O O H N N N Cl Cl Tinidazole 1970 N N O N O O O O O H N N Cl Cl N N N+ O O Azomycin (1956) Streptomyces natural product trichomonacidal ‘toxic’ N N N+ O O O Metronidazole 1962 N N Cl Clotrimazole 1970 N N Cl Cl O Cl Cl Miconazole 1970 N N Cl Cl O Cl Econazole 1972 Ketoconazole 1978 Itraconazole 1984 N N Cl Cl S Cl N N+ O O S Sulconazole 1980 N N Bifonazole 1981 Terconazole 1980 OH Posaconazole 2005 F O O N N F F N N F F N OH N N N N N N O N N Voriconazole 2002 N OH N N N N N F F Fluconazole 1988 N Fosfluconazole 2004 Imidazole triazole O N N N N N F F P O OH OH N N N N N After W. Sneader
  6. 6. Overview of EMBL-­‐EBI Chemistry Resources RDF and REST API interfaces 40K 1.5M >15M 15K 750 UniChem – InChI-­‐based resolver (full + relaxed ‘lenses’) 3rd Party Data ZINC, PubChem, ThomsonPharma DOTF, IUPHAR, DrugBank, KEGG, NIH NCC, eMolecules, FDA SRS, PharmGKB, Selleck, …. ChEMBL BioacHvity data from literature and deposiHons ChEBI Structures and metadata for metabolites. Chemical Ontology Atlas Ligand-­‐ induced transcript response PDBe Ligand structures from structurally defined protein complexes SureChEMBL Ligand structures from patent literature REST API Interface ~75M
  7. 7. ChEMBL
  8. 8. What Is the ChEMBL Data?
  9. 9. SAR Data Compound Assay Ki=4.5 nM >Thrombin MAHVRGLQLPGCLALAALCSLVHSQHVFLAPQQARSLLQRVRRANTFLEEVRKGNLERECVEETCS YEEAFEALESSTATDVFWAKYTACETARTPRDKLAACLEGNCAEGLGTNYRGHVNITRSGIECQLW RSRYPHKPEINSTTHPGADLQENFCRNPDSSTTGPWCYTTDPTVRRQECSIPVCGQDQVTVAMTPR SEGSSVNLSPPLEQCVPDRGQQYQGRLAVTTHGLPCLAWASAQAKALSKHQDFNSAVQLVENFCRN PDGDEEGVWCYVAGKPGDFGYCDLNYCEEAVEEETGDGLDEDSDRAIEGRTATSEYQTFFNPRTFG SGEADCGLRPLFEKKSLEDKTERELLESYIDGRIVEGSDAEIGMSPWQVMLFRKSPQELLCGASLI SDRWVLTAAHCLLYPPWDKNFTENDLLVRIGKHSRTRYERNIEKISMLEKIYIHPRYNWRENLDRD IALMKLKKPVAFSDYIHPVCLPDRETAASLLQAGYKGRVTGWGNLKETWTANVGKGQPSVLQVVNL PIVERPVCKDSTRIRITDNMFCAGYKPDEGKRGDACEGDSGGPFVMKSPFNNRWYQMGIVSWGEGC DRDGKYGFYTHVFRLKKWIQKVIDQFGE ED2=230 nM What Is the ChEMBL Data? Inhibition of human Thrombin PTT (partial thromboplastin time)
  10. 10. ChEMBL Target Types Protein Protein complex Protein family Nucleic Acid e.g. NicoHnic acetylcholine receptor e.g. Muscarinic receptors e.g. DNA e.g. HEK293 cells e.g. Trachea e.g. Drosophila e.g. PDE5 Cell line Tissue Sub-­‐cellular fracHon Organism
  11. 11. ChEMBL hqps://www.ebi.ac.uk/chembl • The world’s largest primary public database of medicinal chemistry data – ~1.6 million compounds, ~10,000 targets, ~12 million bioacHviHes • Truly Open Data -­‐ CC-­‐ BY-­‐SA license • ChEMBL data also loaded into BindingDB, PubChem BioAssay and BARD A. Gaulton et al (2012) Nucleic Acids Research Database Issue. 40 D1100-­‐1107
  12. 12. SureChEMBL hqps://www.surechembl.org • New Public chemistry patent resource • ‘Acquired’ SureChem product from Digital Science – AutomaHcally extracted chemical structures from full-­‐text patent – ~15 million chemical structures – Updated daily – Plan to add molecular target, sequence, disease, animal model, cell-­‐line indexing….
  13. 13. hqps://www.ebi.ac.uk/chembl
  14. 14. About ChEMBL
  15. 15. Compound View -­‐ 1
  16. 16. Compound View -­‐2
  17. 17. Compound View – 3
  18. 18. Compound View -­‐ 4
  19. 19. Target Search
  20. 20. Browse Targets
  21. 21. Browse Targets -­‐ Organism
  22. 22. Browse Drugs
  23. 23. Drugs
  24. 24. Targets of Launched Drugs Overington et al, Nat. Rev. Drug Disc., 5, pp. 993-­‐996 (2006)
  25. 25. Drug Targets and Drugs Santos et al, unpublished
  26. 26. Different Types of Drugs Santos et al, unpublished SyntheHc small molecule Natural product-­‐derived small molecule Monoclonal anHbody Other protein Polymer PepHde OligonucleoHde Oligosaccharide Inorganic Other Other Drugs Approved 2013 Assigned USANs 2013
  27. 27. Affinity of Drugs for their‘Targets’ Ki, Kd, IC50, EC50, & pA2 endpoints for drugs against their‘efficacy targets’ 2 3 4 5 6 7 8 9 10 11 12 400 350 300 250 200 150 100 50 0 Frequency -­‐log10 affinity 10mM 1mM 100mM 10mM 1mM 100nM 10nM 1nM 100pM 10pM 1pM Overington, et al, Nature Rev. Drug Discov. 5 pp. 993-­‐996 (2006) Gleeson et al, Nature Rev. Drug Discov. 10 pp. 197-­‐208 (2011)
  28. 28. Privileged Target Families Rhodopsin-­‐like GPCR PDBe: 3sn6 Ion channels PDBe: 4kfm Nuclear receptors PDBe: 3e00 Protein kinases PDBe: 4foc 22% of drug targets 33% of small mol drugs 12% of drug targets 18% of small mol drugs 6% of drug targets 17% of small mol drugs 13% of drug targets 2.4% of small mol drugs Over 53% of all targets and 70% of drugs modulate these four target classes
  29. 29. Santos, unpublished Privileged Target Families ChEMBL17 Drugs
  30. 30. NFκB Pathway
  31. 31. FDA Approved Drugs
  32. 32. Clinical Candidates
  33. 33. Clinical Candidates
  34. 34. Clinical Candidates • Database of clinical development candidates – Contains ~12,000 2-­‐D structures/sequences • EsHmated size ~35-­‐45,000 compounds – Work in progress • Deeper coverage of key gene families • e.g. Protein kinases, 399 disHnct clinical candidates
  35. 35. Pharma Industry ProducHvity File RegistraHon number vs. USAN date 800,000 700,000 600,000 500,000 400,000 300,000 200,000 100,000 0 Phase 2b date ~Discovery date 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 Overington, unpublished
  36. 36. Clinical Kinome Overington, Al-­‐Lazikani & Wennerberg, unpublished
  37. 37. Kinase Inhibitors in Clinical Development Overington, Bellis, Al-­‐Lazikani & Wennerberg, unpublished
  38. 38. Clinical Kinome • 399 Clinical stage human small molecule protein kinase inhibitors – 29 Approved small molecule kinase inhibitors – 38 Phase 3 – 143 Phase 2 – 189 Phase 1 • Phase 1:2 raHo is atypical due to many kinase inhibitor trials being phase 1/2 oncology trials • 2D structures for 311 of these
  39. 39. Kinase Inhibitor Polypharmacology Staurosporine (no trials) US launched SuniHnib Sorafenib ImaHnib DasaHnib TofaciHnib TozaserHb (Ph. II) ErloHnib GefiHnib LapaHnib Adapted from Ghoreschi et al, Nature Immunology 10, 356 -­‐ 360 (2009)
  40. 40. GSK PKIS Data
  41. 41. ChEMBL – Assay Reliability
  42. 42. F.A. Krüger & J.P. Overington (2012) ‘Global analysis of small molecule binding to related protein targets’ PLoS Comp. Biol. 8, e1002333
  43. 43. Differences Between Human And Rat Orthologs 44 Distribution of affinity differences Human vs Rat pKd Human pKd Rat -­‐log(Kd) Human density |human pKd -­‐ rat pKd|
  44. 44. 45 Differences Between Different Assays Distribution of inter-assay affinity differences density Binding affinity in human and rat assays pKd Assay1 pKd Assay2 |human pKd -­‐ human pKd|
  45. 45. Ortholog vs Intra-assay Differences Density distributions of ortholog and inter-assay differences pKii -­‐ pKij density Krüger, PLoS Comp. Biol. 8, e1002333, DOI:10.1371/journal.pcbi.1002333
  46. 46. ChEMBL – Domain AnnotaHon
  47. 47. Domain-­‐level AnnotaHon • Site of binding is important in understanding and controlling function • often several sites within same target protein • Recently annotated binding sites (where possible) for entire ChEMBL target dictionary • used Pfam domains http://www.pfam.org
  48. 48. Domain ‘poisoning’ of sequence queries Kinase SYK (Q64725), Krüger BMC Bioinformatics, 13, S11 DOI:10.1186/1471-2105-13-S17-S11 R. norvegicus Phosphatase SH-­‐PTP2 (P35235) , R. norvegicus
  49. 49. Domain-­‐level Binding Sites Depleted and Enriched Pfam Domains Neur_chan_memb -1.63 zf-C4 -0.94 ANF_receptor -0.88 SH2 -0.83 Pkinase_C -0.70 fn3 -0.53 SH3_1 -0.51 Lig_chan -0.50 C2 -0.50 C1_1 -0.50 Guanylate_cyc -0.46 HATPase_c -0.46 I-set -0.44 adh_short -0.39 PH -0.39 Ank -0.39 ….. Metallophos 0.35 Phospholip_A2_1 0.38 Peptidase_M10 0.41 Asp 0.45 SNF 0.48 Hist_deacetyl 0.48 Carb_anhydrase 0.50 Peptidase_C1 0.51 Trypsin 0.51 Beta-lactamase 0.57 p450 1.00 Hormone_recep 1.19 Ion_trans 1.66 Neur_chan_LBD 2.02 Pkinase_Tyr 2.12 Pkinase 5.87 7tm_1 7.30 Krueger and Overington, unpublished
  50. 50. Binding Between Multiple Domains IdenHfied only 12 mulH-­‐domain architectures (corresponding to 120 ChEMBL targets) with ligand binding mediated via more than one domain. PDBe: 3goi Krüger BMC Bioinformatics, 13, S11 DOI:10.1186/1471-2105-13-S17-S11
  51. 51. hqps://www.ebi.ac.uk/chembl/research/ppdms
  52. 52. Better prediction of pathway perturbation Overington, unpublished
  53. 53. Domain specific modulation – mTor HEAT repeat FAT FRB kinase RD FATC mTORC1 DEPTOR MLST8 mSIN1 Raptor Rheb FKBP-38 mSLT8 Sirolimus (rapamycin) PI-103 r Gable Rictor Tel2 FBXW7 DEPTOR FKBP-12 S6K1 Overington, unpublished PRAS40 mTORC2 FKBP-12 binding mTORC binding Immunosuppression, Cancer Cancer
  54. 54. Acknowledgements ChEMBL Database Anne Hersey Anna Gaulton Mark Davies Michal Nowotka George Papadatos Jon Chambers Louisa Bellis Rita Santos Gerard Van Westen Ruth Akhtar Francis Atkinson Patricia Bento Ramesh Donadi John Paul Overington Ins5tute of Cancer Research Bissan Al-­‐Lazikani Paul Workman FIMM, Helsinki Krister Wennerberg University of Dundee Andrew Hopkins hqp://chembl.blogspot.com

×