Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Accelerating multiple medicinal chemistry projects using Artificial Intelligence (AI)

208 views

Published on

The technical methods and results of Matched Molecular Pair Analysis (MMPA) applied from a small, individual assay scale through large pharma scale, to multiple pharma data sharing scale have been published and reviewed. The drive behind these efforts has been to derive a medicinal chemistry knowledge base (i.e. definitive textbook) that can be applied to drug discovery projects. The aim is to greatly decrease the time in lead identification and optimization by the synthesis of fewer compounds. Such a system suggests compound designs to expert chemists to triage; such a process is Artificial Intelligence (AI). Given this context, how does this work on projects? How do the chemists make decisions? What are the results? The talk will answer these questions through project examples where MMPA has been applied and how this led to drug candidates. The projects disclosed are from multiple organisations and describe Cathepsin K inhibitors, Glucokinase Inhibitors, 11β-Hydroxysteroid Dehydrogenase Type I Inhibitors (11β-HSD1), Ghrelin inverse antagonists and Tubulin Polymerization inhibitors. An overview of MMPA will be presented and each project will be briefly described with a focus on how the chemists used MMPA to understand SAR and design compounds. The impact of project progress to CD will be quantified.

Published in: Data & Analytics
  • Be the first to comment

Accelerating multiple medicinal chemistry projects using Artificial Intelligence (AI)

  1. 1. Not for circulation Exploiting medicinal chemistry knowledge to accelerate projects February 2019 February 2019 Not for Circulation Accelerating multiple medicinal chemistry projects using Artificial Intelligence (AI) A review from the past 8 years of real world examples
  2. 2. Not for circulation Exploiting medicinal chemistry knowledge to accelerate projects February 2019Exploiting medicinal chemistry knowledge to accelerate projects Febuary 2019 Not for circulation • Founded in 2012 by experienced large Pharma medicinal/computational chemists to accelerate drug hunting by exploiting data driven knowledge • Domain leaders in SAR knowledge extraction and knowledge based design • > 10 years experience of building AI systems that suggest actions to chemists (6 years as MedChemica) • Creators of largest ever documented database of medicinal chemistry ADMET knowledge MedChemica Publications
  3. 3. Not for circulation Exploiting medicinal chemistry knowledge to accelerate projects February 2019Exploiting medicinal chemistry knowledge to accelerate projects Febuary 2019 Not for circulation …7 Years of working with pharma companies “Our median number of compounds per LO project is 3000 - this is unsustainable… [it should be] 300” – Director of Chemistry (large pharma) “Can we define the text book of medincal chemistry?” – Director of Comp Chem (large pharma) “We are aiming at 300 compound per project – currently we are about 400, we will get better” – ExScienta scientist at SCI ‘What can BigData do for chemistry’ – London Oct 2017 MedChemica is a company using knowledge extraction techniques to build “expert systems” to suggest actions to chemists [Artificial Intelligence – AI] and reduce the time and cost to critical compounds and candidate drugs.
  4. 4. Not for circulation Exploiting medicinal chemistry knowledge to accelerate projects February 2019 AI in Medicinal Chemistry? • Artificial Intelligence (AI) provides (or performs) actions, either fully automated or with additional human experience and final decision making. • The quality of an AI model depends on the number of times the machine can learn from success and failure. – Alpha-Go • fast to learn because a computer can play another computer • Clear success and failure – Drug Discovery ! • Fully documented discovery projects (all cmpds/all measurements) in short supply • DTMA too long for many iterations? • Unclear success / failure in early research “Can we accelerate medicinal chemistry by augmenting the chemist with BigData and Artificial Intelligence?” Griffen E.J. et al Drug Disco. Today, 2018, 23, 7, 1373-1384.
  5. 5. Not for circulation Exploiting medicinal chemistry knowledge to accelerate projects February 2019Exploiting medicinal chemistry knowledge to accelerate projects Febuary 2019 Not for circulation Griffen, E. et al. J. Med. Chem. 2011, 54(22), pp.7739-7750. Fully Automated Matched Molecular Pair Analysis (MMPA) What is this form of Artificial Intelligence? Δ Data A-B1 2 2 3 3 3 4 4 4 12 23 3 34 4 4A B • Matched Molecular Pairs – Molecules that differ only by a particular, well-defined structural transformation • Capture the change and environment – MMPs can be recorded as transformations from Aà B • Statistical analysis to define “medicinal chemistry rules” Defined transformations with high probability of improving properties of molecules • Store in a high performance database and provide an intuitive user interface “..it’s like asking 150 of your peers for ideas in just a few seconds” - Principal Scientist (large pharma)
  6. 6. Not for circulation Exploiting medicinal chemistry knowledge to accelerate projects February 2019 Identify and group matching SMIRKS Calculate statistical parameters for each unique SMIRKS (n, median, sd, se, n_up/n_down) Is n ≥ 6? Not enough data: ignore transformation Is the |median| ≤ 0.05 and the intercentile range (10-90%) ≤ 0.3? Perform two-tailed binomial test on the transformation to determine the significance of the up/ down frequency transformation is classified as ‘neutral’ Transformation classified as ‘NED’ (No Effect Determined) Transformation classified as ‘increase’ or ‘decrease’ depending on which direction the property is changing pass fail yes no yes no Rule selection 0 +ve-ve Median data difference Neutral IncreaseDecrease NED • No assumption of normal distribution • Manages ‘censored’ = qualified / out-of-range data
  7. 7. Not for circulation Exploiting medicinal chemistry knowledge to accelerate projects February 2019 MMPA Enables knowledge sharing MMPA MMPA MMPA Combine and Extract Rules Multiple Pharma ADMET data >437000 rules Better Project decisions Increased Medicinal Chemistry learning Kramer, Robb, Ting, Zheng, Griffen, et al. J. Med. Chem. 2018, 61(8), 3277-3292 http://pubs.acs.org/doi/10.1021/acs.jmedchem.7b00935 Our MMPA technology enabled knowledge sharing between multiple organisations (AstraZeneca, Hoffman La Roche and Genentech)
  8. 8. Not for circulation Exploiting medicinal chemistry knowledge to accelerate projects February 2019 Cleaned Measured Data MMPA Exploitable Knowledge Expert Enumerator System Problem molecule Pharmacophores& toxophores Potency prediction Evaluate New Ideas Virtual screening Library design MCPairs Enterprise Biological / data warehouse SynthesisTest Expert Design Decisions “The rise of the intelligent machines in drug hunting?” Griffen, E.J. Future Medicinal Chemistry 2009, 1, 405-408.
  9. 9. Not for circulation Exploiting medicinal chemistry knowledge to accelerate projects February 2019February 2019 Not for Circulation Project examples using MMPA Database of encoded knowledge suggests “what to make next” to augment chemists thinking and deliver stronger candidate drugs in fewer iterations
  10. 10. Not for circulation Exploiting medicinal chemistry knowledge to accelerate projects February 2019 Project Example 1 – GPR 119 anatagonists Changing view of SO2CH3 on solubility Leach et al 2006 Leach et al 2012 ArHgArSO2CH3 h solubility in line with i logP SO2CH3 was found to contribute to tight crystal packing via small molecule single crystal x-ray structure determination 5 years of combined data and the application of MMPA ArHgArSO2CH3 i solubility with i logP Initial dataset was too small(28 pairs) and unrepresentative and did not include environment GPR119 - Project pEC50 7.2 (83%); Sol 0.03 µM pEC50 7.6 (121%); Sol 1.0 µM; hERG 7 µM pEC50 8.1 (84%); Sol 1.8 µM; hERG >33 µM Scott, J.; Leach, A.G.; et al Med. Chem. Commun., 2013, 4, 95- 100 Scott, J.; Leach, A.G.; et al J.Med. Chem. 2012, 55, 5361-5379. Oxadiazoles in medicinal chemistry. Boström, J. et al. (2012) J. Med. Chem. 55, 1817–1830
  11. 11. Not for circulation Exploiting medicinal chemistry knowledge to accelerate projects February 2019 - Fix hERG problem whilst maintaining potency Waring et al, Med. Chem. Commun., 2011, 2, 775 Project example 2 - Glucokinase Activators MMPA ∆pEC50: -0.1 ∆logD: -0.6 ∆hERG pIC50 :-0.5 n=33 n=32 n=22 MMPA ∆pEC50: +0.3 ∆logD: +0.3 ∆hERG pIC50 :-0.3 n=20 n=23 n=19 MMPA ∆pEC50: -0.1 ∆logD: -0.6 ∆hERG pIC50 :-0.5 n=27 n=27 n=7
  12. 12. Not for circulation Exploiting medicinal chemistry knowledge to accelerate projects February 2019 Project 3 - Cathepsin K Inhibitors for OE pIC50 7.95 LogD 0.67 HLM <2.0 Solubility 280µM DTM ~1.0 mg/kg UID Potent Too polar / Renal Cl PDB - 97% of structures Crawford, J.J.; Dossetter, A.G J Med Chem. 2012, 55, 8827. Dossetter, A. G. Bioorg. Med. Chem. 2010, 4405 pIC50 8.2 LogD 2.8 HLM <1.0 Solubility >1400µM DTM 0.01 mg/kg UID High F% / stability maximised Increase in LogP, Properties improved Solubility DpIC50 - 0.1 DLogD +1.4 DpSol +1.2 DHLM + 0.25 No renal Cl low F% DpIC50 +0.1 DLogD - 0.7 DpSol ~0.0 DHLM - 0.25 High F% rat/DogElectrostatic potential minima between oxygens Approx like N from 5-het, new compound can not form a quinoline Incr. selectivity DpIC50 +0.1 DLogD - 0.7 DpSol ~0.0 DHLM - 0.25 High F% rat/Dog
  13. 13. Not for circulation Exploiting medicinal chemistry knowledge to accelerate projects February 2019 Project 4 – Ghrelin Inverse agonists – CNS target – Novel more efficient core required, improve hERG for CD – CNS penetration, good potency and deliver tool for in vivo testing McCoull, W.M.; Dossetter A.G.; et al, Med. Chem. Commun., (2013), 4, 456 DpIC50 -0.4 DlogD -1.8 DhERG pIC50 +0.4 MMPA Cores pIC50 9.9 logD 5.0 hERG pIC5 5.0 LLE 4.9 very potent very lipophilic DpIC50 +0.9 DlogD +0.2 DhERG pIC50 -0.3 pIC50 8.2 logD 1.3 hERG pIC50 4.4 LLE 6.9 DpIC50 -2.2 DlogD -2.2 DhERG pIC50 -0.7 100 compounds made LLE = lipophilic ligand efficiency: LLE=pIC50-logD LLE 6.4 LLE 6.9
  14. 14. Not for circulation Exploiting medicinal chemistry knowledge to accelerate projects February 2019Exploiting medicinal chemistry knowledge to accelerate projects Febuary 2019 Not for circulation Project 5 – MCPairs Online 14 Thompson, M. J.; J. Med. Chem., 2015, 58 (23), 9309–9333
  15. 15. Not for circulation Exploiting medicinal chemistry knowledge to accelerate projects February 2019 Indole-3-glyoxylamide Based Series of Tubulin Polymerization Inhibitors – Increase potency, solubility and reduce metabolism – Enable in-vivo xenograft studies Thompson, M. et al J. Med. Chem., 2015, 58 (23), pp 9309–9333 MMPA solubility & QSAR calcsIndibulin D-24851 LC50 0.032 XlogP 3.35 ~ potent In-vivo activity poor solubility (~ 1uM) LC50 0.027 XlogP 2.02 LC50 0.055 XlogP 2.91 solubility (~10-80uM) LC50 0.031 XlogP 2.57 solubility (~10-80uM) 59
  16. 16. Not for circulation Exploiting medicinal chemistry knowledge to accelerate projects February 2019 Project 6 – tBu metabolism issue Benchmark compound Predicted to offer most improvement in microsomal stability (in at least 1 species / assay) R2 R1 tBu Me Et iPr 99 392 16 64 78 410 53 550 99 288 78 515 41 35 98 327 92 372 24 247 35 128 24 62 60 395 39 445 3 21 20 27 57 89 54 89 • Data shown are Clint for HLM and MLM (top and bottom, respectively) R1 R2R1tBu Roger Butlin Rebecca Newton Allan Jordan
  17. 17. Not for circulation Exploiting medicinal chemistry knowledge to accelerate projects February 2019 Project 7 - A Less Simple Example Increase logD and gain solubility Property Number of Observations Direction Mean Change Probability logD 8 Increase 1.2 100% Log(Solubility) 14 Increase 1.4 92% What is the effect on lipophilicity and solubility? Roche data is inconclusive! (2 pairs for logD, 1 pair for solubility) logD = 2.65 Kinetic solubility = 84 µg/ml IC50 SST5 = 0.8 µM logD = 3.63 Kinetic solubility = >452 µg/ml IC50 SST5 = 0.19 µM Question: Available Statistics: Roche Example:
  18. 18. Not for circulation Exploiting medicinal chemistry knowledge to accelerate projects February 2019 Project 8 - Base of Success Story from Genentech 193 compounds Enumerated Objective: improve metabolic stability Enumeration Calculated Property Docking 8 compounds synthesized 100 cmpds x ($2K make + $1K test) = $ 300 000 8 cmpds x ($2K make + $1K test) = $ 24 000 It is not just money, it is actually time 100 cmpds make & test ~ 15 – 25 weeks 8 cmpds make & test ~ 2 – 4 weeks
  19. 19. Not for circulation Exploiting medicinal chemistry knowledge to accelerate projects February 2019 Value of Potency prediction from MMPA Clear substructures enable rapid actions 19 Compounds + data Safety data Potency data HTS data Toxicity alerts Virtual Library prioritisation Virtual Library design Fragment set design Retest prioritisation Hit re-mining / analogue hunting Substructure modification Lead design Fast Follower design VEGFR 26 examples in training set Mean without pharmacophore Mean with pharmacophore 0 2 4 6 6 7 8 9 10 pIC50 count
  20. 20. Not for circulation Exploiting medicinal chemistry knowledge to accelerate projects February 2019 Project 9 - Recent GPCR target, some literature, some patents….. Problem: what’s the SAR in this data set? Pair Finding Rule Finding Structures & Data High potency groups Fragment LLE “rules” for structural changes that increase potency Identify pivotal compounds Pairs to the pivotal compound: potency by clogP = fragment LLE Chemical changes with large DpIC50 /DlogP 443 compounds
  21. 21. Not for circulation Exploiting medicinal chemistry knowledge to accelerate projects February 2019February 2019 Not for Circulation Potency and Toxicity prediction Analysis of smaller datasets to build pharmacophore models and understand relationships
  22. 22. Not for circulation Exploiting medicinal chemistry knowledge to accelerate projects February 2019 What’s the problem with potency in the AI world? Potency (warhead) hERG (safety) Solubility Low metabolism GLK clinical candidate – designed using MMPA Waring et al, Med. Chem. Commun., (2011), 2, 775 • Demands are high for candidate drugs • Potency at the biological target(s) difficult to predict • How do we suggest actions to chemists… • …that are transparent and understandable • ….from low volume datasets? • How do the models get better over time?
  23. 23. Not for circulation Exploiting medicinal chemistry knowledge to accelerate projects February 2019 What does transparent / understandable mean? substructures Physical chemistry descriptors(Hansch, Taft, Fujita, Abraham) Atomic, pair, triplet descriptors Indices (M)LR Free Wilson PLS Trees / Forests SVM Bayesian NN Deep Learning Dark Black INTERPRETABILITY Descriptors Method Increase clarity?
  24. 24. Not for circulation Exploiting medicinal chemistry knowledge to accelerate projects February 2019 Random Forest & Pharmacophores • Feature Importance from RF trees • Predict EGFR potency from RF model [10 fold CV] Pharmacophore fp length 280 10 fold CV Compounds in training 3789 RMSE 0.72 R2 0.53 Feature HD_3atom_Hal HD_3atom_HA pIC50: mean_with 6.7 6.2 pIC50: mean_without 5.5 5.2 Mean_DpIC50 1.2 1.0 n_with 1133 2645 n_without 2547 1035 Cohans_d 0.95 0.80 HD_3atom_Hal HD-3atom-HA Top 2 features by importance Prediction: 18 nM, actual = 7nM
  25. 25. Not for circulation Exploiting medicinal chemistry knowledge to accelerate projects February 2019 MMPA 8 years experience in Medchem Conclusions • Matched Molecular Pair Analysis – This form of Artificial Intelligence can build useful knowledge databases. – Rule finding – highlights “the Gems” • Med chemist need quick and easy to use interface to enable fast science • 9 Project examples illustrate the effectiveness of augmented decision making.
  26. 26. Not for circulation Exploiting medicinal chemistry knowledge to accelerate projects February 2019Exploiting medicinal chemistry knowledge to accelerate projects Febuary 2019 Not for circulation Benefits of MCPairs Enterprise and Online • Data driven compound design from extensive knowledge database • Fewer compounds to make, faster to project results • Proven results from more than 21 organisations / projects (1 failure) • Collective 50 years of drug discovery experience on top of AI • Backed by 10 experience of building “AI systems” • Advanced analysis of patent data and potency prediction • Novel compound generation and improvement
  27. 27. Not for circulation Exploiting medicinal chemistry knowledge to accelerate projects February 2019 Interfaces to many tools…. Pair & Rule Database Compounds from Rules RESTful API W orkflow Visualis e Shape and electrostatics MMPA MCRules Corporate structures and measurements from DB W eb tool mcpairs.medchemica.com has delivered > 48 months uptime to >100 chemists from 20 organisations SpotDesign MCPairs Enterprise
  28. 28. Not for circulation Exploiting medicinal chemistry knowledge to accelerate projects February 2019 What the Chemist sees… See videos of MCPairs Enterprise / Online on YouTube (search ”MedChemica”) https://www.youtube.com/results?search_query=medchemica
  29. 29. Not for circulation Exploiting medicinal chemistry knowledge to accelerate projects February 2019 Data curation and organisation is all for the chemist Company data sets • Cleaned and checked • Matched pairs found(2 methods) • automatically updates with new data and structures Data set organization & descriptions are configurable MCPairs Enterprise does the work for you
  30. 30. Not for circulation Exploiting medicinal chemistry knowledge to accelerate projects February 2019 Dr Andrew G. Leach Technical Director Liverpool John Moores 12 years experience large Pharma / 6 years academia. Applied computational and medicinal chemistry interests in cheminformatics and QM for problem solving Dr Ed Griffen Technical Director 23 years experience large Pharma, biotech Medicinal chemistry oncology, anti- infectives, CNS, cheminformatics, large scale statistical analysis methods Dr Al Dossetter Managing Director 20 years experience Medicinal chemistry large Pharma respiratory, CV, oncology and extensive cloud computing experience Dr Ali Griffen Business Analyst 21 years experience Team leader University of Manchester & large Pharma, microbiologist, enzymology, assay design, project management and biological data curation Dr Shane Montague Lead Data Scientist PhD Computer Science 15 years experience Microsoft, University of Salford: Data science and information security, machine learning, data base and user interface design.
  31. 31. Not for circulation Exploiting medicinal chemistry knowledge to accelerate projects February 2019
  32. 32. Not for circulation Exploiting medicinal chemistry knowledge to accelerate projects February 2019Exploiting medicinal chemistry knowledge to accelerate projects Febuary 2019 Not for circulation Products and Services • MCPairs Enterprise – In-house server and GUI system – Automated updating and analysis – Custom fitting (Software-as-a-Service) • MCPairs On-line – Analysis of larger datasets – Merging with additional databases • CHEMBL, Patent data • Cross-Organisational knowledge sharing – E.g. ADMET dataset One stop GUI Design tool Biotech, Universities and Foundations Medium to large pharma, agrochemical and materials research
  33. 33. Not for circulation Exploiting medicinal chemistry knowledge to accelerate projects February 2019Exploiting medicinal chemistry knowledge to accelerate projects Febuary 2019 Not for circulation Science As A Service (SaaS) • Pharmacology / Toxicology • Me-Better set(s) • Patent analysis • Bespoke Consultancy – Medicinal chemistry – Project evaluation – Cheminformatics – Drug discovery collaborations

×