Successfully reported this slideshow.
Your SlideShare is downloading. ×

Emerging Challenges for Artificial Intelligence in Medicinal Chemistry

Ad

October 2019
Exploiting medicinal chemistry knowledge to accelerate projects
Emerging Challenges for Artificial Intelligen...

Ad

Exploiting medicinal chemistry knowledge to accelerate projectsExploiting medicinal chemistry knowledge to accelerate proj...

Ad

Exploiting medicinal chemistry knowledge to accelerate projectsExploiting medicinal chemistry knowledge to accelerate proj...

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Loading in …3
×

Check these out next

1 of 34 Ad
1 of 34 Ad

Emerging Challenges for Artificial Intelligence in Medicinal Chemistry

Download to read offline

Presentation by Dr Ed Griffen of MedChemica Ltd, at The IBSA Conference "How Artificial Intelligence Can Change the Pharmaceutical Landscape“ - LUGANO, October 9th 2019.

Presentation by Dr Ed Griffen of MedChemica Ltd, at The IBSA Conference "How Artificial Intelligence Can Change the Pharmaceutical Landscape“ - LUGANO, October 9th 2019.

Advertisement
Advertisement

More Related Content

Slideshows for you (19)

Similar to Emerging Challenges for Artificial Intelligence in Medicinal Chemistry (20)

Advertisement

Emerging Challenges for Artificial Intelligence in Medicinal Chemistry

  1. 1. October 2019 Exploiting medicinal chemistry knowledge to accelerate projects Emerging Challenges for Artificial Intelligence in Medicinal Chemistry Dr Ed Griffen IBSA Lugano October 2019
  2. 2. Exploiting medicinal chemistry knowledge to accelerate projectsExploiting medicinal chemistry knowledge to accelerate projects • Founded in 2012 by experienced large Pharma medicinal/computational chemists to accelerate drug hunting by exploiting data driven knowledge • Domain leaders in SAR knowledge extraction and knowledge based design • > 10 years experience of building AI systems that suggest actions to chemists (7 years as MedChemica) • Creators of largest ever documented database of medicinal chemistry ADMET knowledge
  3. 3. Exploiting medicinal chemistry knowledge to accelerate projectsExploiting medicinal chemistry knowledge to accelerate projects …7 Years of working with pharma companies “Our median number of compounds per LO project is 3000 - this is unsustainable… [it should be] 300” – Director of Chemistry (large pharma) “Can we define the text book of medincal chemistry?” – Director of Comp Chem (large pharma) “We are aiming at 300 compound per project – currently we are about 400, we will get better” – ExScienta scientist at SCI ‘What can BigData do for chemistry’ – London Oct 2017 MedChemica uses knowledge extraction techniques to build “expert systems” to suggest actions to chemists and reduce the time and cost to critical compounds and candidate drugs.
  4. 4. Exploiting medicinal chemistry knowledge to accelerate projectsExploiting medicinal chemistry knowledge to accelerate projects Explainable AI The future of AI lies in enabling people to collaborate with machines to solve complex problems. Like any efficient collaboration, this requires good communication, trust, clarity and understanding. - Freddy Lecue, Explainable AI Research Lead, Accenture Labs https://www.accenture.com/gb-en/insights/technology/explainable-ai-human-machine Black box machine learning models are currently being used for high-stakes decision making throughout society, causing problems in healthcare, criminal justice and other domains. Some people hope that creating methods for explaining these black box models will alleviate some of the problems, but trying to explain black box models, rather than creating models that are interpretable in the first place, is likely to perpetuate bad practice and can potentially cause great harm to society. The way forward is to design models that are inherently interpretable. - Cynthia Rudin Nature Machine Intelligence (2019), 206–215.
  5. 5. Exploiting medicinal chemistry knowledge to accelerate projects Use the right Machine Learning tool for the right problem Where is Medicinal Chemistry? Interpretable Failure cost high Immature science Highly skilled, critical users Business-2-Business Transparent and auditable Black Box Failure cost is low Real time response critical Interactive = self correcting Business-2-consumer User agnostic of process
  6. 6. Exploiting medicinal chemistry knowledge to accelerate projects Help the HiPPOs – or they’ll crush you 1. McAfee & Brynjolfsson “Big Data: The Management Revolution”, Harvard Business Review October 2012 “Companies often make most of their important decisions by relying on “HiPPO”—the highest-paid person’s opinion.”1 Chemistry HiPPs: • experts in pattern recognition • judged on their ability to make the best decisions with partial data • highly trained • time poor • delivery focused • gatekeepers to the adoption of new approaches
  7. 7. Exploiting medicinal chemistry knowledge to accelerate projects Data Warehouse rule finder Exploitable Knowledge Molecule problem solving Explainable QSAR Automated loader MMPA Clean Structures & Data Property Prediction Idea ranking Instant SAR analysis MCPairs REST API & GUI Explainable AI for Medicinal Chemistry Design
  8. 8. Exploiting medicinal chemistry knowledge to accelerate projects Molecule Problem Solving Compounds from Rules • Exploitable Knowledge is a rule database derived from MMPA • User puts in a problem molecule with a property they wish to improve – eg solubility, metabolism, hERG…. • System generates potential improved molecules based on data Exploitable Knowledge MC Expert Enumerator System Problem molecule + property to improve Solution molecules Compounds from Rules https://www.youtube.com/watch?v=lITAT6_-i1E&list=PLtkCAojNL97xs1kd5JHngjIRhl4ZPFTlL&index=3
  9. 9. Exploiting medicinal chemistry knowledge to accelerate projects https://youtu.be/nQxXddJDTfc
  10. 10. Exploiting medicinal chemistry knowledge to accelerate projects MMPA Enables knowledge sharing MMPA MMPA MMPA Combine and Extract Rules Multiple Pharma ADMET data >437000 rules Better Project decisions Increased Medicinal Chemistry learning Kramer, Robb, Ting, Zheng, Griffen, et al. J. Med. Chem. 2018, 61(8), 3277-3292 http://pubs.acs.org/doi/10.1021/acs.jmedchem.7b00935 Our MMPA technology enabled knowledge sharing between multiple organisations (AstraZeneca, Hoffman La Roche and Genentech)
  11. 11. Exploiting medicinal chemistry knowledge to accelerate projectsExploiting medicinal chemistry knowledge to accelerate projects Griffen, E. et al. J. Med. Chem. 2011, 54(22), pp.7739-7750. Fully Automated Matched Molecular Pair Analysis (MMPA) Knowledge Extraction that’s understandable by chemists Δ Data A-B1 2 2 3 3 3 4 4 4 12 23 3 34 4 4A B • Matched Molecular Pairs – Molecules that differ only by a particular, well-defined structural transformation • Capture the change and environment – MMPs can be recorded as transformations from Aà B • Statistical analysis to define “medicinal chemistry rules” Defined transformations with high probability of improving properties of molecules • Store in a high performance database and provide an intuitive user interface
  12. 12. Exploiting medicinal chemistry knowledge to accelerate projects Identify and group matching SMIRKS Calculate statistical parameters for each unique SMIRKS (n, median, sd, se, n_up/n_down) Is n ≥ 6? Not enough data: ignore transformation Is the |median| ≤ 0.05 and the intercentile range (10-90%) ≤ 0.3? Perform two-tailed binomial test on the transformation to determine the significance of the up/ down frequency transformation is classified as ‘neutral’ Transformation classified as ‘NED’ (No Effect Determined) Transformation classified as ‘increase’ or ‘decrease’ depending on which direction the property is changing pass fail yes no yes no Rule selection 0 +ve-ve Median data difference Neutral IncreaseDecrease NED • No assumption of normal distribution • Manages ‘censored’ = qualified / out-of-range data
  13. 13. Exploiting medicinal chemistry knowledge to accelerate projects Base of Success Story from Genentech 193 compounds Enumerated Objective: improve metabolic stability Enumeration Calculated Property Docking 8 compounds synthesized 100 cmpds x ($2K make + $1K test) = $ 300 000 8 cmpds x ($2K make + $1K test) = $ 24 000 It is not just money, it is actually time 100 cmpds make & test ~ 15 – 25 weeks 8 cmpds make & test ~ 2 – 4 weeks
  14. 14. Exploiting medicinal chemistry knowledge to accelerate projects tBu metabolism issue Benchmark compound Predicted to offer most improvement in microsomal stability (in at least 1 species / assay) R2 R1 tBu Me Et iPr 99 392 16 64 78 410 53 550 99 288 78 515 41 35 98 327 92 372 24 247 35 128 24 62 60 395 39 445 3 21 20 27 57 89 54 89 • Data shown are Clint for HLM and MLM (top and bottom, respectively) R1 R2R1tBu Roger Butlin Rebecca Newton Allan Jordan
  15. 15. Exploiting medicinal chemistry knowledge to accelerate projectsExploiting medicinal chemistry knowledge to accelerate projects Tubulin Polymerization Inhibitors 15
  16. 16. Exploiting medicinal chemistry knowledge to accelerate projects Indole-3-glyoxylamide Based Series of Tubulin Polymerization Inhibitors – Increase potency, solubility and reduce metabolism – Enable in-vivo xenograft studies Thompson, M. et al J. Med. Chem., 2015, 58 (23), pp 9309–9333 MMPA solubility & QSAR calcsIndibulin D-24851 LC50 0.032 XlogP 3.35 ~ potent In-vivo activity poor solubility (~ 1uM) LC50 0.027 XlogP 2.02 LC50 0.055 XlogP 2.91 solubility (~10-80uM) LC50 0.031 XlogP 2.57 solubility (~10-80uM) 59
  17. 17. Exploiting medicinal chemistry knowledge to accelerate projects Idea Ranking SpotDesign • Use the knowledge database to estimate how good an idea is compared to a benchmark molecule • System generates assessment based on data 17 Exploitable Knowledge SpotDesign Idea molecule + benchmark molecule + property Assessment of idea molecule compared to benchmark SpotDesign https://www.youtube.com/watch?v=JMhQvNdBOFs&index=2&list=PLtkCAojNL97xs1kd5JHngjIRhl4ZPFTlL
  18. 18. Exploiting medicinal chemistry knowledge to accelerate projects https://youtu.be/fDpFo53IdOE
  19. 19. Exploiting medicinal chemistry knowledge to accelerate projects Property Prediction Automated Explainable QSAR • Chemists get predictions with the substructures highlighted that are driving prediction and the molecules used to support that part of the model – transparent / explainable AI. Explainable QSAR Clean Structures & Data Property Prediction Molecule Structure + property to predict Prediction + clear drivers of prediction
  20. 20. Exploiting medicinal chemistry knowledge to accelerate projects 2 Feature Definition Basic Group Atom or group most likely protonated at pH 7.4 Acidic Group Atom or group most likely deprotonated at pH 7.4, includes N and C acids Acceptor Definitions derived from Taylor & Cosgrove Donor Definitions derived from Taylor & Cosgrove Hydrophobic C4 or greater cyclic or acyclic alkyl group Aromatic Attachment connection of any group to an aromatic atom excluding connections within rings Aliphatic Attachment connection of any atom to an aliphatic group not in a ring. Halo F,Cl, Br, I Reference for Donor acceptor feature definitions: Taylor, R.; Cole, J. C.; Cosgrove, D. A.; Gardiner, E. J.; Gillet, V. J.; Korb, O. J Comput Aided Mol Des 2012, 26 (4), 451– 472. Acid & Base definitions are SMARTS including C, N, heteroaromatic acids, bases excluding weak aniline bases, including amidines, guanidine’s - MedChemica definitions. MedChemica Advanced Pharmacophore Pairs Gobbi, A.; Poppinger, D. Biotechnology and Bioengineering 1998, 61 (1), 47–54. Reutlinger, M.; Koch, C. P.; Reker, D.; Todoroff, N.; Schneider, P.; Rodrigues, T.; Schneider, G. Mol. Inf. 2013, 32 (2), 133–138.
  21. 21. Exploiting medicinal chemistry knowledge to accelerate projects Pay attention to your descriptors • Chemistry must make sense Simple H bond acceptor base acid Precise Diclofenac (1973) Sulfadiazine (1941) DMAP
  22. 22. Exploiting medicinal chemistry knowledge to accelerate projects Regression Forest & Pharmacophore understanding • hERG – auditable models • Identify important chemical features driving potency • Predict hERG potency from RF model [10 fold CV] Pharmacophore fp length 280 10 fold CV Compounds in training 5968 RMSE 0.16 Pearson R2 0.27
  23. 23. Exploiting medicinal chemistry knowledge to accelerate projects • hERG – auditable models • Predict hERG potency from RF model [10 fold CV] • Example CHEMBL12713 sertindole • Colour structure by feature importance weighted sum of of pharmacophore pair fingerprints – show the chemists where the hotspots are. • Drill deeper to show the most important positive and negative features. RF prediction pIC50 7.7 median_with: 5.1 median_without: 4.7 median_diff: 0.4 n_examples_with: 4585 n_examples_without : 1383 median_with: 5.1, median_without: 5.3 median_diff: -0.2 n_examples_with: 3106 n_examples_without : 2862 Regression Forest & Pharmacophore understanding
  24. 24. Exploiting medicinal chemistry knowledge to accelerate projects kNN – Understanding from neighbouring structures • hERG – auditable models • Predict hERG potency from kNN model [10 fold CV] • Example CHEMBL12713 sertindole • Identify the closest neighbours - by Tanimoto to ECFP4 fingerprint • Show chemists structures kNN prediction pIC50 8.2 distance 0.17 0.2 0.23 pIC50 7.7 4.1 8.2
  25. 25. Exploiting medicinal chemistry knowledge to accelerate projects • ML models built for 20 critical seizure related CNS targets • Communicate to chemists activity prediction & if model out of domain • Show close structures and/or toxophores Seizure prediction by Composite Machine Learning CHEMBL 12713 sertindole seizure activity observed clinically Predictions in line with measured data More potent than 1µM Less potent than 1µM Out of Domain – no prediction possible
  26. 26. Exploiting medicinal chemistry knowledge to accelerate projects Estimating Risks, finding toxophores 26
  27. 27. Exploiting medicinal chemistry knowledge to accelerate projects Pair & Rule Database Compounds from Rules API server RESTful API Compound to Pairs MCRules Corporate structures and measurements from DB Structure and data clean up Spot Design Pair finding Web GUI MedChemica In-House Design tools CLI MedChemica Clean Structures & Data Explainable QSAR Engineering and Automation
  28. 28. Exploiting medicinal chemistry knowledge to accelerate projects Data Integrity and curation Knowledge extraction algorithms Engineering, Automation and Interfaces Interpretability ✓ ✓ ✓ ✓ Knowledge Database MCPairs Overcoming the Barriers to Implementing AI MC GUI
  29. 29. Exploiting medicinal chemistry knowledge to accelerate projects
  30. 30. Exploiting medicinal chemistry knowledge to accelerate projects A Less Simple Example Increase logD and gain solubility Property Number of Observations Direction Mean Change Probability logD 8 Increase 1.2 100% Log(Solubility) 14 Increase 1.4 92% What is the effect on lipophilicity and solubility? Roche data is inconclusive! (2 pairs for logD, 1 pair for solubility) logD = 2.65 Kinetic solubility = 84 µg/ml IC50 SST5 = 0.8 µM logD = 3.63 Kinetic solubility = >452 µg/ml IC50 SST5 = 0.19 µM Question: Available Statistics: Roche Example:
  31. 31. Exploiting medicinal chemistry knowledge to accelerate projects Instant SAR Analysis Compound to Pairs • Chemists can instantly see the pairs to a compound and explore property changes 31 Exploitable Knowledge Compound to Pairs Molecule of interest All the matched pairs of that molecule Compound to Pairs https://www.youtube.com/watch?v=OFhZJulxsAw&t=0s&list=PLtkCAojNL97xs1kd5JHngjIRhl4ZPFTlL&index=2
  32. 32. Exploiting medicinal chemistry knowledge to accelerate projects https://youtu.be/OFhZJulxsAw
  33. 33. Exploiting medicinal chemistry knowledge to accelerate projects 3 Possible input streams…. Rule Database REST - API Your DB crontab MCPCLI REST - API ETL custom plugin • Extract Transform Load (ETL) • Custom plugin scripted by MedChemica • Usually 3 – 4 weeks work • On-site work and team interaction required Exploitation Your DB Your DB YOUR FIREWALL assay1 • Export Flat files of data • MCPCLI reads in files and deletes 1 2 3 • Direct Read Access to DB • SQL searches compounds / measurements • https requests for compounds / measurements • Most robust option data 10 years experience building automated systems MCPairs Server
  34. 34. Exploiting medicinal chemistry knowledge to accelerate projects Example Current Pharma install Rule Database In-House Design tools and workflows REST - API MedChemica Web tool MedChemica CLI 3 WAYS OF EXPLOITATION D360 crontab MCPCLI REST - API ETL custom plugin • Every 2 days… • Latest compounds structure pulled from D360 and loaded • Latest measurements from assays pulled and loaded • Custom plugin handled data input streaming • Update the matched pairs and update rules PHARMA FIREWALL MCPairs Server

×