Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Molecular Activity Prediction Using Graph Convolutional Deep Neural Network Considering Distance on a Molecular Graph

490 views

Published on

Molecular Activity Prediction Using Graph Convolutional Deep Neural Network Considering Distance on a Molecular Graph

Int’l Workshop on Mathematical Modeling and Problem Solving (MPS)
2019 Int’l Conference on Parallel and Distributed Processing Techniques & Applications (PDPTA’19)
Session 2. July 29, 2019 @Luxor, Las Vegas
https://americancse.org/events/csce2019/program/pdp_csc_ipc_msv_gcc_29

Published in: Health & Medicine
  • Be the first to comment

Molecular Activity Prediction Using Graph Convolutional Deep Neural Network Considering Distance on a Molecular Graph

  1. 1. Molecular Activity Prediction Using Graph Convolutional Deep Neural Network Considering Distance on a Molecular Graph Int’l Workshop on Mathematical Modeling and Problem Solving (MPS) 2019 Int’l Conference on Parallel and Distributed Processing Techniques & Applications (PDPTA’19) Session 2. July 29, 2019 @Las Vegas Masahito Ohue Ryota Ii Keisuke Yanagisawa Yutaka Akiyama Department of Computer Science, School of Computing, Tokyo Institute of Technology, JAPAN
  2. 2. Agenda • Introduction – Computer-Aided Drug Discovery – Graph Convolutional Deep Neural Network – Weave Module • Proposed Method A) Modify atom distance in ring structures B) Modify convolution of pair features C) Modify assembling pair features • Computational Experiments • Conclusion 1
  3. 3. Introduction 2
  4. 4. • Drug discovery and development – >10 years time and >2 billion US dollars – Possibility to reduce costs by computational approaches • Activity prediction, toxicity prediction, molecular property prediction – Machine learning is powerful tool for CADD 3 Computer-Aided Drug Discovery (CADD) Paul SM, et al. Nat Rev Drug Discov. 2010, 9(3):203.
  5. 5. Graph Convolutional Network (GCN) 4 C O C C C C C C C Br S N N convert feature vector molecule molecular graph input convolutional neural network (CNN) Traditional approach molecule convert machine learning model SVM, Random Forest, LightGBM, … molecular vector (fingerprint, descriptor) input Graph convolutional network (GCN) approach represent a molecule as a graph; atoms → nodes, bonds → edges
  6. 6. Related Work 5 [Duvenaud+2015] Duvenaud DK, et al. In Proc NIPS, 2215-2223, 2015. [Altae-Tran+2017] Altae-Tran H, et al. ACS Central Science, 3: 283–293, 2017. a) Neural graph fingerprints ・Generate molecular fingerprint with neural network ・Update atom feature only using adjacent atoms ・Use different weights for node degrees b) GCN by Altae-Tran ・Update atom feature by convolutional and pooling layers only using adjacent atoms ・They did not consider property of edges (bonds) ・They did not consider atoms other than 1-neighbor [Altae-Tran+2017] [Duvenaud+2015]
  7. 7. Atom feature Pair feature Related Work 6 Information of distant atoms can be considered (not just 1-neighbors) c) Weave module Weave module considers not only atoms but also atom pairs [Kearnes+2016] Kearnes S, et al. J Comput-Aided Mol Des, 30(8): 593-608, 2016. [Kearnes+2016] The difference in distance between atom pairs was not considered Atom feature Pair feature Atom feature Pair feature … y Weave layer 0 Weave layer 1 Weave layer k softmax fullyconnectedlayer
  8. 8. Distances on Molecular Graph HIV dataset MUV dataset PCBA dataset 80 70 60 50 40 30 20 10 0 1 2 3 4 5 6 7 8 9 distance 80 70 60 50 40 30 20 10 0 80 70 60 50 40 30 20 10 0 1 2 3 4 5 6 7 8 9 distance 1 2 3 4 5 6 7 8 9 distance The distribution of interatomic distances is not uniform ▶ It is necessary to consider the difference of atom-atom distance count We counted all atom-atom distances in 3 datasets from MoleculeNet [Wu+2018] [Wu+2018] Wu Z, et al. Chem Sci, 9: 513–530, 2018. 7
  9. 9. Graph vs. 3D Structure 8 Molecular Graph 3D Structure The distance on the graph does not necessarily correlate with the Euclidean distance between atoms on the 3D structure ▶ Need to consider to modify the definition of graph distance
  10. 10. Purpose of This Study 9 Operation in Weave module Ordinary Weave module This study (A) Generation of pair features Use ordinary graph distance on molecular graph Prop. A Correction of distances related to atoms in ring structures (B) Convolution of pair features Use the same weight regardless of the distance on the graph Prop. B Convolution of pair features with different weights (C) Assembly of pair features Add in uniformly Prop. C Use weighted sum based on distance Improve the Weave module considering the distance between atom pairs on the molecular graph
  11. 11. This Study 10 Atom feature Pair feature Weave module Atom feature Pair feature Atom feature Pair feature … y Weave layer 0 Weave layer 1 Weave layer k softmax fullyconnectedlayer We focused and modified these operations A. Correction of distances related to atoms in ring structures B. Convolution of pair features with different weights C. Reweighting pair features by its distance A B,C
  12. 12. Preliminary (Weave module) 11
  13. 13. Weave module 12 Atom feature Pair feature Atom feature Pair feature Atom feature Pair feature … y Weave layer 0 Weave layer 1 Weave layer k softmax fullyconnectedlayer
  14. 14. Initial Features 13 Weave layer 0 arom C N O F P S Cl Br I metal R S FC PC R=3 R=4 R=5 R=6 R=7 R=8 sp sp 2 sp 3 HBA HBD arom atom type (1-hot or NULL) chiral charge ring size (1-hot or NULL) hybridization H-bond same ring? 1 2 3 arom d =1 d =2 d =3 d =4 d =5 d =6 d =7 yes/no bond type (1-hot or NULL) shortest path length ≦ d Atom feature Pair feature Atom feature Pair feature atom atom pair Initial atom vector Initial atom pair vector
  15. 15. Pair→Atom Transform Operation (convolution) 14 Atom feature Pair feature Atom feature Pair feature Weave layer k Input Output Weight Bias vector atom i Activation convolution
  16. 16. Proposed Method 15 A. Correction of distances related to atoms in ring structures B. Convolution of pair features with different weights C. Reweighting pair features by its distance
  17. 17. A. Correction of distances related to atoms in ring 16 The ring structure is relatively rigid in terms of the actual molecular conformation compared to the chain structure. ▶ We modified the distance on rings shorter. At the ortho position and meta position → dist = 1 At the para position → dist = 2 ortho meta para examples Graph distance closer to the trend of atom-atom distance in 3D structure
  18. 18. B. Convolution of pair features with different weights 17 Weights according to the distance were used for atom pairs and convolution was performed. Weave module This Study Pair features were convoluted using the same weight, regardless of the distance length. Can consider distances of atom pairs in conv. process
  19. 19. C. Reweighting pair features by its distance 18 Distant atom pairs are less important than nearby atom pairs ▶ Represented by decreasing weight as the distance is larger Can change the importance of distant and near atoms
  20. 20. Computational Experiments 19
  21. 21. Dataset • We used benchmarking datasets for molecular activity prediction from MoleculeNet [Wu+2018] – Hydrogen atoms were omitted – Molecules with the huge number of heavy atoms exceeding maximum number of atoms, nmax (=60), were excluded 20[Wu+2018] Wu Z, et al. Chem Sci, 9: 513–530, 2018.
  22. 22. Hyperparameters 21 layer
  23. 23. Prediction Scheme 22 training validation test prediction model predict Results were evaluated by AUC (area under the ROC curve) dataset For each task , the best epoch was selected that gives the best AUC value for the validation data. Then the selected prediction model with was applied to the test data. Perfect prediction :1.0 Random prediction:0.5
  24. 24. Results 23
  25. 25. Performance of Props. A and B 24 0.801 0.803 0.806 0.806 0.743 0.783 0.738 0.760 0.824 0.825 0.823 0.823 0.5 0.6 0.7 0.8 Weave Prop. A Prop. B Prop. A&B AUC HIV MUV PCBA Proposed Method  Prop. A improves accuracy with HIV/MUV, and shows particularly good performance with MUV.  Prop. B is slightly more accurate with HIV.  PCBA did not change in performance ← Depending on the size of the dataset?
  26. 26. Performance of Prop. C 25 0.801 0.772 0.807 0.803 0.743 0.721 0.749 0.752 0.5 0.6 0.7 0.8 Weave step linear quadratic AUC HIV MUV Prop. C  Improved performance with linear and quadratic.  Similar to Prop. A and B, the improvement in MUV is remarkable.
  27. 27. Why did MUV improve its accuracy well? 26 • MUV is an unbalanced dataset with extremely few positive samples. • Considering the actual drug discovery hit rate, MUV is closest to the real activity prediction problem. The improvement of this study may be more suitable for real-world data in the field of drug discovery.
  28. 28. Why did Prop. B not perform well? 27 0th Weave layer 1st Weave layer Weave dist 1 dist 2 dist 3 dist ∞ dist 4 dist 5 dist 0 11.5 11.0 10.5 10.0 9.50 0 20 40 60 80 100 15 14 13 12 11 10 9 0 20 40 60 80 100 epoch epoch We confirmed how the weight matrices changed as the learning progressed by using Frobenius norm. Frobenius norm It is possible to improve the model performance by using different weights The slopes are almost same ▶ It may not be necessary to use different weights in the 1st layer ▶
  29. 29. Conclusion 28
  30. 30. Summary 29 Atom feature Pair feature This study targeted Weave module in the activity prediction problem Atom feature Pair feature Atom feature Pair feature … y Weave layer 0 Weave layer 1 Weave layer k softmax fullyconnectedlayer We modified these Weave operations A. Correction of distances related to atoms in ring structures B. Convolution of pair features with different weights C. Reweighting pair features by its distance A B,C
  31. 31. Summary 30 A. correction of the distance on the graph in the ring structure in the compound – Prediction accuracy is improved compared to Weave – Pair features between distant atoms were also used effectively B. convolution of paired features with different weights for different distances – More generalized model by using different weights in the convolution process – Accuracy was slightly higher than the Weave. – More effective on 0th Weave layer C. reweighting pair features by its distance – Using linear and quadratic weights are effective
  32. 32. Future Work • Weave transform operation is complicated, it may not be possible to achieve a drastic improvement in the accuracy simply by improving the pair→atom transform. Other operations can also be improved by utilizing distance information. • It is worthwhile to verify that these improvements apply to other tasks of compound supervised learning. e.g. side-effect prediction, toxicity prediction, stability prediction 31

×