Successfully reported this slideshow.
Your SlideShare is downloading. ×

グラフ構造データに対する深層学習〜創薬・材料科学への応用とその問題点〜 (第26回ステアラボ人工知能セミナー)

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Upcoming SlideShare
Hereglegdehvvn2
Hereglegdehvvn2
Loading in …3
×

Check these out next

1 of 48 Ad

グラフ構造データに対する深層学習〜創薬・材料科学への応用とその問題点〜 (第26回ステアラボ人工知能セミナー)

Download to read offline

講演者: 椿 真史 氏 (産業技術総合研究所 人工知能研究センター 研究員)

概要: 本講演では、創薬・材料科学への機械学習応用について紹介する。特に近年、グラフ構造データに対する深層学習手法であるグラフニューラル(畳み込み)ネットワークが流行しており、創薬や材料で扱われる分子化合物や結晶について、その物性や機能を高精度で予測できるようになってきた。その一方で、深層学習のモデリング自体が機械学習コミュニティのみで肥大化し、結果の解釈性だけでなく、量子物理・化学の観点から様々な問題もある。本講演を通して、深層学習の科学データへの応用に関する正と負の側面について議論したい。

Link: https://stair.center/archives/events/ai-seminar-026

講演者: 椿 真史 氏 (産業技術総合研究所 人工知能研究センター 研究員)

概要: 本講演では、創薬・材料科学への機械学習応用について紹介する。特に近年、グラフ構造データに対する深層学習手法であるグラフニューラル(畳み込み)ネットワークが流行しており、創薬や材料で扱われる分子化合物や結晶について、その物性や機能を高精度で予測できるようになってきた。その一方で、深層学習のモデリング自体が機械学習コミュニティのみで肥大化し、結果の解釈性だけでなく、量子物理・化学の観点から様々な問題もある。本講演を通して、深層学習の科学データへの応用に関する正と負の側面について議論したい。

Link: https://stair.center/archives/events/ai-seminar-026

Advertisement
Advertisement

More Related Content

More from STAIR Lab, Chiba Institute of Technology (20)

Advertisement

グラフ構造データに対する深層学習〜創薬・材料科学への応用とその問題点〜 (第26回ステアラボ人工知能セミナー)

  1. 1. AIST AIRC machine learning team Masashi Tsubaki (3 ) ( )
  2. 2. [Tsubaki, Tomii, and Sese, 2018 in Bioinformatics] [Tsubaki and Mizoguchi, 2018 in Journal of physical chemistry letters] AIST AIRC machine learning team Masashi Tsubaki
  3. 3. AIST AIRC machine learning team 1 ( ) Masashi Tsubaki 🤭 🤐 🤫 1 publish … ( ) ( )
  4. 4. AIST AIRC machine learning team Masashi Tsubaki https://github.com/masashitsubaki ( !)
  5. 5. AIST AIRC machine learning team GitHub ( SMILES ) Masashi Tsubaki https://github.com/masashitsubaki/molecularGNN_smiles GNN CC(=O)OC1=CC=CC=C1C(=O)O SMILES SMILES (※ SMILES RDKit ) (e.g., or not)
  6. 6. AIST AIRC machine learning team SMILES Masashi Tsubaki (SMILES 0 or 1) (SMILES ) (※GitHub )
  7. 7. AIST AIRC machine learning team GitHub ( ) Masashi Tsubaki Atom x y z O 0.03 0.98 0.008 H 0.06 0.02 0.002 H 0.87 1.30 0.0007 Water molecule GNN (e.g., ) https://github.com/masashitsubaki/molecularGNN_3Dstructure
  8. 8. AIST AIRC machine learning team Masashi Tsubaki Molecular property types Data index Each atom and its 3D coordinate in the molecule CH4 Properties in order of the above types In the following, each data is described with the same format ( README)
  9. 9. AIST AIRC machine learning team Masashi Tsubaki ( ) bash train.sh train.sh
  10. 10. AIST AIRC machine learning team bash train.sh (Google colab ) Masashi Tsubaki
  11. 11. AIST AIRC machine learning team (QM9 ) Masashi Tsubaki dataset: QM9_under14atoms property: U0(kcal/mol) dim: 200 layer_hidden: 6 layer_output: 6 batch_train: 32 batch_test: 32 learning_rate: 1e-3 decay of learning rate: 0.99 interval of decay: 10 iteration: 3000 1.9 kcal/mol 1.0 kcal/mol
  12. 12. AIST AIRC machine learning team Masashi Tsubaki data_train public data_test in house dataset 2 (e.g., ) (e.g., )
  13. 13. AIST AIRC machine learning team preprocess.py train.py ( ) Masashi Tsubaki ( )
  14. 14. AIST AIRC machine learning team Masashi Tsubaki DeepChem Schnet Chainer chemistry kGCN
  15. 15. AIST AIRC machine learning team Masashi Tsubaki tsubaki.masashi@aist.go.jp GitHub 🙇 …
  16. 16. 🙇 AIST AIRC machine learning team Masashi Tsubaki https://github.com/masashitsubaki ( !)
  17. 17. [Tsubaki, Tomii, and Sese, 2018 in Bioinformatics] [Tsubaki and Mizoguchi, 2018 in Journal of physical chemistry letters] AIST AIRC machine learning team Masashi Tsubaki
  18. 18. 😅 🤔 COOH ( ) 1 0 1 (e.g., ) ( or not) AIST AIRC machine learning team ( + ) Masashi Tsubaki (e.g., )
  19. 19. AIST AIRC machine learning team Masashi Tsubaki 🤖 ( or not) ( ) ( ) (end-to-end) ( ) Graph neural network (GNN) [Scarselli+ 09; Kearnes+ 16]
  20. 20. (1) ( ) (2) (or ) ( ) (3) ( ) O C H H AIST AIRC machine learning team Masashi Tsubaki : GNN
  21. 21. NN ( ) ( transition propagation( ) message passing ) O C H H AIST AIRC machine learning team Masashi Tsubaki : GNN x (`+1) i = x (`) i + X j f(x (`) j ) e.g, ReLu(Wx+b) NN (4) (5)
  22. 22. sum or not ( ) AIST AIRC machine learning team Masashi Tsubaki O C H H
  23. 23. sum or not (NN ) end-to-end AIST AIRC machine learning team Masashi Tsubaki O C H H
  24. 24. PyTorch-like code AIST AIRC machine learning team ( ) Masashi Tsubaki C O H fif 1 else 0 H
  25. 25. [Tsubaki, Tomii, and Sese, 2018 in Bioinformatics] [Tsubaki and Mizoguchi, 2018 in Journal of physical chemistry letters] AIST AIRC machine learning team Masashi Tsubaki
  26. 26. or not GNN GNN assumption ( ) Masashi Tsubaki AIST AIRC machine learning team
  27. 27. Masashi Tsubaki AIST AIRC machine learning team
  28. 28. β-lactam Masashi Tsubaki AIST AIRC machine learning team
  29. 29. 👍 x1 x2 xM r (e.g., r=2) Masashi Tsubaki x (`+1) i = x (`) i + X j f(x (`) j ) i-th fingerprint vector Neighboring fingerprint vector fingerprint based-GNN AIST AIRC machine learning team β-lactam Fingerprint vectors
  30. 30. Masashi Tsubaki ( GitHub) AIST AIRC machine learning team https://github.com/masashitsubaki/CPI_prediction fingerprint-GNN CNN
  31. 31. Masashi Tsubaki (human C.elegans) AIST AIRC machine learning team ( …) Dataset: Human Radius of subgraphs (fingerprints): 2 Amino acid n-gram :3 Dimensionality :10 Layer of GNN: 3 Window size of n-gram :5 Layer of CNN: 3 Learning rate: 1e-4 Decay of learning rate: 0.5 Interval of decay: 10 Dataset: C.elegans Radius of subgraphs (fingerprints): 2 Amino acid n-gram :3 Dimensionality :10 Layer of GNN: 3 Window size of n-gram :5 Layer of CNN: 3 Learning rate: 1e-4 Decay of learning rate: 0.5 Interval of decay: 10 Liu et al., 2015 - (highly credible negative samples)
  32. 32. Masashi Tsubaki AIST AIRC machine learning team ... some reports even above 0.99 AUC on standard benchmarks…
  33. 33. Masashi Tsubaki AIST AIRC machine learning team ( ) …
  34. 34. [Tsubaki, Tomii, and Sese, 2018 in Bioinformatics] [Tsubaki and Mizoguchi, 2018 in Journal of physical chemistry letters] AIST AIRC machine learning team Masashi Tsubaki
  35. 35. AIST AIRC machine learning team Masashi Tsubaki Atom x y z O 0.03 0.98 0.008 H 0.06 0.02 0.002 H 0.87 1.30 0.0007 3 ( ) -9.24 eV Water molecule ( …) ( …) ( …)
  36. 36. AIST AIRC machine learning team Nature comm PRL ( …) Masashi Tsubaki ( )
  37. 37. AIST AIRC machine learning team Masashi Tsubaki ( …) ( ) O C H H ※
  38. 38. AIST AIRC machine learning team Masashi Tsubaki Schnet( ) i x (`+1) i = X j e j d2 ij f(x (`) j ) j ( …) ( )
  39. 39. AIST AIRC machine learning team Masashi Tsubaki Gaussian ( ) = O H H f ⌘ exp ⇣ 2 ※
  40. 40. AIST AIRC machine learning team Masashi Tsubaki (※QM9 29 130k ) QM9 14 (15k ) ( )
  41. 41. AIST AIRC machine learning team Masashi Tsubaki QM9 14 (10k ) 1.0 Chemical accuracy 1.90 kcal/mol
  42. 42. AIST AIRC machine learning team Masashi Tsubaki 1 ( ) 1.24 kcal/mol GNN ( …) 1.90 kcal/mol 1.0
  43. 43. AIST AIRC machine learning team Masashi Tsubaki New! GNN ( )
  44. 44. AIST AIRC machine learning team ( ) Masashi Tsubaki 14 15 New!
  45. 45. AIST AIRC machine learning team Masashi Tsubaki Atom x y z O 0.03 0.98 0.008 H 0.06 0.02 0.002 H 0.87 1.30 0.0007 ( ) = 1 -9.24 eV Water molecule = etc… … ( ) 🤭
  46. 46. AIST AIRC machine learning team Masashi Tsubaki SchNOrb extends the deep tensor neural network SchNet to represent electronic wavefunctions, ...model uses about 93 million parameters to predict a large Hamiltonian…
  47. 47. AIST AIRC machine learning team Masashi Tsubaki https://github.com/masashitsubaki tsubaki.masashi@aist.go.jp

×