Extension of Chianer-Chemistry
for Large and Sparse Graph
Preferred Networks
Summer Internship 2019
Kenshin Abe
Introduction to Graph Neural Network (GNN)
2D Convolution Graph Convolution
[2019 Zonghan+]
https://arxiv.org/pdf/1901.00596.pdf
https://www.schrodinger.com/science-articles/autoqsardeepchem
Example of GNN
Typical End-to-end GNN Framework
Graph
Conv
Graph
Conv
Graph
Conv
Graph
Conv
Graph
Readout
Linear Linear
Node
Classification / Regression
Graph
Classification / Regression
Graph
Node
Embeddings
Graph
Representation
Graph
Embedding
Adjacency Matrix
● Matrix multiplication
● Zero padding
Sparse Pattern
● Scatter operation
● Graph concatenation
● etc.
Network Graph
For Large and Sparse Graphs
Graph Data Pattern
Adjacency Matrix
Node Features
Edge List (src, dest)
Node Features
Padding Pattern Sparse Pattern
Batching
Adjacency Matrix
Node Features
Handle as One Big Graph
Padding Pattern Sparse Pattern
Edge List (src, dst)
Node Features
https://github.com/tkipf/gcn/issues/4
Scatter Operation
[PyTorch Scatter]
https://pytorch-scatter.readthedoc
s.io/en/latest/functions/add.html
• Add each value of input to an element of output
specified by index
Graph Convolution
Matrix Multiplication
●
● Inefficient for Large Sparse Graph
Scatter Operation
●
● Efficient
● Broad
Padding Pattern Sparse Pattern
src scatter_add(dst)
Readout
Mask and Aggregate Scatter Operation
Padding Pattern Sparse Pattern
Node Embeddings
Mask
Graph
Level
Aggregate
Graph
Embeddings
0
1
2
Graph
Embeddings
Scatter Aggregate
Experiment - Chemical Dataset -
Padding Pattern
(s / epoch)
Sparse Pattern
(s / epoch)
QM9
V=1~9
133,885 graphs
6.92 5.56 1.24 times faster!
ZINC
V=~38 nodes
249,455 graphs
16.67 11.47 1.45 times faster!!
Training of RelGCN [2019 Schlichtkrull+]
layer_num=2, feature_num=16, batchsize=256
Intel(R) Xeon(R) Gold 6254 CPU @ 3.10GHz
Memory Problem
Scatter Operation
●
● Efficient
● Broad
Sparse Pattern
src scatter_add(dst)
of Memory Consumption
Chainer Sparse Matmul
https://docs.chainer.org/en/stable/reference/generat
ed/chainer.functions.sparse_matmul.html
Memory Problem
Sparse Matmul
●
● Unnecessary Multiplication by 1
Coo Matrix Pattern
sparse_matmul(coo_adj)
of Memory Consumption
Experiment - Network Dataset on GPU -
Padding Pattern
(s / 100 epoch)
Sparse Pattern
(s / 100 epoch)
Coo Matrix Pattern
(s / 100 epoch)
Cora
V=2,708
E=5,278
3.3760 3.0190 3.5500
Citeseer
V=3,312
E=4,660
6.8128 3.3024 6.2707
Reddit
V=232,965
E=11,606,919
Out of Memory Out of Memory 318.76
(5.452 GB)
頂点: 23万
辺:1100万!!
Training of GIN [2019 Keyulu+]
layer_num=2, feature_num=64
On a single Tesla V100-SXM2
Experiment - Network Dataset on CPU -
Padding Pattern
(s / 100 epoch)
Sparse Pattern
(s / 100 epoch)
Coo Matrix Pattern
(s / 100 epoch)
Cora
V=2,708
E=5,278
224.439 22.8092 12.1168
Citeseer
V=3,312
E=4,660
1346.11 23.3707 39.8982
Reddit
V=232,965
E=11,606,919
Out of Memory Out of Memory 28097.187
頂点: 23万
辺:1100万!!
Training of GIN [2019 Keyulu+]
layer_num=2, feature_num=64
On Intel(R) Xeon(R) Gold 6254 CPU @ 3.10GHz
Conclusion
• Sparse pattern is good in most cases
– Not using multiplication
• For very large graphs, CooMatrix saves memory
– Not as fast as sparse pattern
Summary
Chainer Chemistry Goal
Overall GNN for Chemical Data General Framework of GNN
Graph Data Pattern Padding Pattern + Sparse Pattern
Dataset Chemical Dataset (Small)
● qm9, tox21, etc
+ Network (Large)
● Citation Networks, Reddit
Task Graph Regression
Graph Classification
+ Node Regression
+ Node Classification
Additional + Sparse matmul
PFN Summer Internship 2019 / Kenshin Abe: Extension of Chainer-Chemistry for Large and Sparse Graph

PFN Summer Internship 2019 / Kenshin Abe: Extension of Chainer-Chemistry for Large and Sparse Graph