Similar to NS-CUK Seminar: V.T.Hoang, Review on "Are More Layers Beneficial to Graph Transformers?", International Conference on Learning Representations 2023
Vlsi design process for low power design methodology using reconfigurable fpgaeSAT Publishing House
Similar to NS-CUK Seminar: V.T.Hoang, Review on "Are More Layers Beneficial to Graph Transformers?", International Conference on Learning Representations 2023 (20)
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
NS-CUK Seminar: V.T.Hoang, Review on "Are More Layers Beneficial to Graph Transformers?", International Conference on Learning Representations 2023
1. Van Thuy Hoang
Dept. of Artificial Intelligence,
The Catholic University of Korea
hoangvanthuy90@gmail.com
Haiteng Zhao, et al.; ICLR23
2. 2
Problems
Proposed model architecture
Novel graph transformer model named DeepGraph
Why more self-attention layers become a
disadvantage
Experiments
4. 4
RELATED WORK
Graph transformers
Some other works introduce structure information into attention
by graph distance, path embedding or feature encoded by GNN
Pure transformers
Recent works apply transformers in graph tasks by designing a
variety of structure encoding techniques
Deep neural networks
Graph substructure
Certain substructures can also be the pivotal feature for graph
property prediction
7. 7
SUBSTRUCTURE SAMPLING
The sampled substructures cover every node of the graph as evenly
as possible in order to reduce biases resulting from the uneven
density of substructures
8. 8
SUBSTRUCTURE TOKEN ENCODING
The formal definition of substructure token encoder is
A single sample is sufficient during training to allow the model to
learn the substructure stably.
9. 9
LOCAL ATTENTION ON SUBSTRUCTURES
The substructure and its corresponding nodes receive localized
attention after substructure tokens have been added
mask M is added in selfattention module
11. 11
RESULTS
EFFECT OF DEEPENING
deepen them by 2 and 4 times compared to the original version.
12. 12
CONCLUSION
Presents the bottleneck of graph transformers’ performance when
depth increases
A novel graph transformer model based on substructure-based local
attention with additional substructure tokens