SlideShare a Scribd company logo
1 of 21
Structure-Aware Transformer
for Graph Representation
Learning
Tien-Bach-Thanh Do
Network Science Lab
Dept. of Artificial Intelligence
The Catholic University of Korea
E-mail: osfa19730@catholic.ac.kr
2024/02/26
Dexiong Chen et al.
International Conference on Machine Learning, 2022
2
Introduction
• The Structure-Aware Transformer is a class of simple and flexible graph Transformers built upon a new
self-attention mechanism.
• This new self-attention incorporates structural information into the original self-attention by extracting a
subgraph representation rooted at each node before computing the attention
3
Problem with Traditional Transformers
• Traditional Transformers with positional encoding do not necessarily capture structural similarity between
nodes
• This can be a limitation when it comes to graph representation learning
• Over-smoothing and over-squashing problems
4
Problem with Traditional Transformers
Over-smoothing problem (message-passing strategies)
5
Problem with Traditional Transformers
Over-smoothing problem (message-passing strategies)
6
Problem with Traditional Transformers
Over-squashing problem (message-passing strategies)
7
Background
Transformers on Graphs
● Graph as G = (V, E, X) where the node attributes for node u, node attributes for all nodes are stored in X
● Transformer composed of two main blocks: a self-attention module followed by a feed-forward neural
network (FFN)
● X are first projected to query (Q), key (K), and value (V) matrices through a linear projection
● Self-attention
● Output of the self-attention is followed by a skip-connection and FFN, then jointly compose a Transformer
former layer
8
Background
Absolute encoding
● Absolute encoding refers to adding or concatenating the positional or structural representations of the
graph to the input node features before the main Transformer model
● Example: Laplacian positional encoding, random walk positional encoding (RWPE)
● Absolute encoding don’t provide a measure of the structural similarity between nodes and their
neighborhoods
9
Background
Self-attention as kernel smoothing
where v is the linear value function
● Mialon et al. (2021) propose a relative positional encoding strategy via the product of this kernel and a
diffusion kernel on the graph, which captures the positional similarity between nodes, however this
method is only position-aware
10
The Structure-Aware Transformer
Structure-Aware Self-Attention
• To address this issue, the Structure-Aware Transformer was proposed.
• It incorporates structural information into the original self-attention by extracting a subgraph representation
rooted at each node before computing the attention
11
The Structure-Aware Transformer
Structure-Aware Self-Attention
• The problem with kernel smoother is that it cannot filter out nodes that are structurally different from the
node of interest when they have the same or similar node features
• To incorporate the structural similarity between nodes => more generalized kernel that additionally
accounts for the local substructures around each node => A set of subgraphs centered at each node
where SG(v) denotes a subgraph in G centered at a node v associated with node features X and
Kgraph
• This takes the attributed similarity into account and structural similarity between subgraphs
• Generate more expressive node representations than the original self-attention
• No longer equivariant to any permutation of nodes but only to nodes whose features and subgraph
coincide
12
The Structure-Aware Transformer
Structure-Aware Self-Attention
where is a structure extractor that extracts vector representations of some subgraph centered
that u with node features X
● k-subtree GNN extractor: extract local structural information at node u to the input graph with node
features X and take the output node representation at u as the subgraph representation at u
● Small value of k already leads to good performance, while not suffering from over-smoothing and over-
squashing
● k-subgraph GNN extractor: more expressive extractor is to use a GNN to directly compute the
representation of the entire k-hop subgraph centered at u rather than just the node representation u
● Use subgraphs rather than subtrees around the node => more powerful than the 1-WL test
● Upadted node representations of all nodes within the k-hop neighborhood using a pooling function such
as summation
13
The Structure-Aware Transformer
Structure-Aware Transformer
14
The Structure-Aware Transformer
Structure-Aware Transformer
● Followed by a skip-connection, a FFN and 2 normalization layers before and after the FFN
● Add degree factor in the skip-connection => reducing the overwhelming influence of highly connected
graph components
where dv denotes the degree of node v
● Obtain new graph with the same structure but different node features G’ = (V, E, X’) where X’
corresponds to the output of the Transformer layer
● For graph property prediction, need to aggreate node-level representations into a graph representation by
taking the average or sum or the embedding of a virtual CLS node (without any connectivity to other
nodes)
15
The Structure-Aware Transformer
Combination with Absolute Encoding
● Most absolute encoding techniques are only position-aware
● They chose RWPE through other absolute positional representations
16
Empirical Results
Comparison to SOTA methods
17
Empirical Results
Comparison to SOTA methodss
18
Empirical Results
SAT models vs Sparse GNNs
19
Empirical Results
Hyperparameter Studies
20
Empirical Results
Model Interpretation
21
Conclusion
• The Structure-Aware Transformer successfully combines the advantages of GNNs and Transformers
• It offers a new way to incorporate structural information into graph representation learning, leading to
improved performance on various benchmarks
• Limitations: k-subgraph SAT has higher memory requirements than k-subtree SAT
• Future work: Focus on reducing the high memory cost and time complexity of the self-attention
computation

More Related Content

Similar to 240226_Thanh_LabSeminar[Structure-Aware Transformer for Graph Representation Learning].pptx

A Generalization of Transformer Networks to Graphs.pptx
A Generalization of Transformer Networks to Graphs.pptxA Generalization of Transformer Networks to Graphs.pptx
A Generalization of Transformer Networks to Graphs.pptxssuser2624f71
 
A Generalization of Transformer Networks to Graphs.pptx
A Generalization of Transformer Networks to Graphs.pptxA Generalization of Transformer Networks to Graphs.pptx
A Generalization of Transformer Networks to Graphs.pptxssuser2624f71
 
NS-CUK Seminar: H.E.Lee, Review on "Graph Star Net for Generalized Multi-Tas...
NS-CUK Seminar: H.E.Lee,  Review on "Graph Star Net for Generalized Multi-Tas...NS-CUK Seminar: H.E.Lee,  Review on "Graph Star Net for Generalized Multi-Tas...
NS-CUK Seminar: H.E.Lee, Review on "Graph Star Net for Generalized Multi-Tas...ssuser4b1f48
 
NS-CUK Seminar: H.E.Lee, Review on "Graph Star Net for Generalized Multi-Tas...
NS-CUK Seminar: H.E.Lee,  Review on "Graph Star Net for Generalized Multi-Tas...NS-CUK Seminar: H.E.Lee,  Review on "Graph Star Net for Generalized Multi-Tas...
NS-CUK Seminar: H.E.Lee, Review on "Graph Star Net for Generalized Multi-Tas...ssuser4b1f48
 
NS-CUK Seminar: S.T.Nguyen, Review on "Improving Graph Neural Network Express...
NS-CUK Seminar: S.T.Nguyen, Review on "Improving Graph Neural Network Express...NS-CUK Seminar: S.T.Nguyen, Review on "Improving Graph Neural Network Express...
NS-CUK Seminar: S.T.Nguyen, Review on "Improving Graph Neural Network Express...ssuser4b1f48
 
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018Universitat Politècnica de Catalunya
 
NS-CUK Seminar: S.T.Nguyen, Review on "Improving Graph Neural Network Express...
NS-CUK Seminar: S.T.Nguyen, Review on "Improving Graph Neural Network Express...NS-CUK Seminar: S.T.Nguyen, Review on "Improving Graph Neural Network Express...
NS-CUK Seminar: S.T.Nguyen, Review on "Improving Graph Neural Network Express...ssuser4b1f48
 
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018Universitat Politècnica de Catalunya
 
intro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxintro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxssuser3aa461
 
intro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxintro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxssuser3aa461
 
NS - CUK Seminar : V.T.Hoang, Review on "Structure-Aware Transformer for Grap...
NS - CUK Seminar : V.T.Hoang, Review on "Structure-Aware Transformer for Grap...NS - CUK Seminar : V.T.Hoang, Review on "Structure-Aware Transformer for Grap...
NS - CUK Seminar : V.T.Hoang, Review on "Structure-Aware Transformer for Grap...ssuser4b1f48
 
NS - CUK Seminar : V.T.Hoang, Review on "Structure-Aware Transformer for Grap...
NS - CUK Seminar : V.T.Hoang, Review on "Structure-Aware Transformer for Grap...NS - CUK Seminar : V.T.Hoang, Review on "Structure-Aware Transformer for Grap...
NS - CUK Seminar : V.T.Hoang, Review on "Structure-Aware Transformer for Grap...ssuser4b1f48
 
Review-image-segmentation-by-deep-learning
Review-image-segmentation-by-deep-learningReview-image-segmentation-by-deep-learning
Review-image-segmentation-by-deep-learningTrong-An Bui
 
Review-image-segmentation-by-deep-learning
Review-image-segmentation-by-deep-learningReview-image-segmentation-by-deep-learning
Review-image-segmentation-by-deep-learningTrong-An Bui
 
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)Universitat Politècnica de Catalunya
 
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)Universitat Politècnica de Catalunya
 

Similar to 240226_Thanh_LabSeminar[Structure-Aware Transformer for Graph Representation Learning].pptx (20)

A Generalization of Transformer Networks to Graphs.pptx
A Generalization of Transformer Networks to Graphs.pptxA Generalization of Transformer Networks to Graphs.pptx
A Generalization of Transformer Networks to Graphs.pptx
 
A Generalization of Transformer Networks to Graphs.pptx
A Generalization of Transformer Networks to Graphs.pptxA Generalization of Transformer Networks to Graphs.pptx
A Generalization of Transformer Networks to Graphs.pptx
 
NS-CUK Seminar: H.E.Lee, Review on "Graph Star Net for Generalized Multi-Tas...
NS-CUK Seminar: H.E.Lee,  Review on "Graph Star Net for Generalized Multi-Tas...NS-CUK Seminar: H.E.Lee,  Review on "Graph Star Net for Generalized Multi-Tas...
NS-CUK Seminar: H.E.Lee, Review on "Graph Star Net for Generalized Multi-Tas...
 
NS-CUK Seminar: H.E.Lee, Review on "Graph Star Net for Generalized Multi-Tas...
NS-CUK Seminar: H.E.Lee,  Review on "Graph Star Net for Generalized Multi-Tas...NS-CUK Seminar: H.E.Lee,  Review on "Graph Star Net for Generalized Multi-Tas...
NS-CUK Seminar: H.E.Lee, Review on "Graph Star Net for Generalized Multi-Tas...
 
DaViT.pdf
DaViT.pdfDaViT.pdf
DaViT.pdf
 
DaViT.pdf
DaViT.pdfDaViT.pdf
DaViT.pdf
 
NS-CUK Seminar: S.T.Nguyen, Review on "Improving Graph Neural Network Express...
NS-CUK Seminar: S.T.Nguyen, Review on "Improving Graph Neural Network Express...NS-CUK Seminar: S.T.Nguyen, Review on "Improving Graph Neural Network Express...
NS-CUK Seminar: S.T.Nguyen, Review on "Improving Graph Neural Network Express...
 
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
 
NS-CUK Seminar: S.T.Nguyen, Review on "Improving Graph Neural Network Express...
NS-CUK Seminar: S.T.Nguyen, Review on "Improving Graph Neural Network Express...NS-CUK Seminar: S.T.Nguyen, Review on "Improving Graph Neural Network Express...
NS-CUK Seminar: S.T.Nguyen, Review on "Improving Graph Neural Network Express...
 
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
 
intro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxintro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptx
 
intro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxintro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptx
 
NS - CUK Seminar : V.T.Hoang, Review on "Structure-Aware Transformer for Grap...
NS - CUK Seminar : V.T.Hoang, Review on "Structure-Aware Transformer for Grap...NS - CUK Seminar : V.T.Hoang, Review on "Structure-Aware Transformer for Grap...
NS - CUK Seminar : V.T.Hoang, Review on "Structure-Aware Transformer for Grap...
 
NS - CUK Seminar : V.T.Hoang, Review on "Structure-Aware Transformer for Grap...
NS - CUK Seminar : V.T.Hoang, Review on "Structure-Aware Transformer for Grap...NS - CUK Seminar : V.T.Hoang, Review on "Structure-Aware Transformer for Grap...
NS - CUK Seminar : V.T.Hoang, Review on "Structure-Aware Transformer for Grap...
 
Review-image-segmentation-by-deep-learning
Review-image-segmentation-by-deep-learningReview-image-segmentation-by-deep-learning
Review-image-segmentation-by-deep-learning
 
Review-image-segmentation-by-deep-learning
Review-image-segmentation-by-deep-learningReview-image-segmentation-by-deep-learning
Review-image-segmentation-by-deep-learning
 
NS-CUK Seminar: S.T.Nguyen, Review on "On Generalized Degree Fairness in Grap...
NS-CUK Seminar: S.T.Nguyen, Review on "On Generalized Degree Fairness in Grap...NS-CUK Seminar: S.T.Nguyen, Review on "On Generalized Degree Fairness in Grap...
NS-CUK Seminar: S.T.Nguyen, Review on "On Generalized Degree Fairness in Grap...
 
NS-CUK Seminar: S.T.Nguyen, Review on "On Generalized Degree Fairness in Grap...
NS-CUK Seminar: S.T.Nguyen, Review on "On Generalized Degree Fairness in Grap...NS-CUK Seminar: S.T.Nguyen, Review on "On Generalized Degree Fairness in Grap...
NS-CUK Seminar: S.T.Nguyen, Review on "On Generalized Degree Fairness in Grap...
 
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
 
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
 

More from thanhdowork

[20240429_LabSeminar_Huy]Spatio-Temporal Graph Neural Point Process for Traff...
[20240429_LabSeminar_Huy]Spatio-Temporal Graph Neural Point Process for Traff...[20240429_LabSeminar_Huy]Spatio-Temporal Graph Neural Point Process for Traff...
[20240429_LabSeminar_Huy]Spatio-Temporal Graph Neural Point Process for Traff...thanhdowork
 
240429_Thanh_LabSeminar[TranSG: Transformer-Based Skeleton Graph Prototype Co...
240429_Thanh_LabSeminar[TranSG: Transformer-Based Skeleton Graph Prototype Co...240429_Thanh_LabSeminar[TranSG: Transformer-Based Skeleton Graph Prototype Co...
240429_Thanh_LabSeminar[TranSG: Transformer-Based Skeleton Graph Prototype Co...thanhdowork
 
240429_Thuy_Labseminar[Simplifying and Empowering Transformers for Large-Grap...
240429_Thuy_Labseminar[Simplifying and Empowering Transformers for Large-Grap...240429_Thuy_Labseminar[Simplifying and Empowering Transformers for Large-Grap...
240429_Thuy_Labseminar[Simplifying and Empowering Transformers for Large-Grap...thanhdowork
 
240422_Thanh_LabSeminar[Dynamic Graph Enhanced Contrastive Learning for Chest...
240422_Thanh_LabSeminar[Dynamic Graph Enhanced Contrastive Learning for Chest...240422_Thanh_LabSeminar[Dynamic Graph Enhanced Contrastive Learning for Chest...
240422_Thanh_LabSeminar[Dynamic Graph Enhanced Contrastive Learning for Chest...thanhdowork
 
[20240422_LabSeminar_Huy]Taming_Effect.pptx
[20240422_LabSeminar_Huy]Taming_Effect.pptx[20240422_LabSeminar_Huy]Taming_Effect.pptx
[20240422_LabSeminar_Huy]Taming_Effect.pptxthanhdowork
 
240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...
240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...
240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...thanhdowork
 
[20240415_LabSeminar_Huy]Deciphering Spatio-Temporal Graph Forecasting: A Cau...
[20240415_LabSeminar_Huy]Deciphering Spatio-Temporal Graph Forecasting: A Cau...[20240415_LabSeminar_Huy]Deciphering Spatio-Temporal Graph Forecasting: A Cau...
[20240415_LabSeminar_Huy]Deciphering Spatio-Temporal Graph Forecasting: A Cau...thanhdowork
 
240315_Thanh_LabSeminar[G-TAD: Sub-Graph Localization for Temporal Action Det...
240315_Thanh_LabSeminar[G-TAD: Sub-Graph Localization for Temporal Action Det...240315_Thanh_LabSeminar[G-TAD: Sub-Graph Localization for Temporal Action Det...
240315_Thanh_LabSeminar[G-TAD: Sub-Graph Localization for Temporal Action Det...thanhdowork
 
240415_Thuy_Labseminar[Simple and Asymmetric Graph Contrastive Learning witho...
240415_Thuy_Labseminar[Simple and Asymmetric Graph Contrastive Learning witho...240415_Thuy_Labseminar[Simple and Asymmetric Graph Contrastive Learning witho...
240415_Thuy_Labseminar[Simple and Asymmetric Graph Contrastive Learning witho...thanhdowork
 
240115_Attention Is All You Need (2017 NIPS).pptx
240115_Attention Is All You Need (2017 NIPS).pptx240115_Attention Is All You Need (2017 NIPS).pptx
240115_Attention Is All You Need (2017 NIPS).pptxthanhdowork
 
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...thanhdowork
 
240122_Attention Is All You Need (2017 NIPS)2.pptx
240122_Attention Is All You Need (2017 NIPS)2.pptx240122_Attention Is All You Need (2017 NIPS)2.pptx
240122_Attention Is All You Need (2017 NIPS)2.pptxthanhdowork
 
[20240304_LabSeminar_Huy]DeepWalk: Online Learning of Social Representations....
[20240304_LabSeminar_Huy]DeepWalk: Online Learning of Social Representations....[20240304_LabSeminar_Huy]DeepWalk: Online Learning of Social Representations....
[20240304_LabSeminar_Huy]DeepWalk: Online Learning of Social Representations....thanhdowork
 
240304_Thanh_LabSeminar[Pure Transformers are Powerful Graph Learners].pptx
240304_Thanh_LabSeminar[Pure Transformers are Powerful Graph Learners].pptx240304_Thanh_LabSeminar[Pure Transformers are Powerful Graph Learners].pptx
240304_Thanh_LabSeminar[Pure Transformers are Powerful Graph Learners].pptxthanhdowork
 
240304_Thuy_Labseminar[SimGRACE: A Simple Framework for Graph Contrastive Lea...
240304_Thuy_Labseminar[SimGRACE: A Simple Framework for Graph Contrastive Lea...240304_Thuy_Labseminar[SimGRACE: A Simple Framework for Graph Contrastive Lea...
240304_Thuy_Labseminar[SimGRACE: A Simple Framework for Graph Contrastive Lea...thanhdowork
 
240311_JW_labseminar[Sequence to Sequence Learning with Neural Networks].pptx
240311_JW_labseminar[Sequence to Sequence Learning with Neural Networks].pptx240311_JW_labseminar[Sequence to Sequence Learning with Neural Networks].pptx
240311_JW_labseminar[Sequence to Sequence Learning with Neural Networks].pptxthanhdowork
 
[20240311_LabSeminar_Huy]LINE: Large-scale Information Network Embedding.pptx
[20240311_LabSeminar_Huy]LINE: Large-scale Information Network Embedding.pptx[20240311_LabSeminar_Huy]LINE: Large-scale Information Network Embedding.pptx
[20240311_LabSeminar_Huy]LINE: Large-scale Information Network Embedding.pptxthanhdowork
 
240311_Thanh_LabSeminar[Translating Embeddings for Modeling Multi-relational ...
240311_Thanh_LabSeminar[Translating Embeddings for Modeling Multi-relational ...240311_Thanh_LabSeminar[Translating Embeddings for Modeling Multi-relational ...
240311_Thanh_LabSeminar[Translating Embeddings for Modeling Multi-relational ...thanhdowork
 
240311_Thuy_Labseminar[Contrastive Multi-View Representation Learning on Grap...
240311_Thuy_Labseminar[Contrastive Multi-View Representation Learning on Grap...240311_Thuy_Labseminar[Contrastive Multi-View Representation Learning on Grap...
240311_Thuy_Labseminar[Contrastive Multi-View Representation Learning on Grap...thanhdowork
 
240318_JW_labseminar[Attention Is All You Need].pptx
240318_JW_labseminar[Attention Is All You Need].pptx240318_JW_labseminar[Attention Is All You Need].pptx
240318_JW_labseminar[Attention Is All You Need].pptxthanhdowork
 

More from thanhdowork (20)

[20240429_LabSeminar_Huy]Spatio-Temporal Graph Neural Point Process for Traff...
[20240429_LabSeminar_Huy]Spatio-Temporal Graph Neural Point Process for Traff...[20240429_LabSeminar_Huy]Spatio-Temporal Graph Neural Point Process for Traff...
[20240429_LabSeminar_Huy]Spatio-Temporal Graph Neural Point Process for Traff...
 
240429_Thanh_LabSeminar[TranSG: Transformer-Based Skeleton Graph Prototype Co...
240429_Thanh_LabSeminar[TranSG: Transformer-Based Skeleton Graph Prototype Co...240429_Thanh_LabSeminar[TranSG: Transformer-Based Skeleton Graph Prototype Co...
240429_Thanh_LabSeminar[TranSG: Transformer-Based Skeleton Graph Prototype Co...
 
240429_Thuy_Labseminar[Simplifying and Empowering Transformers for Large-Grap...
240429_Thuy_Labseminar[Simplifying and Empowering Transformers for Large-Grap...240429_Thuy_Labseminar[Simplifying and Empowering Transformers for Large-Grap...
240429_Thuy_Labseminar[Simplifying and Empowering Transformers for Large-Grap...
 
240422_Thanh_LabSeminar[Dynamic Graph Enhanced Contrastive Learning for Chest...
240422_Thanh_LabSeminar[Dynamic Graph Enhanced Contrastive Learning for Chest...240422_Thanh_LabSeminar[Dynamic Graph Enhanced Contrastive Learning for Chest...
240422_Thanh_LabSeminar[Dynamic Graph Enhanced Contrastive Learning for Chest...
 
[20240422_LabSeminar_Huy]Taming_Effect.pptx
[20240422_LabSeminar_Huy]Taming_Effect.pptx[20240422_LabSeminar_Huy]Taming_Effect.pptx
[20240422_LabSeminar_Huy]Taming_Effect.pptx
 
240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...
240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...
240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...
 
[20240415_LabSeminar_Huy]Deciphering Spatio-Temporal Graph Forecasting: A Cau...
[20240415_LabSeminar_Huy]Deciphering Spatio-Temporal Graph Forecasting: A Cau...[20240415_LabSeminar_Huy]Deciphering Spatio-Temporal Graph Forecasting: A Cau...
[20240415_LabSeminar_Huy]Deciphering Spatio-Temporal Graph Forecasting: A Cau...
 
240315_Thanh_LabSeminar[G-TAD: Sub-Graph Localization for Temporal Action Det...
240315_Thanh_LabSeminar[G-TAD: Sub-Graph Localization for Temporal Action Det...240315_Thanh_LabSeminar[G-TAD: Sub-Graph Localization for Temporal Action Det...
240315_Thanh_LabSeminar[G-TAD: Sub-Graph Localization for Temporal Action Det...
 
240415_Thuy_Labseminar[Simple and Asymmetric Graph Contrastive Learning witho...
240415_Thuy_Labseminar[Simple and Asymmetric Graph Contrastive Learning witho...240415_Thuy_Labseminar[Simple and Asymmetric Graph Contrastive Learning witho...
240415_Thuy_Labseminar[Simple and Asymmetric Graph Contrastive Learning witho...
 
240115_Attention Is All You Need (2017 NIPS).pptx
240115_Attention Is All You Need (2017 NIPS).pptx240115_Attention Is All You Need (2017 NIPS).pptx
240115_Attention Is All You Need (2017 NIPS).pptx
 
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
 
240122_Attention Is All You Need (2017 NIPS)2.pptx
240122_Attention Is All You Need (2017 NIPS)2.pptx240122_Attention Is All You Need (2017 NIPS)2.pptx
240122_Attention Is All You Need (2017 NIPS)2.pptx
 
[20240304_LabSeminar_Huy]DeepWalk: Online Learning of Social Representations....
[20240304_LabSeminar_Huy]DeepWalk: Online Learning of Social Representations....[20240304_LabSeminar_Huy]DeepWalk: Online Learning of Social Representations....
[20240304_LabSeminar_Huy]DeepWalk: Online Learning of Social Representations....
 
240304_Thanh_LabSeminar[Pure Transformers are Powerful Graph Learners].pptx
240304_Thanh_LabSeminar[Pure Transformers are Powerful Graph Learners].pptx240304_Thanh_LabSeminar[Pure Transformers are Powerful Graph Learners].pptx
240304_Thanh_LabSeminar[Pure Transformers are Powerful Graph Learners].pptx
 
240304_Thuy_Labseminar[SimGRACE: A Simple Framework for Graph Contrastive Lea...
240304_Thuy_Labseminar[SimGRACE: A Simple Framework for Graph Contrastive Lea...240304_Thuy_Labseminar[SimGRACE: A Simple Framework for Graph Contrastive Lea...
240304_Thuy_Labseminar[SimGRACE: A Simple Framework for Graph Contrastive Lea...
 
240311_JW_labseminar[Sequence to Sequence Learning with Neural Networks].pptx
240311_JW_labseminar[Sequence to Sequence Learning with Neural Networks].pptx240311_JW_labseminar[Sequence to Sequence Learning with Neural Networks].pptx
240311_JW_labseminar[Sequence to Sequence Learning with Neural Networks].pptx
 
[20240311_LabSeminar_Huy]LINE: Large-scale Information Network Embedding.pptx
[20240311_LabSeminar_Huy]LINE: Large-scale Information Network Embedding.pptx[20240311_LabSeminar_Huy]LINE: Large-scale Information Network Embedding.pptx
[20240311_LabSeminar_Huy]LINE: Large-scale Information Network Embedding.pptx
 
240311_Thanh_LabSeminar[Translating Embeddings for Modeling Multi-relational ...
240311_Thanh_LabSeminar[Translating Embeddings for Modeling Multi-relational ...240311_Thanh_LabSeminar[Translating Embeddings for Modeling Multi-relational ...
240311_Thanh_LabSeminar[Translating Embeddings for Modeling Multi-relational ...
 
240311_Thuy_Labseminar[Contrastive Multi-View Representation Learning on Grap...
240311_Thuy_Labseminar[Contrastive Multi-View Representation Learning on Grap...240311_Thuy_Labseminar[Contrastive Multi-View Representation Learning on Grap...
240311_Thuy_Labseminar[Contrastive Multi-View Representation Learning on Grap...
 
240318_JW_labseminar[Attention Is All You Need].pptx
240318_JW_labseminar[Attention Is All You Need].pptx240318_JW_labseminar[Attention Is All You Need].pptx
240318_JW_labseminar[Attention Is All You Need].pptx
 

Recently uploaded

The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Science lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lessonScience lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lessonJericReyAuditor
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 

Recently uploaded (20)

TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Science lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lessonScience lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lesson
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 

240226_Thanh_LabSeminar[Structure-Aware Transformer for Graph Representation Learning].pptx

  • 1. Structure-Aware Transformer for Graph Representation Learning Tien-Bach-Thanh Do Network Science Lab Dept. of Artificial Intelligence The Catholic University of Korea E-mail: osfa19730@catholic.ac.kr 2024/02/26 Dexiong Chen et al. International Conference on Machine Learning, 2022
  • 2. 2 Introduction • The Structure-Aware Transformer is a class of simple and flexible graph Transformers built upon a new self-attention mechanism. • This new self-attention incorporates structural information into the original self-attention by extracting a subgraph representation rooted at each node before computing the attention
  • 3. 3 Problem with Traditional Transformers • Traditional Transformers with positional encoding do not necessarily capture structural similarity between nodes • This can be a limitation when it comes to graph representation learning • Over-smoothing and over-squashing problems
  • 4. 4 Problem with Traditional Transformers Over-smoothing problem (message-passing strategies)
  • 5. 5 Problem with Traditional Transformers Over-smoothing problem (message-passing strategies)
  • 6. 6 Problem with Traditional Transformers Over-squashing problem (message-passing strategies)
  • 7. 7 Background Transformers on Graphs ● Graph as G = (V, E, X) where the node attributes for node u, node attributes for all nodes are stored in X ● Transformer composed of two main blocks: a self-attention module followed by a feed-forward neural network (FFN) ● X are first projected to query (Q), key (K), and value (V) matrices through a linear projection ● Self-attention ● Output of the self-attention is followed by a skip-connection and FFN, then jointly compose a Transformer former layer
  • 8. 8 Background Absolute encoding ● Absolute encoding refers to adding or concatenating the positional or structural representations of the graph to the input node features before the main Transformer model ● Example: Laplacian positional encoding, random walk positional encoding (RWPE) ● Absolute encoding don’t provide a measure of the structural similarity between nodes and their neighborhoods
  • 9. 9 Background Self-attention as kernel smoothing where v is the linear value function ● Mialon et al. (2021) propose a relative positional encoding strategy via the product of this kernel and a diffusion kernel on the graph, which captures the positional similarity between nodes, however this method is only position-aware
  • 10. 10 The Structure-Aware Transformer Structure-Aware Self-Attention • To address this issue, the Structure-Aware Transformer was proposed. • It incorporates structural information into the original self-attention by extracting a subgraph representation rooted at each node before computing the attention
  • 11. 11 The Structure-Aware Transformer Structure-Aware Self-Attention • The problem with kernel smoother is that it cannot filter out nodes that are structurally different from the node of interest when they have the same or similar node features • To incorporate the structural similarity between nodes => more generalized kernel that additionally accounts for the local substructures around each node => A set of subgraphs centered at each node where SG(v) denotes a subgraph in G centered at a node v associated with node features X and Kgraph • This takes the attributed similarity into account and structural similarity between subgraphs • Generate more expressive node representations than the original self-attention • No longer equivariant to any permutation of nodes but only to nodes whose features and subgraph coincide
  • 12. 12 The Structure-Aware Transformer Structure-Aware Self-Attention where is a structure extractor that extracts vector representations of some subgraph centered that u with node features X ● k-subtree GNN extractor: extract local structural information at node u to the input graph with node features X and take the output node representation at u as the subgraph representation at u ● Small value of k already leads to good performance, while not suffering from over-smoothing and over- squashing ● k-subgraph GNN extractor: more expressive extractor is to use a GNN to directly compute the representation of the entire k-hop subgraph centered at u rather than just the node representation u ● Use subgraphs rather than subtrees around the node => more powerful than the 1-WL test ● Upadted node representations of all nodes within the k-hop neighborhood using a pooling function such as summation
  • 14. 14 The Structure-Aware Transformer Structure-Aware Transformer ● Followed by a skip-connection, a FFN and 2 normalization layers before and after the FFN ● Add degree factor in the skip-connection => reducing the overwhelming influence of highly connected graph components where dv denotes the degree of node v ● Obtain new graph with the same structure but different node features G’ = (V, E, X’) where X’ corresponds to the output of the Transformer layer ● For graph property prediction, need to aggreate node-level representations into a graph representation by taking the average or sum or the embedding of a virtual CLS node (without any connectivity to other nodes)
  • 15. 15 The Structure-Aware Transformer Combination with Absolute Encoding ● Most absolute encoding techniques are only position-aware ● They chose RWPE through other absolute positional representations
  • 21. 21 Conclusion • The Structure-Aware Transformer successfully combines the advantages of GNNs and Transformers • It offers a new way to incorporate structural information into graph representation learning, leading to improved performance on various benchmarks • Limitations: k-subgraph SAT has higher memory requirements than k-subtree SAT • Future work: Focus on reducing the high memory cost and time complexity of the self-attention computation