SlideShare a Scribd company logo
1 of 19
TranSG: Transformer-Based
Skeleton Graph Prototype
Contrastive Learning
with Structure-Trajectory
Prompted Reconstruction for
Person Re-Identification
Tien-Bach-Thanh Do
Network Science Lab
Dept. of Artificial Intelligence
The Catholic University of Korea
E-mail: osfa19730@catholic.ac.kr
2024/04/29
Haocong Rao et al.
CVPR 2023
2
Introduction
● Person re-identification (re-ID) is a challenging task of retrieving and matching a specific person across
varying views
● Person re-ID via 3D skeletons has attracted attention in both academia and industry
● Skeleton-based person re-ID methods model unique body and motion representations with 3D
positions of key body joints
● Key contributions:
○ Present a generic TranSG paradigm to learn effective representations form skeleton graphs
for person re-ID
○ Devise a skeleton graph transformer (SGT) to fully capture relations in skeleton graphs and
integrate key correlative noed features into graph representations
○ Propose the graph prototype contrastive learning (GPC) to contrast and learn the most
representative graph features and identity-related semantics from both skeleton and sequence
levels for person re-ID
○ Devise the graph structure-trajectory prompted reconstruction (STPR) to exploit spatial-
temporal graph contexts for reconstruction, to capture more key graph semantics and unique
feature for person re-ID
3
Related Works
Person Re-identification Using Skeleton Data
● [13] 7 Euclidean distances between certain joints are computed to perform distance metric learning
● [12,7] 13 and 16 skeleton descriptors are utilized to identify different persons
● [8] CNN-based model PoseGait is devised to encode body-joint sequences and hand-crafted pose
features
● [6] proposes the AGE model to encode recognizable gait features from 3D skeleton sequences
● [14] proposed to encode prototypes and intra-sequence relations of masked skeleton sequences
● [18,19] perform multi-stage body-component relation learning based on multi-scale graphs
● [20] proposes a general skeleton feature re-ranking mechanism
[13] I. B. Barbosa, M. Cristani, A. Del Bue, L. Bazzani, and V. Murino, “Re-identification with RGB-D sensors,” in European Conference on Computer Vision (ECCV) Workshop, pp. 433–442, Springer, 2012.
[12] M. Munaro, A. Fossati, A. Basso, E. Menegatti, and L. Van Gool, “One-shot person re-identification with a consumer depth camera,” in Person Re-Identification, pp. 161– 181, Springer, 2014.
[7] P. Pala, L. Seidenari, S. Berretti, and A. Del Bimbo, “Enhanced skeleton and face 3D data for person re-identification from depth cameras,” Computers & Graphics, vol. 79, pp. 69–80, 2019.
[8] R. Liao, S. Yu, W. An, and Y. Huang, “A model-based gait recognition method with body pose and human prior knowledge,” Pattern Recognition, vol. 98, p. 107069, 2020.
[6] H. Rao, S. Wang, X. Hu, M. Tan, H. Da, J. Cheng, and B. Hu, “Self-supervised gait encoding with locality-aware attention for person re-identification,” in International Joint Conference on Artificial Intelligence (IJCAI), vol. 1, pp. 898–905, 2020.
[14] H. Rao and C. Miao, “SimMC: Simple masked contrastive learning of skeleton representations for unsupervised person re-identification,” in International Joint Conference on Artificial Intelligence (IJCAI), pp. 1290–1297, 2022.
[18] H. Rao, S. Xu, X. Hu, J. Cheng, and B. Hu, “Multi-level graph encoding with structural-collaborative relation learning for skeleton-based person re-identification,” in International Joint Conference on Artificial Intelligence (IJCAI), pp. 973–980,
2021.
[19] H. Rao, X. Hu, J. Cheng, and B. Hu, “SM-SGE: A self- supervised multi-scale skeleton graph encoding framework for person re-identification,” in Proceedings of the 29th ACM International Conference on Multimedia, pp. 1812–1820, 2021.
[20] H. Rao, Y. Li, and C. Miao, “Revisiting k-reciprocal distance re-ranking for skeleton-based person re-identification,” IEEE Signal Processing Letters, vol. 29, pp. 2103–2107, 2022.
4
Related Works
Contrastive Learning
● Aim to pull closer homogeneous or positive representation pairs while pushing farther negative pairs in a
certain feature space
● [21] an instance discrimination paradigm with exemplar tasks
● [27] approach based on the context auto-encoding and probabilistic contrastive loss to learn different-
domain representations
● [22] combines k-means clustering and CL
● [2,14] based on consecutive sequences or randomly masked sequences
[21] Z. Wu, Y. Xiong, S. X. Yu, and D. Lin, “Unsupervised feature learning via non-parametric instance discrimination,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3733–3742, 2018. 2
[27] A. van den Oord, Y. Li, and O. Vinyals, “Representation learning with contrastive predictive coding,” arXiv preprint arXiv:1807.03748, 2018.
[22] J. Li, P. Zhou, C. Xiong, and S. Hoi, “Prototypical con- trastive learning of unsupervised representations,” in Inter- national Conference on Learning Representation (ICLR), 2021.
[2] H. Rao, S. Wang, X. Hu, M. Tan, Y. Guo, J. Cheng, X. Liu, and B. Hu, “A self-supervised gait encoding approach with locality-awareness for 3D skeleton based person re-identification,” IEEE Transactions on Pattern Analysis and Machine
Intelligence, no. 01, pp. 1–1, 2021.
[14] H. Rao and C. Miao, “SimMC: Simple masked contrastive learning of skeleton representations for unsupervised person re-identification,” in International Joint Conference on Artificial Intelligence (IJCAI), pp. 1290–1297, 2022.
5
Related Works
Prompt Learning
● To provide additional knowledge, instruction, or context for the input of models => give more reliable
outputs for different tasks [28-36]
● [30] leverages language-based prompts to generalize the pre-trained visual representations to many
tasks
● [31] automatically model task-relevant prompt with continuous representations to improve the
downstream task performance
[28] F. Petroni, T. Rocktaschel, S. Riedel, P. Lewis, A. Bakhtin, Y. Wu, and A. Miller, “Language models as knowledge bases?,” in Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 2463–2473,
2019
[29] C. Jia, Y. Yang, Y. Xia, Y.-T. Chen, Z. Parekh, H. Pham, Q. Le, Y.-H. Sung, Z. Li, and T. Duerig, “Scaling up visual and vision-language representation learning with noisy text supervision,” in International Conference on Machine Learning
(ICML), pp. 4904–4916, PMLR, 2021
[30] A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, et al., “Learning transferable visual models from natural language supervision,” in International Conference on Machine Learning
(ICML), pp. 8748–8763, PMLR, 2021
[31] K. Zhou, J. Yang, C. C. Loy, and Z. Liu, “Learning to prompt for vision-language models,” International Journal of Computer Vision, vol. 130, no. 9, pp. 2337–2348, 2022
[32] Z. Jiang, F. F. Xu, J. Araki, and G. Neubig, “How can we know what language models know?,” Transactions of the Association for Computational Linguistics, vol. 8, pp. 423–438, 2020.
[33] B. Lester, R. Al-Rfou, and N. Constant, “The power of scale for parameter-efficient prompt tuning,” in Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 3045–3059, 2021
[34] X. L. Li and P. Liang, “Prefix-tuning: Optimizing continuous prompts for generation,” in Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), pp. 4582–4597, 2021
[35] P. Liu, W. Yuan, J. Fu, Z. Jiang, H. Hayashi, and G. Neu- big, “Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing,” arXiv preprint arXiv:2107.13586, 2021
[36] T. Shin, Y. Razeghi, R. L. Logan IV, E. Wallace, and S. Singh, “Autoprompt: Eliciting knowledge from language models with automatically generated prompts,” in Proceedings of the Conference on Empirical Methods in Natural Language
Processing (EMNLP), pp. 4222–4235, 2020
6
Methodology
7
Methodology
The Proposed Approach
● 3D skeleton sequence X = (x1, … , xf) Rf*J*3, where xt ∈ RJ*3 denotes the tth skeleton with 3D coordinates
of J body joints
● Each skeleton sequence X corresponds to person identity y where y ∈ {1, … , C} and C is the number of
different classes
● ФT = {XT
i}N1
i=1, ФP = {XP
i}N2
i=1, ФG = {XG
i}N3
i=1 denote the training set, probe set, and gallery set that contain
N1, N2, N3 skeleton sequences of different person in different scenes and views
● Aim to learn an encoder to map skeleton sequences into effective representations => encoded skeleton
sequence representations in the probe set can be matched with the representations of the same identity
in the gallery set
8
Methodology
Skeleton Graph Construction
● Skeleton graphs with the body joints as nodes and the structural connections between adjacent joints as
edges
● Gt(Vt, Et) corresponding to the tth skeleton xt, consists of nodes Vt = {vt
1, vt
2, … ,vt
J}, vt
i ∈ R3 and edges
Et = {et
i,j|vt
i, vt
j ∈ Vt}, et
i,j ∈ R
● Vt, Et denote the set of nodes corresponding to J different body joints and the set of their internal
connection relations
9
Methodology
Skeleton Graph Transformer
● Aim to capture discriminative skeleton features
● Problem:
○ Body structural features, which can be inferred from the relations between adjacent body joints
○ Skeleton actional patterns, which are typically characterized by the relations among different body
components
● Each body-joint node as a basic body component, combine the above relation learning as a full-relation
learning of body-joint nodes => fully aggregate key body and motion features from skeleton graphs
● Positional encoding
● Position-encoded node representation
● Multiple independent FR heads, aggregate corresponding relational feature
10
Methodology
Skeleton Graph Transformer
● Apply Feed Forward Network with residual connections and batch normalization
● Final sequence-level graph representation S
11
Methodology
Graph Prototype Contrastive Learning
● Each individual’s skeletons usually share the same anthropometric features (skeletal lengths), while their
continuous sequence can characterize identity-specific walking patterns (gait)
● Given the encoded graph representations of training skeleton sequences , group
them by growth-truth classes as , where denote
● Graph prototype is generated
● Graph prototype contrastive loss
12
Methodology
Graph Structure-Trajectory Prompted Reconstruction Mechanism
● To exploit more valuable graph features and high-level semantics from both spatial and temporal context
of skeleton graphs, proposed a self-supervised graph Structure and Trajectory Prompted Reconstruction
mechanism
● Devise 2 graph context based prompts
○ Partial node positions of the same graph
○ Partial node trajectory among continuous graphs
● Randomly mask node positions to generate the masked graph representation
● Then exploited to prompt the skeleton reconstruction
13
Methodology
Graph Trajectory Prompted Reconstruction
● To encourage the model to capture more unique temporal patterns from skeleton graphs, propose to
reconstruct graph trajectories based on their partial dynamics, randomly mask the trajectory of each node
● The masked trajectory representation preserves partial graph dynamics of different body-joint nodes,
which are then use to prompt the skeleton reconstruction
● Propose the STPR objective to combine both graph structure and trajectory prompt reconstruction
14
Methodology
The Entire Approach
● Combine the proposed GPC and STPR to perform skeleton representation learning with
where λ is the weight coefficient to fuse two losses for training
15
Experimental Setups
4 datasets
• IAS
○ 11 different persons
• KS20
○ 20 different persons
• BIWI
○ 50 different persons
• KGBD
○ 164 different persons
16
Experimental Setups
Implementation Details
• J = 20 in KGBD, IAS, BIWI (number of body joints)
• J = 25 in KS20
• J = 14 in the estimated skeleton data of CAIA-B
• f = 6 for Kinect-based skeleton datasets (IAS, KS20, BIWI, KGBD) (sequence length)
• f = 40 for RGB-estimated skelton data (CASIA-B)
• Evaluation Metrics
○ Cumulative Matching Characteristics (CMC)
○ Rank-1 accuracy (R1), Rank-5 accuracy (R5), Rank-10 accuracy (R10)
○ Mean Average Precision (mAP) is used to evaluate the overall performance
17
Results
SOTA comparison
18
Results
Ablation Study
19
Conclusion
• TranSG learn effective representations from skelton graphs for person re-ID
• Devise a SGT to perform full-relation learning of body-joint nodes to aggregate key body and motion
features into graph representations
• GPC learn discriminative graph representations by contrasting their inherent similarity with the most
representative graph features
• Design STPR mechanism to encourage learning richer graph semantics and key patterns for person re-ID
• TranSG outperform existing SOTA models, scalable to be applied to different scenarios

More Related Content

Similar to 240429_Thanh_LabSeminar[TranSG: Transformer-Based Skeleton Graph Prototype Contrastive Learning with Structure-Trajectory Prompted Reconstruction for Person Re-Identification].pptx

Advances of neural networks in 2020
Advances of neural networks in 2020Advances of neural networks in 2020
Advances of neural networks in 2020kevig
 
Extracting individual information using facial recognition in a smart mirror....
Extracting individual information using facial recognition in a smart mirror....Extracting individual information using facial recognition in a smart mirror....
Extracting individual information using facial recognition in a smart mirror....IQRARANI11
 
TOP 5 Most View Article From Academia in 2019
TOP 5 Most View Article From Academia in 2019TOP 5 Most View Article From Academia in 2019
TOP 5 Most View Article From Academia in 2019sipij
 
Deep randomized neural networks
Deep randomized neural networksDeep randomized neural networks
Deep randomized neural networksClaudio Gallicchio
 
TOP 10 STORAGE & RETRIEVAL PAPERS : RECOMMENDED READING
TOP 10 STORAGE & RETRIEVAL PAPERS :  RECOMMENDED READINGTOP 10 STORAGE & RETRIEVAL PAPERS :  RECOMMENDED READING
TOP 10 STORAGE & RETRIEVAL PAPERS : RECOMMENDED READINGsipij
 
Multimodal Learning Analytics for Collaborative Learning Understanding and Su...
Multimodal Learning Analytics for Collaborative Learning Understanding and Su...Multimodal Learning Analytics for Collaborative Learning Understanding and Su...
Multimodal Learning Analytics for Collaborative Learning Understanding and Su...Sambit Praharaj
 
Top Cited Articles in Computer Graphics and Animation
Top Cited Articles in Computer Graphics and AnimationTop Cited Articles in Computer Graphics and Animation
Top Cited Articles in Computer Graphics and Animationijcga
 
Top 5 most viewed articles from academia in 2019 -
Top 5 most viewed articles from academia in 2019 - Top 5 most viewed articles from academia in 2019 -
Top 5 most viewed articles from academia in 2019 - gerogepatton
 
Co-located Collaboration Analytics
Co-located Collaboration AnalyticsCo-located Collaboration Analytics
Co-located Collaboration AnalyticsSambit Praharaj
 
Intention recognition for dynamic role exchange in haptic
Intention recognition for dynamic role exchange in hapticIntention recognition for dynamic role exchange in haptic
Intention recognition for dynamic role exchange in hapticأحلام انصارى
 
New research articles 2020 october issue international journal of multimedi...
New research articles 2020 october  issue  international journal of multimedi...New research articles 2020 october  issue  international journal of multimedi...
New research articles 2020 october issue international journal of multimedi...ijma
 
Mining Gems from the Data Visualization Literature
Mining Gems from the Data Visualization LiteratureMining Gems from the Data Visualization Literature
Mining Gems from the Data Visualization LiteratureNils Gehlenborg
 
The Inquisitive Data Scientist: Facilitating Well-Informed Data Science throu...
The Inquisitive Data Scientist: Facilitating Well-Informed Data Science throu...The Inquisitive Data Scientist: Facilitating Well-Informed Data Science throu...
The Inquisitive Data Scientist: Facilitating Well-Informed Data Science throu...Cagatay Turkay
 
Bayanno-Net: Bangla Handwritten Digit Recognition using CNN
Bayanno-Net: Bangla Handwritten Digit Recognition using CNNBayanno-Net: Bangla Handwritten Digit Recognition using CNN
Bayanno-Net: Bangla Handwritten Digit Recognition using CNNMohammad Shakirul islam
 
Top 5 MOST VIEWED LANGUAGE COMPUTING ARTICLE - International Journal on Natur...
Top 5 MOST VIEWED LANGUAGE COMPUTING ARTICLE - International Journal on Natur...Top 5 MOST VIEWED LANGUAGE COMPUTING ARTICLE - International Journal on Natur...
Top 5 MOST VIEWED LANGUAGE COMPUTING ARTICLE - International Journal on Natur...kevig
 
More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?Paul Groth
 
Trans-Disciplinary Practice week 6
Trans-Disciplinary Practice week 6Trans-Disciplinary Practice week 6
Trans-Disciplinary Practice week 6R. Sosa
 
Human-centered AI: towards the next generation of interactive and adaptive ex...
Human-centered AI: towards the next generation of interactive and adaptive ex...Human-centered AI: towards the next generation of interactive and adaptive ex...
Human-centered AI: towards the next generation of interactive and adaptive ex...Katrien Verbert
 
Ontology visualization methods—a survey
Ontology visualization methods—a surveyOntology visualization methods—a survey
Ontology visualization methods—a surveyunyil96
 

Similar to 240429_Thanh_LabSeminar[TranSG: Transformer-Based Skeleton Graph Prototype Contrastive Learning with Structure-Trajectory Prompted Reconstruction for Person Re-Identification].pptx (20)

Advances of neural networks in 2020
Advances of neural networks in 2020Advances of neural networks in 2020
Advances of neural networks in 2020
 
Extracting individual information using facial recognition in a smart mirror....
Extracting individual information using facial recognition in a smart mirror....Extracting individual information using facial recognition in a smart mirror....
Extracting individual information using facial recognition in a smart mirror....
 
TOP 5 Most View Article From Academia in 2019
TOP 5 Most View Article From Academia in 2019TOP 5 Most View Article From Academia in 2019
TOP 5 Most View Article From Academia in 2019
 
Deep randomized neural networks
Deep randomized neural networksDeep randomized neural networks
Deep randomized neural networks
 
TOP 10 STORAGE & RETRIEVAL PAPERS : RECOMMENDED READING
TOP 10 STORAGE & RETRIEVAL PAPERS :  RECOMMENDED READINGTOP 10 STORAGE & RETRIEVAL PAPERS :  RECOMMENDED READING
TOP 10 STORAGE & RETRIEVAL PAPERS : RECOMMENDED READING
 
EWIC talk - 07 June, 2018
EWIC talk - 07 June, 2018EWIC talk - 07 June, 2018
EWIC talk - 07 June, 2018
 
Multimodal Learning Analytics for Collaborative Learning Understanding and Su...
Multimodal Learning Analytics for Collaborative Learning Understanding and Su...Multimodal Learning Analytics for Collaborative Learning Understanding and Su...
Multimodal Learning Analytics for Collaborative Learning Understanding and Su...
 
Top Cited Articles in Computer Graphics and Animation
Top Cited Articles in Computer Graphics and AnimationTop Cited Articles in Computer Graphics and Animation
Top Cited Articles in Computer Graphics and Animation
 
Top 5 most viewed articles from academia in 2019 -
Top 5 most viewed articles from academia in 2019 - Top 5 most viewed articles from academia in 2019 -
Top 5 most viewed articles from academia in 2019 -
 
Co-located Collaboration Analytics
Co-located Collaboration AnalyticsCo-located Collaboration Analytics
Co-located Collaboration Analytics
 
Intention recognition for dynamic role exchange in haptic
Intention recognition for dynamic role exchange in hapticIntention recognition for dynamic role exchange in haptic
Intention recognition for dynamic role exchange in haptic
 
New research articles 2020 october issue international journal of multimedi...
New research articles 2020 october  issue  international journal of multimedi...New research articles 2020 october  issue  international journal of multimedi...
New research articles 2020 october issue international journal of multimedi...
 
Mining Gems from the Data Visualization Literature
Mining Gems from the Data Visualization LiteratureMining Gems from the Data Visualization Literature
Mining Gems from the Data Visualization Literature
 
The Inquisitive Data Scientist: Facilitating Well-Informed Data Science throu...
The Inquisitive Data Scientist: Facilitating Well-Informed Data Science throu...The Inquisitive Data Scientist: Facilitating Well-Informed Data Science throu...
The Inquisitive Data Scientist: Facilitating Well-Informed Data Science throu...
 
Bayanno-Net: Bangla Handwritten Digit Recognition using CNN
Bayanno-Net: Bangla Handwritten Digit Recognition using CNNBayanno-Net: Bangla Handwritten Digit Recognition using CNN
Bayanno-Net: Bangla Handwritten Digit Recognition using CNN
 
Top 5 MOST VIEWED LANGUAGE COMPUTING ARTICLE - International Journal on Natur...
Top 5 MOST VIEWED LANGUAGE COMPUTING ARTICLE - International Journal on Natur...Top 5 MOST VIEWED LANGUAGE COMPUTING ARTICLE - International Journal on Natur...
Top 5 MOST VIEWED LANGUAGE COMPUTING ARTICLE - International Journal on Natur...
 
More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?
 
Trans-Disciplinary Practice week 6
Trans-Disciplinary Practice week 6Trans-Disciplinary Practice week 6
Trans-Disciplinary Practice week 6
 
Human-centered AI: towards the next generation of interactive and adaptive ex...
Human-centered AI: towards the next generation of interactive and adaptive ex...Human-centered AI: towards the next generation of interactive and adaptive ex...
Human-centered AI: towards the next generation of interactive and adaptive ex...
 
Ontology visualization methods—a survey
Ontology visualization methods—a surveyOntology visualization methods—a survey
Ontology visualization methods—a survey
 

More from thanhdowork

[20240520_LabSeminar_Huy]DSTAGNN: Dynamic Spatial-Temporal Aware Graph Neural...
[20240520_LabSeminar_Huy]DSTAGNN: Dynamic Spatial-Temporal Aware Graph Neural...[20240520_LabSeminar_Huy]DSTAGNN: Dynamic Spatial-Temporal Aware Graph Neural...
[20240520_LabSeminar_Huy]DSTAGNN: Dynamic Spatial-Temporal Aware Graph Neural...thanhdowork
 
240520_Thanh_LabSeminar[G-MSM: Unsupervised Multi-Shape Matching with Graph-b...
240520_Thanh_LabSeminar[G-MSM: Unsupervised Multi-Shape Matching with Graph-b...240520_Thanh_LabSeminar[G-MSM: Unsupervised Multi-Shape Matching with Graph-b...
240520_Thanh_LabSeminar[G-MSM: Unsupervised Multi-Shape Matching with Graph-b...thanhdowork
 
240513_Thanh_LabSeminar[Learning and Aggregating Lane Graphs for Urban Automa...
240513_Thanh_LabSeminar[Learning and Aggregating Lane Graphs for Urban Automa...240513_Thanh_LabSeminar[Learning and Aggregating Lane Graphs for Urban Automa...
240513_Thanh_LabSeminar[Learning and Aggregating Lane Graphs for Urban Automa...thanhdowork
 
240513_Thuy_Labseminar[Universal Prompt Tuning for Graph Neural Networks].pptx
240513_Thuy_Labseminar[Universal Prompt Tuning for Graph Neural Networks].pptx240513_Thuy_Labseminar[Universal Prompt Tuning for Graph Neural Networks].pptx
240513_Thuy_Labseminar[Universal Prompt Tuning for Graph Neural Networks].pptxthanhdowork
 
[20240513_LabSeminar_Huy]GraphFewShort_Transfer.pptx
[20240513_LabSeminar_Huy]GraphFewShort_Transfer.pptx[20240513_LabSeminar_Huy]GraphFewShort_Transfer.pptx
[20240513_LabSeminar_Huy]GraphFewShort_Transfer.pptxthanhdowork
 
240506_JW_labseminar[Structural Deep Network Embedding].pptx
240506_JW_labseminar[Structural Deep Network Embedding].pptx240506_JW_labseminar[Structural Deep Network Embedding].pptx
240506_JW_labseminar[Structural Deep Network Embedding].pptxthanhdowork
 
[20240506_LabSeminar_Huy]Conditional Local Convolution for Spatio-Temporal Me...
[20240506_LabSeminar_Huy]Conditional Local Convolution for Spatio-Temporal Me...[20240506_LabSeminar_Huy]Conditional Local Convolution for Spatio-Temporal Me...
[20240506_LabSeminar_Huy]Conditional Local Convolution for Spatio-Temporal Me...thanhdowork
 
240506_Thuy_Labseminar[GraphPrompt: Unifying Pre-Training and Downstream Task...
240506_Thuy_Labseminar[GraphPrompt: Unifying Pre-Training and Downstream Task...240506_Thuy_Labseminar[GraphPrompt: Unifying Pre-Training and Downstream Task...
240506_Thuy_Labseminar[GraphPrompt: Unifying Pre-Training and Downstream Task...thanhdowork
 
[20240429_LabSeminar_Huy]Spatio-Temporal Graph Neural Point Process for Traff...
[20240429_LabSeminar_Huy]Spatio-Temporal Graph Neural Point Process for Traff...[20240429_LabSeminar_Huy]Spatio-Temporal Graph Neural Point Process for Traff...
[20240429_LabSeminar_Huy]Spatio-Temporal Graph Neural Point Process for Traff...thanhdowork
 
240429_Thuy_Labseminar[Simplifying and Empowering Transformers for Large-Grap...
240429_Thuy_Labseminar[Simplifying and Empowering Transformers for Large-Grap...240429_Thuy_Labseminar[Simplifying and Empowering Transformers for Large-Grap...
240429_Thuy_Labseminar[Simplifying and Empowering Transformers for Large-Grap...thanhdowork
 
240422_Thanh_LabSeminar[Dynamic Graph Enhanced Contrastive Learning for Chest...
240422_Thanh_LabSeminar[Dynamic Graph Enhanced Contrastive Learning for Chest...240422_Thanh_LabSeminar[Dynamic Graph Enhanced Contrastive Learning for Chest...
240422_Thanh_LabSeminar[Dynamic Graph Enhanced Contrastive Learning for Chest...thanhdowork
 
[20240422_LabSeminar_Huy]Taming_Effect.pptx
[20240422_LabSeminar_Huy]Taming_Effect.pptx[20240422_LabSeminar_Huy]Taming_Effect.pptx
[20240422_LabSeminar_Huy]Taming_Effect.pptxthanhdowork
 
240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...
240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...
240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...thanhdowork
 
[20240415_LabSeminar_Huy]Deciphering Spatio-Temporal Graph Forecasting: A Cau...
[20240415_LabSeminar_Huy]Deciphering Spatio-Temporal Graph Forecasting: A Cau...[20240415_LabSeminar_Huy]Deciphering Spatio-Temporal Graph Forecasting: A Cau...
[20240415_LabSeminar_Huy]Deciphering Spatio-Temporal Graph Forecasting: A Cau...thanhdowork
 
240315_Thanh_LabSeminar[G-TAD: Sub-Graph Localization for Temporal Action Det...
240315_Thanh_LabSeminar[G-TAD: Sub-Graph Localization for Temporal Action Det...240315_Thanh_LabSeminar[G-TAD: Sub-Graph Localization for Temporal Action Det...
240315_Thanh_LabSeminar[G-TAD: Sub-Graph Localization for Temporal Action Det...thanhdowork
 
240415_Thuy_Labseminar[Simple and Asymmetric Graph Contrastive Learning witho...
240415_Thuy_Labseminar[Simple and Asymmetric Graph Contrastive Learning witho...240415_Thuy_Labseminar[Simple and Asymmetric Graph Contrastive Learning witho...
240415_Thuy_Labseminar[Simple and Asymmetric Graph Contrastive Learning witho...thanhdowork
 
240115_Attention Is All You Need (2017 NIPS).pptx
240115_Attention Is All You Need (2017 NIPS).pptx240115_Attention Is All You Need (2017 NIPS).pptx
240115_Attention Is All You Need (2017 NIPS).pptxthanhdowork
 
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...thanhdowork
 
240122_Attention Is All You Need (2017 NIPS)2.pptx
240122_Attention Is All You Need (2017 NIPS)2.pptx240122_Attention Is All You Need (2017 NIPS)2.pptx
240122_Attention Is All You Need (2017 NIPS)2.pptxthanhdowork
 
240226_Thanh_LabSeminar[Structure-Aware Transformer for Graph Representation ...
240226_Thanh_LabSeminar[Structure-Aware Transformer for Graph Representation ...240226_Thanh_LabSeminar[Structure-Aware Transformer for Graph Representation ...
240226_Thanh_LabSeminar[Structure-Aware Transformer for Graph Representation ...thanhdowork
 

More from thanhdowork (20)

[20240520_LabSeminar_Huy]DSTAGNN: Dynamic Spatial-Temporal Aware Graph Neural...
[20240520_LabSeminar_Huy]DSTAGNN: Dynamic Spatial-Temporal Aware Graph Neural...[20240520_LabSeminar_Huy]DSTAGNN: Dynamic Spatial-Temporal Aware Graph Neural...
[20240520_LabSeminar_Huy]DSTAGNN: Dynamic Spatial-Temporal Aware Graph Neural...
 
240520_Thanh_LabSeminar[G-MSM: Unsupervised Multi-Shape Matching with Graph-b...
240520_Thanh_LabSeminar[G-MSM: Unsupervised Multi-Shape Matching with Graph-b...240520_Thanh_LabSeminar[G-MSM: Unsupervised Multi-Shape Matching with Graph-b...
240520_Thanh_LabSeminar[G-MSM: Unsupervised Multi-Shape Matching with Graph-b...
 
240513_Thanh_LabSeminar[Learning and Aggregating Lane Graphs for Urban Automa...
240513_Thanh_LabSeminar[Learning and Aggregating Lane Graphs for Urban Automa...240513_Thanh_LabSeminar[Learning and Aggregating Lane Graphs for Urban Automa...
240513_Thanh_LabSeminar[Learning and Aggregating Lane Graphs for Urban Automa...
 
240513_Thuy_Labseminar[Universal Prompt Tuning for Graph Neural Networks].pptx
240513_Thuy_Labseminar[Universal Prompt Tuning for Graph Neural Networks].pptx240513_Thuy_Labseminar[Universal Prompt Tuning for Graph Neural Networks].pptx
240513_Thuy_Labseminar[Universal Prompt Tuning for Graph Neural Networks].pptx
 
[20240513_LabSeminar_Huy]GraphFewShort_Transfer.pptx
[20240513_LabSeminar_Huy]GraphFewShort_Transfer.pptx[20240513_LabSeminar_Huy]GraphFewShort_Transfer.pptx
[20240513_LabSeminar_Huy]GraphFewShort_Transfer.pptx
 
240506_JW_labseminar[Structural Deep Network Embedding].pptx
240506_JW_labseminar[Structural Deep Network Embedding].pptx240506_JW_labseminar[Structural Deep Network Embedding].pptx
240506_JW_labseminar[Structural Deep Network Embedding].pptx
 
[20240506_LabSeminar_Huy]Conditional Local Convolution for Spatio-Temporal Me...
[20240506_LabSeminar_Huy]Conditional Local Convolution for Spatio-Temporal Me...[20240506_LabSeminar_Huy]Conditional Local Convolution for Spatio-Temporal Me...
[20240506_LabSeminar_Huy]Conditional Local Convolution for Spatio-Temporal Me...
 
240506_Thuy_Labseminar[GraphPrompt: Unifying Pre-Training and Downstream Task...
240506_Thuy_Labseminar[GraphPrompt: Unifying Pre-Training and Downstream Task...240506_Thuy_Labseminar[GraphPrompt: Unifying Pre-Training and Downstream Task...
240506_Thuy_Labseminar[GraphPrompt: Unifying Pre-Training and Downstream Task...
 
[20240429_LabSeminar_Huy]Spatio-Temporal Graph Neural Point Process for Traff...
[20240429_LabSeminar_Huy]Spatio-Temporal Graph Neural Point Process for Traff...[20240429_LabSeminar_Huy]Spatio-Temporal Graph Neural Point Process for Traff...
[20240429_LabSeminar_Huy]Spatio-Temporal Graph Neural Point Process for Traff...
 
240429_Thuy_Labseminar[Simplifying and Empowering Transformers for Large-Grap...
240429_Thuy_Labseminar[Simplifying and Empowering Transformers for Large-Grap...240429_Thuy_Labseminar[Simplifying and Empowering Transformers for Large-Grap...
240429_Thuy_Labseminar[Simplifying and Empowering Transformers for Large-Grap...
 
240422_Thanh_LabSeminar[Dynamic Graph Enhanced Contrastive Learning for Chest...
240422_Thanh_LabSeminar[Dynamic Graph Enhanced Contrastive Learning for Chest...240422_Thanh_LabSeminar[Dynamic Graph Enhanced Contrastive Learning for Chest...
240422_Thanh_LabSeminar[Dynamic Graph Enhanced Contrastive Learning for Chest...
 
[20240422_LabSeminar_Huy]Taming_Effect.pptx
[20240422_LabSeminar_Huy]Taming_Effect.pptx[20240422_LabSeminar_Huy]Taming_Effect.pptx
[20240422_LabSeminar_Huy]Taming_Effect.pptx
 
240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...
240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...
240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...
 
[20240415_LabSeminar_Huy]Deciphering Spatio-Temporal Graph Forecasting: A Cau...
[20240415_LabSeminar_Huy]Deciphering Spatio-Temporal Graph Forecasting: A Cau...[20240415_LabSeminar_Huy]Deciphering Spatio-Temporal Graph Forecasting: A Cau...
[20240415_LabSeminar_Huy]Deciphering Spatio-Temporal Graph Forecasting: A Cau...
 
240315_Thanh_LabSeminar[G-TAD: Sub-Graph Localization for Temporal Action Det...
240315_Thanh_LabSeminar[G-TAD: Sub-Graph Localization for Temporal Action Det...240315_Thanh_LabSeminar[G-TAD: Sub-Graph Localization for Temporal Action Det...
240315_Thanh_LabSeminar[G-TAD: Sub-Graph Localization for Temporal Action Det...
 
240415_Thuy_Labseminar[Simple and Asymmetric Graph Contrastive Learning witho...
240415_Thuy_Labseminar[Simple and Asymmetric Graph Contrastive Learning witho...240415_Thuy_Labseminar[Simple and Asymmetric Graph Contrastive Learning witho...
240415_Thuy_Labseminar[Simple and Asymmetric Graph Contrastive Learning witho...
 
240115_Attention Is All You Need (2017 NIPS).pptx
240115_Attention Is All You Need (2017 NIPS).pptx240115_Attention Is All You Need (2017 NIPS).pptx
240115_Attention Is All You Need (2017 NIPS).pptx
 
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
 
240122_Attention Is All You Need (2017 NIPS)2.pptx
240122_Attention Is All You Need (2017 NIPS)2.pptx240122_Attention Is All You Need (2017 NIPS)2.pptx
240122_Attention Is All You Need (2017 NIPS)2.pptx
 
240226_Thanh_LabSeminar[Structure-Aware Transformer for Graph Representation ...
240226_Thanh_LabSeminar[Structure-Aware Transformer for Graph Representation ...240226_Thanh_LabSeminar[Structure-Aware Transformer for Graph Representation ...
240226_Thanh_LabSeminar[Structure-Aware Transformer for Graph Representation ...
 

Recently uploaded

Andreas Schleicher presents at the launch of What does child empowerment mean...
Andreas Schleicher presents at the launch of What does child empowerment mean...Andreas Schleicher presents at the launch of What does child empowerment mean...
Andreas Schleicher presents at the launch of What does child empowerment mean...EduSkills OECD
 
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUMDEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUMELOISARIVERA8
 
An overview of the various scriptures in Hinduism
An overview of the various scriptures in HinduismAn overview of the various scriptures in Hinduism
An overview of the various scriptures in HinduismDabee Kamal
 
Observing-Correct-Grammar-in-Making-Definitions.pptx
Observing-Correct-Grammar-in-Making-Definitions.pptxObserving-Correct-Grammar-in-Making-Definitions.pptx
Observing-Correct-Grammar-in-Making-Definitions.pptxAdelaideRefugio
 
diagnosting testing bsc 2nd sem.pptx....
diagnosting testing bsc 2nd sem.pptx....diagnosting testing bsc 2nd sem.pptx....
diagnosting testing bsc 2nd sem.pptx....Ritu480198
 
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptx
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptxAnalyzing and resolving a communication crisis in Dhaka textiles LTD.pptx
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptxLimon Prince
 
Basic Civil Engineering notes on Transportation Engineering & Modes of Transport
Basic Civil Engineering notes on Transportation Engineering & Modes of TransportBasic Civil Engineering notes on Transportation Engineering & Modes of Transport
Basic Civil Engineering notes on Transportation Engineering & Modes of TransportDenish Jangid
 
Climbers and Creepers used in landscaping
Climbers and Creepers used in landscapingClimbers and Creepers used in landscaping
Climbers and Creepers used in landscapingDr. M. Kumaresan Hort.
 
PSYPACT- Practicing Over State Lines May 2024.pptx
PSYPACT- Practicing Over State Lines May 2024.pptxPSYPACT- Practicing Over State Lines May 2024.pptx
PSYPACT- Practicing Over State Lines May 2024.pptxMarlene Maheu
 
Sternal Fractures & Dislocations - EMGuidewire Radiology Reading Room
Sternal Fractures & Dislocations - EMGuidewire Radiology Reading RoomSternal Fractures & Dislocations - EMGuidewire Radiology Reading Room
Sternal Fractures & Dislocations - EMGuidewire Radiology Reading RoomSean M. Fox
 
8 Tips for Effective Working Capital Management
8 Tips for Effective Working Capital Management8 Tips for Effective Working Capital Management
8 Tips for Effective Working Capital ManagementMBA Assignment Experts
 
e-Sealing at EADTU by Kamakshi Rajagopal
e-Sealing at EADTU by Kamakshi Rajagopale-Sealing at EADTU by Kamakshi Rajagopal
e-Sealing at EADTU by Kamakshi RajagopalEADTU
 
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...Nguyen Thanh Tu Collection
 
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjj
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjjStl Algorithms in C++ jjjjjjjjjjjjjjjjjj
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjjMohammed Sikander
 
UChicago CMSC 23320 - The Best Commit Messages of 2024
UChicago CMSC 23320 - The Best Commit Messages of 2024UChicago CMSC 23320 - The Best Commit Messages of 2024
UChicago CMSC 23320 - The Best Commit Messages of 2024Borja Sotomayor
 
How To Create Editable Tree View in Odoo 17
How To Create Editable Tree View in Odoo 17How To Create Editable Tree View in Odoo 17
How To Create Editable Tree View in Odoo 17Celine George
 

Recently uploaded (20)

Andreas Schleicher presents at the launch of What does child empowerment mean...
Andreas Schleicher presents at the launch of What does child empowerment mean...Andreas Schleicher presents at the launch of What does child empowerment mean...
Andreas Schleicher presents at the launch of What does child empowerment mean...
 
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUMDEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
 
An overview of the various scriptures in Hinduism
An overview of the various scriptures in HinduismAn overview of the various scriptures in Hinduism
An overview of the various scriptures in Hinduism
 
Mattingly "AI & Prompt Design: Named Entity Recognition"
Mattingly "AI & Prompt Design: Named Entity Recognition"Mattingly "AI & Prompt Design: Named Entity Recognition"
Mattingly "AI & Prompt Design: Named Entity Recognition"
 
OS-operating systems- ch05 (CPU Scheduling) ...
OS-operating systems- ch05 (CPU Scheduling) ...OS-operating systems- ch05 (CPU Scheduling) ...
OS-operating systems- ch05 (CPU Scheduling) ...
 
Observing-Correct-Grammar-in-Making-Definitions.pptx
Observing-Correct-Grammar-in-Making-Definitions.pptxObserving-Correct-Grammar-in-Making-Definitions.pptx
Observing-Correct-Grammar-in-Making-Definitions.pptx
 
diagnosting testing bsc 2nd sem.pptx....
diagnosting testing bsc 2nd sem.pptx....diagnosting testing bsc 2nd sem.pptx....
diagnosting testing bsc 2nd sem.pptx....
 
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptx
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptxAnalyzing and resolving a communication crisis in Dhaka textiles LTD.pptx
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptx
 
Basic Civil Engineering notes on Transportation Engineering & Modes of Transport
Basic Civil Engineering notes on Transportation Engineering & Modes of TransportBasic Civil Engineering notes on Transportation Engineering & Modes of Transport
Basic Civil Engineering notes on Transportation Engineering & Modes of Transport
 
Climbers and Creepers used in landscaping
Climbers and Creepers used in landscapingClimbers and Creepers used in landscaping
Climbers and Creepers used in landscaping
 
PSYPACT- Practicing Over State Lines May 2024.pptx
PSYPACT- Practicing Over State Lines May 2024.pptxPSYPACT- Practicing Over State Lines May 2024.pptx
PSYPACT- Practicing Over State Lines May 2024.pptx
 
Sternal Fractures & Dislocations - EMGuidewire Radiology Reading Room
Sternal Fractures & Dislocations - EMGuidewire Radiology Reading RoomSternal Fractures & Dislocations - EMGuidewire Radiology Reading Room
Sternal Fractures & Dislocations - EMGuidewire Radiology Reading Room
 
8 Tips for Effective Working Capital Management
8 Tips for Effective Working Capital Management8 Tips for Effective Working Capital Management
8 Tips for Effective Working Capital Management
 
Including Mental Health Support in Project Delivery, 14 May.pdf
Including Mental Health Support in Project Delivery, 14 May.pdfIncluding Mental Health Support in Project Delivery, 14 May.pdf
Including Mental Health Support in Project Delivery, 14 May.pdf
 
e-Sealing at EADTU by Kamakshi Rajagopal
e-Sealing at EADTU by Kamakshi Rajagopale-Sealing at EADTU by Kamakshi Rajagopal
e-Sealing at EADTU by Kamakshi Rajagopal
 
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
 
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjj
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjjStl Algorithms in C++ jjjjjjjjjjjjjjjjjj
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjj
 
Mattingly "AI and Prompt Design: LLMs with NER"
Mattingly "AI and Prompt Design: LLMs with NER"Mattingly "AI and Prompt Design: LLMs with NER"
Mattingly "AI and Prompt Design: LLMs with NER"
 
UChicago CMSC 23320 - The Best Commit Messages of 2024
UChicago CMSC 23320 - The Best Commit Messages of 2024UChicago CMSC 23320 - The Best Commit Messages of 2024
UChicago CMSC 23320 - The Best Commit Messages of 2024
 
How To Create Editable Tree View in Odoo 17
How To Create Editable Tree View in Odoo 17How To Create Editable Tree View in Odoo 17
How To Create Editable Tree View in Odoo 17
 

240429_Thanh_LabSeminar[TranSG: Transformer-Based Skeleton Graph Prototype Contrastive Learning with Structure-Trajectory Prompted Reconstruction for Person Re-Identification].pptx

  • 1. TranSG: Transformer-Based Skeleton Graph Prototype Contrastive Learning with Structure-Trajectory Prompted Reconstruction for Person Re-Identification Tien-Bach-Thanh Do Network Science Lab Dept. of Artificial Intelligence The Catholic University of Korea E-mail: osfa19730@catholic.ac.kr 2024/04/29 Haocong Rao et al. CVPR 2023
  • 2. 2 Introduction ● Person re-identification (re-ID) is a challenging task of retrieving and matching a specific person across varying views ● Person re-ID via 3D skeletons has attracted attention in both academia and industry ● Skeleton-based person re-ID methods model unique body and motion representations with 3D positions of key body joints ● Key contributions: ○ Present a generic TranSG paradigm to learn effective representations form skeleton graphs for person re-ID ○ Devise a skeleton graph transformer (SGT) to fully capture relations in skeleton graphs and integrate key correlative noed features into graph representations ○ Propose the graph prototype contrastive learning (GPC) to contrast and learn the most representative graph features and identity-related semantics from both skeleton and sequence levels for person re-ID ○ Devise the graph structure-trajectory prompted reconstruction (STPR) to exploit spatial- temporal graph contexts for reconstruction, to capture more key graph semantics and unique feature for person re-ID
  • 3. 3 Related Works Person Re-identification Using Skeleton Data ● [13] 7 Euclidean distances between certain joints are computed to perform distance metric learning ● [12,7] 13 and 16 skeleton descriptors are utilized to identify different persons ● [8] CNN-based model PoseGait is devised to encode body-joint sequences and hand-crafted pose features ● [6] proposes the AGE model to encode recognizable gait features from 3D skeleton sequences ● [14] proposed to encode prototypes and intra-sequence relations of masked skeleton sequences ● [18,19] perform multi-stage body-component relation learning based on multi-scale graphs ● [20] proposes a general skeleton feature re-ranking mechanism [13] I. B. Barbosa, M. Cristani, A. Del Bue, L. Bazzani, and V. Murino, “Re-identification with RGB-D sensors,” in European Conference on Computer Vision (ECCV) Workshop, pp. 433–442, Springer, 2012. [12] M. Munaro, A. Fossati, A. Basso, E. Menegatti, and L. Van Gool, “One-shot person re-identification with a consumer depth camera,” in Person Re-Identification, pp. 161– 181, Springer, 2014. [7] P. Pala, L. Seidenari, S. Berretti, and A. Del Bimbo, “Enhanced skeleton and face 3D data for person re-identification from depth cameras,” Computers & Graphics, vol. 79, pp. 69–80, 2019. [8] R. Liao, S. Yu, W. An, and Y. Huang, “A model-based gait recognition method with body pose and human prior knowledge,” Pattern Recognition, vol. 98, p. 107069, 2020. [6] H. Rao, S. Wang, X. Hu, M. Tan, H. Da, J. Cheng, and B. Hu, “Self-supervised gait encoding with locality-aware attention for person re-identification,” in International Joint Conference on Artificial Intelligence (IJCAI), vol. 1, pp. 898–905, 2020. [14] H. Rao and C. Miao, “SimMC: Simple masked contrastive learning of skeleton representations for unsupervised person re-identification,” in International Joint Conference on Artificial Intelligence (IJCAI), pp. 1290–1297, 2022. [18] H. Rao, S. Xu, X. Hu, J. Cheng, and B. Hu, “Multi-level graph encoding with structural-collaborative relation learning for skeleton-based person re-identification,” in International Joint Conference on Artificial Intelligence (IJCAI), pp. 973–980, 2021. [19] H. Rao, X. Hu, J. Cheng, and B. Hu, “SM-SGE: A self- supervised multi-scale skeleton graph encoding framework for person re-identification,” in Proceedings of the 29th ACM International Conference on Multimedia, pp. 1812–1820, 2021. [20] H. Rao, Y. Li, and C. Miao, “Revisiting k-reciprocal distance re-ranking for skeleton-based person re-identification,” IEEE Signal Processing Letters, vol. 29, pp. 2103–2107, 2022.
  • 4. 4 Related Works Contrastive Learning ● Aim to pull closer homogeneous or positive representation pairs while pushing farther negative pairs in a certain feature space ● [21] an instance discrimination paradigm with exemplar tasks ● [27] approach based on the context auto-encoding and probabilistic contrastive loss to learn different- domain representations ● [22] combines k-means clustering and CL ● [2,14] based on consecutive sequences or randomly masked sequences [21] Z. Wu, Y. Xiong, S. X. Yu, and D. Lin, “Unsupervised feature learning via non-parametric instance discrimination,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3733–3742, 2018. 2 [27] A. van den Oord, Y. Li, and O. Vinyals, “Representation learning with contrastive predictive coding,” arXiv preprint arXiv:1807.03748, 2018. [22] J. Li, P. Zhou, C. Xiong, and S. Hoi, “Prototypical con- trastive learning of unsupervised representations,” in Inter- national Conference on Learning Representation (ICLR), 2021. [2] H. Rao, S. Wang, X. Hu, M. Tan, Y. Guo, J. Cheng, X. Liu, and B. Hu, “A self-supervised gait encoding approach with locality-awareness for 3D skeleton based person re-identification,” IEEE Transactions on Pattern Analysis and Machine Intelligence, no. 01, pp. 1–1, 2021. [14] H. Rao and C. Miao, “SimMC: Simple masked contrastive learning of skeleton representations for unsupervised person re-identification,” in International Joint Conference on Artificial Intelligence (IJCAI), pp. 1290–1297, 2022.
  • 5. 5 Related Works Prompt Learning ● To provide additional knowledge, instruction, or context for the input of models => give more reliable outputs for different tasks [28-36] ● [30] leverages language-based prompts to generalize the pre-trained visual representations to many tasks ● [31] automatically model task-relevant prompt with continuous representations to improve the downstream task performance [28] F. Petroni, T. Rocktaschel, S. Riedel, P. Lewis, A. Bakhtin, Y. Wu, and A. Miller, “Language models as knowledge bases?,” in Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 2463–2473, 2019 [29] C. Jia, Y. Yang, Y. Xia, Y.-T. Chen, Z. Parekh, H. Pham, Q. Le, Y.-H. Sung, Z. Li, and T. Duerig, “Scaling up visual and vision-language representation learning with noisy text supervision,” in International Conference on Machine Learning (ICML), pp. 4904–4916, PMLR, 2021 [30] A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, et al., “Learning transferable visual models from natural language supervision,” in International Conference on Machine Learning (ICML), pp. 8748–8763, PMLR, 2021 [31] K. Zhou, J. Yang, C. C. Loy, and Z. Liu, “Learning to prompt for vision-language models,” International Journal of Computer Vision, vol. 130, no. 9, pp. 2337–2348, 2022 [32] Z. Jiang, F. F. Xu, J. Araki, and G. Neubig, “How can we know what language models know?,” Transactions of the Association for Computational Linguistics, vol. 8, pp. 423–438, 2020. [33] B. Lester, R. Al-Rfou, and N. Constant, “The power of scale for parameter-efficient prompt tuning,” in Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 3045–3059, 2021 [34] X. L. Li and P. Liang, “Prefix-tuning: Optimizing continuous prompts for generation,” in Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), pp. 4582–4597, 2021 [35] P. Liu, W. Yuan, J. Fu, Z. Jiang, H. Hayashi, and G. Neu- big, “Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing,” arXiv preprint arXiv:2107.13586, 2021 [36] T. Shin, Y. Razeghi, R. L. Logan IV, E. Wallace, and S. Singh, “Autoprompt: Eliciting knowledge from language models with automatically generated prompts,” in Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 4222–4235, 2020
  • 7. 7 Methodology The Proposed Approach ● 3D skeleton sequence X = (x1, … , xf) Rf*J*3, where xt ∈ RJ*3 denotes the tth skeleton with 3D coordinates of J body joints ● Each skeleton sequence X corresponds to person identity y where y ∈ {1, … , C} and C is the number of different classes ● ФT = {XT i}N1 i=1, ФP = {XP i}N2 i=1, ФG = {XG i}N3 i=1 denote the training set, probe set, and gallery set that contain N1, N2, N3 skeleton sequences of different person in different scenes and views ● Aim to learn an encoder to map skeleton sequences into effective representations => encoded skeleton sequence representations in the probe set can be matched with the representations of the same identity in the gallery set
  • 8. 8 Methodology Skeleton Graph Construction ● Skeleton graphs with the body joints as nodes and the structural connections between adjacent joints as edges ● Gt(Vt, Et) corresponding to the tth skeleton xt, consists of nodes Vt = {vt 1, vt 2, … ,vt J}, vt i ∈ R3 and edges Et = {et i,j|vt i, vt j ∈ Vt}, et i,j ∈ R ● Vt, Et denote the set of nodes corresponding to J different body joints and the set of their internal connection relations
  • 9. 9 Methodology Skeleton Graph Transformer ● Aim to capture discriminative skeleton features ● Problem: ○ Body structural features, which can be inferred from the relations between adjacent body joints ○ Skeleton actional patterns, which are typically characterized by the relations among different body components ● Each body-joint node as a basic body component, combine the above relation learning as a full-relation learning of body-joint nodes => fully aggregate key body and motion features from skeleton graphs ● Positional encoding ● Position-encoded node representation ● Multiple independent FR heads, aggregate corresponding relational feature
  • 10. 10 Methodology Skeleton Graph Transformer ● Apply Feed Forward Network with residual connections and batch normalization ● Final sequence-level graph representation S
  • 11. 11 Methodology Graph Prototype Contrastive Learning ● Each individual’s skeletons usually share the same anthropometric features (skeletal lengths), while their continuous sequence can characterize identity-specific walking patterns (gait) ● Given the encoded graph representations of training skeleton sequences , group them by growth-truth classes as , where denote ● Graph prototype is generated ● Graph prototype contrastive loss
  • 12. 12 Methodology Graph Structure-Trajectory Prompted Reconstruction Mechanism ● To exploit more valuable graph features and high-level semantics from both spatial and temporal context of skeleton graphs, proposed a self-supervised graph Structure and Trajectory Prompted Reconstruction mechanism ● Devise 2 graph context based prompts ○ Partial node positions of the same graph ○ Partial node trajectory among continuous graphs ● Randomly mask node positions to generate the masked graph representation ● Then exploited to prompt the skeleton reconstruction
  • 13. 13 Methodology Graph Trajectory Prompted Reconstruction ● To encourage the model to capture more unique temporal patterns from skeleton graphs, propose to reconstruct graph trajectories based on their partial dynamics, randomly mask the trajectory of each node ● The masked trajectory representation preserves partial graph dynamics of different body-joint nodes, which are then use to prompt the skeleton reconstruction ● Propose the STPR objective to combine both graph structure and trajectory prompt reconstruction
  • 14. 14 Methodology The Entire Approach ● Combine the proposed GPC and STPR to perform skeleton representation learning with where λ is the weight coefficient to fuse two losses for training
  • 15. 15 Experimental Setups 4 datasets • IAS ○ 11 different persons • KS20 ○ 20 different persons • BIWI ○ 50 different persons • KGBD ○ 164 different persons
  • 16. 16 Experimental Setups Implementation Details • J = 20 in KGBD, IAS, BIWI (number of body joints) • J = 25 in KS20 • J = 14 in the estimated skeleton data of CAIA-B • f = 6 for Kinect-based skeleton datasets (IAS, KS20, BIWI, KGBD) (sequence length) • f = 40 for RGB-estimated skelton data (CASIA-B) • Evaluation Metrics ○ Cumulative Matching Characteristics (CMC) ○ Rank-1 accuracy (R1), Rank-5 accuracy (R5), Rank-10 accuracy (R10) ○ Mean Average Precision (mAP) is used to evaluate the overall performance
  • 19. 19 Conclusion • TranSG learn effective representations from skelton graphs for person re-ID • Devise a SGT to perform full-relation learning of body-joint nodes to aggregate key body and motion features into graph representations • GPC learn discriminative graph representations by contrasting their inherent similarity with the most representative graph features • Design STPR mechanism to encourage learning richer graph semantics and key patterns for person re-ID • TranSG outperform existing SOTA models, scalable to be applied to different scenarios