240325_Thanh_LabSeminar[SkeletalGNN].pptx

•Download as PPTX, PDF•

0 likes•14 views

thanhdowork

Learning Skeletal Graph Neural Networks for Hard 3D Pose Estimation

Education

2
Introduction
● Single-view skeleton-based 3D pose estimation problem plays an important role in numerous
applications, such as human-computer interaction, video understanding, and human behavior analysis
● Task is given the 2D skeletal position regress the 3D positions of the corresponding joints
● Proposed skeletal GNN:
○ Hop-aware hierarchical channel squeezing fusion layer to extract relevant information from
neighboring nodes effectively while suppressing underised nosies
○ Built dynamic skeletal graph
● Reduce errors of 3D HPE

3
Previous Approaches
● Spatial Temporal GCN for Skeleton Based Action Recognition, 2018
● Exploiting Spatial-temporal Relationships for 3D Pose Estimation via Graph Convolutional Networks,
2019
● Optimizing Network Structure for 3D Human Pose Estimation, 2019
● Semantic Graph Convolutional Networks for 3D Human Pose Regression, 2019
● High-order Graph Convolutional Networks for 3D Human Pose Estimation, 2020
● A comprehensive Study of Weight Sharing in Graph Network for 3D Human Pose Estimation, 2020
● Learning Global Pose Features in Graph Convolutional Networks for 3D Human Pose Estimation, 2020

4
Previous Approaches
● Spatial Temporal GCN for Skeleton Based Action Recognition

5
Previous Approaches
● Spatial Temporal GCN for Skeleton Based Action Recognition

6
Previous Approaches
● Spatial Temporal GCN for Skeleton Based Action Recognition

7
Previous Approaches
● Exploiting Spatial-temporal Relationships for 3D Pose Estimation via Graph Convolutional Networks

8
Previous Approaches
● Spatial Temporal GCN for Skeleton Based Action Recognition

9
Previous Approaches
● Spatial Temporal GCN for Skeleton Based Action Recognition

10
Previous Approaches
● Spatial Temporal GCN for Skeleton Based Action Recognition

11
Previous Approaches
● Spatial Temporal GCN for Skeleton Based Action Recognition

12
Previous Approaches
● Optimizing Network Structure for 3D Human Pose Estimation

13
Previous Approaches
● Optimizing Network Structure for 3D Human Pose Estimation

14
Previous Approaches
● Optimizing Network Structure for 3D Human Pose Estimation

15
Previous Approaches
● Semantic Graph Convolutional Networks for 3D Human Pose Regression

16
Previous Approaches
● Semantic Graph Convolutional Networks for 3D Human Pose Regression

17
Previous Approaches
● Semantic Graph Convolutional Networks for 3D Human Pose Regression

18
Previous Approaches
● High-order Graph Convolutional Networks for 3D Human Pose Estimation

19
Previous Approaches
● High-order Graph Convolutional Networks for 3D Human Pose Estimation

20
Previous Approaches
● High-order Graph Convolutional Networks for 3D Human Pose Estimation

24
Observation
● Distant neighbors pass not only valuable semantic information but also irrelevant noise
● Dynamic graph construction is useful but should be delicately designed

25
Dynamic Hierarchical Channel-Squeezing Fusion

26
Dynamic Hierarchical Channel-Squeezing Fusion

27
Datasets
2 datasets
• Human3.6M consists of 3.6 million video frames with 15 actions from 4 camera viewpoints, where
accurate 3D human joints positions are captured from high-speed motion capture system
• MPI-INF-3DHP contains both constrained indoor scenes and complex outdoor scenes, covering a greater
diversity of poses and actions, where it is usually taken as a cross-dataset setting to verify the
generalization ability of the proposed methods

31
Conclusion
• A novel skeleton-based representation learning method for better dealing with hard poses in 3D pose
estimation
• The proposed Hierarchical Channel-Squeezing Fusion module can encode short-range and long-range
contexts, keeping essential information while reducing irrelevant noises

Similar to 240325_Thanh_LabSeminar[SkeletalGNN].pptx

AR/SLAM for end-users

Rakuten Group, Inc.

Deep learning for 3 d point clouds presentation

VijaylaxmiNagurkar

Enhanced real time semantic segmentation

AkankshaRawat42

3-d interpretation from single 2-d image for autonomous driving II

Yu Huang

3-d interpretation from stereo images for autonomous driving

Yu Huang

Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...

IRJET Journal

Learning Graph Representation for Data-Efficiency RL

lauratoni4

smart-city-application

Nam Giang

Pose estimation from RGB images by deep learning

Yu Huang

In this project, we propose methods for semantic segmentation with the deep learning state-of-the-art models. Moreover, we want to filterize the segmentation to the specific object in specific application. Instead of concentrating on unnecessary objects we can focus on special ones and make it more specialize and effecient for special purposes. Furtheromore, In this project, we leverage models that are suitable for face segmentation. The models that are used in this project are Mask-RCNN and DeepLabv3. The experimental results clearly indicate that how illustrated approach are efficient and robust in the segmentation task to the previous work in the field of segmentation. These models are reached to 74.4 and 86.6 precision of Mean of Intersection over Union. The visual Results of the models are shown in Appendix part.

Semantic segmentation with Convolutional Neural Network Approaches

Fellowship at Vodafone FutureLab

Indoor scene understanding for autonomous agents

Varun Bhaseen

Machine-Learned_3D_Building_Vectorization_From_Satellite_Imagery__paper.pdf

Yugank Aman

본 논문은 single depth map으로부터의 정확한 3D hand pose estimation을 목표로 한다. 3D hand pose estimation은 HCI, AR등의 기술을 구현함에 있어서 매우 중요한 기술이다. 이를 위해 많은 연구자들이 정확도를 높이기 위해 여러 방법을 제시하였지만, 여전히 손가락들의 비슷한 생김새, 가려짐, 다양한 손가락의 움직임으로 인한 복잡성 때문에 정확도를 올리는데 한계가 있었다. 본 논문은 기존 방법들의 한계를 극복하기 위해 기존 방법들이 사용하는 입력 형태와 출력 형태를 바꾸었다. 2d depth image를 입력으로 받아 hand joint의 3D coordinate를 직접 regress하는 대부분의 기존 방법들과는 달리, 제안하는 모델은 3D voxelized depth map을 입력으로 받아 3D heatmap을 출력한다. 이를 위해 encoder-decoder 형식의 3D CNN을 사용하였고, 달라진 입력과 출력 형태로 인해 제안하는 모델은 널리 사용되는 3개의 3d hand pose estimation dataset, 1개의 3d human pose estimation dataset에서 가장 높은 성능을 내었다. 또한 ICCV 2017에서 주최된 HANDS 2017 challenge에서 우승 하였다.

V2 v posenet

NAVER Engineering

Pratik ibm-open power-ppt

Vaibhav R

Pedestrian behavior/intention modeling for autonomous driving V

Yu Huang

fusion of Camera and lidar for autonomous driving I

Yu Huang

3D perception is crucial for understanding the real world. It offers many benefits and new capabilities over 2D across diverse applications, from XR and autonomous driving to IOT, camera, and mobile. 3D perception with machine learning is creating the new state of the art (SOTA) in areas, such as depth estimation, object detection, and neural scene representation. Making these SOTA neural networks feasible for real-world deployment on mobile devices constrained by power, thermal, and performance has been a challenge. Qualcomm AI Research has developed not only novel AI techniques for 3D perception but also full-stack AI optimizations to enable real-world deployments and energy-efficient solutions. This presentation explores the latest research that is enabling efficient 3D perception while maintaining neural network model accuracy. You’ll learn about: - The advantages of 3D perception over 2D and the need for 3D perception across applications - Advancements in 3D perception research by Qualcomm AI Research - Our future 3D perception research directions

Understanding the world in 3D with AI.pdf

Qualcomm Research

IEEE ALL DOMAIN TITLES 2016 FOR FINAL YEAR PROJECTS

tsysglobalsolutions

Spatial Computing and the Future of Utility GIS

George Percivall

Laplacian-regularized Graph Bandits

lauratoni4

Similar to 240325_Thanh_LabSeminar[SkeletalGNN].pptx (20)

AR/SLAM for end-users

Deep learning for 3 d point clouds presentation

Enhanced real time semantic segmentation

3-d interpretation from single 2-d image for autonomous driving II

3-d interpretation from stereo images for autonomous driving

Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...

Learning Graph Representation for Data-Efficiency RL

smart-city-application

Pose estimation from RGB images by deep learning

Semantic segmentation with Convolutional Neural Network Approaches

Indoor scene understanding for autonomous agents

Machine-Learned_3D_Building_Vectorization_From_Satellite_Imagery__paper.pdf

V2 v posenet

Pratik ibm-open power-ppt

Pedestrian behavior/intention modeling for autonomous driving V

fusion of Camera and lidar for autonomous driving I

Understanding the world in 3D with AI.pdf

IEEE ALL DOMAIN TITLES 2016 FOR FINAL YEAR PROJECTS

Spatial Computing and the Future of Utility GIS

Laplacian-regularized Graph Bandits

Recently uploaded

The Story of Village Palampur Class 9 Free Study Material PDF

Vivekanand Anglo Vedic Academy

The Liver & Gallbladder (Anatomy & Physiology).pptx

Vishal Singh

BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...

Nguyen Thanh Tu Collection

TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...

Nguyen Thanh Tu Collection

Photos from our Spring Gala 2024. Thank you for making our 2024 Spring Gala unforgettable! Whether you attended or supported us with a donation, we extend our heartfelt thanks for celebrating School-Community Partnerships alongside ExpandED Schools. Special appreciation to our honorees, Rachel Skaistis and Cravath, Swaine, & Moore LLP, for their unwavering support in advancing educational equity. Your contributions have profoundly impacted countless young New Yorkers, and we’re proud to have you as part of our community. Our gratitude also extends to our Board of Directors, sponsors, city partners, community-based organizations, and the inspiring young people who joined us. Your support fuels our mission, and we’re endlessly thankful for your dedication. While we celebrate our gala’s success, our work continues. As you view the evening’s highlights, consider how you can further empower young New Yorkers and educators with your donation, no matter the size. Donate today: https://www.expandedschools.org/support-our-work/donate/

Spring gala 2024 photo slideshow - Celebrating School-Community Partnerships

expandedwebsite

MOOD STABLIZERS DRUGS.pptx

PoojaSen20

PSYPACT- Practicing Over State Lines May 2024.pptx

Marlene Maheu

An overview of the various scriptures in Hinduism

Dabee Kamal

Observing-Correct-Grammar-in-Making-Definitions.pptx

AdelaideRefugio

How to Manage Website in Odoo 17 Studio App.pptx

Celine George

An Overview of the Odoo 17 Knowledge App

Celine George

UChicago CMSC 23320 - The Best Commit Messages of 2024

Borja Sotomayor

Scopus Indexed Journals 2024 - ISCOPUS Publications

ISCOPE Publication

diagnosting testing bsc 2nd sem.pptx....

Ritu480198

Improved Approval Flow in Odoo 17 Studio App

Celine George

Graduate Outcomes Presentation Slides - English (v3).pptx

neillewis46

Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptx

Limon Prince

When Quality Assurance Meets Innovation in Higher Education - Report launch w...

Gary Wood

How To Create Editable Tree View in Odoo 17

Celine George

Book Review of Run For Your Life Powerpoint

23600690

Recently uploaded (20)

The Story of Village Palampur Class 9 Free Study Material PDF

The Liver & Gallbladder (Anatomy & Physiology).pptx

BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...

TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...

Spring gala 2024 photo slideshow - Celebrating School-Community Partnerships

MOOD STABLIZERS DRUGS.pptx

PSYPACT- Practicing Over State Lines May 2024.pptx

An overview of the various scriptures in Hinduism

Observing-Correct-Grammar-in-Making-Definitions.pptx

How to Manage Website in Odoo 17 Studio App.pptx

An Overview of the Odoo 17 Knowledge App

UChicago CMSC 23320 - The Best Commit Messages of 2024

Scopus Indexed Journals 2024 - ISCOPUS Publications

diagnosting testing bsc 2nd sem.pptx....

Improved Approval Flow in Odoo 17 Studio App

Graduate Outcomes Presentation Slides - English (v3).pptx

Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptx

When Quality Assurance Meets Innovation in Higher Education - Report launch w...

How To Create Editable Tree View in Odoo 17

Book Review of Run For Your Life Powerpoint

240325_Thanh_LabSeminar[SkeletalGNN].pptx

1. Learning Skeletal Graph Neural Networks for Hard 3D Pose Estimation Tien-Bach-Thanh Do Network Science Lab Dept. of Artificial Intelligence The Catholic University of Korea E-mail: osfa19730@catholic.ac.kr 2024/03/25 Ailing Zeng et al. ICCV 2021

2. 2 Introduction ● Single-view skeleton-based 3D pose estimation problem plays an important role in numerous applications, such as human-computer interaction, video understanding, and human behavior analysis ● Task is given the 2D skeletal position regress the 3D positions of the corresponding joints ● Proposed skeletal GNN: ○ Hop-aware hierarchical channel squeezing fusion layer to extract relevant information from neighboring nodes effectively while suppressing underised nosies ○ Built dynamic skeletal graph ● Reduce errors of 3D HPE

3. 3 Previous Approaches ● Spatial Temporal GCN for Skeleton Based Action Recognition, 2018 ● Exploiting Spatial-temporal Relationships for 3D Pose Estimation via Graph Convolutional Networks, 2019 ● Optimizing Network Structure for 3D Human Pose Estimation, 2019 ● Semantic Graph Convolutional Networks for 3D Human Pose Regression, 2019 ● High-order Graph Convolutional Networks for 3D Human Pose Estimation, 2020 ● A comprehensive Study of Weight Sharing in Graph Network for 3D Human Pose Estimation, 2020 ● Learning Global Pose Features in Graph Convolutional Networks for 3D Human Pose Estimation, 2020

4. 4 Previous Approaches ● Spatial Temporal GCN for Skeleton Based Action Recognition

5. 5 Previous Approaches ● Spatial Temporal GCN for Skeleton Based Action Recognition

6. 6 Previous Approaches ● Spatial Temporal GCN for Skeleton Based Action Recognition

7. 7 Previous Approaches ● Exploiting Spatial-temporal Relationships for 3D Pose Estimation via Graph Convolutional Networks

8. 8 Previous Approaches ● Spatial Temporal GCN for Skeleton Based Action Recognition

9. 9 Previous Approaches ● Spatial Temporal GCN for Skeleton Based Action Recognition

10. 10 Previous Approaches ● Spatial Temporal GCN for Skeleton Based Action Recognition

11. 11 Previous Approaches ● Spatial Temporal GCN for Skeleton Based Action Recognition

12. 12 Previous Approaches ● Optimizing Network Structure for 3D Human Pose Estimation

13. 13 Previous Approaches ● Optimizing Network Structure for 3D Human Pose Estimation

14. 14 Previous Approaches ● Optimizing Network Structure for 3D Human Pose Estimation

15. 15 Previous Approaches ● Semantic Graph Convolutional Networks for 3D Human Pose Regression

16. 16 Previous Approaches ● Semantic Graph Convolutional Networks for 3D Human Pose Regression

17. 17 Previous Approaches ● Semantic Graph Convolutional Networks for 3D Human Pose Regression

18. 18 Previous Approaches ● High-order Graph Convolutional Networks for 3D Human Pose Estimation

19. 19 Previous Approaches ● High-order Graph Convolutional Networks for 3D Human Pose Estimation

20. 20 Previous Approaches ● High-order Graph Convolutional Networks for 3D Human Pose Estimation

21. 21 Motivation

22. 22 Motivation

23. 23 Framework

24. 24 Observation ● Distant neighbors pass not only valuable semantic information but also irrelevant noise ● Dynamic graph construction is useful but should be delicately designed

25. 25 Dynamic Hierarchical Channel-Squeezing Fusion

26. 26 Dynamic Hierarchical Channel-Squeezing Fusion

27. 27 Datasets 2 datasets • Human3.6M consists of 3.6 million video frames with 15 actions from 4 camera viewpoints, where accurate 3D human joints positions are captured from high-speed motion capture system • MPI-INF-3DHP contains both constrained indoor scenes and complex outdoor scenes, covering a greater diversity of poses and actions, where it is usually taken as a cross-dataset setting to verify the generalization ability of the proposed methods

28. 28 Datasets Results

29. 29 Datasets Results

30. 30 Datasets Qualitatives

31. 31 Conclusion • A novel skeleton-based representation learning method for better dealing with hard poses in 3D pose estimation • The proposed Hierarchical Channel-Squeezing Fusion module can encode short-range and long-range contexts, keeping essential information while reducing irrelevant noises

240325_Thanh_LabSeminar[SkeletalGNN].pptx

Recommended

Recommended

More Related Content

Similar to 240325_Thanh_LabSeminar[SkeletalGNN].pptx

Similar to 240325_Thanh_LabSeminar[SkeletalGNN].pptx (20)

More from thanhdowork

More from thanhdowork (20)

Recently uploaded

Recently uploaded (20)

240325_Thanh_LabSeminar[SkeletalGNN].pptx