VNect

•Download as PPTX, PDF•

2 likes•1,691 views

Yunkyu Choi

Real-time 3D Human Pose Estimation with a Single RGB Camera

Software

Contents
● Overview
● Process
● 3D Pose Estimation
○ CNN Regression
○ Kinematic Skeleton Fitting
● Result
● Limitation
● Conclusion

Overview
● Full global 3D skeleton pose
○ global: not local 3D pose relative to a bounding box
● real-time
○ 30Hz
● a single RGB Camera
● CNN based pose regressor + kinematic skeleton fitting
○ CNN base on (https://arxiv.org/pdf/1611.09813.pdf ) 100 Layers => 50 Layers
○ Don’t require tightly cropped input frame

Process
● CNN to regress 2D and 3D joint positions
○ trained on annotated 3D human pose datasets => Joint Positions
● Kinematic Skeleton Fitting
Optional: Skeleton
Initialization by height

3D Pose Estimation
● I => PG
○ I : Image
○ PG : Global Pose
○ PG (θ, d): joint angle θ, Global Position in Camera Space d
○ PL : Root-relative 3D Joint position
○ K: 2D keypoints
● CNN Pose Regression

CNN Regression
● Location map
○ No structure imposed
○ 3D position relative to Root
Loss Function

CNN Regression
● Training
○ Pretrained for 2D pose estimation on MPII and LSP
○ 3D pose:
■ MPI-INF-3DHP : 100k image samples
■ Human3.6m(except S9, S11): 75k image samples
● Bounding Box Tracker
○ CNN don’t require BB
○ but CNN runtime performance affected by the image size

Kinematic Skeleton
Fitting
● 2D prediction of K are
temporally filtered
○ used for 3D coordinates

Limitations
● Depth estimation from single image => ill posed
● Temporal jitter
○ Floor constraint
○ Head angle and pose by HMD
● Implausible 3D pose by misprediction
● Very fast motion

Conclusion
● 3D global 3D skeleton
● Single RGB camera
● 30Hz realtime
● Fully-convolutional CNN => Regress 2D and 3D Joint positions
● Skeleton fitting
● Temporally stable
● Without Strict bounding boxes

What's hot

ガイデットフィルタとその周辺Norishige Fukushima

UNetEliyaLaialy (2).pptxNoorUlHaq47

[DL輪読会]When Does Label Smoothing Help?Deep Learning JP

論文紹介「PointNetLK: Robust & Efficient Point Cloud Registration Using PointNet」Naoya Chiba

SSII2021 [SS1] Transformer x Computer Visionの実活用可能性と展望〜 TransformerのCompute...SSII

論文紹介 wav2vec: Unsupervised Pre-training for Speech RecognitionYosukeKashiwagi1

組込向けDeep Learning最新技術の紹介量子化テクニックとDorefaNetについてNatsutani Minoru

DeepLearning 10章回帰結合型ニューラルネットワークと再帰型ネットワークhirono kawashima

ターン制コマンドバトルにおける強化学習効率化gree_tech

2017:10:20論文読み会"Image-to-Image Translation with Conditional Adversarial Netwo...ayaha osaki

【解説】一般逆行列Kenjiro Sugimoto

モデルアーキテクチャ観点からの高速化2019Yusuke Uchida

（文献紹介）深層学習による動被写体ロバストなカメラの動き推定Morpho, Inc.

Domain adaptation for Image SegmentationDeepak Thukral

G2oFujimoto Keisuke

Real-Time Semantic Stereo Matchingharmonylab

Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料Yusuke Uchida

Structured Light 技術俯瞰Teppei Kurita

SSII2019TS: Shall We GANs? ～GANの基礎から最近の研究まで～SSII

Graph R-CNN for Scene Graph GenerationSangmin Woo

What's hot (20)

ガイデットフィルタとその周辺

UNetEliyaLaialy (2).pptx

[DL輪読会]When Does Label Smoothing Help?

論文紹介「PointNetLK: Robust & Efficient Point Cloud Registration Using PointNet」

SSII2021 [SS1] Transformer x Computer Visionの実活用可能性と展望〜 TransformerのCompute...

論文紹介 wav2vec: Unsupervised Pre-training for Speech Recognition

組込向けDeep Learning最新技術の紹介量子化テクニックとDorefaNetについて

DeepLearning 10章回帰結合型ニューラルネットワークと再帰型ネットワーク

ターン制コマンドバトルにおける強化学習効率化

2017:10:20論文読み会"Image-to-Image Translation with Conditional Adversarial Netwo...

【解説】一般逆行列

モデルアーキテクチャ観点からの高速化2019

（文献紹介）深層学習による動被写体ロバストなカメラの動き推定

Domain adaptation for Image Segmentation

G2o

Real-Time Semantic Stereo Matching

Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料

Structured Light 技術俯瞰

SSII2019TS: Shall We GANs? ～GANの基礎から最近の研究まで～

Graph R-CNN for Scene Graph Generation

Similar to VNect

VIBE: Video Inference for Human Body Pose and Shape EstimationArithmer Inc.

DTAM: Dense Tracking and Mapping in Real-Time, Robot vision GroupLihang Li

Theories and Engineering Technics of 2D-to-3D Back-Projection ProblemSeongcheol Baek

Image Enhancement Deven Sahu

Robotics: Vision-Aided Navigation and Motion Path Planning on Low-End Android...Nevada County Tech Connection

物件偵測與辨識技術CHENHuiMei

Neural Network Approximation.pdfbvhrs2

Sergey A. Sukhanov, "3D content production"Mikhail Vink

Survey on optical flow estimation with DLLeapMind Inc

Similar to VNect (9)

VIBE: Video Inference for Human Body Pose and Shape Estimation

DTAM: Dense Tracking and Mapping in Real-Time, Robot vision Group

Theories and Engineering Technics of 2D-to-3D Back-Projection Problem

Image Enhancement

Robotics: Vision-Aided Navigation and Motion Path Planning on Low-End Android...

物件偵測與辨識技術

Neural Network Approximation.pdf

Sergey A. Sukhanov, "3D content production"

Survey on optical flow estimation with DL

Recently uploaded

Xen Safety Embedded OSS Summit April 2024 v4.pdfStefano Stabellini

Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran

Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed

How to Track Employee Performance A Comprehensive Guide.pdfLivetecs LLC

What are the key points to focus on before starting to learn ETL Development....kzayra69

Recruitment Management Software Benefits (Infographic)Hr365.us smith

Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray

Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig

Buds n Tech IT Solutions: Top-Notch Web Services in Noidabntitsolutionsrishis

React Server Component in Next.js by Hanief UtamaHanief Utama

Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ

Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort

Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh

SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa

What is Advanced Excel and what are some best practices for designing and cre...Technogeeks

Implementing Zero Trust strategy with AzureDinusha Kumarasiri

Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.

英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0

KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app

GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko

Recently uploaded (20)

Xen Safety Embedded OSS Summit April 2024 v4.pdf

Intelligent Home Wi-Fi Solutions | ThinkPalm

Unveiling Design Patterns: A Visual Guide with UML Diagrams

How to Track Employee Performance A Comprehensive Guide.pdf

What are the key points to focus on before starting to learn ETL Development....

Recruitment Management Software Benefits (Infographic)

Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...

Automate your Kamailio Test Calls - Kamailio World 2024

Buds n Tech IT Solutions: Top-Notch Web Services in Noida

React Server Component in Next.js by Hanief Utama

Cloud Data Center Network Construction - IEEE

Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)

Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...

SpotFlow: Tracking Method Calls and States at Runtime

What is Advanced Excel and what are some best practices for designing and cre...

Implementing Zero Trust strategy with Azure

Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data

英国UN学位证,北安普顿大学毕业证书1:1制作

KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx

GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf

VNect

1. VNect: Real-time 3D Human Pose Estimation with a Single RGB Camera 2017.08.14 Yunkyu Choi

2. Contents ● Overview ● Process ● 3D Pose Estimation ○ CNN Regression ○ Kinematic Skeleton Fitting ● Result ● Limitation ● Conclusion

3. Overview ● Full global 3D skeleton pose ○ global: not local 3D pose relative to a bounding box ● real-time ○ 30Hz ● a single RGB Camera ● CNN based pose regressor + kinematic skeleton fitting ○ CNN base on (https://arxiv.org/pdf/1611.09813.pdf ) 100 Layers => 50 Layers ○ Don’t require tightly cropped input frame

4. Process ● CNN to regress 2D and 3D joint positions ○ trained on annotated 3D human pose datasets => Joint Positions ● Kinematic Skeleton Fitting Optional: Skeleton Initialization by height

5. 3D Pose Estimation ● I => PG ○ I : Image ○ PG : Global Pose ○ PG (θ, d): joint angle θ, Global Position in Camera Space d ○ PL : Root-relative 3D Joint position ○ K: 2D keypoints ● CNN Pose Regression

6. CNN Regression ● Location map ○ No structure imposed ○ 3D position relative to Root Loss Function

7. CNN Regression Bone Length

8. CNN Regression ● Training ○ Pretrained for 2D pose estimation on MPII and LSP ○ 3D pose: ■ MPI-INF-3DHP : 100k image samples ■ Human3.6m(except S9, S11): 75k image samples ● Bounding Box Tracker ○ CNN don’t require BB ○ but CNN runtime performance affected by the image size

9. Kinematic Skeleton Fitting ● 2D prediction of K are temporally filtered ○ used for 3D coordinates

10. Result

11. Result 자세한 부분은 영상과 논문 참조

12. Limitations ● Depth estimation from single image => ill posed ● Temporal jitter ○ Floor constraint ○ Head angle and pose by HMD ● Implausible 3D pose by misprediction ● Very fast motion

13. Conclusion ● 3D global 3D skeleton ● Single RGB camera ● 30Hz realtime ● Fully-convolutional CNN => Regress 2D and 3D Joint positions ● Skeleton fitting ● Temporally stable ● Without Strict bounding boxes

14. 감사합니다 질답

VNect

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to VNect

Similar to VNect (9)

Recently uploaded

Recently uploaded (20)

VNect