SlideShare a Scribd company logo
VNect: Real-time 3D Human
Pose Estimation with a
Single RGB Camera
2017.08.14
Yunkyu Choi
Contents
● Overview
● Process
● 3D Pose Estimation
○ CNN Regression
○ Kinematic Skeleton Fitting
● Result
● Limitation
● Conclusion
Overview
● Full global 3D skeleton pose
○ global: not local 3D pose relative to a bounding box
● real-time
○ 30Hz
● a single RGB Camera
● CNN based pose regressor + kinematic skeleton fitting
○ CNN base on (https://arxiv.org/pdf/1611.09813.pdf ) 100 Layers => 50 Layers
○ Don’t require tightly cropped input frame
Process
● CNN to regress 2D and 3D joint positions
○ trained on annotated 3D human pose datasets => Joint Positions
● Kinematic Skeleton Fitting
Optional: Skeleton
Initialization by height
3D Pose Estimation
● I => PG
○ I : Image
○ PG : Global Pose
○ PG (θ, d): joint angle θ, Global Position in Camera Space d
○ PL : Root-relative 3D Joint position
○ K: 2D keypoints
● CNN Pose Regression
CNN Regression
● Location map
○ No structure imposed
○ 3D position relative to Root
Loss Function
CNN Regression
Bone Length
CNN Regression
● Training
○ Pretrained for 2D pose estimation on MPII and LSP
○ 3D pose:
■ MPI-INF-3DHP : 100k image samples
■ Human3.6m(except S9, S11): 75k image samples
● Bounding Box Tracker
○ CNN don’t require BB
○ but CNN runtime performance affected by the image size
Kinematic Skeleton
Fitting
● 2D prediction of K are
temporally filtered
○ used for 3D coordinates
Result
Result
자세한 부분은 영상과 논문 참조
Limitations
● Depth estimation from single image => ill posed
● Temporal jitter
○ Floor constraint
○ Head angle and pose by HMD
● Implausible 3D pose by misprediction
● Very fast motion
Conclusion
● 3D global 3D skeleton
● Single RGB camera
● 30Hz realtime
● Fully-convolutional CNN => Regress 2D and 3D Joint positions
● Skeleton fitting
● Temporally stable
● Without Strict bounding boxes
감사합니다
질답

More Related Content

Similar to VNect

DTAM: Dense Tracking and Mapping in Real-Time, Robot vision Group
DTAM: Dense Tracking and Mapping in Real-Time, Robot vision GroupDTAM: Dense Tracking and Mapping in Real-Time, Robot vision Group
DTAM: Dense Tracking and Mapping in Real-Time, Robot vision Group
Lihang Li
 
Theories and Engineering Technics of 2D-to-3D Back-Projection Problem
Theories and Engineering Technics of 2D-to-3D Back-Projection ProblemTheories and Engineering Technics of 2D-to-3D Back-Projection Problem
Theories and Engineering Technics of 2D-to-3D Back-Projection Problem
Seongcheol Baek
 
Image Enhancement
Image Enhancement Image Enhancement
Image Enhancement
Deven Sahu
 
Robotics: Vision-Aided Navigation and Motion Path Planning on Low-End Android...
Robotics: Vision-Aided Navigation and Motion Path Planning on Low-End Android...Robotics: Vision-Aided Navigation and Motion Path Planning on Low-End Android...
Robotics: Vision-Aided Navigation and Motion Path Planning on Low-End Android...
Nevada County Tech Connection
 
物件偵測與辨識技術
物件偵測與辨識技術物件偵測與辨識技術
物件偵測與辨識技術
CHENHuiMei
 
Neural Network Approximation.pdf
Neural Network Approximation.pdfNeural Network Approximation.pdf
Neural Network Approximation.pdf
bvhrs2
 
Sergey A. Sukhanov, "3D content production"
Sergey A. Sukhanov, "3D content production"Sergey A. Sukhanov, "3D content production"
Sergey A. Sukhanov, "3D content production"
Mikhail Vink
 

Similar to VNect (7)

DTAM: Dense Tracking and Mapping in Real-Time, Robot vision Group
DTAM: Dense Tracking and Mapping in Real-Time, Robot vision GroupDTAM: Dense Tracking and Mapping in Real-Time, Robot vision Group
DTAM: Dense Tracking and Mapping in Real-Time, Robot vision Group
 
Theories and Engineering Technics of 2D-to-3D Back-Projection Problem
Theories and Engineering Technics of 2D-to-3D Back-Projection ProblemTheories and Engineering Technics of 2D-to-3D Back-Projection Problem
Theories and Engineering Technics of 2D-to-3D Back-Projection Problem
 
Image Enhancement
Image Enhancement Image Enhancement
Image Enhancement
 
Robotics: Vision-Aided Navigation and Motion Path Planning on Low-End Android...
Robotics: Vision-Aided Navigation and Motion Path Planning on Low-End Android...Robotics: Vision-Aided Navigation and Motion Path Planning on Low-End Android...
Robotics: Vision-Aided Navigation and Motion Path Planning on Low-End Android...
 
物件偵測與辨識技術
物件偵測與辨識技術物件偵測與辨識技術
物件偵測與辨識技術
 
Neural Network Approximation.pdf
Neural Network Approximation.pdfNeural Network Approximation.pdf
Neural Network Approximation.pdf
 
Sergey A. Sukhanov, "3D content production"
Sergey A. Sukhanov, "3D content production"Sergey A. Sukhanov, "3D content production"
Sergey A. Sukhanov, "3D content production"
 

Recently uploaded

原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
mz5nrf0n
 
E-commerce Application Development Company.pdf
E-commerce Application Development Company.pdfE-commerce Application Development Company.pdf
E-commerce Application Development Company.pdf
Hornet Dynamics
 
E-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet DynamicsE-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet Dynamics
Hornet Dynamics
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
Peter Muessig
 
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s EcosystemUI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
Peter Muessig
 
What is Augmented Reality Image Tracking
What is Augmented Reality Image TrackingWhat is Augmented Reality Image Tracking
What is Augmented Reality Image Tracking
pavan998932
 
DDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systemsDDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systems
Gerardo Pardo-Castellote
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
Green Software Development
 
How to write a program in any programming language
How to write a program in any programming languageHow to write a program in any programming language
How to write a program in any programming language
Rakesh Kumar R
 
Unveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdfUnveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdf
brainerhub1
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
timtebeek1
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
Ayan Halder
 
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Julian Hyde
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
Octavian Nadolu
 
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CDKuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
rodomar2
 
socradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdfsocradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdf
SOCRadar
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata
 
SMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API ServiceSMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API Service
Yara Milbes
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
Łukasz Chruściel
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
lorraineandreiamcidl
 

Recently uploaded (20)

原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
 
E-commerce Application Development Company.pdf
E-commerce Application Development Company.pdfE-commerce Application Development Company.pdf
E-commerce Application Development Company.pdf
 
E-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet DynamicsE-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet Dynamics
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
 
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s EcosystemUI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
 
What is Augmented Reality Image Tracking
What is Augmented Reality Image TrackingWhat is Augmented Reality Image Tracking
What is Augmented Reality Image Tracking
 
DDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systemsDDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systems
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
 
How to write a program in any programming language
How to write a program in any programming languageHow to write a program in any programming language
How to write a program in any programming language
 
Unveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdfUnveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdf
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
 
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
 
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CDKuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
 
socradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdfsocradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdf
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
 
SMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API ServiceSMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API Service
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
 

VNect

  • 1. VNect: Real-time 3D Human Pose Estimation with a Single RGB Camera 2017.08.14 Yunkyu Choi
  • 2. Contents ● Overview ● Process ● 3D Pose Estimation ○ CNN Regression ○ Kinematic Skeleton Fitting ● Result ● Limitation ● Conclusion
  • 3. Overview ● Full global 3D skeleton pose ○ global: not local 3D pose relative to a bounding box ● real-time ○ 30Hz ● a single RGB Camera ● CNN based pose regressor + kinematic skeleton fitting ○ CNN base on (https://arxiv.org/pdf/1611.09813.pdf ) 100 Layers => 50 Layers ○ Don’t require tightly cropped input frame
  • 4. Process ● CNN to regress 2D and 3D joint positions ○ trained on annotated 3D human pose datasets => Joint Positions ● Kinematic Skeleton Fitting Optional: Skeleton Initialization by height
  • 5. 3D Pose Estimation ● I => PG ○ I : Image ○ PG : Global Pose ○ PG (θ, d): joint angle θ, Global Position in Camera Space d ○ PL : Root-relative 3D Joint position ○ K: 2D keypoints ● CNN Pose Regression
  • 6. CNN Regression ● Location map ○ No structure imposed ○ 3D position relative to Root Loss Function
  • 8. CNN Regression ● Training ○ Pretrained for 2D pose estimation on MPII and LSP ○ 3D pose: ■ MPI-INF-3DHP : 100k image samples ■ Human3.6m(except S9, S11): 75k image samples ● Bounding Box Tracker ○ CNN don’t require BB ○ but CNN runtime performance affected by the image size
  • 9. Kinematic Skeleton Fitting ● 2D prediction of K are temporally filtered ○ used for 3D coordinates
  • 12. Limitations ● Depth estimation from single image => ill posed ● Temporal jitter ○ Floor constraint ○ Head angle and pose by HMD ● Implausible 3D pose by misprediction ● Very fast motion
  • 13. Conclusion ● 3D global 3D skeleton ● Single RGB camera ● 30Hz realtime ● Fully-convolutional CNN => Regress 2D and 3D Joint positions ● Skeleton fitting ● Temporally stable ● Without Strict bounding boxes