[Paper introduction] Performance Capture of Interacting Characters with Handheld Kinects

•Download as PPTX, PDF•

0 likes•383 views

Mitsuru Nakazawa

Introduction movie

URL: http://media.au.tsinghua.edu.cn/yegenzhi/HandheldKinectsMocap_ECCV2012.jsp
(Accessed on 26th Nov. 2012)

3

Related works
Multi-view motion capture approaches
 Reconstruct a skeletal motion model & detailed
dynamic surface geometry
 Deal with people wide apparel
 Require controlled studio setup (many number
of sync video cameras)

Marker-less motion capture from a single range
sensor
 Estimate complex poses at real-time frame rates
 Difficult to capture 3D, complex, detailed model
4

Objective
freely move
Full performance capture Operator
of moving humans using
only 3 handheld,
Performer
moving Kinects

 Reconstruct a skeleton motion & time varying surface
geometry of humans in general apparel
 Handle fast and complex motion with many self-
occlusions & non-rid surface deformation
 Not need studios with controlled lighting and many
stationary cameras
5

Data capture
Operator

Performer
Capture environment Captured data from 3 Kinects

 Asynchronous capture
 Use a start recording signal to all PCs connected through Wi-Fi
 Intrinsic calibration
 Apply Zhang’s method
 Alignment between the color image and the range data
 Use the OpenNI API
6

Scene models at time t
• Human model
– Laser scanner provide a static mesh with embedded
skeleton of each performer
[*]

• 5,000 vertices of meshes
• k-th performer’s Skeleton with
31 degrees of freedom: C tk
GND (r=3m)

• Ground plane model (fixed)
– Center of Environment
– Planar mesh with circular boundary
• Camera extrinsic parameters of i-th Kinect
– Translation, rotation: L tk
7
[*] F. Remondino: “3-D reconstruction of static human body shape from image sequence,” CVIU, Vol.93, No.1. pp.65-85

Overview of the proposed method
at time
t Geometric matching of Kinect point to
Point Could Segmentation
vertices of a human model

Optimization of skeleton & camera pose

Non-rigid deformation of the human
surface via Laplacian deformation
8

Optimization of skeleton and camera pose
Error function E is solved within iterative quasi-Newton minimization

• Human region error between model vertices and Kinect point cloud
• Ground region error between model vertices and Kinect point cloud
• Difference of matched SIFT feature positions between previous and current
time on background regions (SfM approach)

Result using t−1 & t−1 Result based on SfM approach Optimized result 9

Comparison with Multi-view Video Tracking
Multi-view video trachking system with 10 calibrated cameras vs. Proposed method

“Rolling” with slow motion  Similar results

“Jump” with fast motion  proposed method gets better results

10

Performance capture results
on a variety of Sequences

11

Conclusion
• Simultaneously marker-less performance
capture system with several hand-held Kinects
– Iterative robust matching of tracked 3D models
and input Kinect data

12

References
• Linear Blend Skinning (Accessed on Nov. 25th 2012 )
– http://bit.ly/RaijkQ
• Motion Capture Using Joint Skeleton Tracking and
Surface Estimation (Accessed on Nov. 26th 2012)
– http://www.vision.ee.ethz.ch/~gallju/projects/skelsurf/ind
ex.html

13

What's hot

Point Cloud Stream on Spatial Mixed Reality: Toward Telepresence in Architect...Tomohiro Fukuda

20210226 esa-science-coffee-v2.0Advanced-Concepts-Team

論文紹介"DynamicFusion: Reconstruction and Tracking of Non-‐rigid Scenes in Real...Ken Sakurada

Introduction of slamHung-Chih Chang

Visual Environment by Semantic Segmentation Using Deep Learning: A Prototype ...Tomohiro Fukuda

Tracking Robustness and Green View Index Estimation of Augmented and Diminish...Tomohiro Fukuda

3D SLAM introcution& current statuse8xu

Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013Sunando Sengupta

30th コンピュータビジョン勉強会@関東 DynamicFusionHiroki Mizuno

VSlam 2017 11_20(張閎智)Hung-Chih Chang

20th. Single Molecule Workshop Picoquant 2014Dirk Hähnel

Towards Accurate Multi-person Pose Estimation in the Wild (My summery)Abdulrahman Kerim

Intelligent robot used in the field of practicalUlaa Iman

Action Genome: Action As Composition of Spatio Temporal Scene GraphsSangmin Woo

[Mmlab seminar 2016] deep learning for human pose estimationWei Yang

An Open Source solution for Three-Dimensional documentation: archaeological a...Giulio Bigliardi

(Progress Presentation) Autonomous Quadcopter NavigationMohamed Elawady

Artificial intelligence at the edgeJameson Toole

Creating smaller, faster, production-ready mobile machine learning models.Jameson Toole

DTAM: Dense Tracking and Mapping in Real-Time, Robot vision GroupLihang Li

What's hot (20)

Point Cloud Stream on Spatial Mixed Reality: Toward Telepresence in Architect...

20210226 esa-science-coffee-v2.0

論文紹介"DynamicFusion: Reconstruction and Tracking of Non-‐rigid Scenes in Real...

Introduction of slam

Visual Environment by Semantic Segmentation Using Deep Learning: A Prototype ...

Tracking Robustness and Green View Index Estimation of Augmented and Diminish...

3D SLAM introcution& current status

Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013

30th コンピュータビジョン勉強会@関東 DynamicFusion

VSlam 2017 11_20(張閎智)

20th. Single Molecule Workshop Picoquant 2014

Towards Accurate Multi-person Pose Estimation in the Wild (My summery)

Intelligent robot used in the field of practical

Action Genome: Action As Composition of Spatio Temporal Scene Graphs

[Mmlab seminar 2016] deep learning for human pose estimation

An Open Source solution for Three-Dimensional documentation: archaeological a...

(Progress Presentation) Autonomous Quadcopter Navigation

Artificial intelligence at the edge

Creating smaller, faster, production-ready mobile machine learning models.

DTAM: Dense Tracking and Mapping in Real-Time, Robot vision Group

Viewers also liked

Benuanasoy

Egdgleroy walker

hloooooooHemant Bansal

Resume: Research Engineer Abhishek Singh

Spring 2012 capstonexiexiaoye

Nontraumatic musculoskeletal disordersdjorgenmorris

gghjklDania Torres

SfdfdfAdriana Collazos Rojas

Connecting Rehab From The Training RoomTeamBuildr

Marketing pitchTom Cartwright

Vacon NXP Common DC Bus productsVacon Plc

CIRCULAR IPC FEBRERO 2016CORPORACION JURIDICA

Tmw20116 brooks.lnavaidkhan

LIEK DIS IF U CRY EVRYTIEMWilliamGuntur

Khilafat Magazine Issue 2Khilafat

LTOZBrian Osman

RIO TINTO IRON ORE ESSSENTIALS - NOV 2013 CERTIFICATE-1Uttamkumar Banerjee

jQuery Internals + Cool Stuffjeresig

06.01 sql select distinctBishal Ghimire

Royal Dutch Shell plc CFO Simon Henry - Barclays conference in New York, Sept...Shell plc

Viewers also liked (20)

Benua

Egdg

hlooooooo

Resume: Research Engineer

Spring 2012 capstone

Nontraumatic musculoskeletal disorders

gghjkl

Sfdfdf

Connecting Rehab From The Training Room

Marketing pitch

Vacon NXP Common DC Bus products

CIRCULAR IPC FEBRERO 2016

Tmw20116 brooks.l

LIEK DIS IF U CRY EVRYTIEM

Khilafat Magazine Issue 2

LTOZ

RIO TINTO IRON ORE ESSSENTIALS - NOV 2013 CERTIFICATE-1

jQuery Internals + Cool Stuff

06.01 sql select distinct

Royal Dutch Shell plc CFO Simon Henry - Barclays conference in New York, Sept...

Similar to [Paper introduction] Performance Capture of Interacting Characters with Handheld Kinects

Human action recognition with kinect using a joint motion descriptorSoma Boubou

Action_recognition-topic.pptxcomputerscience98

final_project_1_2k21cse07.pptxshwetabhagat25

998-isvc16Baiwu (Chris) Zhang

(Paper note) Real time rgb-d camera relocalization via randomized ferns for k...e8xu

Virtualizing Real-life Lectures with vAcademia and KinectMikhail Fominykh

SkyStitch: a Cooperative Multi-UAV-based Real-time Video Surveillance System ...Kitsukawa Yuki

240315_Thanh_LabSeminar[G-TAD: Sub-Graph Localization for Temporal Action Det...thanhdowork

Dataset creation for Deep Learning-based Geometric Computer Vision problemsPetteriTeikariPhD

Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)Saimunur Rahman

Feature Tracking of Objects in Underwater Video SequencesIDES Editor

VIBE: Video Inference for Human Body Pose and Shape EstimationArithmer Inc.

Luigy Bertaglia Bortolo - Poster FinalLuigy Bertaglia Bortolo

Cvpr 2018 papers review (efficient computing)DonghyunKang12

Synthesizing pseudo 2.5 d content from monocular videos for mixed realityNAVER Engineering

"High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro...Edge AI and Vision Alliance

Interactive full body motion capture using infrared sensor networkijcga

Interactive Full-Body Motion Capture Using Infrared Sensor Network ijcga

SENSITIVITY OF A VIDEO SURVEILLANCE SYSTEM BASED ON MOTION DETECTIONsipij

(Research Note) Delving deeper into convolutional neural networks for camera ...Jacky Liu

Similar to [Paper introduction] Performance Capture of Interacting Characters with Handheld Kinects (20)

Human action recognition with kinect using a joint motion descriptor

Action_recognition-topic.pptx

final_project_1_2k21cse07.pptx

998-isvc16

(Paper note) Real time rgb-d camera relocalization via randomized ferns for k...

Virtualizing Real-life Lectures with vAcademia and Kinect

SkyStitch: a Cooperative Multi-UAV-based Real-time Video Surveillance System ...

240315_Thanh_LabSeminar[G-TAD: Sub-Graph Localization for Temporal Action Det...

Dataset creation for Deep Learning-based Geometric Computer Vision problems

Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)

Feature Tracking of Objects in Underwater Video Sequences

VIBE: Video Inference for Human Body Pose and Shape Estimation

Luigy Bertaglia Bortolo - Poster Final

Cvpr 2018 papers review (efficient computing)

Synthesizing pseudo 2.5 d content from monocular videos for mixed reality

"High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro...

Interactive full body motion capture using infrared sensor network

Interactive Full-Body Motion Capture Using Infrared Sensor Network

SENSITIVITY OF A VIDEO SURVEILLANCE SYSTEM BASED ON MOTION DETECTION

(Research Note) Delving deeper into convolutional neural networks for camera ...

[Paper introduction] Performance Capture of Interacting Characters with Handheld Kinects

1. Notice • This power point is made by Mitsuru Nakazawa, NOT an original author, for paper introduction of ECCV2012 1

2. Presenter: Mitsuru NAKAZAWA Performance Capture of Interacting Characters with Handheld Kinects Genzhi Ye1 Yebin Liu1 Nils Hasler2 Xiangyang Ji1 Qionghai Dai1 Christian Theobalt2 1: Deptartment of Automation, Tsinghua University 2: Graphics, Vision & Video Group, Max-Planck Institute for Informatics ECCV2012 paper introduction 2

3. Introduction movie URL: http://media.au.tsinghua.edu.cn/yegenzhi/HandheldKinectsMocap_ECCV2012.jsp (Accessed on 26th Nov. 2012) 3

4. Related works Multi-view motion capture approaches  Reconstruct a skeletal motion model & detailed dynamic surface geometry  Deal with people wide apparel  Require controlled studio setup (many number of sync video cameras) Marker-less motion capture from a single range sensor  Estimate complex poses at real-time frame rates  Difficult to capture 3D, complex, detailed model 4

5. Objective freely move Full performance capture Operator of moving humans using only 3 handheld, Performer moving Kinects  Reconstruct a skeleton motion & time varying surface geometry of humans in general apparel  Handle fast and complex motion with many self- occlusions & non-rid surface deformation  Not need studios with controlled lighting and many stationary cameras 5

6. Data capture Operator Performer Capture environment Captured data from 3 Kinects  Asynchronous capture  Use a start recording signal to all PCs connected through Wi-Fi  Intrinsic calibration  Apply Zhang’s method  Alignment between the color image and the range data  Use the OpenNI API 6

7. Scene models at time t • Human model – Laser scanner provide a static mesh with embedded skeleton of each performer [*] • 5,000 vertices of meshes • k-th performer’s Skeleton with 31 degrees of freedom: C tk GND (r=3m) • Ground plane model (fixed) – Center of Environment – Planar mesh with circular boundary • Camera extrinsic parameters of i-th Kinect – Translation, rotation: L tk 7 [*] F. Remondino: “3-D reconstruction of static human body shape from image sequence,” CVIU, Vol.93, No.1. pp.65-85

8. Overview of the proposed method at time t Geometric matching of Kinect point to Point Could Segmentation vertices of a human model Optimization of skeleton & camera pose Non-rigid deformation of the human surface via Laplacian deformation 8

9. Optimization of skeleton and camera pose Error function E is solved within iterative quasi-Newton minimization • Human region error between model vertices and Kinect point cloud • Ground region error between model vertices and Kinect point cloud • Difference of matched SIFT feature positions between previous and current time on background regions (SfM approach) Result using t−1 & t−1 Result based on SfM approach Optimized result 9

10. Comparison with Multi-view Video Tracking Multi-view video trachking system with 10 calibrated cameras vs. Proposed method “Rolling” with slow motion  Similar results “Jump” with fast motion  proposed method gets better results 10

11. Performance capture results on a variety of Sequences 11

12. Conclusion • Simultaneously marker-less performance capture system with several hand-held Kinects – Iterative robust matching of tracked 3D models and input Kinect data 12

13. References • Linear Blend Skinning (Accessed on Nov. 25th 2012 ) – http://bit.ly/RaijkQ • Motion Capture Using Joint Skeleton Tracking and Surface Estimation (Accessed on Nov. 26th 2012) – http://www.vision.ee.ethz.ch/~gallju/projects/skelsurf/ind ex.html 13

[Paper introduction] Performance Capture of Interacting Characters with Handheld Kinects

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to [Paper introduction] Performance Capture of Interacting Characters with Handheld Kinects

Similar to [Paper introduction] Performance Capture of Interacting Characters with Handheld Kinects (20)

[Paper introduction] Performance Capture of Interacting Characters with Handheld Kinects