PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera pose

•

1 like•317 views

이번 논문은, Video로부터 Unsupervised 방식을 통해 Flow, Depth, Camera Ego-motion까지 뽑아내는 GeoNet이라는 알고리즘입니다. Computer Vision에서 다루는 3D Geometry에 대해 간략히 설명 드린 후에 GeoNet 알고리즘을 소개하는 영상입니다.

Engineering

GeoNet: Unsupervised Learning of Dense
Depth, Optical Flow and Camera Pose
Hyeongmin Lee
Image and Video Pattern Recognition LAB
Electrical and Electronic Engineering Dept, Yonsei University
5th Semester
2020.2.23

Depth, Optical Flow, Camera Pose
◆ Depth [PR098 - MegaDepth]
이미지에 등장하는 각 Pixel이 Camera로부터 몇 m 떨어져 있는지를 나타내는 Map

Depth, Optical Flow, Camera Pose
◆ Optical Flow [PR214 - FlowNet]
연속한 두 Frame 사이에서 각 Pixel의 Motion을 나타내는 Vector Map (Pixel Displacement)

Depth, Optical Flow, Camera Pose
◆ Camera Pose (Camera Motion, Ego-Motion)
𝑧
𝑥
𝑦
(𝑥, 𝑦, 𝑧)
(0,0,0)
(𝑥, 𝑦, 𝑧) (𝑥′, 𝑦′, 𝑧′)
𝑇

Depth, Optical Flow, Camera Pose
◆ Depth, Optical Flow, Camera Pose
대부분의 Pixel Motion은 카메라의 움직임에 의해 발생 ➔ Object Motion과 분리하여 생각.

Depth, Optical Flow, Camera Pose
◆ Depth, Optical Flow, Camera Pose
Depth!!

3D Geometry
◆ Real Distance?
Camera 정보
카메라와 대상 간의 거리
(Depth)

3D Geometry
◆ Camera Calibration
Image Coordinate Normalized Coordinate
pixel Meter(z=1)
(𝑥, 𝑦) (𝑢, 𝑣)
𝑥 = 𝑓𝑥 𝑢 + 𝑐 𝑥
𝑦 = 𝑓𝑦 𝑣 + 𝑐 𝑦
𝑥
𝑦
1
=
𝑓𝑥 0 𝑐 𝑥
0 𝑓𝑦 𝑐 𝑦
0 0 1
𝑢
𝑣
1
𝐾
Intrinsic Parameter

3D Geometry
◆ Depth
초점
𝑍
(𝑋, 𝑌, 𝑍)
1
𝑓
(𝑢, 𝑣, 1)
(𝑥, 𝑦, 1)
𝑢
𝑣
1
= 𝐾−1
𝑥
𝑦
1
𝑋
𝑌
𝑍
= 𝑍
𝑢
𝑣
1
= 𝐷𝐾−1
𝑥
𝑦
1

3D Geometry
◆ 3D Transformation
(𝑥, 𝑦, 𝑧) (𝑥′, 𝑦′, 𝑧′)
𝑇
𝑥′
𝑦′
𝑧′
1
=
𝑟11 𝑟12 𝑟13 𝑡 𝑥
𝑟11 𝑟12 𝑟13 𝑡 𝑥
𝑟11 𝑟12 𝑟13 𝑡 𝑥
0 0 0 1
𝑥
𝑦
𝑧
1
= [𝑅|𝑡]
𝑥
𝑦
𝑧
1
𝑥′
𝑦′
𝑧′
= 𝑅
𝑥
𝑦
𝑧
+
𝑡 𝑥
𝑡 𝑦
𝑡 𝑧
출처: Dark Programmer

GeoNet
◆ Rigid & Residual Motion
• Rigid Motion: Camera Motion에 의한 상대적인 움직임
• Residual Motion: 각 Object의 독립적인 움직임

GeoNet
◆ Rigid Warping Loss
◆ Edge-Aware Depth Smoothness Loss
𝐿 𝑟𝑤 = 𝛼
1 − 𝑆𝑆𝐼𝑀(𝐼𝑡, ෩𝐼𝑠
𝑟𝑖𝑔
)
2
+ 1 − 𝛼 𝐼𝑡 − ෩𝐼𝑠
𝑟𝑖𝑔
1
𝐿 𝑑𝑠 = ෍
𝑝 𝑡
|∇𝐷(𝑝𝑡)| ∙ 𝑒− ∇𝐼 𝑝 𝑡
𝑇

GeoNet
◆ Flow Warping Loss
◆ Edge-Aware Flow Smoothness Loss
𝐿 𝑓𝑤 = 𝛼
1 − 𝑆𝑆𝐼𝑀(𝐼𝑡, ෩𝐼𝑠
𝑓𝑢𝑙𝑙
)
2
+ 1 − 𝛼 𝐼𝑡 − ෩𝐼𝑠
𝑓𝑢𝑙𝑙
1
𝐿 𝑓𝑠 = ෍
𝑝 𝑡
|∇𝑓𝑡→𝑠
𝑓𝑢𝑙𝑙
(𝑝𝑡)| ∙ 𝑒− ∇𝐼 𝑝 𝑡
𝑇

GeoNet
◆ Geometric Consistency Loss
𝐿 𝑔𝑐 = ෍
𝑝 𝑡
[𝛿(𝑝𝑡)] ∙ ∆𝑓𝑡→𝑠
𝑓𝑢𝑙𝑙
𝑝𝑡 1
∆𝑓𝑡→𝑠
𝑓𝑢𝑙𝑙
𝑝𝑡 = 𝑓𝑡→𝑠
𝑓𝑢𝑙𝑙
+ 𝑓𝑠→𝑡
𝑓𝑢𝑙𝑙
(𝑝𝑡 + 𝑓𝑡→𝑠
𝑓𝑢𝑙𝑙
(𝑝𝑡))
For Occlusion Reasoning

What's hot

A Review of the Split Bregman Method for L1 Regularized ProblemsPardis N

自然方策勾配法の基礎と応用Ryo Iwaki

大規模凸最適化問題に対する勾配法京都大学大学院情報学研究科数理工学専攻

Relational Binarized HOG特徴量とReal AdaBoostによるバイナリ選択を用いた物体検出MPRG_Chubu_University

Object tracking presentationMrsShwetaBanait1

CVPR2019読み会 "A Theory of Fermat Paths for Non-Line-of-Sight Shape Reconstruc...Hajime Mihara

論文紹介「PointNetLK: Robust & Efficient Point Cloud Registration Using PointNet」Naoya Chiba

[論文紹介] DPSNet: End-to-end Deep Plane Sweep StereoSeiya Ito

Depth Estimation論文紹介Keio Robotics Association

Lucas kanade法についてHitoshi Nishimura

文献紹介：Image Segmentation Using Deep Learning: A SurveyToru Tamaki

システム制御とディープラーニングKeio Robotics Association

Wasserstein GAN 수학 이해하기 ISungbin Lim

はじめてのパターン認識第1章Prunus 1350

2015年12月PRMU研究会対応点探索のための特徴量表現Mitsuru Ambai

SSII2021 [SS1] Transformer x Computer Visionの実活用可能性と展望〜 TransformerのCompute...SSII

論文紹介：Multimodal Learning with Transformers: A SurveyToru Tamaki

PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisHyeongmin Lee

局所特徴量と統計学習手法による物体検出MPRG_Chubu_University

Fisher Vectorによる画像認識Takao Yamanaka

What's hot (20)

A Review of the Split Bregman Method for L1 Regularized Problems

自然方策勾配法の基礎と応用

大規模凸最適化問題に対する勾配法

Relational Binarized HOG特徴量とReal AdaBoostによるバイナリ選択を用いた物体検出

Object tracking presentation

CVPR2019読み会 "A Theory of Fermat Paths for Non-Line-of-Sight Shape Reconstruc...

論文紹介「PointNetLK: Robust & Efficient Point Cloud Registration Using PointNet」

[論文紹介] DPSNet: End-to-end Deep Plane Sweep Stereo

Depth Estimation論文紹介

Lucas kanade法について

文献紹介：Image Segmentation Using Deep Learning: A Survey

システム制御とディープラーニング

Wasserstein GAN 수학 이해하기 I

はじめてのパターン認識第1章

2015年12月PRMU研究会対応点探索のための特徴量表現

SSII2021 [SS1] Transformer x Computer Visionの実活用可能性と展望〜 TransformerのCompute...

論文紹介：Multimodal Learning with Transformers: A Survey

PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

局所特徴量と統計学習手法による物体検出

Fisher Vectorによる画像認識

Similar to PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera pose

論文読み会@AIST (Deep Virtual Stereo Odometry [ECCV2018])Masaya Kaneko

Keynote at Tracking Workshop during ISMAR 2014Darius Burschka

Motion captureAswanth Talaseela

Motion capture technologyARUN S L

Fast Multi-frame Stereo Scene Flow with Motion Segmentation (CVPR 2017)Tatsunori Taniai

Motion capture technologyArun MK

Motion capture documentharini501

Motion Human Detection & Tracking Based On Background SubtractionInternational Journal of Engineering Inventions www.ijeijournal.com

Reconstructing and Watermarking Stereo Vision Systems-PhD Presentation Osama Hosam

Real-time 3D Object Pose Estimation and Tracking for Natural Landmark Based V...c.choi

BallCatchingRobotgauravbrd

Presentation Object Recognition And Tracking ProjectPrathamesh Joshi

Motionblurozlael ozlael

Androidで出来る!! KinectとiPadを使った亀ロボHirotaka Niisato

OutlineAshraf Aboshosha

Fundamentals of matchmovingDipjoy Routh

Smart Room Gesture ControlGiwrgos Paraskevopoulos

Getmoving as3kinectMarielle Lange

Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon TransformFadwa Fouad

Edge Detection algorithm and codeVaddi Manikanta

Similar to PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera pose (20)

論文読み会@AIST (Deep Virtual Stereo Odometry [ECCV2018])

Keynote at Tracking Workshop during ISMAR 2014

Motion capture

Motion capture technology

Fast Multi-frame Stereo Scene Flow with Motion Segmentation (CVPR 2017)

Motion capture technology

Motion capture document

Motion Human Detection & Tracking Based On Background Subtraction

Reconstructing and Watermarking Stereo Vision Systems-PhD Presentation

Real-time 3D Object Pose Estimation and Tracking for Natural Landmark Based V...

BallCatchingRobot

Presentation Object Recognition And Tracking Project

Motionblur

Androidで出来る!! KinectとiPadを使った亀ロボ

Outline

Fundamentals of matchmoving

Smart Room Gesture Control

Getmoving as3kinect

Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon Transform

Edge Detection algorithm and code

Recently uploaded

247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1

HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95

Porous Ceramics seminar and technical writingrakeshbaidya232001

Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR9953056974 Low Rate Call Girls In Saket, Delhi NCR

Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth

Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha

(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat

(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat

Introduction to IEEE STANDARDS and its different types.pptxupamatechverse

Introduction to Multiple Access Protocol.pptxupamatechverse

★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR9953056974 Low Rate Call Girls In Saket, Delhi NCR

Extrusion Processes and Their Limitations120cr0395

UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingrknatarajan

Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile

Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona

High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile

(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat

UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan

The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat

OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal

Recently uploaded (20)

247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt

HARMONY IN THE NATURE AND EXISTENCE - Unit-IV

Porous Ceramics seminar and technical writing

Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR

Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...

Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx

(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts

(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...

Introduction to IEEE STANDARDS and its different types.pptx

Introduction to Multiple Access Protocol.pptx

★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR

Extrusion Processes and Their Limitations

UNIT-V FMM.HYDRAULIC TURBINE - Construction and working

Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts

Processing & Properties of Floor and Wall Tiles.pptx

High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts

(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...

UNIT-III FMM. DIMENSIONAL ANALYSIS

The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...

OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...

PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera pose

1. GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose Hyeongmin Lee Image and Video Pattern Recognition LAB Electrical and Electronic Engineering Dept, Yonsei University 5th Semester 2020.2.23

2. Depth, Optical Flow, Camera Pose

3. Depth, Optical Flow, Camera Pose ◆ Depth [PR098 - MegaDepth] 이미지에 등장하는 각 Pixel이 Camera로부터 몇 m 떨어져 있는지를 나타내는 Map

4. Depth, Optical Flow, Camera Pose ◆ Optical Flow [PR214 - FlowNet] 연속한 두 Frame 사이에서 각 Pixel의 Motion을 나타내는 Vector Map (Pixel Displacement)

5. Depth, Optical Flow, Camera Pose ◆ Camera Pose (Camera Motion, Ego-Motion) 𝑧 𝑥 𝑦 (𝑥, 𝑦, 𝑧) (0,0,0) (𝑥, 𝑦, 𝑧) (𝑥′, 𝑦′, 𝑧′) 𝑇

6. Depth, Optical Flow, Camera Pose ◆ Depth, Optical Flow, Camera Pose 대부분의 Pixel Motion은 카메라의 움직임에 의해 발생 ➔ Object Motion과 분리하여 생각.

7. Depth, Optical Flow, Camera Pose ◆ Depth, Optical Flow, Camera Pose Depth!!

8. 3D Geometry

9. 3D Geometry ◆ Real Distance? Camera 정보 카메라와 대상 간의 거리 (Depth)

10. 3D Geometry ◆ Camera Calibration Image Coordinate Normalized Coordinate pixel Meter(z=1) (𝑥, 𝑦) (𝑢, 𝑣) 𝑥 = 𝑓𝑥 𝑢 + 𝑐 𝑥 𝑦 = 𝑓𝑦 𝑣 + 𝑐 𝑦 𝑥 𝑦 1 = 𝑓𝑥 0 𝑐 𝑥 0 𝑓𝑦 𝑐 𝑦 0 0 1 𝑢 𝑣 1 𝐾 Intrinsic Parameter

11. 3D Geometry ◆ Depth 초점 𝑍 (𝑋, 𝑌, 𝑍) 1 𝑓 (𝑢, 𝑣, 1) (𝑥, 𝑦, 1) 𝑢 𝑣 1 = 𝐾−1 𝑥 𝑦 1 𝑋 𝑌 𝑍 = 𝑍 𝑢 𝑣 1 = 𝐷𝐾−1 𝑥 𝑦 1

12. 3D Geometry ◆ 3D Transformation (𝑥, 𝑦, 𝑧) (𝑥′, 𝑦′, 𝑧′) 𝑇 𝑥′ 𝑦′ 𝑧′ 1 = 𝑟11 𝑟12 𝑟13 𝑡 𝑥 𝑟11 𝑟12 𝑟13 𝑡 𝑥 𝑟11 𝑟12 𝑟13 𝑡 𝑥 0 0 0 1 𝑥 𝑦 𝑧 1 = [𝑅|𝑡] 𝑥 𝑦 𝑧 1 𝑥′ 𝑦′ 𝑧′ = 𝑅 𝑥 𝑦 𝑧 + 𝑡 𝑥 𝑡 𝑦 𝑡 𝑧 출처: Dark Programmer

13. GeoNet

14. GeoNet ◆ Rigid & Residual Motion • Rigid Motion: Camera Motion에 의한 상대적인 움직임 • Residual Motion: 각 Object의 독립적인 움직임

15. GeoNet ◆ Rigid & Residual Motion =

16. GeoNet ◆ Rigid Warping Loss ◆ Edge-Aware Depth Smoothness Loss 𝐿 𝑟𝑤 = 𝛼 1 − 𝑆𝑆𝐼𝑀(𝐼𝑡, ෩𝐼𝑠 𝑟𝑖𝑔 ) 2 + 1 − 𝛼 𝐼𝑡 − ෩𝐼𝑠 𝑟𝑖𝑔 1 𝐿 𝑑𝑠 = ෍ 𝑝 𝑡 |∇𝐷(𝑝𝑡)| ∙ 𝑒− ∇𝐼 𝑝 𝑡 𝑇

17. GeoNet ◆ Flow Warping Loss ◆ Edge-Aware Flow Smoothness Loss 𝐿 𝑓𝑤 = 𝛼 1 − 𝑆𝑆𝐼𝑀(𝐼𝑡, ෩𝐼𝑠 𝑓𝑢𝑙𝑙 ) 2 + 1 − 𝛼 𝐼𝑡 − ෩𝐼𝑠 𝑓𝑢𝑙𝑙 1 𝐿 𝑓𝑠 = ෍ 𝑝 𝑡 |∇𝑓𝑡→𝑠 𝑓𝑢𝑙𝑙 (𝑝𝑡)| ∙ 𝑒− ∇𝐼 𝑝 𝑡 𝑇

18. GeoNet ◆ Geometric Consistency Loss 𝐿 𝑔𝑐 = ෍ 𝑝 𝑡 [𝛿(𝑝𝑡)] ∙ ∆𝑓𝑡→𝑠 𝑓𝑢𝑙𝑙 𝑝𝑡 1 ∆𝑓𝑡→𝑠 𝑓𝑢𝑙𝑙 𝑝𝑡 = 𝑓𝑡→𝑠 𝑓𝑢𝑙𝑙 + 𝑓𝑠→𝑡 𝑓𝑢𝑙𝑙 (𝑝𝑡 + 𝑓𝑡→𝑠 𝑓𝑢𝑙𝑙 (𝑝𝑡)) For Occlusion Reasoning

19. GeoNet ◆ Depth Result

20. GeoNet ◆ Flow & Pose Result

21. Thank You!

PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera pose

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera pose

Similar to PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera pose (20)

More from Hyeongmin Lee

More from Hyeongmin Lee (20)

Recently uploaded

Recently uploaded (20)

PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera pose