SlideShare a Scribd company logo
GeoNet: Unsupervised Learning of Dense
Depth, Optical Flow and Camera Pose
Hyeongmin Lee
Image and Video Pattern Recognition LAB
Electrical and Electronic Engineering Dept, Yonsei University
5th Semester
2020.2.23
Depth, Optical Flow, Camera Pose
Depth, Optical Flow, Camera Pose
◆ Depth [PR098 - MegaDepth]
이미지에 등장하는 각 Pixel이 Camera로부터 몇 m 떨어져 있는지를 나타내는 Map
Depth, Optical Flow, Camera Pose
◆ Optical Flow [PR214 - FlowNet]
연속한 두 Frame 사이에서 각 Pixel의 Motion을 나타내는 Vector Map (Pixel Displacement)
Depth, Optical Flow, Camera Pose
◆ Camera Pose (Camera Motion, Ego-Motion)
𝑧
𝑥
𝑦
(𝑥, 𝑦, 𝑧)
(0,0,0)
(𝑥, 𝑦, 𝑧) (𝑥′, 𝑦′, 𝑧′)
𝑇
Depth, Optical Flow, Camera Pose
◆ Depth, Optical Flow, Camera Pose
대부분의 Pixel Motion은 카메라의 움직임에 의해 발생 ➔ Object Motion과 분리하여 생각.
Depth, Optical Flow, Camera Pose
◆ Depth, Optical Flow, Camera Pose
Depth!!
3D Geometry
3D Geometry
◆ Real Distance?
Camera 정보
카메라와 대상 간의 거리
(Depth)
3D Geometry
◆ Camera Calibration
Image Coordinate Normalized Coordinate
pixel Meter(z=1)
(𝑥, 𝑦) (𝑢, 𝑣)
𝑥 = 𝑓𝑥 𝑢 + 𝑐 𝑥
𝑦 = 𝑓𝑦 𝑣 + 𝑐 𝑦
𝑥
𝑦
1
=
𝑓𝑥 0 𝑐 𝑥
0 𝑓𝑦 𝑐 𝑦
0 0 1
𝑢
𝑣
1
𝐾
Intrinsic Parameter
3D Geometry
◆ Depth
초점
𝑍
(𝑋, 𝑌, 𝑍)
1
𝑓
(𝑢, 𝑣, 1)
(𝑥, 𝑦, 1)
𝑢
𝑣
1
= 𝐾−1
𝑥
𝑦
1
𝑋
𝑌
𝑍
= 𝑍
𝑢
𝑣
1
= 𝐷𝐾−1
𝑥
𝑦
1
3D Geometry
◆ 3D Transformation
(𝑥, 𝑦, 𝑧) (𝑥′, 𝑦′, 𝑧′)
𝑇
𝑥′
𝑦′
𝑧′
1
=
𝑟11 𝑟12 𝑟13 𝑡 𝑥
𝑟11 𝑟12 𝑟13 𝑡 𝑥
𝑟11 𝑟12 𝑟13 𝑡 𝑥
0 0 0 1
𝑥
𝑦
𝑧
1
= [𝑅|𝑡]
𝑥
𝑦
𝑧
1
𝑥′
𝑦′
𝑧′
= 𝑅
𝑥
𝑦
𝑧
+
𝑡 𝑥
𝑡 𝑦
𝑡 𝑧
출처: Dark Programmer
GeoNet
GeoNet
◆ Rigid & Residual Motion
• Rigid Motion: Camera Motion에 의한 상대적인 움직임
• Residual Motion: 각 Object의 독립적인 움직임
GeoNet
◆ Rigid & Residual Motion
=
GeoNet
◆ Rigid Warping Loss
◆ Edge-Aware Depth Smoothness Loss
𝐿 𝑟𝑤 = 𝛼
1 − 𝑆𝑆𝐼𝑀(𝐼𝑡, ෩𝐼𝑠
𝑟𝑖𝑔
)
2
+ 1 − 𝛼 𝐼𝑡 − ෩𝐼𝑠
𝑟𝑖𝑔
1
𝐿 𝑑𝑠 = ෍
𝑝 𝑡
|∇𝐷(𝑝𝑡)| ∙ 𝑒− ∇𝐼 𝑝 𝑡
𝑇
GeoNet
◆ Flow Warping Loss
◆ Edge-Aware Flow Smoothness Loss
𝐿 𝑓𝑤 = 𝛼
1 − 𝑆𝑆𝐼𝑀(𝐼𝑡, ෩𝐼𝑠
𝑓𝑢𝑙𝑙
)
2
+ 1 − 𝛼 𝐼𝑡 − ෩𝐼𝑠
𝑓𝑢𝑙𝑙
1
𝐿 𝑓𝑠 = ෍
𝑝 𝑡
|∇𝑓𝑡→𝑠
𝑓𝑢𝑙𝑙
(𝑝𝑡)| ∙ 𝑒− ∇𝐼 𝑝 𝑡
𝑇
GeoNet
◆ Geometric Consistency Loss
𝐿 𝑔𝑐 = ෍
𝑝 𝑡
[𝛿(𝑝𝑡)] ∙ ∆𝑓𝑡→𝑠
𝑓𝑢𝑙𝑙
𝑝𝑡 1
∆𝑓𝑡→𝑠
𝑓𝑢𝑙𝑙
𝑝𝑡 = 𝑓𝑡→𝑠
𝑓𝑢𝑙𝑙
+ 𝑓𝑠→𝑡
𝑓𝑢𝑙𝑙
(𝑝𝑡 + 𝑓𝑡→𝑠
𝑓𝑢𝑙𝑙
(𝑝𝑡))
For Occlusion Reasoning
GeoNet
◆ Depth Result
GeoNet
◆ Flow & Pose Result
Thank You!

More Related Content

What's hot

Moving object detection
Moving object detectionMoving object detection
Moving object detection
Raviraj singh shekhawat
 
Wasserstein GAN 수학 이해하기 I
Wasserstein GAN 수학 이해하기 IWasserstein GAN 수학 이해하기 I
Wasserstein GAN 수학 이해하기 I
Sungbin Lim
 
Finding connections among images using CycleGAN
Finding connections among images using CycleGANFinding connections among images using CycleGAN
Finding connections among images using CycleGAN
NAVER Engineering
 
Optimization/Gradient Descent
Optimization/Gradient DescentOptimization/Gradient Descent
Optimization/Gradient Descent
kandelin
 
Optic flow estimation with deep learning
Optic flow estimation with deep learningOptic flow estimation with deep learning
Optic flow estimation with deep learning
Yu Huang
 
Multimodal Deep Learning
Multimodal Deep LearningMultimodal Deep Learning
Multimodal Deep Learning
Universitat Politècnica de Catalunya
 
Deep learning for person re-identification
Deep learning for person re-identificationDeep learning for person re-identification
Deep learning for person re-identification
哲东 郑
 
4 Dimensionality reduction (PCA & t-SNE)
4 Dimensionality reduction (PCA & t-SNE)4 Dimensionality reduction (PCA & t-SNE)
4 Dimensionality reduction (PCA & t-SNE)
Dmytro Fishman
 
InfoGAN: Interpretable Representation Learning by Information Maximizing Gene...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gene...InfoGAN: Interpretable Representation Learning by Information Maximizing Gene...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gene...
홍배 김
 
ConvNeXt: A ConvNet for the 2020s explained
ConvNeXt: A ConvNet for the 2020s explainedConvNeXt: A ConvNet for the 2020s explained
ConvNeXt: A ConvNet for the 2020s explained
Sushant Gautam
 
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
“How Transformers are Changing the Direction of Deep Learning Architectures,”...
“How Transformers are Changing the Direction of Deep Learning Architectures,”...“How Transformers are Changing the Direction of Deep Learning Architectures,”...
“How Transformers are Changing the Direction of Deep Learning Architectures,”...
Edge AI and Vision Alliance
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
NAVER Engineering
 
Kernels and Support Vector Machines
Kernels and Support Vector  MachinesKernels and Support Vector  Machines
Kernels and Support Vector Machines
Edgar Marca
 
Computer vision
Computer visionComputer vision
Computer vision
yusifagalar
 
YOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection reviewYOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection review
LEE HOSEONG
 
Noise2Score: Tweedie’s Approach to Self-Supervised Image Denoising without Cl...
Noise2Score: Tweedie’s Approach to Self-Supervised Image Denoising without Cl...Noise2Score: Tweedie’s Approach to Self-Supervised Image Denoising without Cl...
Noise2Score: Tweedie’s Approach to Self-Supervised Image Denoising without Cl...
KwanyoungKim7
 
[DL輪読会]Generative Models of Visually Grounded Imagination
[DL輪読会]Generative Models of Visually Grounded Imagination[DL輪読会]Generative Models of Visually Grounded Imagination
[DL輪読会]Generative Models of Visually Grounded Imagination
Deep Learning JP
 
最近(2020/09/13)のarxivの分布外検知の論文を紹介
最近(2020/09/13)のarxivの分布外検知の論文を紹介最近(2020/09/13)のarxivの分布外検知の論文を紹介
最近(2020/09/13)のarxivの分布外検知の論文を紹介
ぱんいち すみもと
 
Deep VO and SLAM
Deep VO and SLAMDeep VO and SLAM
Deep VO and SLAM
Yu Huang
 

What's hot (20)

Moving object detection
Moving object detectionMoving object detection
Moving object detection
 
Wasserstein GAN 수학 이해하기 I
Wasserstein GAN 수학 이해하기 IWasserstein GAN 수학 이해하기 I
Wasserstein GAN 수학 이해하기 I
 
Finding connections among images using CycleGAN
Finding connections among images using CycleGANFinding connections among images using CycleGAN
Finding connections among images using CycleGAN
 
Optimization/Gradient Descent
Optimization/Gradient DescentOptimization/Gradient Descent
Optimization/Gradient Descent
 
Optic flow estimation with deep learning
Optic flow estimation with deep learningOptic flow estimation with deep learning
Optic flow estimation with deep learning
 
Multimodal Deep Learning
Multimodal Deep LearningMultimodal Deep Learning
Multimodal Deep Learning
 
Deep learning for person re-identification
Deep learning for person re-identificationDeep learning for person re-identification
Deep learning for person re-identification
 
4 Dimensionality reduction (PCA & t-SNE)
4 Dimensionality reduction (PCA & t-SNE)4 Dimensionality reduction (PCA & t-SNE)
4 Dimensionality reduction (PCA & t-SNE)
 
InfoGAN: Interpretable Representation Learning by Information Maximizing Gene...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gene...InfoGAN: Interpretable Representation Learning by Information Maximizing Gene...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gene...
 
ConvNeXt: A ConvNet for the 2020s explained
ConvNeXt: A ConvNet for the 2020s explainedConvNeXt: A ConvNet for the 2020s explained
ConvNeXt: A ConvNet for the 2020s explained
 
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
 
“How Transformers are Changing the Direction of Deep Learning Architectures,”...
“How Transformers are Changing the Direction of Deep Learning Architectures,”...“How Transformers are Changing the Direction of Deep Learning Architectures,”...
“How Transformers are Changing the Direction of Deep Learning Architectures,”...
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
 
Kernels and Support Vector Machines
Kernels and Support Vector  MachinesKernels and Support Vector  Machines
Kernels and Support Vector Machines
 
Computer vision
Computer visionComputer vision
Computer vision
 
YOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection reviewYOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection review
 
Noise2Score: Tweedie’s Approach to Self-Supervised Image Denoising without Cl...
Noise2Score: Tweedie’s Approach to Self-Supervised Image Denoising without Cl...Noise2Score: Tweedie’s Approach to Self-Supervised Image Denoising without Cl...
Noise2Score: Tweedie’s Approach to Self-Supervised Image Denoising without Cl...
 
[DL輪読会]Generative Models of Visually Grounded Imagination
[DL輪読会]Generative Models of Visually Grounded Imagination[DL輪読会]Generative Models of Visually Grounded Imagination
[DL輪読会]Generative Models of Visually Grounded Imagination
 
最近(2020/09/13)のarxivの分布外検知の論文を紹介
最近(2020/09/13)のarxivの分布外検知の論文を紹介最近(2020/09/13)のarxivの分布外検知の論文を紹介
最近(2020/09/13)のarxivの分布外検知の論文を紹介
 
Deep VO and SLAM
Deep VO and SLAMDeep VO and SLAM
Deep VO and SLAM
 

Similar to PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera pose

論文読み会@AIST (Deep Virtual Stereo Odometry [ECCV2018])
論文読み会@AIST (Deep Virtual Stereo Odometry [ECCV2018])論文読み会@AIST (Deep Virtual Stereo Odometry [ECCV2018])
論文読み会@AIST (Deep Virtual Stereo Odometry [ECCV2018])
Masaya Kaneko
 
Keynote at Tracking Workshop during ISMAR 2014
Keynote at Tracking Workshop during ISMAR 2014Keynote at Tracking Workshop during ISMAR 2014
Keynote at Tracking Workshop during ISMAR 2014
Darius Burschka
 
Motion capture
Motion captureMotion capture
Motion capture
Aswanth Talaseela
 
Motion capture technology
Motion capture technologyMotion capture technology
Motion capture technology
ARUN S L
 
Fast Multi-frame Stereo Scene Flow with Motion Segmentation (CVPR 2017)
Fast Multi-frame Stereo Scene Flow with Motion Segmentation (CVPR 2017)Fast Multi-frame Stereo Scene Flow with Motion Segmentation (CVPR 2017)
Fast Multi-frame Stereo Scene Flow with Motion Segmentation (CVPR 2017)
Tatsunori Taniai
 
Motion capture technology
Motion capture technologyMotion capture technology
Motion capture technology
Arun MK
 
Motion capture document
Motion capture documentMotion capture document
Motion capture document
harini501
 
Motion Human Detection & Tracking Based On Background Subtraction
Motion Human Detection & Tracking Based On Background SubtractionMotion Human Detection & Tracking Based On Background Subtraction
Motion Human Detection & Tracking Based On Background Subtraction
International Journal of Engineering Inventions www.ijeijournal.com
 
Reconstructing and Watermarking Stereo Vision Systems-PhD Presentation
Reconstructing and Watermarking Stereo Vision Systems-PhD Presentation Reconstructing and Watermarking Stereo Vision Systems-PhD Presentation
Reconstructing and Watermarking Stereo Vision Systems-PhD Presentation
Osama Hosam
 
Real-time 3D Object Pose Estimation and Tracking for Natural Landmark Based V...
Real-time 3D Object Pose Estimation and Tracking for Natural Landmark Based V...Real-time 3D Object Pose Estimation and Tracking for Natural Landmark Based V...
Real-time 3D Object Pose Estimation and Tracking for Natural Landmark Based V...c.choi
 
BallCatchingRobot
BallCatchingRobotBallCatchingRobot
BallCatchingRobotgauravbrd
 
Presentation Object Recognition And Tracking Project
Presentation Object Recognition And Tracking ProjectPresentation Object Recognition And Tracking Project
Presentation Object Recognition And Tracking ProjectPrathamesh Joshi
 
Motionblur
MotionblurMotionblur
Motionblur
ozlael ozlael
 
Androidで出来る!! KinectとiPadを使った亀ロボ
Androidで出来る!! KinectとiPadを使った亀ロボAndroidで出来る!! KinectとiPadを使った亀ロボ
Androidで出来る!! KinectとiPadを使った亀ロボ
Hirotaka Niisato
 
Outline
OutlineOutline
Fundamentals of matchmoving
Fundamentals of matchmovingFundamentals of matchmoving
Fundamentals of matchmoving
Dipjoy Routh
 
Smart Room Gesture Control
Smart Room Gesture ControlSmart Room Gesture Control
Smart Room Gesture Control
Giwrgos Paraskevopoulos
 
Getmoving as3kinect
Getmoving as3kinectGetmoving as3kinect
Getmoving as3kinect
Marielle Lange
 
Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon Transform
Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon TransformHuman Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon Transform
Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon TransformFadwa Fouad
 
Edge Detection algorithm and code
Edge Detection algorithm and codeEdge Detection algorithm and code
Edge Detection algorithm and code
Vaddi Manikanta
 

Similar to PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera pose (20)

論文読み会@AIST (Deep Virtual Stereo Odometry [ECCV2018])
論文読み会@AIST (Deep Virtual Stereo Odometry [ECCV2018])論文読み会@AIST (Deep Virtual Stereo Odometry [ECCV2018])
論文読み会@AIST (Deep Virtual Stereo Odometry [ECCV2018])
 
Keynote at Tracking Workshop during ISMAR 2014
Keynote at Tracking Workshop during ISMAR 2014Keynote at Tracking Workshop during ISMAR 2014
Keynote at Tracking Workshop during ISMAR 2014
 
Motion capture
Motion captureMotion capture
Motion capture
 
Motion capture technology
Motion capture technologyMotion capture technology
Motion capture technology
 
Fast Multi-frame Stereo Scene Flow with Motion Segmentation (CVPR 2017)
Fast Multi-frame Stereo Scene Flow with Motion Segmentation (CVPR 2017)Fast Multi-frame Stereo Scene Flow with Motion Segmentation (CVPR 2017)
Fast Multi-frame Stereo Scene Flow with Motion Segmentation (CVPR 2017)
 
Motion capture technology
Motion capture technologyMotion capture technology
Motion capture technology
 
Motion capture document
Motion capture documentMotion capture document
Motion capture document
 
Motion Human Detection & Tracking Based On Background Subtraction
Motion Human Detection & Tracking Based On Background SubtractionMotion Human Detection & Tracking Based On Background Subtraction
Motion Human Detection & Tracking Based On Background Subtraction
 
Reconstructing and Watermarking Stereo Vision Systems-PhD Presentation
Reconstructing and Watermarking Stereo Vision Systems-PhD Presentation Reconstructing and Watermarking Stereo Vision Systems-PhD Presentation
Reconstructing and Watermarking Stereo Vision Systems-PhD Presentation
 
Real-time 3D Object Pose Estimation and Tracking for Natural Landmark Based V...
Real-time 3D Object Pose Estimation and Tracking for Natural Landmark Based V...Real-time 3D Object Pose Estimation and Tracking for Natural Landmark Based V...
Real-time 3D Object Pose Estimation and Tracking for Natural Landmark Based V...
 
BallCatchingRobot
BallCatchingRobotBallCatchingRobot
BallCatchingRobot
 
Presentation Object Recognition And Tracking Project
Presentation Object Recognition And Tracking ProjectPresentation Object Recognition And Tracking Project
Presentation Object Recognition And Tracking Project
 
Motionblur
MotionblurMotionblur
Motionblur
 
Androidで出来る!! KinectとiPadを使った亀ロボ
Androidで出来る!! KinectとiPadを使った亀ロボAndroidで出来る!! KinectとiPadを使った亀ロボ
Androidで出来る!! KinectとiPadを使った亀ロボ
 
Outline
OutlineOutline
Outline
 
Fundamentals of matchmoving
Fundamentals of matchmovingFundamentals of matchmoving
Fundamentals of matchmoving
 
Smart Room Gesture Control
Smart Room Gesture ControlSmart Room Gesture Control
Smart Room Gesture Control
 
Getmoving as3kinect
Getmoving as3kinectGetmoving as3kinect
Getmoving as3kinect
 
Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon Transform
Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon TransformHuman Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon Transform
Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon Transform
 
Edge Detection algorithm and code
Edge Detection algorithm and codeEdge Detection algorithm and code
Edge Detection algorithm and code
 

More from Hyeongmin Lee

PR-455: CoTracker: It is Better to Track Together
PR-455: CoTracker: It is Better to Track TogetherPR-455: CoTracker: It is Better to Track Together
PR-455: CoTracker: It is Better to Track Together
Hyeongmin Lee
 
PR-430: CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retri...
PR-430: CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retri...PR-430: CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retri...
PR-430: CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retri...
Hyeongmin Lee
 
PR-420: Scalable Model Compression by Entropy Penalized Reparameterization
PR-420: Scalable Model Compression by Entropy Penalized ReparameterizationPR-420: Scalable Model Compression by Entropy Penalized Reparameterization
PR-420: Scalable Model Compression by Entropy Penalized Reparameterization
Hyeongmin Lee
 
PR-395: Variational Image Compression with a Scale Hyperprior
PR-395: Variational Image Compression with a Scale HyperpriorPR-395: Variational Image Compression with a Scale Hyperprior
PR-395: Variational Image Compression with a Scale Hyperprior
Hyeongmin Lee
 
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
Hyeongmin Lee
 
PR-376: Softmax Splatting for Video Frame Interpolation
PR-376: Softmax Splatting for Video Frame InterpolationPR-376: Softmax Splatting for Video Frame Interpolation
PR-376: Softmax Splatting for Video Frame Interpolation
Hyeongmin Lee
 
PR-365: Fast object detection in compressed video
PR-365: Fast object detection in compressed videoPR-365: Fast object detection in compressed video
PR-365: Fast object detection in compressed video
Hyeongmin Lee
 
PR-340: DVC: An End-to-end Deep Video Compression Framework
PR-340: DVC: An End-to-end Deep Video Compression FrameworkPR-340: DVC: An End-to-end Deep Video Compression Framework
PR-340: DVC: An End-to-end Deep Video Compression Framework
Hyeongmin Lee
 
PR-328: End-to-End Optimized Image Compression
PR-328: End-to-End OptimizedImage CompressionPR-328: End-to-End OptimizedImage Compression
PR-328: End-to-End Optimized Image Compression
Hyeongmin Lee
 
PR-315: Taming Transformers for High-Resolution Image Synthesis
PR-315: Taming Transformers for High-Resolution Image SynthesisPR-315: Taming Transformers for High-Resolution Image Synthesis
PR-315: Taming Transformers for High-Resolution Image Synthesis
Hyeongmin Lee
 
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisPR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
Hyeongmin Lee
 
PR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
PR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical FlowPR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
PR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
Hyeongmin Lee
 
Pr266
Pr266Pr266
PR-252: Making Convolutional Networks Shift-Invariant Again
PR-252: Making Convolutional Networks Shift-Invariant AgainPR-252: Making Convolutional Networks Shift-Invariant Again
PR-252: Making Convolutional Networks Shift-Invariant Again
Hyeongmin Lee
 
PR-240: Modulating Image Restoration with Continual Levels via Adaptive Featu...
PR-240: Modulating Image Restoration with Continual Levels viaAdaptive Featu...PR-240: Modulating Image Restoration with Continual Levels viaAdaptive Featu...
PR-240: Modulating Image Restoration with Continual Levels via Adaptive Featu...
Hyeongmin Lee
 
[PR12] Making Convolutional Networks Shift-Invariant Again
[PR12] Making Convolutional Networks Shift-Invariant Again[PR12] Making Convolutional Networks Shift-Invariant Again
[PR12] Making Convolutional Networks Shift-Invariant Again
Hyeongmin Lee
 
Latest Frame interpolation Algorithms
Latest Frame interpolation AlgorithmsLatest Frame interpolation Algorithms
Latest Frame interpolation Algorithms
Hyeongmin Lee
 
[Paper Review] Temporal Generative Adversarial Nets with Singular Value Clipping
[Paper Review] Temporal Generative Adversarial Nets with Singular Value Clipping[Paper Review] Temporal Generative Adversarial Nets with Singular Value Clipping
[Paper Review] Temporal Generative Adversarial Nets with Singular Value Clipping
Hyeongmin Lee
 
[Paper Review] A Middlebury Benchmark & Context-Aware Synthesis for Video Fra...
[Paper Review] A Middlebury Benchmark & Context-Aware Synthesis for Video Fra...[Paper Review] A Middlebury Benchmark & Context-Aware Synthesis for Video Fra...
[Paper Review] A Middlebury Benchmark & Context-Aware Synthesis for Video Fra...
Hyeongmin Lee
 
[Paper Review] Video Frame Interpolation via Adaptive Convolution
[Paper Review] Video Frame Interpolation via Adaptive Convolution[Paper Review] Video Frame Interpolation via Adaptive Convolution
[Paper Review] Video Frame Interpolation via Adaptive Convolution
Hyeongmin Lee
 

More from Hyeongmin Lee (20)

PR-455: CoTracker: It is Better to Track Together
PR-455: CoTracker: It is Better to Track TogetherPR-455: CoTracker: It is Better to Track Together
PR-455: CoTracker: It is Better to Track Together
 
PR-430: CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retri...
PR-430: CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retri...PR-430: CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retri...
PR-430: CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retri...
 
PR-420: Scalable Model Compression by Entropy Penalized Reparameterization
PR-420: Scalable Model Compression by Entropy Penalized ReparameterizationPR-420: Scalable Model Compression by Entropy Penalized Reparameterization
PR-420: Scalable Model Compression by Entropy Penalized Reparameterization
 
PR-395: Variational Image Compression with a Scale Hyperprior
PR-395: Variational Image Compression with a Scale HyperpriorPR-395: Variational Image Compression with a Scale Hyperprior
PR-395: Variational Image Compression with a Scale Hyperprior
 
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
 
PR-376: Softmax Splatting for Video Frame Interpolation
PR-376: Softmax Splatting for Video Frame InterpolationPR-376: Softmax Splatting for Video Frame Interpolation
PR-376: Softmax Splatting for Video Frame Interpolation
 
PR-365: Fast object detection in compressed video
PR-365: Fast object detection in compressed videoPR-365: Fast object detection in compressed video
PR-365: Fast object detection in compressed video
 
PR-340: DVC: An End-to-end Deep Video Compression Framework
PR-340: DVC: An End-to-end Deep Video Compression FrameworkPR-340: DVC: An End-to-end Deep Video Compression Framework
PR-340: DVC: An End-to-end Deep Video Compression Framework
 
PR-328: End-to-End Optimized Image Compression
PR-328: End-to-End OptimizedImage CompressionPR-328: End-to-End OptimizedImage Compression
PR-328: End-to-End Optimized Image Compression
 
PR-315: Taming Transformers for High-Resolution Image Synthesis
PR-315: Taming Transformers for High-Resolution Image SynthesisPR-315: Taming Transformers for High-Resolution Image Synthesis
PR-315: Taming Transformers for High-Resolution Image Synthesis
 
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisPR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
 
PR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
PR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical FlowPR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
PR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
 
Pr266
Pr266Pr266
Pr266
 
PR-252: Making Convolutional Networks Shift-Invariant Again
PR-252: Making Convolutional Networks Shift-Invariant AgainPR-252: Making Convolutional Networks Shift-Invariant Again
PR-252: Making Convolutional Networks Shift-Invariant Again
 
PR-240: Modulating Image Restoration with Continual Levels via Adaptive Featu...
PR-240: Modulating Image Restoration with Continual Levels viaAdaptive Featu...PR-240: Modulating Image Restoration with Continual Levels viaAdaptive Featu...
PR-240: Modulating Image Restoration with Continual Levels via Adaptive Featu...
 
[PR12] Making Convolutional Networks Shift-Invariant Again
[PR12] Making Convolutional Networks Shift-Invariant Again[PR12] Making Convolutional Networks Shift-Invariant Again
[PR12] Making Convolutional Networks Shift-Invariant Again
 
Latest Frame interpolation Algorithms
Latest Frame interpolation AlgorithmsLatest Frame interpolation Algorithms
Latest Frame interpolation Algorithms
 
[Paper Review] Temporal Generative Adversarial Nets with Singular Value Clipping
[Paper Review] Temporal Generative Adversarial Nets with Singular Value Clipping[Paper Review] Temporal Generative Adversarial Nets with Singular Value Clipping
[Paper Review] Temporal Generative Adversarial Nets with Singular Value Clipping
 
[Paper Review] A Middlebury Benchmark & Context-Aware Synthesis for Video Fra...
[Paper Review] A Middlebury Benchmark & Context-Aware Synthesis for Video Fra...[Paper Review] A Middlebury Benchmark & Context-Aware Synthesis for Video Fra...
[Paper Review] A Middlebury Benchmark & Context-Aware Synthesis for Video Fra...
 
[Paper Review] Video Frame Interpolation via Adaptive Convolution
[Paper Review] Video Frame Interpolation via Adaptive Convolution[Paper Review] Video Frame Interpolation via Adaptive Convolution
[Paper Review] Video Frame Interpolation via Adaptive Convolution
 

Recently uploaded

weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
Pratik Pawar
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
Kerry Sado
 
DESIGN AND ANALYSIS OF A CAR SHOWROOM USING E TABS
DESIGN AND ANALYSIS OF A CAR SHOWROOM USING E TABSDESIGN AND ANALYSIS OF A CAR SHOWROOM USING E TABS
DESIGN AND ANALYSIS OF A CAR SHOWROOM USING E TABS
itech2017
 
Fundamentals of Induction Motor Drives.pptx
Fundamentals of Induction Motor Drives.pptxFundamentals of Induction Motor Drives.pptx
Fundamentals of Induction Motor Drives.pptx
manasideore6
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
AJAYKUMARPUND1
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
bakpo1
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Christina Lin
 
Steel & Timber Design according to British Standard
Steel & Timber Design according to British StandardSteel & Timber Design according to British Standard
Steel & Timber Design according to British Standard
AkolbilaEmmanuel1
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
Intella Parts
 
Technical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prismsTechnical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prisms
heavyhaig
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
manasideore6
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
fxintegritypublishin
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
thanhdowork
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
Massimo Talia
 
Water billing management system project report.pdf
Water billing management system project report.pdfWater billing management system project report.pdf
Water billing management system project report.pdf
Kamal Acharya
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
SamSarthak3
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
JoytuBarua2
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
TeeVichai
 
DfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributionsDfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributions
gestioneergodomus
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
WENKENLI1
 

Recently uploaded (20)

weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
 
DESIGN AND ANALYSIS OF A CAR SHOWROOM USING E TABS
DESIGN AND ANALYSIS OF A CAR SHOWROOM USING E TABSDESIGN AND ANALYSIS OF A CAR SHOWROOM USING E TABS
DESIGN AND ANALYSIS OF A CAR SHOWROOM USING E TABS
 
Fundamentals of Induction Motor Drives.pptx
Fundamentals of Induction Motor Drives.pptxFundamentals of Induction Motor Drives.pptx
Fundamentals of Induction Motor Drives.pptx
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
 
Steel & Timber Design according to British Standard
Steel & Timber Design according to British StandardSteel & Timber Design according to British Standard
Steel & Timber Design according to British Standard
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
 
Technical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prismsTechnical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prisms
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
 
Water billing management system project report.pdf
Water billing management system project report.pdfWater billing management system project report.pdf
Water billing management system project report.pdf
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
 
DfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributionsDfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributions
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
 

PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera pose

  • 1. GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose Hyeongmin Lee Image and Video Pattern Recognition LAB Electrical and Electronic Engineering Dept, Yonsei University 5th Semester 2020.2.23
  • 2. Depth, Optical Flow, Camera Pose
  • 3. Depth, Optical Flow, Camera Pose ◆ Depth [PR098 - MegaDepth] 이미지에 등장하는 각 Pixel이 Camera로부터 몇 m 떨어져 있는지를 나타내는 Map
  • 4. Depth, Optical Flow, Camera Pose ◆ Optical Flow [PR214 - FlowNet] 연속한 두 Frame 사이에서 각 Pixel의 Motion을 나타내는 Vector Map (Pixel Displacement)
  • 5. Depth, Optical Flow, Camera Pose ◆ Camera Pose (Camera Motion, Ego-Motion) 𝑧 𝑥 𝑦 (𝑥, 𝑦, 𝑧) (0,0,0) (𝑥, 𝑦, 𝑧) (𝑥′, 𝑦′, 𝑧′) 𝑇
  • 6. Depth, Optical Flow, Camera Pose ◆ Depth, Optical Flow, Camera Pose 대부분의 Pixel Motion은 카메라의 움직임에 의해 발생 ➔ Object Motion과 분리하여 생각.
  • 7. Depth, Optical Flow, Camera Pose ◆ Depth, Optical Flow, Camera Pose Depth!!
  • 9. 3D Geometry ◆ Real Distance? Camera 정보 카메라와 대상 간의 거리 (Depth)
  • 10. 3D Geometry ◆ Camera Calibration Image Coordinate Normalized Coordinate pixel Meter(z=1) (𝑥, 𝑦) (𝑢, 𝑣) 𝑥 = 𝑓𝑥 𝑢 + 𝑐 𝑥 𝑦 = 𝑓𝑦 𝑣 + 𝑐 𝑦 𝑥 𝑦 1 = 𝑓𝑥 0 𝑐 𝑥 0 𝑓𝑦 𝑐 𝑦 0 0 1 𝑢 𝑣 1 𝐾 Intrinsic Parameter
  • 11. 3D Geometry ◆ Depth 초점 𝑍 (𝑋, 𝑌, 𝑍) 1 𝑓 (𝑢, 𝑣, 1) (𝑥, 𝑦, 1) 𝑢 𝑣 1 = 𝐾−1 𝑥 𝑦 1 𝑋 𝑌 𝑍 = 𝑍 𝑢 𝑣 1 = 𝐷𝐾−1 𝑥 𝑦 1
  • 12. 3D Geometry ◆ 3D Transformation (𝑥, 𝑦, 𝑧) (𝑥′, 𝑦′, 𝑧′) 𝑇 𝑥′ 𝑦′ 𝑧′ 1 = 𝑟11 𝑟12 𝑟13 𝑡 𝑥 𝑟11 𝑟12 𝑟13 𝑡 𝑥 𝑟11 𝑟12 𝑟13 𝑡 𝑥 0 0 0 1 𝑥 𝑦 𝑧 1 = [𝑅|𝑡] 𝑥 𝑦 𝑧 1 𝑥′ 𝑦′ 𝑧′ = 𝑅 𝑥 𝑦 𝑧 + 𝑡 𝑥 𝑡 𝑦 𝑡 𝑧 출처: Dark Programmer
  • 14. GeoNet ◆ Rigid & Residual Motion • Rigid Motion: Camera Motion에 의한 상대적인 움직임 • Residual Motion: 각 Object의 독립적인 움직임
  • 15. GeoNet ◆ Rigid & Residual Motion =
  • 16. GeoNet ◆ Rigid Warping Loss ◆ Edge-Aware Depth Smoothness Loss 𝐿 𝑟𝑤 = 𝛼 1 − 𝑆𝑆𝐼𝑀(𝐼𝑡, ෩𝐼𝑠 𝑟𝑖𝑔 ) 2 + 1 − 𝛼 𝐼𝑡 − ෩𝐼𝑠 𝑟𝑖𝑔 1 𝐿 𝑑𝑠 = ෍ 𝑝 𝑡 |∇𝐷(𝑝𝑡)| ∙ 𝑒− ∇𝐼 𝑝 𝑡 𝑇
  • 17. GeoNet ◆ Flow Warping Loss ◆ Edge-Aware Flow Smoothness Loss 𝐿 𝑓𝑤 = 𝛼 1 − 𝑆𝑆𝐼𝑀(𝐼𝑡, ෩𝐼𝑠 𝑓𝑢𝑙𝑙 ) 2 + 1 − 𝛼 𝐼𝑡 − ෩𝐼𝑠 𝑓𝑢𝑙𝑙 1 𝐿 𝑓𝑠 = ෍ 𝑝 𝑡 |∇𝑓𝑡→𝑠 𝑓𝑢𝑙𝑙 (𝑝𝑡)| ∙ 𝑒− ∇𝐼 𝑝 𝑡 𝑇
  • 18. GeoNet ◆ Geometric Consistency Loss 𝐿 𝑔𝑐 = ෍ 𝑝 𝑡 [𝛿(𝑝𝑡)] ∙ ∆𝑓𝑡→𝑠 𝑓𝑢𝑙𝑙 𝑝𝑡 1 ∆𝑓𝑡→𝑠 𝑓𝑢𝑙𝑙 𝑝𝑡 = 𝑓𝑡→𝑠 𝑓𝑢𝑙𝑙 + 𝑓𝑠→𝑡 𝑓𝑢𝑙𝑙 (𝑝𝑡 + 𝑓𝑡→𝑠 𝑓𝑢𝑙𝑙 (𝑝𝑡)) For Occlusion Reasoning
  • 20. GeoNet ◆ Flow & Pose Result