PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera pose

•

1 like•317 views

이번 논문은, Video로부터 Unsupervised 방식을 통해 Flow, Depth, Camera Ego-motion까지 뽑아내는 GeoNet이라는 알고리즘입니다. Computer Vision에서 다루는 3D Geometry에 대해 간략히 설명 드린 후에 GeoNet 알고리즘을 소개하는 영상입니다.

GeoNet: Unsupervised Learning of Dense
Depth, Optical Flow and Camera Pose
Hyeongmin Lee
Image and Video Pattern Recognition LAB
Electrical and Electronic Engineering Dept, Yonsei University
5th Semester
2020.2.23

Depth, Optical Flow, Camera Pose
◆ Depth [PR098 - MegaDepth]
이미지에 등장하는 각 Pixel이 Camera로부터 몇 m 떨어져 있는지를 나타내는 Map

Depth, Optical Flow, Camera Pose
◆ Optical Flow [PR214 - FlowNet]
연속한 두 Frame 사이에서 각 Pixel의 Motion을 나타내는 Vector Map (Pixel Displacement)

Depth, Optical Flow, Camera Pose
◆ Camera Pose (Camera Motion, Ego-Motion)
𝑧
𝑥
𝑦
(𝑥, 𝑦, 𝑧)
(0,0,0)
(𝑥, 𝑦, 𝑧) (𝑥′, 𝑦′, 𝑧′)
𝑇

Depth, Optical Flow, Camera Pose
◆ Depth, Optical Flow, Camera Pose
대부분의 Pixel Motion은 카메라의 움직임에 의해 발생 ➔ Object Motion과 분리하여 생각.

Depth, Optical Flow, Camera Pose
◆ Depth, Optical Flow, Camera Pose
Depth!!

3D Geometry
◆ Real Distance?
Camera 정보
카메라와 대상 간의 거리
(Depth)

3D Geometry
◆ Camera Calibration
Image Coordinate Normalized Coordinate
pixel Meter(z=1)
(𝑥, 𝑦) (𝑢, 𝑣)
𝑥 = 𝑓𝑥 𝑢 + 𝑐 𝑥
𝑦 = 𝑓𝑦 𝑣 + 𝑐 𝑦
𝑥
𝑦
1
=
𝑓𝑥 0 𝑐 𝑥
0 𝑓𝑦 𝑐 𝑦
0 0 1
𝑢
𝑣
1
𝐾
Intrinsic Parameter

3D Geometry
◆ Depth
초점
𝑍
(𝑋, 𝑌, 𝑍)
1
𝑓
(𝑢, 𝑣, 1)
(𝑥, 𝑦, 1)
𝑢
𝑣
1
= 𝐾−1
𝑥
𝑦
1
𝑋
𝑌
𝑍
= 𝑍
𝑢
𝑣
1
= 𝐷𝐾−1
𝑥
𝑦
1

3D Geometry
◆ 3D Transformation
(𝑥, 𝑦, 𝑧) (𝑥′, 𝑦′, 𝑧′)
𝑇
𝑥′
𝑦′
𝑧′
1
=
𝑟11 𝑟12 𝑟13 𝑡 𝑥
𝑟11 𝑟12 𝑟13 𝑡 𝑥
𝑟11 𝑟12 𝑟13 𝑡 𝑥
0 0 0 1
𝑥
𝑦
𝑧
1
= [𝑅|𝑡]
𝑥
𝑦
𝑧
1
𝑥′
𝑦′
𝑧′
= 𝑅
𝑥
𝑦
𝑧
+
𝑡 𝑥
𝑡 𝑦
𝑡 𝑧
출처: Dark Programmer

GeoNet
◆ Rigid & Residual Motion
• Rigid Motion: Camera Motion에 의한 상대적인 움직임
• Residual Motion: 각 Object의 독립적인 움직임

GeoNet
◆ Rigid Warping Loss
◆ Edge-Aware Depth Smoothness Loss
𝐿 𝑟𝑤 = 𝛼
1 − 𝑆𝑆𝐼𝑀(𝐼𝑡, ෩𝐼𝑠
𝑟𝑖𝑔
)
2
+ 1 − 𝛼 𝐼𝑡 − ෩𝐼𝑠
𝑟𝑖𝑔
1
𝐿 𝑑𝑠 = ෍
𝑝 𝑡
|∇𝐷(𝑝𝑡)| ∙ 𝑒− ∇𝐼 𝑝 𝑡
𝑇

GeoNet
◆ Flow Warping Loss
◆ Edge-Aware Flow Smoothness Loss
𝐿 𝑓𝑤 = 𝛼
1 − 𝑆𝑆𝐼𝑀(𝐼𝑡, ෩𝐼𝑠
𝑓𝑢𝑙𝑙
)
2
+ 1 − 𝛼 𝐼𝑡 − ෩𝐼𝑠
𝑓𝑢𝑙𝑙
1
𝐿 𝑓𝑠 = ෍
𝑝 𝑡
|∇𝑓𝑡→𝑠
𝑓𝑢𝑙𝑙
(𝑝𝑡)| ∙ 𝑒− ∇𝐼 𝑝 𝑡
𝑇

GeoNet
◆ Geometric Consistency Loss
𝐿 𝑔𝑐 = ෍
𝑝 𝑡
[𝛿(𝑝𝑡)] ∙ ∆𝑓𝑡→𝑠
𝑓𝑢𝑙𝑙
𝑝𝑡 1
∆𝑓𝑡→𝑠
𝑓𝑢𝑙𝑙
𝑝𝑡 = 𝑓𝑡→𝑠
𝑓𝑢𝑙𝑙
+ 𝑓𝑠→𝑡
𝑓𝑢𝑙𝑙
(𝑝𝑡 + 𝑓𝑡→𝑠
𝑓𝑢𝑙𝑙
(𝑝𝑡))
For Occlusion Reasoning

제 PR12 첫번째 발표 논문은 FlowNet이라는 논문입니다. Optical Flow는 비디오의 인접한 Frame에 대하여 각 Pixel이 첫 번째 Frame에서 두 번째 Frame으로 얼마나 이동했는지의 Vector를 모든 위치에 대하여 나타낸 Map입니다. Video에 Motion을 분석하는 일은 매우 중요하기 때문에, 이러한 Optical Flow 역시 굉장히 중요한 요소 중 하나인데요, 이번 영상에서는 고전적인 Computer Vision에서 쓰였던 다양한 Optical Flow 알고리즘들과, Deep Learning Based로 Optical Flow를 구하는 Neural Network인 FlowNet에 대하여 알아보겠습니다. 감사합니다!! 영상 링크: https://youtu.be/Z_t0shK98pM 논문 링크: http://openaccess.thecvf.com/content_iccv_2015/html/Dosovitskiy_FlowNet_Learning_Optical_ICCV_2015_paper.html

Deep Learning for Video: Action Recognition (UPC 2018)

Universitat Politècnica de Catalunya

PR-409: Denoising Diffusion Probabilistic Models

Hyeongmin Lee

이번 논문은 요즘 핫한 Diffusion을 처음으로 유행시킨 Denoising Diffusion Probabilistic Models (DDPM) 입니다. ICML 2015년에 처음 제안된 Diffusion의 여러 실용적인 측면들을 멋지게 해결하여 그 유행의 시작을 알린 논문인데요, Generative Model의 여러 분야와 Diffusion, 그리고 DDPM에서는 무엇이 바뀌었는지 알아보도록 하겠습니다. 논문 링크: https://arxiv.org/abs/2006.11239 영상 링크: https://youtu.be/1j0W_lu55nc

Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)

Universitat Politècnica de Catalunya

https://telecombcn-dl.github.io/2017-dlcv/ Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.

20191019 sinkhorn

Taku Yoshioka

Image segmentation with deep learning

Antonio Rueda-Toicen

Presentation on deformable model for medical image segmentation

Subhash Basistha

Single Image Super Resolution Overview

LEE HOSEONG

발표자: 박태성 (UC Berkeley 박사과정) 발표일: 2017.6. Taesung Park is a Ph.D. student at UC Berkeley in AI and computer vision, advised by Prof. Alexei Efros. His research interest lies between computer vision and computational photography, such as generating realistic images or enhancing photo qualities. He received B.S. in mathematics and M.S. in computer science from Stanford University. 개요: Image-to-image translation is a class of vision and graphics problems where the goal is to learn the mapping between an input image and an output image using a training set of aligned image pairs. However, for many tasks, paired training data will not be available. We present an approach for learning to translate an image from a source domain X to a target domain Y in the absence of paired examples. Our goal is to learn a mapping G: X → Y such that the distribution of images from G(X) is indistinguishable from the distribution Y using an adversarial loss. Because this mapping is highly under-constrained, we couple it with an inverse mapping F: Y → X and introduce a cycle consistency loss to push F(G(X)) ≈ X (and vice versa). Qualitative results are presented on several tasks where paired training data does not exist, including collection style transfer, object transfiguration, season transfer, photo enhancement, etc. Quantitative comparisons against several prior methods demonstrate the superiority of our approach.

Optimization/Gradient Descent

kandelin

Optic flow estimation with deep learning

Yu Huang

Optic Flow Brightness Constancy Constraints Aperture Problem Regularization and Smoothness Constraints Lucas-Kanade algorithm Focus of Expansion (FOE) Discrete Optimization for Optical Flow Large Displacement Optical Flow: Descriptor Matching DeepFlow: Large displ. optical flow with deep matching EpicFlow: Edge-Preserving Interpolation of Correspondences for Optical Flow Optical Flow with Piecewise Parametric Model Flow Fields: Dense Correspondence Fields for Accurate Large Displacement Optical Flow Estimation Full Flow: Optical Flow Estimation By Global Optimization over Regular Grids FlowNet: Learning Optical Flow with Convol. Networks Deep Discrete Flow Optical Flow Estimation using a Spatial Pyramid Network A Large Dataset to Train ConvNets for Disparity, Optical Flow, and Scene Flow Estimation DeMoN: Depth and Motion Network for Learning Monocular Stereo Unsupervised Learning of Depth and Ego-Motion from Video Appendix A: A Database and Evaluation Methodology for Optical Flow Appendix B: Learning and optimization

Multimodal Deep Learning

Universitat Politècnica de Catalunya

Deep neural networks have boosted the convergence of multimedia data analytics in a unified framework shared by practitioners in natural language, vision and speech. Image captioning, lip reading or video sonorization are some of the first applications of a new and exciting field of research exploiting the generalization properties of deep neural representation. This tutorial will firstly review the basic neural architectures to encode and decode vision, text and audio, to later review the those models that have successfully translated information across modalities. The contents of this tutorial are available at: https://telecombcn-dl.github.io/2019-mmm-tutorial/.

Deep learning for person re-identification

哲东郑

4 Dimensionality reduction (PCA & t-SNE)

Dmytro Fishman

The fourth lecture from the Machine Learning course series of lectures. This lecture first introduces a problem of visualising multi-dimensional data on fewer dimensions and later discusses one of the most popular methods for reducing dimensionality - principal component analysis (PCA). Later, also t-SNE is mentioned briefly as a non-linear alternative to PCA. A link to my github (https://github.com/skyfallen/MachineLearningPracticals) with practicals that I have designed for this course in both R and Python. I can share keynote files, contact me via e-mail: dmytro.fishman@ut.ee.

InfoGAN: Interpretable Representation Learning by Information Maximizing Gene...

홍배 김

InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets 오사카 대학 박사과정인 Takato Horii군이 작성한 자료 데이터 생성 모델로 우수한 GAN을 이용하여 비지도학습을 통해 "알기쉬게" 이미지의 정보를 표현하는 특징량을 "간단하게"획득하기 * 특징이 서로 얽혀있는 Physical space에서 서로 독립적인 Eigen space로 변환하는 것과 같은 원리

ConvNeXt: A ConvNet for the 2020s explained

Sushant Gautam

Explained here: https://youtu.be/aBvDPL1jFnI In Nepali A ConvNet for the 2020s (Zhuang Liu et al.) ComvNeXt paper Deep Learning for Visual Intelligence Sushant Gautam MSCIISE Department of Electronics and Computer Engineering Institute of Engineering, Thapathali Campus 13 March 2022 To all the authors (obviously!!) 1. Jinwon Lee's slides at https://www.slideshare.net/JinwonLee9/pr366-a-convnet-for-2020s?qid=274bc524-23ae-4c13-b03b-0d2416976ad5&v=&b=&from_search=1 2. Letitia from AI Coffee Break: https://www.youtube.com/watch?v=SndHALawoag I even edited some of her hard visual works and put them as a slide. :(

Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018

Universitat Politècnica de Catalunya

https://telecombcn-dl.github.io/2018-dlai/ Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.

“How Transformers are Changing the Direction of Deep Learning Architectures,”...

Edge AI and Vision Alliance

For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2022/08/how-transformers-are-changing-the-direction-of-deep-learning-architectures-a-presentation-from-synopsys/ Tom Michiels, System Architect for DesignWare ARC Processors at Synopsys, presents the “How Transformers are Changing the Direction of Deep Learning Architectures” tutorial at the May 2022 Embedded Vision Summit. The neural network architectures used in embedded real-time applications are evolving quickly. Transformers are a leading deep learning approach for natural language processing and other time-dependent, series data applications. Now, transformer-based deep learning network architectures are also being applied to vision applications with state-of-the-art results compared to CNN-based solutions. In this presentation, Michiels introduces transformers and contrast them with the CNNs commonly used for vision tasks today. He examines the key features of transformer model architectures and shows performance comparisons between transformers and CNNs. He concludes the presentation with insights on why Synopsys thinks transformers are an important approach for future visual perception tasks.

1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

NAVER Engineering

발표자: 최윤제(고려대 석사과정) 최윤제 (Yunjey Choi)는 고려대학교에서 컴퓨터공학을 전공하였으며, 현재는 석사과정으로 Machine Learning을 공부하고 있는 학생이다. 코딩을 좋아하며 이해한 것을 다른 사람들에게 공유하는 것을 좋아한다. 1년 간 TensorFlow를 사용하여 Deep Learning을 공부하였고 현재는 PyTorch를 사용하여 Generative Adversarial Network를 공부하고 있다. TensorFlow로 여러 논문들을 구현, PyTorch Tutorial을 만들어 Github에 공개한 이력을 갖고 있다. 개요: Generative Adversarial Network(GAN)은 2014년 Ian Goodfellow에 의해 처음으로 제안되었으며, 적대적 학습을 통해 실제 데이터의 분포를 추정하는 생성 모델입니다. 최근 들어 GAN은 가장 인기있는 연구 분야로 떠오르고 있고 하루에도 수 많은 관련 논문들이 쏟아져 나오고 있습니다. 수 없이 쏟아져 나오고 있는 GAN 논문들을 다 읽기가 힘드신가요? 괜찮습니다. 기본적인 GAN만 완벽하게 이해한다면 새로 나오는 논문들도 쉽게 이해할 수 있습니다. 이번 발표를 통해 제가 GAN에 대해 알고 있는 모든 것들을 전달해드리고자 합니다. GAN을 아예 모르시는 분들, GAN에 대한 이론적인 내용이 궁금하셨던 분들, GAN을 어떻게 활용할 수 있을지 궁금하셨던 분들이 발표를 들으면 좋을 것 같습니다. 발표영상: https://youtu.be/odpjk7_tGY0

Kernels and Support Vector Machines

Edgar Marca

Computer vision

yusifagalar

YOLOv4: optimal speed and accuracy of object detection review

LEE HOSEONG

Noise2Score: Tweedie’s Approach to Self-Supervised Image Denoising without Cl...

KwanyoungKim7

[DL輪読会]Generative Models of Visually Grounded Imagination

Deep Learning JP

最近(2020/09/13)のarxivの分布外検知の論文を紹介

ぱんいちすみもと

Deep VO and SLAM

Yu Huang

論文読み会@AIST (Deep Virtual Stereo Odometry [ECCV2018])

Masaya Kaneko

Keynote at Tracking Workshop during ISMAR 2014

Darius Burschka

What's hot

Moving object detection

Raviraj singh shekhawat

Wasserstein GAN 수학 이해하기 I

Sungbin Lim

Finding connections among images using CycleGAN

NAVER Engineering

Optimization/Gradient Descent

kandelin

Optic flow estimation with deep learning

Yu Huang

Multimodal Deep Learning

Universitat Politècnica de Catalunya

Deep learning for person re-identification

哲东郑

4 Dimensionality reduction (PCA & t-SNE)

Dmytro Fishman

InfoGAN: Interpretable Representation Learning by Information Maximizing Gene...

홍배 김

ConvNeXt: A ConvNet for the 2020s explained

Sushant Gautam

Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018

Universitat Politècnica de Catalunya

“How Transformers are Changing the Direction of Deep Learning Architectures,”...

Edge AI and Vision Alliance

1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

NAVER Engineering

Kernels and Support Vector Machines

Edgar Marca

Computer vision

yusifagalar

YOLOv4: optimal speed and accuracy of object detection review

LEE HOSEONG

Noise2Score: Tweedie’s Approach to Self-Supervised Image Denoising without Cl...

KwanyoungKim7

[DL輪読会]Generative Models of Visually Grounded Imagination

Deep Learning JP

最近(2020/09/13)のarxivの分布外検知の論文を紹介

ぱんいちすみもと

Deep VO and SLAM

Yu Huang

What's hot (20)

Moving object detection

Wasserstein GAN 수학 이해하기 I

Finding connections among images using CycleGAN

Optimization/Gradient Descent

Optic flow estimation with deep learning

Multimodal Deep Learning

Deep learning for person re-identification

4 Dimensionality reduction (PCA & t-SNE)

InfoGAN: Interpretable Representation Learning by Information Maximizing Gene...

ConvNeXt: A ConvNet for the 2020s explained

Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018

“How Transformers are Changing the Direction of Deep Learning Architectures,”...

1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

Kernels and Support Vector Machines

Computer vision

YOLOv4: optimal speed and accuracy of object detection review

Noise2Score: Tweedie’s Approach to Self-Supervised Image Denoising without Cl...

[DL輪読会]Generative Models of Visually Grounded Imagination

最近(2020/09/13)のarxivの分布外検知の論文を紹介

Deep VO and SLAM

Similar to PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera pose

論文読み会@AIST (Deep Virtual Stereo Odometry [ECCV2018])

Masaya Kaneko

Keynote at Tracking Workshop during ISMAR 2014

Darius Burschka

Motion capture

Aswanth Talaseela

Motion capture technology

ARUN S L

Fast Multi-frame Stereo Scene Flow with Motion Segmentation (CVPR 2017)

Tatsunori Taniai

Motion capture technology

Arun MK

Motion capture document

harini501

Motion Human Detection & Tracking Based On Background Subtraction

International Journal of Engineering Inventions www.ijeijournal.com

Reconstructing and Watermarking Stereo Vision Systems-PhD Presentation

Osama Hosam

We have solved the correspondence problem by applying the matching process in two levels, the first level is Feature based matching, in which we have extracted the features of both images by creating multi-resolution images and applying histogram segmentation. The resulting features are region features; a comparison is done between the regions in the first image with the regions of the second image to get the disparity map. The second level is Area-based matching in which we applied the Wavelet transform to get an expected window size as a search area for each pixel. We have joined the two levels to obtain more accurate pixel by pixel correspondence. We also obtained an adaptive search range and window size for each pixel to reduce the mismatches. Our procedure introduced high accuracy results and denser depth information. The depth information is used to get the final 3D model – using only pair of images will create 2.5D model, using more than pair of images will create 3D model, we will refer to 3D model as a general output of stereo reconstruction– After reconstructing the model, in some applications it is needed to be published online. For example suppose the reconstructed model is a model for Sphinx – Famous statue in Egypt – The reconstruction for the model can be done in many days or months; then the model will be published online to let Internet users around the world watch the model. Therefore, techniques should be used to protect the copyright for that model. We have applied new fragile watermarking technique to secure the 3D reconstructed model and protect its copyright.

Real-time 3D Object Pose Estimation and Tracking for Natural Landmark Based V...c.choi

BallCatchingRobotgauravbrd

Presentation Object Recognition And Tracking ProjectPrathamesh Joshi

Motionblur

ozlael ozlael

Androidで出来る!! KinectとiPadを使った亀ロボ

Hirotaka Niisato

Outline

Ashraf Aboshosha

Fundamentals of matchmoving

Dipjoy Routh

Smart Room Gesture Control

Giwrgos Paraskevopoulos

Getmoving as3kinect

Marielle Lange

Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon TransformFadwa Fouad

Edge Detection algorithm and code

Vaddi Manikanta

Similar to PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera pose (20)

論文読み会@AIST (Deep Virtual Stereo Odometry [ECCV2018])

Keynote at Tracking Workshop during ISMAR 2014

Motion capture

Motion capture technology

Fast Multi-frame Stereo Scene Flow with Motion Segmentation (CVPR 2017)

Motion capture technology

Motion capture document

Motion Human Detection & Tracking Based On Background Subtraction

Reconstructing and Watermarking Stereo Vision Systems-PhD Presentation

Real-time 3D Object Pose Estimation and Tracking for Natural Landmark Based V...

BallCatchingRobot

Presentation Object Recognition And Tracking Project

Motionblur

Androidで出来る!! KinectとiPadを使った亀ロボ

Outline

Fundamentals of matchmoving

Smart Room Gesture Control

Getmoving as3kinect

Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon Transform

Edge Detection algorithm and code

More from Hyeongmin Lee

PR-455: CoTracker: It is Better to Track Together

Hyeongmin Lee

이번 영상에서는 제가 PR 278번째로 소개드린 적 있었던 RAFT의 Point Tracking 버전 논문입니다. 보통 Object Traking은 주어진 bounding box를 track하는 task를 말하는데 본 논문에서는 첫 프레임에 주어진 point를 따라가는 task를 다루고 있습니다. 논문 제목에서 이야기 하듯이, 주어진 point 하나를 따라가는 것보다 여러 point를 함께 따라가면서 서로 정보를 주고받는 등의 interaction을 하는 것이 tracking 성능 향상에 도움이 된다는 것이 이 논문의 main idea입니다. 논문 링크: https://arxiv.org/abs/2307.07635 영상 링크: https://youtu.be/BDfTSm3_hys

PR-430: CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retri...

Hyeongmin Lee

이번 영상에서는 제가 최근에 관심 가지기 시작한 Video 검색쪽 논문을 소개드려볼까 합니다. 발표드릴 논문 제목은 CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval 입니다. CLIP을 Video에 확장해서 Video와 Text의 multimodal 학습을 하는 형태의 논문인데요, 논문 자체 내용은 굉장히 심플해서 분량이 남을 것 같기도 하고 CLIP이 PR12에서 제대로 다뤄진 적은 없었던 것 같아서 함께 다뤄보려고 합니다. 영상 링크: https://youtu.be/b543xivGRnI 논문 링크: https://arxiv.org/abs/2104.08860

PR-420: Scalable Model Compression by Entropy Penalized Reparameterization

Hyeongmin Lee

제가 이번에 소개드릴 논문은 Scalable Model Compression by Entropy Penalized Reparameterization이라는 논문입니다. 이전에 꾸준히 Deep Learning을 이용한 이미지 및 비디오 압축에 대해 설명드렸던 바가 있는데, 이번에는 Neural Network의 Model Parameter들을 압축하는 방법에 관한 논문입니다. 논문 링크: https://arxiv.org/abs/1906.06624 영상 링크: https://youtu.be/LJ8WD5MKA2o

PR-395: Variational Image Compression with a Scale Hyperprior

Hyeongmin Lee

제가 이번에 소개드릴 논문은 Variational Image Compression with a Scale Hyperprior라는 논문입니다. 지난 328번째 발표에 이어서 두번째 Deep Learning-based Image Compression이고, 지난번 발표때 다루지 못했던 Variational Autoencoder와의 관계와 이번 논문에서의 새 Contribution까지, Deep Learning을 이용한 Image Compression연구는 어떤 고민을 주로 하고 있는지 등을 전달해드리고자 노력하였습니다. 논문 링크: https://arxiv.org/abs/1802.01436 영상 링크: https://youtu.be/ne9ieHRsfCc

PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...

Hyeongmin Lee

제가 이번에 소개드릴 논문은 NeRF와 같이 view synthesis를 하는 논문입니다. NeRF 이후로 NeRF의 문제점을 보완하기 위해 여러 방법들이 쏟아져 나왔는데요, 다른 한편으로는 발상의 전환을 통해 NeRF와 다른 방법을 활용하고자 하는 시도들도 있는 편입니다. 그러한 가장 대표적인 방법중 하나인 Neural Light Field Rendering 방식에 대해 설명드리겠습니다. 논문 링크: https://arxiv.org/abs/2106.02634 영상 링크: https://youtu.be/gxag8uvA2Sc

PR-376: Softmax Splatting for Video Frame Interpolation

Hyeongmin Lee

이번 PR12 365번째 논문으로 소개드릴 내용은 조금 특이한 접근법입니다. 우리가 실생활에서 접하는 대부분의 비디오는 Compressed 된 형태의 Video인데요, 실제 Computer Vision Task에서 input이 Compressed Video라는 가정을 하게 되면 생각보다 큰 이점을 얻을 수 있습니다. 바로 Compressed Video에는 Motion Vector가 포함되어있다는 점입니다. 이를 이용하면 생각보다 많은 것들을 할 수 있게 됩니다. 그 예시로 Object Detection의 연산량을 크게 줄인 case를 하나 소개드려보고자 합니다. 논문 링크: https://arxiv.org/abs/2003.05534 영상 링크: https://youtu.be/jxKU4pDs2G8

PR-365: Fast object detection in compressed video

Hyeongmin Lee

이번 PR12 365번째 논문으로 소개드릴 내용은 조금 특이한 접근법입니다. 우리가 실생활에서 접하는 대부분의 비디오는 Compressed 된 형태의 Video인데요, 실제 Computer Vision Task에서 input이 Compressed Video라는 가정을 하게 되면 생각보다 큰 이점을 얻을 수 있습니다. 바로 Compressed Video에는 Motion Vector가 포함되어있다는 점입니다. 이를 이용하면 생각보다 많은 것들을 할 수 있게 됩니다. 그 예시로 Object Detection의 연산량을 크게 줄인 case를 하나 소개드려보고자 합니다. paper link: https://openaccess.thecvf.com/content_ICCV_2019/html/Wang_Fast_Object_Detection_in_Compressed_Video_ICCV_2019_paper.html video link: https://youtu.be/9n6OtHtJvJ0

PR-340: DVC: An End-to-end Deep Video Compression Framework

Hyeongmin Lee

이번 PR12 340번째 논문으로 소개드릴 내용은 Deep Learning을 이용한 Video Compression에 관한 내용입니다. 바로 이전 논문으로 Deep Learning을 이용한 Image Compression에 대해 설명드렸었는데요, 시간 여유가 있으신 분들께서는 이전 영상 먼저 보시고 오는 것을 추천드립니다 :) 이전 영상: https://www.youtube.com/watch?v=rtuJqQDWmIA paper link: https://arxiv.org/abs/1812.00101 youtube link: https://youtu.be/Dd8Gj2ZITkA

PR-328: End-to-End OptimizedImage Compression

Hyeongmin Lee

PR 328번째 논문은 ICLR 2017에 발표된 "End-to-End OptimizedImage Compression"이라는 논문입니다. 이미지 압축에 대해 들어보신 적이 있으신가요? 이미지를 더 적은 비트, 즉 더 적은 용량의 데이터로 표현하기 위해 다양한 압축 방법이 제안되어 왔습니다. 가장 대표적인 기술이 JPEG이라고 할 수 있겠는데요, 이 논문에서는 End-to-End Deep Learning을 이용하여 이미지를 압축하는 기법을 제안합니다. 이 논문에서 제안한 방법과 더불어 이미지 압축에 필요한 기본 개념들까지 함께 정리하였으니 이미지 압축이라는 분야가 단순히 무엇인지 궁금하신 분들께서도 앞에서부터 차근차근 봐주시면 감사드리겠습니다 :) paper link: https://arxiv.org/abs/1611.01704 youtube link: https://youtu.be/rtuJqQDWmIA

PR-315: Taming Transformers for High-Resolution Image Synthesis

Hyeongmin Lee

요즘 Transformer 구조를 language랑 vision 관계 없이 여기저기 적용해보려는 시도가 매우 다양하게 이루어지고 있는데요, 그래서 이번주 제 발표에서는 이를 High-resolution image synthesis에 활용한, CVPR 2021 Oral Session에서 발표될 논문 하나를 소개해보려고 합니다! ** 방송 기기 문제로 이번 영상은 아이패드 필기 없이 진행됩니다!! ** 논문 링크: https://arxiv.org/abs/2012.09841 영상 링크: https://youtu.be/GcbT0IGt0xE

PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

Hyeongmin Lee

드디어 PR12 Season 4가 시작되었습니다! 제가 이번 시즌에서 발표하게 된 첫 논문은 ""NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis"라는 논문입니다. View Synthesis라는 Task는 몇 개의 시점에서 대상을 찍은 영상이 주어지면 주어지지 않은 위치와 방향에서 바라본 대상의 영상을 합성해내는 기술입니다. 이를 위해서 본 논문에서는 대상의 3D 정보를 통째로 Neural Network가 외우게 하는 방법을 선택했는데요, 이 방식은 Implicit Neural Representation이라는 이름으로 유명해지고 있는 추세고, 2D 이미지에 대해서도 적용하려는 접근들이 늘고 있습니다. 영상 링크: https://youtu.be/zkeh7Tt9tYQ 논문 링크: https://arxiv.org/abs/2003.08934

PR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical Flow

Hyeongmin Lee

Pr266

Hyeongmin Lee

이번에 다룰 논문은 "Learning by Analogy: Reliable Supervision From Transformations for Unsupervised Optical Flow Estimation"이라는 논문입니다. 얼마 전에 발표드렸던 FlowNet 논문처럼 이 논문도 Deep Learning을 통해 Optical Flow를 학습하는 방법입니다. 다른 점이 하나 있다면, Unsupervised 방식으로 학습이 진행된다는 점입니다. Supervised 방식 만큼이나 Unsupervised 방식으로 Optical Flow를 학습하는 연구 역시 이미 많이 진행이 되어 왔는데요, 오늘 소개 드릴 논문에서는 Data Augmentation을 통한 Consistency를 활용하여 성능을 높이는 방식을 채용한 경우를 소개드리고자 합니다. 영상 링크: 이번에 다룰 논문은 "Learning by Analogy: Reliable Supervision From Transformations for Unsupervised Optical Flow Estimation"이라는 논문입니다. 얼마 전에 발표드렸던 FlowNet 논문처럼 이 논문도 Deep Learning을 통해 Optical Flow를 학습하는 방법입니다. 다른 점이 하나 있다면, Unsupervised 방식으로 학습이 진행된다는 점입니다. Supervised 방식 만큼이나 Unsupervised 방식으로 Optical Flow를 학습하는 연구 역시 이미 많이 진행이 되어 왔는데요, 오늘 소개 드릴 논문에서는 Data Augmentation을 통한 Consistency를 활용하여 성능을 높이는 방식을 채용한 경우를 소개드리고자 합니다.

PR-252: Making Convolutional Networks Shift-Invariant Again

Hyeongmin Lee

PR-240: Modulating Image Restoration with Continual Levels viaAdaptive Featu...

Hyeongmin Lee

[PR12] Making Convolutional Networks Shift-Invariant Again

Hyeongmin Lee

Latest Frame interpolation Algorithms

Hyeongmin Lee

[Paper Review] Temporal Generative Adversarial Nets with Singular Value Clipping

Hyeongmin Lee

[Paper Review] A Middlebury Benchmark & Context-Aware Synthesis for Video Fra...

Hyeongmin Lee

[Paper Review] Video Frame Interpolation via Adaptive Convolution

Hyeongmin Lee

More from Hyeongmin Lee (20)

PR-455: CoTracker: It is Better to Track Together

PR-430: CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retri...

PR-420: Scalable Model Compression by Entropy Penalized Reparameterization

PR-395: Variational Image Compression with a Scale Hyperprior

PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...

PR-376: Softmax Splatting for Video Frame Interpolation

PR-365: Fast object detection in compressed video

PR-340: DVC: An End-to-end Deep Video Compression Framework

PR-328: End-to-End OptimizedImage Compression

PR-315: Taming Transformers for High-Resolution Image Synthesis

PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

PR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical Flow

Pr266

PR-252: Making Convolutional Networks Shift-Invariant Again

PR-240: Modulating Image Restoration with Continual Levels viaAdaptive Featu...

[PR12] Making Convolutional Networks Shift-Invariant Again

Latest Frame interpolation Algorithms

[Paper Review] Temporal Generative Adversarial Nets with Singular Value Clipping

[Paper Review] A Middlebury Benchmark & Context-Aware Synthesis for Video Fra...

[Paper Review] Video Frame Interpolation via Adaptive Convolution

Recently uploaded

weather web application report.pdf

Pratik Pawar

Hierarchical Digital Twin of a Naval Power System

Kerry Sado

A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.

DESIGN AND ANALYSIS OF A CAR SHOWROOM USING E TABS

itech2017

Fundamentals of Induction Motor Drives.pptx

manasideore6

Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...

AJAYKUMARPUND1

一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理

bakpo1

SFU毕业证原版定制【微信：176555708】【西蒙菲莎大学毕业证成绩单-学位证】【微信：176555708】（留信学历认证永久存档查询）采用学校原版纸张、特殊工艺完全按照原版一比一制作（包括：隐形水印，阴影底纹，钢印LOGO烫金烫银，LOGO烫金烫银复合重叠，文字图案浮雕，激光镭射，紫外荧光，温感，复印防伪）行业标杆！精益求精，诚心合作，真诚制作！多年品质 ,按需精细制作，24小时接单,全套进口原装设备，十五年致力于帮助留学生解决难题，业务范围有加拿大、英国、澳洲、韩国、美国、新加坡，新西兰等学历材料，包您满意。 ◆◆◆◆◆ — — — — — — — — 【留学教育】留学归国服务中心 — — — — — -◆◆◆◆◆ 【主营项目】一.毕业证【微信：176555708】成绩单、使馆认证、教育部认证、雅思托福成绩单、学生卡等！二.真实使馆公证(即留学回国人员证明,不成功不收费) 三.真实教育部学历学位认证（教育部存档！教育部留服网站永久可查）四.办理各国各大学文凭(一对一专业服务,可全程监控跟踪进度) 如果您处于以下几种情况： ◇在校期间，因各种原因未能顺利毕业……拿不到官方毕业证【微信：176555708】 ◇面对父母的压力，希望尽快拿到； ◇不清楚认证流程以及材料该如何准备； ◇回国时间很长，忘记办理； ◇回国马上就要找工作，办给用人单位看； ◇企事业单位必须要求办理的 ◇需要报考公务员、购买免税车、落转户口 ◇申请留学生创业基金留信网认证的作用: 1:该专业认证可证明留学生真实身份 2:同时对留学生所学专业登记给予评定 3:国家专业人才认证中心颁发入库证书 4:这个认证书并且可以归档倒地方 5:凡事获得留信网入网的信息将会逐步更新到个人身份内，将在公安局网内查询个人身份证信息后，同步读取人才网入库信息 6:个人职称评审加20分 7:个人信誉贷款加10分→ 【关于价格问题（保证一手价格）我们所定的价格是非常合理的，而且我们现在做得单子大多数都是代理和回头客户介绍的所以一般现在有新的单子我给客户的都是第一手的代理价格，因为我想坦诚对待大家不想跟大家在价格方面浪费时间对于老客户或者被老客户介绍过来的朋友，我们都会适当给一些优惠。 8:在国家人才网主办的国家网络招聘大会中纳入资料，供国家高端企业选择人才选择实体注册公司办理，更放心，更安全！我们的承诺：可来公司面谈，可签订合同，会陪同客户一起到教育部认证窗口递交认证材料，客户在教育部官方认证查询网站查询到认证通过结果后付款，不成功不收费！学历顾问：微信：176555708

Harnessing WebAssembly for Real-time Stateless Streaming Pipelines

Christina Lin

Traditionally, dealing with real-time data pipelines has involved significant overhead, even for straightforward tasks like data transformation or masking. However, in this talk, we’ll venture into the dynamic realm of WebAssembly (WASM) and discover how it can revolutionize the creation of stateless streaming pipelines within a Kafka (Redpanda) broker. These pipelines are adept at managing low-latency, high-data-volume scenarios.

Steel & Timber Design according to British Standard

AkolbilaEmmanuel1

Forklift Classes Overview by Intella Parts

Intella Parts

Technical Drawings introduction to drawing of prisms

heavyhaig

Fundamentals of Electric Drives and its applications.pptx

manasideore6

Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf

fxintegritypublishin

Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.

RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...

thanhdowork

Nuclear Power Economics and Structuring 2024

Massimo Talia

Water billing management system project report.pdf

Kamal Acharya

Our project entitled “Water Billing Management System” aims is to generate Water bill with all the charges and penalty. Manual system that is employed is extremely laborious and quite inadequate. It only makes the process more difficult and hard. The aim of our project is to develop a system that is meant to partially computerize the work performed in the Water Board like generating monthly Water bill, record of consuming unit of water, store record of the customer and previous unpaid record. We used HTML/PHP as front end and MYSQL as back end for developing our project. HTML is primarily a visual design environment. We can create a android application by designing the form and that make up the user interface. Adding android application code to the form and the objects such as buttons and text boxes on them and adding any required support code in additional modular. MySQL is free open source database that facilitates the effective management of the databases by connecting them to the software. It is a stable ,reliable and the powerful solution with the advanced features and advantages which are as follows: Data Security.MySQL is free open source database that facilitates the effective management of the databases by connecting them to the software.

AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf

SamSarthak3

Planning Of Procurement o different goods and services

JoytuBarua2

Railway Signalling Principles Edition 3.pdf

TeeVichai

DfMAy 2024 - key insights and contributions

gestioneergodomus

Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf

WENKENLI1

Recently uploaded (20)

weather web application report.pdf

Hierarchical Digital Twin of a Naval Power System

DESIGN AND ANALYSIS OF A CAR SHOWROOM USING E TABS

Fundamentals of Induction Motor Drives.pptx

Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...

一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理

Harnessing WebAssembly for Real-time Stateless Streaming Pipelines

Steel & Timber Design according to British Standard

Forklift Classes Overview by Intella Parts

Technical Drawings introduction to drawing of prisms

Fundamentals of Electric Drives and its applications.pptx

Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf

RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...

Nuclear Power Economics and Structuring 2024

Water billing management system project report.pdf

AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf

Planning Of Procurement o different goods and services

Railway Signalling Principles Edition 3.pdf

DfMAy 2024 - key insights and contributions

Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf

PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera pose

1. GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose Hyeongmin Lee Image and Video Pattern Recognition LAB Electrical and Electronic Engineering Dept, Yonsei University 5th Semester 2020.2.23

2. Depth, Optical Flow, Camera Pose

3. Depth, Optical Flow, Camera Pose ◆ Depth [PR098 - MegaDepth] 이미지에 등장하는 각 Pixel이 Camera로부터 몇 m 떨어져 있는지를 나타내는 Map

4. Depth, Optical Flow, Camera Pose ◆ Optical Flow [PR214 - FlowNet] 연속한 두 Frame 사이에서 각 Pixel의 Motion을 나타내는 Vector Map (Pixel Displacement)

5. Depth, Optical Flow, Camera Pose ◆ Camera Pose (Camera Motion, Ego-Motion) 𝑧 𝑥 𝑦 (𝑥, 𝑦, 𝑧) (0,0,0) (𝑥, 𝑦, 𝑧) (𝑥′, 𝑦′, 𝑧′) 𝑇

6. Depth, Optical Flow, Camera Pose ◆ Depth, Optical Flow, Camera Pose 대부분의 Pixel Motion은 카메라의 움직임에 의해 발생 ➔ Object Motion과 분리하여 생각.

7. Depth, Optical Flow, Camera Pose ◆ Depth, Optical Flow, Camera Pose Depth!!

8. 3D Geometry

9. 3D Geometry ◆ Real Distance? Camera 정보 카메라와 대상 간의 거리 (Depth)

10. 3D Geometry ◆ Camera Calibration Image Coordinate Normalized Coordinate pixel Meter(z=1) (𝑥, 𝑦) (𝑢, 𝑣) 𝑥 = 𝑓𝑥 𝑢 + 𝑐 𝑥 𝑦 = 𝑓𝑦 𝑣 + 𝑐 𝑦 𝑥 𝑦 1 = 𝑓𝑥 0 𝑐 𝑥 0 𝑓𝑦 𝑐 𝑦 0 0 1 𝑢 𝑣 1 𝐾 Intrinsic Parameter

11. 3D Geometry ◆ Depth 초점 𝑍 (𝑋, 𝑌, 𝑍) 1 𝑓 (𝑢, 𝑣, 1) (𝑥, 𝑦, 1) 𝑢 𝑣 1 = 𝐾−1 𝑥 𝑦 1 𝑋 𝑌 𝑍 = 𝑍 𝑢 𝑣 1 = 𝐷𝐾−1 𝑥 𝑦 1

12. 3D Geometry ◆ 3D Transformation (𝑥, 𝑦, 𝑧) (𝑥′, 𝑦′, 𝑧′) 𝑇 𝑥′ 𝑦′ 𝑧′ 1 = 𝑟11 𝑟12 𝑟13 𝑡 𝑥 𝑟11 𝑟12 𝑟13 𝑡 𝑥 𝑟11 𝑟12 𝑟13 𝑡 𝑥 0 0 0 1 𝑥 𝑦 𝑧 1 = [𝑅|𝑡] 𝑥 𝑦 𝑧 1 𝑥′ 𝑦′ 𝑧′ = 𝑅 𝑥 𝑦 𝑧 + 𝑡 𝑥 𝑡 𝑦 𝑡 𝑧 출처: Dark Programmer

13. GeoNet

14. GeoNet ◆ Rigid & Residual Motion • Rigid Motion: Camera Motion에 의한 상대적인 움직임 • Residual Motion: 각 Object의 독립적인 움직임

15. GeoNet ◆ Rigid & Residual Motion =

16. GeoNet ◆ Rigid Warping Loss ◆ Edge-Aware Depth Smoothness Loss 𝐿 𝑟𝑤 = 𝛼 1 − 𝑆𝑆𝐼𝑀(𝐼𝑡, ෩𝐼𝑠 𝑟𝑖𝑔 ) 2 + 1 − 𝛼 𝐼𝑡 − ෩𝐼𝑠 𝑟𝑖𝑔 1 𝐿 𝑑𝑠 = ෍ 𝑝 𝑡 |∇𝐷(𝑝𝑡)| ∙ 𝑒− ∇𝐼 𝑝 𝑡 𝑇

17. GeoNet ◆ Flow Warping Loss ◆ Edge-Aware Flow Smoothness Loss 𝐿 𝑓𝑤 = 𝛼 1 − 𝑆𝑆𝐼𝑀(𝐼𝑡, ෩𝐼𝑠 𝑓𝑢𝑙𝑙 ) 2 + 1 − 𝛼 𝐼𝑡 − ෩𝐼𝑠 𝑓𝑢𝑙𝑙 1 𝐿 𝑓𝑠 = ෍ 𝑝 𝑡 |∇𝑓𝑡→𝑠 𝑓𝑢𝑙𝑙 (𝑝𝑡)| ∙ 𝑒− ∇𝐼 𝑝 𝑡 𝑇

18. GeoNet ◆ Geometric Consistency Loss 𝐿 𝑔𝑐 = ෍ 𝑝 𝑡 [𝛿(𝑝𝑡)] ∙ ∆𝑓𝑡→𝑠 𝑓𝑢𝑙𝑙 𝑝𝑡 1 ∆𝑓𝑡→𝑠 𝑓𝑢𝑙𝑙 𝑝𝑡 = 𝑓𝑡→𝑠 𝑓𝑢𝑙𝑙 + 𝑓𝑠→𝑡 𝑓𝑢𝑙𝑙 (𝑝𝑡 + 𝑓𝑡→𝑠 𝑓𝑢𝑙𝑙 (𝑝𝑡)) For Occlusion Reasoning

19. GeoNet ◆ Depth Result

20. GeoNet ◆ Flow & Pose Result

21. Thank You!

PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera pose

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera pose

Similar to PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera pose (20)

More from Hyeongmin Lee

More from Hyeongmin Lee (20)

Recently uploaded

Recently uploaded (20)

PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera pose