SlideShare a Scribd company logo
Frame Interpolation
연세대학교 영상 및 비디오 패턴 인식 연구실
이형민
2018.8.17
Frame Interpolation??
Interpolation
Simple Blending
Convolutional
Neural Network
Phase Based
Phase-Based
2015.6 CVPR
Meyer
PhaseNet
2018.6 CVPR
Meyer
Optical Flow
Based
ETC
Deep Voxel Flow
2017.10 ICCV
Liu
Moving Gradient
2009.8 ACM Graphics
Mahajan
Flow Learning
SuperSlomo
2018.6 CVPR
Jiang
Middlebury
2011.11 IJCV
Baker
CtxSyn
2018.6 CVPR
Niklaus
AdaConv
2017.6 CVPR
Niklaus
SepConv
2017.10 ICCV
Niklaus
MIND
2016.10 ECCV
Long
*색이 칠해진 알고리즘은 Deep Learning을 썼다는 뜻
Convolutional
Neural Network
Phase Based
Phase-Based
2015.6 CVPR
Meyer
PhaseNet
2018.6 CVPR
Meyer
Optical Flow
Based
ETC
Deep Voxel Flow
2017.10 ICCV
Liu
Moving Gradient
2009.8 ACM Graphics
Mahajan
Flow Learning
SuperSlomo
2018.6 CVPR
Jiang
Middlebury
2011.11 IJCV
Baker
CtxSyn
2018.6 CVPR
Niklaus
AdaConv
2017.6 CVPR
Niklaus
SepConv
2017.10 ICCV
Niklaus
MIND
2016.10 ECCV
Long
*색이 칠해진 알고리즘은 Deep Learning을 썼다는 뜻
1.
2.
3.
4.
5.
SOTA
Most Famous
Benchmark & Test Dataset
Middlebury Benchmark
Baker, Simon, et al. "A database and evaluation methodology for optical flow." International Journal of Computer Vision 92.1 (2011): 1-31.
• Optical Flow의 Evaluation을 위해 쓰여진 논문
• 여러 Evaluation 방법 중, Optical Flow를 이용한 Frame Interpolation 성능을 기준으로 평가하는 방법 존재
• Optical Flow를 평가하기 위한 목적으로 Optical Flow Based Frame Interpolation 알고리즘을 부산물로 제안
Frame Interpolation 논문으로도 평가받고 있다.
Middlebury Benchmark
다양한 Mask에 대해서 RMS Error 계산  등수
http://vision.middlebury.edu/flow/eval/results/results-i1.php
Middlebury Benchmark
• 두 이미지 𝐼0, 𝐼1와 둘 사이의 Optical Flow 𝑢0 존재
• Optical Flow 𝑢0 를 이용하면 0~1 사이 임의의 시점 t에서의 Optical Flow 𝑢 𝑡를 얻을 수 있다.
𝑢 𝑡 round x + tu0 x = u0(x)
• 분명히 𝑢 𝑡의 어떤 지점에는 u0의 값 둘 이상이 몰릴 수 있고, 어떤 지점에는 아무도 배정되지 않을 수 있다.
• u0(x1)과 u0(x2)가 경쟁할 경우, |𝐼0 𝑥 − 𝐼1(𝑥 + 𝑢0(𝑥))|의 값이 작은 쪽을 채택한다.
• 비어 있는 곳은 Outside-in strategy를 이용한다.
Frame Wraping
 “Optical Flow에 따라서 input image의 픽셀 값들을 이동시킨다”
Middlebury Benchmark
• Occlusion Mask 𝑂0 𝑥 , 𝑂1(𝑥)를 구한다. 𝑂𝑖 𝑥 = 1이라는 뜻은 𝐼𝑖(𝑥)픽셀이 반대편에서는 보이지 않는다는 뜻이다.
 𝑢 𝑡 round x + tu0 x = u0(x)에 𝑡 = 1을 대입하면 u1을 구할 수 있는데, 이때 생긴 hole 들을 𝑂1 𝑥 = 1로 놓을 수 있다.
 𝑂0 𝑥 의 경우, 𝑢0 𝑥 − 𝑢1 𝑥 + 𝑢0 𝑥 > 0.5이면 𝑂0 𝑥 = 1로 놓는다.
• 이제 𝐼𝑡(𝑥)픽셀 값을 구하기 위해 참조할 𝐼0에서의 위치 𝑥0와 𝐼1에서의 위치 𝑥1를 구해야 한다. 즉 𝐼0(𝑥0)와 𝐼1(𝑥1)을 참
조할 것이다.
 𝑥0 = 𝑥 − 𝑡𝑢 𝑡(𝑥)
 𝑥1 = 𝑥 + 1 − 𝑡 𝑢𝑡(𝑥)
• 𝐼𝑡 𝑥 = 1 − 𝑡 𝐼0 𝑥0 + 𝑡𝐼1(𝑥1)
• 만약 𝑂0 𝑥0 = 1이라면 𝐼𝑡(𝑥) = 𝐼0 𝑥0 반대 경우도 마찬가지
Frame Wraping
Middlebury Benchmark
• 우리는 Frame 0 에서의 Flow 𝐹0을 갖고 있다.
• 𝐹0을 Wraping 하면 시점 𝑡에서의 Flow 𝐹𝑡를 구할 수 있다.
• 𝐹𝑡를 이용하면 현재 생성할 Frame의 픽셀이 Frame 1에서는 어느 위치에 해당하는 지 구할 수 있다.(Forward Flow)
• −𝐹𝑡를 이용하면 Frame 0에 대해서도 마찬가지(Backward Flow)
• Forward, Backward Flow를 이용하여 Frame 0과 1을 각각 Warping한다.
• 두 Warping된 이미지를 Linear Combination
Frame Wraping(요
약)
CNN Based
Flow Based Method의 치명적인 단점
Occlusion에 강인하지 못하다
Self-Supervised Learning
Input
output
Input
output
Self-Supervised Learning
Input
output
Self-Supervised Learning
Input
output
Self-Supervised Learning
Input
output
Self-Supervised Learning
Input
output
Self-Supervised Learning
MIND(Matching by INverting Deep Neural Network)
• U-Net 형태의 Encoder-Decoder 구조
• 정말 단순히 “CNN을 학습시키자”는 아이디어
• Output이 Blur하다는 단점 존재
Long, Gucan, et al. "Learning image matching by simply watching video." European Conference on Computer Vision. Springer, Cham, 2016.
Adaptive Convolution
Target Pixel에서 필요로 하는 정보는 앞뒤 프레임의 근처 픽셀들이 갖고 있을 거야!!
앞뒤 프레임들로 부터 Convolution을 취한 결과를 Pixel 값으로 쓰자!!
그럼 필터는 뭘 쓸 건데??
S. Niklaus, L. Mai, and F. Liu. “Video frame interpolation via adaptive convolution.” In IEEE Conference on Computer Vision and Pattern Recognition, July 2017
Adaptive Convolution
41 (Filter Size)
79 (Input Size)
Target Point
이미지의 각 부분마다 다른 필터를 적용시키자. 각 필터는 Network의 output으로 뽑자.
Input Patch
Output Filter
Adaptive Convolution
Occlusion
Edge-aware pixel interpolation
Adaptive Separable Convolution
• 필터를 통째로 뽑지 말고, 41dim짜리 가로, 세로 vector 두 개를 뽑아서 곱하자!  41 x 41 Matrix
• Adaptive Conv: 픽셀 수 x 필터 사이즈(41) x 필터 사이즈(41) (26 GB)
• Adaptive Separable Conv: 픽셀 수 x 필터 사이즈(41) x 2 (1.27 GB)
S. Niklaus, L. Mai, and F. Liu. “Video frame interpolation via adaptive separable convolution.” In IEEE International Conference on Computer Vision, Oct 2017
Optical Flow + Learning
Context Aware Synthesis
• 𝐼𝑡 𝑥 = 1 − 𝑡 𝐼0 𝑥0 + 𝑡𝐼1(𝑥1) [Middlebury Benchmark]
𝐼𝑡 𝑥 = 𝐼0(𝑥0)  Forward Wrapped Image
𝐼𝑡 𝑥 = 𝐼1(𝑥1)  Backward Wrapped Image
• 두 이미지를 Linear Combination(Blending)하지 말고, Neural Network를 이용해서 섞어보자!
• 섞을 때 Pre-trained Network에서 얻은 Feature까지 Wrap 해서 같이 넣어주자!!
S. Niklaus, and F. Liu. “Context-Aware Synthesis for Video Frame Interpolation.” In IEEE Conference on Computer Vision and Pattern Recognition, June 2018
Phase Based
Phase Based Frame Interpolation
S.Meyer, O.Wang, H. Zimmer,M. Grosse, and A. Sorkine- Hornung. “Phase-based frame interpolation for video.” In IEEE Conference on Computer Vision and Pattern Recog- nition, pages 1410–1418, 2015.
Phase Based Frame Interpolation
Phase Based Frame Interpolation
Steerable Filter
Phase Based Frame Interpolation
But...!!
=
???
PhaseNet
Meyer, Simone, et al. "PhaseNet for Video Frame Interpolation." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.
Flow Learning
Optical Flow 말고,
Frame Interpolation 전용 Flow를 학습시키자!!
Optical Flow Based와의 차이점
Input
Optical
Flow
Frame
Synthesis
Input Flow
Frame
Synthesis
기존:
논문:
End-to-End Training
Deep Voxel Flow
Liu, Ziwei, et al. "Video Frame Synthesis Using Deep Voxel Flow." Proceedings of the IEEE International Conference on Computer Vision. 2017.
Input Voxel
Flow
Frame
Synthesis
Deep Voxel Flow
∆𝒙
∆𝒚
∆𝒙
∆𝒚
Voxel Flow
Size = input
size
Channel = 3
Train
Voxel Flow??
Deep Voxel Flow
𝑳 𝟎
𝑳 𝟏
Target Point가 정수 격자점이 아닐 수도 있다!!  Trilinear Interpolation
Deep Voxel Flow
Trilinear Interpolation
Deep Voxel Flow
Training & Result
Super SloMo
Jiang, Huaizu, et al. "Super slomo: High quality estimation of multiple intermediate frames for video interpolation." arXiv preprint arXiv:1712.00080 (2017).
뭐 힘들게 새로운 Flow를 만드니? 난 그냥 Optical Flow 구현해서 쓸래
ㅎ
1. 일단 𝐹0→1과 𝐹1→0는 주어진 상태라고 가정
2. 두 프레임 사이의 대상들은 직선운동 한다고 가정하고 𝐹𝑡→0, 𝐹𝑡→1을 구한다.
3. 그런데 다음과 같이 근사하는 것이 더 잘 된다고 한다.
Super SloMo
* 𝑔(∙,∙): Wraping Function
• 0~1 사이의 값.
• 𝑉𝑡←0(𝑝)이 0이면, pixel 𝑝는 Frame 0에 존재하지 않고, 1이면 존재한다는 뜻. 𝑉𝑡←1(𝑝)도 Frame 1에 대하여 마찬가지
Super SloMo
Optical Flow를 얻는
Network
Flow Refine / Visibility Map 생성
http://smartaedi.tistory.com/325
Thank You!!
More Information: https://hyeongminlee.github.io/post/pr002_image_captioning/

More Related Content

What's hot

[DL輪読会]NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
[DL輪読会]NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis[DL輪読会]NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
[DL輪読会]NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
Deep Learning JP
 
그림 그리는 AI
그림 그리는 AI그림 그리는 AI
그림 그리는 AI
NAVER Engineering
 
Open Source codes of trajectory prediction & behavior planning
Open Source codes of trajectory prediction & behavior planningOpen Source codes of trajectory prediction & behavior planning
Open Source codes of trajectory prediction & behavior planning
Yu Huang
 
話者V2S攻撃: 話者認証から構築される 声質変換とその音声なりすまし可能性の評価
話者V2S攻撃: 話者認証から構築される 声質変換とその音声なりすまし可能性の評価話者V2S攻撃: 話者認証から構築される 声質変換とその音声なりすまし可能性の評価
話者V2S攻撃: 話者認証から構築される 声質変換とその音声なりすまし可能性の評価
Shinnosuke Takamichi
 
フォトンマッピング入門
フォトンマッピング入門フォトンマッピング入門
フォトンマッピング入門
Shuichi Hayashi
 
Python開発は仮想化しろ
Python開発は仮想化しろPython開発は仮想化しろ
Python開発は仮想化しろ
iPride Co., Ltd.
 
책 읽어주는 딥러닝: 배우 유인나가 해리포터를 읽어준다면 DEVIEW 2017
책 읽어주는 딥러닝: 배우 유인나가 해리포터를 읽어준다면 DEVIEW 2017책 읽어주는 딥러닝: 배우 유인나가 해리포터를 읽어준다면 DEVIEW 2017
책 읽어주는 딥러닝: 배우 유인나가 해리포터를 읽어준다면 DEVIEW 2017
Taehoon Kim
 
Autoencoder
AutoencoderAutoencoder
Autoencoder
HARISH R
 
Autoencoders in Deep Learning
Autoencoders in Deep LearningAutoencoders in Deep Learning
Autoencoders in Deep Learning
milad abbasi
 
Send Sms with SmsManager Api In Android with Kotlin
Send Sms with SmsManager Api In Android with KotlinSend Sms with SmsManager Api In Android with Kotlin
Send Sms with SmsManager Api In Android with Kotlin
ShahRushika
 
AIによるアニメ生成の挑戦
AIによるアニメ生成の挑戦AIによるアニメ生成の挑戦
AIによるアニメ生成の挑戦
Koichi Hamada
 
[DL輪読会]近年のエネルギーベースモデルの進展
[DL輪読会]近年のエネルギーベースモデルの進展[DL輪読会]近年のエネルギーベースモデルの進展
[DL輪読会]近年のエネルギーベースモデルの進展
Deep Learning JP
 
A brief introduction to recent segmentation methods
A brief introduction to recent segmentation methodsA brief introduction to recent segmentation methods
A brief introduction to recent segmentation methods
Shunta Saito
 
【第33回コンピュータビジョン勉強会@関東】OpenVX、 NVIDIA VisionWorks使ってみた
【第33回コンピュータビジョン勉強会@関東】OpenVX、 NVIDIA VisionWorks使ってみた【第33回コンピュータビジョン勉強会@関東】OpenVX、 NVIDIA VisionWorks使ってみた
【第33回コンピュータビジョン勉強会@関東】OpenVX、 NVIDIA VisionWorks使ってみた
Yasuhiro Yoshimura
 
React vac pattern
React vac patternReact vac pattern
React vac pattern
NAVER Engineering
 
Learning spatiotemporal features with 3 d convolutional networks
Learning spatiotemporal features with 3 d convolutional networksLearning spatiotemporal features with 3 d convolutional networks
Learning spatiotemporal features with 3 d convolutional networks
SungminYou
 
CNN-RNN: a large-scale hierarchical image classification framework
CNN-RNN: a large-scale hierarchical image classification frameworkCNN-RNN: a large-scale hierarchical image classification framework
CNN-RNN: a large-scale hierarchical image classification framework
harmonylab
 
눈으로 듣는 음악 추천 시스템-2018 if-kakao
눈으로 듣는 음악 추천 시스템-2018 if-kakao눈으로 듣는 음악 추천 시스템-2018 if-kakao
눈으로 듣는 음악 추천 시스템-2018 if-kakao
choi kyumin
 
Image classification using convolutional neural network
Image classification using convolutional neural networkImage classification using convolutional neural network
Image classification using convolutional neural network
KIRAN R
 
Spotify Discover Weekly: The machine learning behind your music recommendations
Spotify Discover Weekly: The machine learning behind your music recommendationsSpotify Discover Weekly: The machine learning behind your music recommendations
Spotify Discover Weekly: The machine learning behind your music recommendations
Sophia Ciocca
 

What's hot (20)

[DL輪読会]NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
[DL輪読会]NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis[DL輪読会]NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
[DL輪読会]NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
 
그림 그리는 AI
그림 그리는 AI그림 그리는 AI
그림 그리는 AI
 
Open Source codes of trajectory prediction & behavior planning
Open Source codes of trajectory prediction & behavior planningOpen Source codes of trajectory prediction & behavior planning
Open Source codes of trajectory prediction & behavior planning
 
話者V2S攻撃: 話者認証から構築される 声質変換とその音声なりすまし可能性の評価
話者V2S攻撃: 話者認証から構築される 声質変換とその音声なりすまし可能性の評価話者V2S攻撃: 話者認証から構築される 声質変換とその音声なりすまし可能性の評価
話者V2S攻撃: 話者認証から構築される 声質変換とその音声なりすまし可能性の評価
 
フォトンマッピング入門
フォトンマッピング入門フォトンマッピング入門
フォトンマッピング入門
 
Python開発は仮想化しろ
Python開発は仮想化しろPython開発は仮想化しろ
Python開発は仮想化しろ
 
책 읽어주는 딥러닝: 배우 유인나가 해리포터를 읽어준다면 DEVIEW 2017
책 읽어주는 딥러닝: 배우 유인나가 해리포터를 읽어준다면 DEVIEW 2017책 읽어주는 딥러닝: 배우 유인나가 해리포터를 읽어준다면 DEVIEW 2017
책 읽어주는 딥러닝: 배우 유인나가 해리포터를 읽어준다면 DEVIEW 2017
 
Autoencoder
AutoencoderAutoencoder
Autoencoder
 
Autoencoders in Deep Learning
Autoencoders in Deep LearningAutoencoders in Deep Learning
Autoencoders in Deep Learning
 
Send Sms with SmsManager Api In Android with Kotlin
Send Sms with SmsManager Api In Android with KotlinSend Sms with SmsManager Api In Android with Kotlin
Send Sms with SmsManager Api In Android with Kotlin
 
AIによるアニメ生成の挑戦
AIによるアニメ生成の挑戦AIによるアニメ生成の挑戦
AIによるアニメ生成の挑戦
 
[DL輪読会]近年のエネルギーベースモデルの進展
[DL輪読会]近年のエネルギーベースモデルの進展[DL輪読会]近年のエネルギーベースモデルの進展
[DL輪読会]近年のエネルギーベースモデルの進展
 
A brief introduction to recent segmentation methods
A brief introduction to recent segmentation methodsA brief introduction to recent segmentation methods
A brief introduction to recent segmentation methods
 
【第33回コンピュータビジョン勉強会@関東】OpenVX、 NVIDIA VisionWorks使ってみた
【第33回コンピュータビジョン勉強会@関東】OpenVX、 NVIDIA VisionWorks使ってみた【第33回コンピュータビジョン勉強会@関東】OpenVX、 NVIDIA VisionWorks使ってみた
【第33回コンピュータビジョン勉強会@関東】OpenVX、 NVIDIA VisionWorks使ってみた
 
React vac pattern
React vac patternReact vac pattern
React vac pattern
 
Learning spatiotemporal features with 3 d convolutional networks
Learning spatiotemporal features with 3 d convolutional networksLearning spatiotemporal features with 3 d convolutional networks
Learning spatiotemporal features with 3 d convolutional networks
 
CNN-RNN: a large-scale hierarchical image classification framework
CNN-RNN: a large-scale hierarchical image classification frameworkCNN-RNN: a large-scale hierarchical image classification framework
CNN-RNN: a large-scale hierarchical image classification framework
 
눈으로 듣는 음악 추천 시스템-2018 if-kakao
눈으로 듣는 음악 추천 시스템-2018 if-kakao눈으로 듣는 음악 추천 시스템-2018 if-kakao
눈으로 듣는 음악 추천 시스템-2018 if-kakao
 
Image classification using convolutional neural network
Image classification using convolutional neural networkImage classification using convolutional neural network
Image classification using convolutional neural network
 
Spotify Discover Weekly: The machine learning behind your music recommendations
Spotify Discover Weekly: The machine learning behind your music recommendationsSpotify Discover Weekly: The machine learning behind your music recommendations
Spotify Discover Weekly: The machine learning behind your music recommendations
 

Similar to Latest Frame interpolation Algorithms

[Paper Review] A Middlebury Benchmark & Context-Aware Synthesis for Video Fra...
[Paper Review] A Middlebury Benchmark & Context-Aware Synthesis for Video Fra...[Paper Review] A Middlebury Benchmark & Context-Aware Synthesis for Video Fra...
[Paper Review] A Middlebury Benchmark & Context-Aware Synthesis for Video Fra...
Hyeongmin Lee
 
[Paper Review] Video Frame Interpolation via Adaptive Convolution
[Paper Review] Video Frame Interpolation via Adaptive Convolution[Paper Review] Video Frame Interpolation via Adaptive Convolution
[Paper Review] Video Frame Interpolation via Adaptive Convolution
Hyeongmin Lee
 
210801 hierarchical long term video frame prediction without supervision
210801 hierarchical long term video frame prediction without supervision210801 hierarchical long term video frame prediction without supervision
210801 hierarchical long term video frame prediction without supervision
taeseon ryu
 
FCN to DeepLab.v3+
FCN to DeepLab.v3+FCN to DeepLab.v3+
FCN to DeepLab.v3+
Whi Kwon
 
Designing more efficient convolution neural network
Designing more efficient convolution neural networkDesigning more efficient convolution neural network
Designing more efficient convolution neural network
NAVER Engineering
 
Designing more efficient convolution neural network
Designing more efficient convolution neural networkDesigning more efficient convolution neural network
Designing more efficient convolution neural network
Dongyi Kim
 
순환신경망(Recurrent neural networks) 개요
순환신경망(Recurrent neural networks) 개요순환신경망(Recurrent neural networks) 개요
순환신경망(Recurrent neural networks) 개요
Byoung-Hee Kim
 
[PR12] image super resolution using deep convolutional networks
[PR12] image super resolution using deep convolutional networks[PR12] image super resolution using deep convolutional networks
[PR12] image super resolution using deep convolutional networks
Taegyun Jeon
 
Knowing when to look : Adaptive Attention via A Visual Sentinel for Image Cap...
Knowing when to look : Adaptive Attention via A Visual Sentinel for Image Cap...Knowing when to look : Adaptive Attention via A Visual Sentinel for Image Cap...
Knowing when to look : Adaptive Attention via A Visual Sentinel for Image Cap...
홍배 김
 
180525 mobile visionnet_hanlim_extended
180525 mobile visionnet_hanlim_extended180525 mobile visionnet_hanlim_extended
180525 mobile visionnet_hanlim_extended
Jaewook. Kang
 
HistoryOfCNN
HistoryOfCNNHistoryOfCNN
HistoryOfCNN
Tae Young Lee
 
A neural image caption generator
A neural image caption generatorA neural image caption generator
A neural image caption generator
홍배 김
 
I3D and Kinetics datasets (Action Recognition)
I3D and Kinetics datasets (Action Recognition)I3D and Kinetics datasets (Action Recognition)
I3D and Kinetics datasets (Action Recognition)
Susang Kim
 
Deep Learning Into Advance - 1. Image, ConvNet
Deep Learning Into Advance - 1. Image, ConvNetDeep Learning Into Advance - 1. Image, ConvNet
Deep Learning Into Advance - 1. Image, ConvNet
Hyojun Kim
 
Vid2vid
Vid2vidVid2vid
Vid2vid
seop kim
 
180624 mobile visionnet_baeksucon_jwkang_pub
180624 mobile visionnet_baeksucon_jwkang_pub180624 mobile visionnet_baeksucon_jwkang_pub
180624 mobile visionnet_baeksucon_jwkang_pub
Jaewook. Kang
 
SPPNet : Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Re...
SPPNet : Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Re...SPPNet : Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Re...
SPPNet : Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Re...
Dae Hyun Nam
 
동물 홍채인식부터 서버까지
동물 홍채인식부터 서버까지동물 홍채인식부터 서버까지
동물 홍채인식부터 서버까지
진성 정
 
03.12 cnn backpropagation
03.12 cnn backpropagation03.12 cnn backpropagation
03.12 cnn backpropagation
Dea-hwan Ki
 

Similar to Latest Frame interpolation Algorithms (20)

[Paper Review] A Middlebury Benchmark & Context-Aware Synthesis for Video Fra...
[Paper Review] A Middlebury Benchmark & Context-Aware Synthesis for Video Fra...[Paper Review] A Middlebury Benchmark & Context-Aware Synthesis for Video Fra...
[Paper Review] A Middlebury Benchmark & Context-Aware Synthesis for Video Fra...
 
[Paper Review] Video Frame Interpolation via Adaptive Convolution
[Paper Review] Video Frame Interpolation via Adaptive Convolution[Paper Review] Video Frame Interpolation via Adaptive Convolution
[Paper Review] Video Frame Interpolation via Adaptive Convolution
 
210801 hierarchical long term video frame prediction without supervision
210801 hierarchical long term video frame prediction without supervision210801 hierarchical long term video frame prediction without supervision
210801 hierarchical long term video frame prediction without supervision
 
FCN to DeepLab.v3+
FCN to DeepLab.v3+FCN to DeepLab.v3+
FCN to DeepLab.v3+
 
Designing more efficient convolution neural network
Designing more efficient convolution neural networkDesigning more efficient convolution neural network
Designing more efficient convolution neural network
 
Designing more efficient convolution neural network
Designing more efficient convolution neural networkDesigning more efficient convolution neural network
Designing more efficient convolution neural network
 
순환신경망(Recurrent neural networks) 개요
순환신경망(Recurrent neural networks) 개요순환신경망(Recurrent neural networks) 개요
순환신경망(Recurrent neural networks) 개요
 
[PR12] image super resolution using deep convolutional networks
[PR12] image super resolution using deep convolutional networks[PR12] image super resolution using deep convolutional networks
[PR12] image super resolution using deep convolutional networks
 
Knowing when to look : Adaptive Attention via A Visual Sentinel for Image Cap...
Knowing when to look : Adaptive Attention via A Visual Sentinel for Image Cap...Knowing when to look : Adaptive Attention via A Visual Sentinel for Image Cap...
Knowing when to look : Adaptive Attention via A Visual Sentinel for Image Cap...
 
180525 mobile visionnet_hanlim_extended
180525 mobile visionnet_hanlim_extended180525 mobile visionnet_hanlim_extended
180525 mobile visionnet_hanlim_extended
 
HistoryOfCNN
HistoryOfCNNHistoryOfCNN
HistoryOfCNN
 
[Gpg1권]skinning
[Gpg1권]skinning[Gpg1권]skinning
[Gpg1권]skinning
 
A neural image caption generator
A neural image caption generatorA neural image caption generator
A neural image caption generator
 
I3D and Kinetics datasets (Action Recognition)
I3D and Kinetics datasets (Action Recognition)I3D and Kinetics datasets (Action Recognition)
I3D and Kinetics datasets (Action Recognition)
 
Deep Learning Into Advance - 1. Image, ConvNet
Deep Learning Into Advance - 1. Image, ConvNetDeep Learning Into Advance - 1. Image, ConvNet
Deep Learning Into Advance - 1. Image, ConvNet
 
Vid2vid
Vid2vidVid2vid
Vid2vid
 
180624 mobile visionnet_baeksucon_jwkang_pub
180624 mobile visionnet_baeksucon_jwkang_pub180624 mobile visionnet_baeksucon_jwkang_pub
180624 mobile visionnet_baeksucon_jwkang_pub
 
SPPNet : Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Re...
SPPNet : Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Re...SPPNet : Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Re...
SPPNet : Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Re...
 
동물 홍채인식부터 서버까지
동물 홍채인식부터 서버까지동물 홍채인식부터 서버까지
동물 홍채인식부터 서버까지
 
03.12 cnn backpropagation
03.12 cnn backpropagation03.12 cnn backpropagation
03.12 cnn backpropagation
 

More from Hyeongmin Lee

PR-455: CoTracker: It is Better to Track Together
PR-455: CoTracker: It is Better to Track TogetherPR-455: CoTracker: It is Better to Track Together
PR-455: CoTracker: It is Better to Track Together
Hyeongmin Lee
 
PR-430: CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retri...
PR-430: CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retri...PR-430: CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retri...
PR-430: CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retri...
Hyeongmin Lee
 
PR-420: Scalable Model Compression by Entropy Penalized Reparameterization
PR-420: Scalable Model Compression by Entropy Penalized ReparameterizationPR-420: Scalable Model Compression by Entropy Penalized Reparameterization
PR-420: Scalable Model Compression by Entropy Penalized Reparameterization
Hyeongmin Lee
 
PR-409: Denoising Diffusion Probabilistic Models
PR-409: Denoising Diffusion Probabilistic ModelsPR-409: Denoising Diffusion Probabilistic Models
PR-409: Denoising Diffusion Probabilistic Models
Hyeongmin Lee
 
PR-395: Variational Image Compression with a Scale Hyperprior
PR-395: Variational Image Compression with a Scale HyperpriorPR-395: Variational Image Compression with a Scale Hyperprior
PR-395: Variational Image Compression with a Scale Hyperprior
Hyeongmin Lee
 
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
Hyeongmin Lee
 
PR-365: Fast object detection in compressed video
PR-365: Fast object detection in compressed videoPR-365: Fast object detection in compressed video
PR-365: Fast object detection in compressed video
Hyeongmin Lee
 
PR-340: DVC: An End-to-end Deep Video Compression Framework
PR-340: DVC: An End-to-end Deep Video Compression FrameworkPR-340: DVC: An End-to-end Deep Video Compression Framework
PR-340: DVC: An End-to-end Deep Video Compression Framework
Hyeongmin Lee
 
PR-328: End-to-End Optimized Image Compression
PR-328: End-to-End OptimizedImage CompressionPR-328: End-to-End OptimizedImage Compression
PR-328: End-to-End Optimized Image Compression
Hyeongmin Lee
 
PR-315: Taming Transformers for High-Resolution Image Synthesis
PR-315: Taming Transformers for High-Resolution Image SynthesisPR-315: Taming Transformers for High-Resolution Image Synthesis
PR-315: Taming Transformers for High-Resolution Image Synthesis
Hyeongmin Lee
 
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisPR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
Hyeongmin Lee
 
PR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
PR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical FlowPR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
PR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
Hyeongmin Lee
 
Pr266
Pr266Pr266
PR-252: Making Convolutional Networks Shift-Invariant Again
PR-252: Making Convolutional Networks Shift-Invariant AgainPR-252: Making Convolutional Networks Shift-Invariant Again
PR-252: Making Convolutional Networks Shift-Invariant Again
Hyeongmin Lee
 
PR-240: Modulating Image Restoration with Continual Levels via Adaptive Featu...
PR-240: Modulating Image Restoration with Continual Levels viaAdaptive Featu...PR-240: Modulating Image Restoration with Continual Levels viaAdaptive Featu...
PR-240: Modulating Image Restoration with Continual Levels via Adaptive Featu...
Hyeongmin Lee
 
PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera...
PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera...PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera...
PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera...
Hyeongmin Lee
 
PR-214: FlowNet: Learning Optical Flow with Convolutional Networks
PR-214: FlowNet: Learning Optical Flow with Convolutional NetworksPR-214: FlowNet: Learning Optical Flow with Convolutional Networks
PR-214: FlowNet: Learning Optical Flow with Convolutional Networks
Hyeongmin Lee
 
[PR12] Making Convolutional Networks Shift-Invariant Again
[PR12] Making Convolutional Networks Shift-Invariant Again[PR12] Making Convolutional Networks Shift-Invariant Again
[PR12] Making Convolutional Networks Shift-Invariant Again
Hyeongmin Lee
 
[Paper Review] Temporal Generative Adversarial Nets with Singular Value Clipping
[Paper Review] Temporal Generative Adversarial Nets with Singular Value Clipping[Paper Review] Temporal Generative Adversarial Nets with Singular Value Clipping
[Paper Review] Temporal Generative Adversarial Nets with Singular Value Clipping
Hyeongmin Lee
 
[Paper Review] A spatio -Temporal Descriptor Based on 3D -Gradients
[Paper Review] A spatio -Temporal Descriptor Based on 3D -Gradients[Paper Review] A spatio -Temporal Descriptor Based on 3D -Gradients
[Paper Review] A spatio -Temporal Descriptor Based on 3D -Gradients
Hyeongmin Lee
 

More from Hyeongmin Lee (20)

PR-455: CoTracker: It is Better to Track Together
PR-455: CoTracker: It is Better to Track TogetherPR-455: CoTracker: It is Better to Track Together
PR-455: CoTracker: It is Better to Track Together
 
PR-430: CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retri...
PR-430: CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retri...PR-430: CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retri...
PR-430: CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retri...
 
PR-420: Scalable Model Compression by Entropy Penalized Reparameterization
PR-420: Scalable Model Compression by Entropy Penalized ReparameterizationPR-420: Scalable Model Compression by Entropy Penalized Reparameterization
PR-420: Scalable Model Compression by Entropy Penalized Reparameterization
 
PR-409: Denoising Diffusion Probabilistic Models
PR-409: Denoising Diffusion Probabilistic ModelsPR-409: Denoising Diffusion Probabilistic Models
PR-409: Denoising Diffusion Probabilistic Models
 
PR-395: Variational Image Compression with a Scale Hyperprior
PR-395: Variational Image Compression with a Scale HyperpriorPR-395: Variational Image Compression with a Scale Hyperprior
PR-395: Variational Image Compression with a Scale Hyperprior
 
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
 
PR-365: Fast object detection in compressed video
PR-365: Fast object detection in compressed videoPR-365: Fast object detection in compressed video
PR-365: Fast object detection in compressed video
 
PR-340: DVC: An End-to-end Deep Video Compression Framework
PR-340: DVC: An End-to-end Deep Video Compression FrameworkPR-340: DVC: An End-to-end Deep Video Compression Framework
PR-340: DVC: An End-to-end Deep Video Compression Framework
 
PR-328: End-to-End Optimized Image Compression
PR-328: End-to-End OptimizedImage CompressionPR-328: End-to-End OptimizedImage Compression
PR-328: End-to-End Optimized Image Compression
 
PR-315: Taming Transformers for High-Resolution Image Synthesis
PR-315: Taming Transformers for High-Resolution Image SynthesisPR-315: Taming Transformers for High-Resolution Image Synthesis
PR-315: Taming Transformers for High-Resolution Image Synthesis
 
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisPR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
 
PR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
PR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical FlowPR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
PR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
 
Pr266
Pr266Pr266
Pr266
 
PR-252: Making Convolutional Networks Shift-Invariant Again
PR-252: Making Convolutional Networks Shift-Invariant AgainPR-252: Making Convolutional Networks Shift-Invariant Again
PR-252: Making Convolutional Networks Shift-Invariant Again
 
PR-240: Modulating Image Restoration with Continual Levels via Adaptive Featu...
PR-240: Modulating Image Restoration with Continual Levels viaAdaptive Featu...PR-240: Modulating Image Restoration with Continual Levels viaAdaptive Featu...
PR-240: Modulating Image Restoration with Continual Levels via Adaptive Featu...
 
PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera...
PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera...PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera...
PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera...
 
PR-214: FlowNet: Learning Optical Flow with Convolutional Networks
PR-214: FlowNet: Learning Optical Flow with Convolutional NetworksPR-214: FlowNet: Learning Optical Flow with Convolutional Networks
PR-214: FlowNet: Learning Optical Flow with Convolutional Networks
 
[PR12] Making Convolutional Networks Shift-Invariant Again
[PR12] Making Convolutional Networks Shift-Invariant Again[PR12] Making Convolutional Networks Shift-Invariant Again
[PR12] Making Convolutional Networks Shift-Invariant Again
 
[Paper Review] Temporal Generative Adversarial Nets with Singular Value Clipping
[Paper Review] Temporal Generative Adversarial Nets with Singular Value Clipping[Paper Review] Temporal Generative Adversarial Nets with Singular Value Clipping
[Paper Review] Temporal Generative Adversarial Nets with Singular Value Clipping
 
[Paper Review] A spatio -Temporal Descriptor Based on 3D -Gradients
[Paper Review] A spatio -Temporal Descriptor Based on 3D -Gradients[Paper Review] A spatio -Temporal Descriptor Based on 3D -Gradients
[Paper Review] A spatio -Temporal Descriptor Based on 3D -Gradients
 

Latest Frame interpolation Algorithms

  • 1. Frame Interpolation 연세대학교 영상 및 비디오 패턴 인식 연구실 이형민 2018.8.17
  • 5. Convolutional Neural Network Phase Based Phase-Based 2015.6 CVPR Meyer PhaseNet 2018.6 CVPR Meyer Optical Flow Based ETC Deep Voxel Flow 2017.10 ICCV Liu Moving Gradient 2009.8 ACM Graphics Mahajan Flow Learning SuperSlomo 2018.6 CVPR Jiang Middlebury 2011.11 IJCV Baker CtxSyn 2018.6 CVPR Niklaus AdaConv 2017.6 CVPR Niklaus SepConv 2017.10 ICCV Niklaus MIND 2016.10 ECCV Long *색이 칠해진 알고리즘은 Deep Learning을 썼다는 뜻
  • 6. Convolutional Neural Network Phase Based Phase-Based 2015.6 CVPR Meyer PhaseNet 2018.6 CVPR Meyer Optical Flow Based ETC Deep Voxel Flow 2017.10 ICCV Liu Moving Gradient 2009.8 ACM Graphics Mahajan Flow Learning SuperSlomo 2018.6 CVPR Jiang Middlebury 2011.11 IJCV Baker CtxSyn 2018.6 CVPR Niklaus AdaConv 2017.6 CVPR Niklaus SepConv 2017.10 ICCV Niklaus MIND 2016.10 ECCV Long *색이 칠해진 알고리즘은 Deep Learning을 썼다는 뜻 1. 2. 3. 4. 5. SOTA Most Famous
  • 8. Middlebury Benchmark Baker, Simon, et al. "A database and evaluation methodology for optical flow." International Journal of Computer Vision 92.1 (2011): 1-31. • Optical Flow의 Evaluation을 위해 쓰여진 논문 • 여러 Evaluation 방법 중, Optical Flow를 이용한 Frame Interpolation 성능을 기준으로 평가하는 방법 존재 • Optical Flow를 평가하기 위한 목적으로 Optical Flow Based Frame Interpolation 알고리즘을 부산물로 제안 Frame Interpolation 논문으로도 평가받고 있다.
  • 9. Middlebury Benchmark 다양한 Mask에 대해서 RMS Error 계산  등수 http://vision.middlebury.edu/flow/eval/results/results-i1.php
  • 10. Middlebury Benchmark • 두 이미지 𝐼0, 𝐼1와 둘 사이의 Optical Flow 𝑢0 존재 • Optical Flow 𝑢0 를 이용하면 0~1 사이 임의의 시점 t에서의 Optical Flow 𝑢 𝑡를 얻을 수 있다. 𝑢 𝑡 round x + tu0 x = u0(x) • 분명히 𝑢 𝑡의 어떤 지점에는 u0의 값 둘 이상이 몰릴 수 있고, 어떤 지점에는 아무도 배정되지 않을 수 있다. • u0(x1)과 u0(x2)가 경쟁할 경우, |𝐼0 𝑥 − 𝐼1(𝑥 + 𝑢0(𝑥))|의 값이 작은 쪽을 채택한다. • 비어 있는 곳은 Outside-in strategy를 이용한다. Frame Wraping  “Optical Flow에 따라서 input image의 픽셀 값들을 이동시킨다”
  • 11. Middlebury Benchmark • Occlusion Mask 𝑂0 𝑥 , 𝑂1(𝑥)를 구한다. 𝑂𝑖 𝑥 = 1이라는 뜻은 𝐼𝑖(𝑥)픽셀이 반대편에서는 보이지 않는다는 뜻이다.  𝑢 𝑡 round x + tu0 x = u0(x)에 𝑡 = 1을 대입하면 u1을 구할 수 있는데, 이때 생긴 hole 들을 𝑂1 𝑥 = 1로 놓을 수 있다.  𝑂0 𝑥 의 경우, 𝑢0 𝑥 − 𝑢1 𝑥 + 𝑢0 𝑥 > 0.5이면 𝑂0 𝑥 = 1로 놓는다. • 이제 𝐼𝑡(𝑥)픽셀 값을 구하기 위해 참조할 𝐼0에서의 위치 𝑥0와 𝐼1에서의 위치 𝑥1를 구해야 한다. 즉 𝐼0(𝑥0)와 𝐼1(𝑥1)을 참 조할 것이다.  𝑥0 = 𝑥 − 𝑡𝑢 𝑡(𝑥)  𝑥1 = 𝑥 + 1 − 𝑡 𝑢𝑡(𝑥) • 𝐼𝑡 𝑥 = 1 − 𝑡 𝐼0 𝑥0 + 𝑡𝐼1(𝑥1) • 만약 𝑂0 𝑥0 = 1이라면 𝐼𝑡(𝑥) = 𝐼0 𝑥0 반대 경우도 마찬가지 Frame Wraping
  • 12. Middlebury Benchmark • 우리는 Frame 0 에서의 Flow 𝐹0을 갖고 있다. • 𝐹0을 Wraping 하면 시점 𝑡에서의 Flow 𝐹𝑡를 구할 수 있다. • 𝐹𝑡를 이용하면 현재 생성할 Frame의 픽셀이 Frame 1에서는 어느 위치에 해당하는 지 구할 수 있다.(Forward Flow) • −𝐹𝑡를 이용하면 Frame 0에 대해서도 마찬가지(Backward Flow) • Forward, Backward Flow를 이용하여 Frame 0과 1을 각각 Warping한다. • 두 Warping된 이미지를 Linear Combination Frame Wraping(요 약)
  • 14. Flow Based Method의 치명적인 단점 Occlusion에 강인하지 못하다
  • 21. MIND(Matching by INverting Deep Neural Network) • U-Net 형태의 Encoder-Decoder 구조 • 정말 단순히 “CNN을 학습시키자”는 아이디어 • Output이 Blur하다는 단점 존재 Long, Gucan, et al. "Learning image matching by simply watching video." European Conference on Computer Vision. Springer, Cham, 2016.
  • 22. Adaptive Convolution Target Pixel에서 필요로 하는 정보는 앞뒤 프레임의 근처 픽셀들이 갖고 있을 거야!! 앞뒤 프레임들로 부터 Convolution을 취한 결과를 Pixel 값으로 쓰자!! 그럼 필터는 뭘 쓸 건데?? S. Niklaus, L. Mai, and F. Liu. “Video frame interpolation via adaptive convolution.” In IEEE Conference on Computer Vision and Pattern Recognition, July 2017
  • 23. Adaptive Convolution 41 (Filter Size) 79 (Input Size) Target Point 이미지의 각 부분마다 다른 필터를 적용시키자. 각 필터는 Network의 output으로 뽑자. Input Patch Output Filter
  • 25. Adaptive Separable Convolution • 필터를 통째로 뽑지 말고, 41dim짜리 가로, 세로 vector 두 개를 뽑아서 곱하자!  41 x 41 Matrix • Adaptive Conv: 픽셀 수 x 필터 사이즈(41) x 필터 사이즈(41) (26 GB) • Adaptive Separable Conv: 픽셀 수 x 필터 사이즈(41) x 2 (1.27 GB) S. Niklaus, L. Mai, and F. Liu. “Video frame interpolation via adaptive separable convolution.” In IEEE International Conference on Computer Vision, Oct 2017
  • 26. Optical Flow + Learning
  • 27. Context Aware Synthesis • 𝐼𝑡 𝑥 = 1 − 𝑡 𝐼0 𝑥0 + 𝑡𝐼1(𝑥1) [Middlebury Benchmark] 𝐼𝑡 𝑥 = 𝐼0(𝑥0)  Forward Wrapped Image 𝐼𝑡 𝑥 = 𝐼1(𝑥1)  Backward Wrapped Image • 두 이미지를 Linear Combination(Blending)하지 말고, Neural Network를 이용해서 섞어보자! • 섞을 때 Pre-trained Network에서 얻은 Feature까지 Wrap 해서 같이 넣어주자!! S. Niklaus, and F. Liu. “Context-Aware Synthesis for Video Frame Interpolation.” In IEEE Conference on Computer Vision and Pattern Recognition, June 2018
  • 29. Phase Based Frame Interpolation S.Meyer, O.Wang, H. Zimmer,M. Grosse, and A. Sorkine- Hornung. “Phase-based frame interpolation for video.” In IEEE Conference on Computer Vision and Pattern Recog- nition, pages 1410–1418, 2015.
  • 30. Phase Based Frame Interpolation
  • 31. Phase Based Frame Interpolation Steerable Filter
  • 32. Phase Based Frame Interpolation But...!! = ???
  • 33. PhaseNet Meyer, Simone, et al. "PhaseNet for Video Frame Interpolation." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.
  • 34. Flow Learning Optical Flow 말고, Frame Interpolation 전용 Flow를 학습시키자!!
  • 35. Optical Flow Based와의 차이점 Input Optical Flow Frame Synthesis Input Flow Frame Synthesis 기존: 논문: End-to-End Training
  • 36. Deep Voxel Flow Liu, Ziwei, et al. "Video Frame Synthesis Using Deep Voxel Flow." Proceedings of the IEEE International Conference on Computer Vision. 2017. Input Voxel Flow Frame Synthesis
  • 37. Deep Voxel Flow ∆𝒙 ∆𝒚 ∆𝒙 ∆𝒚 Voxel Flow Size = input size Channel = 3 Train Voxel Flow??
  • 38. Deep Voxel Flow 𝑳 𝟎 𝑳 𝟏 Target Point가 정수 격자점이 아닐 수도 있다!!  Trilinear Interpolation
  • 39. Deep Voxel Flow Trilinear Interpolation
  • 41. Super SloMo Jiang, Huaizu, et al. "Super slomo: High quality estimation of multiple intermediate frames for video interpolation." arXiv preprint arXiv:1712.00080 (2017). 뭐 힘들게 새로운 Flow를 만드니? 난 그냥 Optical Flow 구현해서 쓸래 ㅎ 1. 일단 𝐹0→1과 𝐹1→0는 주어진 상태라고 가정 2. 두 프레임 사이의 대상들은 직선운동 한다고 가정하고 𝐹𝑡→0, 𝐹𝑡→1을 구한다. 3. 그런데 다음과 같이 근사하는 것이 더 잘 된다고 한다.
  • 42. Super SloMo * 𝑔(∙,∙): Wraping Function • 0~1 사이의 값. • 𝑉𝑡←0(𝑝)이 0이면, pixel 𝑝는 Frame 0에 존재하지 않고, 1이면 존재한다는 뜻. 𝑉𝑡←1(𝑝)도 Frame 1에 대하여 마찬가지
  • 43. Super SloMo Optical Flow를 얻는 Network Flow Refine / Visibility Map 생성 http://smartaedi.tistory.com/325
  • 44. Thank You!! More Information: https://hyeongminlee.github.io/post/pr002_image_captioning/