SlideShare a Scribd company logo
1 of 28
Download to read offline
Vid-ODE: Continuous-Time Video Generation
with Neural Ordinary Differential Equation
Sunghyun Park1*, Kangyeol Kim1*, Junsoo Lee1
Jaegul Choo1, Joonseok Lee2, Sookyung Kim3, Edward Choi1
1Korea Advanced Institute of Science and Technology (KAIST),
2Google Research, 3Lawrence Livermore Nat'l Lab.
1
Motivations
• Videos, the recording of continuous flow of visual information, inevitably di
scretize the continuous time into a predefined, finite number of units.
• It is challenging for video generation models to accept irregularly sample
d frames or generate frames at unseen timesteps.
2
Regular Video Frames
0 1 2 3 Time
0.5 1.4 2.7 3.8 4.3 4.8
Arbitrary Video Frames
Motivations
• We aim to learn the continuous flow of videos from a sequence of
frames (either regular or irregular) and synthesize new frames at any
given timesteps.
3
Continuous-time Video Generation
Importance of Continuous Generation
• Due to the equipment cost, the time interval per each measurement of cli
mate video often spans minutes to hours, which is insufficient to capture th
e target dynamics.
• Datasets collected from a wild environment is frequently missing values,
which is in turn results in irregular timesteps
4
0H 3H 6H Time
9H
Missing Frame
Unobserved Time
Introduction to Neural ODE
• Reformulation of ResNet forward pass using an integral.
• Interpreting the formula as solving an Ordinary Differential Equation (ODE),
where neural networks serves as a derivative estimator.
• The essence of neural ODE lies in learning continuous dynamics. And our pa
per aims at modeling the continuous dynamics of a video.
5
Neural Ordinary Differential Equations, Chen et al, NIPS 2018
Introduction to Neural ODE
Latent ODEs for Irregularly-SampledTime Series, Rubanova et al, NeurIPS 2019 6
• Continuous time-series prediction (Interpolation / Extrapolation)
• Latent-ODE with an ODE-RNN encoder can handle irregularly-sampled
time-series data.
Video Generation via Neural ODE
• Previous approach towards video generation via Neural ODE: ODE²VAE
• ODE²VAE decompose the latent representations into the position and the
momentum to model the continuous dynamics.
ODE²VAE: Deep generative second order ODEs with Bayesian neural networks,Yıldız et al, NeurIPS 2019 7
Limitation of ODE²VAE
• Although ODE²VAE shows some promising directions in continuous time-s
eries modeling, it is still an unanswered question whether they can scale to
perform continuous-time video generation on complicate real-world vide
os.
8
ODE²VAE: Deep generative second order ODEs with Bayesian neural networks,Yıldız et al, NeurIPS 2019
Relationship to Existing Video Models
• Video Interpolation and extrapolation aim at generating an in-between a
nd future frame respectively given a set of video frames.
• Technically, existing approaches take advantage of warping operation or pi
xel-wise prediction for video generation.
• Since most existing models for the tasks rely on the supervision signals
defined at fixed timesteps (i.e., in-between and future frame recorded in
discretized manner), they have a limitation on generating at arbitrary
timesteps.
• In this paper, we address the limitation by combining neural ODE with
various vision methods and propose a novel framework Vid-ODE for
continuous-time video generation
9
Key Contributions
• Vid-ODE can predict video frames at any given timesteps (both within
and beyond the observed range).
• This is the first ODE-based framework to successfully perform
continuous-time video generation on real-world videos.
• Vid-ODE can flexibly handle unrestricted by pre-defined time intervals
over the several variants of ConvGRU and neural ODE on climate videos
where data are sparsely collected.
10
Overview of Vid-ODE
11
Encoder Decoder
Proposed Method: Encoder
• Prior works employ FC layers to model the derivative of the latent state .
• For encoder, we propose ODE-ConvGRU, a combination of neural ODE
and ConvGRU, to handle the spatial aspect of given video frames.
• In specific, ODE-ConvGRU processes 3D tensors with the aid of convolutional blocks,
preserving spatial information.
12
versus
ODE-RNN ODE-ConvGRU
Proposed Method: Decoder
• Our decoder consists of an ODE solver and the Conv-Decoder .
• ODE solver produces the hidden states by integrating at given timesteps.
• By taking adjacent hidden states , Conv Decoder outputs Optical
Flow , Image Difference , and Composition Mask .
13
Linear Composition
• Three outputs of decoder are combined via the convex combination .
14
Combination procedure
Visualization of 3 intermediate outputs
Total Objective Functions
• We adopt image and sequence discriminators to improve the output
quality by using adversarial losses.
• computes the pixel-level distance between the predicted video
and the ground truth.
• helps the model learn the image difference as the pixel-wise
difference between consecutive video frames.
• To sum up, total objective function can be written as
15
Experimental Setup
• Dataset
• KTH Action: the videos of 25 subjects performing 6 different types of actions.
• Penn Action: the videos of humans playing sports.
• Moving GIF: the videos of animated animal characters.
• CAM5: a hurricane video dataset for evaluating irregularly-sampled video prediction.
• Bouncing Ball: the videos containing three balls moving in different directions.
• Evaluation Metric
• Structural Similarity (SSIM)
• Peak Signal-to-Noise Ratio (PSNR)
• Learned Perceptual image Path Similarity (LPIPS)
16
Neural-ODE Comparison
• Table shows that Vid-ODE significantly outperforms all other baselines
both in interpolation and extrapolation tasks.
17
Neural-ODE Comparison
18
Video interpolation results (KTH Action)
Video extrapolation results (Bouncing ball)
Video Interpolation
19
• As expected, we see some gap between Vid-ODE and the supervised
approach (Deep Voxel Flow (DVI)).
• However, Vid-ODE outperforms Unsupervised Video Interpolation (UVI) in
all cases (especially in MGIF), except for SSIM in KTH-Action.
Unsupervised
Supervised
Video Interpolation
20
Video interpolation results (Penn Action)
Continuous video interpolation (Moving GIF)
Video Extrapolation
• As shown in table, Vid-ODE significantly outperforms all other baseline
models in all metrics.
• It is noteworthy that the performance gap is wider for moving GIF, which
contains more dynamic object movements, indicating Vid-ODE’s superior
ability to learn complex dynamics.
21
Video Extrapolation
22
Continuous video extrapolation (KTH Action)
Video extrapolation results (Penn Action)
Irregular Video Prediction
23
• We use hurricane dataset (CAM5) to test Vid-ODE’s ability to cope with
irregularly sampled input.
• Table shows that Vid-ODE has the ability to process irregularly sampled
video frames.
• In addition, we measure MSE and LPIPS on CAM5 dataset while changing
the input’s sampling rate to evaluate the effect of irregularity.
Irregular Video Prediction
24
Irregular video prediction results (CAM5)
RNN vs ODE
• To emphasize the need for learning the continuous video dynamics using
the ODE, we compare Vid-ODE to Vid-RNN.
• Vid-RNN replaces ODE components in Vid-ODE with ConvGRU while
retaining all other components.
25
versus
RNN vs ODE
• Vid-ODE is successfully inferring video frames at unseen timesteps thanks
to learning the underlying video dynamics.
• Vid-RNN generates unrealistic video frames due to simply blending two
adjacent latent representations.
26
Conclusion
27
• We propose Vid-ODE which enjoys the continuous nature of neural ODEs
to generate video frames at any given timesteps.
• We demonstrate its ability to generate high-quality video frames in the cont
inuous-time domain using four real-world video datasets.
• In future work, we plan to study how to adopt a flexible structure to addres
s the auto-regressive architecture of Vid-ODE.
28
Thank you!

More Related Content

Similar to [AAAI 2021] Vid-ODE: Continuous-Time Video Generation with Neural Ordinary Differential Equation

TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...
TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...
TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...sipij
 
TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...
TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...
TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...sipij
 
VIDEO AESTHETIC QUALITY ASSESSMENT USING KERNEL SUPPORT VECTOR MACHINE WITH I...
VIDEO AESTHETIC QUALITY ASSESSMENT USING KERNEL SUPPORT VECTOR MACHINE WITH I...VIDEO AESTHETIC QUALITY ASSESSMENT USING KERNEL SUPPORT VECTOR MACHINE WITH I...
VIDEO AESTHETIC QUALITY ASSESSMENT USING KERNEL SUPPORT VECTOR MACHINE WITH I...MOVING Project
 
Google | Infinite Nature Zero Whitepaper
Google | Infinite Nature Zero WhitepaperGoogle | Infinite Nature Zero Whitepaper
Google | Infinite Nature Zero WhitepaperAlejandro Franceschi
 
Time Dependent Video Compression For Efficient Storage
Time Dependent Video Compression For Efficient StorageTime Dependent Video Compression For Efficient Storage
Time Dependent Video Compression For Efficient StorageSoumyaShaw4
 
Parking Surveillance Footage Summarization
Parking Surveillance Footage SummarizationParking Surveillance Footage Summarization
Parking Surveillance Footage SummarizationIRJET Journal
 
Machine Learning approaches at video compression
Machine Learning approaches at video compression Machine Learning approaches at video compression
Machine Learning approaches at video compression Roberto Iacoviello
 
Emily Denton - Unsupervised Learning of Disentangled Representations from Vid...
Emily Denton - Unsupervised Learning of Disentangled Representations from Vid...Emily Denton - Unsupervised Learning of Disentangled Representations from Vid...
Emily Denton - Unsupervised Learning of Disentangled Representations from Vid...Luba Elliott
 
MIT's experience on OpenPOWER/POWER 9 platform
MIT's experience on OpenPOWER/POWER 9 platformMIT's experience on OpenPOWER/POWER 9 platform
MIT's experience on OpenPOWER/POWER 9 platformGanesan Narayanasamy
 
DreamPose: Fashion Image to Video Synthesis via Stable Diffusion
DreamPose: Fashion Image to Video Synthesis via Stable DiffusionDreamPose: Fashion Image to Video Synthesis via Stable Diffusion
DreamPose: Fashion Image to Video Synthesis via Stable Diffusiondrawais8
 
論文紹介:Temporal Sentence Grounding in Videos: A Survey and Future Directions
論文紹介:Temporal Sentence Grounding in Videos: A Survey and Future Directions論文紹介:Temporal Sentence Grounding in Videos: A Survey and Future Directions
論文紹介:Temporal Sentence Grounding in Videos: A Survey and Future DirectionsToru Tamaki
 
Video smart cropping web application
Video smart cropping web applicationVideo smart cropping web application
Video smart cropping web applicationVasileiosMezaris
 
2019-06-14:3 - Reti neurali e compressione video
2019-06-14:3 - Reti neurali e compressione video2019-06-14:3 - Reti neurali e compressione video
2019-06-14:3 - Reti neurali e compressione videouninfoit
 
Video Description using Deep Learning
Video Description using Deep LearningVideo Description using Deep Learning
Video Description using Deep LearningPranjalMahajan9
 
Video saliency-recognition by applying custom spatio temporal fusion technique
Video saliency-recognition by applying custom spatio temporal fusion techniqueVideo saliency-recognition by applying custom spatio temporal fusion technique
Video saliency-recognition by applying custom spatio temporal fusion techniqueIAESIJAI
 
IRJET- A Reliable and Robust Video Watermarking Scheme Over Cloud
IRJET- A Reliable and Robust Video Watermarking Scheme Over CloudIRJET- A Reliable and Robust Video Watermarking Scheme Over Cloud
IRJET- A Reliable and Robust Video Watermarking Scheme Over CloudIRJET Journal
 
CG OpneGL 2D viewing & simple animation-course 6
CG OpneGL 2D viewing & simple animation-course 6CG OpneGL 2D viewing & simple animation-course 6
CG OpneGL 2D viewing & simple animation-course 6fungfung Chen
 
SUMMARY GENERATION FOR LECTURING VIDEOS
SUMMARY GENERATION FOR LECTURING VIDEOSSUMMARY GENERATION FOR LECTURING VIDEOS
SUMMARY GENERATION FOR LECTURING VIDEOSIRJET Journal
 
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Robust face recognition from multi...
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Robust face recognition from multi...IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Robust face recognition from multi...
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Robust face recognition from multi...IEEEBEBTECHSTUDENTPROJECTS
 

Similar to [AAAI 2021] Vid-ODE: Continuous-Time Video Generation with Neural Ordinary Differential Equation (20)

Presentación Tesis 08022016
Presentación Tesis 08022016Presentación Tesis 08022016
Presentación Tesis 08022016
 
TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...
TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...
TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...
 
TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...
TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...
TARGET DETECTION AND CLASSIFICATION PERFORMANCE ENHANCEMENT USING SUPERRESOLU...
 
VIDEO AESTHETIC QUALITY ASSESSMENT USING KERNEL SUPPORT VECTOR MACHINE WITH I...
VIDEO AESTHETIC QUALITY ASSESSMENT USING KERNEL SUPPORT VECTOR MACHINE WITH I...VIDEO AESTHETIC QUALITY ASSESSMENT USING KERNEL SUPPORT VECTOR MACHINE WITH I...
VIDEO AESTHETIC QUALITY ASSESSMENT USING KERNEL SUPPORT VECTOR MACHINE WITH I...
 
Google | Infinite Nature Zero Whitepaper
Google | Infinite Nature Zero WhitepaperGoogle | Infinite Nature Zero Whitepaper
Google | Infinite Nature Zero Whitepaper
 
Time Dependent Video Compression For Efficient Storage
Time Dependent Video Compression For Efficient StorageTime Dependent Video Compression For Efficient Storage
Time Dependent Video Compression For Efficient Storage
 
Parking Surveillance Footage Summarization
Parking Surveillance Footage SummarizationParking Surveillance Footage Summarization
Parking Surveillance Footage Summarization
 
Machine Learning approaches at video compression
Machine Learning approaches at video compression Machine Learning approaches at video compression
Machine Learning approaches at video compression
 
Emily Denton - Unsupervised Learning of Disentangled Representations from Vid...
Emily Denton - Unsupervised Learning of Disentangled Representations from Vid...Emily Denton - Unsupervised Learning of Disentangled Representations from Vid...
Emily Denton - Unsupervised Learning of Disentangled Representations from Vid...
 
MIT's experience on OpenPOWER/POWER 9 platform
MIT's experience on OpenPOWER/POWER 9 platformMIT's experience on OpenPOWER/POWER 9 platform
MIT's experience on OpenPOWER/POWER 9 platform
 
DreamPose: Fashion Image to Video Synthesis via Stable Diffusion
DreamPose: Fashion Image to Video Synthesis via Stable DiffusionDreamPose: Fashion Image to Video Synthesis via Stable Diffusion
DreamPose: Fashion Image to Video Synthesis via Stable Diffusion
 
論文紹介:Temporal Sentence Grounding in Videos: A Survey and Future Directions
論文紹介:Temporal Sentence Grounding in Videos: A Survey and Future Directions論文紹介:Temporal Sentence Grounding in Videos: A Survey and Future Directions
論文紹介:Temporal Sentence Grounding in Videos: A Survey and Future Directions
 
Video smart cropping web application
Video smart cropping web applicationVideo smart cropping web application
Video smart cropping web application
 
2019-06-14:3 - Reti neurali e compressione video
2019-06-14:3 - Reti neurali e compressione video2019-06-14:3 - Reti neurali e compressione video
2019-06-14:3 - Reti neurali e compressione video
 
Video Description using Deep Learning
Video Description using Deep LearningVideo Description using Deep Learning
Video Description using Deep Learning
 
Video saliency-recognition by applying custom spatio temporal fusion technique
Video saliency-recognition by applying custom spatio temporal fusion techniqueVideo saliency-recognition by applying custom spatio temporal fusion technique
Video saliency-recognition by applying custom spatio temporal fusion technique
 
IRJET- A Reliable and Robust Video Watermarking Scheme Over Cloud
IRJET- A Reliable and Robust Video Watermarking Scheme Over CloudIRJET- A Reliable and Robust Video Watermarking Scheme Over Cloud
IRJET- A Reliable and Robust Video Watermarking Scheme Over Cloud
 
CG OpneGL 2D viewing & simple animation-course 6
CG OpneGL 2D viewing & simple animation-course 6CG OpneGL 2D viewing & simple animation-course 6
CG OpneGL 2D viewing & simple animation-course 6
 
SUMMARY GENERATION FOR LECTURING VIDEOS
SUMMARY GENERATION FOR LECTURING VIDEOSSUMMARY GENERATION FOR LECTURING VIDEOS
SUMMARY GENERATION FOR LECTURING VIDEOS
 
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Robust face recognition from multi...
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Robust face recognition from multi...IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Robust face recognition from multi...
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Robust face recognition from multi...
 

Recently uploaded

Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 

Recently uploaded (20)

Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 

[AAAI 2021] Vid-ODE: Continuous-Time Video Generation with Neural Ordinary Differential Equation

  • 1. Vid-ODE: Continuous-Time Video Generation with Neural Ordinary Differential Equation Sunghyun Park1*, Kangyeol Kim1*, Junsoo Lee1 Jaegul Choo1, Joonseok Lee2, Sookyung Kim3, Edward Choi1 1Korea Advanced Institute of Science and Technology (KAIST), 2Google Research, 3Lawrence Livermore Nat'l Lab. 1
  • 2. Motivations • Videos, the recording of continuous flow of visual information, inevitably di scretize the continuous time into a predefined, finite number of units. • It is challenging for video generation models to accept irregularly sample d frames or generate frames at unseen timesteps. 2 Regular Video Frames 0 1 2 3 Time 0.5 1.4 2.7 3.8 4.3 4.8 Arbitrary Video Frames
  • 3. Motivations • We aim to learn the continuous flow of videos from a sequence of frames (either regular or irregular) and synthesize new frames at any given timesteps. 3 Continuous-time Video Generation
  • 4. Importance of Continuous Generation • Due to the equipment cost, the time interval per each measurement of cli mate video often spans minutes to hours, which is insufficient to capture th e target dynamics. • Datasets collected from a wild environment is frequently missing values, which is in turn results in irregular timesteps 4 0H 3H 6H Time 9H Missing Frame Unobserved Time
  • 5. Introduction to Neural ODE • Reformulation of ResNet forward pass using an integral. • Interpreting the formula as solving an Ordinary Differential Equation (ODE), where neural networks serves as a derivative estimator. • The essence of neural ODE lies in learning continuous dynamics. And our pa per aims at modeling the continuous dynamics of a video. 5 Neural Ordinary Differential Equations, Chen et al, NIPS 2018
  • 6. Introduction to Neural ODE Latent ODEs for Irregularly-SampledTime Series, Rubanova et al, NeurIPS 2019 6 • Continuous time-series prediction (Interpolation / Extrapolation) • Latent-ODE with an ODE-RNN encoder can handle irregularly-sampled time-series data.
  • 7. Video Generation via Neural ODE • Previous approach towards video generation via Neural ODE: ODE²VAE • ODE²VAE decompose the latent representations into the position and the momentum to model the continuous dynamics. ODE²VAE: Deep generative second order ODEs with Bayesian neural networks,Yıldız et al, NeurIPS 2019 7
  • 8. Limitation of ODE²VAE • Although ODE²VAE shows some promising directions in continuous time-s eries modeling, it is still an unanswered question whether they can scale to perform continuous-time video generation on complicate real-world vide os. 8 ODE²VAE: Deep generative second order ODEs with Bayesian neural networks,Yıldız et al, NeurIPS 2019
  • 9. Relationship to Existing Video Models • Video Interpolation and extrapolation aim at generating an in-between a nd future frame respectively given a set of video frames. • Technically, existing approaches take advantage of warping operation or pi xel-wise prediction for video generation. • Since most existing models for the tasks rely on the supervision signals defined at fixed timesteps (i.e., in-between and future frame recorded in discretized manner), they have a limitation on generating at arbitrary timesteps. • In this paper, we address the limitation by combining neural ODE with various vision methods and propose a novel framework Vid-ODE for continuous-time video generation 9
  • 10. Key Contributions • Vid-ODE can predict video frames at any given timesteps (both within and beyond the observed range). • This is the first ODE-based framework to successfully perform continuous-time video generation on real-world videos. • Vid-ODE can flexibly handle unrestricted by pre-defined time intervals over the several variants of ConvGRU and neural ODE on climate videos where data are sparsely collected. 10
  • 12. Proposed Method: Encoder • Prior works employ FC layers to model the derivative of the latent state . • For encoder, we propose ODE-ConvGRU, a combination of neural ODE and ConvGRU, to handle the spatial aspect of given video frames. • In specific, ODE-ConvGRU processes 3D tensors with the aid of convolutional blocks, preserving spatial information. 12 versus ODE-RNN ODE-ConvGRU
  • 13. Proposed Method: Decoder • Our decoder consists of an ODE solver and the Conv-Decoder . • ODE solver produces the hidden states by integrating at given timesteps. • By taking adjacent hidden states , Conv Decoder outputs Optical Flow , Image Difference , and Composition Mask . 13
  • 14. Linear Composition • Three outputs of decoder are combined via the convex combination . 14 Combination procedure Visualization of 3 intermediate outputs
  • 15. Total Objective Functions • We adopt image and sequence discriminators to improve the output quality by using adversarial losses. • computes the pixel-level distance between the predicted video and the ground truth. • helps the model learn the image difference as the pixel-wise difference between consecutive video frames. • To sum up, total objective function can be written as 15
  • 16. Experimental Setup • Dataset • KTH Action: the videos of 25 subjects performing 6 different types of actions. • Penn Action: the videos of humans playing sports. • Moving GIF: the videos of animated animal characters. • CAM5: a hurricane video dataset for evaluating irregularly-sampled video prediction. • Bouncing Ball: the videos containing three balls moving in different directions. • Evaluation Metric • Structural Similarity (SSIM) • Peak Signal-to-Noise Ratio (PSNR) • Learned Perceptual image Path Similarity (LPIPS) 16
  • 17. Neural-ODE Comparison • Table shows that Vid-ODE significantly outperforms all other baselines both in interpolation and extrapolation tasks. 17
  • 18. Neural-ODE Comparison 18 Video interpolation results (KTH Action) Video extrapolation results (Bouncing ball)
  • 19. Video Interpolation 19 • As expected, we see some gap between Vid-ODE and the supervised approach (Deep Voxel Flow (DVI)). • However, Vid-ODE outperforms Unsupervised Video Interpolation (UVI) in all cases (especially in MGIF), except for SSIM in KTH-Action. Unsupervised Supervised
  • 20. Video Interpolation 20 Video interpolation results (Penn Action) Continuous video interpolation (Moving GIF)
  • 21. Video Extrapolation • As shown in table, Vid-ODE significantly outperforms all other baseline models in all metrics. • It is noteworthy that the performance gap is wider for moving GIF, which contains more dynamic object movements, indicating Vid-ODE’s superior ability to learn complex dynamics. 21
  • 22. Video Extrapolation 22 Continuous video extrapolation (KTH Action) Video extrapolation results (Penn Action)
  • 23. Irregular Video Prediction 23 • We use hurricane dataset (CAM5) to test Vid-ODE’s ability to cope with irregularly sampled input. • Table shows that Vid-ODE has the ability to process irregularly sampled video frames. • In addition, we measure MSE and LPIPS on CAM5 dataset while changing the input’s sampling rate to evaluate the effect of irregularity.
  • 24. Irregular Video Prediction 24 Irregular video prediction results (CAM5)
  • 25. RNN vs ODE • To emphasize the need for learning the continuous video dynamics using the ODE, we compare Vid-ODE to Vid-RNN. • Vid-RNN replaces ODE components in Vid-ODE with ConvGRU while retaining all other components. 25 versus
  • 26. RNN vs ODE • Vid-ODE is successfully inferring video frames at unseen timesteps thanks to learning the underlying video dynamics. • Vid-RNN generates unrealistic video frames due to simply blending two adjacent latent representations. 26
  • 27. Conclusion 27 • We propose Vid-ODE which enjoys the continuous nature of neural ODEs to generate video frames at any given timesteps. • We demonstrate its ability to generate high-quality video frames in the cont inuous-time domain using four real-world video datasets. • In future work, we plan to study how to adopt a flexible structure to addres s the auto-regressive architecture of Vid-ODE.