SlideShare a Scribd company logo
1 of 33
Download to read offline
2
Problem: Per-frame stylization of videos often leads to temporal flickering
Input Per-Frame Stylization
3
Further, most of the techniques do not provide consistency control
Video Watercolorization using Bidirectional Texture Advection ,
Bousseau et al., Transcations on Graphics, 2007.
Processing images and video for an impressionist effect,
Peter Litwinowicz, SIGGRAPH, 1997.
Style Specific Offline Processing
Stylizing Animation By Example,
Bénard et al., Transcations on Graphics, 2013.
Stylizing Video By Example,
Jamriška et al., Transcations on Graphics, 2019.
4
Fišer et al. , Color Me Noisy: Example-based Rendering of Hand-colored Animations with Temporal Noise Control, EGSR 2014.
Temporal inconsistency can
add to the artistic
look and feel.
5
To cater to the needs of
live video streaming or
conferencing.
Stylizing a live video conferencing session
Src: https://towardsdatascience.com/fancy-and-custom-neural-style-transfer-filters-for-video-conferencing-7eba2be1b6d5
6
• Should be capable of handling a wide range of stylization techniques
• Provides interactive temporal consistency control
• Capable of low latency processing of high-resolution video streams
Characteristics of a practical tool for stylizing video streams:
7
Ours
Thiomonier et al.,
ICME 2021
Shekhar et al.,
VMV 2019
Lai et al.,
ECCV 2018
Yao et al.
MM 2017
Bonneel et al.,
SIGGRAPH 2015
Aspects
No
No
Yes
No
Yes
No
Requires pre-processing?
Yes
No
Yes
No
No
Yes
Provides consistency
control?
Yes
N/A
Yes
N/A
N/A
(Not Applicable)
No
Provides interactive
consistency control?
Aspects Bonneel et al.,
SIGGRAPH 2015
Yao et al.
MM 2017
Lai et al.,
ECCV 2018
Shekhar et al.,
VMV 2019
Thiomonier et al.,
ICME 2021
Ours
They do not require knowledge about underlying stylization technique
However, what about the interactive consistency control?
Bonneel et al., Blind Video Temporal Consistency, SIGGRAPH 2015
Yao et al., Occlusion-aware Video Temporal Consistency, MM 2017
Lai et al., Learning Blind Video Temporal Consistency, ECCV 2018
Shekhar et al., Consistent Filtering of Videos and Dense Light-Fields Without Optic-Flow, VMV 2019
Thiomonier et al., Learning Long Term Style Preserving Blind Video Temporal Consistency, ICME 2021
Temporal consistency (𝜆)
9
10
𝐼𝑡−1 𝐼𝑡 𝐼𝑡+1
𝑃𝑡
𝑃𝑡−1 𝑃𝑡+1
𝑂𝑡−1
𝑤𝑝 𝑤𝑛
𝐿𝑡
Linear
combination
𝐺𝑡
𝑤𝑝
Use 𝑤𝑝 and 𝑤𝑛
for combining
1
2
3
Input:
𝐼𝑡−1, 𝐼𝑡, 𝐼𝑡+1 -- Input images at time instance 𝑡 − 1, 𝑡 , 𝑡 + 1
𝑃𝑡−1, 𝑃𝑡, 𝑃𝑡+1 -- Per-frame stylized images at time instance 𝑡 − 1, 𝑡 , 𝑡 + 1
𝑂𝑡−1 -- Output at previous time instance 𝑡 − 1
Output:
𝑂𝑡 -- Output at time instance 𝑡 ?
11
Global Consistency
Input (at time instance 𝒕): Per-frame stylized results 𝑃𝑡−1, 𝑃𝑡, 𝑃𝑡+1, Input Images
𝐼𝑡−1, 𝐼𝑡, 𝐼𝑡+1, and the previous output 𝑂𝑡−1
𝐺𝑡 = Γ(𝑂𝑡−1)
𝑤𝑝 = exp(−𝛼 𝐼𝑡 − Γ 𝐼𝑡−1
2
)
𝑤𝑛 = exp(−𝛼 𝐼𝑡 − Γ 𝐼𝑡+1
2)
Γ – is a warping function towards time instance 𝑡
• Backward and forward warping reduces artifacts
due to occlusion and flow inaccuracies
• Preserves local temporal variations
• Cannot reduce inconsistencies significantly
• Simple yet effective
• Leads to a loss of stylization
(in terms of colors and textures)
• Warping errors keep getting propagated
Local Consistency
𝐿𝑡 = 𝑤𝑝 ∙ Γ(𝑃𝑡−1) + 𝑤𝑛 ∙ Γ(𝑃𝑡+1) + (1 − 𝑤𝑝 − 𝑤𝑛) ∙ 𝑃𝑡
12
Linear
combination
𝐼𝑡−1 𝐼𝑡 𝐼𝑡+1
𝑃𝑡
𝑃𝑡−1 𝑃𝑡+1
𝑂𝑡−1
𝑤𝑝 𝑤𝑛
𝐿𝑡
Linear
combination
𝐺𝑡
𝐴𝑡
Optimization
Solving
𝑂𝑡
𝑤𝑝
Use 𝑤𝑝 and 𝑤𝑛
for combining
1
2
3
4
5
Input:
𝐼𝑡−1, 𝐼𝑡, 𝐼𝑡+1 -- Input images at time instance 𝑡 − 1, 𝑡 , 𝑡 + 1
𝑃𝑡−1, 𝑃𝑡, 𝑃𝑡+1 -- Per-frame stylized images at time instance 𝑡 − 1, 𝑡 , 𝑡 + 1
𝑂𝑡−1 -- Output at previous time instance 𝑡 − 1
Output:
𝑂𝑡 -- Output at time instance 𝑡 ?
13
argmin න 𝛻𝑂𝑡 − 𝛻𝑃𝑡
2
+ 𝑤𝑠 𝑂𝑡 − 𝐴𝑡
2
Data Term
( High-frequency
details from 𝑃𝑡 )
Smoothness Term
( Temporally consistent
content from 𝐴𝑡 )
Weighting Parameter
𝑃𝑡 - Per-frame stylized
𝐴𝑡 - Temporally consistent
𝑂𝑡 - Per-frame output
• Formulation is similar to that employed by
Bonneel et al. SIGGRAPH 2015 and
Shekhar et al. VMV 2019
• Our novelty is the way in which we construct the
consistent image 𝐴𝑡
• Through an adaptive combination the consistent
image preserves both local and global consistency
aspects
𝐴𝑡 = (1 − 𝑤𝑝) ∙ 𝐿𝑡 + 𝑤𝑝 ∙ 𝐺𝑡
14
• We want to invoke the Smoothness Term only when the warping
accuracy is sufficiently high. 𝑤𝑠 is thus driven by the similarity of warped
input image 𝐴𝑡
𝐼
to 𝐼𝑡:
𝐴𝑡
𝐼
= 𝑤𝑝 ∙ Γ(𝐼𝑡−1) + 𝑤𝑛 ∙ Γ(𝐼𝑡+1) + (1 − 𝑤𝑝 − 𝑤𝑛) ∙ 𝐼𝑡
𝑤𝑠 = 𝜆 ∙ exp(−𝛼 𝐼𝑡 − 𝐴𝑡
𝐼 2)
• We clamp the weights 𝑤𝑝 and 𝑤𝑛 such that
0 < 𝑤𝑝 < 𝑘1 and 0 < 𝑤𝑛 < 𝑘2 and 0 < 𝑘1 + 𝑘2 < 1
• We can control the degree of temporal consistency by varying 𝐤𝟏 and 𝛌
argmin න 𝛻𝑂𝑡 − 𝛻𝑃𝑡
2
+ 𝑤𝑠 𝑂𝑡 − 𝐴𝑡
2
Data Term
( High-frequency
details from 𝑃𝑡 )
Smoothness Term
( Temporally consistent
content from 𝐴𝑡 )
𝑂𝑡 - Per-frame output
𝑃𝑡 - Per-frame stylized
𝐴𝑡 - Temporally consistent
Per-frame Stylized Only Global Consistency (𝐴𝑡 = 𝐺𝑡)
Only Local Consistency (𝐴𝑡 = 𝐿𝑡) Our full Approach (𝐴𝑡 as a linear comb. of 𝐺𝑡 𝑎𝑛𝑑 𝐿𝑡)
16
Linear
combination
𝐼𝑡−1 𝐼𝑡 𝐼𝑡+1
𝑃𝑡
𝑃𝑡−1 𝑃𝑡+1
𝑂𝑡−1
𝑤𝑝 𝑤𝑛
𝐿𝑡
Linear
combination
𝐺𝑡
𝐴𝑡
Optimization
Solving
𝑂𝑡
𝑤𝑝
Use 𝑤𝑝 and 𝑤𝑛
for combining
1
2
3
4
5
Input:
𝐼𝑡−1, 𝐼𝑡, 𝐼𝑡+1 -- Input images at time instance 𝑡 − 1, 𝑡 , 𝑡 + 1
𝑃𝑡−1, 𝑃𝑡, 𝑃𝑡+1 -- Per-frame stylized images at time instance 𝑡 − 1, 𝑡 , 𝑡 + 1
𝑂𝑡−1 -- Output at previous time instance 𝑡 − 1
Output:
𝑂𝑡 -- Output at time instance 𝑡 ?
We require interactive performance and the
bottleneck in this regard is slow flow-based warping
-- To overcome this, we develop a fast optic-flow
neural network model
17
0
10
20
30
40
50
60
70
80
90
0 1 2 3 4 5 6 7 8 9
GMA
RAFT VCN
ours
liteflownet2
pwcnet
flownet2
arflow
spynet
Sintel final test EPE (lower is better)
Frames
per
second
(higher
is
better)
18
(a) Remove DenseNet Connections (b) Remove last flow estimator (c) Separable Conv. in Refinement (d) Prune 40% chnls.
Neural network compression steps
Results in a speedup factor of approx. 2.8, from 30 FPS to 85 FPS on RTX 2080
0
10
20
30
40
50
60
70
80
90
640 x 480 px 1280 x 720 px 1920 x 1080 px 1920 x 1080 px (Fast Preset)
Time
(milliseconds)
Runtime performance on a RTX 3090
Optical Flow Stabilization Total
19
“Fast preset” = downscale the flow computation by 2x and
use only 50 iterations of stabilization optimization instead of 150.
25 fps
20
22
Per-Frame Stylized Bonneel et al. [SIGGRAPH Asia 2015]
Lai et al. [ECCV 2018] Ours
Per-Frame Stylized Bonneel et al. [SIGGRAPH Asia 2015]
Lai et al. [ECCV 2018] Ours
25
132
128
127
39
43
44
0 20 40 60 80 100 120 140
Lai
Bonneel
Ours-obj.
Others Ours
*
*Ours-objective = Best performing on benchmarks (vs. Ours = subjectively determined parameters )
For 19 participants and 9 different videos we
compare our method against Bonneel
et al., Lai et al., and Ours-objective through a
total of 171 randomized A/B tests.
We ask the participants to select the output
which best preserves:
(i) temporal consistency and
(ii) similarity with the per-frame processed
video.
26
Per-Frame Processed Stabilized - Ours
Per-Frame Processed Stabilized - Ours
29
Lowering 𝑘1/𝜆 and increasing 𝛼 can remove these artifacts
Prompt: 1920’s car in a roundabout, old movie
Per-Frame Processed: Img2Img Stable Diffusion Stabilized - Ours
31
• By combining local and global consistency aspects we can achieve consistency while preserving stylization
• Reasonable flow accuracy estimated by a lightweight flow network is enough for making stylized videos consistent
• Existing objective metrics for temporal consistency do not capture the subjective preference
32
• We propose the first approach that provides interactive consistency control for per-frame stylized videos
• A novel temporal consistency term that combines local and global consistency aspects
• Fast optical-flow inference is achieved by developing a lightweight flow network architecture based on PWC-Net
• The entire pipeline is GPU-based and can handle video streams at full-HD resolution
Future Work
• Use learning-based temporal denoising for local consistency to further improve the quality of results
• Explore the usage of depth-based and saliency-based masks to spatially vary consistency
Tha
Website and Code!
View publication stats

More Related Content

Similar to Interactive Control over Temporal Consistency while Stylizing Video Streams

Video Compression Basics - MPEG2
Video Compression Basics - MPEG2Video Compression Basics - MPEG2
Video Compression Basics - MPEG2VijayKumarArya
 
NEW IMPROVED 2D SVD BASED ALGORITHM FOR VIDEO CODING
NEW IMPROVED 2D SVD BASED ALGORITHM FOR VIDEO CODINGNEW IMPROVED 2D SVD BASED ALGORITHM FOR VIDEO CODING
NEW IMPROVED 2D SVD BASED ALGORITHM FOR VIDEO CODINGcscpconf
 
Stable SSAO in Battlefield 3 with Selective Temporal Filtering
Stable SSAO in Battlefield 3 with Selective Temporal FilteringStable SSAO in Battlefield 3 with Selective Temporal Filtering
Stable SSAO in Battlefield 3 with Selective Temporal FilteringElectronic Arts / DICE
 
Multimedia basic video compression techniques
Multimedia basic video compression techniquesMultimedia basic video compression techniques
Multimedia basic video compression techniquesMazin Alwaaly
 
Evaluation of bandwidth performance for interactive spherical video
Evaluation of bandwidth performance for interactive spherical videoEvaluation of bandwidth performance for interactive spherical video
Evaluation of bandwidth performance for interactive spherical videoAlpen-Adria-Universität
 
Objective Evaluation of Video Quality
Objective Evaluation of Video QualityObjective Evaluation of Video Quality
Objective Evaluation of Video QualityAnton Venema
 
SPLC 2021 - The Interplay of Compile-time and Run-time Options for Performan...
SPLC 2021  - The Interplay of Compile-time and Run-time Options for Performan...SPLC 2021  - The Interplay of Compile-time and Run-time Options for Performan...
SPLC 2021 - The Interplay of Compile-time and Run-time Options for Performan...Luc Lesoil
 
161209 Unsupervised Learning of Video Representations using LSTMs
161209 Unsupervised Learning of Video Representations using LSTMs161209 Unsupervised Learning of Video Representations using LSTMs
161209 Unsupervised Learning of Video Representations using LSTMsJunho Cho
 
A Novel Blind SR Method to Improve the Spatial Resolution of Real Life Video ...
A Novel Blind SR Method to Improve the Spatial Resolution of Real Life Video ...A Novel Blind SR Method to Improve the Spatial Resolution of Real Life Video ...
A Novel Blind SR Method to Improve the Spatial Resolution of Real Life Video ...IRJET Journal
 
Video Compression Basics by sahil jain
Video Compression Basics by sahil jainVideo Compression Basics by sahil jain
Video Compression Basics by sahil jainSahil Jain
 
Video summarization using clustering
Video summarization using clusteringVideo summarization using clustering
Video summarization using clusteringSahil Biswas
 
Video and animation
Video and animationVideo and animation
Video and animationGem WeBlog
 
Video Denoising using Transform Domain Method
Video Denoising using Transform Domain MethodVideo Denoising using Transform Domain Method
Video Denoising using Transform Domain MethodIRJET Journal
 
Object Tracking with Instance Matching and Online Learning
Object Tracking with Instance Matching and Online LearningObject Tracking with Instance Matching and Online Learning
Object Tracking with Instance Matching and Online LearningJui-Hsin (Larry) Lai
 
Paper id 36201508
Paper id 36201508Paper id 36201508
Paper id 36201508IJRAT
 

Similar to Interactive Control over Temporal Consistency while Stylizing Video Streams (20)

Video Compression Basics - MPEG2
Video Compression Basics - MPEG2Video Compression Basics - MPEG2
Video Compression Basics - MPEG2
 
video comparison
video comparison video comparison
video comparison
 
NEW IMPROVED 2D SVD BASED ALGORITHM FOR VIDEO CODING
NEW IMPROVED 2D SVD BASED ALGORITHM FOR VIDEO CODINGNEW IMPROVED 2D SVD BASED ALGORITHM FOR VIDEO CODING
NEW IMPROVED 2D SVD BASED ALGORITHM FOR VIDEO CODING
 
Stable SSAO in Battlefield 3 with Selective Temporal Filtering
Stable SSAO in Battlefield 3 with Selective Temporal FilteringStable SSAO in Battlefield 3 with Selective Temporal Filtering
Stable SSAO in Battlefield 3 with Selective Temporal Filtering
 
ICIP2013-video stabilization with l1 l2 optimization
ICIP2013-video stabilization with l1 l2 optimizationICIP2013-video stabilization with l1 l2 optimization
ICIP2013-video stabilization with l1 l2 optimization
 
Broadcaster Notes
Broadcaster NotesBroadcaster Notes
Broadcaster Notes
 
Multimedia basic video compression techniques
Multimedia basic video compression techniquesMultimedia basic video compression techniques
Multimedia basic video compression techniques
 
Evaluation of bandwidth performance for interactive spherical video
Evaluation of bandwidth performance for interactive spherical videoEvaluation of bandwidth performance for interactive spherical video
Evaluation of bandwidth performance for interactive spherical video
 
Objective Evaluation of Video Quality
Objective Evaluation of Video QualityObjective Evaluation of Video Quality
Objective Evaluation of Video Quality
 
SPLC 2021 - The Interplay of Compile-time and Run-time Options for Performan...
SPLC 2021  - The Interplay of Compile-time and Run-time Options for Performan...SPLC 2021  - The Interplay of Compile-time and Run-time Options for Performan...
SPLC 2021 - The Interplay of Compile-time and Run-time Options for Performan...
 
161209 Unsupervised Learning of Video Representations using LSTMs
161209 Unsupervised Learning of Video Representations using LSTMs161209 Unsupervised Learning of Video Representations using LSTMs
161209 Unsupervised Learning of Video Representations using LSTMs
 
A Novel Blind SR Method to Improve the Spatial Resolution of Real Life Video ...
A Novel Blind SR Method to Improve the Spatial Resolution of Real Life Video ...A Novel Blind SR Method to Improve the Spatial Resolution of Real Life Video ...
A Novel Blind SR Method to Improve the Spatial Resolution of Real Life Video ...
 
Video Compression Basics by sahil jain
Video Compression Basics by sahil jainVideo Compression Basics by sahil jain
Video Compression Basics by sahil jain
 
Video summarization using clustering
Video summarization using clusteringVideo summarization using clustering
Video summarization using clustering
 
Video to Video Translation CGAN
Video to Video Translation CGANVideo to Video Translation CGAN
Video to Video Translation CGAN
 
Video and animation
Video and animationVideo and animation
Video and animation
 
C04841417
C04841417C04841417
C04841417
 
Video Denoising using Transform Domain Method
Video Denoising using Transform Domain MethodVideo Denoising using Transform Domain Method
Video Denoising using Transform Domain Method
 
Object Tracking with Instance Matching and Online Learning
Object Tracking with Instance Matching and Online LearningObject Tracking with Instance Matching and Online Learning
Object Tracking with Instance Matching and Online Learning
 
Paper id 36201508
Paper id 36201508Paper id 36201508
Paper id 36201508
 

More from Matthias Trapp

A Framework for Art-directed Augmentation of Human Motion in Videos on Mobile...
A Framework for Art-directed Augmentation of Human Motion in Videos on Mobile...A Framework for Art-directed Augmentation of Human Motion in Videos on Mobile...
A Framework for Art-directed Augmentation of Human Motion in Videos on Mobile...Matthias Trapp
 
A Framework for Interactive 3D Photo Stylization Techniques on Mobile Devices
A Framework for Interactive 3D Photo Stylization Techniques on Mobile DevicesA Framework for Interactive 3D Photo Stylization Techniques on Mobile Devices
A Framework for Interactive 3D Photo Stylization Techniques on Mobile DevicesMatthias Trapp
 
A Service-based Preset Recommendation System for Image Stylization Applications
A Service-based Preset Recommendation System for Image Stylization ApplicationsA Service-based Preset Recommendation System for Image Stylization Applications
A Service-based Preset Recommendation System for Image Stylization ApplicationsMatthias Trapp
 
Design Space of Geometry-based Image Abstraction Techniques with Vectorizatio...
Design Space of Geometry-based Image Abstraction Techniques with Vectorizatio...Design Space of Geometry-based Image Abstraction Techniques with Vectorizatio...
Design Space of Geometry-based Image Abstraction Techniques with Vectorizatio...Matthias Trapp
 
A Benchmark for the Use of Topic Models for Text Visualization Tasks - Online...
A Benchmark for the Use of Topic Models for Text Visualization Tasks - Online...A Benchmark for the Use of Topic Models for Text Visualization Tasks - Online...
A Benchmark for the Use of Topic Models for Text Visualization Tasks - Online...Matthias Trapp
 
Efficient GitHub Crawling using the GraphQL API
Efficient GitHub Crawling using the GraphQL APIEfficient GitHub Crawling using the GraphQL API
Efficient GitHub Crawling using the GraphQL APIMatthias Trapp
 
CodeCV - Mining Expertise of GitHub Users from Coding Activities - Online.pdf
CodeCV - Mining Expertise of GitHub Users from Coding Activities - Online.pdfCodeCV - Mining Expertise of GitHub Users from Coding Activities - Online.pdf
CodeCV - Mining Expertise of GitHub Users from Coding Activities - Online.pdfMatthias Trapp
 
Non-Photorealistic Rendering of 3D Point Clouds for Cartographic Visualization
Non-Photorealistic Rendering of 3D Point Clouds for Cartographic VisualizationNon-Photorealistic Rendering of 3D Point Clouds for Cartographic Visualization
Non-Photorealistic Rendering of 3D Point Clouds for Cartographic VisualizationMatthias Trapp
 
TWIN4ROAD - Erfassung Analyse und Auswertung mobiler Multi Sensorik im Strass...
TWIN4ROAD - Erfassung Analyse und Auswertung mobiler Multi Sensorik im Strass...TWIN4ROAD - Erfassung Analyse und Auswertung mobiler Multi Sensorik im Strass...
TWIN4ROAD - Erfassung Analyse und Auswertung mobiler Multi Sensorik im Strass...Matthias Trapp
 
Interactive Close-Up Rendering for Detail+Overview Visualization of 3D Digita...
Interactive Close-Up Rendering for Detail+Overview Visualization of 3D Digita...Interactive Close-Up Rendering for Detail+Overview Visualization of 3D Digita...
Interactive Close-Up Rendering for Detail+Overview Visualization of 3D Digita...Matthias Trapp
 
Web-based and Mobile Provisioning of Virtual 3D Reconstructions
Web-based and Mobile Provisioning of Virtual 3D ReconstructionsWeb-based and Mobile Provisioning of Virtual 3D Reconstructions
Web-based and Mobile Provisioning of Virtual 3D ReconstructionsMatthias Trapp
 
Visualization of Knowledge Distribution across Development Teams using 2.5D S...
Visualization of Knowledge Distribution across Development Teams using 2.5D S...Visualization of Knowledge Distribution across Development Teams using 2.5D S...
Visualization of Knowledge Distribution across Development Teams using 2.5D S...Matthias Trapp
 
Real-time Screen-space Geometry Draping for 3D Digital Terrain Models
Real-time Screen-space Geometry Draping for 3D Digital Terrain ModelsReal-time Screen-space Geometry Draping for 3D Digital Terrain Models
Real-time Screen-space Geometry Draping for 3D Digital Terrain ModelsMatthias Trapp
 
FERMIUM - A Framework for Real-time Procedural Point Cloud Animation & Morphing
FERMIUM - A Framework for Real-time Procedural Point Cloud Animation & MorphingFERMIUM - A Framework for Real-time Procedural Point Cloud Animation & Morphing
FERMIUM - A Framework for Real-time Procedural Point Cloud Animation & MorphingMatthias Trapp
 
Interactive Editing of Signed Distance Fields
Interactive Editing of Signed Distance FieldsInteractive Editing of Signed Distance Fields
Interactive Editing of Signed Distance FieldsMatthias Trapp
 
Integration of Image Processing Techniques into the Unity Game Engine
Integration of Image Processing Techniques into the Unity Game EngineIntegration of Image Processing Techniques into the Unity Game Engine
Integration of Image Processing Techniques into the Unity Game EngineMatthias Trapp
 
Interactive GPU-based Image Deformation for Mobile Devices
Interactive GPU-based Image Deformation for Mobile DevicesInteractive GPU-based Image Deformation for Mobile Devices
Interactive GPU-based Image Deformation for Mobile DevicesMatthias Trapp
 
Interactive Photo Editing on Smartphones via Intrinsic Decomposition
Interactive Photo Editing on Smartphones via Intrinsic DecompositionInteractive Photo Editing on Smartphones via Intrinsic Decomposition
Interactive Photo Editing on Smartphones via Intrinsic DecompositionMatthias Trapp
 
Service-based Analysis and Abstraction for Content Moderation of Digital Images
Service-based Analysis and Abstraction for Content Moderation of Digital ImagesService-based Analysis and Abstraction for Content Moderation of Digital Images
Service-based Analysis and Abstraction for Content Moderation of Digital ImagesMatthias Trapp
 
A Non-Photorealistic Rendering Techniquefor Art-directed Hatching of 3D Point...
A Non-Photorealistic Rendering Techniquefor Art-directed Hatching of 3D Point...A Non-Photorealistic Rendering Techniquefor Art-directed Hatching of 3D Point...
A Non-Photorealistic Rendering Techniquefor Art-directed Hatching of 3D Point...Matthias Trapp
 

More from Matthias Trapp (20)

A Framework for Art-directed Augmentation of Human Motion in Videos on Mobile...
A Framework for Art-directed Augmentation of Human Motion in Videos on Mobile...A Framework for Art-directed Augmentation of Human Motion in Videos on Mobile...
A Framework for Art-directed Augmentation of Human Motion in Videos on Mobile...
 
A Framework for Interactive 3D Photo Stylization Techniques on Mobile Devices
A Framework for Interactive 3D Photo Stylization Techniques on Mobile DevicesA Framework for Interactive 3D Photo Stylization Techniques on Mobile Devices
A Framework for Interactive 3D Photo Stylization Techniques on Mobile Devices
 
A Service-based Preset Recommendation System for Image Stylization Applications
A Service-based Preset Recommendation System for Image Stylization ApplicationsA Service-based Preset Recommendation System for Image Stylization Applications
A Service-based Preset Recommendation System for Image Stylization Applications
 
Design Space of Geometry-based Image Abstraction Techniques with Vectorizatio...
Design Space of Geometry-based Image Abstraction Techniques with Vectorizatio...Design Space of Geometry-based Image Abstraction Techniques with Vectorizatio...
Design Space of Geometry-based Image Abstraction Techniques with Vectorizatio...
 
A Benchmark for the Use of Topic Models for Text Visualization Tasks - Online...
A Benchmark for the Use of Topic Models for Text Visualization Tasks - Online...A Benchmark for the Use of Topic Models for Text Visualization Tasks - Online...
A Benchmark for the Use of Topic Models for Text Visualization Tasks - Online...
 
Efficient GitHub Crawling using the GraphQL API
Efficient GitHub Crawling using the GraphQL APIEfficient GitHub Crawling using the GraphQL API
Efficient GitHub Crawling using the GraphQL API
 
CodeCV - Mining Expertise of GitHub Users from Coding Activities - Online.pdf
CodeCV - Mining Expertise of GitHub Users from Coding Activities - Online.pdfCodeCV - Mining Expertise of GitHub Users from Coding Activities - Online.pdf
CodeCV - Mining Expertise of GitHub Users from Coding Activities - Online.pdf
 
Non-Photorealistic Rendering of 3D Point Clouds for Cartographic Visualization
Non-Photorealistic Rendering of 3D Point Clouds for Cartographic VisualizationNon-Photorealistic Rendering of 3D Point Clouds for Cartographic Visualization
Non-Photorealistic Rendering of 3D Point Clouds for Cartographic Visualization
 
TWIN4ROAD - Erfassung Analyse und Auswertung mobiler Multi Sensorik im Strass...
TWIN4ROAD - Erfassung Analyse und Auswertung mobiler Multi Sensorik im Strass...TWIN4ROAD - Erfassung Analyse und Auswertung mobiler Multi Sensorik im Strass...
TWIN4ROAD - Erfassung Analyse und Auswertung mobiler Multi Sensorik im Strass...
 
Interactive Close-Up Rendering for Detail+Overview Visualization of 3D Digita...
Interactive Close-Up Rendering for Detail+Overview Visualization of 3D Digita...Interactive Close-Up Rendering for Detail+Overview Visualization of 3D Digita...
Interactive Close-Up Rendering for Detail+Overview Visualization of 3D Digita...
 
Web-based and Mobile Provisioning of Virtual 3D Reconstructions
Web-based and Mobile Provisioning of Virtual 3D ReconstructionsWeb-based and Mobile Provisioning of Virtual 3D Reconstructions
Web-based and Mobile Provisioning of Virtual 3D Reconstructions
 
Visualization of Knowledge Distribution across Development Teams using 2.5D S...
Visualization of Knowledge Distribution across Development Teams using 2.5D S...Visualization of Knowledge Distribution across Development Teams using 2.5D S...
Visualization of Knowledge Distribution across Development Teams using 2.5D S...
 
Real-time Screen-space Geometry Draping for 3D Digital Terrain Models
Real-time Screen-space Geometry Draping for 3D Digital Terrain ModelsReal-time Screen-space Geometry Draping for 3D Digital Terrain Models
Real-time Screen-space Geometry Draping for 3D Digital Terrain Models
 
FERMIUM - A Framework for Real-time Procedural Point Cloud Animation & Morphing
FERMIUM - A Framework for Real-time Procedural Point Cloud Animation & MorphingFERMIUM - A Framework for Real-time Procedural Point Cloud Animation & Morphing
FERMIUM - A Framework for Real-time Procedural Point Cloud Animation & Morphing
 
Interactive Editing of Signed Distance Fields
Interactive Editing of Signed Distance FieldsInteractive Editing of Signed Distance Fields
Interactive Editing of Signed Distance Fields
 
Integration of Image Processing Techniques into the Unity Game Engine
Integration of Image Processing Techniques into the Unity Game EngineIntegration of Image Processing Techniques into the Unity Game Engine
Integration of Image Processing Techniques into the Unity Game Engine
 
Interactive GPU-based Image Deformation for Mobile Devices
Interactive GPU-based Image Deformation for Mobile DevicesInteractive GPU-based Image Deformation for Mobile Devices
Interactive GPU-based Image Deformation for Mobile Devices
 
Interactive Photo Editing on Smartphones via Intrinsic Decomposition
Interactive Photo Editing on Smartphones via Intrinsic DecompositionInteractive Photo Editing on Smartphones via Intrinsic Decomposition
Interactive Photo Editing on Smartphones via Intrinsic Decomposition
 
Service-based Analysis and Abstraction for Content Moderation of Digital Images
Service-based Analysis and Abstraction for Content Moderation of Digital ImagesService-based Analysis and Abstraction for Content Moderation of Digital Images
Service-based Analysis and Abstraction for Content Moderation of Digital Images
 
A Non-Photorealistic Rendering Techniquefor Art-directed Hatching of 3D Point...
A Non-Photorealistic Rendering Techniquefor Art-directed Hatching of 3D Point...A Non-Photorealistic Rendering Techniquefor Art-directed Hatching of 3D Point...
A Non-Photorealistic Rendering Techniquefor Art-directed Hatching of 3D Point...
 

Recently uploaded

Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 

Recently uploaded (20)

Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 

Interactive Control over Temporal Consistency while Stylizing Video Streams

  • 1.
  • 2. 2 Problem: Per-frame stylization of videos often leads to temporal flickering Input Per-Frame Stylization
  • 3. 3 Further, most of the techniques do not provide consistency control Video Watercolorization using Bidirectional Texture Advection , Bousseau et al., Transcations on Graphics, 2007. Processing images and video for an impressionist effect, Peter Litwinowicz, SIGGRAPH, 1997. Style Specific Offline Processing Stylizing Animation By Example, Bénard et al., Transcations on Graphics, 2013. Stylizing Video By Example, Jamriška et al., Transcations on Graphics, 2019.
  • 4. 4 Fišer et al. , Color Me Noisy: Example-based Rendering of Hand-colored Animations with Temporal Noise Control, EGSR 2014. Temporal inconsistency can add to the artistic look and feel.
  • 5. 5 To cater to the needs of live video streaming or conferencing. Stylizing a live video conferencing session Src: https://towardsdatascience.com/fancy-and-custom-neural-style-transfer-filters-for-video-conferencing-7eba2be1b6d5
  • 6. 6 • Should be capable of handling a wide range of stylization techniques • Provides interactive temporal consistency control • Capable of low latency processing of high-resolution video streams Characteristics of a practical tool for stylizing video streams:
  • 7. 7 Ours Thiomonier et al., ICME 2021 Shekhar et al., VMV 2019 Lai et al., ECCV 2018 Yao et al. MM 2017 Bonneel et al., SIGGRAPH 2015 Aspects No No Yes No Yes No Requires pre-processing? Yes No Yes No No Yes Provides consistency control? Yes N/A Yes N/A N/A (Not Applicable) No Provides interactive consistency control? Aspects Bonneel et al., SIGGRAPH 2015 Yao et al. MM 2017 Lai et al., ECCV 2018 Shekhar et al., VMV 2019 Thiomonier et al., ICME 2021 Ours They do not require knowledge about underlying stylization technique However, what about the interactive consistency control? Bonneel et al., Blind Video Temporal Consistency, SIGGRAPH 2015 Yao et al., Occlusion-aware Video Temporal Consistency, MM 2017 Lai et al., Learning Blind Video Temporal Consistency, ECCV 2018 Shekhar et al., Consistent Filtering of Videos and Dense Light-Fields Without Optic-Flow, VMV 2019 Thiomonier et al., Learning Long Term Style Preserving Blind Video Temporal Consistency, ICME 2021
  • 9. 9
  • 10. 10 𝐼𝑡−1 𝐼𝑡 𝐼𝑡+1 𝑃𝑡 𝑃𝑡−1 𝑃𝑡+1 𝑂𝑡−1 𝑤𝑝 𝑤𝑛 𝐿𝑡 Linear combination 𝐺𝑡 𝑤𝑝 Use 𝑤𝑝 and 𝑤𝑛 for combining 1 2 3 Input: 𝐼𝑡−1, 𝐼𝑡, 𝐼𝑡+1 -- Input images at time instance 𝑡 − 1, 𝑡 , 𝑡 + 1 𝑃𝑡−1, 𝑃𝑡, 𝑃𝑡+1 -- Per-frame stylized images at time instance 𝑡 − 1, 𝑡 , 𝑡 + 1 𝑂𝑡−1 -- Output at previous time instance 𝑡 − 1 Output: 𝑂𝑡 -- Output at time instance 𝑡 ?
  • 11. 11 Global Consistency Input (at time instance 𝒕): Per-frame stylized results 𝑃𝑡−1, 𝑃𝑡, 𝑃𝑡+1, Input Images 𝐼𝑡−1, 𝐼𝑡, 𝐼𝑡+1, and the previous output 𝑂𝑡−1 𝐺𝑡 = Γ(𝑂𝑡−1) 𝑤𝑝 = exp(−𝛼 𝐼𝑡 − Γ 𝐼𝑡−1 2 ) 𝑤𝑛 = exp(−𝛼 𝐼𝑡 − Γ 𝐼𝑡+1 2) Γ – is a warping function towards time instance 𝑡 • Backward and forward warping reduces artifacts due to occlusion and flow inaccuracies • Preserves local temporal variations • Cannot reduce inconsistencies significantly • Simple yet effective • Leads to a loss of stylization (in terms of colors and textures) • Warping errors keep getting propagated Local Consistency 𝐿𝑡 = 𝑤𝑝 ∙ Γ(𝑃𝑡−1) + 𝑤𝑛 ∙ Γ(𝑃𝑡+1) + (1 − 𝑤𝑝 − 𝑤𝑛) ∙ 𝑃𝑡
  • 12. 12 Linear combination 𝐼𝑡−1 𝐼𝑡 𝐼𝑡+1 𝑃𝑡 𝑃𝑡−1 𝑃𝑡+1 𝑂𝑡−1 𝑤𝑝 𝑤𝑛 𝐿𝑡 Linear combination 𝐺𝑡 𝐴𝑡 Optimization Solving 𝑂𝑡 𝑤𝑝 Use 𝑤𝑝 and 𝑤𝑛 for combining 1 2 3 4 5 Input: 𝐼𝑡−1, 𝐼𝑡, 𝐼𝑡+1 -- Input images at time instance 𝑡 − 1, 𝑡 , 𝑡 + 1 𝑃𝑡−1, 𝑃𝑡, 𝑃𝑡+1 -- Per-frame stylized images at time instance 𝑡 − 1, 𝑡 , 𝑡 + 1 𝑂𝑡−1 -- Output at previous time instance 𝑡 − 1 Output: 𝑂𝑡 -- Output at time instance 𝑡 ?
  • 13. 13 argmin න 𝛻𝑂𝑡 − 𝛻𝑃𝑡 2 + 𝑤𝑠 𝑂𝑡 − 𝐴𝑡 2 Data Term ( High-frequency details from 𝑃𝑡 ) Smoothness Term ( Temporally consistent content from 𝐴𝑡 ) Weighting Parameter 𝑃𝑡 - Per-frame stylized 𝐴𝑡 - Temporally consistent 𝑂𝑡 - Per-frame output • Formulation is similar to that employed by Bonneel et al. SIGGRAPH 2015 and Shekhar et al. VMV 2019 • Our novelty is the way in which we construct the consistent image 𝐴𝑡 • Through an adaptive combination the consistent image preserves both local and global consistency aspects 𝐴𝑡 = (1 − 𝑤𝑝) ∙ 𝐿𝑡 + 𝑤𝑝 ∙ 𝐺𝑡
  • 14. 14 • We want to invoke the Smoothness Term only when the warping accuracy is sufficiently high. 𝑤𝑠 is thus driven by the similarity of warped input image 𝐴𝑡 𝐼 to 𝐼𝑡: 𝐴𝑡 𝐼 = 𝑤𝑝 ∙ Γ(𝐼𝑡−1) + 𝑤𝑛 ∙ Γ(𝐼𝑡+1) + (1 − 𝑤𝑝 − 𝑤𝑛) ∙ 𝐼𝑡 𝑤𝑠 = 𝜆 ∙ exp(−𝛼 𝐼𝑡 − 𝐴𝑡 𝐼 2) • We clamp the weights 𝑤𝑝 and 𝑤𝑛 such that 0 < 𝑤𝑝 < 𝑘1 and 0 < 𝑤𝑛 < 𝑘2 and 0 < 𝑘1 + 𝑘2 < 1 • We can control the degree of temporal consistency by varying 𝐤𝟏 and 𝛌 argmin න 𝛻𝑂𝑡 − 𝛻𝑃𝑡 2 + 𝑤𝑠 𝑂𝑡 − 𝐴𝑡 2 Data Term ( High-frequency details from 𝑃𝑡 ) Smoothness Term ( Temporally consistent content from 𝐴𝑡 ) 𝑂𝑡 - Per-frame output 𝑃𝑡 - Per-frame stylized 𝐴𝑡 - Temporally consistent
  • 15. Per-frame Stylized Only Global Consistency (𝐴𝑡 = 𝐺𝑡) Only Local Consistency (𝐴𝑡 = 𝐿𝑡) Our full Approach (𝐴𝑡 as a linear comb. of 𝐺𝑡 𝑎𝑛𝑑 𝐿𝑡)
  • 16. 16 Linear combination 𝐼𝑡−1 𝐼𝑡 𝐼𝑡+1 𝑃𝑡 𝑃𝑡−1 𝑃𝑡+1 𝑂𝑡−1 𝑤𝑝 𝑤𝑛 𝐿𝑡 Linear combination 𝐺𝑡 𝐴𝑡 Optimization Solving 𝑂𝑡 𝑤𝑝 Use 𝑤𝑝 and 𝑤𝑛 for combining 1 2 3 4 5 Input: 𝐼𝑡−1, 𝐼𝑡, 𝐼𝑡+1 -- Input images at time instance 𝑡 − 1, 𝑡 , 𝑡 + 1 𝑃𝑡−1, 𝑃𝑡, 𝑃𝑡+1 -- Per-frame stylized images at time instance 𝑡 − 1, 𝑡 , 𝑡 + 1 𝑂𝑡−1 -- Output at previous time instance 𝑡 − 1 Output: 𝑂𝑡 -- Output at time instance 𝑡 ? We require interactive performance and the bottleneck in this regard is slow flow-based warping -- To overcome this, we develop a fast optic-flow neural network model
  • 17. 17 0 10 20 30 40 50 60 70 80 90 0 1 2 3 4 5 6 7 8 9 GMA RAFT VCN ours liteflownet2 pwcnet flownet2 arflow spynet Sintel final test EPE (lower is better) Frames per second (higher is better)
  • 18. 18 (a) Remove DenseNet Connections (b) Remove last flow estimator (c) Separable Conv. in Refinement (d) Prune 40% chnls. Neural network compression steps Results in a speedup factor of approx. 2.8, from 30 FPS to 85 FPS on RTX 2080
  • 19. 0 10 20 30 40 50 60 70 80 90 640 x 480 px 1280 x 720 px 1920 x 1080 px 1920 x 1080 px (Fast Preset) Time (milliseconds) Runtime performance on a RTX 3090 Optical Flow Stabilization Total 19 “Fast preset” = downscale the flow computation by 2x and use only 50 iterations of stabilization optimization instead of 150. 25 fps
  • 20. 20
  • 21.
  • 22. 22
  • 23. Per-Frame Stylized Bonneel et al. [SIGGRAPH Asia 2015] Lai et al. [ECCV 2018] Ours
  • 24. Per-Frame Stylized Bonneel et al. [SIGGRAPH Asia 2015] Lai et al. [ECCV 2018] Ours
  • 25. 25 132 128 127 39 43 44 0 20 40 60 80 100 120 140 Lai Bonneel Ours-obj. Others Ours * *Ours-objective = Best performing on benchmarks (vs. Ours = subjectively determined parameters ) For 19 participants and 9 different videos we compare our method against Bonneel et al., Lai et al., and Ours-objective through a total of 171 randomized A/B tests. We ask the participants to select the output which best preserves: (i) temporal consistency and (ii) similarity with the per-frame processed video.
  • 26. 26
  • 29. 29 Lowering 𝑘1/𝜆 and increasing 𝛼 can remove these artifacts
  • 30. Prompt: 1920’s car in a roundabout, old movie Per-Frame Processed: Img2Img Stable Diffusion Stabilized - Ours
  • 31. 31 • By combining local and global consistency aspects we can achieve consistency while preserving stylization • Reasonable flow accuracy estimated by a lightweight flow network is enough for making stylized videos consistent • Existing objective metrics for temporal consistency do not capture the subjective preference
  • 32. 32 • We propose the first approach that provides interactive consistency control for per-frame stylized videos • A novel temporal consistency term that combines local and global consistency aspects • Fast optical-flow inference is achieved by developing a lightweight flow network architecture based on PWC-Net • The entire pipeline is GPU-based and can handle video streams at full-HD resolution Future Work • Use learning-based temporal denoising for local consistency to further improve the quality of results • Explore the usage of depth-based and saliency-based masks to spatially vary consistency
  • 33. Tha Website and Code! View publication stats