SlideShare a Scribd company logo
Advances in Visual Quality Restoration
with Generative Adversarial Networks
Leonardo Galteri - PhD
University of Florence, MICC
Advances in Visual Quality Restoration with Generative Adversarial Networks
*Amazon Cloud Outbound Traffic
-1,125,000 $
Why does this happens? (in 2019…)
• First Episode of Season 2 had 15M viewers
• Stream this at reasonable quality at a cost of
0,020 $/GB*
Working on the Encoder
ICPR’18
• Adaptive video coding approach
Working on the Encoder
ICPR’18
X.264
• Adaptive video coding approach
Deep CNN
Is there Another Way?
Improving Compressed Images
Given an uncompressed frame 𝑥𝐻𝑄
𝑥𝐿𝑄 = 𝒞(𝑥𝐻𝑄; 𝜃)
We want to learn a function
𝐺(𝑥𝐿𝑄) ≈ 𝒞−1(𝑥𝐻𝑄; 𝜃)
where 𝜃 are codec parameters.
𝑥𝐿𝑄 𝐺(𝑥𝐿𝑄)
ICCV’17
A Deep Residual Network for
Reconstruction
• We use strided convolution to reduce feature map size.
• We avoid checkerboard artifacts with NN upsampling followed by 2
more convolutional layers
• Trained on patches 128x128 pixel extracted from MS-COCO. ICCV’17
Limitations of MSE and SSIM Losses
JPEG SSIM Loss Original
• SSIM and MSE losses are able to reduce effectively compression artefacts.
• However, reconstructions appear blurry and there are many missing details with respect to the
uncompressed version of the image.
ICCV’17
Generative Adversarial Network
G
Generator
D
Discriminator
Image
REAL or
FAKE?
High Quality
Images (REAL)
Low Quality
Images
Restored
Images (FAKE)
D is trained to tell apart
real from reconstructed
images
G is trained to fool D
ICCV’17
The Sub-Patch Discriminator
• 128 x 128 patches are split into smaller 16x16 sub-patches, concatenated with correspondent
input sub-patches and processed by the discriminator.
• The discriminator is trained with a binary cross-entropy loss over all the sub-patches.
ICCV’17
Generator Loss
ICCV’17
• The new objective of Generator is:
ℒ𝐺 = ℒ𝑃 + 𝜆ℒ𝐴𝐷𝑉
where: ℒ𝑃 = 𝜙 𝐼𝐻𝑄
𝑥,𝑦 − 𝜙 𝐼𝑅𝑄
𝑥,𝑦
2
is called perceptual loss, a MSE loss in VGG19 feature space, and:
ℒ𝐴𝐷𝑉 = − log 𝐷 𝐼𝑅𝑄|𝐼𝐿𝑄
is the adversarial loss, which measures how good is the
fooling the discriminator.
Effect of Sub-Patch Discriminator
• This technique allows to reduce the mosquito noise present in reconstructions.
W/o Sub-Patch With Sub-Patch ICCV’17
Predicting QF
• We train a CNN regressor, named QF predictor, to drive a finite Ensemble of Generators
• We use the most appropriateGenerator to restore the image
QF
Predictor
𝐺(𝜃 = 𝜃0; 𝑥)
𝐺(𝜃 = 𝜃𝑁; 𝑥)
…
…
Model
Switcher
TMM’19
𝐺(𝜃 = 𝜃𝑛; 𝑥)
𝑥
Quality Prediction Results
TMM’19
Qualitative Results
JPEG AR-CNN GAN ORIGINAL
TMM’19
Subjective Evaluation
• DSIS setup test image compared to original
and similarity scored in 0-100
• We compare SSIM Loss vs Adversarial
Training using the same Generator
architecture.
• Subjects have a strong preference forGAN
restored images over SSIM ones.
Method MOS Std. Dev.
SSIM 49.51 22.72
GAN 68.32 20.75
TMM’19
Object Detection Results
Class
GAN
AP gain
@QF 20
Dog +18.6
Cat +16.6
Sheep +14.3
Cow +12.5
• Use an object detector, Faster R-CNN to assess the visual quality of restored images
• Compute mAP on PASCALVOC using several JPEG quality factors and the correspondent
reconstructions.
• Large increase in detector
performance
• Largest gainers are
deformable ’furry’ objects
such as animals
TMM’19
Enters MobileNetV2
• MobileNetV2 was originally proposed to reduce computational burden of CNNs
• Depthwise separable convolutions drop-in replacement for convolutional layers
• Inverted residual blocks better propagate gradients across layers but more memory
efficient
Sandler CVPR’18
Residual Block Inverted Residual Block
A Deep Residual Network for Reconstruction
• Keep the Generator identical except for the Inverted Residual Blocks!
• 
Train on the small DIV2K dataset
• Augmentation: resizing 256, 384 and 512; random crops of 224x224;
mirror flipping.
Inverted
Residual
Block
Inverted
Residual
Block
CAIP’19
Qualitative Results
ICCV’17 RAW
Bit/Pixel 0.146
FPS 4 onTitan Xp GPU
Bit/Pixel 12
FPS -
All videos 720p
CAIP’19
Qualitative Results
Bit/Pixel 0.146
FPS 42 onTitan Xp GPU
Very Fast RAW
Bit/Pixel 12
FPS -
CAIP’19
All videos 720p
Qualitative Results
Bit/Pixel 0.146
FPS 20 onTitan Xp GPU
Bit/Pixel 12
FPS -
Fast RAW
CAIP’19
All videos 720p
RAW
Bit/Pixel 0.0570
FPS 3 onV100 GPU
Bit/Pixel 12
FPS -
Wave.ONE
Qualitative Results
CAIP’19
All videos 720p
Qualitative Results
Bit/Pixel 0.0570
FPS 1,6 onTitan Xp GPU*
Bit/Pixel 0.146 (x3)
FPS 20 onTitan Xp GPU (x12)
Fast
Wave.ONE
All videos 720p
CAIP’19
No-Reference Evaluation
• According to NIQE and BRISQUE value GAN images as ’more natural’ the the
original ones!
• VIIDEO is likely penalizing reconstruction for lack of temporal coherence
CAIP’19
• GANs are well known to work well when the distribution is simpler
• Faces are possibly the most interesting object we are willing to transmit
• Here what we can do with a severe degradation and a specialized GAN
Specialized Artifact Removal
ACM MM’19 Best Demo
Specialized Artifact Removal
ACM MM’19 Best Demo
• GANs are well known to work well when the distribution is simpler
• Faces are possibly the most interesting object we are willing to transmit
• Here what we can do with a severe degradation and a specialized GAN
Advances in Visual Quality Restoration with Generative Adversarial Networks
Advances in Visual Quality Restoration with Generative Adversarial Networks
ACM MM’19 Best Demo
Original frame
Semantic mask
Transmitter
Semantic Coding + GAN
ACM MM’20
Semantic Coding + GAN
ACM MM’20
Received frame
Receiver
Semantic Segmentation
ACM MM’20
• Use BiSeNet to label each pixel
detected as face or neck as
foreground, the remainder as
background.
Semantic Segmentation
ACM MM’20
+
+
Generator Loss
𝑀
𝑀
face
background
Semantic Segmentation
ACM MM’20
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0 5 10 15 20 25 30 35 40 45
LPIPS
Our Method Baseline
0
5
10
15
20
25
30
35
40
0 5 10 15 20 25 30 35 40 45
BRISQUE
Our Method Baseline
~ 30% improvement ~ 55% improvement
RESULTS – QUALITATIVE 1 min
COMPRESSED
RESULTS – QUALITATIVE 1 min
COMPRESSED +
SALIENCY
RESULTS – QUALITATIVE 1 min
COMPRESSED +
SALIENCY + GAN
Evaluation using Language
• Task: “evaluate the weighted combination of all of the visually significant
attributes of an image”
Quality ACM MM ASIA’21
Best Paper Award
A statue of a woman
wearing a christmas tie
A brown and white dog
wearing a tie
A brown and white dog
wearing a red tie
A statue of a woman
wearing a christmas tie
A brown and white dog
wearing a tie
A brown and white dog
wearing a red tie
A statue of a woman
wearing a christmas tie
A brown and white dog
wearing a tie
A brown and white dog
wearing a red tie
JPEG GAN HQ
Evaluation using Language
Use image captioning algorithm to evaluate the fine semantics of the image
ACM MM ASIA’21
Best Paper Award
Assesment Methodolgy
ACM MM ASIA’21
Best Paper Award
Assesment Methodolgy
ACM MM ASIA’21
Best Paper Award
Assesment Methodolgy
ACM MM ASIA’21
Best Paper Award
Language Model
Language
Metric
0.752
Language Model Pseudo Ground Truth Caption
Predicted Caption
Input Image
Reference
Image
• “GT” caption can be generated from the reference imagein case captions
are not available (the usual case)
Evaluating Enhanced Images
ACM MM ASIA’21
Best Paper Award
• Our approach scores higher GAN
reconstructions (REC) of JPEG
compressed images across a
wide range of QFs
• Results are consistent for all
captioning metrics across all
qualities/enhancements
Changing the Captioning Model
ACM MM ASIA’21
Best Paper Award
• Using a better captioning
model increases the
correlation with MOS
• Visual features are shared
among [1] and [2]
[1] M. Cornia, M. Stefanini, L. Baraldi, and R. Cucchiara. Meshed-memory transformer for image
captioning. In Proc. of CVPR 2020
[2] P. Anderson, X. He, C. Buehler, D. Teney, M. Johnson, S. Gould, and L. Zhang. Bottom-up and top-
down attention for image captioning and visual question answering. In Proc. of CVPR 2018
[1]
JPEG GAN

A man riding a wave on a surfboard in the ocea

A couple of people sitting next to a christmas tree.
Qualitative Analysis
ACM MM ASIA’21
Best Paper Award
What’s Next for Restoration?
• Innovative technologies for restoration replacingGANs
• Smaller architectures and new ways to train them
• Blind restoration
References
L. Galteri, M. Bertini , L. Seidenari, A. Del Bimbo, Video Compression for Object Detection Algorithms’, ICPR 2018
L. Galteri, L. Seidenari, M. Bertini, A. Del Bimbo, 'Deep Generative Adversarial Compression Artifact Removal’,
IEEE ICCV 2017
L. Galteri, L. Seidenari, M. Bertini, A. Del Bimbo, ‘Deep Universal Generative Adversarial Compression Artifact
Removal’, IEEE TMM 2019
L. Galteri, L. Seidenari, M. Bertini, A. Del Bimbo, ‘Towards Real-Time Image Enhancement GANs’, CAIP 2019
L. Galteri, L. Seidenari, M. Bertini, T. Uricchio, A. Del Bimbo, ‘Fast Video Quality Enchancement Using GANs’, ACM
MM Best Demo 2019
L. Galteri, M. Bertini, L. Seidenari, T. Uricchio, A. Del Bimbo, ‘Increasing Video Perceptual Quality with GANs and
Semantic Coding’, ACM MM 2020
L. Galteri, L. Seidenari, P. Bongini, M. Bertini, A. Del Bimbo, ‘Language Based Image Quality Enhancement’, ACM
MM Asia Best Paper Award 2021

More Related Content

Similar to Advances in Visual Quality Restoration with Generative Adversarial Networks

Single Image Super Resolution using Fuzzy Deep Convolutional Networks
Single Image Super Resolution using Fuzzy Deep Convolutional NetworksSingle Image Super Resolution using Fuzzy Deep Convolutional Networks
Single Image Super Resolution using Fuzzy Deep Convolutional Networks
Greeshma M.S.R
 
DALL-E.pdf
DALL-E.pdfDALL-E.pdf
DALL-E.pdf
dsfajkh
 
Deep Convolutional GANs - meaning of latent space
Deep Convolutional GANs - meaning of latent spaceDeep Convolutional GANs - meaning of latent space
Deep Convolutional GANs - meaning of latent space
Hansol Kang
 
Detection of surface flaws in a pipe using vision based technique
Detection of surface flaws in a pipe using vision based techniqueDetection of surface flaws in a pipe using vision based technique
Detection of surface flaws in a pipe using vision based technique
Adilkhan205430
 
Comparitive Analysis for Pre-Processing of Images and Videos using Histogram ...
Comparitive Analysis for Pre-Processing of Images and Videos using Histogram ...Comparitive Analysis for Pre-Processing of Images and Videos using Histogram ...
Comparitive Analysis for Pre-Processing of Images and Videos using Histogram ...
IRJET Journal
 
IMAGE PROCESSING
IMAGE PROCESSINGIMAGE PROCESSING
IMAGE PROCESSING
ABHISHEK MAURYA
 
Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...
Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...
Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...
Universitat Politècnica de Catalunya
 
Post processing of jpeg image using MLP
Post processing of jpeg image using MLP Post processing of jpeg image using MLP
Post processing of jpeg image using MLP
Data Fok
 
Single Image Super Resolution Overview
Single Image Super Resolution OverviewSingle Image Super Resolution Overview
Single Image Super Resolution Overview
LEE HOSEONG
 
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Universitat Politècnica de Catalunya
 
Enhance and quantify Microstructure using Machine Learning
Enhance and quantify Microstructure using Machine LearningEnhance and quantify Microstructure using Machine Learning
Enhance and quantify Microstructure using Machine Learning
Manthan Ambolkar
 
Comparative Analysis of image Enhancement Techniques on Real Time images
Comparative Analysis of image Enhancement Techniques on Real Time imagesComparative Analysis of image Enhancement Techniques on Real Time images
Comparative Analysis of image Enhancement Techniques on Real Time images
IJSRED
 
“Image Signal Processing Optimization for Object Detection,” a Presentation f...
“Image Signal Processing Optimization for Object Detection,” a Presentation f...“Image Signal Processing Optimization for Object Detection,” a Presentation f...
“Image Signal Processing Optimization for Object Detection,” a Presentation f...
Edge AI and Vision Alliance
 
Adaptive Image Compression Using Saliency and KAZE Features
Adaptive Image Compression Using Saliency and KAZE FeaturesAdaptive Image Compression Using Saliency and KAZE Features
Adaptive Image Compression Using Saliency and KAZE Features
Prerana Mukherjee
 
Ad24210214
Ad24210214Ad24210214
Ad24210214
IJERA Editor
 
Alex_Vlachos_Advanced_VR_Rendering_Performance_GDC2016
Alex_Vlachos_Advanced_VR_Rendering_Performance_GDC2016Alex_Vlachos_Advanced_VR_Rendering_Performance_GDC2016
Alex_Vlachos_Advanced_VR_Rendering_Performance_GDC2016
Alex Vlachos
 
Interactive Control over Temporal Consistency while Stylizing Video Streams
Interactive Control over Temporal Consistency while Stylizing Video StreamsInteractive Control over Temporal Consistency while Stylizing Video Streams
Interactive Control over Temporal Consistency while Stylizing Video Streams
Matthias Trapp
 
Mmclass5b
Mmclass5bMmclass5b
Mmclass5b
Hassan Dar
 
Depth estimation do we need to throw old things away
Depth estimation do we need to throw old things awayDepth estimation do we need to throw old things away
Depth estimation do we need to throw old things away
NAVER Engineering
 
Final image processing
Final image processingFinal image processing
Final image processing
Sharanjit Kaur
 

Similar to Advances in Visual Quality Restoration with Generative Adversarial Networks (20)

Single Image Super Resolution using Fuzzy Deep Convolutional Networks
Single Image Super Resolution using Fuzzy Deep Convolutional NetworksSingle Image Super Resolution using Fuzzy Deep Convolutional Networks
Single Image Super Resolution using Fuzzy Deep Convolutional Networks
 
DALL-E.pdf
DALL-E.pdfDALL-E.pdf
DALL-E.pdf
 
Deep Convolutional GANs - meaning of latent space
Deep Convolutional GANs - meaning of latent spaceDeep Convolutional GANs - meaning of latent space
Deep Convolutional GANs - meaning of latent space
 
Detection of surface flaws in a pipe using vision based technique
Detection of surface flaws in a pipe using vision based techniqueDetection of surface flaws in a pipe using vision based technique
Detection of surface flaws in a pipe using vision based technique
 
Comparitive Analysis for Pre-Processing of Images and Videos using Histogram ...
Comparitive Analysis for Pre-Processing of Images and Videos using Histogram ...Comparitive Analysis for Pre-Processing of Images and Videos using Histogram ...
Comparitive Analysis for Pre-Processing of Images and Videos using Histogram ...
 
IMAGE PROCESSING
IMAGE PROCESSINGIMAGE PROCESSING
IMAGE PROCESSING
 
Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...
Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...
Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...
 
Post processing of jpeg image using MLP
Post processing of jpeg image using MLP Post processing of jpeg image using MLP
Post processing of jpeg image using MLP
 
Single Image Super Resolution Overview
Single Image Super Resolution OverviewSingle Image Super Resolution Overview
Single Image Super Resolution Overview
 
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
 
Enhance and quantify Microstructure using Machine Learning
Enhance and quantify Microstructure using Machine LearningEnhance and quantify Microstructure using Machine Learning
Enhance and quantify Microstructure using Machine Learning
 
Comparative Analysis of image Enhancement Techniques on Real Time images
Comparative Analysis of image Enhancement Techniques on Real Time imagesComparative Analysis of image Enhancement Techniques on Real Time images
Comparative Analysis of image Enhancement Techniques on Real Time images
 
“Image Signal Processing Optimization for Object Detection,” a Presentation f...
“Image Signal Processing Optimization for Object Detection,” a Presentation f...“Image Signal Processing Optimization for Object Detection,” a Presentation f...
“Image Signal Processing Optimization for Object Detection,” a Presentation f...
 
Adaptive Image Compression Using Saliency and KAZE Features
Adaptive Image Compression Using Saliency and KAZE FeaturesAdaptive Image Compression Using Saliency and KAZE Features
Adaptive Image Compression Using Saliency and KAZE Features
 
Ad24210214
Ad24210214Ad24210214
Ad24210214
 
Alex_Vlachos_Advanced_VR_Rendering_Performance_GDC2016
Alex_Vlachos_Advanced_VR_Rendering_Performance_GDC2016Alex_Vlachos_Advanced_VR_Rendering_Performance_GDC2016
Alex_Vlachos_Advanced_VR_Rendering_Performance_GDC2016
 
Interactive Control over Temporal Consistency while Stylizing Video Streams
Interactive Control over Temporal Consistency while Stylizing Video StreamsInteractive Control over Temporal Consistency while Stylizing Video Streams
Interactive Control over Temporal Consistency while Stylizing Video Streams
 
Mmclass5b
Mmclass5bMmclass5b
Mmclass5b
 
Depth estimation do we need to throw old things away
Depth estimation do we need to throw old things awayDepth estimation do we need to throw old things away
Depth estimation do we need to throw old things away
 
Final image processing
Final image processingFinal image processing
Final image processing
 

More from Förderverein Technische Fakultät

Greening local government units: Current status and required competences
Greening local government units: Current status and required competencesGreening local government units: Current status and required competences
Greening local government units: Current status and required competences
Förderverein Technische Fakultät
 
Supervisory control of business processes
Supervisory control of business processesSupervisory control of business processes
Supervisory control of business processes
Förderverein Technische Fakultät
 
The Digital Transformation of Education: A Hyper-Disruptive Era through Block...
The Digital Transformation of Education: A Hyper-Disruptive Era through Block...The Digital Transformation of Education: A Hyper-Disruptive Era through Block...
The Digital Transformation of Education: A Hyper-Disruptive Era through Block...
Förderverein Technische Fakultät
 
A Game of Chess is Like a Swordfight.pdf
A Game of Chess is Like a Swordfight.pdfA Game of Chess is Like a Swordfight.pdf
A Game of Chess is Like a Swordfight.pdf
Förderverein Technische Fakultät
 
From Mind to Meta.pdf
From Mind to Meta.pdfFrom Mind to Meta.pdf
From Mind to Meta.pdf
Förderverein Technische Fakultät
 
Miniatures Design for Tabletop Games.pdf
Miniatures Design for Tabletop Games.pdfMiniatures Design for Tabletop Games.pdf
Miniatures Design for Tabletop Games.pdf
Förderverein Technische Fakultät
 
Distributed Systems in the Post-Moore Era.pptx
Distributed Systems in the Post-Moore Era.pptxDistributed Systems in the Post-Moore Era.pptx
Distributed Systems in the Post-Moore Era.pptx
Förderverein Technische Fakultät
 
Don't Treat the Symptom, Find the Cause!.pptx
Don't Treat the Symptom, Find the Cause!.pptxDon't Treat the Symptom, Find the Cause!.pptx
Don't Treat the Symptom, Find the Cause!.pptx
Förderverein Technische Fakultät
 
Engineering Serverless Workflow Applications in Federated FaaS.pdf
Engineering Serverless Workflow Applications in Federated FaaS.pdfEngineering Serverless Workflow Applications in Federated FaaS.pdf
Engineering Serverless Workflow Applications in Federated FaaS.pdf
Förderverein Technische Fakultät
 
The Role of Machine Learning in Fluid Network Control and Data Planes.pdf
The Role of Machine Learning in Fluid Network Control and Data Planes.pdfThe Role of Machine Learning in Fluid Network Control and Data Planes.pdf
The Role of Machine Learning in Fluid Network Control and Data Planes.pdf
Förderverein Technische Fakultät
 
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
Förderverein Technische Fakultät
 
Towards a data driven identification of teaching patterns.pdf
Towards a data driven identification of teaching patterns.pdfTowards a data driven identification of teaching patterns.pdf
Towards a data driven identification of teaching patterns.pdf
Förderverein Technische Fakultät
 
Förderverein Technische Fakultät.pptx
Förderverein Technische Fakultät.pptxFörderverein Technische Fakultät.pptx
Förderverein Technische Fakultät.pptx
Förderverein Technische Fakultät
 
The Computing Continuum.pdf
The Computing Continuum.pdfThe Computing Continuum.pdf
The Computing Continuum.pdf
Förderverein Technische Fakultät
 
East-west oriented photovoltaic power systems: model, benefits and technical ...
East-west oriented photovoltaic power systems: model, benefits and technical ...East-west oriented photovoltaic power systems: model, benefits and technical ...
East-west oriented photovoltaic power systems: model, benefits and technical ...
Förderverein Technische Fakultät
 
Machine Learning in Finance via Randomization
Machine Learning in Finance via RandomizationMachine Learning in Finance via Randomization
Machine Learning in Finance via Randomization
Förderverein Technische Fakultät
 
IT does not stop
IT does not stopIT does not stop
Recent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at Netflix
Förderverein Technische Fakultät
 
Industriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdf
Industriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdfIndustriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdf
Industriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdf
Förderverein Technische Fakultät
 
Introduction to 5G from radio perspective
Introduction to 5G from radio perspectiveIntroduction to 5G from radio perspective
Introduction to 5G from radio perspective
Förderverein Technische Fakultät
 

More from Förderverein Technische Fakultät (20)

Greening local government units: Current status and required competences
Greening local government units: Current status and required competencesGreening local government units: Current status and required competences
Greening local government units: Current status and required competences
 
Supervisory control of business processes
Supervisory control of business processesSupervisory control of business processes
Supervisory control of business processes
 
The Digital Transformation of Education: A Hyper-Disruptive Era through Block...
The Digital Transformation of Education: A Hyper-Disruptive Era through Block...The Digital Transformation of Education: A Hyper-Disruptive Era through Block...
The Digital Transformation of Education: A Hyper-Disruptive Era through Block...
 
A Game of Chess is Like a Swordfight.pdf
A Game of Chess is Like a Swordfight.pdfA Game of Chess is Like a Swordfight.pdf
A Game of Chess is Like a Swordfight.pdf
 
From Mind to Meta.pdf
From Mind to Meta.pdfFrom Mind to Meta.pdf
From Mind to Meta.pdf
 
Miniatures Design for Tabletop Games.pdf
Miniatures Design for Tabletop Games.pdfMiniatures Design for Tabletop Games.pdf
Miniatures Design for Tabletop Games.pdf
 
Distributed Systems in the Post-Moore Era.pptx
Distributed Systems in the Post-Moore Era.pptxDistributed Systems in the Post-Moore Era.pptx
Distributed Systems in the Post-Moore Era.pptx
 
Don't Treat the Symptom, Find the Cause!.pptx
Don't Treat the Symptom, Find the Cause!.pptxDon't Treat the Symptom, Find the Cause!.pptx
Don't Treat the Symptom, Find the Cause!.pptx
 
Engineering Serverless Workflow Applications in Federated FaaS.pdf
Engineering Serverless Workflow Applications in Federated FaaS.pdfEngineering Serverless Workflow Applications in Federated FaaS.pdf
Engineering Serverless Workflow Applications in Federated FaaS.pdf
 
The Role of Machine Learning in Fluid Network Control and Data Planes.pdf
The Role of Machine Learning in Fluid Network Control and Data Planes.pdfThe Role of Machine Learning in Fluid Network Control and Data Planes.pdf
The Role of Machine Learning in Fluid Network Control and Data Planes.pdf
 
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
 
Towards a data driven identification of teaching patterns.pdf
Towards a data driven identification of teaching patterns.pdfTowards a data driven identification of teaching patterns.pdf
Towards a data driven identification of teaching patterns.pdf
 
Förderverein Technische Fakultät.pptx
Förderverein Technische Fakultät.pptxFörderverein Technische Fakultät.pptx
Förderverein Technische Fakultät.pptx
 
The Computing Continuum.pdf
The Computing Continuum.pdfThe Computing Continuum.pdf
The Computing Continuum.pdf
 
East-west oriented photovoltaic power systems: model, benefits and technical ...
East-west oriented photovoltaic power systems: model, benefits and technical ...East-west oriented photovoltaic power systems: model, benefits and technical ...
East-west oriented photovoltaic power systems: model, benefits and technical ...
 
Machine Learning in Finance via Randomization
Machine Learning in Finance via RandomizationMachine Learning in Finance via Randomization
Machine Learning in Finance via Randomization
 
IT does not stop
IT does not stopIT does not stop
IT does not stop
 
Recent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at Netflix
 
Industriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdf
Industriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdfIndustriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdf
Industriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdf
 
Introduction to 5G from radio perspective
Introduction to 5G from radio perspectiveIntroduction to 5G from radio perspective
Introduction to 5G from radio perspective
 

Recently uploaded

Vertex AI Agent Builder - GDG Alicante - Julio 2024
Vertex AI Agent Builder - GDG Alicante - Julio 2024Vertex AI Agent Builder - GDG Alicante - Julio 2024
Vertex AI Agent Builder - GDG Alicante - Julio 2024
Nicolás Lopéz
 
Vulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive OverviewVulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive Overview
Steven Carlson
 
Tailored CRM Software Development for Enhanced Customer Insights
Tailored CRM Software Development for Enhanced Customer InsightsTailored CRM Software Development for Enhanced Customer Insights
Tailored CRM Software Development for Enhanced Customer Insights
SynapseIndia
 
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
alexjohnson7307
 
UX Webinar Series: Essentials for Adopting Passkeys as the Foundation of your...
UX Webinar Series: Essentials for Adopting Passkeys as the Foundation of your...UX Webinar Series: Essentials for Adopting Passkeys as the Foundation of your...
UX Webinar Series: Essentials for Adopting Passkeys as the Foundation of your...
FIDO Alliance
 
Gen AI: Privacy Risks of Large Language Models (LLMs)
Gen AI: Privacy Risks of Large Language Models (LLMs)Gen AI: Privacy Risks of Large Language Models (LLMs)
Gen AI: Privacy Risks of Large Language Models (LLMs)
Debmalya Biswas
 
Acumatica vs. Sage Intacct _Construction_July (1).pptx
Acumatica vs. Sage Intacct _Construction_July (1).pptxAcumatica vs. Sage Intacct _Construction_July (1).pptx
Acumatica vs. Sage Intacct _Construction_July (1).pptx
BrainSell Technologies
 
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
bellared2
 
Semantic-Aware Code Model: Elevating the Future of Software Development
Semantic-Aware Code Model: Elevating the Future of Software DevelopmentSemantic-Aware Code Model: Elevating the Future of Software Development
Semantic-Aware Code Model: Elevating the Future of Software Development
Baishakhi Ray
 
Communications Mining Series - Zero to Hero - Session 3
Communications Mining Series - Zero to Hero - Session 3Communications Mining Series - Zero to Hero - Session 3
Communications Mining Series - Zero to Hero - Session 3
DianaGray10
 
Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024
siddu769252
 
BLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
BLOCKCHAIN TECHNOLOGY - Advantages and DisadvantagesBLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
BLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
SAI KAILASH R
 
Computer HARDWARE presenattion by CWD students class 10
Computer HARDWARE presenattion by CWD students class 10Computer HARDWARE presenattion by CWD students class 10
Computer HARDWARE presenattion by CWD students class 10
ankush9927
 
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdfAcumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
BrainSell Technologies
 
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
shanihomely
 
Mastering OnlyFans Clone App Development: Key Strategies for Success
Mastering OnlyFans Clone App Development: Key Strategies for SuccessMastering OnlyFans Clone App Development: Key Strategies for Success
Mastering OnlyFans Clone App Development: Key Strategies for Success
David Wilson
 
LeadMagnet IQ Review: Unlock the Secret to Effortless Traffic and Leads.pdf
LeadMagnet IQ Review:  Unlock the Secret to Effortless Traffic and Leads.pdfLeadMagnet IQ Review:  Unlock the Secret to Effortless Traffic and Leads.pdf
LeadMagnet IQ Review: Unlock the Secret to Effortless Traffic and Leads.pdf
SelfMade bd
 
Patch Tuesday de julio
Patch Tuesday de julioPatch Tuesday de julio
Patch Tuesday de julio
Ivanti
 
Types of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technologyTypes of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technology
ldtexsolbl
 
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
sunilverma7884
 

Recently uploaded (20)

Vertex AI Agent Builder - GDG Alicante - Julio 2024
Vertex AI Agent Builder - GDG Alicante - Julio 2024Vertex AI Agent Builder - GDG Alicante - Julio 2024
Vertex AI Agent Builder - GDG Alicante - Julio 2024
 
Vulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive OverviewVulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive Overview
 
Tailored CRM Software Development for Enhanced Customer Insights
Tailored CRM Software Development for Enhanced Customer InsightsTailored CRM Software Development for Enhanced Customer Insights
Tailored CRM Software Development for Enhanced Customer Insights
 
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
 
UX Webinar Series: Essentials for Adopting Passkeys as the Foundation of your...
UX Webinar Series: Essentials for Adopting Passkeys as the Foundation of your...UX Webinar Series: Essentials for Adopting Passkeys as the Foundation of your...
UX Webinar Series: Essentials for Adopting Passkeys as the Foundation of your...
 
Gen AI: Privacy Risks of Large Language Models (LLMs)
Gen AI: Privacy Risks of Large Language Models (LLMs)Gen AI: Privacy Risks of Large Language Models (LLMs)
Gen AI: Privacy Risks of Large Language Models (LLMs)
 
Acumatica vs. Sage Intacct _Construction_July (1).pptx
Acumatica vs. Sage Intacct _Construction_July (1).pptxAcumatica vs. Sage Intacct _Construction_July (1).pptx
Acumatica vs. Sage Intacct _Construction_July (1).pptx
 
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
 
Semantic-Aware Code Model: Elevating the Future of Software Development
Semantic-Aware Code Model: Elevating the Future of Software DevelopmentSemantic-Aware Code Model: Elevating the Future of Software Development
Semantic-Aware Code Model: Elevating the Future of Software Development
 
Communications Mining Series - Zero to Hero - Session 3
Communications Mining Series - Zero to Hero - Session 3Communications Mining Series - Zero to Hero - Session 3
Communications Mining Series - Zero to Hero - Session 3
 
Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024
 
BLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
BLOCKCHAIN TECHNOLOGY - Advantages and DisadvantagesBLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
BLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
 
Computer HARDWARE presenattion by CWD students class 10
Computer HARDWARE presenattion by CWD students class 10Computer HARDWARE presenattion by CWD students class 10
Computer HARDWARE presenattion by CWD students class 10
 
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdfAcumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
 
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
 
Mastering OnlyFans Clone App Development: Key Strategies for Success
Mastering OnlyFans Clone App Development: Key Strategies for SuccessMastering OnlyFans Clone App Development: Key Strategies for Success
Mastering OnlyFans Clone App Development: Key Strategies for Success
 
LeadMagnet IQ Review: Unlock the Secret to Effortless Traffic and Leads.pdf
LeadMagnet IQ Review:  Unlock the Secret to Effortless Traffic and Leads.pdfLeadMagnet IQ Review:  Unlock the Secret to Effortless Traffic and Leads.pdf
LeadMagnet IQ Review: Unlock the Secret to Effortless Traffic and Leads.pdf
 
Patch Tuesday de julio
Patch Tuesday de julioPatch Tuesday de julio
Patch Tuesday de julio
 
Types of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technologyTypes of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technology
 
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
 

Advances in Visual Quality Restoration with Generative Adversarial Networks

  • 1. Advances in Visual Quality Restoration with Generative Adversarial Networks Leonardo Galteri - PhD University of Florence, MICC
  • 3. *Amazon Cloud Outbound Traffic -1,125,000 $ Why does this happens? (in 2019…) • First Episode of Season 2 had 15M viewers • Stream this at reasonable quality at a cost of 0,020 $/GB*
  • 4. Working on the Encoder ICPR’18 • Adaptive video coding approach
  • 5. Working on the Encoder ICPR’18 X.264 • Adaptive video coding approach
  • 6. Deep CNN Is there Another Way?
  • 7. Improving Compressed Images Given an uncompressed frame 𝑥𝐻𝑄 𝑥𝐿𝑄 = 𝒞(𝑥𝐻𝑄; 𝜃) We want to learn a function 𝐺(𝑥𝐿𝑄) ≈ 𝒞−1(𝑥𝐻𝑄; 𝜃) where 𝜃 are codec parameters. 𝑥𝐿𝑄 𝐺(𝑥𝐿𝑄) ICCV’17
  • 8. A Deep Residual Network for Reconstruction • We use strided convolution to reduce feature map size. • We avoid checkerboard artifacts with NN upsampling followed by 2 more convolutional layers • Trained on patches 128x128 pixel extracted from MS-COCO. ICCV’17
  • 9. Limitations of MSE and SSIM Losses JPEG SSIM Loss Original • SSIM and MSE losses are able to reduce effectively compression artefacts. • However, reconstructions appear blurry and there are many missing details with respect to the uncompressed version of the image. ICCV’17
  • 10. Generative Adversarial Network G Generator D Discriminator Image REAL or FAKE? High Quality Images (REAL) Low Quality Images Restored Images (FAKE) D is trained to tell apart real from reconstructed images G is trained to fool D ICCV’17
  • 11. The Sub-Patch Discriminator • 128 x 128 patches are split into smaller 16x16 sub-patches, concatenated with correspondent input sub-patches and processed by the discriminator. • The discriminator is trained with a binary cross-entropy loss over all the sub-patches. ICCV’17
  • 12. Generator Loss ICCV’17 • The new objective of Generator is: ℒ𝐺 = ℒ𝑃 + 𝜆ℒ𝐴𝐷𝑉 where: ℒ𝑃 = 𝜙 𝐼𝐻𝑄 𝑥,𝑦 − 𝜙 𝐼𝑅𝑄 𝑥,𝑦 2 is called perceptual loss, a MSE loss in VGG19 feature space, and: ℒ𝐴𝐷𝑉 = − log 𝐷 𝐼𝑅𝑄|𝐼𝐿𝑄 is the adversarial loss, which measures how good is the fooling the discriminator.
  • 13. Effect of Sub-Patch Discriminator • This technique allows to reduce the mosquito noise present in reconstructions. W/o Sub-Patch With Sub-Patch ICCV’17
  • 14. Predicting QF • We train a CNN regressor, named QF predictor, to drive a finite Ensemble of Generators • We use the most appropriateGenerator to restore the image QF Predictor 𝐺(𝜃 = 𝜃0; 𝑥) 𝐺(𝜃 = 𝜃𝑁; 𝑥) … … Model Switcher TMM’19 𝐺(𝜃 = 𝜃𝑛; 𝑥) 𝑥
  • 16. Qualitative Results JPEG AR-CNN GAN ORIGINAL TMM’19
  • 17. Subjective Evaluation • DSIS setup test image compared to original and similarity scored in 0-100 • We compare SSIM Loss vs Adversarial Training using the same Generator architecture. • Subjects have a strong preference forGAN restored images over SSIM ones. Method MOS Std. Dev. SSIM 49.51 22.72 GAN 68.32 20.75 TMM’19
  • 18. Object Detection Results Class GAN AP gain @QF 20 Dog +18.6 Cat +16.6 Sheep +14.3 Cow +12.5 • Use an object detector, Faster R-CNN to assess the visual quality of restored images • Compute mAP on PASCALVOC using several JPEG quality factors and the correspondent reconstructions. • Large increase in detector performance • Largest gainers are deformable ’furry’ objects such as animals TMM’19
  • 19. Enters MobileNetV2 • MobileNetV2 was originally proposed to reduce computational burden of CNNs • Depthwise separable convolutions drop-in replacement for convolutional layers • Inverted residual blocks better propagate gradients across layers but more memory efficient Sandler CVPR’18 Residual Block Inverted Residual Block
  • 20. A Deep Residual Network for Reconstruction • Keep the Generator identical except for the Inverted Residual Blocks! •  Train on the small DIV2K dataset • Augmentation: resizing 256, 384 and 512; random crops of 224x224; mirror flipping. Inverted Residual Block Inverted Residual Block CAIP’19
  • 21. Qualitative Results ICCV’17 RAW Bit/Pixel 0.146 FPS 4 onTitan Xp GPU Bit/Pixel 12 FPS - All videos 720p CAIP’19
  • 22. Qualitative Results Bit/Pixel 0.146 FPS 42 onTitan Xp GPU Very Fast RAW Bit/Pixel 12 FPS - CAIP’19 All videos 720p
  • 23. Qualitative Results Bit/Pixel 0.146 FPS 20 onTitan Xp GPU Bit/Pixel 12 FPS - Fast RAW CAIP’19 All videos 720p
  • 24. RAW Bit/Pixel 0.0570 FPS 3 onV100 GPU Bit/Pixel 12 FPS - Wave.ONE Qualitative Results CAIP’19 All videos 720p
  • 25. Qualitative Results Bit/Pixel 0.0570 FPS 1,6 onTitan Xp GPU* Bit/Pixel 0.146 (x3) FPS 20 onTitan Xp GPU (x12) Fast Wave.ONE All videos 720p CAIP’19
  • 26. No-Reference Evaluation • According to NIQE and BRISQUE value GAN images as ’more natural’ the the original ones! • VIIDEO is likely penalizing reconstruction for lack of temporal coherence CAIP’19
  • 27. • GANs are well known to work well when the distribution is simpler • Faces are possibly the most interesting object we are willing to transmit • Here what we can do with a severe degradation and a specialized GAN Specialized Artifact Removal ACM MM’19 Best Demo
  • 28. Specialized Artifact Removal ACM MM’19 Best Demo • GANs are well known to work well when the distribution is simpler • Faces are possibly the most interesting object we are willing to transmit • Here what we can do with a severe degradation and a specialized GAN
  • 33. Semantic Coding + GAN ACM MM’20 Received frame Receiver
  • 34. Semantic Segmentation ACM MM’20 • Use BiSeNet to label each pixel detected as face or neck as foreground, the remainder as background.
  • 35. Semantic Segmentation ACM MM’20 + + Generator Loss 𝑀 𝑀 face background
  • 36. Semantic Segmentation ACM MM’20 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0 5 10 15 20 25 30 35 40 45 LPIPS Our Method Baseline 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40 45 BRISQUE Our Method Baseline ~ 30% improvement ~ 55% improvement
  • 37. RESULTS – QUALITATIVE 1 min COMPRESSED
  • 38. RESULTS – QUALITATIVE 1 min COMPRESSED + SALIENCY
  • 39. RESULTS – QUALITATIVE 1 min COMPRESSED + SALIENCY + GAN
  • 40. Evaluation using Language • Task: “evaluate the weighted combination of all of the visually significant attributes of an image” Quality ACM MM ASIA’21 Best Paper Award
  • 41. A statue of a woman wearing a christmas tie A brown and white dog wearing a tie A brown and white dog wearing a red tie A statue of a woman wearing a christmas tie A brown and white dog wearing a tie A brown and white dog wearing a red tie A statue of a woman wearing a christmas tie A brown and white dog wearing a tie A brown and white dog wearing a red tie JPEG GAN HQ Evaluation using Language Use image captioning algorithm to evaluate the fine semantics of the image ACM MM ASIA’21 Best Paper Award
  • 42. Assesment Methodolgy ACM MM ASIA’21 Best Paper Award
  • 43. Assesment Methodolgy ACM MM ASIA’21 Best Paper Award
  • 44. Assesment Methodolgy ACM MM ASIA’21 Best Paper Award Language Model Language Metric 0.752 Language Model Pseudo Ground Truth Caption Predicted Caption Input Image Reference Image • “GT” caption can be generated from the reference imagein case captions are not available (the usual case)
  • 45. Evaluating Enhanced Images ACM MM ASIA’21 Best Paper Award • Our approach scores higher GAN reconstructions (REC) of JPEG compressed images across a wide range of QFs • Results are consistent for all captioning metrics across all qualities/enhancements
  • 46. Changing the Captioning Model ACM MM ASIA’21 Best Paper Award • Using a better captioning model increases the correlation with MOS • Visual features are shared among [1] and [2] [1] M. Cornia, M. Stefanini, L. Baraldi, and R. Cucchiara. Meshed-memory transformer for image captioning. In Proc. of CVPR 2020 [2] P. Anderson, X. He, C. Buehler, D. Teney, M. Johnson, S. Gould, and L. Zhang. Bottom-up and top- down attention for image captioning and visual question answering. In Proc. of CVPR 2018 [1]
  • 47. JPEG GAN  A man riding a wave on a surfboard in the ocea  A couple of people sitting next to a christmas tree. Qualitative Analysis ACM MM ASIA’21 Best Paper Award
  • 48. What’s Next for Restoration? • Innovative technologies for restoration replacingGANs • Smaller architectures and new ways to train them • Blind restoration
  • 49. References L. Galteri, M. Bertini , L. Seidenari, A. Del Bimbo, Video Compression for Object Detection Algorithms’, ICPR 2018 L. Galteri, L. Seidenari, M. Bertini, A. Del Bimbo, 'Deep Generative Adversarial Compression Artifact Removal’, IEEE ICCV 2017 L. Galteri, L. Seidenari, M. Bertini, A. Del Bimbo, ‘Deep Universal Generative Adversarial Compression Artifact Removal’, IEEE TMM 2019 L. Galteri, L. Seidenari, M. Bertini, A. Del Bimbo, ‘Towards Real-Time Image Enhancement GANs’, CAIP 2019 L. Galteri, L. Seidenari, M. Bertini, T. Uricchio, A. Del Bimbo, ‘Fast Video Quality Enchancement Using GANs’, ACM MM Best Demo 2019 L. Galteri, M. Bertini, L. Seidenari, T. Uricchio, A. Del Bimbo, ‘Increasing Video Perceptual Quality with GANs and Semantic Coding’, ACM MM 2020 L. Galteri, L. Seidenari, P. Bongini, M. Bertini, A. Del Bimbo, ‘Language Based Image Quality Enhancement’, ACM MM Asia Best Paper Award 2021

Editor's Notes

  1. 1 min