SlideShare a Scribd company logo
Institut Mines-Télécom
Compression meets Deep Learning:
Breakthrough or breakdown?
Report on research in
DL+Compression @ Multimedia group
Marco Cagnazzo, Attilio Fiandrotti, Andrei Purica
Institut Mines-Télécom
Context
DL and compression
Connexion with compression problems
■ Learn the best choices for classical encoders
• E.g., fast mode decision, rate allocation
■ Improve classical tasks of compression algorithms
• Probability models
• Block prediction
• Segmentation for object-based coding
• MPEG contributions
■ Paradigm shift in signal representation
• Autoencoders, GAN’s
2
IMPACT
Decisive
Incremental
Disruptive
Institut Mines-Télécom
Outline
■ On-going works @ MM group
• Airplane screen content video compression
• Virtual viewpoint synthesis and super-resolution
• Subjective quality comparison of DL-based compression
algorithms
• Other possible applications
■ The ML-Compression working group
■ Conclusions
3
Institut Mines-Télécom
Airplane screen content video compression
■ Airplane screen content: critical text information embedded over
natural image or synthetic background
• Sensors, navigation and positioning information, etc
• No direct access to these data only to captured screens
■ Compression is required for
several use cases
■ Semantic (or object-based)
video coding
• Text is recognized and
encoded as such
• Perfect text reconstruction
at the decoder side
4
Institut Mines-Télécom
Semantic coding
■ Deep learning is a key component of such
schemes since it allows to obtain a reliable
detection of the semantic information
■ Three NN architectures tested (complexity vs. accuracy trade-off)
■ First results: up to -90% rate reduction wrt the state of the
art for the same quality, or +4.6 dB PSNR improvement
■ Example :
HEVC-SCC at 0.018 bpp, 33.1 dB Proposed at 0.007 bpp, 38.2 dB
5
Institut Mines-Télécom
Outline
■ On-going works @ MM group
• Airplane screen content video compression
• Virtual viewpoint synthesis and super-resolution
• Subjective quality comparison of DL-based compression
algorithms
• Other possible applications
■ The ML-Compression working group
■ Conclusions
6
Institut Mines-Télécom
Virtual viewpoint synthesis
7
x
y
z
Real cameras
Virtual
viewpoints
Institut Mines-Télécom
View Synthesis Reference Software (MPEG)
8
3D back-
projection
3D back-
projection
Merging
Filling
holes
Reference
Homography
Matrix
Reference
Homography
Matrix
Synthesis
Homography
Matrix
Institut Mines-Télécom
Proposed scheme
9
3D back-
projection
3D back-
projection
CNN-based
merge
Reference
Homography
Matrix
Reference
Homography
Matrix
Synthesis
Homography
Matrix
Institut Mines-Télécom
CNN-based merge
Architecture derived from a video super-resolution technique
10
Concatenate Convolutional
Layer 1
Convolutional
Layer 2
Institut Mines-Télécom
CNN-based view synthesis: results
11
VSRS
Institut Mines-Télécom
CNN-based view synthesis: results
12
Ground
truth
Institut Mines-Télécom
CNN-based view synthesis: results
13
Proposed
Institut Mines-Télécom
Outline
■ On-going works
• Airplane screen content video compression
• Virtual viewpoint synthesis and super-resolution
• Subjective quality comparison of DL-based compression
algorithms
• Other possible applications
■ The ML-Compr working group
■ Conclusions
14
Institut Mines-Télécom
Subjective quality evaluation of DL-compression
methods
■ Deep generative models try to learn the latent distribution generating images
■ A typical architecture is based on auto-encoders, i.e. networks trained to
reproduce their input
■ Autoencoders include an information bottleneck, achieving compression
■ Very low-bitrate compression could also be obtained with GANs
─ Training process stability?
─ Naturaliness vs. fidelity
15
Encoder Code Decoder
𝑥
L=||𝑥-𝑦||2
𝑦
Institut Mines-Télécom
Subjective quality evaluation of DL-compression
methods
■ Subjective quality evaluation (PSNR is not reliable enough)
■ 6 images, 113 compressed stimuli (uniform span of the
impairment scale)
■ 23 participants
■ Double stimulus impairment scale
■ Four compression methods:
1. Ballé et al.: 3-layers autoencoder with biologically-inspired non-
linearity and an approximation of rate-distortion optimization
2. Toderici et al.: Progressive RNN-based encoder working on 32x32
pixels patches
3. JP2K: Wavelet Transform, RDO, arithmetic coding
4. BPG: Spatial prediction, variable size prediction and transform
units, DCT and arithmetic coding
16
Institut Mines-Télécom
Subjective quality evaluation of DL-compression
methods
17
Image 1 Image 2
Institut Mines-Télécom
Subjective quality evaluation of DL-compression
methods - Image 1
18
Ballé, 0.38 bpp JP2K, 0.43 bpp
Institut Mines-Télécom
Subjective quality evaluation of DL-compression
methods - Image 2
19
Toderici, 0.125 bpp JP2K, 0.1 bpp
Institut Mines-Télécom
Outline
■ On-going works
• Airplane screen content video compression
• Virtual viewpoint synthesis and super-resolution
• Subjective quality comparison of DL-based compression
algorithms
• Other possible applications
■ The ML-Compression working group
■ Conclusions
20
Institut Mines-Télécom
Other applications
■ The flexibility of learning methods make them suitable for
several other problems in the field of compression and
streaming
• Spatial image prediction
• Probability distribution estimation for lossless coding
• Digital Hologram Compression
• HTTP Adaptive streaming (Q-learning)
Digicosme
post-doc
BCOM PhD HUAWEI (CIFRE)
PhD
Institut Mines-Télécom
Outline
■ On-going works
• Airplane screen content video compression
• Virtual viewpoint synthesis and super-resolution
• Subjective quality comparison of DL-based compression
algorithms
• Other possible applications
■ The ML-Compression working group
■ Conclusions
22
Institut Mines-Télécom
ML-Compression working group
http://mlcompr.wp.imt.fr
■ Started in 4 months ago
■ People
• 2 MdC, 2 Post-doc, 3 PhD, interns @ MM
• Possible recruiting of 1-3 PhDs in the next months (CIFRE)
• Researchers from L2S (with PhDs and one post-doc)
• Contributions from other groups (talks, discussions, …)
■ Regular seminars with contributions from
─ IMAGES and S2A groups
─ Former IDS members
• Paris 5, L2S
─ Other universities (Paris13, Poitiers, CentraleSupéléc …)
─ Companies (Orange, Zodiac, …)
■ Make it an “official research topic”(aka “theme”)?
■ Connection with the Learning theme?
23
Institut Mines-Télécom
Outline
■ On-going works
• Airplane screen content video compression
• Virtual viewpoint synthesis and super-resolution
• Subjective quality comparison of DL-based compression
algorithms
• Other possible applications
■ The ML-Compr working group
■ Conclusions
24
Institut Mines-Télécom
Conclusions
■ Deep learning has triggered a revolution in many
fields: will it be the same for compression?
• Possible, if we consider the impact on close fields
(computer vision)
• But not sure: traditional methods still have very
important properties that cannot (yet) be guaranteed
by DL-based methods (robustness, progressivity,
rate-control, low-complexity decoders, …)
■ Will DL provide decisive gains inside traditional
architecture?
• Possible, but many difficulties have to be faced
■ Or will DL just be used for incremental
improvements in traditional architectures?
• Almost sure that this minimal target can be achieved
25
IMPACT
Decisive
Incremental
Disruptive
Institut Mines-Télécom
Perspectives
■ Increasing activity of the ML-Compression group
• Seminars from AI experts
• Growing network of collaborations
■ Industrial activity
• 2 or 3 PhD Cifre proposals for next autumn
■ Critical mass?
• Intra-department? NewUni? New recruitment?
26
Institut Mines-Télécom
Thank you!
Working group: http://mlcompr.wp.imt.fr
Next seminar on July 19th
Subscribe to the mailing list:
https://listes.telecom-paristech.fr/mailman/listinfo/mlcompr
27

More Related Content

What's hot

Report HPC 2019 2020
Report HPC 2019 2020Report HPC 2019 2020
Report HPC 2019 2020
Cineca
 
Data Mining OPtimization Ontology and its application to meta-mining of knowl...
Data Mining OPtimization Ontology and its application to meta-mining of knowl...Data Mining OPtimization Ontology and its application to meta-mining of knowl...
Data Mining OPtimization Ontology and its application to meta-mining of knowl...
Agnieszka Ławrynowicz
 
Cineca HPC Annual Report 2020-2021
Cineca HPC Annual Report 2020-2021 Cineca HPC Annual Report 2020-2021
Cineca HPC Annual Report 2020-2021
Cineca
 
ENVRIPLUS Data for Science Theme
ENVRIPLUS Data for Science ThemeENVRIPLUS Data for Science Theme
ENVRIPLUS Data for Science Theme
EUDAT
 
Towards Language-Oriented Modeling (HDR Defense)
Towards Language-Oriented Modeling (HDR Defense)Towards Language-Oriented Modeling (HDR Defense)
Towards Language-Oriented Modeling (HDR Defense)
Benoit Combemale
 
Overview of Selected Current MPEG Activities
Overview of Selected Current MPEG ActivitiesOverview of Selected Current MPEG Activities
Overview of Selected Current MPEG ActivitiesAlpen-Adria-Universität
 
PoR_evaluation_measure_acm_mm_2020
PoR_evaluation_measure_acm_mm_2020PoR_evaluation_measure_acm_mm_2020
PoR_evaluation_measure_acm_mm_2020
VasileiosMezaris
 
Model executability within the GEMOC Studio
Model executability within the GEMOC StudioModel executability within the GEMOC Studio
Model executability within the GEMOC Studio
Benoit Combemale
 
Video Hyperlinking Tutorial (Part A)
Video Hyperlinking Tutorial (Part A)Video Hyperlinking Tutorial (Part A)
Video Hyperlinking Tutorial (Part A)
LinkedTV
 
IMPACT Final Event 26-06-2012 - Overview of IMPACT tools by: ABBYY, NCSR Demo...
IMPACT Final Event 26-06-2012 - Overview of IMPACT tools by: ABBYY, NCSR Demo...IMPACT Final Event 26-06-2012 - Overview of IMPACT tools by: ABBYY, NCSR Demo...
IMPACT Final Event 26-06-2012 - Overview of IMPACT tools by: ABBYY, NCSR Demo...IMPACT Centre of Competence
 

What's hot (10)

Report HPC 2019 2020
Report HPC 2019 2020Report HPC 2019 2020
Report HPC 2019 2020
 
Data Mining OPtimization Ontology and its application to meta-mining of knowl...
Data Mining OPtimization Ontology and its application to meta-mining of knowl...Data Mining OPtimization Ontology and its application to meta-mining of knowl...
Data Mining OPtimization Ontology and its application to meta-mining of knowl...
 
Cineca HPC Annual Report 2020-2021
Cineca HPC Annual Report 2020-2021 Cineca HPC Annual Report 2020-2021
Cineca HPC Annual Report 2020-2021
 
ENVRIPLUS Data for Science Theme
ENVRIPLUS Data for Science ThemeENVRIPLUS Data for Science Theme
ENVRIPLUS Data for Science Theme
 
Towards Language-Oriented Modeling (HDR Defense)
Towards Language-Oriented Modeling (HDR Defense)Towards Language-Oriented Modeling (HDR Defense)
Towards Language-Oriented Modeling (HDR Defense)
 
Overview of Selected Current MPEG Activities
Overview of Selected Current MPEG ActivitiesOverview of Selected Current MPEG Activities
Overview of Selected Current MPEG Activities
 
PoR_evaluation_measure_acm_mm_2020
PoR_evaluation_measure_acm_mm_2020PoR_evaluation_measure_acm_mm_2020
PoR_evaluation_measure_acm_mm_2020
 
Model executability within the GEMOC Studio
Model executability within the GEMOC StudioModel executability within the GEMOC Studio
Model executability within the GEMOC Studio
 
Video Hyperlinking Tutorial (Part A)
Video Hyperlinking Tutorial (Part A)Video Hyperlinking Tutorial (Part A)
Video Hyperlinking Tutorial (Part A)
 
IMPACT Final Event 26-06-2012 - Overview of IMPACT tools by: ABBYY, NCSR Demo...
IMPACT Final Event 26-06-2012 - Overview of IMPACT tools by: ABBYY, NCSR Demo...IMPACT Final Event 26-06-2012 - Overview of IMPACT tools by: ABBYY, NCSR Demo...
IMPACT Final Event 26-06-2012 - Overview of IMPACT tools by: ABBYY, NCSR Demo...
 

Similar to Activity report on Deep-learning based compression

4 - Simulation and analysis of different DCT techniques on MATLAB (presented ...
4 - Simulation and analysis of different DCT techniques on MATLAB (presented ...4 - Simulation and analysis of different DCT techniques on MATLAB (presented ...
4 - Simulation and analysis of different DCT techniques on MATLAB (presented ...
Youness Lahdili
 
computer architecture.
computer architecture.computer architecture.
computer architecture.
Shivalik college of engineering
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用
CHENHuiMei
 
Data compression, data security, and machine learning
Data compression, data security, and machine learningData compression, data security, and machine learning
Data compression, data security, and machine learning
Chris Huang
 
OpenSees: Future Directions
OpenSees: Future DirectionsOpenSees: Future Directions
OpenSees: Future Directions
openseesdays
 
Ectel nods v2
Ectel nods v2Ectel nods v2
Ectel nods v2nodenot
 
“Imaging Systems for Applied Reinforcement Learning Control,” a Presentation ...
“Imaging Systems for Applied Reinforcement Learning Control,” a Presentation ...“Imaging Systems for Applied Reinforcement Learning Control,” a Presentation ...
“Imaging Systems for Applied Reinforcement Learning Control,” a Presentation ...
Edge AI and Vision Alliance
 
FACE COUNTING USING OPEN CV & PYTHON FOR ANALYZING UNUSUAL EVENTS IN CROWDS
FACE COUNTING USING OPEN CV & PYTHON FOR ANALYZING UNUSUAL EVENTS IN CROWDSFACE COUNTING USING OPEN CV & PYTHON FOR ANALYZING UNUSUAL EVENTS IN CROWDS
FACE COUNTING USING OPEN CV & PYTHON FOR ANALYZING UNUSUAL EVENTS IN CROWDS
IRJET Journal
 
Edge-Fog Cloud
Edge-Fog CloudEdge-Fog Cloud
Edge-Fog Cloud
Nitinder Mohan
 
“COVID-19 Safe Distancing Measures in Public Spaces with Edge AI,” a Presenta...
“COVID-19 Safe Distancing Measures in Public Spaces with Edge AI,” a Presenta...“COVID-19 Safe Distancing Measures in Public Spaces with Edge AI,” a Presenta...
“COVID-19 Safe Distancing Measures in Public Spaces with Edge AI,” a Presenta...
Edge AI and Vision Alliance
 
"Using Deep Learning for Video Event Detection on a Compute Budget," a Presen...
"Using Deep Learning for Video Event Detection on a Compute Budget," a Presen..."Using Deep Learning for Video Event Detection on a Compute Budget," a Presen...
"Using Deep Learning for Video Event Detection on a Compute Budget," a Presen...
Edge AI and Vision Alliance
 
Imaging automotive 2015 addfor v002
Imaging automotive 2015   addfor v002Imaging automotive 2015   addfor v002
Imaging automotive 2015 addfor v002
Enrico Busto
 
Imaging automotive 2015 addfor v002
Imaging automotive 2015   addfor v002Imaging automotive 2015   addfor v002
Imaging automotive 2015 addfor v002
Enrico Busto
 
Tutorial on Point Cloud Compression and standardisation
Tutorial on Point Cloud Compression and standardisationTutorial on Point Cloud Compression and standardisation
Tutorial on Point Cloud Compression and standardisation
Rufael Mekuria
 
“Introduction to the TVM Open Source Deep Learning Compiler Stack,” a Present...
“Introduction to the TVM Open Source Deep Learning Compiler Stack,” a Present...“Introduction to the TVM Open Source Deep Learning Compiler Stack,” a Present...
“Introduction to the TVM Open Source Deep Learning Compiler Stack,” a Present...
Edge AI and Vision Alliance
 
Video Denoising using Transform Domain Method
Video Denoising using Transform Domain MethodVideo Denoising using Transform Domain Method
Video Denoising using Transform Domain Method
IRJET Journal
 
A Novel Approach for Compressing Surveillance System Videos
A Novel Approach for Compressing Surveillance System VideosA Novel Approach for Compressing Surveillance System Videos
A Novel Approach for Compressing Surveillance System Videos
INFOGAIN PUBLICATION
 
Efficient video perception through AI
Efficient video perception through AIEfficient video perception through AI
Efficient video perception through AI
Qualcomm Research
 
Enhanced real time semantic segmentation
Enhanced real time semantic segmentationEnhanced real time semantic segmentation
Enhanced real time semantic segmentation
AkankshaRawat42
 

Similar to Activity report on Deep-learning based compression (20)

4 - Simulation and analysis of different DCT techniques on MATLAB (presented ...
4 - Simulation and analysis of different DCT techniques on MATLAB (presented ...4 - Simulation and analysis of different DCT techniques on MATLAB (presented ...
4 - Simulation and analysis of different DCT techniques on MATLAB (presented ...
 
computer architecture.
computer architecture.computer architecture.
computer architecture.
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用
 
Data compression, data security, and machine learning
Data compression, data security, and machine learningData compression, data security, and machine learning
Data compression, data security, and machine learning
 
OpenSees: Future Directions
OpenSees: Future DirectionsOpenSees: Future Directions
OpenSees: Future Directions
 
Ectel nods v2
Ectel nods v2Ectel nods v2
Ectel nods v2
 
“Imaging Systems for Applied Reinforcement Learning Control,” a Presentation ...
“Imaging Systems for Applied Reinforcement Learning Control,” a Presentation ...“Imaging Systems for Applied Reinforcement Learning Control,” a Presentation ...
“Imaging Systems for Applied Reinforcement Learning Control,” a Presentation ...
 
FACE COUNTING USING OPEN CV & PYTHON FOR ANALYZING UNUSUAL EVENTS IN CROWDS
FACE COUNTING USING OPEN CV & PYTHON FOR ANALYZING UNUSUAL EVENTS IN CROWDSFACE COUNTING USING OPEN CV & PYTHON FOR ANALYZING UNUSUAL EVENTS IN CROWDS
FACE COUNTING USING OPEN CV & PYTHON FOR ANALYZING UNUSUAL EVENTS IN CROWDS
 
Edge-Fog Cloud
Edge-Fog CloudEdge-Fog Cloud
Edge-Fog Cloud
 
“COVID-19 Safe Distancing Measures in Public Spaces with Edge AI,” a Presenta...
“COVID-19 Safe Distancing Measures in Public Spaces with Edge AI,” a Presenta...“COVID-19 Safe Distancing Measures in Public Spaces with Edge AI,” a Presenta...
“COVID-19 Safe Distancing Measures in Public Spaces with Edge AI,” a Presenta...
 
A0540106
A0540106A0540106
A0540106
 
"Using Deep Learning for Video Event Detection on a Compute Budget," a Presen...
"Using Deep Learning for Video Event Detection on a Compute Budget," a Presen..."Using Deep Learning for Video Event Detection on a Compute Budget," a Presen...
"Using Deep Learning for Video Event Detection on a Compute Budget," a Presen...
 
Imaging automotive 2015 addfor v002
Imaging automotive 2015   addfor v002Imaging automotive 2015   addfor v002
Imaging automotive 2015 addfor v002
 
Imaging automotive 2015 addfor v002
Imaging automotive 2015   addfor v002Imaging automotive 2015   addfor v002
Imaging automotive 2015 addfor v002
 
Tutorial on Point Cloud Compression and standardisation
Tutorial on Point Cloud Compression and standardisationTutorial on Point Cloud Compression and standardisation
Tutorial on Point Cloud Compression and standardisation
 
“Introduction to the TVM Open Source Deep Learning Compiler Stack,” a Present...
“Introduction to the TVM Open Source Deep Learning Compiler Stack,” a Present...“Introduction to the TVM Open Source Deep Learning Compiler Stack,” a Present...
“Introduction to the TVM Open Source Deep Learning Compiler Stack,” a Present...
 
Video Denoising using Transform Domain Method
Video Denoising using Transform Domain MethodVideo Denoising using Transform Domain Method
Video Denoising using Transform Domain Method
 
A Novel Approach for Compressing Surveillance System Videos
A Novel Approach for Compressing Surveillance System VideosA Novel Approach for Compressing Surveillance System Videos
A Novel Approach for Compressing Surveillance System Videos
 
Efficient video perception through AI
Efficient video perception through AIEfficient video perception through AI
Efficient video perception through AI
 
Enhanced real time semantic segmentation
Enhanced real time semantic segmentationEnhanced real time semantic segmentation
Enhanced real time semantic segmentation
 

Recently uploaded

Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 

Recently uploaded (20)

Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 

Activity report on Deep-learning based compression

  • 1. Institut Mines-Télécom Compression meets Deep Learning: Breakthrough or breakdown? Report on research in DL+Compression @ Multimedia group Marco Cagnazzo, Attilio Fiandrotti, Andrei Purica
  • 2. Institut Mines-Télécom Context DL and compression Connexion with compression problems ■ Learn the best choices for classical encoders • E.g., fast mode decision, rate allocation ■ Improve classical tasks of compression algorithms • Probability models • Block prediction • Segmentation for object-based coding • MPEG contributions ■ Paradigm shift in signal representation • Autoencoders, GAN’s 2 IMPACT Decisive Incremental Disruptive
  • 3. Institut Mines-Télécom Outline ■ On-going works @ MM group • Airplane screen content video compression • Virtual viewpoint synthesis and super-resolution • Subjective quality comparison of DL-based compression algorithms • Other possible applications ■ The ML-Compression working group ■ Conclusions 3
  • 4. Institut Mines-Télécom Airplane screen content video compression ■ Airplane screen content: critical text information embedded over natural image or synthetic background • Sensors, navigation and positioning information, etc • No direct access to these data only to captured screens ■ Compression is required for several use cases ■ Semantic (or object-based) video coding • Text is recognized and encoded as such • Perfect text reconstruction at the decoder side 4
  • 5. Institut Mines-Télécom Semantic coding ■ Deep learning is a key component of such schemes since it allows to obtain a reliable detection of the semantic information ■ Three NN architectures tested (complexity vs. accuracy trade-off) ■ First results: up to -90% rate reduction wrt the state of the art for the same quality, or +4.6 dB PSNR improvement ■ Example : HEVC-SCC at 0.018 bpp, 33.1 dB Proposed at 0.007 bpp, 38.2 dB 5
  • 6. Institut Mines-Télécom Outline ■ On-going works @ MM group • Airplane screen content video compression • Virtual viewpoint synthesis and super-resolution • Subjective quality comparison of DL-based compression algorithms • Other possible applications ■ The ML-Compression working group ■ Conclusions 6
  • 7. Institut Mines-Télécom Virtual viewpoint synthesis 7 x y z Real cameras Virtual viewpoints
  • 8. Institut Mines-Télécom View Synthesis Reference Software (MPEG) 8 3D back- projection 3D back- projection Merging Filling holes Reference Homography Matrix Reference Homography Matrix Synthesis Homography Matrix
  • 9. Institut Mines-Télécom Proposed scheme 9 3D back- projection 3D back- projection CNN-based merge Reference Homography Matrix Reference Homography Matrix Synthesis Homography Matrix
  • 10. Institut Mines-Télécom CNN-based merge Architecture derived from a video super-resolution technique 10 Concatenate Convolutional Layer 1 Convolutional Layer 2
  • 11. Institut Mines-Télécom CNN-based view synthesis: results 11 VSRS
  • 12. Institut Mines-Télécom CNN-based view synthesis: results 12 Ground truth
  • 13. Institut Mines-Télécom CNN-based view synthesis: results 13 Proposed
  • 14. Institut Mines-Télécom Outline ■ On-going works • Airplane screen content video compression • Virtual viewpoint synthesis and super-resolution • Subjective quality comparison of DL-based compression algorithms • Other possible applications ■ The ML-Compr working group ■ Conclusions 14
  • 15. Institut Mines-Télécom Subjective quality evaluation of DL-compression methods ■ Deep generative models try to learn the latent distribution generating images ■ A typical architecture is based on auto-encoders, i.e. networks trained to reproduce their input ■ Autoencoders include an information bottleneck, achieving compression ■ Very low-bitrate compression could also be obtained with GANs ─ Training process stability? ─ Naturaliness vs. fidelity 15 Encoder Code Decoder 𝑥 L=||𝑥-𝑦||2 𝑦
  • 16. Institut Mines-Télécom Subjective quality evaluation of DL-compression methods ■ Subjective quality evaluation (PSNR is not reliable enough) ■ 6 images, 113 compressed stimuli (uniform span of the impairment scale) ■ 23 participants ■ Double stimulus impairment scale ■ Four compression methods: 1. Ballé et al.: 3-layers autoencoder with biologically-inspired non- linearity and an approximation of rate-distortion optimization 2. Toderici et al.: Progressive RNN-based encoder working on 32x32 pixels patches 3. JP2K: Wavelet Transform, RDO, arithmetic coding 4. BPG: Spatial prediction, variable size prediction and transform units, DCT and arithmetic coding 16
  • 17. Institut Mines-Télécom Subjective quality evaluation of DL-compression methods 17 Image 1 Image 2
  • 18. Institut Mines-Télécom Subjective quality evaluation of DL-compression methods - Image 1 18 Ballé, 0.38 bpp JP2K, 0.43 bpp
  • 19. Institut Mines-Télécom Subjective quality evaluation of DL-compression methods - Image 2 19 Toderici, 0.125 bpp JP2K, 0.1 bpp
  • 20. Institut Mines-Télécom Outline ■ On-going works • Airplane screen content video compression • Virtual viewpoint synthesis and super-resolution • Subjective quality comparison of DL-based compression algorithms • Other possible applications ■ The ML-Compression working group ■ Conclusions 20
  • 21. Institut Mines-Télécom Other applications ■ The flexibility of learning methods make them suitable for several other problems in the field of compression and streaming • Spatial image prediction • Probability distribution estimation for lossless coding • Digital Hologram Compression • HTTP Adaptive streaming (Q-learning) Digicosme post-doc BCOM PhD HUAWEI (CIFRE) PhD
  • 22. Institut Mines-Télécom Outline ■ On-going works • Airplane screen content video compression • Virtual viewpoint synthesis and super-resolution • Subjective quality comparison of DL-based compression algorithms • Other possible applications ■ The ML-Compression working group ■ Conclusions 22
  • 23. Institut Mines-Télécom ML-Compression working group http://mlcompr.wp.imt.fr ■ Started in 4 months ago ■ People • 2 MdC, 2 Post-doc, 3 PhD, interns @ MM • Possible recruiting of 1-3 PhDs in the next months (CIFRE) • Researchers from L2S (with PhDs and one post-doc) • Contributions from other groups (talks, discussions, …) ■ Regular seminars with contributions from ─ IMAGES and S2A groups ─ Former IDS members • Paris 5, L2S ─ Other universities (Paris13, Poitiers, CentraleSupéléc …) ─ Companies (Orange, Zodiac, …) ■ Make it an “official research topic”(aka “theme”)? ■ Connection with the Learning theme? 23
  • 24. Institut Mines-Télécom Outline ■ On-going works • Airplane screen content video compression • Virtual viewpoint synthesis and super-resolution • Subjective quality comparison of DL-based compression algorithms • Other possible applications ■ The ML-Compr working group ■ Conclusions 24
  • 25. Institut Mines-Télécom Conclusions ■ Deep learning has triggered a revolution in many fields: will it be the same for compression? • Possible, if we consider the impact on close fields (computer vision) • But not sure: traditional methods still have very important properties that cannot (yet) be guaranteed by DL-based methods (robustness, progressivity, rate-control, low-complexity decoders, …) ■ Will DL provide decisive gains inside traditional architecture? • Possible, but many difficulties have to be faced ■ Or will DL just be used for incremental improvements in traditional architectures? • Almost sure that this minimal target can be achieved 25 IMPACT Decisive Incremental Disruptive
  • 26. Institut Mines-Télécom Perspectives ■ Increasing activity of the ML-Compression group • Seminars from AI experts • Growing network of collaborations ■ Industrial activity • 2 or 3 PhD Cifre proposals for next autumn ■ Critical mass? • Intra-department? NewUni? New recruitment? 26
  • 27. Institut Mines-Télécom Thank you! Working group: http://mlcompr.wp.imt.fr Next seminar on July 19th Subscribe to the mailing list: https://listes.telecom-paristech.fr/mailman/listinfo/mlcompr 27