SlideShare a Scribd company logo
Machine Learning approaches at
video compression
Roberto Iacoviello
RAI - Radiotelevisione Italiana
Centre for Research, Technological Innovation and
Experimentation (CRITS)
Machine Learning is like sex in high school.
Everyone is talking about it, a few know what to
do, and only your teacher is doing it
There are 2/3 topics around AI: Ethics, that
sounds to me like if we don’t teach ethics to
the machines, Skynet will kill all of us.
Academic paper full of mathematics and
different notations. After you read them you
feel like: Ok, and now?
Then there is the real life: sometime is good
and sometimes is bad.
Dear old typical hybrid block based approach
Many new tools in VVC: Versatile Video
Coding. MPEG group in 30 years has
developed many useful standards but
based on the same schema. Now the
group is going towards new horizons:
neural networks.
Two approaches:
 NON Video approach: coded representation of neural network
Neural Network Video approach
Conservative Disruptive
One to One End to End
Replace one MPEG block with
one Deep Learning block
Replace the entire chain MPEG
Non-video approach: coded representation of
neural networks
Scope: Representation
of weights
and parameters,
no architecture
N18162 Marrakech
Non-video approach: coded representation of
neural networks
Coded
representation of
weight matrix
Coded representation of neural networks
Represent different artificial neural network
Enable faster inference
Enable use under resource limitations
Use cases
• Inference may be performed on a large number of devices
• The NNs used in an application can be improved incrementally
• Limitations in terms of processing power and memory
• Several apps would need to store on the device the
same base neural network multiple times
8
W17924 Macao
Type Parameter’s Size
Media content analysis From few KB to several
hundreds of MB
Translational app Currently around 200MB
Compact Descriptors for
Video Analysis (CDVA)
About 500-600 MB
MPEG Use cases
• UC10 Distributed training and evaluation of neural networks
for media content analysis
• UC11 Compact Descriptors for Video Analysis (CDVA)
• UC12 Image/Video Compression
• UC13 Distribution of neural networks for content processing
W17924 Macao
Dropping connections
Dropping layers
Replacing convolutions with
lower Dimensional ones  Matrix
decomposition
Changing stride in convolutions
without Increasing output size
Quantization (rate distortion based)
Quantization using codebook
Entropy coding
Methods
Summary: cut Something
Somewhere
• Uniform Quantization
• Sequential Quantization
• Nonuniform Quantization
• Low-Rank Approximation
M47704, Geneva
Methods
Original Weight
(32-bits)
Quantization Stage 1Quantization Stage 1
Quantization1
(10-bits)
DeQuantization 1
Quantization2
(8-bits)
Compressed
Model
DeQuantization 2
(for inference)
Quantization Stage 2Quantization Stage 2
W x H
Conv
W x 1
Conv
1 x H
Conv
Relu
Relu
• “Importance” estimation step
• With the proper re-train the model with the constraints of fixed-point
weights, the model’s precision could be very closed to the floating-
point model
• Quantize the coefficients with different precision for different layers
Methods
Video approach: Conservative
Neural Network based Filter for Video Coding
Core Experiment 13 on neural network based filter for video coding
Investigate the following problems:
 The impact of NN filter position in the filter chain
 The generalization capability of the NN: performance change when the test QP is not the same
as the training QP
13
JVET-N0840-v1
CE13-2.1: Convolutional Neural Network Filter (CNNF) for
Intra Frame
JVET-N0169
Over VTM-4.0 All Intra
Y U V EncT DecT
DF+CNNF+SAO+ALF -3.48% -5.18% -6.77% 142% 38414%
CNNF+ALF -4.65% -6.73% -7.92% 149% 37956%
CNNF -4.14% -5.49% -6.70% 140% 38411%
Pay attention to
the decoding
time
Concat
Conv1, (5,5,64)
Conv2, (3,3,64)
Conv3, (3,3,64)
Conv4, (3,3,64)
Conv5, (3,3,64)
Conv6, (3,3,64)
Conv7, (3,3,64)
Convolution8, (3,3,1)
Summation
Normalized QP MapNormalized Y/U/V
N: kernel size
K:kernel number
ConvM, (N,N,K)
Convolution (N,N,K)
ReLU
CE13-2.1: Convolutional Neural Network Filter (CNNF) for Intra
Frame
JVET-N0169
CE13-1.1: Convolutional neural network loop filter
JVET-N0110-v1
Over VTM-4.0
Random Access
Y U V EncT DecT
-1.36% -14.96% -14.91% 100% 142%
Each category will investigate the following problems:
 The impact of NN filter position in the filter chain: there is always objective gain
 The generalization capability of the NN: results indicate that the difference is minor
Neural Network based Filter for Video Coding
JVET-N_Notes_dD
What MPEG has decided in the March meeting (25/3/2019):
The performance/complexity tradeoff indicates that the NN technology
currently is not mature enough to be included in a standard
As I
said…sometimes
life is bad
PERFORMANCE
IS
NOTHING
WITHOUT
COMPLEXITY
Neural Network for Video Coding: Conclusion
The trade-off
matter
Neural Network Video approach: Disruptive
Videos are temporally highly
redundant
No deep image compression can
compete with state-of-the-art video
compression, which exploits this
redundancy
Optical Flow
Optical Flow
 In the computer vision tasks, optical flow is widely used to exploit temporal
relationship
 Learning based optical flow methods can provide accurate motion information at
pixel-level
 Only artificial/synthetic data set
SpyNet
• Learning based optical flow estimation is utilized to obtain the motion
information and reconstruct the current frame
• End-to-end deep video compression model that jointly learns motion
estimation, motion compression, and residual compression
DVC: An End-to-end Deep Video Compression
Framework
DVC: An End-to-end Deep Video Compression
Framework
MPEG NN
𝐴𝑟𝑐ℎ𝑖𝑡𝑒𝑐𝑡𝑢𝑟𝑒2 =
𝐴𝑟𝑐ℎ𝑖𝑡𝑒𝑐𝑡𝑢𝑟𝑒 𝑜𝑓 𝑁𝑁 𝐴𝑟𝑐ℎ𝑖𝑡𝑒𝑐𝑡𝑢𝑟𝑒𝑠
DVC: An End-to-end Deep Video Compression
Framework
Optical Flow Net
DVC: An End-to-end Deep Video Compression
Framework
Motion Compression
 MV Encoder and Decoder Network
DVC: An End-to-end Deep Video Compression
Framework
DVC: An End-to-end Deep Video Compression
Framework
Motion Compensation Network
DVC: An End-to-end Deep Video Compression
Framework
Residual Encoder Net
Bit Rate Estimation Net
Loss Function DVC: An End-to-end Deep Video
Compression Framework
 The whole compression system is end-to-end optimized:
Rate Distortion Optimization Just one end to end
formula that jointly learns
motion estimation,
motion compression, and
residual compression
Residuals
entropy
Motion
entropy
Advantages of Neural Networks
 Excellent content adaptivity
 Improve coding efficiency by leveraging samples from far distance
 Neural Network can well represent both texture and feature
 The whole compression system is end-to-end optimized
Rai R&D : what we are doing
 End to end chain
 Issues:
 Residuals compression
New EBU Distribution
Codecs activity
Please join the EBU Video Group
https://tech.ebu.ch/video
Please join the
EBU Video Group,
we’ll have lot of
fun!
Machine Learning approaches at
video compression
Roberto Iacoviello
roberto.iacoviello@rai.it
Grazie per l’attenzione
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivs 3.0
Unported License
To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-nd/3.0/
On your left there is the
reinforcement learning, that
means: this is the reward if
you contact me.

More Related Content

What's hot

Occlusion and Abandoned Object Detection for Surveillance Applications
Occlusion and Abandoned Object Detection for Surveillance ApplicationsOcclusion and Abandoned Object Detection for Surveillance Applications
Occlusion and Abandoned Object Detection for Surveillance Applications
Editor IJCATR
 
An Overview of High Efficiency Video Codec HEVC (H.265)
An Overview of High Efficiency Video Codec HEVC (H.265)An Overview of High Efficiency Video Codec HEVC (H.265)
An Overview of High Efficiency Video Codec HEVC (H.265)
Varun Ravi
 
HEVC intra coding
HEVC intra codingHEVC intra coding
HEVC intra coding
Manohar Kuse
 
Thesis presentation Slides Ph.D. Aliouat Ahcen
Thesis presentation Slides Ph.D. Aliouat AhcenThesis presentation Slides Ph.D. Aliouat Ahcen
Thesis presentation Slides Ph.D. Aliouat Ahcen
Ahcen ALIOUAT
 
How AI research is enabling next-gen codecs
How AI research is enabling next-gen codecsHow AI research is enabling next-gen codecs
How AI research is enabling next-gen codecs
Qualcomm Research
 
H.264 video standard
H.264 video standardH.264 video standard
H.264 video standard
Sajan Sahu
 
A short history of video coding
A short history of video codingA short history of video coding
A short history of video coding
Iain Richardson
 
High Efficiency Video Codec
High Efficiency Video CodecHigh Efficiency Video Codec
High Efficiency Video Codec
Tejus Adiga M
 
Introduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag JainIntroduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag JainVideoguy
 
Video Segmentation
Video SegmentationVideo Segmentation
Video Segmentation
Smriti Jain
 
MPEG video compression standard
MPEG video compression standardMPEG video compression standard
MPEG video compression standard
anuragjagetiya
 
Action Recognition (Thesis presentation)
Action Recognition (Thesis presentation)Action Recognition (Thesis presentation)
Action Recognition (Thesis presentation)nikhilus85
 
Block Truncation Coding
Block Truncation CodingBlock Truncation Coding
Block Truncation Coding
riyagam
 
Enhancement in frequency domain
Enhancement in frequency domainEnhancement in frequency domain
Enhancement in frequency domainAshish Kumar
 
Versatile Video Coding – Video Compression beyond HEVC: Coding Tools for SDR ...
Versatile Video Coding – Video Compression beyond HEVC: Coding Tools for SDR ...Versatile Video Coding – Video Compression beyond HEVC: Coding Tools for SDR ...
Versatile Video Coding – Video Compression beyond HEVC: Coding Tools for SDR ...
Förderverein Technische Fakultät
 
Video Compression Basics - MPEG2
Video Compression Basics - MPEG2Video Compression Basics - MPEG2
Video Compression Basics - MPEG2
VijayKumarArya
 
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
DeepLab V3+: Encoder-Decoder with Atrous Separable Convolution for Semantic I...
DeepLab V3+: Encoder-Decoder with Atrous Separable Convolution for Semantic I...DeepLab V3+: Encoder-Decoder with Atrous Separable Convolution for Semantic I...
DeepLab V3+: Encoder-Decoder with Atrous Separable Convolution for Semantic I...
Joonhyung Lee
 

What's hot (20)

Occlusion and Abandoned Object Detection for Surveillance Applications
Occlusion and Abandoned Object Detection for Surveillance ApplicationsOcclusion and Abandoned Object Detection for Surveillance Applications
Occlusion and Abandoned Object Detection for Surveillance Applications
 
An Overview of High Efficiency Video Codec HEVC (H.265)
An Overview of High Efficiency Video Codec HEVC (H.265)An Overview of High Efficiency Video Codec HEVC (H.265)
An Overview of High Efficiency Video Codec HEVC (H.265)
 
HEVC intra coding
HEVC intra codingHEVC intra coding
HEVC intra coding
 
Thesis presentation Slides Ph.D. Aliouat Ahcen
Thesis presentation Slides Ph.D. Aliouat AhcenThesis presentation Slides Ph.D. Aliouat Ahcen
Thesis presentation Slides Ph.D. Aliouat Ahcen
 
How AI research is enabling next-gen codecs
How AI research is enabling next-gen codecsHow AI research is enabling next-gen codecs
How AI research is enabling next-gen codecs
 
H.264 video standard
H.264 video standardH.264 video standard
H.264 video standard
 
A short history of video coding
A short history of video codingA short history of video coding
A short history of video coding
 
High Efficiency Video Codec
High Efficiency Video CodecHigh Efficiency Video Codec
High Efficiency Video Codec
 
Introduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag JainIntroduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag Jain
 
Video Segmentation
Video SegmentationVideo Segmentation
Video Segmentation
 
MPEG video compression standard
MPEG video compression standardMPEG video compression standard
MPEG video compression standard
 
Action Recognition (Thesis presentation)
Action Recognition (Thesis presentation)Action Recognition (Thesis presentation)
Action Recognition (Thesis presentation)
 
Block Truncation Coding
Block Truncation CodingBlock Truncation Coding
Block Truncation Coding
 
Enhancement in frequency domain
Enhancement in frequency domainEnhancement in frequency domain
Enhancement in frequency domain
 
Versatile Video Coding – Video Compression beyond HEVC: Coding Tools for SDR ...
Versatile Video Coding – Video Compression beyond HEVC: Coding Tools for SDR ...Versatile Video Coding – Video Compression beyond HEVC: Coding Tools for SDR ...
Versatile Video Coding – Video Compression beyond HEVC: Coding Tools for SDR ...
 
Image Segmentation
 Image Segmentation Image Segmentation
Image Segmentation
 
Video Compression Basics - MPEG2
Video Compression Basics - MPEG2Video Compression Basics - MPEG2
Video Compression Basics - MPEG2
 
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
 
DeepLab V3+: Encoder-Decoder with Atrous Separable Convolution for Semantic I...
DeepLab V3+: Encoder-Decoder with Atrous Separable Convolution for Semantic I...DeepLab V3+: Encoder-Decoder with Atrous Separable Convolution for Semantic I...
DeepLab V3+: Encoder-Decoder with Atrous Separable Convolution for Semantic I...
 
85 videocompress
85 videocompress85 videocompress
85 videocompress
 

Similar to Machine Learning approaches at video compression

2019-06-14:3 - Reti neurali e compressione video
2019-06-14:3 - Reti neurali e compressione video2019-06-14:3 - Reti neurali e compressione video
2019-06-14:3 - Reti neurali e compressione video
uninfoit
 
H43044145
H43044145H43044145
H43044145
IJERA Editor
 
Paper id 2120148
Paper id 2120148Paper id 2120148
Paper id 2120148IJRAT
 
A Novel Approach for Compressing Surveillance System Videos
A Novel Approach for Compressing Surveillance System VideosA Novel Approach for Compressing Surveillance System Videos
A Novel Approach for Compressing Surveillance System Videos
INFOGAIN PUBLICATION
 
Cuda project paper
Cuda project paperCuda project paper
Cuda project paper
Kan-Han (John) Lu
 
ETPS_Efficient_Two_pass_Encoding_Scheme_for_Adaptive_Streaming.pdf
ETPS_Efficient_Two_pass_Encoding_Scheme_for_Adaptive_Streaming.pdfETPS_Efficient_Two_pass_Encoding_Scheme_for_Adaptive_Streaming.pdf
ETPS_Efficient_Two_pass_Encoding_Scheme_for_Adaptive_Streaming.pdf
Vignesh V Menon
 
ETPS: Efficient Two-pass Encoding Scheme for Adaptive Live Streaming
ETPS: Efficient Two-pass Encoding Scheme for Adaptive Live StreamingETPS: Efficient Two-pass Encoding Scheme for Adaptive Live Streaming
ETPS: Efficient Two-pass Encoding Scheme for Adaptive Live Streaming
Alpen-Adria-Universität
 
Paralleling Variable Block Size Motion Estimation of HEVC On CPU plus GPU Pla...
Paralleling Variable Block Size Motion Estimation of HEVC On CPU plus GPU Pla...Paralleling Variable Block Size Motion Estimation of HEVC On CPU plus GPU Pla...
Paralleling Variable Block Size Motion Estimation of HEVC On CPU plus GPU Pla...
Shanghai Jiao Tong University(上海交通大学)
 
NEW IMPROVED 2D SVD BASED ALGORITHM FOR VIDEO CODING
NEW IMPROVED 2D SVD BASED ALGORITHM FOR VIDEO CODINGNEW IMPROVED 2D SVD BASED ALGORITHM FOR VIDEO CODING
NEW IMPROVED 2D SVD BASED ALGORITHM FOR VIDEO CODING
cscpconf
 
10.1.1.184.6612
10.1.1.184.661210.1.1.184.6612
10.1.1.184.6612
NITC
 
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
Vignesh V Menon
 
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
Alpen-Adria-Universität
 
IRJET- A Hybrid Image and Video Compression of DCT and DWT Techniques for H.2...
IRJET- A Hybrid Image and Video Compression of DCT and DWT Techniques for H.2...IRJET- A Hybrid Image and Video Compression of DCT and DWT Techniques for H.2...
IRJET- A Hybrid Image and Video Compression of DCT and DWT Techniques for H.2...
IRJET Journal
 
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...
Ijripublishers Ijri
 
Optimal coding unit decision for early termination in high efficiency video c...
Optimal coding unit decision for early termination in high efficiency video c...Optimal coding unit decision for early termination in high efficiency video c...
Optimal coding unit decision for early termination in high efficiency video c...
IJECEIAES
 
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...
Ijripublishers Ijri
 
Deep learning-based switchable network for in-loop filtering in high efficie...
Deep learning-based switchable network for in-loop filtering in  high efficie...Deep learning-based switchable network for in-loop filtering in  high efficie...
Deep learning-based switchable network for in-loop filtering in high efficie...
IJECEIAES
 
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVC
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVCIEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVC
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVC
Vignesh V Menon
 
INCEPT: Intra CU Depth Prediction for HEVC
INCEPT: Intra CU Depth Prediction for HEVCINCEPT: Intra CU Depth Prediction for HEVC
INCEPT: Intra CU Depth Prediction for HEVC
Alpen-Adria-Universität
 
A04840107
A04840107A04840107
A04840107
IOSR-JEN
 

Similar to Machine Learning approaches at video compression (20)

2019-06-14:3 - Reti neurali e compressione video
2019-06-14:3 - Reti neurali e compressione video2019-06-14:3 - Reti neurali e compressione video
2019-06-14:3 - Reti neurali e compressione video
 
H43044145
H43044145H43044145
H43044145
 
Paper id 2120148
Paper id 2120148Paper id 2120148
Paper id 2120148
 
A Novel Approach for Compressing Surveillance System Videos
A Novel Approach for Compressing Surveillance System VideosA Novel Approach for Compressing Surveillance System Videos
A Novel Approach for Compressing Surveillance System Videos
 
Cuda project paper
Cuda project paperCuda project paper
Cuda project paper
 
ETPS_Efficient_Two_pass_Encoding_Scheme_for_Adaptive_Streaming.pdf
ETPS_Efficient_Two_pass_Encoding_Scheme_for_Adaptive_Streaming.pdfETPS_Efficient_Two_pass_Encoding_Scheme_for_Adaptive_Streaming.pdf
ETPS_Efficient_Two_pass_Encoding_Scheme_for_Adaptive_Streaming.pdf
 
ETPS: Efficient Two-pass Encoding Scheme for Adaptive Live Streaming
ETPS: Efficient Two-pass Encoding Scheme for Adaptive Live StreamingETPS: Efficient Two-pass Encoding Scheme for Adaptive Live Streaming
ETPS: Efficient Two-pass Encoding Scheme for Adaptive Live Streaming
 
Paralleling Variable Block Size Motion Estimation of HEVC On CPU plus GPU Pla...
Paralleling Variable Block Size Motion Estimation of HEVC On CPU plus GPU Pla...Paralleling Variable Block Size Motion Estimation of HEVC On CPU plus GPU Pla...
Paralleling Variable Block Size Motion Estimation of HEVC On CPU plus GPU Pla...
 
NEW IMPROVED 2D SVD BASED ALGORITHM FOR VIDEO CODING
NEW IMPROVED 2D SVD BASED ALGORITHM FOR VIDEO CODINGNEW IMPROVED 2D SVD BASED ALGORITHM FOR VIDEO CODING
NEW IMPROVED 2D SVD BASED ALGORITHM FOR VIDEO CODING
 
10.1.1.184.6612
10.1.1.184.661210.1.1.184.6612
10.1.1.184.6612
 
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
 
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low ...
 
IRJET- A Hybrid Image and Video Compression of DCT and DWT Techniques for H.2...
IRJET- A Hybrid Image and Video Compression of DCT and DWT Techniques for H.2...IRJET- A Hybrid Image and Video Compression of DCT and DWT Techniques for H.2...
IRJET- A Hybrid Image and Video Compression of DCT and DWT Techniques for H.2...
 
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...
 
Optimal coding unit decision for early termination in high efficiency video c...
Optimal coding unit decision for early termination in high efficiency video c...Optimal coding unit decision for early termination in high efficiency video c...
Optimal coding unit decision for early termination in high efficiency video c...
 
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...
 
Deep learning-based switchable network for in-loop filtering in high efficie...
Deep learning-based switchable network for in-loop filtering in  high efficie...Deep learning-based switchable network for in-loop filtering in  high efficie...
Deep learning-based switchable network for in-loop filtering in high efficie...
 
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVC
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVCIEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVC
IEEE MMSP'21: INCEPT: Intra CU Depth Prediction for HEVC
 
INCEPT: Intra CU Depth Prediction for HEVC
INCEPT: Intra CU Depth Prediction for HEVCINCEPT: Intra CU Depth Prediction for HEVC
INCEPT: Intra CU Depth Prediction for HEVC
 
A04840107
A04840107A04840107
A04840107
 

Recently uploaded

Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
pchutichetpong
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 

Recently uploaded (20)

Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 

Machine Learning approaches at video compression

  • 1. Machine Learning approaches at video compression Roberto Iacoviello RAI - Radiotelevisione Italiana Centre for Research, Technological Innovation and Experimentation (CRITS)
  • 2. Machine Learning is like sex in high school. Everyone is talking about it, a few know what to do, and only your teacher is doing it There are 2/3 topics around AI: Ethics, that sounds to me like if we don’t teach ethics to the machines, Skynet will kill all of us. Academic paper full of mathematics and different notations. After you read them you feel like: Ok, and now? Then there is the real life: sometime is good and sometimes is bad.
  • 3. Dear old typical hybrid block based approach Many new tools in VVC: Versatile Video Coding. MPEG group in 30 years has developed many useful standards but based on the same schema. Now the group is going towards new horizons: neural networks.
  • 4. Two approaches:  NON Video approach: coded representation of neural network Neural Network Video approach Conservative Disruptive One to One End to End Replace one MPEG block with one Deep Learning block Replace the entire chain MPEG
  • 5. Non-video approach: coded representation of neural networks Scope: Representation of weights and parameters, no architecture N18162 Marrakech
  • 6. Non-video approach: coded representation of neural networks Coded representation of weight matrix
  • 7. Coded representation of neural networks Represent different artificial neural network Enable faster inference Enable use under resource limitations
  • 8. Use cases • Inference may be performed on a large number of devices • The NNs used in an application can be improved incrementally • Limitations in terms of processing power and memory • Several apps would need to store on the device the same base neural network multiple times 8 W17924 Macao Type Parameter’s Size Media content analysis From few KB to several hundreds of MB Translational app Currently around 200MB Compact Descriptors for Video Analysis (CDVA) About 500-600 MB
  • 9. MPEG Use cases • UC10 Distributed training and evaluation of neural networks for media content analysis • UC11 Compact Descriptors for Video Analysis (CDVA) • UC12 Image/Video Compression • UC13 Distribution of neural networks for content processing W17924 Macao
  • 10. Dropping connections Dropping layers Replacing convolutions with lower Dimensional ones  Matrix decomposition Changing stride in convolutions without Increasing output size Quantization (rate distortion based) Quantization using codebook Entropy coding Methods Summary: cut Something Somewhere
  • 11. • Uniform Quantization • Sequential Quantization • Nonuniform Quantization • Low-Rank Approximation M47704, Geneva Methods Original Weight (32-bits) Quantization Stage 1Quantization Stage 1 Quantization1 (10-bits) DeQuantization 1 Quantization2 (8-bits) Compressed Model DeQuantization 2 (for inference) Quantization Stage 2Quantization Stage 2 W x H Conv W x 1 Conv 1 x H Conv Relu Relu
  • 12. • “Importance” estimation step • With the proper re-train the model with the constraints of fixed-point weights, the model’s precision could be very closed to the floating- point model • Quantize the coefficients with different precision for different layers Methods
  • 13. Video approach: Conservative Neural Network based Filter for Video Coding Core Experiment 13 on neural network based filter for video coding Investigate the following problems:  The impact of NN filter position in the filter chain  The generalization capability of the NN: performance change when the test QP is not the same as the training QP 13 JVET-N0840-v1
  • 14. CE13-2.1: Convolutional Neural Network Filter (CNNF) for Intra Frame JVET-N0169 Over VTM-4.0 All Intra Y U V EncT DecT DF+CNNF+SAO+ALF -3.48% -5.18% -6.77% 142% 38414% CNNF+ALF -4.65% -6.73% -7.92% 149% 37956% CNNF -4.14% -5.49% -6.70% 140% 38411% Pay attention to the decoding time
  • 15. Concat Conv1, (5,5,64) Conv2, (3,3,64) Conv3, (3,3,64) Conv4, (3,3,64) Conv5, (3,3,64) Conv6, (3,3,64) Conv7, (3,3,64) Convolution8, (3,3,1) Summation Normalized QP MapNormalized Y/U/V N: kernel size K:kernel number ConvM, (N,N,K) Convolution (N,N,K) ReLU CE13-2.1: Convolutional Neural Network Filter (CNNF) for Intra Frame JVET-N0169
  • 16. CE13-1.1: Convolutional neural network loop filter JVET-N0110-v1 Over VTM-4.0 Random Access Y U V EncT DecT -1.36% -14.96% -14.91% 100% 142%
  • 17. Each category will investigate the following problems:  The impact of NN filter position in the filter chain: there is always objective gain  The generalization capability of the NN: results indicate that the difference is minor Neural Network based Filter for Video Coding JVET-N_Notes_dD What MPEG has decided in the March meeting (25/3/2019): The performance/complexity tradeoff indicates that the NN technology currently is not mature enough to be included in a standard As I said…sometimes life is bad
  • 18. PERFORMANCE IS NOTHING WITHOUT COMPLEXITY Neural Network for Video Coding: Conclusion The trade-off matter
  • 19. Neural Network Video approach: Disruptive Videos are temporally highly redundant No deep image compression can compete with state-of-the-art video compression, which exploits this redundancy Optical Flow
  • 20. Optical Flow  In the computer vision tasks, optical flow is widely used to exploit temporal relationship  Learning based optical flow methods can provide accurate motion information at pixel-level  Only artificial/synthetic data set
  • 22. • Learning based optical flow estimation is utilized to obtain the motion information and reconstruct the current frame • End-to-end deep video compression model that jointly learns motion estimation, motion compression, and residual compression DVC: An End-to-end Deep Video Compression Framework
  • 23. DVC: An End-to-end Deep Video Compression Framework MPEG NN 𝐴𝑟𝑐ℎ𝑖𝑡𝑒𝑐𝑡𝑢𝑟𝑒2 = 𝐴𝑟𝑐ℎ𝑖𝑡𝑒𝑐𝑡𝑢𝑟𝑒 𝑜𝑓 𝑁𝑁 𝐴𝑟𝑐ℎ𝑖𝑡𝑒𝑐𝑡𝑢𝑟𝑒𝑠
  • 24. DVC: An End-to-end Deep Video Compression Framework Optical Flow Net
  • 25. DVC: An End-to-end Deep Video Compression Framework Motion Compression
  • 26.  MV Encoder and Decoder Network DVC: An End-to-end Deep Video Compression Framework
  • 27. DVC: An End-to-end Deep Video Compression Framework Motion Compensation Network
  • 28. DVC: An End-to-end Deep Video Compression Framework Residual Encoder Net Bit Rate Estimation Net
  • 29. Loss Function DVC: An End-to-end Deep Video Compression Framework  The whole compression system is end-to-end optimized: Rate Distortion Optimization Just one end to end formula that jointly learns motion estimation, motion compression, and residual compression Residuals entropy Motion entropy
  • 30. Advantages of Neural Networks  Excellent content adaptivity  Improve coding efficiency by leveraging samples from far distance  Neural Network can well represent both texture and feature  The whole compression system is end-to-end optimized
  • 31. Rai R&D : what we are doing  End to end chain  Issues:  Residuals compression
  • 32. New EBU Distribution Codecs activity Please join the EBU Video Group https://tech.ebu.ch/video Please join the EBU Video Group, we’ll have lot of fun!
  • 33. Machine Learning approaches at video compression Roberto Iacoviello roberto.iacoviello@rai.it Grazie per l’attenzione This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-nd/3.0/ On your left there is the reinforcement learning, that means: this is the reward if you contact me.