PR-395: Variational Image Compression with a Scale Hyperprior

Variational Image Compression
with a Scale Hyperprior
Hyeongmin Lee
PR-395
2022.7.17

Entropy Coding
 Image Compression
0011010100111...

Entropy Coding
 Entropy Coding
A,B,C,D: 4 Letters (need at least 2 bits per letter)
Sample Bits
A 00
B 10
C 01
D 11
100100001111110101000100010001
(30 bits)
https://www.programiz.com/dsa/huffman-coding

Entropy Coding
 Entropy Coding
A,B,C,D: 4 Letters (need at least 2 bits per letter)
Sample Bits
A 11
B 100
C 0
D 101
1000111110110110100110110110
(28 bits)
Lower Bound??
Huffman Coding

Entropy Coding
 Entropy
𝑯𝑯 = − � 𝒑𝒑𝒊𝒊𝒍𝒍𝒍𝒍𝒈𝒈𝟐𝟐𝒑𝒑𝒊𝒊
Sample 𝒑𝒑𝒊𝒊
A 5/15
B 1/15
C 6/15
D 3/15

Image Compression
 Image Compression
0011010100111...
Lossless Coding: bmp  Coding the Integers Itself
Lossy Coding: JPEG  Reducing the ‘Entropy’ of samples

Image Compression
 JPEG
8x8 cutting
Discrete Cosine Transform Quantization
Entropy Coding...
https://www.whydomath.org/node/wavlets/basicjpg.html

Image Compression
 JPEG
High Quality
High Entropy
High Bitrate
Low Quality
Low Entropy
Low Bitrate

Image Compression
 Bitrate-Distortion Tradeoff

Deep Learning-based
Image Compression

Deep Learning-based Image Compression
 Image Compression Pipeline
Transform Quantization
Entropy Coding
Decoding
Inverse
Transform

 Image Compression Pipeline
𝐿𝐿 = 𝑅𝑅 + 𝜆𝜆𝜆𝜆

 Auto Encoder
�
𝑦𝑦 = 𝑄𝑄(𝑦𝑦)
𝑦𝑦 = 𝑔𝑔𝑎𝑎 𝑥𝑥; 𝜙𝜙𝑔𝑔
�
𝑥𝑥 = 𝑔𝑔𝑠𝑠(�
𝑦𝑦; 𝜃𝜃𝑔𝑔)
𝑅𝑅 = 𝐸𝐸𝑥𝑥~𝑝𝑝𝑥𝑥
[− log2 𝑝𝑝�
𝑦𝑦(�
𝑦𝑦)]
𝐷𝐷 = 𝑥𝑥 − �
𝑥𝑥 2
2
???

Problem of Quantization
 Adding Uniform Noise [PR328]
�
𝑦𝑦 = 𝑄𝑄 𝑦𝑦 �
𝑦𝑦 = 𝑦𝑦 + 𝑛𝑛, 𝑛𝑛~𝑈𝑈(−
1
2
,
1
2
)
Ballé, Johannes, Valero Laparra, and Eero P. Simoncelli. "End-to-end optimized image compression." 5th International Conference on Learning Representations, ICLR 2017.
− log2 𝑝𝑝�
𝑦𝑦 �
𝑦𝑦 = 𝐸𝐸𝑥𝑥~𝑝𝑝𝑥𝑥
𝑦𝑦(𝑄𝑄(𝑔𝑔𝑎𝑎(𝑥𝑥; 𝜙𝜙𝑔𝑔)))]
− log2 𝑝𝑝�
𝑦𝑦 �
𝑦𝑦 = 𝐸𝐸𝑥𝑥~𝑝𝑝𝑥𝑥
𝑦𝑦(𝑔𝑔𝑎𝑎 𝑥𝑥; 𝜙𝜙𝑔𝑔 + 𝑛𝑛)]

Rate Loss
 Applying Uniform Noise Approximation
𝑦𝑦(𝑔𝑔𝑎𝑎 𝑥𝑥; 𝜙𝜙𝑔𝑔 + 𝑛𝑛)]
𝑝𝑝�
𝑦𝑦
𝑦𝑦(�
𝑦𝑦)]
Entropy Model

Non-parametric Density Model
 Defining the density as a function (Neural Network)

 Using Sigmoid at the last layer
All Jacobian elements are positive

 Intermediate Activation Functions
𝑎𝑎 > 0 −1 < 𝑎𝑎 < 0 𝑎𝑎 < −1

 Setting parameter constraints

 Experiment on toy example

 Revisiting Loss Function
𝑦𝑦(�
𝑦𝑦)]
𝐷𝐷 = 𝑥𝑥 − �
𝑥𝑥 2
2

Variational Auto Encoder
 Auto Encoder
�
𝑦𝑦 = 𝑦𝑦 + 𝑛𝑛
�

 Variational Auto Encoder
�
𝑦𝑦~𝑝𝑝�
𝑦𝑦|𝑥𝑥(�
𝑦𝑦|𝑥𝑥)
𝑥𝑥~𝑝𝑝𝑥𝑥|𝑦𝑦(𝑥𝑥|�
𝑦𝑦)

 Defining Generative Model
�
𝑦𝑦~𝑝𝑝�
𝑦𝑦|𝑥𝑥(�
𝑦𝑦|𝑥𝑥)

 Setting the parametric version of posterior

 Setting KL Divergence

Scale Hyperprior
 Factorized Prior
�
�
𝑅𝑅 = − log2 𝑝𝑝�
𝑦𝑦(�
𝑦𝑦)
𝐷𝐷 = 𝑥𝑥 − �
𝑥𝑥 2
2
Inference Model
Generative Model
Entropy Model
𝐸𝐸𝐸𝐸𝐸𝐸[�
𝑦𝑦]
𝐷𝐷𝐷𝐷𝐷𝐷[�
𝑦𝑦]

Scale Hyperprior
 Scale of Y
Encoding Scale as ‘Additional Information’

Scale Hyperprior
 Scale Hyperprior

Scale Hyperprior
 Factorized Prior
�
�
𝑅𝑅 = − log2 𝑝𝑝�
𝑦𝑦 �
𝑦𝑦 − log2 𝑝𝑝�
𝑦𝑦|�
𝑧𝑧(�
𝑦𝑦| ̃
𝑧𝑧)
𝐷𝐷 = 𝑥𝑥 − �
𝑥𝑥 2
2
Inference Model
Generative Model
Entropy Model
𝑧𝑧 = ℎ𝑎𝑎 𝑦𝑦; 𝜙𝜙ℎ
̃
𝑧𝑧 = 𝑧𝑧 + 𝑛𝑛
�
𝜎𝜎 = ℎ𝑠𝑠 ̃
𝑧𝑧; 𝜃𝜃ℎ 𝐸𝐸𝐸𝐸𝐸𝐸[�
𝑦𝑦; �
𝜎𝜎]
𝐷𝐷𝐷𝐷𝐷𝐷[ ̃
𝑧𝑧] �
𝜎𝜎 = ℎ𝑠𝑠 ̃
𝑧𝑧; 𝜃𝜃ℎ
𝐸𝐸𝐸𝐸𝐸𝐸[ ̃
𝑧𝑧]
𝐷𝐷𝐷𝐷𝐷𝐷[�
𝑦𝑦; �
𝜎𝜎]

Scale Hyperprior
 Network Architecture

Scale Hyperprior
 Experiments

PR-395: Variational Image Compression with a Scale Hyperprior

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to PR-395: Variational Image Compression with a Scale Hyperprior

Similar to PR-395: Variational Image Compression with a Scale Hyperprior (20)

More from Hyeongmin Lee

More from Hyeongmin Lee (20)

Recently uploaded

Recently uploaded (20)

PR-395: Variational Image Compression with a Scale Hyperprior