SlideShare a Scribd company logo
1 of 41
Download to read offline
Variational Image Compression
with a Scale Hyperprior
Hyeongmin Lee
PR-395
2022.7.17
Entropy Coding
Entropy Coding
 Image Compression
0011010100111...
Entropy Coding
 Entropy Coding
A,B,C,D: 4 Letters (need at least 2 bits per letter)
Sample Bits
A 00
B 10
C 01
D 11
100100001111110101000100010001
(30 bits)
https://www.programiz.com/dsa/huffman-coding
Entropy Coding
 Entropy Coding
A,B,C,D: 4 Letters (need at least 2 bits per letter)
Sample Bits
A 11
B 100
C 0
D 101
1000111110110110100110110110
(28 bits)
Lower Bound??
Huffman Coding
Entropy Coding
 Entropy
𝑯𝑯 = − � 𝒑𝒑𝒊𝒊𝒍𝒍𝒍𝒍𝒈𝒈𝟐𝟐𝒑𝒑𝒊𝒊
Sample 𝒑𝒑𝒊𝒊
A 5/15
B 1/15
C 6/15
D 3/15
Image Compression
Image Compression
 Image Compression
0011010100111...
Lossless Coding: bmp  Coding the Integers Itself
Lossy Coding: JPEG  Reducing the ‘Entropy’ of samples
Image Compression
 JPEG
8x8 cutting
Discrete Cosine Transform Quantization
Entropy Coding...
https://www.whydomath.org/node/wavlets/basicjpg.html
Image Compression
 JPEG
High Quality
High Entropy
High Bitrate
Low Quality
Low Entropy
Low Bitrate
Image Compression
 Bitrate-Distortion Tradeoff
Deep Learning-based
Image Compression
Deep Learning-based Image Compression
 Image Compression Pipeline
Transform Quantization
Entropy Coding
Decoding
Inverse
Transform
Deep Learning-based Image Compression
 Image Compression Pipeline
𝐿𝐿 = 𝑅𝑅 + 𝜆𝜆𝜆𝜆
Deep Learning-based Image Compression
 Auto Encoder
�
𝑦𝑦 = 𝑄𝑄(𝑦𝑦)
𝑦𝑦 = 𝑔𝑔𝑎𝑎 𝑥𝑥; 𝜙𝜙𝑔𝑔
�
𝑥𝑥 = 𝑔𝑔𝑠𝑠(�
𝑦𝑦; 𝜃𝜃𝑔𝑔)
𝑅𝑅 = 𝐸𝐸𝑥𝑥~𝑝𝑝𝑥𝑥
[− log2 𝑝𝑝�
𝑦𝑦(�
𝑦𝑦)]
𝐷𝐷 = 𝑥𝑥 − �
𝑥𝑥 2
2
???
Problem of Quantization
Problem of Quantization
 Adding Uniform Noise [PR328]
�
𝑦𝑦 = 𝑄𝑄 𝑦𝑦 �
𝑦𝑦 = 𝑦𝑦 + 𝑛𝑛, 𝑛𝑛~𝑈𝑈(−
1
2
,
1
2
)
Ballé, Johannes, Valero Laparra, and Eero P. Simoncelli. "End-to-end optimized image compression." 5th International Conference on Learning Representations, ICLR 2017.
𝑅𝑅 = 𝐸𝐸𝑥𝑥~𝑝𝑝𝑥𝑥
− log2 𝑝𝑝�
𝑦𝑦 �
𝑦𝑦 = 𝐸𝐸𝑥𝑥~𝑝𝑝𝑥𝑥
[− log2 𝑝𝑝�
𝑦𝑦(𝑄𝑄(𝑔𝑔𝑎𝑎(𝑥𝑥; 𝜙𝜙𝑔𝑔)))]
𝑅𝑅 = 𝐸𝐸𝑥𝑥~𝑝𝑝𝑥𝑥
− log2 𝑝𝑝�
𝑦𝑦 �
𝑦𝑦 = 𝐸𝐸𝑥𝑥~𝑝𝑝𝑥𝑥
[− log2 𝑝𝑝�
𝑦𝑦(𝑔𝑔𝑎𝑎 𝑥𝑥; 𝜙𝜙𝑔𝑔 + 𝑛𝑛)]
Rate Loss
Rate Loss
 Applying Uniform Noise Approximation
𝑅𝑅 = 𝐸𝐸𝑥𝑥~𝑝𝑝𝑥𝑥
[− log2 𝑝𝑝�
𝑦𝑦(𝑔𝑔𝑎𝑎 𝑥𝑥; 𝜙𝜙𝑔𝑔 + 𝑛𝑛)]
𝑝𝑝�
𝑦𝑦
𝑅𝑅 = 𝐸𝐸𝑥𝑥~𝑝𝑝𝑥𝑥
[− log2 𝑝𝑝�
𝑦𝑦(�
𝑦𝑦)]
Entropy Model
Non-parametric Density Model
Non-parametric Density Model
 Defining the density as a function (Neural Network)
Non-parametric Density Model
 Using Sigmoid at the last layer
All Jacobian elements are positive
Non-parametric Density Model
 Intermediate Activation Functions
All Jacobian elements are positive
𝑎𝑎 > 0 −1 < 𝑎𝑎 < 0 𝑎𝑎 < −1
Non-parametric Density Model
 Setting parameter constraints
All Jacobian elements are positive
Non-parametric Density Model
 Experiment on toy example
Non-parametric Density Model
 Revisiting Loss Function
𝑅𝑅 = 𝐸𝐸𝑥𝑥~𝑝𝑝𝑥𝑥
[− log2 𝑝𝑝�
𝑦𝑦(�
𝑦𝑦)]
𝐷𝐷 = 𝑥𝑥 − �
𝑥𝑥 2
2
𝐿𝐿 = 𝑅𝑅 + 𝜆𝜆𝜆𝜆
Variational Auto Encoder
Variational Auto Encoder
 Auto Encoder
�
𝑦𝑦 = 𝑦𝑦 + 𝑛𝑛
𝑦𝑦 = 𝑔𝑔𝑎𝑎 𝑥𝑥; 𝜙𝜙𝑔𝑔
�
𝑥𝑥 = 𝑔𝑔𝑠𝑠(�
𝑦𝑦; 𝜃𝜃𝑔𝑔)
Variational Auto Encoder
 Variational Auto Encoder
�
𝑦𝑦~𝑝𝑝�
𝑦𝑦|𝑥𝑥(�
𝑦𝑦|𝑥𝑥)
𝑥𝑥~𝑝𝑝𝑥𝑥|𝑦𝑦(𝑥𝑥|�
𝑦𝑦)
Variational Auto Encoder
 Defining Generative Model
�
𝑦𝑦~𝑝𝑝�
𝑦𝑦|𝑥𝑥(�
𝑦𝑦|𝑥𝑥)
Variational Auto Encoder
 Setting the parametric version of posterior
Variational Auto Encoder
 Setting KL Divergence
Scale Hyperprior
Scale Hyperprior
 Factorized Prior
�
𝑦𝑦 = 𝑦𝑦 + 𝑛𝑛
𝑦𝑦 = 𝑔𝑔𝑎𝑎 𝑥𝑥; 𝜙𝜙𝑔𝑔
�
𝑥𝑥 = 𝑔𝑔𝑠𝑠(�
𝑦𝑦; 𝜃𝜃𝑔𝑔)
𝑅𝑅 = − log2 𝑝𝑝�
𝑦𝑦(�
𝑦𝑦)
𝐷𝐷 = 𝑥𝑥 − �
𝑥𝑥 2
2
𝐿𝐿 = 𝑅𝑅 + 𝜆𝜆𝜆𝜆
Inference Model
Generative Model
Entropy Model
𝐸𝐸𝐸𝐸𝐸𝐸[�
𝑦𝑦]
𝐷𝐷𝐷𝐷𝐷𝐷[�
𝑦𝑦]
Scale Hyperprior
 Scale of Y
Encoding Scale as ‘Additional Information’
Scale Hyperprior
 Scale Hyperprior
Scale Hyperprior
 Factorized Prior
�
𝑦𝑦 = 𝑦𝑦 + 𝑛𝑛
𝑦𝑦 = 𝑔𝑔𝑎𝑎 𝑥𝑥; 𝜙𝜙𝑔𝑔
�
𝑥𝑥 = 𝑔𝑔𝑠𝑠(�
𝑦𝑦; 𝜃𝜃𝑔𝑔)
𝑅𝑅 = − log2 𝑝𝑝�
𝑦𝑦 �
𝑦𝑦 − log2 𝑝𝑝�
𝑦𝑦|�
𝑧𝑧(�
𝑦𝑦| ̃
𝑧𝑧)
𝐷𝐷 = 𝑥𝑥 − �
𝑥𝑥 2
2
𝐿𝐿 = 𝑅𝑅 + 𝜆𝜆𝜆𝜆
Inference Model
Generative Model
Entropy Model
𝑧𝑧 = ℎ𝑎𝑎 𝑦𝑦; 𝜙𝜙ℎ
̃
𝑧𝑧 = 𝑧𝑧 + 𝑛𝑛
�
𝜎𝜎 = ℎ𝑠𝑠 ̃
𝑧𝑧; 𝜃𝜃ℎ 𝐸𝐸𝐸𝐸𝐸𝐸[�
𝑦𝑦; �
𝜎𝜎]
𝐷𝐷𝐷𝐷𝐷𝐷[ ̃
𝑧𝑧] �
𝜎𝜎 = ℎ𝑠𝑠 ̃
𝑧𝑧; 𝜃𝜃ℎ
𝐸𝐸𝐸𝐸𝐸𝐸[ ̃
𝑧𝑧]
𝐷𝐷𝐷𝐷𝐷𝐷[�
𝑦𝑦; �
𝜎𝜎]
Scale Hyperprior
 Network Architecture
Scale Hyperprior
 Experiments
Scale Hyperprior
 Experiments
Thank You!

More Related Content

What's hot

PR-409: Denoising Diffusion Probabilistic Models
PR-409: Denoising Diffusion Probabilistic ModelsPR-409: Denoising Diffusion Probabilistic Models
PR-409: Denoising Diffusion Probabilistic ModelsHyeongmin Lee
 
Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Appsilon Data Science
 
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic SegmentationSemantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation岳華 杜
 
Generative Adversarial Networks (GANs)
Generative Adversarial Networks (GANs)Generative Adversarial Networks (GANs)
Generative Adversarial Networks (GANs)Amol Patil
 
Pixel RNN to Pixel CNN++
Pixel RNN to Pixel CNN++Pixel RNN to Pixel CNN++
Pixel RNN to Pixel CNN++Dongheon Lee
 
Feature Extraction
Feature ExtractionFeature Extraction
Feature Extractionskylian
 
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)Universitat Politècnica de Catalunya
 
Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)Manohar Mukku
 
Image to image translation with Pix2Pix GAN
Image to image translation with Pix2Pix GANImage to image translation with Pix2Pix GAN
Image to image translation with Pix2Pix GANS.Shayan Daneshvar
 
Regularization in deep learning
Regularization in deep learningRegularization in deep learning
Regularization in deep learningKien Le
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks남주 김
 
PR-393: ResLT: Residual Learning for Long-tailed Recognition
PR-393: ResLT: Residual Learning for Long-tailed RecognitionPR-393: ResLT: Residual Learning for Long-tailed Recognition
PR-393: ResLT: Residual Learning for Long-tailed RecognitionSunghoon Joo
 
Domain Transfer and Adaptation Survey
Domain Transfer and Adaptation SurveyDomain Transfer and Adaptation Survey
Domain Transfer and Adaptation SurveySangwoo Mo
 
Basic Generative Adversarial Networks
Basic Generative Adversarial NetworksBasic Generative Adversarial Networks
Basic Generative Adversarial NetworksDong Heon Cho
 
[Paper Review] Continual learning with deep generative replay
[Paper Review] Continual learning with deep generative replay[Paper Review] Continual learning with deep generative replay
[Paper Review] Continual learning with deep generative replayPyungin Paek
 
220206 transformer interpretability beyond attention visualization
220206 transformer interpretability beyond attention visualization220206 transformer interpretability beyond attention visualization
220206 transformer interpretability beyond attention visualizationtaeseon ryu
 
Convolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNetConvolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNetSungminYou
 
EuroSciPy 2019 - GANs: Theory and Applications
EuroSciPy 2019 - GANs: Theory and ApplicationsEuroSciPy 2019 - GANs: Theory and Applications
EuroSciPy 2019 - GANs: Theory and ApplicationsEmanuele Ghelfi
 
Image-to-Image Translation pix2pix
Image-to-Image Translation pix2pixImage-to-Image Translation pix2pix
Image-to-Image Translation pix2pixYasar Hayat
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and ApplicationsEmanuele Ghelfi
 

What's hot (20)

PR-409: Denoising Diffusion Probabilistic Models
PR-409: Denoising Diffusion Probabilistic ModelsPR-409: Denoising Diffusion Probabilistic Models
PR-409: Denoising Diffusion Probabilistic Models
 
Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)
 
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic SegmentationSemantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
 
Generative Adversarial Networks (GANs)
Generative Adversarial Networks (GANs)Generative Adversarial Networks (GANs)
Generative Adversarial Networks (GANs)
 
Pixel RNN to Pixel CNN++
Pixel RNN to Pixel CNN++Pixel RNN to Pixel CNN++
Pixel RNN to Pixel CNN++
 
Feature Extraction
Feature ExtractionFeature Extraction
Feature Extraction
 
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
 
Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)
 
Image to image translation with Pix2Pix GAN
Image to image translation with Pix2Pix GANImage to image translation with Pix2Pix GAN
Image to image translation with Pix2Pix GAN
 
Regularization in deep learning
Regularization in deep learningRegularization in deep learning
Regularization in deep learning
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 
PR-393: ResLT: Residual Learning for Long-tailed Recognition
PR-393: ResLT: Residual Learning for Long-tailed RecognitionPR-393: ResLT: Residual Learning for Long-tailed Recognition
PR-393: ResLT: Residual Learning for Long-tailed Recognition
 
Domain Transfer and Adaptation Survey
Domain Transfer and Adaptation SurveyDomain Transfer and Adaptation Survey
Domain Transfer and Adaptation Survey
 
Basic Generative Adversarial Networks
Basic Generative Adversarial NetworksBasic Generative Adversarial Networks
Basic Generative Adversarial Networks
 
[Paper Review] Continual learning with deep generative replay
[Paper Review] Continual learning with deep generative replay[Paper Review] Continual learning with deep generative replay
[Paper Review] Continual learning with deep generative replay
 
220206 transformer interpretability beyond attention visualization
220206 transformer interpretability beyond attention visualization220206 transformer interpretability beyond attention visualization
220206 transformer interpretability beyond attention visualization
 
Convolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNetConvolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNet
 
EuroSciPy 2019 - GANs: Theory and Applications
EuroSciPy 2019 - GANs: Theory and ApplicationsEuroSciPy 2019 - GANs: Theory and Applications
EuroSciPy 2019 - GANs: Theory and Applications
 
Image-to-Image Translation pix2pix
Image-to-Image Translation pix2pixImage-to-Image Translation pix2pix
Image-to-Image Translation pix2pix
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and Applications
 

Similar to PR-395: Variational Image Compression with a Scale Hyperprior

PR-420: Scalable Model Compression by Entropy Penalized Reparameterization
PR-420: Scalable Model Compression by Entropy Penalized ReparameterizationPR-420: Scalable Model Compression by Entropy Penalized Reparameterization
PR-420: Scalable Model Compression by Entropy Penalized ReparameterizationHyeongmin Lee
 
cp467_12_lecture14_image compression1.pdf
cp467_12_lecture14_image compression1.pdfcp467_12_lecture14_image compression1.pdf
cp467_12_lecture14_image compression1.pdfshaikmoosa2003
 
MPEG-1 Part 2 Video Encoding
MPEG-1 Part 2 Video EncodingMPEG-1 Part 2 Video Encoding
MPEG-1 Part 2 Video EncodingChristian Kehl
 
Compression: Images (JPEG)
Compression: Images (JPEG)Compression: Images (JPEG)
Compression: Images (JPEG)danishrafiq
 
Video Compression Basics
Video Compression BasicsVideo Compression Basics
Video Compression BasicsSanjiv Malik
 
image processing for jpeg presentati.ppt
image processing for jpeg presentati.pptimage processing for jpeg presentati.ppt
image processing for jpeg presentati.pptnaghamallella
 
Variational Autoencoded Regression of Visual Data with Generative Adversarial...
Variational Autoencoded Regression of Visual Data with Generative Adversarial...Variational Autoencoded Regression of Visual Data with Generative Adversarial...
Variational Autoencoded Regression of Visual Data with Generative Adversarial...NAVER Engineering
 
When Discrete Optimization Meets Multimedia Security (and Beyond)
When Discrete Optimization Meets Multimedia Security (and Beyond)When Discrete Optimization Meets Multimedia Security (and Beyond)
When Discrete Optimization Meets Multimedia Security (and Beyond)Shujun Li
 
Designed by Identity MLP
Designed by Identity MLP Designed by Identity MLP
Designed by Identity MLP butest
 
Image processing with matlab
Image processing with matlabImage processing with matlab
Image processing with matlabAman Gupta
 
Image compression 14_04_2020 (1)
Image compression 14_04_2020 (1)Image compression 14_04_2020 (1)
Image compression 14_04_2020 (1)Joel P
 

Similar to PR-395: Variational Image Compression with a Scale Hyperprior (20)

PR-420: Scalable Model Compression by Entropy Penalized Reparameterization
PR-420: Scalable Model Compression by Entropy Penalized ReparameterizationPR-420: Scalable Model Compression by Entropy Penalized Reparameterization
PR-420: Scalable Model Compression by Entropy Penalized Reparameterization
 
Source coding
Source codingSource coding
Source coding
 
cp467_12_lecture14_image compression1.pdf
cp467_12_lecture14_image compression1.pdfcp467_12_lecture14_image compression1.pdf
cp467_12_lecture14_image compression1.pdf
 
2021 04-01-dalle
2021 04-01-dalle2021 04-01-dalle
2021 04-01-dalle
 
JPEG Image Compression
JPEG Image CompressionJPEG Image Compression
JPEG Image Compression
 
MPEG-1 Part 2 Video Encoding
MPEG-1 Part 2 Video EncodingMPEG-1 Part 2 Video Encoding
MPEG-1 Part 2 Video Encoding
 
Mmclass4
Mmclass4Mmclass4
Mmclass4
 
Compression: Images (JPEG)
Compression: Images (JPEG)Compression: Images (JPEG)
Compression: Images (JPEG)
 
*αí*ß
*αí*ß*αí*ß
*αí*ß
 
Compression
CompressionCompression
Compression
 
Video Compression Basics
Video Compression BasicsVideo Compression Basics
Video Compression Basics
 
image processing for jpeg presentati.ppt
image processing for jpeg presentati.pptimage processing for jpeg presentati.ppt
image processing for jpeg presentati.ppt
 
Variational Autoencoded Regression of Visual Data with Generative Adversarial...
Variational Autoencoded Regression of Visual Data with Generative Adversarial...Variational Autoencoded Regression of Visual Data with Generative Adversarial...
Variational Autoencoded Regression of Visual Data with Generative Adversarial...
 
When Discrete Optimization Meets Multimedia Security (and Beyond)
When Discrete Optimization Meets Multimedia Security (and Beyond)When Discrete Optimization Meets Multimedia Security (and Beyond)
When Discrete Optimization Meets Multimedia Security (and Beyond)
 
Designed by Identity MLP
Designed by Identity MLP Designed by Identity MLP
Designed by Identity MLP
 
Image Compression, Introduction Data Compression/ Data compression, modelling...
Image Compression, Introduction Data Compression/ Data compression, modelling...Image Compression, Introduction Data Compression/ Data compression, modelling...
Image Compression, Introduction Data Compression/ Data compression, modelling...
 
image compression ppt
image compression pptimage compression ppt
image compression ppt
 
Image processing with matlab
Image processing with matlabImage processing with matlab
Image processing with matlab
 
image enhancement.pptx
image enhancement.pptximage enhancement.pptx
image enhancement.pptx
 
Image compression 14_04_2020 (1)
Image compression 14_04_2020 (1)Image compression 14_04_2020 (1)
Image compression 14_04_2020 (1)
 

More from Hyeongmin Lee

PR-455: CoTracker: It is Better to Track Together
PR-455: CoTracker: It is Better to Track TogetherPR-455: CoTracker: It is Better to Track Together
PR-455: CoTracker: It is Better to Track TogetherHyeongmin Lee
 
PR-430: CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retri...
PR-430: CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retri...PR-430: CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retri...
PR-430: CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retri...Hyeongmin Lee
 
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...Hyeongmin Lee
 
PR-376: Softmax Splatting for Video Frame Interpolation
PR-376: Softmax Splatting for Video Frame InterpolationPR-376: Softmax Splatting for Video Frame Interpolation
PR-376: Softmax Splatting for Video Frame InterpolationHyeongmin Lee
 
PR-365: Fast object detection in compressed video
PR-365: Fast object detection in compressed videoPR-365: Fast object detection in compressed video
PR-365: Fast object detection in compressed videoHyeongmin Lee
 
PR-340: DVC: An End-to-end Deep Video Compression Framework
PR-340: DVC: An End-to-end Deep Video Compression FrameworkPR-340: DVC: An End-to-end Deep Video Compression Framework
PR-340: DVC: An End-to-end Deep Video Compression FrameworkHyeongmin Lee
 
PR-315: Taming Transformers for High-Resolution Image Synthesis
PR-315: Taming Transformers for High-Resolution Image SynthesisPR-315: Taming Transformers for High-Resolution Image Synthesis
PR-315: Taming Transformers for High-Resolution Image SynthesisHyeongmin Lee
 
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisPR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisHyeongmin Lee
 
PR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
PR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical FlowPR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
PR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical FlowHyeongmin Lee
 
PR-252: Making Convolutional Networks Shift-Invariant Again
PR-252: Making Convolutional Networks Shift-Invariant AgainPR-252: Making Convolutional Networks Shift-Invariant Again
PR-252: Making Convolutional Networks Shift-Invariant AgainHyeongmin Lee
 
PR-240: Modulating Image Restoration with Continual Levels via Adaptive Featu...
PR-240: Modulating Image Restoration with Continual Levels viaAdaptive Featu...PR-240: Modulating Image Restoration with Continual Levels viaAdaptive Featu...
PR-240: Modulating Image Restoration with Continual Levels via Adaptive Featu...Hyeongmin Lee
 
PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera...
PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera...PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera...
PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera...Hyeongmin Lee
 
PR-214: FlowNet: Learning Optical Flow with Convolutional Networks
PR-214: FlowNet: Learning Optical Flow with Convolutional NetworksPR-214: FlowNet: Learning Optical Flow with Convolutional Networks
PR-214: FlowNet: Learning Optical Flow with Convolutional NetworksHyeongmin Lee
 
[PR12] Making Convolutional Networks Shift-Invariant Again
[PR12] Making Convolutional Networks Shift-Invariant Again[PR12] Making Convolutional Networks Shift-Invariant Again
[PR12] Making Convolutional Networks Shift-Invariant AgainHyeongmin Lee
 
Latest Frame interpolation Algorithms
Latest Frame interpolation AlgorithmsLatest Frame interpolation Algorithms
Latest Frame interpolation AlgorithmsHyeongmin Lee
 
[Paper Review] Temporal Generative Adversarial Nets with Singular Value Clipping
[Paper Review] Temporal Generative Adversarial Nets with Singular Value Clipping[Paper Review] Temporal Generative Adversarial Nets with Singular Value Clipping
[Paper Review] Temporal Generative Adversarial Nets with Singular Value ClippingHyeongmin Lee
 
[Paper Review] A Middlebury Benchmark & Context-Aware Synthesis for Video Fra...
[Paper Review] A Middlebury Benchmark & Context-Aware Synthesis for Video Fra...[Paper Review] A Middlebury Benchmark & Context-Aware Synthesis for Video Fra...
[Paper Review] A Middlebury Benchmark & Context-Aware Synthesis for Video Fra...Hyeongmin Lee
 
[Paper Review] Video Frame Interpolation via Adaptive Convolution
[Paper Review] Video Frame Interpolation via Adaptive Convolution[Paper Review] Video Frame Interpolation via Adaptive Convolution
[Paper Review] Video Frame Interpolation via Adaptive ConvolutionHyeongmin Lee
 
[Paper Review] A spatio -Temporal Descriptor Based on 3D -Gradients
[Paper Review] A spatio -Temporal Descriptor Based on 3D -Gradients[Paper Review] A spatio -Temporal Descriptor Based on 3D -Gradients
[Paper Review] A spatio -Temporal Descriptor Based on 3D -GradientsHyeongmin Lee
 

More from Hyeongmin Lee (20)

PR-455: CoTracker: It is Better to Track Together
PR-455: CoTracker: It is Better to Track TogetherPR-455: CoTracker: It is Better to Track Together
PR-455: CoTracker: It is Better to Track Together
 
PR-430: CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retri...
PR-430: CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retri...PR-430: CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retri...
PR-430: CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retri...
 
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
 
PR-376: Softmax Splatting for Video Frame Interpolation
PR-376: Softmax Splatting for Video Frame InterpolationPR-376: Softmax Splatting for Video Frame Interpolation
PR-376: Softmax Splatting for Video Frame Interpolation
 
PR-365: Fast object detection in compressed video
PR-365: Fast object detection in compressed videoPR-365: Fast object detection in compressed video
PR-365: Fast object detection in compressed video
 
PR-340: DVC: An End-to-end Deep Video Compression Framework
PR-340: DVC: An End-to-end Deep Video Compression FrameworkPR-340: DVC: An End-to-end Deep Video Compression Framework
PR-340: DVC: An End-to-end Deep Video Compression Framework
 
PR-315: Taming Transformers for High-Resolution Image Synthesis
PR-315: Taming Transformers for High-Resolution Image SynthesisPR-315: Taming Transformers for High-Resolution Image Synthesis
PR-315: Taming Transformers for High-Resolution Image Synthesis
 
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisPR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
 
PR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
PR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical FlowPR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
PR-278: RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
 
Pr266
Pr266Pr266
Pr266
 
PR-252: Making Convolutional Networks Shift-Invariant Again
PR-252: Making Convolutional Networks Shift-Invariant AgainPR-252: Making Convolutional Networks Shift-Invariant Again
PR-252: Making Convolutional Networks Shift-Invariant Again
 
PR-240: Modulating Image Restoration with Continual Levels via Adaptive Featu...
PR-240: Modulating Image Restoration with Continual Levels viaAdaptive Featu...PR-240: Modulating Image Restoration with Continual Levels viaAdaptive Featu...
PR-240: Modulating Image Restoration with Continual Levels via Adaptive Featu...
 
PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera...
PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera...PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera...
PR-228: Geonet: Unsupervised learning of dense depth, optical flow and camera...
 
PR-214: FlowNet: Learning Optical Flow with Convolutional Networks
PR-214: FlowNet: Learning Optical Flow with Convolutional NetworksPR-214: FlowNet: Learning Optical Flow with Convolutional Networks
PR-214: FlowNet: Learning Optical Flow with Convolutional Networks
 
[PR12] Making Convolutional Networks Shift-Invariant Again
[PR12] Making Convolutional Networks Shift-Invariant Again[PR12] Making Convolutional Networks Shift-Invariant Again
[PR12] Making Convolutional Networks Shift-Invariant Again
 
Latest Frame interpolation Algorithms
Latest Frame interpolation AlgorithmsLatest Frame interpolation Algorithms
Latest Frame interpolation Algorithms
 
[Paper Review] Temporal Generative Adversarial Nets with Singular Value Clipping
[Paper Review] Temporal Generative Adversarial Nets with Singular Value Clipping[Paper Review] Temporal Generative Adversarial Nets with Singular Value Clipping
[Paper Review] Temporal Generative Adversarial Nets with Singular Value Clipping
 
[Paper Review] A Middlebury Benchmark & Context-Aware Synthesis for Video Fra...
[Paper Review] A Middlebury Benchmark & Context-Aware Synthesis for Video Fra...[Paper Review] A Middlebury Benchmark & Context-Aware Synthesis for Video Fra...
[Paper Review] A Middlebury Benchmark & Context-Aware Synthesis for Video Fra...
 
[Paper Review] Video Frame Interpolation via Adaptive Convolution
[Paper Review] Video Frame Interpolation via Adaptive Convolution[Paper Review] Video Frame Interpolation via Adaptive Convolution
[Paper Review] Video Frame Interpolation via Adaptive Convolution
 
[Paper Review] A spatio -Temporal Descriptor Based on 3D -Gradients
[Paper Review] A spatio -Temporal Descriptor Based on 3D -Gradients[Paper Review] A spatio -Temporal Descriptor Based on 3D -Gradients
[Paper Review] A spatio -Temporal Descriptor Based on 3D -Gradients
 

Recently uploaded

GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAbhinavSharma374939
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineeringmalavadedarshan25
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 

Recently uploaded (20)

DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog Converter
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineering
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 

PR-395: Variational Image Compression with a Scale Hyperprior

  • 1. Variational Image Compression with a Scale Hyperprior Hyeongmin Lee PR-395 2022.7.17
  • 3. Entropy Coding  Image Compression 0011010100111...
  • 4. Entropy Coding  Entropy Coding A,B,C,D: 4 Letters (need at least 2 bits per letter) Sample Bits A 00 B 10 C 01 D 11 100100001111110101000100010001 (30 bits) https://www.programiz.com/dsa/huffman-coding
  • 5. Entropy Coding  Entropy Coding A,B,C,D: 4 Letters (need at least 2 bits per letter) Sample Bits A 11 B 100 C 0 D 101 1000111110110110100110110110 (28 bits) Lower Bound?? Huffman Coding
  • 6. Entropy Coding  Entropy 𝑯𝑯 = − � 𝒑𝒑𝒊𝒊𝒍𝒍𝒍𝒍𝒈𝒈𝟐𝟐𝒑𝒑𝒊𝒊 Sample 𝒑𝒑𝒊𝒊 A 5/15 B 1/15 C 6/15 D 3/15
  • 8. Image Compression  Image Compression 0011010100111... Lossless Coding: bmp  Coding the Integers Itself Lossy Coding: JPEG  Reducing the ‘Entropy’ of samples
  • 9. Image Compression  JPEG 8x8 cutting Discrete Cosine Transform Quantization Entropy Coding... https://www.whydomath.org/node/wavlets/basicjpg.html
  • 10. Image Compression  JPEG High Quality High Entropy High Bitrate Low Quality Low Entropy Low Bitrate
  • 13. Deep Learning-based Image Compression  Image Compression Pipeline Transform Quantization Entropy Coding Decoding Inverse Transform
  • 14. Deep Learning-based Image Compression  Image Compression Pipeline 𝐿𝐿 = 𝑅𝑅 + 𝜆𝜆𝜆𝜆
  • 15. Deep Learning-based Image Compression  Auto Encoder � 𝑦𝑦 = 𝑄𝑄(𝑦𝑦) 𝑦𝑦 = 𝑔𝑔𝑎𝑎 𝑥𝑥; 𝜙𝜙𝑔𝑔 � 𝑥𝑥 = 𝑔𝑔𝑠𝑠(� 𝑦𝑦; 𝜃𝜃𝑔𝑔) 𝑅𝑅 = 𝐸𝐸𝑥𝑥~𝑝𝑝𝑥𝑥 [− log2 𝑝𝑝� 𝑦𝑦(� 𝑦𝑦)] 𝐷𝐷 = 𝑥𝑥 − � 𝑥𝑥 2 2 ???
  • 17. Problem of Quantization  Adding Uniform Noise [PR328] � 𝑦𝑦 = 𝑄𝑄 𝑦𝑦 � 𝑦𝑦 = 𝑦𝑦 + 𝑛𝑛, 𝑛𝑛~𝑈𝑈(− 1 2 , 1 2 ) Ballé, Johannes, Valero Laparra, and Eero P. Simoncelli. "End-to-end optimized image compression." 5th International Conference on Learning Representations, ICLR 2017. 𝑅𝑅 = 𝐸𝐸𝑥𝑥~𝑝𝑝𝑥𝑥 − log2 𝑝𝑝� 𝑦𝑦 � 𝑦𝑦 = 𝐸𝐸𝑥𝑥~𝑝𝑝𝑥𝑥 [− log2 𝑝𝑝� 𝑦𝑦(𝑄𝑄(𝑔𝑔𝑎𝑎(𝑥𝑥; 𝜙𝜙𝑔𝑔)))] 𝑅𝑅 = 𝐸𝐸𝑥𝑥~𝑝𝑝𝑥𝑥 − log2 𝑝𝑝� 𝑦𝑦 � 𝑦𝑦 = 𝐸𝐸𝑥𝑥~𝑝𝑝𝑥𝑥 [− log2 𝑝𝑝� 𝑦𝑦(𝑔𝑔𝑎𝑎 𝑥𝑥; 𝜙𝜙𝑔𝑔 + 𝑛𝑛)]
  • 19. Rate Loss  Applying Uniform Noise Approximation 𝑅𝑅 = 𝐸𝐸𝑥𝑥~𝑝𝑝𝑥𝑥 [− log2 𝑝𝑝� 𝑦𝑦(𝑔𝑔𝑎𝑎 𝑥𝑥; 𝜙𝜙𝑔𝑔 + 𝑛𝑛)] 𝑝𝑝� 𝑦𝑦 𝑅𝑅 = 𝐸𝐸𝑥𝑥~𝑝𝑝𝑥𝑥 [− log2 𝑝𝑝� 𝑦𝑦(� 𝑦𝑦)] Entropy Model
  • 21. Non-parametric Density Model  Defining the density as a function (Neural Network)
  • 22. Non-parametric Density Model  Using Sigmoid at the last layer All Jacobian elements are positive
  • 23. Non-parametric Density Model  Intermediate Activation Functions All Jacobian elements are positive 𝑎𝑎 > 0 −1 < 𝑎𝑎 < 0 𝑎𝑎 < −1
  • 24. Non-parametric Density Model  Setting parameter constraints All Jacobian elements are positive
  • 25. Non-parametric Density Model  Experiment on toy example
  • 26. Non-parametric Density Model  Revisiting Loss Function 𝑅𝑅 = 𝐸𝐸𝑥𝑥~𝑝𝑝𝑥𝑥 [− log2 𝑝𝑝� 𝑦𝑦(� 𝑦𝑦)] 𝐷𝐷 = 𝑥𝑥 − � 𝑥𝑥 2 2 𝐿𝐿 = 𝑅𝑅 + 𝜆𝜆𝜆𝜆
  • 28. Variational Auto Encoder  Auto Encoder � 𝑦𝑦 = 𝑦𝑦 + 𝑛𝑛 𝑦𝑦 = 𝑔𝑔𝑎𝑎 𝑥𝑥; 𝜙𝜙𝑔𝑔 � 𝑥𝑥 = 𝑔𝑔𝑠𝑠(� 𝑦𝑦; 𝜃𝜃𝑔𝑔)
  • 29. Variational Auto Encoder  Variational Auto Encoder � 𝑦𝑦~𝑝𝑝� 𝑦𝑦|𝑥𝑥(� 𝑦𝑦|𝑥𝑥) 𝑥𝑥~𝑝𝑝𝑥𝑥|𝑦𝑦(𝑥𝑥|� 𝑦𝑦)
  • 30. Variational Auto Encoder  Defining Generative Model � 𝑦𝑦~𝑝𝑝� 𝑦𝑦|𝑥𝑥(� 𝑦𝑦|𝑥𝑥)
  • 31. Variational Auto Encoder  Setting the parametric version of posterior
  • 32. Variational Auto Encoder  Setting KL Divergence
  • 34. Scale Hyperprior  Factorized Prior � 𝑦𝑦 = 𝑦𝑦 + 𝑛𝑛 𝑦𝑦 = 𝑔𝑔𝑎𝑎 𝑥𝑥; 𝜙𝜙𝑔𝑔 � 𝑥𝑥 = 𝑔𝑔𝑠𝑠(� 𝑦𝑦; 𝜃𝜃𝑔𝑔) 𝑅𝑅 = − log2 𝑝𝑝� 𝑦𝑦(� 𝑦𝑦) 𝐷𝐷 = 𝑥𝑥 − � 𝑥𝑥 2 2 𝐿𝐿 = 𝑅𝑅 + 𝜆𝜆𝜆𝜆 Inference Model Generative Model Entropy Model 𝐸𝐸𝐸𝐸𝐸𝐸[� 𝑦𝑦] 𝐷𝐷𝐷𝐷𝐷𝐷[� 𝑦𝑦]
  • 35. Scale Hyperprior  Scale of Y Encoding Scale as ‘Additional Information’
  • 37. Scale Hyperprior  Factorized Prior � 𝑦𝑦 = 𝑦𝑦 + 𝑛𝑛 𝑦𝑦 = 𝑔𝑔𝑎𝑎 𝑥𝑥; 𝜙𝜙𝑔𝑔 � 𝑥𝑥 = 𝑔𝑔𝑠𝑠(� 𝑦𝑦; 𝜃𝜃𝑔𝑔) 𝑅𝑅 = − log2 𝑝𝑝� 𝑦𝑦 � 𝑦𝑦 − log2 𝑝𝑝� 𝑦𝑦|� 𝑧𝑧(� 𝑦𝑦| ̃ 𝑧𝑧) 𝐷𝐷 = 𝑥𝑥 − � 𝑥𝑥 2 2 𝐿𝐿 = 𝑅𝑅 + 𝜆𝜆𝜆𝜆 Inference Model Generative Model Entropy Model 𝑧𝑧 = ℎ𝑎𝑎 𝑦𝑦; 𝜙𝜙ℎ ̃ 𝑧𝑧 = 𝑧𝑧 + 𝑛𝑛 � 𝜎𝜎 = ℎ𝑠𝑠 ̃ 𝑧𝑧; 𝜃𝜃ℎ 𝐸𝐸𝐸𝐸𝐸𝐸[� 𝑦𝑦; � 𝜎𝜎] 𝐷𝐷𝐷𝐷𝐷𝐷[ ̃ 𝑧𝑧] � 𝜎𝜎 = ℎ𝑠𝑠 ̃ 𝑧𝑧; 𝜃𝜃ℎ 𝐸𝐸𝐸𝐸𝐸𝐸[ ̃ 𝑧𝑧] 𝐷𝐷𝐷𝐷𝐷𝐷[� 𝑦𝑦; � 𝜎𝜎]