© 2019 Your Company Name
Machine Learning Based Image
Compression: Ready for Prime Time?
Michael Gormish
Clarifai
May 2019
© 2019 Clarifai
What is Clarifai?
2
© 2019 Clarifai
Machine Learning is replacing traditional CV techniques
by using examples instead of equations
Image Classification
Image Segmentation
Object Detection
Image Matching
Pose Estimation
What about
Image Compression?
3
Difference of Gaussian, Histograms of Oriented
Gradients, Approximate Nearest Neighbors,
Fundamental Matrix
Label the input with the desired output
© 2019 Clarifai
Traditional Image Compression
Component Transform allows independent channel processing
Decorrelating transform - DCT for JPEG, Wavelet for JPEG 2000
Quantization - Quality Loss/Reduced File size
Entropy Coding - fully lossless
All very carefully selected and hand optimized to work together
4
© 2019 Clarifai
Traditional Image Compression Systems
GIF - string compressor on palletized image samples
PNG - string compressor on palletized or full range pixels
JPEG - block coder, Discrete Cosine Transform, Huffman Coding
JPEG 2000 - Tile coder, Wavelet Transform, Binary Entropy coder
H.264 - video block codec, supports stills
HEVC (H.265) - video block codec, supports stills
BPG - image block coder (HEVC subset)
XVC - proprietary video codec - with indemnification, supports stills
5
© 2019 Clarifai
Image Compression is a natural topic for ML
Vector quantization used in late 1980s
Auto-encoders are a common neural net topic
Don’t need to label millions of training samples (desired output = input)
6
© 2019 Clarifai
Modern ML methods show much better quality
But would you want these images?
7
GANs
© 2019 Clarifai
“Generative Compression” provides really small files
Compression changes content not
“quality” !!
But the system only works on
64x64 images and no code is
available
8
140:1
708:1
1416:1
[Santurkar, PCS 2018]
© 2019 Clarifai
TensorFlow provides Research models for compression
Improved Lossy Image Compression with Priming and Spatially
Adaptive Bit Rates for Recurrent Networks
Excellent Writeup and CODE!!!
Google: Nick Johnston, Damien Vincent, David
Minnen, Michele Covell, Saurabh Singh, Troy Chinen, Sung Jin
Hwang, Joel Shor, George Toderici
https://arxiv.org/abs/1703.10114
https://github.com/tensorflow/models/tree/master/research
/compression/image_encoder
http://download.tensorflow.org/models/
compression_residual_gru-2016-08-23.tar.gz
9
[Johnston CVPR 2018]
© 2019 Clarifai
JPEG artifacts
10
© 2019 Clarifai
0.125 bpp
0.250 bpp
ML artifacts
11
0.375 bpp
© 2019 Clarifai
ML provides more quality per bit
Peak Signal to Noise Ratio (PSNR) -
Higher is better
Structural Similarity (SSIM)
- Even bigger win for ML
12
Bits per pixel (bpp)
PSNR (Quality)
ML-Compression
JPEG-Compression
© 2019 Clarifai
ML Compression Has Better (and More) Basis Functions
More basis functions match the real world better
Higher level basis functions map to concepts
13
2-D 8x8 Discrete Cosine Transform
Four Levels of ML basis functions
© 2019 Clarifai
Computational Cost for More Basis Functions
2D DCT can be done with about 2 multiplies per (grayscale) pixel including quantization
A single 11x11 convolution layer with stride 4 will have 7.5 multiplies per pixel per channel per
kernel
A 4096 input 4096 output FC layer for a 224x224 input image is 326 multiplies per pixel
14
1D 8 Point DCT tranform 1D 8 Point 3 Layer Fully Connected
© 2019 Clarifai
JPEG 10,000x faster in totally unfair comparison
JPEG using one CPU core, ML code using 4 CPU cores of CPU
JPEG has been highly optimized, ML code was for demonstration purposes
Bit rates and quality were not matched
GPU/TPU/ASIC will speed up ML but how much energy savings will there be?
15
JPEG JPEG 2000 ML based
Encode Time
(ms)
26 217 292169
Decode Time
(ms)
12 157 208363
© 2019 Clarifai
CLIC 2018 - Workshop on ML image compression
Ranked by MS-SSIM (different from PSNR)
Top codec is 113x slower than jpeg (xvc and BPG are block coders)
Top codec is 97 Megabyte decoder (jpeg library might be 48K)
16
…
…
© 2019 Clarifai
Image Compression is often an interchange format
but…
Train on a different data set means need a new model at decoder
Many papers don’t have source code
Some only work on 64x64 images!!
Often only work at one or two bitrates
Some leave training as an exercise for the reader
17
© 2019 Clarifai
What can be done with ML in Embedded Systems
Embedded Camera? Do Machine Learned post-processing
Embedded Display? Do Machine Learned Preprocessing
18
© 2019 Clarifai
ML can be used to preprocess or post process
Machine learned image saliency can
guide a traditional block coder
(image top, saliency below)
Machine learned post processing
can improve block coded images
(jpeg decoded left, enhanced right)
19
[Galteri, ICCV 2017]
© 2019 Clarifai
Conclusion: Is ML based compression ready for prime
time?
Best quality per bit requires ML techniques especially at low bit rates
but
Computation/energy usage, and memory requirements are much higher
and
ML systems lack flexibility and interchange specification
therefore
NO.
Unless you have special use cases, e.g. special purpose ML HW available, but
very limited bandwidth (transmission cost/bandwidth)
20
© 2019 Clarifai
For Further Info
21
Papers
Preprocessing: Semantic Perceptual Image Compression using Deep Convolution Networks
https://arxiv.org/abs/1612.08712
Post Processing: Deep Generative Adversarial Compression Artifact Removal
http://openaccess.thecvf.com/content_ICCV_2017/papers/Galteri_Deep_Generative_Adver
sarial_ICCV_2017_paper.pdf
Evaluation of Compression
Kodak PhotoCD dataset.
http://r0k.us/graphics/kodak
CLIC - Workshop and Challenge on Learned Image Compression
http://www.compression.cc/
MS SSIM - Multi-scale structural Similarity for Image Quality Assessment
http://www.cns.nyu.edu/~zwang/files/papers/msssim.pdf
© 2019 Your Company Name
Backup Material
22
© 2019 Clarifai
References
23
© 2019 Clarifai
Clarifai Platform
24
© 2019 Clarifai
About Clarifai
25
© 2019 Clarifai
Clarifai Models
26

"Machine Learning- based Image Compression: Ready for Prime Time?," a Presentation from Clarifai

  • 1.
    © 2019 YourCompany Name Machine Learning Based Image Compression: Ready for Prime Time? Michael Gormish Clarifai May 2019
  • 2.
    © 2019 Clarifai Whatis Clarifai? 2
  • 3.
    © 2019 Clarifai MachineLearning is replacing traditional CV techniques by using examples instead of equations Image Classification Image Segmentation Object Detection Image Matching Pose Estimation What about Image Compression? 3 Difference of Gaussian, Histograms of Oriented Gradients, Approximate Nearest Neighbors, Fundamental Matrix Label the input with the desired output
  • 4.
    © 2019 Clarifai TraditionalImage Compression Component Transform allows independent channel processing Decorrelating transform - DCT for JPEG, Wavelet for JPEG 2000 Quantization - Quality Loss/Reduced File size Entropy Coding - fully lossless All very carefully selected and hand optimized to work together 4
  • 5.
    © 2019 Clarifai TraditionalImage Compression Systems GIF - string compressor on palletized image samples PNG - string compressor on palletized or full range pixels JPEG - block coder, Discrete Cosine Transform, Huffman Coding JPEG 2000 - Tile coder, Wavelet Transform, Binary Entropy coder H.264 - video block codec, supports stills HEVC (H.265) - video block codec, supports stills BPG - image block coder (HEVC subset) XVC - proprietary video codec - with indemnification, supports stills 5
  • 6.
    © 2019 Clarifai ImageCompression is a natural topic for ML Vector quantization used in late 1980s Auto-encoders are a common neural net topic Don’t need to label millions of training samples (desired output = input) 6
  • 7.
    © 2019 Clarifai ModernML methods show much better quality But would you want these images? 7 GANs
  • 8.
    © 2019 Clarifai “GenerativeCompression” provides really small files Compression changes content not “quality” !! But the system only works on 64x64 images and no code is available 8 140:1 708:1 1416:1 [Santurkar, PCS 2018]
  • 9.
    © 2019 Clarifai TensorFlowprovides Research models for compression Improved Lossy Image Compression with Priming and Spatially Adaptive Bit Rates for Recurrent Networks Excellent Writeup and CODE!!! Google: Nick Johnston, Damien Vincent, David Minnen, Michele Covell, Saurabh Singh, Troy Chinen, Sung Jin Hwang, Joel Shor, George Toderici https://arxiv.org/abs/1703.10114 https://github.com/tensorflow/models/tree/master/research /compression/image_encoder http://download.tensorflow.org/models/ compression_residual_gru-2016-08-23.tar.gz 9 [Johnston CVPR 2018]
  • 10.
  • 11.
    © 2019 Clarifai 0.125bpp 0.250 bpp ML artifacts 11 0.375 bpp
  • 12.
    © 2019 Clarifai MLprovides more quality per bit Peak Signal to Noise Ratio (PSNR) - Higher is better Structural Similarity (SSIM) - Even bigger win for ML 12 Bits per pixel (bpp) PSNR (Quality) ML-Compression JPEG-Compression
  • 13.
    © 2019 Clarifai MLCompression Has Better (and More) Basis Functions More basis functions match the real world better Higher level basis functions map to concepts 13 2-D 8x8 Discrete Cosine Transform Four Levels of ML basis functions
  • 14.
    © 2019 Clarifai ComputationalCost for More Basis Functions 2D DCT can be done with about 2 multiplies per (grayscale) pixel including quantization A single 11x11 convolution layer with stride 4 will have 7.5 multiplies per pixel per channel per kernel A 4096 input 4096 output FC layer for a 224x224 input image is 326 multiplies per pixel 14 1D 8 Point DCT tranform 1D 8 Point 3 Layer Fully Connected
  • 15.
    © 2019 Clarifai JPEG10,000x faster in totally unfair comparison JPEG using one CPU core, ML code using 4 CPU cores of CPU JPEG has been highly optimized, ML code was for demonstration purposes Bit rates and quality were not matched GPU/TPU/ASIC will speed up ML but how much energy savings will there be? 15 JPEG JPEG 2000 ML based Encode Time (ms) 26 217 292169 Decode Time (ms) 12 157 208363
  • 16.
    © 2019 Clarifai CLIC2018 - Workshop on ML image compression Ranked by MS-SSIM (different from PSNR) Top codec is 113x slower than jpeg (xvc and BPG are block coders) Top codec is 97 Megabyte decoder (jpeg library might be 48K) 16 … …
  • 17.
    © 2019 Clarifai ImageCompression is often an interchange format but… Train on a different data set means need a new model at decoder Many papers don’t have source code Some only work on 64x64 images!! Often only work at one or two bitrates Some leave training as an exercise for the reader 17
  • 18.
    © 2019 Clarifai Whatcan be done with ML in Embedded Systems Embedded Camera? Do Machine Learned post-processing Embedded Display? Do Machine Learned Preprocessing 18
  • 19.
    © 2019 Clarifai MLcan be used to preprocess or post process Machine learned image saliency can guide a traditional block coder (image top, saliency below) Machine learned post processing can improve block coded images (jpeg decoded left, enhanced right) 19 [Galteri, ICCV 2017]
  • 20.
    © 2019 Clarifai Conclusion:Is ML based compression ready for prime time? Best quality per bit requires ML techniques especially at low bit rates but Computation/energy usage, and memory requirements are much higher and ML systems lack flexibility and interchange specification therefore NO. Unless you have special use cases, e.g. special purpose ML HW available, but very limited bandwidth (transmission cost/bandwidth) 20
  • 21.
    © 2019 Clarifai ForFurther Info 21 Papers Preprocessing: Semantic Perceptual Image Compression using Deep Convolution Networks https://arxiv.org/abs/1612.08712 Post Processing: Deep Generative Adversarial Compression Artifact Removal http://openaccess.thecvf.com/content_ICCV_2017/papers/Galteri_Deep_Generative_Adver sarial_ICCV_2017_paper.pdf Evaluation of Compression Kodak PhotoCD dataset. http://r0k.us/graphics/kodak CLIC - Workshop and Challenge on Learned Image Compression http://www.compression.cc/ MS SSIM - Multi-scale structural Similarity for Image Quality Assessment http://www.cns.nyu.edu/~zwang/files/papers/msssim.pdf
  • 22.
    © 2019 YourCompany Name Backup Material 22
  • 23.
  • 24.
  • 25.
  • 26.