SlideShare a Scribd company logo
1 of 10
Download to read offline
ROYAL HOLLOWAY UNIVERISTY OF LONDON
JPEG compression
How images are generally compressed using JPEG
Candidate Number: 1600085
Contents
Compression using JPEG .....................................................................................................................................................1
YCbCr colour transform........................................................................................................................................................1
Down Sampling ...................................................................................................................................................................2
Discrete Cosine Transform (DCT):.......................................................................................................................................2
Discrete Cosine Transform formulae:.............................................................................................................................4
Quantization: ......................................................................................................................................................................5
Entropy Coding:...................................................................................................................................................................6
Conclusion...........................................................................................................................................................................8
Questions not answered in this Project:.............................................................................................................................8
Bibliography:.......................................................................................................................................................................9
1
Candidate number: 1600085
Compression using JPEG
JPEG is a widely known compression method used to store images efficiently. JPEG reduces the size of the original image
at the cost of image quality. The size is greatly reduced and the change in quality is almost undetectable by human eye.
The quality of image is reduced because some data is discarded and is unrecoverable which classes JPEG as a lossy
compression method. This is of course different from the lossless data compression methods such as PNG in which there
is no loss of image data. By using JPEG, images can be reduced to roughly 5% of the normal size which saves tremendous
amount of storage and is particularly useful for companies that store huge amounts of images.
JPEG Compression procedure:
Original image
in RGB
YCbCr colour
transform
Down
Sampling
Discrete Cosine
Transform
Quantization Entropy
Encoding
Encoded JPEG
image
Original Image in RGB colour space
Images are made up of pixels and the colour of each pixel in the original image can be represented by 3-dimensional
vector (R,G,B). The colour of each pixel can be specified using intensities of red, green and blue. The intensity of each
colour varies from 0 to 255. Hence each color component can be represented as an integer. In a typical natural image,
there is a significant amount of correlation between these components i.e. take a pixel and the pixels around this pixel
will be similar. It is consequence of the fast that surfaces are smooth. Our aim is to find redundancies in order to reduce
the amount of data required to represent the image.
YCbCr colour transform
We use colour space transform from RGB to YCbCr whose vector components represent luminance (Y), blue chrominance
(Cb) and red chrominance (Cr). Note that YCbCr is not a colour space but rather a way of encoding RGB information.
Below is the transformation matrix which converts RGB to YCbCr. The matrix is constant and most importantly invertible
meaning that we can transform back to RGB when reconstructing the image.
We split the image into blocks (Figure 1) where each block consists of 8x8 pixels, one of the blocks from figure 1 is
zoomed in to show 8x8 pixel block in Figure 2. If the image cannot be divided exactly into 8x8 pixel blocks then we add
extra information.
Figure 1 Figure 2
2
Candidate number: 1600085
The pixel in the top left corner of figure 2 has RGB(222,138,123).
Figure 3
Hence (Y,Cb,Cr) = (161.406, -21.67417, 43.21965). We do this for each pixel in 8x8 block to obtain three 8x8 matrices one
Y component, one for Cb and one for Cr.
Figure 4 Luminosity (Y) Figure 5 Chrominance (Cb) Figure 6 Chrominance (Cr)
Down Sampling
Human eye is more perceptible to luminance compared to chrominance. Therefore image can be down sampled by
assuming the chrominance values to be constant on 2x2 block in our 8x8 block hence recording few values Each block is
encoded ‘almost’ independently hence we will assume for now that each 8x8 block is encoded independently. Down
sampling reduces the data but also reduces the quality of the image. Most software use down sampling of two i.e.
assume 2x2 block is constant (4x less colour), however this can be increased.
Discrete Cosine Transform (DCT): There are many types of DCT but for JPEG, DCT-II is used most commonly. The
main idea of DCT is to represent data of 8x8 pixel blocks as the sum of cosine functions. Each of the 8x8 pixel blocks are
separately encoded with its own discrete cosine transform. Each of the 8x8 blocks can be exactly replicated, hence we
have 64 cosine waves. This is true for all three of our components Y, Cb and Cr. From here on, we’ll talk about luminance
(Y) but Cb and Cr are similar.
What we are essentially trying to do is represent image data in terms of cosine waves. We can add different frequencies
of cosine waves in order to get the shape of the wave of our data.
3
Candidate number: 1600085
Figure 7
Red is cos(x), blue is cos(2x) and black wave is (½)cos(x)+ (½)cos(2x). If we added cos(x) and cos(2x) we would have a
wave which goes above 1 and below -1. Hence we can take an average (mean) in order to get appropriate range. In fact
we can take weighted average of cosine waves in order of importance e.g. (¾)Cos(2x) + (¼)cos(x), and the resulting wave
will resemble more of cos(2x) wave. The more cosine waves we have, the more possible shapes we can make and hence
better approximation of our image data. In our case, we use all 64 cosine functions to represent a block.
Figure 8
Every 8x8 block is a linear combination of these 64 patterns which is transformed by DCT. These patterns are called two
dimensional DCT basis functions where the output values are called transform coefficients. The top left region shows
low frequency cosine waves and bottom right represents higher frequency cosine waves.
Figure 9
4
Candidate number: 1600085
Luminance value ranges from 0 to 255 just like RGB. Figure 9 shows matrix for luminosity component of a certain 8x8
block. Before computing the DCT coefficients, values must be centered around zero. This can be done by subtracting 128
from each element in the matrix in figure 9 which gives modified range [-128, 127].
Figure 10
Discrete Cosine Transform formulae:
𝐺 𝑢,𝑣 =
1
4
α(𝑢)α(𝑣) ∑ ∑ 𝑔 𝑥,𝑦 cos [
(2𝑥 + 1)𝑢𝜋
2𝑛
]
𝑛−1
𝑥=0
𝑛−1
𝑦=0
cos [
(2𝑥 + 1)𝑣𝜋
2𝑛
]
This is the general formulae for 𝑛 ∗ 𝑛 pixel block. Hence for 8x8 pixel block, n=8. Gu,v is DCT coefficient at coordinates
𝑢, 𝑣 in 8x8 matrix. 𝑢 is the horizontal spatial frequency with integer values 0 ≤ 𝑢 ≤ 7 and 𝑣 is the vertical spatial
frequency with integer values 0 ≤ 𝑣 ≤ 7.
Similar for α(𝑣)
Below is the calculation for the first entry 𝐺0,0 for DCT matrix
𝐺0,0 =
1
4
∗
1
√2
∗
1
√2
∑ ∑ 𝑔 𝑥,𝑦 cos(0)
7
𝑥=0
7
𝑦=0
cos(0)
=
1
8
∑ ∑ 𝑔 𝑥,𝑦
7
𝑥=0
7
𝑦=0
Calculating the above for all x and y we obtain:
5
Candidate number: 1600085
Figure 11: In this case we sum all the elements in matrix g since cos(0) = 1.
Hence the first entry of DCT matrix is -415.38 rounded to 2d.p. Calculating values for the rest of the matrix gives:
Figure 12
G0,0 usually is much higher in magnitude compared to others since it represents the general intensity of 8x8 block and is
called DC coefficient. Note that bottom right region has numbers of low magnitude compared to top left region.
This shows that the high frequency cosine waves do not contribute much and have very subtle effects on the output
pixel data. The tendency to gather most of the signal in top left corner is one of the main advantage of using DCT-II.
Removing the high frequency data is called Quantization.
Quantization:
Since human eye is good at seeing small differences in brightness than it is in seeing exact strength of a high frequency
brightness variation. Due to this, we can reduce the amount of information by getting rid of the high frequency
components. We do this by dividing each value of DCT matrix 𝐺𝑖,𝑗 by the corresponding value 𝑄𝑖,𝑗 in our Quantization
matrix.
Figure 13
Figure 13 shows a commonly used Quantization matrix. Dividing the elements in DCT coefficient matrix by
corresponding elements in quantization matrix and rounding to nearest integer gives:
Figure 14 This is quantized DCT coefficient matrix
The first element obtained by
−415.38
16
= −25.96 which rounds to -26 and comparing to last element
1.68
99
= 0.017 which
rounded to nearest integer is 0.
6
Candidate number: 1600085
The elements in this matrix represent our 8x8 block. We now have long run of 0s and some values on top left region.
This saves a lot of space since now we can use Huffman encoding.
Entropy Coding:
This is a special form of lossless data compression scheme. This rearranges the elements in our quantized DCT
coefficients into zigzag pattern as shown in figure 15. This enables us to get the highest runs of 0s allowing us to use
Run-Length encoding (RLE). After RLE we can use Huffman encoding to store or send the image data.
Figure 15
The DC coefficient B0,0 is stored separately hence is excluded from the string. From matrix B we have the string: 38 0s
-3, 0, -3, -2, -6, 2, -4, 1, -3, 1, 1, 5, 1, 2, -1, 1, -1, 2, 0, 0, 0, 0, 0, -1, -1, 0, 0, 0, 0, … ,0
Huffman algorithm gives the optimal codeword length for each symbol according to its frequency. However, if there are
a lot of symbols occurring, then we have to write the codeword for each symbol as it appears.
Using Huffman algorithm on data we obtain the following associated codewords:
Symbol Frequency Codeword
0 44 1
-1 4 010
1 5 001
2 3 0111
-3 3 0110
-6 1 00011
-2 1 00010
5 1 00001
-4 1 00000
Encoded string is:
011010110000100001101110000000100100001001011101000101001111111101001000000000000000000000000000
000000000000.
Our encoded string is 108 bits long. Huffman algorithm gives the optimal codeword length for each symbol according to
its frequency. However this is not very efficient in a sense that our original string is 64 characters long and we must write
the codeword for every character as it appears in our string. We can be more efficient by using a simple lossless data
7
Candidate number: 1600085
compression technique which is called Run-length encoding (RLE), before we apply Huffman in order to reduce the
number of characters to be encoded.
Definition (Runs): An element appearing more than once consecutively in a string is called run e.g. 0 appears five times
consecutively after the symbol 1 in a string 010000010 hence we call it run of 0
Definition (Run-Length encoding):
Lossless data compression method where the run of data is stored as a data value and its count e.g. 010000010 is stored
as 01(0,5)10
We use Run-length encoding for our original string obtained from matrix B using zigzag pattern.
Original string: -3 0 -3 -2 -6 2 -4 1 -3 1 1 5 1 2 -1 1 -1 2 0 0 0 0 0 -1-1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
Encoded string using RLE: -3 0 -3 -2 -6 2 -4 1 -3 (1,2) 5 1 2 -1 1 -1 2 (0,5) (-1,2) (0,38)
Note that we only use RLE for elements appearing twice or more consecutively. We can now use Huffman encoding to
encode our string
Figure 16
Using the code words, our encoded string is: 1101110101000111111100011111010101011111011010
0001110001100011001000000
Note that this encoded string is on 72 bits long, much smaller compared to 108 bits. This is the encoded string for
luminance component for our 8x8 pixel block that we store. If the image was divided into n blocks, we would send 3 ∗ 𝑛
different encoded string since we have three 8x8 matrices for each 8x8 pixel block i.e. Y, Cb and Cr.
8
Candidate number: 1600085
Conclusion
This completes the general procedure for JPEG compression. Different software may use different variations in each
stage e.g. higher ratio of down sampling of chrominance, different quantization matrix or different lossless encoding
method for entropy encoding and other minor changes to achieve the required size or quality. However the general idea
remains the same. Each of the stages are obviously reversible in order to reconstruct the original image. Some data is
lost permanently and quality of the image may be lowered. Although in most cases, human eye would not be able to
distinguish the difference between JPEG and the original image.
Questions not answered in this Project:
1. How 3x3 matrix for RGB to YCbCr derived and why are there different variations of these matrices?
2. How is the quantization matrix derived? What is the optimal Quantizer?
3. How is DCT formulae derived?
4. There are many other types of transforms such as Kahunen-Loeve transform, Discrete Fourier transform etc.
Why use DCT-II?
Karhunen–Loève transform (KLT) minimizes the total mean square error for the pixels. In fact it gives optimal
error however KLT is not used in practice since the co-efficient matrix is not constant and is image dependent.
This costs too much and is computationally slow. In fact for certain types of images, DCT is Kahunen-Loeve
transform. Also DCT assumes the pixels next to each other are similar, which is a reasonable assumption since
natural images are smooth and pixels are highly correlated. Discrete Cosine Transform is suboptimal but it is
very fast and efficient. However, more research is needed to answer this question in more depth.
9
Candidate number: 1600085
Bibliography:
[1] David Austin, Image Compression: Seeing What’s Not There [online]. Grand Valley State Univeristy [viewed 08 Jan
2016] Available from:
http://www.ams.org/samplings/feature-column/fcarc-image-compression
[2] Randell Heyman, How JPEG works. 23 Jan 2015 [viewed 02 Jan 2016] Available from:
https://www.youtube.com/watch?v=f2odrCGjOFY
[3] Mikulic, Discrete Cosine Transform. 01 Sept 2001 [viewed 04 Jan 2016] Available from:
https://unix4lyfe.org/dct/
[4] JPEG: Wikipedia. 08 Jan 2016 [viewed 06 Jan 2016] Available from:
https://en.wikipedia.org/wiki/JPEG#Discrete_cosine_transform
[5] Discrete Cosine Transform: Wikipedia. 20 Dec 2015 [viewed 04 Jan 2016] Available from:
https://en.wikipedia.org/wiki/Discrete_cosine_transform
[6] Dheera Venkatraman, Online Plotting tool. Available from:
http://fooplot.com/#W3sidHlwZSI6MTAwMH1d
[7] Timur, Huffman coding calculator. Available from:
http://planetcalc.com/2481/
[8] JPEG ‘files’ & Colour (JPEG Pt1): Computerphile. 21 Apr 2015 [viewed 28 dec 2015]. Available from:
https://www.youtube.com/watch?v=n_uNPbdenRs
[9] JPEGDCT, Discrete Cosine Transform (JPEG Pt2): Computerphile. 22 May 2015 [viewed 28 dec 2015]. Available from:
https://www.youtube.com/watch?v=Q2aEzeMDHMA
[10] Digital image processing: p010 – The Discrete Cosine Transform (DCT): Alireza Saberi. 15 March 2013 [viewed 02 Jan
2016]. Available from:
https://www.youtube.com/watch?v=_bltj_7Ne2c
[11] Digital image processing: p009 JPEGs 8x8 blocks: Alireza Saberi. 15 March 2013 [viewed 02 Jan 2016]. Available
from:
https://www.youtube.com/watch?v=pZuaOjfsv0Y
[12] Run-length encoding: Wikipedia. 07 Dec 2015 [viewed 08 Jan 2016]. Available from:
https://en.wikipedia.org/wiki/Run-length_encoding

More Related Content

What's hot

Os Swapping, Paging, Segmentation and Virtual Memory
Os Swapping, Paging, Segmentation and Virtual MemoryOs Swapping, Paging, Segmentation and Virtual Memory
Os Swapping, Paging, Segmentation and Virtual Memorysgpraju
 
DBMS architecture &; system structure
DBMS architecture &; system  structureDBMS architecture &; system  structure
DBMS architecture &; system structureRUpaliLohar
 
Data base management system and Architecture ppt.
Data base management system and Architecture ppt.Data base management system and Architecture ppt.
Data base management system and Architecture ppt.AnkitAbhilashSwain
 
Nosql-Module 1 PPT.pptx
Nosql-Module 1 PPT.pptxNosql-Module 1 PPT.pptx
Nosql-Module 1 PPT.pptxRadhika R
 
Introduction to files and db systems 1.0
Introduction to files and db systems 1.0Introduction to files and db systems 1.0
Introduction to files and db systems 1.0Dr. C.V. Suresh Babu
 
Rdbms
RdbmsRdbms
Rdbmsrdbms
 
Query Decomposition and data localization
Query Decomposition and data localization Query Decomposition and data localization
Query Decomposition and data localization Hafiz faiz
 
Memory Management & Garbage Collection
Memory Management & Garbage CollectionMemory Management & Garbage Collection
Memory Management & Garbage CollectionAbhishek Sur
 
DATA WRANGLING presentation.pptx
DATA WRANGLING presentation.pptxDATA WRANGLING presentation.pptx
DATA WRANGLING presentation.pptxAbdullahAbbasi55
 
Generic types and collections GUIs.pptx
Generic types and collections GUIs.pptxGeneric types and collections GUIs.pptx
Generic types and collections GUIs.pptxAvirup Pal
 
Query processing and optimization (updated)
Query processing and optimization (updated)Query processing and optimization (updated)
Query processing and optimization (updated)Ravinder Kamboj
 
Text clustering
Text clusteringText clustering
Text clusteringKU Leuven
 
Introduction & history of dbms
Introduction & history of dbmsIntroduction & history of dbms
Introduction & history of dbmssethu pm
 
13. Query Processing in DBMS
13. Query Processing in DBMS13. Query Processing in DBMS
13. Query Processing in DBMSkoolkampus
 

What's hot (20)

Os Swapping, Paging, Segmentation and Virtual Memory
Os Swapping, Paging, Segmentation and Virtual MemoryOs Swapping, Paging, Segmentation and Virtual Memory
Os Swapping, Paging, Segmentation and Virtual Memory
 
DBMS architecture &; system structure
DBMS architecture &; system  structureDBMS architecture &; system  structure
DBMS architecture &; system structure
 
Data base management system and Architecture ppt.
Data base management system and Architecture ppt.Data base management system and Architecture ppt.
Data base management system and Architecture ppt.
 
Final exam in advance dbms
Final exam in advance dbmsFinal exam in advance dbms
Final exam in advance dbms
 
Design approach
Design approachDesign approach
Design approach
 
Segmentation in operating systems
Segmentation in operating systemsSegmentation in operating systems
Segmentation in operating systems
 
Nosql-Module 1 PPT.pptx
Nosql-Module 1 PPT.pptxNosql-Module 1 PPT.pptx
Nosql-Module 1 PPT.pptx
 
Introduction to files and db systems 1.0
Introduction to files and db systems 1.0Introduction to files and db systems 1.0
Introduction to files and db systems 1.0
 
Rdbms
RdbmsRdbms
Rdbms
 
Query Decomposition and data localization
Query Decomposition and data localization Query Decomposition and data localization
Query Decomposition and data localization
 
Memory Management & Garbage Collection
Memory Management & Garbage CollectionMemory Management & Garbage Collection
Memory Management & Garbage Collection
 
Data model
Data modelData model
Data model
 
DATA WRANGLING presentation.pptx
DATA WRANGLING presentation.pptxDATA WRANGLING presentation.pptx
DATA WRANGLING presentation.pptx
 
Generic types and collections GUIs.pptx
Generic types and collections GUIs.pptxGeneric types and collections GUIs.pptx
Generic types and collections GUIs.pptx
 
Query processing and optimization (updated)
Query processing and optimization (updated)Query processing and optimization (updated)
Query processing and optimization (updated)
 
Text clustering
Text clusteringText clustering
Text clustering
 
Memory Management
Memory ManagementMemory Management
Memory Management
 
Introduction & history of dbms
Introduction & history of dbmsIntroduction & history of dbms
Introduction & history of dbms
 
Data Warehousing ppt
Data Warehousing pptData Warehousing ppt
Data Warehousing ppt
 
13. Query Processing in DBMS
13. Query Processing in DBMS13. Query Processing in DBMS
13. Query Processing in DBMS
 

Similar to Compression using JPEG

image compression in data compression
image compression in data compressionimage compression in data compression
image compression in data compressionZaabir Ali
 
Tchebichef moment based hilbert scan for image compression
Tchebichef moment based hilbert scan for image compressionTchebichef moment based hilbert scan for image compression
Tchebichef moment based hilbert scan for image compressionAlexander Decker
 
Image compression- JPEG Compression & its Modes
Image compression- JPEG Compression & its ModesImage compression- JPEG Compression & its Modes
Image compression- JPEG Compression & its Modeskanimozhirajasekaren
 
M4L1.ppt
M4L1.pptM4L1.ppt
M4L1.pptdudoo1
 
Pipelined Architecture of 2D-DCT, Quantization and ZigZag Process for JPEG Im...
Pipelined Architecture of 2D-DCT, Quantization and ZigZag Process for JPEG Im...Pipelined Architecture of 2D-DCT, Quantization and ZigZag Process for JPEG Im...
Pipelined Architecture of 2D-DCT, Quantization and ZigZag Process for JPEG Im...VLSICS Design
 
Lec_2_Digital Image Fundamentals.pdf
Lec_2_Digital Image Fundamentals.pdfLec_2_Digital Image Fundamentals.pdf
Lec_2_Digital Image Fundamentals.pdfnagwaAboElenein
 
PIPELINED ARCHITECTURE OF 2D-DCT, QUANTIZATION AND ZIGZAG PROCESS FOR JPEG IM...
PIPELINED ARCHITECTURE OF 2D-DCT, QUANTIZATION AND ZIGZAG PROCESS FOR JPEG IM...PIPELINED ARCHITECTURE OF 2D-DCT, QUANTIZATION AND ZIGZAG PROCESS FOR JPEG IM...
PIPELINED ARCHITECTURE OF 2D-DCT, QUANTIZATION AND ZIGZAG PROCESS FOR JPEG IM...VLSICS Design
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Image processing
Image processingImage processing
Image processingmaheshpene
 
Multimedia image compression standards
Multimedia image compression standardsMultimedia image compression standards
Multimedia image compression standardsMazin Alwaaly
 
International Journal of Engineering Research and Development (IJERD)
 International Journal of Engineering Research and Development (IJERD) International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
CyberSec_JPEGcompressionForensics.pdf
CyberSec_JPEGcompressionForensics.pdfCyberSec_JPEGcompressionForensics.pdf
CyberSec_JPEGcompressionForensics.pdfMohammadAzreeYahaya
 
Compression: Images (JPEG)
Compression: Images (JPEG)Compression: Images (JPEG)
Compression: Images (JPEG)danishrafiq
 

Similar to Compression using JPEG (20)

Image compression Algorithms
Image compression AlgorithmsImage compression Algorithms
Image compression Algorithms
 
Jpeg compression
Jpeg compressionJpeg compression
Jpeg compression
 
image compression in data compression
image compression in data compressionimage compression in data compression
image compression in data compression
 
Tchebichef moment based hilbert scan for image compression
Tchebichef moment based hilbert scan for image compressionTchebichef moment based hilbert scan for image compression
Tchebichef moment based hilbert scan for image compression
 
B070306010
B070306010B070306010
B070306010
 
Image compression- JPEG Compression & its Modes
Image compression- JPEG Compression & its ModesImage compression- JPEG Compression & its Modes
Image compression- JPEG Compression & its Modes
 
M4L1.ppt
M4L1.pptM4L1.ppt
M4L1.ppt
 
JPEG Image Compression
JPEG Image CompressionJPEG Image Compression
JPEG Image Compression
 
Pipelined Architecture of 2D-DCT, Quantization and ZigZag Process for JPEG Im...
Pipelined Architecture of 2D-DCT, Quantization and ZigZag Process for JPEG Im...Pipelined Architecture of 2D-DCT, Quantization and ZigZag Process for JPEG Im...
Pipelined Architecture of 2D-DCT, Quantization and ZigZag Process for JPEG Im...
 
Lec_2_Digital Image Fundamentals.pdf
Lec_2_Digital Image Fundamentals.pdfLec_2_Digital Image Fundamentals.pdf
Lec_2_Digital Image Fundamentals.pdf
 
PIPELINED ARCHITECTURE OF 2D-DCT, QUANTIZATION AND ZIGZAG PROCESS FOR JPEG IM...
PIPELINED ARCHITECTURE OF 2D-DCT, QUANTIZATION AND ZIGZAG PROCESS FOR JPEG IM...PIPELINED ARCHITECTURE OF 2D-DCT, QUANTIZATION AND ZIGZAG PROCESS FOR JPEG IM...
PIPELINED ARCHITECTURE OF 2D-DCT, QUANTIZATION AND ZIGZAG PROCESS FOR JPEG IM...
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
 
Image processing
Image processingImage processing
Image processing
 
JFEF encoding
JFEF encodingJFEF encoding
JFEF encoding
 
Multimedia image compression standards
Multimedia image compression standardsMultimedia image compression standards
Multimedia image compression standards
 
Jpeg
JpegJpeg
Jpeg
 
Algorithm
AlgorithmAlgorithm
Algorithm
 
International Journal of Engineering Research and Development (IJERD)
 International Journal of Engineering Research and Development (IJERD) International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
CyberSec_JPEGcompressionForensics.pdf
CyberSec_JPEGcompressionForensics.pdfCyberSec_JPEGcompressionForensics.pdf
CyberSec_JPEGcompressionForensics.pdf
 
Compression: Images (JPEG)
Compression: Images (JPEG)Compression: Images (JPEG)
Compression: Images (JPEG)
 

Compression using JPEG

  • 1. ROYAL HOLLOWAY UNIVERISTY OF LONDON JPEG compression How images are generally compressed using JPEG Candidate Number: 1600085 Contents Compression using JPEG .....................................................................................................................................................1 YCbCr colour transform........................................................................................................................................................1 Down Sampling ...................................................................................................................................................................2 Discrete Cosine Transform (DCT):.......................................................................................................................................2 Discrete Cosine Transform formulae:.............................................................................................................................4 Quantization: ......................................................................................................................................................................5 Entropy Coding:...................................................................................................................................................................6 Conclusion...........................................................................................................................................................................8 Questions not answered in this Project:.............................................................................................................................8 Bibliography:.......................................................................................................................................................................9
  • 2. 1 Candidate number: 1600085 Compression using JPEG JPEG is a widely known compression method used to store images efficiently. JPEG reduces the size of the original image at the cost of image quality. The size is greatly reduced and the change in quality is almost undetectable by human eye. The quality of image is reduced because some data is discarded and is unrecoverable which classes JPEG as a lossy compression method. This is of course different from the lossless data compression methods such as PNG in which there is no loss of image data. By using JPEG, images can be reduced to roughly 5% of the normal size which saves tremendous amount of storage and is particularly useful for companies that store huge amounts of images. JPEG Compression procedure: Original image in RGB YCbCr colour transform Down Sampling Discrete Cosine Transform Quantization Entropy Encoding Encoded JPEG image Original Image in RGB colour space Images are made up of pixels and the colour of each pixel in the original image can be represented by 3-dimensional vector (R,G,B). The colour of each pixel can be specified using intensities of red, green and blue. The intensity of each colour varies from 0 to 255. Hence each color component can be represented as an integer. In a typical natural image, there is a significant amount of correlation between these components i.e. take a pixel and the pixels around this pixel will be similar. It is consequence of the fast that surfaces are smooth. Our aim is to find redundancies in order to reduce the amount of data required to represent the image. YCbCr colour transform We use colour space transform from RGB to YCbCr whose vector components represent luminance (Y), blue chrominance (Cb) and red chrominance (Cr). Note that YCbCr is not a colour space but rather a way of encoding RGB information. Below is the transformation matrix which converts RGB to YCbCr. The matrix is constant and most importantly invertible meaning that we can transform back to RGB when reconstructing the image. We split the image into blocks (Figure 1) where each block consists of 8x8 pixels, one of the blocks from figure 1 is zoomed in to show 8x8 pixel block in Figure 2. If the image cannot be divided exactly into 8x8 pixel blocks then we add extra information. Figure 1 Figure 2
  • 3. 2 Candidate number: 1600085 The pixel in the top left corner of figure 2 has RGB(222,138,123). Figure 3 Hence (Y,Cb,Cr) = (161.406, -21.67417, 43.21965). We do this for each pixel in 8x8 block to obtain three 8x8 matrices one Y component, one for Cb and one for Cr. Figure 4 Luminosity (Y) Figure 5 Chrominance (Cb) Figure 6 Chrominance (Cr) Down Sampling Human eye is more perceptible to luminance compared to chrominance. Therefore image can be down sampled by assuming the chrominance values to be constant on 2x2 block in our 8x8 block hence recording few values Each block is encoded ‘almost’ independently hence we will assume for now that each 8x8 block is encoded independently. Down sampling reduces the data but also reduces the quality of the image. Most software use down sampling of two i.e. assume 2x2 block is constant (4x less colour), however this can be increased. Discrete Cosine Transform (DCT): There are many types of DCT but for JPEG, DCT-II is used most commonly. The main idea of DCT is to represent data of 8x8 pixel blocks as the sum of cosine functions. Each of the 8x8 pixel blocks are separately encoded with its own discrete cosine transform. Each of the 8x8 blocks can be exactly replicated, hence we have 64 cosine waves. This is true for all three of our components Y, Cb and Cr. From here on, we’ll talk about luminance (Y) but Cb and Cr are similar. What we are essentially trying to do is represent image data in terms of cosine waves. We can add different frequencies of cosine waves in order to get the shape of the wave of our data.
  • 4. 3 Candidate number: 1600085 Figure 7 Red is cos(x), blue is cos(2x) and black wave is (½)cos(x)+ (½)cos(2x). If we added cos(x) and cos(2x) we would have a wave which goes above 1 and below -1. Hence we can take an average (mean) in order to get appropriate range. In fact we can take weighted average of cosine waves in order of importance e.g. (¾)Cos(2x) + (¼)cos(x), and the resulting wave will resemble more of cos(2x) wave. The more cosine waves we have, the more possible shapes we can make and hence better approximation of our image data. In our case, we use all 64 cosine functions to represent a block. Figure 8 Every 8x8 block is a linear combination of these 64 patterns which is transformed by DCT. These patterns are called two dimensional DCT basis functions where the output values are called transform coefficients. The top left region shows low frequency cosine waves and bottom right represents higher frequency cosine waves. Figure 9
  • 5. 4 Candidate number: 1600085 Luminance value ranges from 0 to 255 just like RGB. Figure 9 shows matrix for luminosity component of a certain 8x8 block. Before computing the DCT coefficients, values must be centered around zero. This can be done by subtracting 128 from each element in the matrix in figure 9 which gives modified range [-128, 127]. Figure 10 Discrete Cosine Transform formulae: 𝐺 𝑢,𝑣 = 1 4 α(𝑢)α(𝑣) ∑ ∑ 𝑔 𝑥,𝑦 cos [ (2𝑥 + 1)𝑢𝜋 2𝑛 ] 𝑛−1 𝑥=0 𝑛−1 𝑦=0 cos [ (2𝑥 + 1)𝑣𝜋 2𝑛 ] This is the general formulae for 𝑛 ∗ 𝑛 pixel block. Hence for 8x8 pixel block, n=8. Gu,v is DCT coefficient at coordinates 𝑢, 𝑣 in 8x8 matrix. 𝑢 is the horizontal spatial frequency with integer values 0 ≤ 𝑢 ≤ 7 and 𝑣 is the vertical spatial frequency with integer values 0 ≤ 𝑣 ≤ 7. Similar for α(𝑣) Below is the calculation for the first entry 𝐺0,0 for DCT matrix 𝐺0,0 = 1 4 ∗ 1 √2 ∗ 1 √2 ∑ ∑ 𝑔 𝑥,𝑦 cos(0) 7 𝑥=0 7 𝑦=0 cos(0) = 1 8 ∑ ∑ 𝑔 𝑥,𝑦 7 𝑥=0 7 𝑦=0 Calculating the above for all x and y we obtain:
  • 6. 5 Candidate number: 1600085 Figure 11: In this case we sum all the elements in matrix g since cos(0) = 1. Hence the first entry of DCT matrix is -415.38 rounded to 2d.p. Calculating values for the rest of the matrix gives: Figure 12 G0,0 usually is much higher in magnitude compared to others since it represents the general intensity of 8x8 block and is called DC coefficient. Note that bottom right region has numbers of low magnitude compared to top left region. This shows that the high frequency cosine waves do not contribute much and have very subtle effects on the output pixel data. The tendency to gather most of the signal in top left corner is one of the main advantage of using DCT-II. Removing the high frequency data is called Quantization. Quantization: Since human eye is good at seeing small differences in brightness than it is in seeing exact strength of a high frequency brightness variation. Due to this, we can reduce the amount of information by getting rid of the high frequency components. We do this by dividing each value of DCT matrix 𝐺𝑖,𝑗 by the corresponding value 𝑄𝑖,𝑗 in our Quantization matrix. Figure 13 Figure 13 shows a commonly used Quantization matrix. Dividing the elements in DCT coefficient matrix by corresponding elements in quantization matrix and rounding to nearest integer gives: Figure 14 This is quantized DCT coefficient matrix The first element obtained by −415.38 16 = −25.96 which rounds to -26 and comparing to last element 1.68 99 = 0.017 which rounded to nearest integer is 0.
  • 7. 6 Candidate number: 1600085 The elements in this matrix represent our 8x8 block. We now have long run of 0s and some values on top left region. This saves a lot of space since now we can use Huffman encoding. Entropy Coding: This is a special form of lossless data compression scheme. This rearranges the elements in our quantized DCT coefficients into zigzag pattern as shown in figure 15. This enables us to get the highest runs of 0s allowing us to use Run-Length encoding (RLE). After RLE we can use Huffman encoding to store or send the image data. Figure 15 The DC coefficient B0,0 is stored separately hence is excluded from the string. From matrix B we have the string: 38 0s -3, 0, -3, -2, -6, 2, -4, 1, -3, 1, 1, 5, 1, 2, -1, 1, -1, 2, 0, 0, 0, 0, 0, -1, -1, 0, 0, 0, 0, … ,0 Huffman algorithm gives the optimal codeword length for each symbol according to its frequency. However, if there are a lot of symbols occurring, then we have to write the codeword for each symbol as it appears. Using Huffman algorithm on data we obtain the following associated codewords: Symbol Frequency Codeword 0 44 1 -1 4 010 1 5 001 2 3 0111 -3 3 0110 -6 1 00011 -2 1 00010 5 1 00001 -4 1 00000 Encoded string is: 011010110000100001101110000000100100001001011101000101001111111101001000000000000000000000000000 000000000000. Our encoded string is 108 bits long. Huffman algorithm gives the optimal codeword length for each symbol according to its frequency. However this is not very efficient in a sense that our original string is 64 characters long and we must write the codeword for every character as it appears in our string. We can be more efficient by using a simple lossless data
  • 8. 7 Candidate number: 1600085 compression technique which is called Run-length encoding (RLE), before we apply Huffman in order to reduce the number of characters to be encoded. Definition (Runs): An element appearing more than once consecutively in a string is called run e.g. 0 appears five times consecutively after the symbol 1 in a string 010000010 hence we call it run of 0 Definition (Run-Length encoding): Lossless data compression method where the run of data is stored as a data value and its count e.g. 010000010 is stored as 01(0,5)10 We use Run-length encoding for our original string obtained from matrix B using zigzag pattern. Original string: -3 0 -3 -2 -6 2 -4 1 -3 1 1 5 1 2 -1 1 -1 2 0 0 0 0 0 -1-1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Encoded string using RLE: -3 0 -3 -2 -6 2 -4 1 -3 (1,2) 5 1 2 -1 1 -1 2 (0,5) (-1,2) (0,38) Note that we only use RLE for elements appearing twice or more consecutively. We can now use Huffman encoding to encode our string Figure 16 Using the code words, our encoded string is: 1101110101000111111100011111010101011111011010 0001110001100011001000000 Note that this encoded string is on 72 bits long, much smaller compared to 108 bits. This is the encoded string for luminance component for our 8x8 pixel block that we store. If the image was divided into n blocks, we would send 3 ∗ 𝑛 different encoded string since we have three 8x8 matrices for each 8x8 pixel block i.e. Y, Cb and Cr.
  • 9. 8 Candidate number: 1600085 Conclusion This completes the general procedure for JPEG compression. Different software may use different variations in each stage e.g. higher ratio of down sampling of chrominance, different quantization matrix or different lossless encoding method for entropy encoding and other minor changes to achieve the required size or quality. However the general idea remains the same. Each of the stages are obviously reversible in order to reconstruct the original image. Some data is lost permanently and quality of the image may be lowered. Although in most cases, human eye would not be able to distinguish the difference between JPEG and the original image. Questions not answered in this Project: 1. How 3x3 matrix for RGB to YCbCr derived and why are there different variations of these matrices? 2. How is the quantization matrix derived? What is the optimal Quantizer? 3. How is DCT formulae derived? 4. There are many other types of transforms such as Kahunen-Loeve transform, Discrete Fourier transform etc. Why use DCT-II? Karhunen–Loève transform (KLT) minimizes the total mean square error for the pixels. In fact it gives optimal error however KLT is not used in practice since the co-efficient matrix is not constant and is image dependent. This costs too much and is computationally slow. In fact for certain types of images, DCT is Kahunen-Loeve transform. Also DCT assumes the pixels next to each other are similar, which is a reasonable assumption since natural images are smooth and pixels are highly correlated. Discrete Cosine Transform is suboptimal but it is very fast and efficient. However, more research is needed to answer this question in more depth.
  • 10. 9 Candidate number: 1600085 Bibliography: [1] David Austin, Image Compression: Seeing What’s Not There [online]. Grand Valley State Univeristy [viewed 08 Jan 2016] Available from: http://www.ams.org/samplings/feature-column/fcarc-image-compression [2] Randell Heyman, How JPEG works. 23 Jan 2015 [viewed 02 Jan 2016] Available from: https://www.youtube.com/watch?v=f2odrCGjOFY [3] Mikulic, Discrete Cosine Transform. 01 Sept 2001 [viewed 04 Jan 2016] Available from: https://unix4lyfe.org/dct/ [4] JPEG: Wikipedia. 08 Jan 2016 [viewed 06 Jan 2016] Available from: https://en.wikipedia.org/wiki/JPEG#Discrete_cosine_transform [5] Discrete Cosine Transform: Wikipedia. 20 Dec 2015 [viewed 04 Jan 2016] Available from: https://en.wikipedia.org/wiki/Discrete_cosine_transform [6] Dheera Venkatraman, Online Plotting tool. Available from: http://fooplot.com/#W3sidHlwZSI6MTAwMH1d [7] Timur, Huffman coding calculator. Available from: http://planetcalc.com/2481/ [8] JPEG ‘files’ & Colour (JPEG Pt1): Computerphile. 21 Apr 2015 [viewed 28 dec 2015]. Available from: https://www.youtube.com/watch?v=n_uNPbdenRs [9] JPEGDCT, Discrete Cosine Transform (JPEG Pt2): Computerphile. 22 May 2015 [viewed 28 dec 2015]. Available from: https://www.youtube.com/watch?v=Q2aEzeMDHMA [10] Digital image processing: p010 – The Discrete Cosine Transform (DCT): Alireza Saberi. 15 March 2013 [viewed 02 Jan 2016]. Available from: https://www.youtube.com/watch?v=_bltj_7Ne2c [11] Digital image processing: p009 JPEGs 8x8 blocks: Alireza Saberi. 15 March 2013 [viewed 02 Jan 2016]. Available from: https://www.youtube.com/watch?v=pZuaOjfsv0Y [12] Run-length encoding: Wikipedia. 07 Dec 2015 [viewed 08 Jan 2016]. Available from: https://en.wikipedia.org/wiki/Run-length_encoding