1
Image Compression: JPEG
Multimedia Systems (Module 4 Lesson 1)
Summary:
r JPEG Compression
m DCT
m Quantization
m Zig-Zag Scan
m RLE and DPCM
m Entropy Coding
r JPEG Modes
m Sequential
m Lossless
m Progressive
m Hierarchical
Sources:
r The JPEG website:
http://www.jpeg.org
r My research notes
2
Why JPEG
r The compression ratio of lossless methods (e.g.,
Huffman, Arithmetic, LZW) is not high enough for
image and video compression.
r JPEG uses transform coding, it is largely based on
the following observations:
m Observation 1: A large majority of useful image contents
change relatively slowly across images, i.e., it is unusual
for intensity values to alter up and down several times in
a small area, for example, within an 8 x 8 image block.
A translation of this fact into the spatial frequency
domain, implies, generally, lower spatial frequency
components contain more information than the high
frequency components which often correspond to less
useful details and noises.
m Observation 2: Experiments suggest that humans are
more immune to loss of higher spatial frequency
components than loss of lower frequency components.
3
JPEG Coding
Y
Cb
Cr
DPCM
RLC
Entropy
Coding
Header
Tables
Data
Coding
Tables
Quant…
Tables
DCT
f(i, j)
8 x 8
F(u, v)
8 x 8
Quantization
Fq(u, v)
Zig Zag
Scan
Steps Involved:
1. Discrete Cosine
Transform of each 8x8
pixel array
f(x,y) T F(u,v)
2. Quantization using a
table or using a constant
3. Zig-Zag scan to exploit
redundancy
4. Differential Pulse Code
Modulation(DPCM) on
the DC component and
Run length Coding of the
AC components
5. Entropy coding
(Huffman) of the final
output
4
DCT : Discrete Cosine Transform
DCT converts the information contained in a block(8x8) of
pixels from spatial domain to the frequency domain.
m A simple analogy: Consider a unsorted list of 12 numbers
between 0 and 3 -> (2, 3, 1, 2, 2, 0, 1, 1, 0, 1, 0, 0). Consider a
transformation of the list involving two steps (1.) sort the list
(2.) Count the frequency of occurrence of each of the numbers
->(4,4,3,1 ). : Through this transformation we lost the spatial
information but captured the frequency information.
m There are other transformations which retain the spatial
information. E.g., Fourier transform, DCT etc. Therefore
allowing us to move back and forth between spatial and
frequency domains.
1-D DCT: 1-D Inverese DCT:
5
Example and Comparison
0
2 0
4 0
6 0
8 0
0 1 2 3 4 5 6 7
100 -52 0 -5 0 -2 0 0.4 36 10 10 6 6 4 4 4
36 10 10 6 6 4 4 4
100 -52 0 -5 0 -2 0 0.4
24 12 20 32 40 51 59 48
8 15 24 32 40 48 57 63
0
2 0
4 0
6 0
8 0
0 1 2 3 4 5 6 7
0
2 0
4 0
6 0
8 0
0 1 2 3 4 5 6 7
DCT FFT
Inverse DCT Inverse FFT
Example Description:
r f(n) is given from n = 0 to 7; (N=8)
r Using DCT(FFT) we compute F(ω) for ω = 0 to 7
r We truncate and use Inverse Transform to compute f’(n)
6
2-D DCT
r Images are two-dimensional; How do you perform 2-D DCT?
m Two series of 1-D transforms result in a 2-D transform as
demonstrated in the figure below
1-D
Row-
wise
1-D
Column-
wise
8x8 8x8 8x8
j)
f(i,
v)
F(u,
r F(0,0) is called the DC component and the rest of F(i,j) are called
AC components
7
2-D Transform Example
r The following example will demonstrate the idea behind a 2-
D transform by using our own cooked up transform: The
transform computes a running cumulative sum.
1
1-D
Row-
wise
1-D
Column-
wise
8x8
8x8
8x8
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
8 7 6 5 4 3 2 1
8 7 6 5 4 3 2 1
8 7 6 5 4 3 2 1
8 7 6 5 4 3 2 1
8 7 6 5 4 3 2 1
8 7 6 5 4 3 2 1
8 7 6 5 4 3 2 1
8 7 6 5 4 3 2 1
64 56 48 40 32 24 16 8
56
48
40
32
24
16
8
49
42
35
28
21
14
7
42
36
30
24
18
12
6
35
30
25
20
15
10
5
28
24
20
16
12
8
4
21
18
15
12
9
6
3
14
12
10
8
6
4
2
7
6
5
4
3
2
1
 

8
n
my (n)
f
)
(
F 

My Transform:
j)
f(i,
v)
(u,
my
F
Note that this is only a hypothetical
transform. Do not confuse this with DCT
8
Quantization
r Why? -- To reduce number of bits per sample
F’(u,v) = round(F(u,v)/q(u,v))
r Example: 101101 = 45 (6 bits).
Truncate to 4 bits: 1011 = 11. (Compare 11 x 4 =44 against 45)
Truncate to 3 bits: 101 = 5. (Compare 8 x 5 =40 against 45)
Note, that the more bits we truncate the more precision we lose
r Quantization error is the main source of the Lossy
Compression.
r Uniform Quantization:
m q(u,v) is a constant.
r Non-uniform Quantization -- Quantization Tables
m Eye is most sensitive to low frequencies (upper left corner in
frequency matrix), less sensitive to high frequencies (lower right
corner)
m Custom quantization tables can be put in image/scan header.
m JPEG Standard defines two default quantization tables, one
each for luminance and chrominance.

M4L12.ppt

  • 1.
    1 Image Compression: JPEG MultimediaSystems (Module 4 Lesson 1) Summary: r JPEG Compression m DCT m Quantization m Zig-Zag Scan m RLE and DPCM m Entropy Coding r JPEG Modes m Sequential m Lossless m Progressive m Hierarchical Sources: r The JPEG website: http://www.jpeg.org r My research notes
  • 2.
    2 Why JPEG r Thecompression ratio of lossless methods (e.g., Huffman, Arithmetic, LZW) is not high enough for image and video compression. r JPEG uses transform coding, it is largely based on the following observations: m Observation 1: A large majority of useful image contents change relatively slowly across images, i.e., it is unusual for intensity values to alter up and down several times in a small area, for example, within an 8 x 8 image block. A translation of this fact into the spatial frequency domain, implies, generally, lower spatial frequency components contain more information than the high frequency components which often correspond to less useful details and noises. m Observation 2: Experiments suggest that humans are more immune to loss of higher spatial frequency components than loss of lower frequency components.
  • 3.
    3 JPEG Coding Y Cb Cr DPCM RLC Entropy Coding Header Tables Data Coding Tables Quant… Tables DCT f(i, j) 8x 8 F(u, v) 8 x 8 Quantization Fq(u, v) Zig Zag Scan Steps Involved: 1. Discrete Cosine Transform of each 8x8 pixel array f(x,y) T F(u,v) 2. Quantization using a table or using a constant 3. Zig-Zag scan to exploit redundancy 4. Differential Pulse Code Modulation(DPCM) on the DC component and Run length Coding of the AC components 5. Entropy coding (Huffman) of the final output
  • 4.
    4 DCT : DiscreteCosine Transform DCT converts the information contained in a block(8x8) of pixels from spatial domain to the frequency domain. m A simple analogy: Consider a unsorted list of 12 numbers between 0 and 3 -> (2, 3, 1, 2, 2, 0, 1, 1, 0, 1, 0, 0). Consider a transformation of the list involving two steps (1.) sort the list (2.) Count the frequency of occurrence of each of the numbers ->(4,4,3,1 ). : Through this transformation we lost the spatial information but captured the frequency information. m There are other transformations which retain the spatial information. E.g., Fourier transform, DCT etc. Therefore allowing us to move back and forth between spatial and frequency domains. 1-D DCT: 1-D Inverese DCT:
  • 5.
    5 Example and Comparison 0 20 4 0 6 0 8 0 0 1 2 3 4 5 6 7 100 -52 0 -5 0 -2 0 0.4 36 10 10 6 6 4 4 4 36 10 10 6 6 4 4 4 100 -52 0 -5 0 -2 0 0.4 24 12 20 32 40 51 59 48 8 15 24 32 40 48 57 63 0 2 0 4 0 6 0 8 0 0 1 2 3 4 5 6 7 0 2 0 4 0 6 0 8 0 0 1 2 3 4 5 6 7 DCT FFT Inverse DCT Inverse FFT Example Description: r f(n) is given from n = 0 to 7; (N=8) r Using DCT(FFT) we compute F(ω) for ω = 0 to 7 r We truncate and use Inverse Transform to compute f’(n)
  • 6.
    6 2-D DCT r Imagesare two-dimensional; How do you perform 2-D DCT? m Two series of 1-D transforms result in a 2-D transform as demonstrated in the figure below 1-D Row- wise 1-D Column- wise 8x8 8x8 8x8 j) f(i, v) F(u, r F(0,0) is called the DC component and the rest of F(i,j) are called AC components
  • 7.
    7 2-D Transform Example rThe following example will demonstrate the idea behind a 2- D transform by using our own cooked up transform: The transform computes a running cumulative sum. 1 1-D Row- wise 1-D Column- wise 8x8 8x8 8x8 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 8 7 6 5 4 3 2 1 8 7 6 5 4 3 2 1 8 7 6 5 4 3 2 1 8 7 6 5 4 3 2 1 8 7 6 5 4 3 2 1 8 7 6 5 4 3 2 1 8 7 6 5 4 3 2 1 8 7 6 5 4 3 2 1 64 56 48 40 32 24 16 8 56 48 40 32 24 16 8 49 42 35 28 21 14 7 42 36 30 24 18 12 6 35 30 25 20 15 10 5 28 24 20 16 12 8 4 21 18 15 12 9 6 3 14 12 10 8 6 4 2 7 6 5 4 3 2 1    8 n my (n) f ) ( F   My Transform: j) f(i, v) (u, my F Note that this is only a hypothetical transform. Do not confuse this with DCT
  • 8.
    8 Quantization r Why? --To reduce number of bits per sample F’(u,v) = round(F(u,v)/q(u,v)) r Example: 101101 = 45 (6 bits). Truncate to 4 bits: 1011 = 11. (Compare 11 x 4 =44 against 45) Truncate to 3 bits: 101 = 5. (Compare 8 x 5 =40 against 45) Note, that the more bits we truncate the more precision we lose r Quantization error is the main source of the Lossy Compression. r Uniform Quantization: m q(u,v) is a constant. r Non-uniform Quantization -- Quantization Tables m Eye is most sensitive to low frequencies (upper left corner in frequency matrix), less sensitive to high frequencies (lower right corner) m Custom quantization tables can be put in image/scan header. m JPEG Standard defines two default quantization tables, one each for luminance and chrominance.