Basics of Compression
•Goals:
• To understand how image/audio/video signals are
compressed to save storage and increase
transmission efficiency.
• One of the main focus areas in the field of
information theory is on the issue of source-
coding:
• How to efficiently (“Compress”) information into as
few bits as possible.
Compression Issues
• Losslesscompression
• Coding Efficiency
• Compression ratio
• Coder Complexity
• Memory needs
• Power needs
• Operations per second
• Coding Delay
5.
Fixed and VariableLength Codes
A fixed length code assigns the same number of bits to each
code word.
E.g. ASCII letter -> 7 bits (up to 128 code words)
So to encode the string “at” we need 14 bits.
A variable length code assigns a different number of bits to each
code word, depending on the frequency of the code word.
Frequent words are assigned short codes; infrequent words
are assigned long codes.
e.g. Huffman Coding, Lampel-Ziv
6.
Huffman Encoding
• Letan alphabet have N symbols S1 … SN
• Let pi be the probability of occurrence of Si
• Order the symbols by their probabilities
p1 p2 p3 … pN
• Replace symbols SN-1 and SN by a new symbol HN-1
such that it has the probability pN-1+ pN
• Repeat until there is only one symbol
• This generates a binary tree
7.
Huffman Encoding Example
•Symbols picked up as
• K+W
• {K,W}+T
• {K,W,T}+U
• {R,L}
• {K,W,T,U}+E
• {{K,W,T,U,E},{R,L}}
• Codewords are
generated in a tree-
traversal pattern
Symbol Probability Codeword
K 0.05 10101
L 0.2 01
U 0.1 100
W 0.05 10100
E 0.3 11
R 0.2 00
T 0.1 1011
8.
Properties of HuffmanCodes
• Fixed-length inputs become variable-length outputs
• We start assigning binary numbers (0,1) starting
from the root node going down up to the leave nodes
where the characters will be.
• We assign 0 or 1 to the node edges and we follow
the same convention for al nodes.
• Prefix-condition code: no codeword is a prefix for
another
• Makes the code uniquely decodable
9.
Prefix Property
E 0
T11
N 100
I 1010
S 1011
0111100100101010
1
E 0
T 10
N 100
I 0111
S 1010
10010010101011011
Ambiguous Prefix code
10.
Building the HuffmanTree
symbol x p(x)
S
W
N
E
0.5
0.25
0.125
0.125
0.25
0.25
0.5 0.5
0.5
(EW)
(NEW)
compound symbols
Suppose we have symbols {S,N,E,W} with
probability {0.5,0.25,0.125,0.125}
11.
Encode the HuffmanTree
p(x)
0.5
0.25
0.125
0.125
0.25
0.25
0.5 0.5
0.5 1
0
1
0
1
0
codeword
0
10
110
111
symbol x
S
W
N
E
Codeword assignment:
12.
Decoding a HuffmanCode
• After receiving a Huffman Code decode it
by matching group bits to match a symbol,
if you can't any symbol then this indicate
an error may occurred.
To program theHuffman code we
may have three possible ways :
• Using a One-Dimensional array.
• Using a Two-Dimensional array.
• Using a binary tree.
the Tree Method is the best.
15.
The One-Dimensional arraymethod:
• This method is based on arranging the elements in
decreasing order based on their probabilities in a
one-dimensional array.
• Once the elements are sorted, assign “0” to the
element with index array [n-1] and a “1” to the
element with index array [n] .
• Sum the probability of the last 2 elements and
store it in index array[n-1], you may create a new
array.
• Rearrange the new array (total number-1) and
return to step 2.
16.
• The bestway to do the previous method is
threw a loop of size n (array size).
• You can add sum control capabilities such
as adding the probabilities of the entered
array and the if its equal to “1” and there is
no non-negative numbers then continue else
display a message that there is an error.
The problem with this method is that we can
not track back the symbols.
17.
Some help fullfunctions:
• sortm = sort(symbols,'descend')
this will sort the symbols array in the
decreasing order and the store it in a new
array sortm.
• [sortm(n-1), ]=[sortm(n-1)]+[sortm(n)]
to add the last two elements and store them
in index n-1, it also discard the last element
18.
Example:
• Suppose wehave
symbols {A B C D E F G H} with
Probabilities{0.1 0.3 0.1 0.2 0.05 0.1 0.05
0.1} respetively.
After applying the one dimentional method
the arrays generated would look like this:
19.
• prob =[0.10.3 0.1 0.2 0.05 0.1 0.05 0.1 ] (not
sorted).
• prob =[0.3 0.2 0.1 0.1 0.1 0. 1 0.05 0.05] (sorted)
• prob =[0.3 0.2 0.1 0.1 0.1 0.1 0.1 ] ( last 2
elements added)
• prob =[ 0.3 0.2 0.2 0.1 0.1 0.1 ]
• prob =[ 0.3 0.2 0.2 0.2 0.1 ]
• prob =[ 0.3 0.3 0.2 0.2 ]
• prob =[ 0.4 0.3 0.3]
• prob =[ 0.6 0.4]
• prob =[ 1]
As you have seen the arrays are sorted and
the last 2 element are added+ last removed
20.
• This methodis easy and simple but the
problem with it is that you can not track
back the symbols entered.
21.
•This part iskept to
faisal al- hajry to
explain his way here.