Huffman Code + Programming it
Basics of Compression
• Goals:
• To understand how image/audio/video signals are
compressed to save storage and increase
transmission efficiency.
• One of the main focus areas in the field of
information theory is on the issue of source-
coding:
• How to efficiently (“Compress”) information into as
few bits as possible.
Compression Classification
Linear Predictive AutoRegressive Polynomial Fitting
Model-Based
Huffman
Statistical
Arithmetic Lempel-Ziv
Universal
Lossless
Spatial/Time-Domain
Subband Wavelet
Filter-Based
Fourier DCT
Transform-Based
Frequency-Domain
Lossy
Waveform-Based
Compression Methods
Compression Issues
• Lossless compression
• Coding Efficiency
• Compression ratio
• Coder Complexity
• Memory needs
• Power needs
• Operations per second
• Coding Delay
Fixed and Variable Length Codes
A fixed length code assigns the same number of bits to each
code word.
E.g. ASCII letter -> 7 bits (up to 128 code words)
So to encode the string “at” we need 14 bits.
A variable length code assigns a different number of bits to each
code word, depending on the frequency of the code word.
Frequent words are assigned short codes; infrequent words
are assigned long codes.
e.g. Huffman Coding, Lampel-Ziv
Huffman Encoding
• Let an alphabet have N symbols S1 … SN
• Let pi be the probability of occurrence of Si
• Order the symbols by their probabilities
p1  p2  p3  …  pN
• Replace symbols SN-1 and SN by a new symbol HN-1
such that it has the probability pN-1+ pN
• Repeat until there is only one symbol
• This generates a binary tree
Huffman Encoding Example
• Symbols picked up as
• K+W
• {K,W}+T
• {K,W,T}+U
• {R,L}
• {K,W,T,U}+E
• {{K,W,T,U,E},{R,L}}
• Codewords are
generated in a tree-
traversal pattern
Symbol Probability Codeword
K 0.05 10101
L 0.2 01
U 0.1 100
W 0.05 10100
E 0.3 11
R 0.2 00
T 0.1 1011
Properties of Huffman Codes
• Fixed-length inputs become variable-length outputs
• We start assigning binary numbers (0,1) starting
from the root node going down up to the leave nodes
where the characters will be.
• We assign 0 or 1 to the node edges and we follow
the same convention for al nodes.
• Prefix-condition code: no codeword is a prefix for
another
• Makes the code uniquely decodable
Prefix Property
E 0
T 11
N 100
I 1010
S 1011
0111100100101010
1
E 0
T 10
N 100
I 0111
S 1010
10010010101011011
Ambiguous Prefix code
Building the Huffman Tree
symbol x p(x)
S
W
N
E
0.5
0.25
0.125
0.125
0.25
0.25
0.5 0.5
0.5
(EW)
(NEW)
compound symbols
Suppose we have symbols {S,N,E,W} with
probability {0.5,0.25,0.125,0.125}
Encode the Huffman Tree
p(x)
0.5
0.25
0.125
0.125
0.25
0.25
0.5 0.5
0.5 1
0
1
0
1
0
codeword
0
10
110
111
symbol x
S
W
N
E
Codeword assignment:
Decoding a Huffman Code
• After receiving a Huffman Code decode it
by matching group bits to match a symbol,
if you can't any symbol then this indicate
an error may occurred.
PROGRAMMING THE HUFFMAN CODE
To program the Huffman code we
may have three possible ways :
• Using a One-Dimensional array.
• Using a Two-Dimensional array.
• Using a binary tree.
the Tree Method is the best.
The One-Dimensional array method:
• This method is based on arranging the elements in
decreasing order based on their probabilities in a
one-dimensional array.
• Once the elements are sorted, assign “0” to the
element with index array [n-1] and a “1” to the
element with index array [n] .
• Sum the probability of the last 2 elements and
store it in index array[n-1], you may create a new
array.
• Rearrange the new array (total number-1) and
return to step 2.
• The best way to do the previous method is
threw a loop of size n (array size).
• You can add sum control capabilities such
as adding the probabilities of the entered
array and the if its equal to “1” and there is
no non-negative numbers then continue else
display a message that there is an error.
The problem with this method is that we can
not track back the symbols.
Some help full functions:
• sortm = sort(symbols,'descend')
this will sort the symbols array in the
decreasing order and the store it in a new
array sortm.
• [sortm(n-1), ]=[sortm(n-1)]+[sortm(n)]
to add the last two elements and store them
in index n-1, it also discard the last element
Example:
• Suppose we have
symbols {A B C D E F G H} with
Probabilities{0.1 0.3 0.1 0.2 0.05 0.1 0.05
0.1} respetively.
After applying the one dimentional method
the arrays generated would look like this:
• prob =[0.1 0.3 0.1 0.2 0.05 0.1 0.05 0.1 ] (not
sorted).
• prob =[0.3 0.2 0.1 0.1 0.1 0. 1 0.05 0.05] (sorted)
• prob =[0.3 0.2 0.1 0.1 0.1 0.1 0.1 ] ( last 2
elements added)
• prob =[ 0.3 0.2 0.2 0.1 0.1 0.1 ]
• prob =[ 0.3 0.2 0.2 0.2 0.1 ]
• prob =[ 0.3 0.3 0.2 0.2 ]
• prob =[ 0.4 0.3 0.3]
• prob =[ 0.6 0.4]
• prob =[ 1]
As you have seen the arrays are sorted and
the last 2 element are added+ last removed
• This method is easy and simple but the
problem with it is that you can not track
back the symbols entered.
•This part is kept to
faisal al- hajry to
explain his way here.

huffman Codes + Programming 5TH (part1).ppt

  • 1.
    Huffman Code +Programming it
  • 2.
    Basics of Compression •Goals: • To understand how image/audio/video signals are compressed to save storage and increase transmission efficiency. • One of the main focus areas in the field of information theory is on the issue of source- coding: • How to efficiently (“Compress”) information into as few bits as possible.
  • 3.
    Compression Classification Linear PredictiveAutoRegressive Polynomial Fitting Model-Based Huffman Statistical Arithmetic Lempel-Ziv Universal Lossless Spatial/Time-Domain Subband Wavelet Filter-Based Fourier DCT Transform-Based Frequency-Domain Lossy Waveform-Based Compression Methods
  • 4.
    Compression Issues • Losslesscompression • Coding Efficiency • Compression ratio • Coder Complexity • Memory needs • Power needs • Operations per second • Coding Delay
  • 5.
    Fixed and VariableLength Codes A fixed length code assigns the same number of bits to each code word. E.g. ASCII letter -> 7 bits (up to 128 code words) So to encode the string “at” we need 14 bits. A variable length code assigns a different number of bits to each code word, depending on the frequency of the code word. Frequent words are assigned short codes; infrequent words are assigned long codes. e.g. Huffman Coding, Lampel-Ziv
  • 6.
    Huffman Encoding • Letan alphabet have N symbols S1 … SN • Let pi be the probability of occurrence of Si • Order the symbols by their probabilities p1  p2  p3  …  pN • Replace symbols SN-1 and SN by a new symbol HN-1 such that it has the probability pN-1+ pN • Repeat until there is only one symbol • This generates a binary tree
  • 7.
    Huffman Encoding Example •Symbols picked up as • K+W • {K,W}+T • {K,W,T}+U • {R,L} • {K,W,T,U}+E • {{K,W,T,U,E},{R,L}} • Codewords are generated in a tree- traversal pattern Symbol Probability Codeword K 0.05 10101 L 0.2 01 U 0.1 100 W 0.05 10100 E 0.3 11 R 0.2 00 T 0.1 1011
  • 8.
    Properties of HuffmanCodes • Fixed-length inputs become variable-length outputs • We start assigning binary numbers (0,1) starting from the root node going down up to the leave nodes where the characters will be. • We assign 0 or 1 to the node edges and we follow the same convention for al nodes. • Prefix-condition code: no codeword is a prefix for another • Makes the code uniquely decodable
  • 9.
    Prefix Property E 0 T11 N 100 I 1010 S 1011 0111100100101010 1 E 0 T 10 N 100 I 0111 S 1010 10010010101011011 Ambiguous Prefix code
  • 10.
    Building the HuffmanTree symbol x p(x) S W N E 0.5 0.25 0.125 0.125 0.25 0.25 0.5 0.5 0.5 (EW) (NEW) compound symbols Suppose we have symbols {S,N,E,W} with probability {0.5,0.25,0.125,0.125}
  • 11.
    Encode the HuffmanTree p(x) 0.5 0.25 0.125 0.125 0.25 0.25 0.5 0.5 0.5 1 0 1 0 1 0 codeword 0 10 110 111 symbol x S W N E Codeword assignment:
  • 12.
    Decoding a HuffmanCode • After receiving a Huffman Code decode it by matching group bits to match a symbol, if you can't any symbol then this indicate an error may occurred.
  • 13.
  • 14.
    To program theHuffman code we may have three possible ways : • Using a One-Dimensional array. • Using a Two-Dimensional array. • Using a binary tree. the Tree Method is the best.
  • 15.
    The One-Dimensional arraymethod: • This method is based on arranging the elements in decreasing order based on their probabilities in a one-dimensional array. • Once the elements are sorted, assign “0” to the element with index array [n-1] and a “1” to the element with index array [n] . • Sum the probability of the last 2 elements and store it in index array[n-1], you may create a new array. • Rearrange the new array (total number-1) and return to step 2.
  • 16.
    • The bestway to do the previous method is threw a loop of size n (array size). • You can add sum control capabilities such as adding the probabilities of the entered array and the if its equal to “1” and there is no non-negative numbers then continue else display a message that there is an error. The problem with this method is that we can not track back the symbols.
  • 17.
    Some help fullfunctions: • sortm = sort(symbols,'descend') this will sort the symbols array in the decreasing order and the store it in a new array sortm. • [sortm(n-1), ]=[sortm(n-1)]+[sortm(n)] to add the last two elements and store them in index n-1, it also discard the last element
  • 18.
    Example: • Suppose wehave symbols {A B C D E F G H} with Probabilities{0.1 0.3 0.1 0.2 0.05 0.1 0.05 0.1} respetively. After applying the one dimentional method the arrays generated would look like this:
  • 19.
    • prob =[0.10.3 0.1 0.2 0.05 0.1 0.05 0.1 ] (not sorted). • prob =[0.3 0.2 0.1 0.1 0.1 0. 1 0.05 0.05] (sorted) • prob =[0.3 0.2 0.1 0.1 0.1 0.1 0.1 ] ( last 2 elements added) • prob =[ 0.3 0.2 0.2 0.1 0.1 0.1 ] • prob =[ 0.3 0.2 0.2 0.2 0.1 ] • prob =[ 0.3 0.3 0.2 0.2 ] • prob =[ 0.4 0.3 0.3] • prob =[ 0.6 0.4] • prob =[ 1] As you have seen the arrays are sorted and the last 2 element are added+ last removed
  • 20.
    • This methodis easy and simple but the problem with it is that you can not track back the symbols entered.
  • 21.
    •This part iskept to faisal al- hajry to explain his way here.