3. Decode the following
E 0
T 11
N 100
I 1010
S 1011
11010010010101011
E 0
T 10
N 100
I 0111
S 1010
100100101010
4. Prefix code
No prefix of a
codeword is a
codeword
Uniquely decodable
A 00 1 00
B 010 01 10
C 011 001 11
D 100 0001 0001
E 11 00001 11000
F 101 000001 101
5. Prefix codes and binary trees
Tree representation of
prefix codes
A 00
B 010
C 0110
D 0111
E 10
F 11
9. Huffman code algorithm
Derivation
Two rarest items will have the longest codewords
Codewords for rarest items differ only in the last
bit
Idea: suppose the weights are
with and the smallest weights
Start with an optimal code for
and
Extend the codeword for to get
codewords for and
10. Huffman code
H = new Heap()
for each wi
T = new Tree(wi)
H.Insert(T)
while H.Size() > 1
T1 = H.DeleteMin()
T2 = H.DeleteMin()
T3 = Merge(T1, T2)
H.Insert(T3)
12. Draw a Huffman tree for the following
data values and show internal weights:
3, 5, 9, 14, 16, 35
13. Correctness proof
The most amazing induction proof
Induction on the number of code words
The Huffman algorithm finds an optimal
code for n = 1
Suppose that the Huffman algorithm
finds an optimal code for codes size n,
now consider a code of size n + 1 . . .
14. Key lemma
Given a tree T, we can find a tree T’,
with the two minimum cost leaves as
siblings, and C(T’) <= C(T)
16. Finish the induction proof
T – Tree constructed by Huffman
X – Any code tree
Show C(T) <= C(X)
T’ and X’ – Trees from the lemma
C(T’) = C(T)
C(X’) <= C(X)
T’’ and X’’ – Trees with minimum cost leaves x
and y removed
17. X : Any tree, X’: – modified,
X’’ : Two smallest leaves removed
C(X’’) = C(X’) – x – y
C(T’’) = C(T’) – x – y
C(T’’) <= C(X’’)
C(T) = C(T’) = C(T’’) + x + y
<= C(X’’) + x + y = C(X’) <= C(X)