DATA COMPRESSION USING
HUFFMAN CODING
Rahul V. Khanwani
Roll No. 47
Department Of Computer Science
HUFFMAN CODING
• Huffman Coding Algorithm— a bottom-up
approach.
• The Huffman coding is a procedure to generate a
binary ...
Huffman Coding Algorithm
1. Initialization: Put all symbols on a list sorted according
to their frequency counts.
2. Repea...
Example:
Symbol Count
A 15
B 7
C 6
D 6
E 5
Rahul Khanvani For More Visit Binarybuzz.wordpress.com
Constructing A Tree of Nodes Who
Has Minimum Occurance
(11)
D(6) E(5)
Rahul Khanvani For More Visit Binarybuzz.wordpress.c...
Constructing A Tree of Nodes Who
Has Minimum Occurance
17
C(6) (11)
D(6) E(5)
Rahul Khanvani For More Visit Binarybuzz.wor...
Re-Constructing A Tree of Nodes Who
Has Minimum Occurance
17
(13)
B(7) C(6)
(11)
D(6) E(5)
Rahul Khanvani For More Visit B...
Re-Constructing A Tree of Nodes Who
Has Minimum Occurance
(39)
A(15) (24)
(13)
B(7) C(6)
(11)
D(6) E(5)
Rahul Khanvani For...
Huffman Coding Result
Symbol Count Bits
A 15 0
B 7 100
C 6 101
D 6 110
E 5 111
Rahul Khanvani For More Visit Binarybuzz.wo...
Comparison Of Huffman And Shanon-
Fano Coding Algorithm
Symbol Count Shanon-
Fano
Bit Size
Huffman Bit
Size
Shanon
Fano To...
Comparison Conclusion
• Shannon-Fano and Huffman coding are close in
performance.
• But Huffman coding will always at leas...
Huffman Coding Types:
• The construction of a code tree for the
Huffman coding is based on a certain
probability distribut...
Static probability distribution
• Coding procedures with static Huffman codes
operate with a predefined code tree.
• Provi...
Dynamic probability distribution
• Instead of a static tree being identical for any
type of data, a dynamic analysis of th...
Adaptive probability distribution
• The adaptive coding procedure uses a code
tree that is permanently adapted to the
prev...
Adaptive probability distribution
• Adaptive Huffman codes initially using empty
trees operate with a special control char...
Extended Huffman Coding
• Extended Alphabet : For alphabet
S={s1,s2,...,sn}, if k symbols are grouped
together, then the e...
Adaptive(Dynamic) Huffman Coding
• In adaptive Huffman Coding statistics are gathered and up-
dated dynamically as the dat...
Adaptive(Dynamic) Huffman Coding
1. Initial code : assigns symbols with some initially
agreed upon codes, without any prio...
Notes on Adaptive Huffman Tree
Updating
• Nodes are numbered in order from left to
right, bottom to top. The numbers in
pa...
Adaptive Huffman Coding
Example: ABCDPAA
9.(9)
7.(4)
5.(2)
1.A: (1) 2.B: (1)
6.(2)
3.C: (1) 4.D: (1)
8.P.(5)
Rahul Khanvan...
Adaptive Huffman Coding
Example: ABCDPAA
9.(9)
7.(4)
5.(2)
4.D: (1) 2.B: (1)
6.(2)
3.C: (1) 1.A: (2)
8.P.(5)
Rahul Khanvan...
Adaptive Huffman Coding
Example: ABCDPAA
9.(9)
7.(4)
5.(2)
4.D: (1) 2.B: (1)
6.(2)
3.C: (1)
1.A:
(2+1)
8.P.(5)
Rahul Khanv...
Adaptive Huffman Coding
Example: ABCDPPPPPAA
9.(10)
7.(5+1)
6. (3)
4(2)
4.D: (1) 2.B: (1)
3.C: (1)
5.A: (3)
8.P(5)
Rahul K...
Adaptive Huffman Coding
Example: ABCDPPPPPAA
9.(11)
7:p(5) 8.(6)
5.A(3) 6(3)
3.C(1) 4.(2)
1.D(1) 2.B(1)
Rahul Khanvani For...
Another Example: Adaptive Huffman
Coding
• This is to clearly illustrate more implementation
details. We show exactly what...
Initial code assignment for AADCCDD
using adaptive Huffman coding.
(1)
NEW:0 A: (1)
(2)
NEW:0 A: (2)
Rahul Khanvani For Mo...
Initial code assignment for AADCCDD
using adaptive Huffman coding.
(3)
A : (2)(1)
NEW:0 D: (1)
Rahul Khanvani For More Vis...
Initial code assignment for AADCCDD
using adaptive Huffman coding.
(4)
A: (2)(2)
(1)
NEW:0 C: (1)
D: (1)
Rahul Khanvani Fo...
Initial code assignment for AADCCDD
using adaptive Huffman coding.
(4)
A: (2)(2)
(1)
NEW:0 C: (1+1)
D: (1)
Rahul Khanvani ...
Initial code assignment for AADCCDD
using adaptive Huffman coding.
(4)
A: (2)(2+1)
(1)
NEW:0 D: (1)
C: (2)
Rahul Khanvani ...
Initial code assignment for AADCCDD
using adaptive Huffman coding.
(5)
A: (2) (3)
C : (2)(1)
NEW:0 D: (1)
Rahul Khanvani F...
Initial code assignment for AADCCDD
using adaptive Huffman coding.
(6)
A: (2) (4)
C : (2)(2)
NEW:0 D: (2)
Rahul Khanvani F...
Initial code assignment for AADCCDD
using adaptive Huffman coding.
(6)
A: (2) (4)
C : (2)(2)
NEW:0 D: (2+1)
Rahul Khanvani...
Initial code assignment for AADCCDD
using adaptive Huffman coding.
(7)
D: (3) (4)
C : (2)(2)
NEW:0 A: (2)
Rahul Khanvani F...
Sequence of symbols and codes sent
to the decoder
Symb
ol
NEW A A NEW D NEW C C D D
Code 0000
0000
0000
0001
0000
0001
000...
THANK YOU 
Rahul Khanvani For More Visit Binarybuzz.wordpress.com
Upcoming SlideShare
Loading in …5
×

Data compression huffman coding algoritham

4,246
-1

Published on

Published in: Technology, Education
0 Comments
6 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
4,246
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
324
Comments
0
Likes
6
Embeds 0
No embeds

No notes for slide

Data compression huffman coding algoritham

  1. 1. DATA COMPRESSION USING HUFFMAN CODING Rahul V. Khanwani Roll No. 47 Department Of Computer Science
  2. 2. HUFFMAN CODING • Huffman Coding Algorithm— a bottom-up approach. • The Huffman coding is a procedure to generate a binary code tree. The algorithm invented by David Huffman in 1952 ensures that the probability for the occurrence of every symbol results in its code length. • Huffman coding could perform effective data compression by reducing the amount of redundancy in the coding of symbols.Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  3. 3. Huffman Coding Algorithm 1. Initialization: Put all symbols on a list sorted according to their frequency counts. 2. Repeat until the list has only one symbol left: 1. From the list pick two symbols with the lowest frequency counts 2. Form a Huffman sub-tree that has these two symbols as child nodes and create a parent node. 3. Assign the sum of the children’s frequency counts to the parent and insert it into the list such that the order is maintained. 4. Delete the children from the list. 3. Assign a codeword for each leaf based on the path from the root. Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  4. 4. Example: Symbol Count A 15 B 7 C 6 D 6 E 5 Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  5. 5. Constructing A Tree of Nodes Who Has Minimum Occurance (11) D(6) E(5) Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  6. 6. Constructing A Tree of Nodes Who Has Minimum Occurance 17 C(6) (11) D(6) E(5) Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  7. 7. Re-Constructing A Tree of Nodes Who Has Minimum Occurance 17 (13) B(7) C(6) (11) D(6) E(5) Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  8. 8. Re-Constructing A Tree of Nodes Who Has Minimum Occurance (39) A(15) (24) (13) B(7) C(6) (11) D(6) E(5) Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  9. 9. Huffman Coding Result Symbol Count Bits A 15 0 B 7 100 C 6 101 D 6 110 E 5 111 Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  10. 10. Comparison Of Huffman And Shanon- Fano Coding Algorithm Symbol Count Shanon- Fano Bit Size Huffman Bit Size Shanon Fano Total Bits Huffman Total Bits A 15 2 1 30 15 B 7 2 3 14 21 C 6 2 3 12 18 D 6 3 3 18 18 E 5 3 3 15 15 Total 89 87 Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  11. 11. Comparison Conclusion • Shannon-Fano and Huffman coding are close in performance. • But Huffman coding will always at least equal the efficiency of Shannon-Fano coding, so it has become the predominant coding method of its type. • both algorithms take a similar amount of processing power. • it seems sensible to take the one that gives slightly better performance. • Huffman was able to prove that this coding method cannot be improved on with any other integral bit- width coding stream. Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  12. 12. Huffman Coding Types: • The construction of a code tree for the Huffman coding is based on a certain probability distribution. • Varies In Three Types: – static probability distribution – dynamic probability distribution – adaptive probability distribution Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  13. 13. Static probability distribution • Coding procedures with static Huffman codes operate with a predefined code tree. • Provided that the source data correspond to the adopted frequency distribution, an acceptable efficiency of the coding can be achieved. • It is not necessary to store the Huffman tree or the frequencies within the encoded data. • It is sufficient to keep them available within the encoder or decoder software. • Additionally the coding tables do not need to be generated at run-time. Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  14. 14. Dynamic probability distribution • Instead of a static tree being identical for any type of data, a dynamic analysis of the probability distribution could take place. • Codes generated from these code trees match the real conditions clearly better than standard distributions. • The major disadvantage of this procedure is, that the information about the Huffman tree has to be embedded into the compressed files or data transmissions. • A code table or the symbol's frequencies must be part of the header data. Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  15. 15. Adaptive probability distribution • The adaptive coding procedure uses a code tree that is permanently adapted to the previously encoded or decoded data. Starting with an empty tree or a standard distribution. • each encoded symbol will be used to refine the code tree. This way a continuous adaption will be achieved and local variations will be compensated at run-time. Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  16. 16. Adaptive probability distribution • Adaptive Huffman codes initially using empty trees operate with a special control character identifying new symbols currently not being part of the tree. • This variant is characterized by its minimum requirements for header data, but the attainable compression rate is unfavourable at the beginning of the coding or for small files. Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  17. 17. Extended Huffman Coding • Extended Alphabet : For alphabet S={s1,s2,...,sn}, if k symbols are grouped together, then the extended alphabet is: • Problem: If k is relatively large (e.g., k≥3), then for most practical applications where n>1, k implies a huge symbol table that is impractical. Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  18. 18. Adaptive(Dynamic) Huffman Coding • In adaptive Huffman Coding statistics are gathered and up- dated dynamically as the data stream arrives. Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  19. 19. Adaptive(Dynamic) Huffman Coding 1. Initial code : assigns symbols with some initially agreed upon codes, without any prior knowledge of the frequency counts. 2. Update tree : constructs an Adaptive Huffman tree. It basically does two things: 1. increments the frequency counts for the symbols (includ- ing any new ones). 2. updates the configuration of the tree. 3. The encoder and decoder must use exactly the same initial code and update tree routines. Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  20. 20. Notes on Adaptive Huffman Tree Updating • Nodes are numbered in order from left to right, bottom to top. The numbers in parentheses indicates the count. • The tree must always maintain its sibling property. • When a swap is necessary, the farthest node with count N is swapped with the node whose count has just been increased to N+ 1. Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  21. 21. Adaptive Huffman Coding Example: ABCDPAA 9.(9) 7.(4) 5.(2) 1.A: (1) 2.B: (1) 6.(2) 3.C: (1) 4.D: (1) 8.P.(5) Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  22. 22. Adaptive Huffman Coding Example: ABCDPAA 9.(9) 7.(4) 5.(2) 4.D: (1) 2.B: (1) 6.(2) 3.C: (1) 1.A: (2) 8.P.(5) Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  23. 23. Adaptive Huffman Coding Example: ABCDPAA 9.(9) 7.(4) 5.(2) 4.D: (1) 2.B: (1) 6.(2) 3.C: (1) 1.A: (2+1) 8.P.(5) Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  24. 24. Adaptive Huffman Coding Example: ABCDPPPPPAA 9.(10) 7.(5+1) 6. (3) 4(2) 4.D: (1) 2.B: (1) 3.C: (1) 5.A: (3) 8.P(5) Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  25. 25. Adaptive Huffman Coding Example: ABCDPPPPPAA 9.(11) 7:p(5) 8.(6) 5.A(3) 6(3) 3.C(1) 4.(2) 1.D(1) 2.B(1) Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  26. 26. Another Example: Adaptive Huffman Coding • This is to clearly illustrate more implementation details. We show exactly what bits are sent, as opposed to simply stating how the tree is updated. • An additional rule: if any character/symbol is to be sent the first time, it must be preceded by a special symbol, NEW. • The initial code for NEW is 0. The count for NEW is always kept as 0 (the count is never increased); • hence it is always denoted as NEW:(0) Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  27. 27. Initial code assignment for AADCCDD using adaptive Huffman coding. (1) NEW:0 A: (1) (2) NEW:0 A: (2) Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  28. 28. Initial code assignment for AADCCDD using adaptive Huffman coding. (3) A : (2)(1) NEW:0 D: (1) Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  29. 29. Initial code assignment for AADCCDD using adaptive Huffman coding. (4) A: (2)(2) (1) NEW:0 C: (1) D: (1) Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  30. 30. Initial code assignment for AADCCDD using adaptive Huffman coding. (4) A: (2)(2) (1) NEW:0 C: (1+1) D: (1) Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  31. 31. Initial code assignment for AADCCDD using adaptive Huffman coding. (4) A: (2)(2+1) (1) NEW:0 D: (1) C: (2) Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  32. 32. Initial code assignment for AADCCDD using adaptive Huffman coding. (5) A: (2) (3) C : (2)(1) NEW:0 D: (1) Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  33. 33. Initial code assignment for AADCCDD using adaptive Huffman coding. (6) A: (2) (4) C : (2)(2) NEW:0 D: (2) Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  34. 34. Initial code assignment for AADCCDD using adaptive Huffman coding. (6) A: (2) (4) C : (2)(2) NEW:0 D: (2+1) Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  35. 35. Initial code assignment for AADCCDD using adaptive Huffman coding. (7) D: (3) (4) C : (2)(2) NEW:0 A: (2) Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  36. 36. Sequence of symbols and codes sent to the decoder Symb ol NEW A A NEW D NEW C C D D Code 0000 0000 0000 0001 0000 0001 0000 0000 0000 0100 0000 0000 0000 0011 0000 0011 0000 0100 0000 0100 It is important to emphasize that the code for a particular symbol changes during the adaptive Huffman coding process. Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  37. 37. THANK YOU  Rahul Khanvani For More Visit Binarybuzz.wordpress.com
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×