2. Contents 1. Introduction
2. Steps/Algorithm to build Huffman tree
3. (Example) Building Huffman tree with
sample inputs
4. Steps/Algorithm for traversing the Huffman
tree
5. Real Life applications of Algorithm
6. Advantages/Disadvantages
7. References
2
3. Introduction Huffman coding is a lossless data compression algorithm.
The idea is to assign variable-length codes to input
characters, lengths of the assigned codes are based on the
frequencies of corresponding characters. The most
frequent character gets the smallest code and the least
frequent character gets the largest code. The variable-
length codes assigned to input characters are Prefix
Codes, means the codes (bit sequences) are assigned in
such a way that the code assigned to one character is not
the prefix of code assigned to any other character. This is
how Huffman Coding makes sure that there is no
ambiguity when decoding the generated bit stream.
3
4. Introduction There are mainly two major parts in Huffman Coding-:
1. Build a Huffman Tree from input characters.
2. Traverse the Huffman Tree and assign codes to
characters.
4
5. Steps/Algorithm
to build Huffman
table
1. Create a leaf node for each unique character and build a min heap
of all leaf nodes (Min Heap is used as a priority queue. The value
of frequency field is used to compare two nodes in min heap.
Initially, the least frequent character is at root)
2. Extract two nodes with the minimum frequency from the min
heap.
3. Create a new internal node with a frequency equal to the sum of
the two nodes frequencies. Make the first extracted node as its left
child and the other extracted node as its right child. Add this node
to the min heap.
4. Repeat steps#2 and #3 until the heap contains only one node. The
remaining node is the root node and the tree is complete.
5
7. Step A Create a leaf node for each character and build a min heap using all the nodes (The frequency value is used to
compare two nodes in min heap)
Resulted
Leaf Nodes
for each
Character
7
8. Step B- Repeat
the following
steps till heap
has more than
one nodes
Step 1: Extract two
nodes, say x and y,
with minimum
frequency from the
heap
Step 2 : Create a new
internal node z with x
as its left child and y
as its right child. Also
frequency(z)=
frequency(x)+frequen
cy(y)
Step 3: Add z to min
heap. Then Extract
and Combine node
u with an internal
node having 4 as
the frequency then
add the new
internal node to
priority queue
8
9. Step B- Repeat
the following
steps till heap
has more than
one nodes
Step 4: Extract and
combine node a
with an internal
node having 8 as
the frequency then
add the new
internal node to
priority queue
Step 5: Extract and
Combine nodes i
and s then add the
new internal node
to priority queue-
Step 6: Extract and
Combine nodes i
and s then add the
new internal node
to priority queue-
9
10. Step B- Repeat
the following
steps till heap
has more than
one nodes
Step 7: Extract and
Combine node e
with an internal
node having 18 as
the frequency then
add the new
internal node to
priority queue-
Step 8: Finally, Extract
and Combine internal
nodes having 25 and
33 as the frequency
then add the new
internal node to
priority queue-
10
11. Steps to
traversing
Huffman Tree
1. Create an auxiliary array
2. Traverse the tree starting from root node
3. Add 0 to array while traversing the left
child and add 1 to array while traversing
the right child
4. Print the array elements whenever a leaf
node is found
11
12. Need for
traversing
Huffman Tree
Suppose the string “staeiout” needs to be
transmitted from computer A (sender) to
computer B (receiver) across a network. Using
concepts of Huffman encoding, the string gets
encoded to binary codes at sender address
12
13. Conclusion of all the steps used to demonstrated in above sample example
12
11
10
9
8
7
6
5
4
3
2
1
Create leaf nodes for
all the characters and
add them to the min
heap.
Extract two nodes, say
x and y, with minimum
frequency from the
heap Add z to min heap
Since internal node
with frequency 58 is
the only node in the
queue, it becomes the
root of Huffman tree.
Create an auxiliary
array
Add 0 to array while
traversing the left child
and add 1 to array
while traversing the
right child
Repeat the following
steps till heap has
more than one nodes
Create a new internal
node z with x as its left
child and y as its right
child. Also
frequency(z)=
frequency(x)+frequenc
y(y)
Extract and Combine
node u with an internal
node having 4 as the
frequency
Last node in the heap
is the root of Huffman
tree
Traverse the tree
starting from root node
Print the array
elements
whenever a leaf
node is found
13
14. Real-life
Applications of
Huffman Codding
Applications of Huffman Codding-:
◎ Huffman encoding is widely used in compression formats
like GZIP, PKZIP (WinZip) and BZIP2.
◎ Multimedia codecs like JPEG, PNG and MP3 uses Huffman
encoding (to be more precise the prefix codes)
◎ Huffman encoding still dominates the compression industry since
newer arithmetic and range coding schemes are avoided due to
their patent issues.
14
15. Advantages of Huffman Encoding-
◎ This encoding scheme results in saving lot of storage space,
since the binary codes generated are variable in length
◎ It generates shorter binary codes for encoding
symbols/characters that appear more frequently in the input
string
◎ The binary codes generated are prefix-free
Advantages/
Disadvantages
15
16. Disadvantages of Huffman Encoding-
◎ Lossless data encoding schemes, like Huffman encoding, achieve a
lower compression ratio compared to lossy encoding techniques. Thus,
lossless techniques like Huffman encoding are suitable only for
encoding text and program files and are unsuitable for encoding digital
images.
◎ Huffman encoding is a relatively slower process since it uses two
passes- one for building the statistical model and another for encoding.
Thus, the lossless techniques that use Huffman encoding are
considerably slower than others.
◎ Since length of all the binary codes is different, it becomes difficult for
the decoding software to detect whether the encoded data is corrupt.
This can result in an incorrect decoding and subsequently, a wrong
output.
Advantages/
Disadvantages
16
17. References 1. Bao Ergude;Li Weisheng;Fan A Study and Implementation of
the Huffman Algorithm Based on Condensed Huffman Table
2. Rabia Arshad;Adeel Saleem;Danista Khan Performance
comparison of Huffman Coding and double Huffman Coding
3. Hoang-Anh Pham;Van-Hieu 2010 Fifth IEEE International
Symposium on Electronic Design, Test & Applications An
Adaptive Huffman Decoding Algorithm for MP3 Decoder
4. Studytonight Huffman Coding Algorithm
5. GeekforGeeks Application of Huffman algorithm
17