SlideShare a Scribd company logo
1 of 36
Greedy Algorithms
(Huffman Coding)
Prepared by
Md. Shafiuzzaman
Lecturer, Dept. of CSE, JUST
David Huffman’s idea
• A Term paper at MIT
• Build the tree (code) bottom-up in a greedy fashion
Huffman Coding
• A technique to compress data effectively
• Usually between 20%-90% compression
• Lossless compression
• No information is lost
• When decompress, you get the original file
Original file
Compressed file
Huffman coding
Huffman Coding: Applications
• Saving space
• Store compressed files instead of original files
• Transmitting files or data
• Send compressed data to save transmission time and power
• Encryption and decryption
• Cannot read the compressed file without knowing the “key”
Original file
Compressed file
Huffman coding
Main Idea: Frequency-Based Encoding
• Assume in this file only 6 characters appear
• E, A, C, T, K, N
• The frequencies are:
Character Frequency
E 10,000
A 4,000
C 300
T 200
K 100
N 100
• Option I (No Compression)
– Each character = 1 Byte (8 bits)
– Total file size = 14,700 * 8 = 117,600 bits
• Option 2 (Fixed size compression)
– We have 6 characters, so we need
3 bits to encode them
– Total file size = 14,700 * 3 = 44,100 bits
Character Fixed Encoding
E 000
A 001
C 010
T 100
K 110
N 111
Main Idea: Frequency-Based Encoding
(Cont’d)
• Assume in this file only 6 characters appear
• E, A, C, T, K, N
• The frequencies are:
Character Frequency
E 10,000
A 4,000
C 300
T 200
K 100
N 100• Option 3 (Huffman compression)
– Variable-length compression
– Assign shorter codes to more frequent characters and longer
codes to less frequent characters
– Total file size:
Char. HuffmanEncoding
E 0
A 10
C 110
T 1110
K 11110
N 11111
(10,000 x 1) + (4,000 x 2) + (300 x 3) + (200 x 4) + (100 x
5) + (100 x 5) = 20,700 bits
Huffman Coding
• A variable-length coding for characters
• More frequent characters  shorter codes
• Less frequent characters  longer codes
• It is not like ASCII coding where all characters have the same coding
length (8 bits)
• Two main questions
• How to assign codes (Encoding process)?
• How to decode (from the compressed file, generate the original file)
(Decoding process)?
Decoding for fixed-length codes is much
easier
Character Fixed-length
Encoding
E 000
A 001
C 010
T 100
K 110
N 111
010001100110111000
010 001 100 110 111 000
Divide into 3’s
C A T K N E
Decode
Decoding for variable-length codes is not
that easy…
Character Variable-length
Encoding
E 0
A 00
C 001
… …
… …
… …
000001
It means what???
EEEC EAC AEC
Huffman encoding guarantees to avoid this
uncertainty …Always have a single decoding
Huffman Algorithm
• Step 1: Get Frequencies
• Scan the file to be compressed and count the occurrence of each character
• Sort the characters based on their frequency
• Step 2: Build Tree & Assign Codes
• Build a Huffman-code tree (binary tree)
• Traverse the tree to assign codes
• Step 3: Encode (Compress)
• Scan the file again and replace each character by its code
• Step 4: Decode (Decompress)
• Huffman tree is the key to decompress the file
Step 1: Get Frequencies
Eerie eyes seen near lake.
Char Freq. Char Freq. Char Freq.
E 1 y 1 k 1
e 8 s 2 . 1
r 2 n 2
i 1 a 2
space 4 l 1
Input File:
Step 2: Build Huffman Tree & Assign Codes
• It is a binary tree in which each character is a leaf node
• Initially each node is a separate root
• At each step
• Select two roots with smallest frequency and connect them to a new
parent (Break ties arbitrary) [The greedy choice]
• The parent will get the sum of frequencies of the two child nodes
• Repeat until you have one root
Example
Char Freq. Char Freq. Char Freq.
E 1 y 1 k 1
e 8 s 2 . 1
r 2 n 2
i 1 a 2
space 4 l 1
E
1
i
1
y
1
l
1
k
1
.
1
r
2
s
2
n
2
a
2
☐
4
e
8
Each char. has a
leaf node with its
frequency
Find the smallest two frequencies…Replace them with their parent
E
1
i
1
y
1
l
1
k
1
.
1
r
2
s
2
n
2
a
2
☐
4
e
8
E
1
i
1
2
E
1
i
1
y
1
l
1
k
1
.
1
r
2
s
2
n
2
a
2
☐
4
e
8
2
Find the smallest two frequencies…Replace them
with their parent
y
1
l
1
2
E
1
i
1
k
1
.
1
r
2
s
2
n
2
a
2
☐
4
e
8
2
y
1
l
1
2
Find the smallest two frequencies…Replace them
with their parent
k
1
.
1
2
E
1
i
1
r
2
s
2
n
2
a
2
☐
4
e
8
2
y
1
l
1
2
k
1
.
1
2
Find the smallest two frequencies…Replace them
with their parent
r
2
s
2
4
E
1
i
1
n
2
a
2
☐
4
e
8
2
y
1
l
1
2
k
1
.
1
2
r
2
s
2
4
Find the smallest two frequencies…Replace them
with their parent
n
2
a
2
4
E
1
i
1
☐
4
e
8
2
y
1
l
1
2
k
1
.
1
2
r
2
s
2
4
n
2
a
2
4
Find the smallest two frequencies…Replace them
with their parent
E
1
i
1
2
y
1
l
1
2
4
E
1
i
1
☐
4
e
82
y
1
l
1
2
k
1
.
1
2
r
2
s
2
4
n
2
a
2
4 4
Find the smallest two frequencies…Replace them
with their parent
☐
4
k
1
.
1
2
6
E
1
i
1
☐
4
e
8
2
y
1
l
1
2
k
1
.
1
2
r
2
s
2
4
n
2
a
2
4 4 6
Find the smallest two frequencies…Replace them
with their parent
r
2
s
2
4
n
2
a
2
4
8
E
1
i
1
☐
4
e
82
y
1
l
1
2
k
1
.
1
2
r
2
s
2
4
n
2
a
2
4
4
6 8
Find the smallest two frequencies…Replace them
with their parent
E
1
i
1
☐
4
2
y
1
l
1
2
k
1
.
1
2
4
6
10
E
1
i
1
☐
4
e
8
2
y
1
l
1
2
k
1
.
1
2r
2
s
2
4
n
2
a
2
4 4
6
8 10
Find the smallest two frequencies…Replace them
with their parent
e
8
r
2
s
2
4
n
2
a
2
4
8
16
E
1
i
1
☐
4
e
82
y
1
l
1
2
k
1
.
1
2
r
2
s
2
4
n
2
a
2
4
4
6
8
10 16
Find the smallest two frequencies…Replace them
with their parent
E
1
i
1
☐
4
e
8
2
y
1
l
1
2
k
1
.
1
2
r
2
s
2
4
n
2
a
2
4
4
6
8
10
16
26
Now we have a single root…This is the Huffman Tree
Lets Analyze Huffman Tree
• All characters are at the leaf nodes
• The number at the root = # of characters in the file
• High-frequency chars (E.g., “e”) are near the root
• Low-frequency chars are far from the root
E
1
i
1
☐
4
e
8
2
y
1
l
1
2
k
1
.
1
2
r
2
s
2
4
n
2
a
2
4
4
6
8
10
16
26
Lets Assign Codes
• Traverse the tree
• Any left edge  add label 0
• As right edge  add label 1
• The code for each character is its root-to-leaf label sequence
E
1
i
1
☐
4
e
8
2
y
1
l
1
2
k
1
.
1
2
r
2
s
2
4
n
2
a
2
4
4
6
8
10
16
26
• Traverse the tree
• Any left edge  add label 0
• As right edge  add label 1
• The code for each character is its root-to-leaf label sequence
E
1
i
1
☐
4
e
8
2
y
1
l
1
2
k
1
.
1
2
r
2
s
2
4
n
2
a
2
4
4
6
8
10
16
26
0
1
0
0
0
0
0
0 0
1
1
11
1
1
1
10
001 1
Lets Assign Codes
• Traverse the tree
• Any left edge  add label 0
• As right edge  add label 1
• The code for each character is its root-to-leaf label sequence
Lets Assign Codes
Char Code
E 0000
i 0001
y 0010
l 0011
k 0100
. 0101
space☐ 011
e 10
r 1100
s 1101
n 1110
a 1111
Coding Table
Huffman Algorithm
• Step 1: Get Frequencies
• Scan the file to be compressed and count the occurrence of each character
• Sort the characters based on their frequency
• Step 2: Build Tree & Assign Codes
• Build a Huffman-code tree (binary tree)
• Traverse the tree to assign codes
• Step 3: Encode (Compress)
• Scan the file again and replace each character by its code
• Step 4: Decode (Decompress)
• Huffman tree is the key to decompess the file
Generate the
encoded file
Step 3: Encode (Compress) The File
Eerie eyes seen near lake.
Input File:
Char Code
E 0000
i 0001
y 0010
l 0011
k 0100
. 0101
space☐ 011
e 10
r 1100
s 1101
n 1110
a 1111
Coding Table
+
000010 1100 000110 ….
Notice that no code is prefix to any other code 
Ensures the decoding will be unique (Unlike Slide 8)
Step 4: Decode (Decompress)
• Must have the encoded file + the coding tree
• Scan the encoded file
• For each 0  move left in the tree
• For each 1  move right
• Until reach a leaf node  Emit that character and go back to the root
000010 1100 000110 ….
+
Eerie …
Generate the
original file
Huffman Algorithm
• Step 1: Get Frequencies
• Scan the file to be compressed and count the occurrence of each character
• Sort the characters based on their frequency
• Step 2: Build Tree & Assign Codes
• Build a Huffman-code tree (binary tree)
• Traverse the tree to assign codes
• Step 3: Encode (Compress)
• Scan the file again and replace each character by its code
• Step 4: Decode (Decompress)
• Huffman tree is the key to decompess the file
The Algorithm
• An appropriate data structure is a binary min-heap
• Rebuilding the heap is lgn and n-1 extractions are made, so the complexity is
O( nlgn)
• The encoding is NOT unique, other encoding may work just as well, but none
will work better
Lab Assignment
• Example Input: Huffman coding is a data compression algorithm.
• Output:
Huffman Codes

More Related Content

What's hot

Huffman's Alforithm
Huffman's AlforithmHuffman's Alforithm
Huffman's AlforithmRoohaali
 
Huffman coding
Huffman codingHuffman coding
Huffman codingNihal Gupta
 
Huffman Coding
Huffman CodingHuffman Coding
Huffman CodingEhtisham Ali
 
Huffman Code Decoding
Huffman Code DecodingHuffman Code Decoding
Huffman Code DecodingRex Yuan
 
Komdat-Kompresi Data
Komdat-Kompresi DataKomdat-Kompresi Data
Komdat-Kompresi Datamursalinfajri007
 
Text compression
Text compressionText compression
Text compressionSammer Qader
 
Data compression huffman coding algoritham
Data compression huffman coding algorithamData compression huffman coding algoritham
Data compression huffman coding algorithamRahul Khanwani
 
Module 4 Arithmetic Coding
Module 4 Arithmetic CodingModule 4 Arithmetic Coding
Module 4 Arithmetic Codinganithabalaprabhu
 
Clean Code Tutorial maXbox starter24
Clean Code Tutorial maXbox starter24Clean Code Tutorial maXbox starter24
Clean Code Tutorial maXbox starter24Max Kleiner
 

What's hot (16)

Huffman's Alforithm
Huffman's AlforithmHuffman's Alforithm
Huffman's Alforithm
 
Huffman coding
Huffman codingHuffman coding
Huffman coding
 
Huffman Coding
Huffman CodingHuffman Coding
Huffman Coding
 
Huffman Coding
Huffman CodingHuffman Coding
Huffman Coding
 
Lec32
Lec32Lec32
Lec32
 
Huffman Code Decoding
Huffman Code DecodingHuffman Code Decoding
Huffman Code Decoding
 
Komdat-Kompresi Data
Komdat-Kompresi DataKomdat-Kompresi Data
Komdat-Kompresi Data
 
Huffman
HuffmanHuffman
Huffman
 
Text compression
Text compressionText compression
Text compression
 
Data compression huffman coding algoritham
Data compression huffman coding algorithamData compression huffman coding algoritham
Data compression huffman coding algoritham
 
Lossless
LosslessLossless
Lossless
 
Lec5 Compression
Lec5 CompressionLec5 Compression
Lec5 Compression
 
Lossless
LosslessLossless
Lossless
 
Module 4 Arithmetic Coding
Module 4 Arithmetic CodingModule 4 Arithmetic Coding
Module 4 Arithmetic Coding
 
information theory
information theoryinformation theory
information theory
 
Clean Code Tutorial maXbox starter24
Clean Code Tutorial maXbox starter24Clean Code Tutorial maXbox starter24
Clean Code Tutorial maXbox starter24
 

Similar to Huffman Codes

Greedy Algorithms Huffman Coding.ppt
Greedy Algorithms  Huffman Coding.pptGreedy Algorithms  Huffman Coding.ppt
Greedy Algorithms Huffman Coding.pptRuchika Sinha
 
Farhana shaikh webinar_huffman coding
Farhana shaikh webinar_huffman codingFarhana shaikh webinar_huffman coding
Farhana shaikh webinar_huffman codingFarhana Shaikh
 
Huffman > Data Structures & Algorithums
Huffman > Data Structures & AlgorithumsHuffman > Data Structures & Algorithums
Huffman > Data Structures & AlgorithumsAin-ul-Moiz Khawaja
 
Huffman Tree And Its Application
Huffman Tree And Its ApplicationHuffman Tree And Its Application
Huffman Tree And Its ApplicationPapu Kumar
 
Data structures' project
Data structures' projectData structures' project
Data structures' projectBehappy Seehappy
 
Module-IV 094.pdf
Module-IV 094.pdfModule-IV 094.pdf
Module-IV 094.pdfSamrajECE
 
Intro compute
Intro computeIntro compute
Intro computeGHOTRAANGEL
 
Intro computeRRR
Intro computeRRRIntro computeRRR
Intro computeRRRGHOTRAANGEL
 
Intro compute
Intro computeIntro compute
Intro computeUsman Shah
 
COMPUTER INTRODUCTION
COMPUTER INTRODUCTIONCOMPUTER INTRODUCTION
COMPUTER INTRODUCTIONAmit Sharma
 
CS-102 Data Structures huffman coding.pdf
CS-102 Data Structures huffman coding.pdfCS-102 Data Structures huffman coding.pdf
CS-102 Data Structures huffman coding.pdfssuser034ce1
 
CS-102 Data Structures huffman coding.pdf
CS-102 Data Structures huffman coding.pdfCS-102 Data Structures huffman coding.pdf
CS-102 Data Structures huffman coding.pdfssuser034ce1
 
introduction for computers
introduction for computersintroduction for computers
introduction for computersYogesh Chaure
 
Intro computer fundamentals
Intro computer fundamentalsIntro computer fundamentals
Intro computer fundamentalsPrabhu Govind
 
huffman algoritm upload for understand.ppt
huffman algoritm upload for understand.ppthuffman algoritm upload for understand.ppt
huffman algoritm upload for understand.pptSadiaSharmin40
 
huffman codes and algorithm
huffman codes and algorithm huffman codes and algorithm
huffman codes and algorithm Vinay379568
 
huffman_nyu.ppt ghgghtttjghh hhhhhhhhhhh
huffman_nyu.ppt ghgghtttjghh hhhhhhhhhhhhuffman_nyu.ppt ghgghtttjghh hhhhhhhhhhh
huffman_nyu.ppt ghgghtttjghh hhhhhhhhhhhBARUNSINGH43
 
23 priority queue
23 priority queue23 priority queue
23 priority queueGodo Dodo
 

Similar to Huffman Codes (20)

Greedy Algorithms Huffman Coding.ppt
Greedy Algorithms  Huffman Coding.pptGreedy Algorithms  Huffman Coding.ppt
Greedy Algorithms Huffman Coding.ppt
 
Farhana shaikh webinar_huffman coding
Farhana shaikh webinar_huffman codingFarhana shaikh webinar_huffman coding
Farhana shaikh webinar_huffman coding
 
Huffman > Data Structures & Algorithums
Huffman > Data Structures & AlgorithumsHuffman > Data Structures & Algorithums
Huffman > Data Structures & Algorithums
 
Huffman tree
Huffman tree Huffman tree
Huffman tree
 
Huffman Tree And Its Application
Huffman Tree And Its ApplicationHuffman Tree And Its Application
Huffman Tree And Its Application
 
Data structures' project
Data structures' projectData structures' project
Data structures' project
 
Module-IV 094.pdf
Module-IV 094.pdfModule-IV 094.pdf
Module-IV 094.pdf
 
Intro compute
Intro computeIntro compute
Intro compute
 
Intro computeRRR
Intro computeRRRIntro computeRRR
Intro computeRRR
 
Intro compute
Intro computeIntro compute
Intro compute
 
COMPUTER INTRODUCTION
COMPUTER INTRODUCTIONCOMPUTER INTRODUCTION
COMPUTER INTRODUCTION
 
CS-102 Data Structures huffman coding.pdf
CS-102 Data Structures huffman coding.pdfCS-102 Data Structures huffman coding.pdf
CS-102 Data Structures huffman coding.pdf
 
CS-102 Data Structures huffman coding.pdf
CS-102 Data Structures huffman coding.pdfCS-102 Data Structures huffman coding.pdf
CS-102 Data Structures huffman coding.pdf
 
introduction for computers
introduction for computersintroduction for computers
introduction for computers
 
Intro computer fundamentals
Intro computer fundamentalsIntro computer fundamentals
Intro computer fundamentals
 
huffman algoritm upload for understand.ppt
huffman algoritm upload for understand.ppthuffman algoritm upload for understand.ppt
huffman algoritm upload for understand.ppt
 
huffman codes and algorithm
huffman codes and algorithm huffman codes and algorithm
huffman codes and algorithm
 
huffman_nyu.ppt ghgghtttjghh hhhhhhhhhhh
huffman_nyu.ppt ghgghtttjghh hhhhhhhhhhhhuffman_nyu.ppt ghgghtttjghh hhhhhhhhhhh
huffman_nyu.ppt ghgghtttjghh hhhhhhhhhhh
 
Huffman
HuffmanHuffman
Huffman
 
23 priority queue
23 priority queue23 priority queue
23 priority queue
 

More from Md. Shafiuzzaman Hira (20)

Introduction to Web development
Introduction to Web developmentIntroduction to Web development
Introduction to Web development
 
Software measurement and estimation
Software measurement and estimationSoftware measurement and estimation
Software measurement and estimation
 
Why do we test software?
Why do we test software?Why do we test software?
Why do we test software?
 
Software Requirements engineering
Software Requirements engineeringSoftware Requirements engineering
Software Requirements engineering
 
Software architectural patterns
Software architectural patternsSoftware architectural patterns
Software architectural patterns
 
Class based modeling
Class based modelingClass based modeling
Class based modeling
 
Class diagram
Class diagramClass diagram
Class diagram
 
State diagram
State diagramState diagram
State diagram
 
Use case Modeling
Use case ModelingUse case Modeling
Use case Modeling
 
User stories
User storiesUser stories
User stories
 
Dev ops
Dev opsDev ops
Dev ops
 
Agile Methodology
Agile MethodologyAgile Methodology
Agile Methodology
 
Software Process Model
Software Process ModelSoftware Process Model
Software Process Model
 
Introduction to Software Engineering Course
Introduction to Software Engineering CourseIntroduction to Software Engineering Course
Introduction to Software Engineering Course
 
C files
C filesC files
C files
 
C pointers
C pointersC pointers
C pointers
 
C structures
C structuresC structures
C structures
 
How to Create Python scripts
How to Create Python scriptsHow to Create Python scripts
How to Create Python scripts
 
Regular expressions using Python
Regular expressions using PythonRegular expressions using Python
Regular expressions using Python
 
Password locker project
Password locker project Password locker project
Password locker project
 

Recently uploaded

High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...Call Girls in Nagpur High Profile
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEslot gacor bisa pakai pulsa
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
Study on Air-Water & Water-Water Heat Exchange in a Finned ďťżTube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned ďťżTube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned ďťżTube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned ďťżTube ExchangerAnamika Sarkar
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZTE
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAbhinavSharma374939
 

Recently uploaded (20)

High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
Study on Air-Water & Water-Water Heat Exchange in a Finned ďťżTube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned ďťżTube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned ďťżTube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned ďťżTube Exchanger
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog Converter
 

Huffman Codes

  • 1. Greedy Algorithms (Huffman Coding) Prepared by Md. Shafiuzzaman Lecturer, Dept. of CSE, JUST
  • 2. David Huffman’s idea • A Term paper at MIT • Build the tree (code) bottom-up in a greedy fashion
  • 3. Huffman Coding • A technique to compress data effectively • Usually between 20%-90% compression • Lossless compression • No information is lost • When decompress, you get the original file Original file Compressed file Huffman coding
  • 4. Huffman Coding: Applications • Saving space • Store compressed files instead of original files • Transmitting files or data • Send compressed data to save transmission time and power • Encryption and decryption • Cannot read the compressed file without knowing the “key” Original file Compressed file Huffman coding
  • 5. Main Idea: Frequency-Based Encoding • Assume in this file only 6 characters appear • E, A, C, T, K, N • The frequencies are: Character Frequency E 10,000 A 4,000 C 300 T 200 K 100 N 100 • Option I (No Compression) – Each character = 1 Byte (8 bits) – Total file size = 14,700 * 8 = 117,600 bits • Option 2 (Fixed size compression) – We have 6 characters, so we need 3 bits to encode them – Total file size = 14,700 * 3 = 44,100 bits Character Fixed Encoding E 000 A 001 C 010 T 100 K 110 N 111
  • 6. Main Idea: Frequency-Based Encoding (Cont’d) • Assume in this file only 6 characters appear • E, A, C, T, K, N • The frequencies are: Character Frequency E 10,000 A 4,000 C 300 T 200 K 100 N 100• Option 3 (Huffman compression) – Variable-length compression – Assign shorter codes to more frequent characters and longer codes to less frequent characters – Total file size: Char. HuffmanEncoding E 0 A 10 C 110 T 1110 K 11110 N 11111 (10,000 x 1) + (4,000 x 2) + (300 x 3) + (200 x 4) + (100 x 5) + (100 x 5) = 20,700 bits
  • 7. Huffman Coding • A variable-length coding for characters • More frequent characters  shorter codes • Less frequent characters  longer codes • It is not like ASCII coding where all characters have the same coding length (8 bits) • Two main questions • How to assign codes (Encoding process)? • How to decode (from the compressed file, generate the original file) (Decoding process)?
  • 8. Decoding for fixed-length codes is much easier Character Fixed-length Encoding E 000 A 001 C 010 T 100 K 110 N 111 010001100110111000 010 001 100 110 111 000 Divide into 3’s C A T K N E Decode
  • 9. Decoding for variable-length codes is not that easy… Character Variable-length Encoding E 0 A 00 C 001 … … … … … … 000001 It means what??? EEEC EAC AEC Huffman encoding guarantees to avoid this uncertainty …Always have a single decoding
  • 10. Huffman Algorithm • Step 1: Get Frequencies • Scan the file to be compressed and count the occurrence of each character • Sort the characters based on their frequency • Step 2: Build Tree & Assign Codes • Build a Huffman-code tree (binary tree) • Traverse the tree to assign codes • Step 3: Encode (Compress) • Scan the file again and replace each character by its code • Step 4: Decode (Decompress) • Huffman tree is the key to decompress the file
  • 11. Step 1: Get Frequencies Eerie eyes seen near lake. Char Freq. Char Freq. Char Freq. E 1 y 1 k 1 e 8 s 2 . 1 r 2 n 2 i 1 a 2 space 4 l 1 Input File:
  • 12. Step 2: Build Huffman Tree & Assign Codes • It is a binary tree in which each character is a leaf node • Initially each node is a separate root • At each step • Select two roots with smallest frequency and connect them to a new parent (Break ties arbitrary) [The greedy choice] • The parent will get the sum of frequencies of the two child nodes • Repeat until you have one root
  • 13. Example Char Freq. Char Freq. Char Freq. E 1 y 1 k 1 e 8 s 2 . 1 r 2 n 2 i 1 a 2 space 4 l 1 E 1 i 1 y 1 l 1 k 1 . 1 r 2 s 2 n 2 a 2 ☐ 4 e 8 Each char. has a leaf node with its frequency
  • 14. Find the smallest two frequencies…Replace them with their parent E 1 i 1 y 1 l 1 k 1 . 1 r 2 s 2 n 2 a 2 ☐ 4 e 8 E 1 i 1 2
  • 15. E 1 i 1 y 1 l 1 k 1 . 1 r 2 s 2 n 2 a 2 ☐ 4 e 8 2 Find the smallest two frequencies…Replace them with their parent y 1 l 1 2
  • 16. E 1 i 1 k 1 . 1 r 2 s 2 n 2 a 2 ☐ 4 e 8 2 y 1 l 1 2 Find the smallest two frequencies…Replace them with their parent k 1 . 1 2
  • 17. E 1 i 1 r 2 s 2 n 2 a 2 ☐ 4 e 8 2 y 1 l 1 2 k 1 . 1 2 Find the smallest two frequencies…Replace them with their parent r 2 s 2 4
  • 18. E 1 i 1 n 2 a 2 ☐ 4 e 8 2 y 1 l 1 2 k 1 . 1 2 r 2 s 2 4 Find the smallest two frequencies…Replace them with their parent n 2 a 2 4
  • 19. E 1 i 1 ☐ 4 e 8 2 y 1 l 1 2 k 1 . 1 2 r 2 s 2 4 n 2 a 2 4 Find the smallest two frequencies…Replace them with their parent E 1 i 1 2 y 1 l 1 2 4
  • 20. E 1 i 1 ☐ 4 e 82 y 1 l 1 2 k 1 . 1 2 r 2 s 2 4 n 2 a 2 4 4 Find the smallest two frequencies…Replace them with their parent ☐ 4 k 1 . 1 2 6
  • 21. E 1 i 1 ☐ 4 e 8 2 y 1 l 1 2 k 1 . 1 2 r 2 s 2 4 n 2 a 2 4 4 6 Find the smallest two frequencies…Replace them with their parent r 2 s 2 4 n 2 a 2 4 8
  • 22. E 1 i 1 ☐ 4 e 82 y 1 l 1 2 k 1 . 1 2 r 2 s 2 4 n 2 a 2 4 4 6 8 Find the smallest two frequencies…Replace them with their parent E 1 i 1 ☐ 4 2 y 1 l 1 2 k 1 . 1 2 4 6 10
  • 23. E 1 i 1 ☐ 4 e 8 2 y 1 l 1 2 k 1 . 1 2r 2 s 2 4 n 2 a 2 4 4 6 8 10 Find the smallest two frequencies…Replace them with their parent e 8 r 2 s 2 4 n 2 a 2 4 8 16
  • 24. E 1 i 1 ☐ 4 e 82 y 1 l 1 2 k 1 . 1 2 r 2 s 2 4 n 2 a 2 4 4 6 8 10 16 Find the smallest two frequencies…Replace them with their parent
  • 26. Lets Analyze Huffman Tree • All characters are at the leaf nodes • The number at the root = # of characters in the file • High-frequency chars (E.g., “e”) are near the root • Low-frequency chars are far from the root E 1 i 1 ☐ 4 e 8 2 y 1 l 1 2 k 1 . 1 2 r 2 s 2 4 n 2 a 2 4 4 6 8 10 16 26
  • 27. Lets Assign Codes • Traverse the tree • Any left edge  add label 0 • As right edge  add label 1 • The code for each character is its root-to-leaf label sequence E 1 i 1 ☐ 4 e 8 2 y 1 l 1 2 k 1 . 1 2 r 2 s 2 4 n 2 a 2 4 4 6 8 10 16 26
  • 28. • Traverse the tree • Any left edge  add label 0 • As right edge  add label 1 • The code for each character is its root-to-leaf label sequence E 1 i 1 ☐ 4 e 8 2 y 1 l 1 2 k 1 . 1 2 r 2 s 2 4 n 2 a 2 4 4 6 8 10 16 26 0 1 0 0 0 0 0 0 0 1 1 11 1 1 1 10 001 1 Lets Assign Codes
  • 29. • Traverse the tree • Any left edge  add label 0 • As right edge  add label 1 • The code for each character is its root-to-leaf label sequence Lets Assign Codes Char Code E 0000 i 0001 y 0010 l 0011 k 0100 . 0101 space☐ 011 e 10 r 1100 s 1101 n 1110 a 1111 Coding Table
  • 30. Huffman Algorithm • Step 1: Get Frequencies • Scan the file to be compressed and count the occurrence of each character • Sort the characters based on their frequency • Step 2: Build Tree & Assign Codes • Build a Huffman-code tree (binary tree) • Traverse the tree to assign codes • Step 3: Encode (Compress) • Scan the file again and replace each character by its code • Step 4: Decode (Decompress) • Huffman tree is the key to decompess the file
  • 31. Generate the encoded file Step 3: Encode (Compress) The File Eerie eyes seen near lake. Input File: Char Code E 0000 i 0001 y 0010 l 0011 k 0100 . 0101 space☐ 011 e 10 r 1100 s 1101 n 1110 a 1111 Coding Table + 000010 1100 000110 …. Notice that no code is prefix to any other code  Ensures the decoding will be unique (Unlike Slide 8)
  • 32. Step 4: Decode (Decompress) • Must have the encoded file + the coding tree • Scan the encoded file • For each 0  move left in the tree • For each 1  move right • Until reach a leaf node  Emit that character and go back to the root 000010 1100 000110 …. + Eerie … Generate the original file
  • 33. Huffman Algorithm • Step 1: Get Frequencies • Scan the file to be compressed and count the occurrence of each character • Sort the characters based on their frequency • Step 2: Build Tree & Assign Codes • Build a Huffman-code tree (binary tree) • Traverse the tree to assign codes • Step 3: Encode (Compress) • Scan the file again and replace each character by its code • Step 4: Decode (Decompress) • Huffman tree is the key to decompess the file
  • 34. The Algorithm • An appropriate data structure is a binary min-heap • Rebuilding the heap is lgn and n-1 extractions are made, so the complexity is O( nlgn) • The encoding is NOT unique, other encoding may work just as well, but none will work better
  • 35. Lab Assignment • Example Input: Huffman coding is a data compression algorithm. • Output: