SlideShare a Scribd company logo
DATA
COMPRESSION
Simple Dictionary
Compression
Manish T I
• It is a two pass algorithm in which first pass
analyze the data in the source file and second
pass will compress the data to a file.
First Pass:-
• In the source file distinct bytes are identified.
• Check the number of times it occurs in the
source file.
• A new list is sorted in descending order of the
frequencies, in such a manner in which higher
count of byte (alphabets) appear at the top of
the list which is known as the dictionary.
Second Pass:-
• The source file is read again byte by byte
• Each byte is located in the dictionary by a direct
search and its index is noted.
• Index value is written on the compressed file,
preceded by its length.
• The index value consist of 256 values and range
spans from 0 to 255.
• The index is written on the compressed file,
preceded by a 3-bit code denoting the index’s
length.
• Index Table
Binary
Value
Value Bit
000 0 1
001 1 2
010 2 3
011 3 4
100 4 5
101 5 6
110 6 7
111 7 8
Input File sample data : - TTVVVEGTVEN
Dictionary File : -
Compressed File (4 – 11 bits)
T
V
V
V
E
G
T
V
E
N
0 0 1 1 0
0 0 0 1
0 0 0 1
0 0 0 1
0 0 1 1 1
0 1 0 1 0 0
0 0 1 1 0
0 0 0 1
0 0 1 1 1
0 1 0 1 0 1
No: of bits
used
5
4
4
4
5
6
5
4
5
6
• Compression is achieved because the
dictionary is sorted by the frequency of the
bytes. Each byte is replaced by a quantity of
between 4 and 11 bits.
• Dictionary is not sorted by byte values.
• Disadvantage :- Slow compression not in the
case of decompression.
Reference:-
Data Compression : The Complete Reference,
David Salomon, Springer Science & Business
Media, 2004
For any queries contact:
Web: www.iprg.co.in
E-mail: manishti2004@gmail.com
Facebook: @ImageProcessingResearchGroup

More Related Content

Similar to Simple Dictionary Compression

Analytics: The Final Data Frontier (or, Why Users Need Your Data and How Pino...
Analytics: The Final Data Frontier (or, Why Users Need Your Data and How Pino...Analytics: The Final Data Frontier (or, Why Users Need Your Data and How Pino...
Analytics: The Final Data Frontier (or, Why Users Need Your Data and How Pino...
HostedbyConfluent
 
block ciphers
block ciphersblock ciphers
block ciphers
Asad Ali
 
3_Indexing.ppt
3_Indexing.ppt3_Indexing.ppt
3_Indexing.ppt
MedinaBedru
 
Indexing and Hashing
Indexing and HashingIndexing and Hashing
Indexing and Hashing
sathish sak
 
Special purpose computing system document
Special purpose computing system documentSpecial purpose computing system document
Special purpose computing system document
Nof140
 
Why you should care about data layout in the file system with Cheng Lian and ...
Why you should care about data layout in the file system with Cheng Lian and ...Why you should care about data layout in the file system with Cheng Lian and ...
Why you should care about data layout in the file system with Cheng Lian and ...
Databricks
 
Fundamentals of Information Encryption
Fundamentals of Information EncryptionFundamentals of Information Encryption
Fundamentals of Information Encryption
Amna Magzoub
 
A General Session Based Bit Level Block Encoding Technique Using Symmetric Ke...
A General Session Based Bit Level Block Encoding Technique Using Symmetric Ke...A General Session Based Bit Level Block Encoding Technique Using Symmetric Ke...
A General Session Based Bit Level Block Encoding Technique Using Symmetric Ke...
ijcseit
 
Application of tries
Application of triesApplication of tries
Application of triesTech_MX
 
Data structures
Data structuresData structures
Data structures
MADHAVASAIYENDUVA
 
lecture1-intro.ppt
lecture1-intro.pptlecture1-intro.ppt
lecture1-intro.ppt
IshaXogaha
 
lecture1-intro.ppt
lecture1-intro.pptlecture1-intro.ppt
lecture1-intro.ppt
WrushabhShirsat3
 
Basic Local Alignment Search Tool (BLAST)
Basic Local Alignment Search Tool (BLAST)Basic Local Alignment Search Tool (BLAST)
Basic Local Alignment Search Tool (BLAST)Asiri Wijesinghe
 
Anton Dorfman - Reversing data formats what data can reveal
Anton Dorfman - Reversing data formats what data can revealAnton Dorfman - Reversing data formats what data can reveal
Anton Dorfman - Reversing data formats what data can revealDefconRussia
 
CPP17 - File IO
CPP17 - File IOCPP17 - File IO
CPP17 - File IO
Michael Heron
 
Text compression
Text compressionText compression
Text compression
Sammer Qader
 
Database Sizing
Database SizingDatabase Sizing
Database Sizing
Amin Chowdhury
 
Structured Query Language (SQL) _ Edu4Sure Training.pptx
Structured Query Language (SQL) _ Edu4Sure Training.pptxStructured Query Language (SQL) _ Edu4Sure Training.pptx
Structured Query Language (SQL) _ Edu4Sure Training.pptx
Edu4Sure
 
Source coding
Source coding Source coding
Source coding
Shankar Gangaju
 

Similar to Simple Dictionary Compression (20)

Analytics: The Final Data Frontier (or, Why Users Need Your Data and How Pino...
Analytics: The Final Data Frontier (or, Why Users Need Your Data and How Pino...Analytics: The Final Data Frontier (or, Why Users Need Your Data and How Pino...
Analytics: The Final Data Frontier (or, Why Users Need Your Data and How Pino...
 
block ciphers
block ciphersblock ciphers
block ciphers
 
3_Indexing.ppt
3_Indexing.ppt3_Indexing.ppt
3_Indexing.ppt
 
Indexing and Hashing
Indexing and HashingIndexing and Hashing
Indexing and Hashing
 
Special purpose computing system document
Special purpose computing system documentSpecial purpose computing system document
Special purpose computing system document
 
Why you should care about data layout in the file system with Cheng Lian and ...
Why you should care about data layout in the file system with Cheng Lian and ...Why you should care about data layout in the file system with Cheng Lian and ...
Why you should care about data layout in the file system with Cheng Lian and ...
 
Fundamentals of Information Encryption
Fundamentals of Information EncryptionFundamentals of Information Encryption
Fundamentals of Information Encryption
 
A General Session Based Bit Level Block Encoding Technique Using Symmetric Ke...
A General Session Based Bit Level Block Encoding Technique Using Symmetric Ke...A General Session Based Bit Level Block Encoding Technique Using Symmetric Ke...
A General Session Based Bit Level Block Encoding Technique Using Symmetric Ke...
 
Application of tries
Application of triesApplication of tries
Application of tries
 
Data structures
Data structuresData structures
Data structures
 
lecture1-intro.ppt
lecture1-intro.pptlecture1-intro.ppt
lecture1-intro.ppt
 
lecture1-intro.ppt
lecture1-intro.pptlecture1-intro.ppt
lecture1-intro.ppt
 
Basic Local Alignment Search Tool (BLAST)
Basic Local Alignment Search Tool (BLAST)Basic Local Alignment Search Tool (BLAST)
Basic Local Alignment Search Tool (BLAST)
 
Unit 08 dbms
Unit 08 dbmsUnit 08 dbms
Unit 08 dbms
 
Anton Dorfman - Reversing data formats what data can reveal
Anton Dorfman - Reversing data formats what data can revealAnton Dorfman - Reversing data formats what data can reveal
Anton Dorfman - Reversing data formats what data can reveal
 
CPP17 - File IO
CPP17 - File IOCPP17 - File IO
CPP17 - File IO
 
Text compression
Text compressionText compression
Text compression
 
Database Sizing
Database SizingDatabase Sizing
Database Sizing
 
Structured Query Language (SQL) _ Edu4Sure Training.pptx
Structured Query Language (SQL) _ Edu4Sure Training.pptxStructured Query Language (SQL) _ Edu4Sure Training.pptx
Structured Query Language (SQL) _ Edu4Sure Training.pptx
 
Source coding
Source coding Source coding
Source coding
 

More from MANISH T I

Budgerigar
BudgerigarBudgerigar
Budgerigar
MANISH T I
 
NAAC Criteria 3
NAAC  Criteria 3NAAC  Criteria 3
NAAC Criteria 3
MANISH T I
 
Artificial intelligence - An Overview
Artificial intelligence - An OverviewArtificial intelligence - An Overview
Artificial intelligence - An Overview
MANISH T I
 
The future of blogging
The future of bloggingThe future of blogging
The future of blogging
MANISH T I
 
Socrates - Most Important of his Thoughts
Socrates - Most Important of his ThoughtsSocrates - Most Important of his Thoughts
Socrates - Most Important of his Thoughts
MANISH T I
 
Technical writing
Technical writingTechnical writing
Technical writing
MANISH T I
 
Shannon-Fano algorithm
Shannon-Fano algorithmShannon-Fano algorithm
Shannon-Fano algorithm
MANISH T I
 
Solar Image Processing
Solar Image Processing  Solar Image Processing
Solar Image Processing
MANISH T I
 
Graph Theory Introduction
Graph Theory IntroductionGraph Theory Introduction
Graph Theory Introduction
MANISH T I
 
Rooted & binary tree
Rooted & binary treeRooted & binary tree
Rooted & binary tree
MANISH T I
 
JPEG
JPEGJPEG
Colourful Living - Way of Life
Colourful Living - Way of LifeColourful Living - Way of Life
Colourful Living - Way of Life
MANISH T I
 
Introduction to Multimedia
Introduction to MultimediaIntroduction to Multimedia
Introduction to Multimedia
MANISH T I
 
Soft Computing
Soft ComputingSoft Computing
Soft Computing
MANISH T I
 
Research Methodology - Methods of data collection
 Research Methodology - Methods of data collection Research Methodology - Methods of data collection
Research Methodology - Methods of data collection
MANISH T I
 
15 lessons of lord buddha
15 lessons of lord buddha15 lessons of lord buddha
15 lessons of lord buddha
MANISH T I
 
Image enhancement
Image enhancementImage enhancement
Image enhancement
MANISH T I
 
Research Methodology - Introduction
Research  Methodology - IntroductionResearch  Methodology - Introduction
Research Methodology - Introduction
MANISH T I
 
DBMS - FIRST NORMAL FORM
DBMS - FIRST NORMAL FORMDBMS - FIRST NORMAL FORM
DBMS - FIRST NORMAL FORM
MANISH T I
 
Data Compression - Text Compression - Run Length Encoding
Data Compression - Text Compression - Run Length EncodingData Compression - Text Compression - Run Length Encoding
Data Compression - Text Compression - Run Length Encoding
MANISH T I
 

More from MANISH T I (20)

Budgerigar
BudgerigarBudgerigar
Budgerigar
 
NAAC Criteria 3
NAAC  Criteria 3NAAC  Criteria 3
NAAC Criteria 3
 
Artificial intelligence - An Overview
Artificial intelligence - An OverviewArtificial intelligence - An Overview
Artificial intelligence - An Overview
 
The future of blogging
The future of bloggingThe future of blogging
The future of blogging
 
Socrates - Most Important of his Thoughts
Socrates - Most Important of his ThoughtsSocrates - Most Important of his Thoughts
Socrates - Most Important of his Thoughts
 
Technical writing
Technical writingTechnical writing
Technical writing
 
Shannon-Fano algorithm
Shannon-Fano algorithmShannon-Fano algorithm
Shannon-Fano algorithm
 
Solar Image Processing
Solar Image Processing  Solar Image Processing
Solar Image Processing
 
Graph Theory Introduction
Graph Theory IntroductionGraph Theory Introduction
Graph Theory Introduction
 
Rooted & binary tree
Rooted & binary treeRooted & binary tree
Rooted & binary tree
 
JPEG
JPEGJPEG
JPEG
 
Colourful Living - Way of Life
Colourful Living - Way of LifeColourful Living - Way of Life
Colourful Living - Way of Life
 
Introduction to Multimedia
Introduction to MultimediaIntroduction to Multimedia
Introduction to Multimedia
 
Soft Computing
Soft ComputingSoft Computing
Soft Computing
 
Research Methodology - Methods of data collection
 Research Methodology - Methods of data collection Research Methodology - Methods of data collection
Research Methodology - Methods of data collection
 
15 lessons of lord buddha
15 lessons of lord buddha15 lessons of lord buddha
15 lessons of lord buddha
 
Image enhancement
Image enhancementImage enhancement
Image enhancement
 
Research Methodology - Introduction
Research  Methodology - IntroductionResearch  Methodology - Introduction
Research Methodology - Introduction
 
DBMS - FIRST NORMAL FORM
DBMS - FIRST NORMAL FORMDBMS - FIRST NORMAL FORM
DBMS - FIRST NORMAL FORM
 
Data Compression - Text Compression - Run Length Encoding
Data Compression - Text Compression - Run Length EncodingData Compression - Text Compression - Run Length Encoding
Data Compression - Text Compression - Run Length Encoding
 

Recently uploaded

2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
EugeneSaldivar
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
Thiyagu K
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
Nguyen Thanh Tu Collection
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
BhavyaRajput3
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Atul Kumar Singh
 
Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
Jheel Barad
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
Tamralipta Mahavidyalaya
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
Vikramjit Singh
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
Peter Windle
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
joachimlavalley1
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
SACHIN R KONDAGURI
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
GeoBlogs
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Levi Shapiro
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Thiyagu K
 
Francesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptxFrancesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptx
EduSkills OECD
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
DhatriParmar
 

Recently uploaded (20)

2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
 
Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
 
Francesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptxFrancesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptx
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
 

Simple Dictionary Compression

  • 2. • It is a two pass algorithm in which first pass analyze the data in the source file and second pass will compress the data to a file. First Pass:- • In the source file distinct bytes are identified. • Check the number of times it occurs in the source file. • A new list is sorted in descending order of the frequencies, in such a manner in which higher count of byte (alphabets) appear at the top of the list which is known as the dictionary.
  • 3. Second Pass:- • The source file is read again byte by byte • Each byte is located in the dictionary by a direct search and its index is noted. • Index value is written on the compressed file, preceded by its length. • The index value consist of 256 values and range spans from 0 to 255. • The index is written on the compressed file, preceded by a 3-bit code denoting the index’s length.
  • 4. • Index Table Binary Value Value Bit 000 0 1 001 1 2 010 2 3 011 3 4 100 4 5 101 5 6 110 6 7 111 7 8
  • 5. Input File sample data : - TTVVVEGTVEN Dictionary File : -
  • 6. Compressed File (4 – 11 bits) T V V V E G T V E N 0 0 1 1 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 1 1 1 0 1 0 1 0 0 0 0 1 1 0 0 0 0 1 0 0 1 1 1 0 1 0 1 0 1 No: of bits used 5 4 4 4 5 6 5 4 5 6
  • 7. • Compression is achieved because the dictionary is sorted by the frequency of the bytes. Each byte is replaced by a quantity of between 4 and 11 bits. • Dictionary is not sorted by byte values. • Disadvantage :- Slow compression not in the case of decompression.
  • 8. Reference:- Data Compression : The Complete Reference, David Salomon, Springer Science & Business Media, 2004 For any queries contact: Web: www.iprg.co.in E-mail: manishti2004@gmail.com Facebook: @ImageProcessingResearchGroup