Comparison of various data compression techniques and it perfectly differentiates different techniques of data compression. Its likely to be precise and focused on techniques rather than the topic itself.
● Why it is necessary?
What is Data
The art of representation of information in a compact
form is called Data Compression.
These representations are created by identifying and
using the structure in the data.
2) Data Transmission
Data Compression reduces the size of a file to reduce the storage space
required to store that particular file.
● Data Transmission
It saves the time that is required in transmitting a file.
Data compression involves two main components:
1. Encoding Algorithm
2. Decoding Algorithm
Components of Data Compression
This algorithm takes a message and
generates a compressed
representation of that message.
This algorithm reconstructs the
original message or some
approximation of it from the
Data Compression Techniques
Mainy all the techniques are divided in two basic types.
Lossless Compression Technique
As per its name, No data loss.
reconstruct the original message exactly from the compressed message.
Generally used for text files, spreadsheet files, important documents.
Some examples based on these techniques are RLE, Huffman Coding.
RLE - Run Length Encoding
Simple Compression technique.
Replace all consecutive numbers or alphabets by first the number of times
an alphabet was used followed by the alphabet itself.
This method becomes more effective with numbers especially when it is
about only two digits 1 and 0
For example we have this stream here
Calculate the repetition
Uses certain method for selecting representation for each symbol which gets
certain code which is called as huffman code
In fact, assigns fewer bits to symbols that occur more often and more bit to
symbols occur less in data.
It follows certain algorithm which is described below
1)Make a base node for each code symbol.
2)Count their occurrences.
For example, if we have a sentence like following
“the essential feature”
We exactly have to assign numbers to each symbol and count occurrences
By counting we can find 12 different symbols as follows
A E F H I L N R S T U
2 5 1 1 1 1 1 1 2 3 1 2
LZ77 compression works by finding sequences of data that are repeated
It introduces a term called “sliding window”, which means at any point of
time, there is a record of what characters went before.
For example, a 32K sliding window means the compressor (and
decompressor) have a record of what the last 32768 (32 * 1024)
When the next sequence of characters to be compressed is identical to one
that can be found within the sliding window, The sequence of character is
1)Distance : how far back into the window the sequence starts
2)Length : the number of characters for which the sequence is identical
For example, if the word is
Blah blah blah blah blah!
Here you can see that, data is repeated after another b, and hence our first
compression would be like
It is totally dependent on above two techniques
It gives three different modes to compress data:
1)Not compressed at all
2)Compression, first with LZ77 and then with Huffman coding(The trees that
are used to compress in this mode are defined by the Deflate specification
itself, and so no extra space needs to be taken to store those trees.)
3)Compression, first with LZ77 and then with Huffman coding with trees that
Unlike Lossless, this method reduces data by eliminating specific
It can achieve very high compression ratios through data removal.
It could happen that if user try to decompress it, only a part of the original
information is still there.
This method is generally used for video and sound where specific amount of
information loss is there and that even not recognised by users. JPEG is
Different Lossy Techniques
Comparatively, These methods are less time taking, cheaper as well as it can
reduce more space.
Methods based on lossy compression,
JPEG: Used for pictures and graphics
MPEG: used for video compression
Pictures & graphics
A picture is divided into 8*8
pixel square block to
decrease number of
● Picture preprocessing
In this step, there is generation of an appropriate digital representation of the
information in the medium being processed.
● Picture transformation
This step involves mainly the use of compression algorithm.
This step takes place after the data processing part. The values determined in
the second part are quantized according to specific properties like resolution.
● Entropy Encoding
In this step, There is data streaming of bits and bytes in a sequential way.
Here we have to think about values
in the transformed image,
Elements near zero will be
converted to zero in order to
All Quantized values will be
rounded to integers.
This makes it a lossy compression.
-2 -19 -15 -6 -4 -1 -1 0
14 4 -2 -13 0 0 -1 -2
-2 -2 -2 7 -1 -1 0 1
2 -3 -2 2 0 0 1 0
1 0 1 -1 -1 0 0 0
-3 2 1 -1 0 0 0 0
0 0 0 -1 1 0 0 0
1 0 -1 0 0 0 0 0
Finally, Here we encode the
quantized image according
to regular JPEG standard.
As per the example, here we
have dimensions 160 * 240
and it means 160 * 240 * 8
= 307.200 bits needed
If we use it in JPEG format
then, we only need 85143
bits according to the
This interestingly saves
JPEG Compressed Image
Ultimately it uses JPEG
only. Each frame of it are
spatially compressed by
To know this compression,
Three frames are
necessary to understand
1) I - frame(intra coded)
2) B - frame (Forward
As name suggests, it is used for speech or music compression.
It has so many applications and methods for ex,
MP3, PCM, ADPCM
Dolby true hd, Direct stream transfer,Apple lossless
Famous applications are:
MPEG and so on.