The development of data compression algorithms
for a variety of data can be divided into two phases.
In modeling phase we try to extract information
about any redundancy that exists in the data and
describe the redundancy in the form of a model.
A description of the model and a “description” of
how the data differ from the model are coded,
generally using a binary alphabet.
The difference between the data and the model is
often referred to as the residual.
In the following three examples we will look at three
different ways that data can be modeled. We will
then use the model to obtain compression.
If binary representations of these numbers is to be
transmitted or stored, we would need to use 5 bits per
By exploiting the structure in the data, we can represent
the sequence using fewer bits.
If we plot this data as shown in following Figure
We see that the data
seem to fall on a straight
A model for the data
could therefore be a
straight line given by the
To examine the difference between the data and the
model. The difference (or residual) is computed:
En =An − Ān = 0 1 0 −1 1 −1 0 1 −1 −1 1 1
The residual sequence consists of only three numbers −1 0 1.
If we assign a code of 00 to −1, a code of 01 to 0, and a code of
10 to 1, we need to use 2 bits to represent each element of the
Therefore, we can obtain compression by transmitting or
storing the parameters of the model and the residual
The encoding can be exact if the required compression is to be
lossless, or approximate if the compression can be lossy.
The number of distinct values has been reduced.
Fewer bits are required to represent each number and
compression is achieved.
The decoder adds each received value to the previous decoded
value to obtain the reconstruction corresponding to the received
The sequence is made up of eight different symbols.
we need to use 3 bits per symbol.
Say we have assigned a codeword with only a single
bit to the symbol that occurs most often, and
correspondingly longer codewords to symbols that
occur less often.
If we substitute the codes for each symbol,
we will use 106 bits to encode the entire
As there are 41 symbols in the sequence, this
works out to approximately 2.58 bits per
This means we have obtained a compression
ratio of 1.16:1.
Amount of Information
Condition for maximum entropy
Compression is achieved by removing data redundancy
while preserving information content.
The information content of a group of bytes (a
Entropy is the measure of information content in a
Messages with higher entropy carry more information
than messages with lower entropy.
Data with low entropy permit a larger compression ratio
than data with high entropy.
How to determine the entropy
Find the probability p(x) of symbol x in the message
The entropy H(x) of the symbol x is:
H(x) = - p(x) • log2p(x)
The average entropy over the entire message is the
sum of the entropy of all n symbols in the message.