2. CONTENTS
1. What is Source Coding general concept?
2. What are Commonly Used Data compression algorithms?
3. Problems of Huffman and the need of Arithmetic coding , general comparison?
4. Arithmetic Encoding.
5. Arithmetic Decoding.
6/18/2017
2ECE /MIT/2009
3. 1.GENERAL CONCEPT OF
SOURCE CODING
Source Coding : encoding information using fewer bits than the original
representation.
Importance of Data Compression is :
A technique to reduce the quantity of data .
preserve quality of themultimedia data.
Allows more bytes to be packed than uncompressed.
Saving storage ,saving Bandwidth .
Reduce file transfer time . quick encode & send .
i.e,mp4->mp3.
6/18/2017 3ECE /MIT/2009
6. CONTND…..
Figure 1.3 Quality comparison for lossy system
o What is Lossy Compression?
after compression ,file cannot recover.
reduce image size .
redundant information lost.
i.e -mp3,JPEG,wav
6/18/2017 6ECE /MIT/2009
8. 2.COMMONLY USED COMPRESSING
ALGORITHMS
• i.Huffman Coding
• assign short codewords to those input blocks with high probabilities and long
codewords to those with low probabilities.
6/18/2017 8ECE /MIT/2009
9. CONTND…..
ii. Run Length Encoding
Algorithm:
Run length encoding is simple form of data compression.
consecutive runs of data are stored as single data value .
Each count(indicating how many times that data is repeating) .
It is useful for compressing simple graphic images .
EXAMPLE:
Input: ZZZZZZZZZZZZCZZZZZZZZZZZZCCCZZZZZZZZZZZZZZZZZZZZZZZZC
Output:
12ZC12Z3C24ZC
6/18/2017 9ECE /MIT/2009
10. III.ARITHMETIC CODING
Mathematical compression.
Based on the coding of a input sequence using a rational number in ranges [0,1).
Doesn´t use a discrete number of bits for each.
The main idea behind Arithmetic coding is to assign each symbol an interval.
6/18/2017 10ECE /MIT/2009
11. 3.PROBLEMS OF HUFFMAN AND THE NEED OF ARITHMETIC
CODING , GENERAL COMPARISON?
ARITHMETIC CODING
• Huffman coding
Less efficiency
less edible.
……… .....in contrast
Faster
Enough storage
• Arithmetic Coding
redundancy much reduced.
Can be used with any model
conjunction , adaptiveness ,sharp in any
model.
…………………………Dis advantage AC
Too slow because mathematical
operations .
Significant amount of memory.
6/18/2017 11ECE /MIT/2009
13. 4.ARITHMETIC ENCODING STEPS
• To code symbol s ,where symbols are numbered from 1 to n and symbol I has the
probability pr[i];
• low bound = 𝑖=0𝑝𝑟
𝑠−1
[𝑖]
• High bound = 𝑖=0𝑝𝑟
𝑠−1
[𝑖]
• Range=high-low
• Low=low + range *(low bound)
• High=low + range *(high bound)
6/18/2017 13ECE /MIT/2009
14. CONTND----
• Consider encoding the name MIT CAMPAS Again, we need the frequency of all the
characters in the text
• char freq.
• Space 0.1
• A 0.2
• C 0.1
• I 0.1
• M 0.2
• P 0.1
• S 0.1
• T 0.1
6/18/2017 14ECE /MIT/2009
15. CONTND----
• . character probability range
• space 0.1 [0.00, 0.10)
• A 0.2 [0.10, 0.30)
• C 0.1 [0.30, 0.40)
• I 0.1 [0.40, 0.50)
• M 0.2 [0.50, 0.70)
• P 0.1 [0.70, 0.80)
• S 0.1 [0.80, 0.90)
• T 0.1 [0.90, 1.00)
•
6/18/2017 15ECE /MIT/2009
16. CONTND----
• . ENCODING THEWORD MIT CAMPAS
• chr low high
• 0.0 1.0
• M 0.5 0.7
• I 0.54 0.55
• T 0.549 0.550
• Space 0.5490 0.5491
• C 0.54903 0.54941
• A 0.549301 0.549033
• M 0.5493015 0.5493017
• P 0.54930164 0.54930166
A 0.549301643 0.549316466
S 0.5493016438 0.5493016439
6/18/2017 16ECE /MIT/2009
17. CONTND
• .The final low value, 0.5493016438 will uniquely encode the name MIT CAMPAS.
• which in binary is approximately [0.11010 00000, 0.11010 01100).We can uniquely
identify this interval by outputting 1101000.
• Another example Encoding
suppose the alphabet is (a, e, i, O, u, !I, and a fixed model is used with
probabilities shown in Table I. Imagine trans mitting the message eaii! .
Initially, both encoder and decoder know that the range is [0, 1).
After seeing the first symbol, e, the encoder narrows it to [0.2, 04, the range the
model allocates to this symbol.
The second symbol, a, will narrow this new range to the first one-fifth of it,
6/18/2017 17ECE /MIT/2009
20. HOW THE ARITHMETIC DECODER WORKS?
Decoder detects the last suffix o.23355.
Relative to the fixed model of Table I,
.The entropy of the five-symbol message eaii! Is taking logarithm of each term would
be 4.22 .
6/18/2017 20ECE /MIT/2009
21. GENERALLY……..
Arithmetic coding typically has a better compression ratio than Huffman coding, as it
produces a single symbol rather than several separate code words.
Arithmetic coding is a lossless coding technique.
Few disadvantages of arithmetic coding.
I . whole code word must be received to start decoding the symbols.
If corrupt bit in the code word, the entire message could become corrupt.
II .There is a limit to the precision of the number which can be encoded, thus limiting
the number of symbols to encode within a code word.
There also exists many patents upon arithmetic coding, so the use of some of the
algorithms also call upon royalty fees.
6/18/2017 21ECE /MIT/2009