UNIT 4
INFORMATION THEORY AND CODING
By,
Dr. M. Rajkumar, ASP/CSE
R.M.D Engineering College
Remya Rose S, AP/CSE
R.M.D Engineering College
 Measure of information – Entropy – Source coding
theorem – Shannon–Fano coding, Huffman Coding,
LZ Coding – Channel capacity – Shannon-Hartley law
– Shannon's limit – Error control codes – Cyclic
codes, Syndrome calculation – Convolution Coding,
Sequential and Viterbi decoding
INFORMATION THEORY
 Information is the source of a communication
system, whether it is analog or digital. Information
theory is a mathematical approach to the study of
coding of information along with the quantification,
storage, and communication of information.
SOURCE CODING
4
SOURCE ENCODER
Symbols
Remove Redundancy
Add Redundancy
5
SOURCE
SOURCE
CODER
CHANNEL
ENCODER
What is source code
 In computing, source code is any collection of code, possibly
with comments, written using a human-readable
programming language, usually as plain text. The source
code is often transformed by an assembler or compiler into
binary machine code that can be executed by the computer.
What is source coding Theorem
 Source coding is a mapping from (a sequence of) symbols
from an information source to a sequence of alphabet
symbols (usually bits) such that the source symbols can be
exactly recovered from the binary bits (lossless source
coding) or recovered within some distortion (lossy source
coding).
TWO REQUIREMENTS
1. The code words produced by the encoder are in binary
form.
2. The source code is uniquely decodable, so that the
original source sequence can be reconstructed perfectly
from the encoded binary sequence.
8
ENTROPY

Types of Coding Techniques
10
Shannon Fano
Coding
Huffman
Coding
Classification
Shannon Fano Coding
 An entropy encoding technique for lossless data
compression
 It assigns a code to each symbol based on their
probabilities of occurrence.
 It is a variable length encoding scheme
11
Procedure
1.Sort the list of symbols in decreasing order of
probability
2.Split the list into two parts, with the total
probability of both the parts being as close to each
other as possible.
3. Assign the value 0 to the left part and 1 to the
right part.
4. Repeat the steps 2 and 3 for each part, until all
the symbols are split into individual subgroups.
12
Shannon-fano Coding: Example 1
Given task is to construct Shannon codes for the given set of symbols using
the Shannon-Fano lossless compression technique.
Given symbols S= { S1 S2 S3 S4 S5} with probabilities P= {0.3 0.2 0.2
0.2 0.1}
Message Probability
S1 0.3
S2 0.2
S3 0.2
S4 0.2
S5 0.1
Message Probability
S1 0.3
S2 0.2
S3 0.2
S4 0.2
S5 0.1
0.3
0.2
0.2
0.2
0.1
Message Probability
S1 0.3
S2 0.2
S3 0.2
S4 0.2
S5 0.1
0.3
0.2
0.2
0.2
0.1
0
1
0
1
1
Message Probability
S1 0.3
S2 0.2
S3 0.2
S4 0.2
S5 0.1
0.3
0.2
0.2
0.2
0.1
0
1
0
1
1
0.3
0.2
0.2
0.2
0.1
1
0
Message Probability
S1 0.3
S2 0.2
S3 0.2
S4 0.2
S5 0.1
0.3
0.2
0.2
0.2
0.1
0
1
0
1
1
0.3
0.2
0.2
0.2
0.1
1
0
0
1
1
Message Probability
S1 0.3
S2 0.2
S3 0.2
S4 0.2
S5 0.1
0.3
0.2
0.2
0.2
0.1
0
1
0
1
1
0.3
0.2
0.2
0.2
0.1
1
0
0
1
1 0.2
0.1 1
0
Message Probability
S1 0.3
S2 0.2
S3 0.2
S4 0.2
S5 0.1
0.3
0.2
0.2
0.2
0.1
0
1
0
1
1
0.3
0.2
0.2
0.2
0.1
1
0
0
1
1 0.2
0.1 1
0
Codeword of S1 - 00
Message Probability
S1 0.3
S2 0.2
S3 0.2
S4 0.2
S5 0.1
0.3
0.2
0.2
0.2
0.1
0
1
0
1
1
0.3
0.2
0.2
0.2
0.1
1
0
0
1
1 0.2
0.1 1
0
Codeword of S2 - 01
Message Probability
S1 0.3
S2 0.2
S3 0.2
S4 0.2
S5 0.1
0.3
0.2
0.2
0.2
0.1
0
1
0
1
1
0.3
0.2
0.2
0.2
0.1
1
0
0
1
1 0.2
0.1 1
0
Codeword of S3 - 10
Message Probability
S1 0.3
S2 0.2
S3 0.2
S4 0.2
S5 0.1
0.3
0.2
0.2
0.2
0.1
0
1
0
1
1
0.3
0.2
0.2
0.2
0.1
1
0
0
1
1 0.2
0.1 1
0
Codeword of S4 - 110
Message Probability
S1 0.3
S2 0.2
S3 0.2
S4 0.2
S5 0.1
0.3
0.2
0.2
0.2
0.1
0
1
0
1
1
0.3
0.2
0.2
0.2
0.1
1
0
0
1
1 0.2
0.1 1
0
Codeword of S5 - 111
Parameters

25
Variance of the Code
 Measure of the variability in code word lengths of a source code
 Pk= Probability of symbol
 Lk= Length of each symbol
 L= Average Length, Variance is given by
26
Parameters

27
log20.3 = log10 0.3/log10 2
Parameters
 Efficiency = η= H(S)/ L = 2.24/2.3 = 97.39%
 Redundancy= 1- η = 0.0261= 2.61%
28
Shannon-fano Coding: Example 2
Consider symbols S= { S1 S2 S3 S4 S5} with
probabilities P= {0.4 0.2 0.2 0.1 0.1}
Huffman Encoding
 Lossless compression of data
 It uses variable length encoding
 The character which occurs most frequently gets the
smallest code
 The character which occurs least frequently gets the largest
code
30
Procedure
1.The source symbols are listed in order of decreasing
probability.
2. These two source symbols are then combined into a new
source symbol with probability equal to the sum of the two
original probabilities.
31
Procedure
3. The procedure is repeated until we are left with a final
list of source statistics
The code for each source is found by working backward
and tracing the sequence of 0s and 1s assigned to that
symbol as well as its successors.
32
HUFFMAN CODING – Example 1
Find the Entropy and Efficiency of the following symbols
using Huffman Coding Method
{ x1,x2,x3,x4,x5} = { 0.3,0.15,0.25,0.05,0.25}
Message Probability 1st
Reduction
2nd
Reduction
3rd
Reduction
x1 0.3 0.3 0.45 0.55
x3 0.25 0.25 0.3 0.45
x5 0.25 0.25 0.25
x2 0.15 0.2
x4 0.05
Message Probability
x1 0.3
x3 0.25
x5 0.25
x2 0.15
x4 0.05
0
1
Message Probability 1st
Reduction
x1 0.3 0.3
x3 0.25 0.25
x5 0.25 0.25
x2 0.15 0.2
x4 0.05
0
1
Message Probability 1st
Reduction
x1 0.3 0.3
x3 0.25 0.25
x5 0.25 0.25
x2 0.15 0.2
x4 0.05
0
1
0
1
Message Probability 1st
Reduction
2nd
Reduction
x1 0.3 0.3 0.45
x3 0.25 0.25 0.3
x5 0.25 0.25 0.25
x2 0.15 0.2
x4 0.05
0
1
0
1
Message Probability 1st
Reduction
2nd
Reduction
x1 0.3 0.3 0.45
x3 0.25 0.25 0.3
x5 0.25 0.25 0.25
x2 0.15 0.2
x4 0.05
0
1
0
10
1
Message Probability 1st
Reduction
2nd
Reduction
3rd
Reduction
x1 0.3 0.3 0.45 0.55
x3 0.25 0.25 0.3 0.45
x5 0.25 0.25 0.25
x2 0.15 0.2
x4 0.05
0
1
0
10
1
Message Probability 1st
Reduction
2nd
Reduction
3rd
Reduction
x1 0.3 0.3 0.45 0.55
x3 0.25 0.25 0.3 0.45
x5 0.25 0.25 0.25
x2 0.15 0.2
x4 0.05
0
1
0
10
1
0
1
Message Probability 1st
Reduction
2nd
Reduction
3rd
Reduction
x1 0.3 0.3 0.45 0.55
x3 0.25 0.25 0.3 0.45
x5 0.25 0.25 0.25
x2 0.15 0.2
x4 0.05
0
1
0
10
1
0
1
Message Probability 1st
Reduction
2nd
Reduction
3rd
Reduction
x1 0.3 0.3 0.45 0.55
x3 0.25 0.25 0.3 0.45
x5 0.25 0.25 0.25
x2 0.15 0.2
x4 0.05
0
1
0
10
1
0
1
Message Probability 1st
Reduction
2nd
Reduction
3rd
Reduction
x1 0.3 0.3 0.45 0.55
x3 0.25 0.25 0.3 0.45
x5 0.25 0.25 0.25
x2 0.15 0.2
x4 0.05
1
10
1
0
1
Code word x1 = 00
0
0
Message Probability 1st
Reduction
2nd
Reduction
3rd
Reduction
x1 0.3 0.3 0.45 0.55
x3 0.25 0.25 0.3 0.45
x5 0.25 0.25 0.25
x2 0.15 0.2
x4 0.05
0
1
0
10
1
0
1
Message Probability 1st
Reduction
2nd
Reduction
3rd
Reduction
x1 0.3 0.3 0.45 0.55
x3 0.25 0.25 0.3 0.45
x5 0.25 0.25 0.25
x2 0.15 0.2
x4 0.05
10
1
1
Code word x3 = 01
0
0
0
1
Message Probability 1st
Reduction
2nd
Reduction
3rd
Reduction
x1 0.3 0.3 0.45 0.55
x3 0.25 0.25 0.3 0.45
x5 0.25 0.25 0.25
x2 0.15 0.2
x4 0.05
0
1
0
10
1
0
1
Message Probability 1st
Reduction
2nd
Reduction
3rd
Reduction
x1 0.3 0.3 0.45 0.55
x3 0.25 0.25 0.3 0.45
x5 0.25 0.25 0.25
x2 0.15 0.2
x4 0.05
0
1
0
10
1
0
1
Code word x5 = 10
Message Probability 1st
Reduction
2nd
Reduction
3rd
Reduction
x1 0.3 0.3 0.45 0.55
x3 0.25 0.25 0.3 0.45
x5 0.25 0.25 0.25
x2 0.15 0.2
x4 0.05
0
1
0
10
1
0
1
Message Probability 1st
Reduction
2nd
Reduction
3rd
Reduction
x1 0.3 0.3 0.45 0.55
x3 0.25 0.25 0.3 0.45
x5 0.25 0.25 0.25
x2 0.15 0.2
x4 0.05
0
1
0
10
1
0
1
Code word x2 = 110
Message Probability 1st
Reduction
2nd
Reduction
3rd
Reduction
x1 0.3 0.3 0.45 0.55
x3 0.25 0.25 0.3 0.45
x5 0.25 0.25 0.25
x2 0.15 0.2
x4 0.05
0
1
0
10
1
0
1
Message Probability 1st
Reduction
2nd
Reduction
3rd
Reduction
x1 0.3 0.3 0.45 0.55
x3 0.25 0.25 0.3 0.45
x5 0.25 0.25 0.25
x2 0.15 0.2
x4 0.05
0
1
0
10
1
0
1
Code word x4 = 111
Parameters

53
Variance of the Code
 Measure of the variability in code word lengths of a
source code
 Pk= Probability of symbol
 Lk= Length of each symbol
 L= Average Length, Variance is given by
54
Parameters

55
Efficiency -100 % Possible

56
HUFFMAN CODING – Example 2
Symbols {S1, S2,S3,S4,S5,S6} with probabilities {1/2, 1/4, 1/8,
1/16, 1/32, 1/32}
{1/2, 1/4, 1/8, 1/16, 1/32, 1/32}
=>{16/32, 8/32, 4/32, 2/32, 1/32, 1/32}
Message Probability 1st
Reduction
2nd
Reduction
3rd
Reduction
4th
Reduction
S1 16/32 16/32 16/32 16/32 16/32
S2 8/32 8/32 8/32 8/32 16/32
S3 4/32 4/32 4/32 8/32
S4 2/32 2/32 4/32
S5 1/32 2/32
S6 1/32

Unit 4