information theory

Information Theory & Coding
Unit V :
Measure of Information, Source Encoding, Error Free Communication over a Noisy Channel capacity of
a discrete and Continuous Memoryless channel Error Correcting codes: Hamming sphere, Hamming
distance and Hamming bound, relation between minimum distance and error detecting and correcting
capability, Linear block codes, encoding & syndrome decoding; Cyclic codes, encoder and decoders for
systematic cycle codes; convolution codes, code tree & Trellis diagram, Viterbi and sequential decoding,
burst error correction, Turbo codes.
4/2/2018 1
NEC 602 by Dr Naim R Kidwai, Professor, F/o Engineering,
JETGI, (JIT Jahangirabad)

Measure of Information
4/2/2018 2
Let a finite alphabet set  of K symbols emits a symbol per signalling interval
={s0, s1,-----, sK-1}
with probabilities p(S=si)=pi, i= 0,1,2,-----K-1
such that
1
0
1 and 0 1
K
i i
i
p p


  
If pi=1, a certain event or no surprise → No information
• Thus uncertainty, surprise, and information are all related
• more surprise → more information
• Thus Information is inversely proportional to probability of occurrence

Measure of Information
4/2/2018 3
Information in emitting of a symbol Si is defined as
For base=2, unit is “bit”
For base= 3, unit is “trit”
For base= 10, unit is “digit” or “dit”
For base= e, unit is “nit” or “nepit”
One bit is the amount of information that we gain when one of two equiprobable events
occurs.
2 2
1
( ) log = -log bitsi i
i
I S s p
p
 

Entropy
4/2/2018 4
The entropy of a discrete random variable, representing the output of a source of
information, is a measure of the average information content per source symbol.
1 1
2
0 0
( ) log ( )
K K
i i i i
i i
H p I p p
 
 
    
20 ( ) log ( ) where K is number of symbols in source alphabetH K  
Entropy is positive and is limited by
As 0≤pi ≤1,→ Ii≥0, → H()≥0
Entropy of Extended Source: If a discrete memoryless source  with K symbols is
extended by considering n block of symbols, It can be viewed as a source n with Kn
symbols with H(n)=nH()

Entropy
4/2/2018 5
Let pi and qi are two distributions and qi =1/K is an equiprobable distribution, then
relative probability
1 1
2 2 2
0 0
1 1
2
0 0
1
( / ) log log log ( )
1
( / ) log 1 = 0 (as ln 1, 0)
ln 2
Thus ( / ) 0, i.e relative information is no
K K
i
i i i i
i ii i
K K
i i
i i i i
i ii i
i i
p
D p q p p K H
q Kp
q q
D p q p p x x x
p p
D p q
 
 
 
 
   
        
   
   
         
   

 
 
2
n negative
. . ( ) logi e H K 
x
1
-1
ln x
X-1

Entropy: Bernaulli Random Variable
4/2/2018 6
Let a random variable has two outcomes ‘0’ and ‘1’ with probabilities p0 and 1-p0
respectively. The Entropy function
   
1
0 2 0 2 0 0 2 0
0
( ) log log 1 log 1i i
i
H p p p p p p p

      
p0
H(p0)
1
0.8
0.6
0.4
0.2
0
0.2 0.4 0.6 0.8 1
 0 2 0 2 0 0
0
1
( ) 1 log 1 log 1 0
2
d
H p p p p
dp
          
2
02
0 0 0
1 1
( ) 0
1
d
H p
dp p p
 
    
 
2
1 1 1 1 1
2 log 1
2 2 2 2 2
H
   
       
   

Source Encoding
4/2/2018 7
Given a discrete memoryless source of entropy H(), the average code word length Ꝉ
is bounded as
Ꝉ ≥H()
Efficiency of coding scheme =H() / Ꝉ
A code has to be uniquely decodable
i.e each codeword (or set of symbol) is different from other

Prefix Coding
4/2/2018 8
Prefix Coding
A coding in which no codeword is prefix to any other codeword.
Initial state
0
0
0
1
1
1
S0
S1
S2
S3
Symbol Prefix Code
S0 0
S1 10
S2 110
S3 111
Decision tree of Prefix code
• initial state is split in two branches
•branch ‘0’ ends at first symbol
•branch ‘1’ is split further in same way
•If last symbol branch ‘1’ ends at last symbol
• Symbol codes are read from initial state
Prefix code is always uniquely decodable
Prefix code is also referred as instantaneous code
For prefix code H() ≤ Ꝉ ≤ H()+1

Huffman Coding
4/2/2018 9
Huffman Code is optimum such that average code word length approach to
fundamental limit
• List the symbols with decreasing probability
•Two symbols with lowest probability are assigned as ‘0’ and ‘1’
•The two symbols are combined (probabilities added) and new list is made with
decreasing probability
•Process is repeated till only two symbol marked as ‘0’ and ‘1’
• Symbol codes are read backword for each symbol

Huffman Coding
4/2/2018 10
Huffman Code is optimum such that average code word length approach to
fundamental limit
• List the symbols with decreasing probability
• Symbol codes are read backword for each symbol
0
1
0
1
0
1
0
1
S0
S1
S2
S3
S5
0.4
0.2
0.2
0.1
0.1
0.4
0.2
0.2
0.2
0.4
0.4
0.2
0.6
0.4
si pi Ii piIi Code Li LiPi
S0 0.4 1.322 0.5288 00 2 0.8
S1 0.2 2.322 0.4644 10 2 0.4
S2 0.2 2.322 0.4644 11 2 0.4
S3 0.1 3.322 0.3322 010 3 0.3
S3 0.1 3.322 0.3322 011 3 0.3
H() =∑ piIi 2.1220 Ꝉ=LiPi 2.2

Shannon Fano Coding
4/2/2018 11
Similar to Huffman code
• List the symbols with decreasing probability and divide the list in
nearly two equal part and assign ‘0’ to symbols in one part and ‘1’
to symbols in other part
• Continue dividing each part and assigning 0/1 till one symbol
remains in each part
• Read code from left to right
S0
S1
S2
S3
S5
0.4
0.2
0.2
0.1
0.1
0
0
1
1
1
0
1
0
1
1
0
1
S0
S1
S2
S3
S5
0.4
0.2
0.2
0.1
0.1
0
1
1
1
1
0
1
1
1
0
1
1
0
1

Shannon Fano Coding
4/2/2018 12
si pi Ii piIi
Code 1 Code 2
Code Li LiPi Code Li LiPi
S0 0.4 1.322 0.5288 00 2 0.8 0 1 0.4
S1 0.2 2.322 0.4644 01 2 0.4 10 2 0.4
S2 0.2 2.322 0.4644 10 2 0.4 110 3 0.6
S3 0.1 3.322 0.3322 010 3 0.3 1110 4 0.3
S3 0.1 3.322 0.3322 011 3 0.3 1111 4 0.3
H() =∑ piIi 2.1220 Ꝉ1=LiPi 2.2 Ꝉ2=LiPi 2.2
• Average codeword length is same in both cases
• Variance of Code 2 is more than Code 1
• So Code with low variance is preferred
 
2
2
Variance i
i
L L  

Discrete Memoryless Channel
4/2/2018 13
Discrete memoryless channel is given by set of transition probabilities
P(yj/xk)= P(Y=yj, given X=xk) for all k and j
P(yj/xk)
x0
x1
-
-
xJ-1
y0
y1
-
-
yK-1
X
Y
YX
P(y0 /x0) P(y1 /x0) - - P(yK-1 /x0)
P(y0 /x1) P(y1 /x1) - - P(yK-1 /x1)
-
-
P(y0 /xJ-1) P(y1 /xJ-1) - - P(yK-1 /xJ-1)
Transition matrix P(Y /X)=
1
0
1 1 1
0 0 0
( ) ( ) ( / ) ( ) for all k=0,1,----,K-1
Probability of error ( )= ( / ) ( )
J
k k k j j
j
k J k
e k k j j
k j k
k j k j
p y p Y y p y x p x
p p Y y p y x p x


  
  
 
  
 

 

Mutual Information
Channel Capacity
4/2/2018 14
If Y is symbol set at the output of discrete memoryless channel in response to the input
symbol set X, then mutual information is given by
I(X;Y) = H(X)-H(X/Y)
• Mutual information is symmetric I(X;Y)= I(Y;X)
• Mutual information is non-negative I(X;Y)≥0
• I(X;Y) = H(X) + H(Y) - H(X,Y), where H(X,Y) is joint entropy
Channel capacity of discrete memoryless channel is given as maximum average mutual
information I(X;Y) in a single use, where maximization is over all possible distribution p(xj)
{ ( )}
max ( ; ) bits per channel use
jp x
C I X Y

Channel Coding Theorem
4/2/2018 15
Let a discrete memoryless source have entropy H() produce symbol per Ts second, and
let a discrete memoryless channel of capacity C and be used every TC second, then if
( )
s C
H C
T T


Thus Channel capacity is fundamental limit on the rate over which error free reliable
transmission is possible over discrete memoryless channel

Channel Capacity Theorem
4/2/2018 16
Capacity of a channel of bandwidth B Hz, perturbed by additive white
Gaussian noise of power NO/2 Watts, limited to bandwidth B, and signal
power P Watts is given as
2log 1 bits/s
O
P
C B
N B
 
  
 
Ideal system: bit rate Rb=C, and if energy/ bit is Eb, so P= Eb C
2
2 1
log 1
ln 2 0.693 1.59 dB
C
B
b b
C
O O B
b
O B
E EC C
B N B N
E
N 
  
    
 
 
     
 
•(Eb/NO)B→ =-1.59 dB is called Shannon limit for error free transmission
• Eb/NO can be traded with bandwidth B for a channel capacity C

Channel Capacity Theorem or Shannon Hartley Law
4/2/2018 17
Capacity of a channel of bandwidth B Hz, perturbed by additive white Gaussian noise of
power NO/2 Watts, limited to bandwidth B, and signal power P Watts is given as
2log 1 bits/s
O
P
C B
N B
 
  
 

Error Free Communication over a Noisy
Channel
4/2/2018 18
Ideal system: bit rate Rb=C, and if energy/ bit is Eb, so P= Eb C
2log 1
2 1
ln 2 0.693 1.59 dB
b
O
C
B
b
C
O B
b
O B
EC C
B N B
E
N
E
N 
 
  
 

 
 
     
 
•(Eb/NO)B→=-1.59 dB is called Shannon limit for error free transmission
• Eb/NO can be traded with bandwidth B for a channel capacity C
C/B bits/s/Hz
-1.59 0 10 20 30 40
Capacity boundary Rb=C
Rb<C
Practical
systems
Rb>C
Error
region
16
8
4
2
1
1/2
Eb/NO dB

information theory

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to information theory

Similar to information theory (20)

More from Dr Naim R Kidwai

More from Dr Naim R Kidwai (20)

Recently uploaded

Recently uploaded (20)

information theory