SlideShare a Scribd company logo
1 of 65
Download to read offline
INFORMATION THEORY
ER. FARUK BIN POYEN, Asst. Professor
DEPT. OF AEIE, UIT, BU, BURDWAN, WB, INDIA
faruk.poyen@gmail.com
Contents..
๏ต Information Theory
๏ต What is Information?
๏ต Axioms of Information
๏ต Information Source
๏ต Information Content of a Discrete Memoryless Source
๏ต Information Content of a Symbol (i.e. Logarithmic Measure of Information)
๏ต Entropy (i.e. Average Information)
๏ต Information Rate
๏ต The Discrete Memoryless Channels (DMC)
๏ต Types of Channels
๏ต Conditional And Joint Entropies
๏ต The Mutual Information
๏ต The Channel Capacity
2
Contents: Contd..
๏ต Capacity of an Additive Gaussian Noise (AWGN) Chanel - Shannon โ€“ Hartley Law
๏ต Channel Capacity: Detailed Study โ€“ Proof of Shannon โ€“ Hartley Law expression.
๏ต The Source Coding
๏ต Few Terms Related to Source Coding Process
๏ต The Source Coding Theorem
๏ต Classification of Codes
๏ต The Kraft Inequality
๏ต The Entropy Coding โ€“ Shannon โ€“ Fano Coding; Huffman Coding.
๏ต Redundancy
๏ต Entropy Relations for a continuous Channel
๏ต Analytical Proof of Shannon โ€“ Hartley Law
๏ต Rate Distortion Theory
๏ต Probability Theory
3
Introduction:
๏ต The purpose of communication system is to carry information bearing base band
signals from one place to another placed over a communication channel.
๏ต Information theory is concerned with the fundamental limits of communication.
๏ต What is the ultimate limit to data compression?
๏ต What is the ultimate limit of reliable communication over a noisy channel, e.g. how
many bits can be sent in one second over a telephone line?
๏ต Information Theory is a branch of probability theory which may be applied to the study
of the communication systems that deals with the mathematical modelling and analysis
of a communication system rather than with the physical sources and physical
channels.
๏ต Two important elements presented in this theory are Binary Source (BS) and the
Binary Symmetric Channel (BSC).
๏ต A binary source is a device that generates one of the two possible symbols โ€˜0โ€™ and โ€˜1โ€™
at a given rate โ€˜rโ€™, measured in symbols per second.
4
๏ต These symbols are called bits (binary digits) and are generated randomly.
๏ต The BSC is a medium through which it is possible to transmit one symbol per time
unit. However this channel is not reliable and is characterized by error probability โ€˜pโ€™
(0 โ‰ค p โ‰ค 1/2) that an output bit can be different from the corresponding input.
๏ต Information theory tries to analyse communication between a transmitter and a receiver
through an unreliable channel and in this approach performs an analysis of information
sources, especially the amount of information produced by a given source, states the
conditions for performing reliable transmission through an unreliable channel.
๏ต The source information measure, the channel capacity measure and the coding are all
related by one of the Shannon theorems, the channel coding theorem which is stated
as: โ€˜If the information rate of a given source does not exceed the capacity of a given
channel then there exists a coding technique that makes possible transmission through
this unreliable channel with an arbitrarily low error rate.โ€™
5
๏ต There are three main concepts in this theory:
1. The first is the definition of a quantity that can be a valid measurement of information
which should be consistent with a physical understanding of its properties.
2. The second concept deals with the relationship between the information and the
source that generates it. This concept will be referred to as the source information.
Compression and encryptions are related to this concept.
3. The third concept deals with the relationship between the information and the
unreliable channel through which it is going to be transmitted. This concept leads to
the definition of a very important parameter called the channel capacity. Error -
correction coding is closely related to this concept.
6
What is Information?
๏ต Information of an event depends only on its probability of occurrence and is not
dependent on its content.
๏ต The randomness of happening of an event and the probability of its prediction as a
news is known as information.
๏ต The message associated with the least likelihood event contains the maximum
information.
7
Axioms of Information:
1. Information is a non-negative quantity: I (p) โ‰ฅ 0.
2. If an event has probability 1, we get no information from the
occurrence of the event: I (1) = 0.
3. If two independent events occur (whose joint probability is the product
of their individual probabilities), then the information we get from
observing the events is the sum of the two information: I (p1* p2) = I
(p1) + I (p2).
4. I (p) is monotonic and continuous in p.
8
Information Source:
๏ต An information source may be viewed as an object which produces an event, the
outcome of which is selected at random according to a probability distribution.
๏ต The set of source symbols is called the source alphabet and the elements of the set are
called symbols or letters.
๏ต Information source can be classified as having memory or being memoryless.
๏ต A source with memory is one for which a current symbol depends on the previous
symbols.
๏ต A memoryless source is one for which each symbol produced is independent of the
previous symbols.
๏ต A discrete memoryless source (DMS) can be characterized by the list of the symbol, the
probability assignment of these symbols and the specification of the rate of generating
these symbols by the source.
9
Information Content of a DMS:
๏ต The amount of information contained in an event is closely related to its uncertainty.
๏ต A mathematical measure of information should be a function of the probability of the
outcome and should satisfy the following axioms:
a) Information should be proportional to the uncertainty of an outcome
b) Information contained in independent outcomes should add up.
10
Information Content of a Symbol
(i.e. Logarithmic Measure of Information):
๏ต Let us consider a DMS denoted by โ€˜xโ€™ and having alphabet {x1, x2, โ€ฆโ€ฆ, xm}.
๏ต The information content of the symbol xi, denoted by I (๐‘ฅ๐‘–) is defined by
๏ต ๐ผ ๐‘ฅ๐‘– = log๐‘
1
๐‘ƒ(๐‘ฅ๐‘–)
= โˆ’ log๐‘
๐‘ƒ(๐‘ฅ๐‘–)
๏ต where P(๐‘ฅ๐‘–) is the probability of occurrence of symbol ๐‘ฅ๐‘–.
๏ต For any two independent source messages xi and xj with probabilities ๐‘ƒ๐‘– and ๐‘ƒ๐‘—
respectively and with joint probability P (๐‘ฅ๐‘–, ๐‘ฅ๐‘—) = Pi Pj, the information of the
messages is the addition of the information in each message. ๐ผ๐‘–๐‘— = ๐ผ๐‘– + ๐ผ๐‘—.
11
๏ต Note that I(xi) satisfies the following properties.
1. I(xi) = 0 for P(xi) = 1
2. I(xi) โ‰ฅ 0
3. I(xi) > I(xj) if P(xi) < P(xj)
4. I(xi, xj) = I(xi) + I(xj) if xi and xj are independent
๏ต Unit of I (xi): The unit of I (xi) is the bit (binary unit) if b = 2, Hartley or
decit if b = 10 and nat (natural unit) if b = e. it is standard to use b = 2.
log2
๐‘Ž = ln ๐‘Ž/ ln 2 = log ๐‘Ž/ log 2
12
Entropy (i.e. Average Information):
๏ต Entropy is a measure of the uncertainty in a random variable. The entropy, H, of a
discrete random variable X is a measure of the amount of uncertainty associated with
the value of X.
๏ต For quantitative representation of average information per symbol we make the
following assumptions:
i) The source is stationary so that the probabilities may remain constant with time.
ii) The successive symbols are statistically independent and come from the source at an
average rate of โ€˜rโ€™ symbols per second.
๏ต The quantity H(X) is called the entropy of source X. it is a measure of the average
information content per source symbol.
๏ต The source entropy H(X) can be considered as the average amount of uncertainty
within the source X that is resolved by the use of the alphabet.
๏ต H(X) = E [I(xi)] = - ฮฃP(xi) I(xi) = - ฮฃP(xi)log2 P(xi) b/symbol.
13
๏ต Entropy for Binary Source:
๐ป ๐‘‹ = โˆ’ 1
2 log2
1
2 โˆ’ 1
2 log2
1
2 = 1 ๐‘/๐‘ ๐‘ฆ๐‘š๐‘๐‘œ๐‘™
๏ต The source entropy H(X) satisfies the relation: 0 โ‰ค H(X) โ‰ค log2 m,
where m is the size of the alphabet source X.
๏ต Properties of Entropy:
1. 0 โ‰ค ๐ป ๐‘‹ โ‰ค log2
๐‘š ; m = no. of symbols of the alphabet of
source X.
2. When all the events are equally likely, the average uncertainty
must have the largest value i.e. log2
๐‘š โ‰ฅ ๐ป ๐‘‹
3. H (X) = 0, if all the P(xi) are zero except for one symbol with P =
1.
14
Information Rate:
๏ต If the time rate at which X emits symbols is โ€˜rโ€™ (symbols s), the
information rate R of the source is given by
๏ต R = r H(X) b/s [(symbols / second) * (information bits/ symbol)].
๏ต R is the information rate. H(X) = Entropy or average information.
15
The Discrete Memoryless Channels (DMC):
1. Channel Representation: A communication channel may be defined as the path or
medium through which the symbols flow to the receiver end.
A DMC is a statistical model with an input X and output Y. Each possible input to output
path is indicated along with a conditional probability P (yj/xi), where P (yj/xi) is the
conditional probability of obtaining output yj given that the input is x1 and is called a
channel transition probability.
1. A channel is completely specified by the complete set of transition probabilities. The
channel is specified by the matrix of transition probabilities [P(Y/X)]. This matrix is
known as Channel Matrix.
16
๐‘ƒ(๐‘Œ/๐‘‹) =
๐‘ƒ(๐‘ฆ1
/๐‘ฅ1) โ‹ฏ ๐‘ƒ(๐‘ฆ๐‘›/๐‘ฅ1)
โ‹ฎ โ‹ฑ โ‹ฎ
๐‘ƒ(๐‘ฆ1
/๐‘ฅ๐‘š) โ‹ฏ ๐‘ƒ(๐‘ฆ๐‘›
/๐‘ฅ๐‘š)
Since each input to the channel results in some output, each row of the
column matrix must sum to unity. This means that
๐‘—=1
๐‘›
๐‘ƒ(๐‘ฆ๐‘—
/๐‘ฅ๐‘–) = 1 ๐‘“๐‘œ๐‘Ÿ ๐‘Ž๐‘™๐‘™ ๐‘–
Now, if the input probabilities P(X) are represented by the row matrix, we
have
๐‘ƒ ๐‘‹ = [๐‘ƒ ๐‘ฅ1 ๐‘ƒ ๐‘ฅ2 โ€ฆ . ๐‘ƒ(๐‘ฅ๐‘š)]
17
๏ต Also the output probabilities P(Y) are represented by the row matrix, we have
๐‘ƒ ๐‘Œ = [๐‘ƒ ๐‘ฆ1 ๐‘ƒ ๐‘ฆ2 โ€ฆ . ๐‘ƒ(๐‘ฆ๐‘›)]
Then
[๐‘ƒ ๐‘Œ ] = [๐‘ƒ(๐‘‹)][๐‘ƒ(๐‘Œ/๐‘‹)]
Now if P(X) is represented as a diagonal matrix, we have
[๐‘ƒ(๐‘‹)]๐‘‘
=
๐‘ƒ(๐‘ฅ1) โ‹ฏ 0
โ‹ฎ โ‹ฑ โ‹ฎ
0 โ‹ฏ ๐‘ƒ(๐‘ฅ๐‘š)
Then
๐‘ƒ ๐‘‹, ๐‘Œ = [๐‘ƒ(๐‘‹)]๐‘‘
[๐‘ƒ(๐‘Œ/๐‘‹)]
๏ต Where the (i, j) element of matrix [P(X,Y)] has the form P(xi, yj).
๏ต The matrix [P(X, Y)] is known as the joint probability matrix and the element
P(xi, yj) is the joint probability of transmitting xi and receiving yj.
18
Types of Channels:
๏ต Other than discrete and continuous channels, there are some special types of channels
with their own channel matrices. They are as follows:
๏ต Lossless Channel: A channel described by a channel matrix with only one non โ€“ zero
element in each column is called a lossless channel.
๐‘ƒ
๐‘Œ
๐‘‹
=
3
4
1
4 0 0
0 0 2
3 0
0 0 0 1
19
๏ต Deterministic Channel: A channel described by a channel matrix with
only one non โ€“ zero element in each row is called a deterministic
channel.
๐‘ƒ
๐‘Œ
๐‘‹
=
1 0 0
1 0 0
0 1 0
0 0 1
20
๏ต Noiseless Channel: A channel is called noiseless if it is both lossless
and deterministic. For a lossless channel, m = n.
๐‘ƒ
๐‘Œ
๐‘‹
=
1 0 0
0 1 0
0 0 1
21
๏ต Binary Symmetric Channel: BSC has two inputs (x1 = 0 and x2 = 1)
and two outputs (y1 = 0 and y2 = 1).
๏ต This channel is symmetric because the probability of receiving a 1 if a 0
is sent is the same as the probability of receiving a 0 if a 1 is sent.
๐‘ƒ
๐‘Œ
๐‘‹
=
1 โˆ’ ๐‘ ๐‘
๐‘ 1 โˆ’ ๐‘
22
Conditional And Joint Entropies:
๏ต Using the input probabilities P (xi), output probabilities P (yj), transition
probabilities P (yj/xi) and joint probabilities P (xi, yj), various entropy
functions for a channel with m inputs and n outputs are defined.
๐ป ๐‘‹ = โˆ’
i=1
m
P(xi) log2
P(xi)
๐ป ๐‘Œ = โˆ’
j=1
n
P(yj
) log2
P(yj
)
๐ป
๐‘‹
๐‘Œ
= โˆ’
๐‘—=1
๐‘›
๐‘–=1
๐‘š
๐‘ƒ(๐‘ฅ๐‘–, ๐‘ฆ๐‘—) log2
(๐‘ฅ๐‘–/๐‘ฆ๐‘—)
๐ป
๐‘Œ
๐‘‹
= โˆ’
๐‘—=1
๐‘›
๐‘–=1
๐‘š
๐‘ƒ(๐‘ฅ๐‘–, ๐‘ฆ๐‘—) log2
(๐‘ฆ๐‘—/๐‘ฅ๐‘–)
๐ป ๐‘‹, ๐‘Œ = โˆ’
๐‘—=1
๐‘›
๐‘–=1
๐‘š
๐‘ƒ(๐‘ฅ๐‘–, ๐‘ฆ๐‘—) log2
(๐‘ฅ๐‘–, ๐‘ฆ๐‘—)
23
๏ต H (X) is the average uncertainty of the channel input and H (Y) is the average
uncertainty of the channel output.
๏ต The conditional entropy H (X/Y) is a measure of the average uncertainty remaining
about the channel input after the channel output has been observed. H (X/Y) is also
called equivocation of X w.r.t. Y.
๏ต The conditional entropy H (Y/X) is the average uncertainty of the channel output given
that X was transmitted.
๏ต The joint entropy H (X, Y) is the average uncertainty of the communication channel as
a whole. Few useful relationships among the above various entropies are as under:
๏ต H (X, Y) = H (X/Y) + H (Y)
๏ต H (X, Y) = H (Y/X) + H (X)
๏ต H (X, Y) = H (X) + H (Y)
๏ต H (X/Y) = H (X, Y) โ€“ H (Y)
๏ต X and Y are statistically independent.
24
๏ต The conditional entropy or conditional uncertainty of X given random
variable Y (also called the equivocation of X about Y) is the average
conditional entropy over Y.
๏ต The joint entropy of two discrete random variables X and Y is merely
the entropy of their pairing: (X, Y), this implies that if X and Y are
independent, then their joint entropy is the sum of their individual
entropies.
25
The Mutual Information:
๏ต Mutual information measures the amount of information that can be obtained about
one random variable by observing another.
๏ต It is important in communication where it can be used to maximize the amount of
information shared between sent and received signals.
๏ต The mutual information denoted by I (X, Y) of a channel is defined by:
๐ผ ๐‘‹; ๐‘Œ = ๐ป ๐‘‹ โˆ’ ๐ป
๐‘‹
๐‘Œ
๐‘/๐‘ ๐‘ฆ๐‘š๐‘๐‘œ๐‘™
๏ต Since H (X) represents the uncertainty about the channel input before the channel
output is observed and H (X/Y) represents the uncertainty about the channel input after
the channel output is observed, the mutual information I (X; Y) represents the
uncertainty about the channel input that is resolved by observing the channel output.
26
๏ต Properties of Mutual Information I (X; Y)
๏ต I (X; Y) = I(Y; X)
๏ต I (X; Y) โ‰ฅ 0
๏ต I (X; Y) = H (Y) โ€“ H (Y/X)
๏ต I (X; Y) = H(X) + H(Y) โ€“ H(X,Y)
๏ต The Entropy corresponding to mutual information [i.e. I (X, Y)]
indicates a measure of the information transmitted through a channel.
Hence, it is called โ€˜Transferred informationโ€™.
27
The Channel Capacity:
๏ต The channel capacity represents the maximum amount of information that can be
transmitted by a channel per second.
๏ต To achieve this rate of transmission, the information has to be processed properly or
coded in the most efficient manner.
๏ต Channel Capacity per Symbol CS: The channel capacity per symbol of a discrete
memoryless channel (DMC) is defined as
๐ถ๐‘† = max
{P(xi)}
๐ผ ๐‘‹; ๐‘Œ ๐‘/๐‘ ๐‘ฆ๐‘š๐‘๐‘œ๐‘™
Where the maximization is over all possible input probability distributions {P (xi)} on X.
๏ต Channel Capacity per Second C: I f โ€˜rโ€™ symbols are being transmitted per second,
then the maximum rate od transmission of information per second is โ€˜r CSโ€™. this is the
channel capacity per second and is denoted by C (b/s) i.e.
๐ถ = ๐‘Ÿ๐ถ๐‘† ๐‘/๐‘ 
28
Capacities of Special Channels:
๏ต Lossless Channel: For a lossless channel, H (X/Y) = 0 and I (X; Y) = H (X).
๏ต Thus the mutual information is equal to the input entropy and no source information is
lost in transmission.
๐ถ๐‘† = max
{P(xi)}
๐ป ๐‘‹ = log2
๐‘š
Where m is the number of symbols in X.
๏ต Deterministic Channel: For a deterministic channel, H (Y/X) = 0 for all input
distributions P (xi) and I (X; Y) = H (Y).
๏ต Thus the information transfer is equal to the output entropy. The channel capacity per
symbol will be
๐ถ๐‘† = max
{P(xi)}
๐ป ๐‘Œ = log2
๐‘›
where n is the number of symbols in Y.
29
๏ต Noiseless Channel: since a noiseless channel is both lossless and
deterministic, we have I (X; Y) = H (X) = H (Y) and the channel
capacity per symbol is
๐ถ๐‘† = log2
๐‘š = log2
๐‘›
๏ต Binary Symmetric Channel: For the BSC, the mutual information is
๐ผ ๐‘‹; ๐‘Œ = ๐ป ๐‘Œ + ๐‘ log2
๐‘ + (1 โˆ’ ๐‘) log2
(1 โˆ’ ๐‘)
And the channel capacity per symbol will be
๐ถ๐‘† = 1 + ๐‘ log2
๐‘ + 1 โˆ’ ๐‘ log2
(1 โˆ’ ๐‘)
30
Capacity of an Additive Gaussian Noise
(AWGN) Channel - Shannon โ€“ Hartley Law
๏ต The Shannon โ€“ Hartley law underscores the fundamental role of
bandwidth and signal โ€“ to โ€“ noise ration in communication channel. It
also shows that we can exchange increased bandwidth for decreased
signal power for a system with given capacity C.
๏ต In an additive white Gaussian noise (AWGN) channel, the channel
output Y is given by
Y = X + n
๏ต Where X is the channel input and n is an additive bandlimited white
Gaussian noise for zero mean and variance ฯƒ2.
31
๏ต The capacity CS of an AWGN channel is given by
๐ถ๐‘  = max
{P(xi)}
๐ผ ๐‘‹; ๐‘Œ = 1
2 log2
1 + ๐‘†
๐‘ b/sample
๏ต Where S/N is the signal โ€“ to โ€“ noise ratio at the channel output.
๏ต If the channel bandwidth B Hz is fixed, then the output y(t) is also a
bandlimited signal completely characterized by its periodic sample
values taken at the Nyquist rate 2B samples/s.
๏ต Then the capacity C (b/s) of the AWGN channel is limited by
๐ถ = 2๐ต โˆ— ๐ถ๐‘† = ๐ต log2
1 + ๐‘†
๐‘ ๐‘/๐‘ 
๏ต This above equation is known as the Shannon โ€“ Hartley Law.
32
Channel Capacity: Shannon โ€“ Hartley Law Proof.
๏ต The bandwidth and the noise power place a restriction upon the rate of information that
can be transmitted by a channel. Channel capacity C for an AWGN channel is
expressed as
๐ถ = ๐ต log2
(1 + ๐‘†
๐‘)
๏ต Where B = channel bandwidth in Hz; S = signal power; N = noise power;
๏ต Proof: Assuming signal mixed with noise, the signal amplitude can be recognized only
within the root mean square noise voltage.
๏ต Assuming average signal power and noise power to be S watts and N watts
respectively the RMS value of the received signal is (๐‘† + ๐‘) and that of noise is ๐‘.
๏ต Therefore the number of distinct levels that can be distinguished without error is
expressed as
๐‘€ = ๐‘† + ๐‘
๐‘
= 1 + ๐‘†
๐‘
33
๏ต The maximum amount of information carried by each pulse having 1 + ๐‘†
๐‘ distinct
levels is given by
๐ผ = log2
1 + ๐‘†
๐‘ = 1
2 log2
(1 + ๐‘†
๐‘) ๐‘๐‘–๐‘ก๐‘ 
๏ต The channel capacity is the maximum amount of information that can be transmitted
per second by a channel. If a channel can transmit a maximum of K pulses per second,
then the channel capacity C is given by
๐ถ = ๐พ
2 log2
1 + ๐‘†
๐‘ ๐‘๐‘–๐‘ก๐‘ /๐‘ ๐‘’๐‘๐‘œ๐‘›๐‘‘
๏ต A system of bandwidth nfm Hz can transmit 2nfm independent pulses per second. It is
concluded that a system with bandwidth B Hz can transmit a maximum of 2B pulses
per second. Replacing K with 2B, we eventually get
๐ถ = ๐ต log2
1 + ๐‘†
๐‘ ๐‘๐‘–๐‘ก๐‘ /๐‘ ๐‘’๐‘๐‘œ๐‘›๐‘‘
๏ต The bandwidth and the signal power can be exchanged for one another.
34
The Source Coding:
๏ต Definition: A conversion of the output of a discrete memory less source
(DMS) into a sequence of binary symbols i.e. binary code word, is
called Source Coding.
๏ต The device that performs this conversion is called the Source Encoder.
๏ต Objective of Source Coding: An objective of source coding is to
minimize the average bit rate required for representation of the source
by reducing the redundancy of the information source.
35
Few Terms Related to Source Coding Process:
I. Code word Length:
๏ต Let X be a DMS with finite entropy H (X) and an alphabet {๐‘ฅ1 โ€ฆ โ€ฆ . . ๐‘ฅ๐‘š } with
corresponding probabilities of occurrence P(xi) (i = 1, โ€ฆ. , m). Let the binary code
word assigned to symbol xi by the encoder have length ni, measured in bits. The length
of the code word is the number of binary digits in the code word.
II. Average Code word Length:
๏ต The average code word length L, per source symbol is given by
๐‘ณ =
๐’Š=๐Ÿ
๐’Ž
๐‘ท(๐’™๐’Š)๐’๐’Š
๏ต The parameter L represents the average number of bits per source symbol used in the
source coding process.
36
III. Code Efficiency:
๏ต The code efficiency ฮท is defined as
๐œผ =
๐‘ณ๐’Ž๐’Š๐’
๐‘ณ
๏ต where Lmin is the minimum value of L. As ฮท approaches unity, the code is said to be
efficient.
IV. Code Redundancy:
๏ต The code redundancy ฮณ is defined as
๐œธ = ๐Ÿ โˆ’ ฦž
37
The Source Coding Theorem:
๏ต The source coding theorem states that for a DMS X, with entropy H (X), the average
code word length L per symbol is bounded as L โ‰ฅ H (X).
๏ต And further, L can be made as close to H (X) as desired for some suitable chosen code.
๏ต Thus, with
๐ฟ๐‘š๐‘–๐‘› = ๐ป (๐‘‹)
๏ต The code efficiency can be rewritten as
๐œ‚ =
๐ป(๐‘‹)
๐ฟ
38
Classification of Code:
๏ต Fixed โ€“ Length Codes
๏ต Variable โ€“ Length Codes
๏ต Distinct Codes
๏ต Prefix โ€“ Free Codes
๏ต Uniquely Decodable Codes
๏ต Instantaneous Codes
๏ต Optimal Codes
39
xi Code 1 Code 2 Code 3 Code 4 Code 5 Code 6
x1
x2
x3
x4
00
01
00
11
00
01
10
11
0
1
00
11
0
10
110
111
0
01
011
0111
1
01
001
0001
๏ต Fixed โ€“ Length Codes:
A fixed โ€“ length code is one whose code word length is fixed. Code 1 and Code 2
of above table are fixed โ€“ length code words with length 2.
๏ต Variable โ€“ Length Codes:
A variable โ€“ length code is one whose code word length is not fixed. All codes of
above table except Code 1 and Code 2 are variable โ€“ length codes.
๏ต Distinct Codes:
A code is distinct if each code word is distinguishable from each other. All codes
of above table except Code 1 are distinct codes.
๏ต Prefix โ€“ Free Codes:
A code in which no code word can be formed by adding code symbols to another
code word is called a prefix- free code. In a prefix โ€“ free code, no code word is
prefix of another. Codes 2, 4 and 6 of above table are prefix โ€“ free codes.
40
๏ต Uniquely Decodable Codes:
A distinct code is uniquely decodable if the original source sequence can be reconstructed
perfectly from the encoded binary sequence. A sufficient condition to ensure that a code is
uniquely decodable is that no code word is a prefix of another. Thus the prefix โ€“ free
codes 2, 4 and 6 are uniquely decodable codes. Prefix โ€“ free condition is not a necessary
condition for uniquely decidability. Code 5 albeit does not satisfy the prefix โ€“ free
condition and yet it is a uniquely decodable code since the bit 0 indicates the beginning of
each code word of the code.
๏ต Instantaneous Codes:
A uniquely decodable code is called an instantaneous code if the end of any code word is
recognizable without examining subsequent code symbols. The instantaneous codes have
the property previously mentioned that no code word is a prefix of another code word.
Prefix โ€“ free codes are sometimes known as instantaneous codes.
๏ต Optimal Codes:
A code is said to be optimal if it is instantaneous and has the minimum average L for a
given source with a given probability assignment for the source symbols.
41
Kraft Inequality:
๏ต Let X be a DMS with alphabet ๐‘ฅ๐‘– (๐‘– = 1,2, โ€ฆ , ๐‘š). Assume that the length of the
assigned binary code word corresponding to xi is ni.
๏ต A necessary and sufficient condition for the existence of an instantaneous binary code
is
๐พ =
๐‘–=1
๐‘š
2โˆ’๐‘›๐‘– โ‰ค 1
๏ต This is known as the Kraft Inequality.
๏ต It may be noted that Kraft inequality assures us of the existence of an instantaneously
decodable code with code word lengths that satisfy the inequality.
๏ต But it does not show us how to obtain those code words, nor does it say any code
satisfies the inequality is automatically uniquely decodable.
42
Entropy Coding:
๏ต The design of a variable โ€“ length code such that its average code word
length approaches the entropy of DMS is often referred to as Entropy
Coding.
๏ต There are basically two types of entropy coding, viz.
1. Shannon โ€“ Fano Coding
2. Huffman Coding
43
Shannon โ€“ Fano Coding:
๏ต An efficient code can be obtained by the following simple procedure, known as
Shannon โ€“ Fano algorithm.
1. List the source symbols in order of decreasing probability.
2. Partition the set into two sets that are as close to equiprobables as possible and assign
0 to the upper set and 1 to the lower set.
3. Continue this process, each time partitioning the sets with as nearly equal
probabilities as possible until further partitioning is not possible.
4. Assign code word by appending the 0s and 1s from left to right.
44
Shannon โ€“ Fano Coding - Example
๏ต Let there be six (6) source symbols having probabilities as x1 = 0.30, x2 = 0.25, x3 =
0.20, x4 = 0.12, x5 = 0.08 x6 = 0.05. Obtain the Shannon โ€“ Fano Coding for the given
source symbols.
๏ต Shannon Fano Code words
๏ต H (X) = 2.36 b/symbol
๏ต L = 2.38 b/symbol
๏ต ฮท = H (X)/ L = 0.99
45
xi P (xi) Step 1 Step 2 Step 3 Step 4 Code
x1
x2
x3
x4
x5
x6
0.30
0.25
0.20
0.12
0.08
0.05
0
0
0 00
1 01
1
1
1
1
0 10
1
1
1
0 110
1
1
0 1110
1 1111
Huffman Coding:
๏ต Huffman coding results in an optimal code. It is the code that has the highest
efficiency.
๏ต The Huffman coding procedure is as follows:
1. List the source symbols in order of decreasing probability.
2. Combine the probabilities of the two symbols having the lowest probabilities and reorder the
resultant probabilities, this step is called reduction 1. The same procedure is repeated until there
are two ordered probabilities remaining.
3. Start encoding with the last reduction, which consists of exactly two ordered probabilities. Assign
0 as the first digit in the code word for all the source symbols associated with the first probability;
assign 1 to the second probability.
4. Now go back and assign 0 and 1 to the second digit for the two probabilities that were combined
in the previous reduction step, retaining all the source symbols associated with the first
probability; assign 1 to the second probability.
5. Keep regressing this way until the first column is reached.
6. The code word is obtained tracing back from right to left.
46
Huffman Encoding - Example 47
Source
Sample
xi
P (xi) Codeword
x1
x2
x3
x4
x5
x6
0.30
0.25
0.20
0.12
0.08
0.05
00
01
11
101
1000
1001
H (X) = 2.36 b/symbol
L = 2.38 b/symbol
ฮท = H (X)/ L = 0.99
Redundancy:
๏ต Redundancy in information theory refers to the reduction in information content of a
message from its maximum value.
๏ต For example, consider English having 26 alphabets. Assuming all alphabets are equally
likely to occur, P (xi) = 1/26. For all the 26 letters, the information contained is
therefore
log2
26 = 4.7 ๐‘๐‘–๐‘ก๐‘ /๐‘™๐‘’๐‘ก๐‘ก๐‘’๐‘Ÿ
๏ต Assuming that each letter to occur with equal probability is not correct, if we assume
that some letters are more likely to occur than others, it actually reduces the
information content in English from its maximum value of 4.7 bits/symbol.
๏ต We define relative entropy on the ratio of H (Y/X) to H (X) which gives the maximum
compression value and Redundancy is then expressed as
๐‘…๐‘’๐‘‘๐‘ข๐‘›๐‘‘๐‘Ž๐‘›๐‘๐‘ฆ = โˆ’
๐ป(๐‘Œ/๐‘‹)
๐ป(๐‘‹)
48
Entropy Relations for a continuous Channel:
๏ต In a continuous channel, an information source produces a continuous signal x (t).
๏ต The signal offered to such a channel of an ensemble of waveforms generated by some
random ergodic processes if the channel is subject to AWGN.
๏ต If the continuous signal x (t) has a finite bandwidth, it can be as well described by a
continuous random variable X having a PDF (Probability Density Function) fx (x).
๏ต The entropy of a continuous source X (t)is given by
๐ป ๐‘‹ = โˆ’
โˆ’โˆž
โˆž
๐‘“๐‘ฅ
(๐‘ฅ) log2
๐‘“๐‘ฅ
๐‘ฅ ๐‘‘๐‘ฅ ๐‘๐‘–๐‘ก๐‘ /๐‘ ๐‘ฆ๐‘š๐‘๐‘œ๐‘™
๏ต This relation is also known as Differential Entropy of X.
49
๏ต Similarly, the average mutual information for a continuous channel is given by
๐ผ ๐‘‹; ๐‘Œ = ๐ป ๐‘‹ โˆ’ ๐ป
๐‘‹
๐‘Œ
= ๐ป ๐‘Œ โˆ’ ๐ป(๐‘Œ/๐‘‹)
๏ต Where
๐ป ๐‘Œ = โˆ’
โˆ’โˆž
โˆž
๐‘“๐‘Œ
๐‘ฆ log2
๐‘“๐‘Œ
(๐‘ฆ)๐‘‘๐‘ฆ
๐ป
๐‘‹
๐‘Œ
= โˆ’
โˆ’โˆž
โˆž
โˆ’โˆž
โˆž
๐‘“๐‘‹๐‘Œ
(๐‘ฅ, ๐‘ฆ) log2
๐‘“๐‘‹
๐‘ฅ
๐‘ฆ ๐‘‘๐‘ฅ๐‘‘๐‘ฆ
๐ป
๐‘Œ
๐‘‹
= โˆ’
โˆ’โˆž
โˆž
โˆ’โˆž
โˆž
๐‘“๐‘‹๐‘Œ
(๐‘ฅ, ๐‘ฆ) log2
๐‘“๐‘Œ
๐‘ฆ
๐‘ฅ ๐‘‘๐‘ฅ๐‘‘๐‘ฆ
50
Analytical Proof of Shannon โ€“ Hartley Law:
๏ต Let x, A and y represent the samples of x (t), A(t) and y (t).
๏ต Assume that the channel is band limited to B Hz. x (t) and A (t) are also limited to B
Hz and hence y (t) is also bandlimited to B Hz.
๏ต Hence all signals can be specified by taking samples at Nyquist rate of 2B samples/sec.
๏ต We know
๐ผ ๐‘‹; ๐‘Œ = ๐ป ๐‘Œ โˆ’ ๐ป(๐‘Œ/๐‘‹)
๐ป
๐‘Œ
๐‘‹
=
โˆ’โˆž
โˆž
โˆ’โˆž
โˆž
๐‘ƒ(๐‘ฅ, ๐‘ฆ) log 1
๐‘ƒ(๐‘ฆ/๐‘ฅ) ๐‘‘๐‘ฅ๐‘‘๐‘ฆ
๐ป
๐‘Œ
๐‘‹
=
โˆ’โˆž
โˆž
๐‘ƒ(๐‘ฅ)๐‘‘๐‘ฅ
โˆ’โˆž
โˆž
๐‘ƒ(๐‘ฆ/๐‘ฅ) log 1/๐‘ƒ(๐‘ฆ/๐‘ฅ) ๐‘‘๐‘ฆ
๏ต We know that y = x + n.
๏ต For a given x, y = n + a [a = constant of x]
๏ต Hence, the distribution of y for which x has a given value is identical to that of โ€˜nโ€™
except for a translation of x.
51
๏ต If PN (n) represents the probability density function of โ€˜nโ€™, then
๏ต P (y/x) = Pn (y - x)
๏ต Therefore
โˆ’โˆž
โˆž
๐‘ƒ(๐‘ฆ/๐‘ฅ) log 1/๐‘ƒ(๐‘ฆ/๐‘ฅ) ๐‘‘๐‘ฆ =
โˆ’โˆž
โˆž
๐‘ƒ๐‘›(๐‘ฆ โˆ’ ๐‘ฅ) log 1/๐‘ƒ๐‘›(๐‘ฆ โˆ’ ๐‘ฅ) ๐‘‘๐‘ฆ
๏ต Let y โ€“ x = z; dy = dz
๏ต Hence, the above equation becomes
โˆ’โˆž
โˆž
๐‘ƒ๐‘›(๐‘ง) log 1/๐‘ƒ๐‘›(๐‘ง) ๐‘‘๐‘ง = ๐ป(๐‘›)
๏ต H (n) is the entropy of the noise sample.
๏ต If the mean square value of the signal x (t) is S and that of noise n (t) is N, then the
mean square value of the output is given by
๐‘ฆ2 = ๐‘† + ๐‘
52
๏ต H (n) is the entropy of the noise sample.
๏ต If the mean square value of the signal x (t) is S and that of noise n (t) is N, then the mean square
value of the output is given by ๐‘ฆ2 = ๐‘† + ๐‘
๏ต Channel capacity C = max [I (X;Y)] = max [H(Y) โ€“ H(Y/X)] = max [H(Y) โ€“ H(n)]
๏ต H(Y) is the maximum when y is Gaussian and the maximum value of H(Y) is fm.log (2ฮ eฯƒ2);
whereas fm is the band limited frequency, ฯƒ2 is the mean square value where ฯƒ2 = S + N.
๏ต Therefore,
๐ป(๐‘ฆ)๐‘š๐‘Ž๐‘ฅ
= ๐ต log 2๐œ‹๐‘’ (๐‘† + ๐‘)
๏ต max ๐ป ๐‘› = ๐ต log 2๐œ‹๐‘’(๐‘)
๏ต Therefore
๐ถ = ๐ต log[2๐œ‹๐‘’(๐‘† + ๐‘)] โˆ’ ๐ต log(2๐œ‹๐‘’๐‘)
Or ๐ถ = ๐ต log[2๐œ‹๐‘’(๐‘†+๐‘)
2๐œ‹๐‘’๐‘]
๏ต Hence
๐ถ = ๐ต log[1 + ๐‘†
๐‘]
53
Rate Distortion Theory:
๏ต By source coding theorem for a discrete memoryless source, according
to which the average code โ€“ word length must be at least as large as the
source entropy for perfect coding (i.e. perfect representation of the
source).
๏ต There are constraints that force the coding to be imperfect, thereby
resulting in unavoidable distortion.
๏ต For example, the source may have a continuous amplitude as in case of
speech and the requirement is to quantize the amplitude of each sample
generated by the source to permit its representation by a code word of
finite length as in PCM.
๏ต In such a case, the problem is referred to as Source coding with a
fidelity criterion and the branch of information theory that deals with it
is called Rate Distortion Theory.
54
๏ต RDT finds applications in two types of situations:
1. Source coding where permitted coding alphabet can not exactly
represent the information source, in which case we are forced to do
โ€˜lossy data compressionโ€™.
2. Information transmission at a rate greater than the channel capacity.
55
Rate Distortion Function:
๏ต Consider DMS defined by a M โ€“ ary alphabet X: {xi| i = 1, 2, โ€ฆ., M} which consists
of a set of statistically independent symbols together with the associated symbol
probabilities {Pi | i = 1, 2, .. , M}. Let R be the average code rate in bits per code word.
๏ต The representation code words are taken from another alphabet Y: {yj| j = 1, 2, โ€ฆ N}.
๏ต This source coding theorem states that this second alphabet provides a perfect
representation of the source provided that R > H, where H is the source entropy.
๏ต But if we are forced to have R < H, then there is an unavoidable distortion and
therefore loss of information.
๏ต Let P (xi, yj) denote the joint probability of occurrence of source symbol xi and
representation symbol yj.
56
๏ต From probability theory, we have
๏ต P (xi, yj) = P (yj/xi) P (xi) where P (yj/xi) is a transition probability.
๏ต Let d (xi, yj) denote a measure of the cost incurred in representing the source symbol xi
by the symbol yj;
๏ต the quantity d (xi, yj) is referred to as a single โ€“ letter distortion measure.
๏ต The statistical average of d(xi, yj) over all possible source symbols and representation
symbols is given by
๐‘‘ =
๐‘–=1
๐‘€
๐‘—=1
๐‘
๐‘ƒ ๐‘ฅ๐‘– ๐‘ƒ
๐‘ฆ๐‘—
๐‘ฅ๐‘–
๐‘‘(๐‘ฅ๐‘–๐‘ฆ๐‘—)
Average distortion ๐‘‘ is a non โ€“ negative continuous function of the transition probabilities
P (yj / xi) those are determined by the source encoder โ€“ decoder pair.
57
๏ต A conditional probability assignment P (yj / xi) is said to be D โ€“ admissible if and only
the average distortion ๐‘‘ is less than or equal to some acceptable value D. the set of all
D โ€“ admissible conditional probability assignments is denoted by
๐‘ƒ๐ท = P yj
|
xi : ๐‘‘ โ‰ค ๐ท
๏ต For each set of transition probabilities, we have a mutual information
๐ผ ๐‘‹; ๐‘Œ =
๐‘–
=
1
๐‘€
๐‘—
=
1
๐‘
๐‘ƒ
๐‘ฅ๐‘–
๐‘ƒ(๐‘ฆ๐‘—
/๐‘ฅ๐‘–) log(
๐‘ƒ(๐‘ฆ๐‘—
/๐‘ฅ๐‘–)
๐‘ƒ(๐‘ฆ๐‘—
))
58
๏ต โ€œA Rate Distortion Function R (D) is defined as the smallest coding rate
possible for which the average distortion is guaranteed not to exceed Dโ€.
๏ต Let PD denote the set to which the conditional probability P (yj / xi) belongs for
prescribed D. then, for a fixed D, we write
๐‘… ๐ท = min
๐‘ƒ( yj
/ xi
)โˆˆ๐‘ƒ๐ท
๐ผ(๐‘‹; ๐‘Œ)
๏ต Subject to the constraint
๐‘—=1
๐‘
๐‘ƒ
๐‘ฆ๐‘—
๐‘ฅ๐‘–
= 1 ๐‘“๐‘œ๐‘Ÿ ๐‘– = 1, 2, โ€ฆ . , ๐‘€
๏ต The rate distortion function R (D) is measured in units of bits if the base โ€“ 2
logarithm is used.
๏ต We expect the distortion D to decrease as R (D) is increased.
๏ต Tolerating a large distortion D permits the use of a smaller rate for coding
and/or transmission of information.
59
Gaussian Source:
๏ต Consider a discrete time memoryless Gaussian source with zero mean and variance ฯƒ2.
Let x denote the value of a sample generated by such a source.
๏ต Let y denote a quantized version of x that permits a finite representation of it.
๏ต The squared error distortion d(x, y) = (x-y)2 provides a distortion measure that is
widely used for continuous alphabets.
๏ต The rate distortion function for the Gaussian source with squared error distortion, is
given by
๏ต R (D) =
1
2
log(๐œŽ2
๐ท); 0โ‰ค D โ‰ค ฯƒ2 otherwise 0 for D > ฯƒ2
๏ต In this case, R (D) โ†’ โˆž as D โ†’ 0
๏ต R (D) โ†’ 0 for D = ฯƒ2
60
Data Compression:
๏ต RDT leads to data compression that involves a purposeful or
unavoidable reduction in the information content of data from a
continuous or discrete source.
๏ต Data compressor is a device that supplies a code with the least number
of symbols for the representation of the source output subject to
permissible or acceptable distortion.
61
Probability Theory:
0 โ‰ค
๐‘๐‘›(๐ด)
๐‘› โ‰ค 1
๐‘ƒ ๐ด = lim
๐‘›โ†’โˆž
๐‘๐‘›(๐ด)
๐‘›
๏ต Axioms of Probability:
1. P (S) = 1
2. 0 โ‰ค P (A) โ‰ค 1
3. P (A+ B) = P (A) + P (B)
4. {Nn (A+B)}/n = Nn(A)/n + Nn(B)/n
62
๏ต Property 1:
๏ต P (Aโ€™) = 1 โ€“ P (A)
๏ต S = A + Aโ€™
๏ต 1= P (A) + P (Aโ€™)
๏ต Property 2:
๏ต If M mutually exclusive events A1, A2, โ€ฆโ€ฆ, Am, have the exhaustive
property
๏ต A1 + A2 + โ€ฆโ€ฆ + Am = S
๏ต Then
๏ต P (A1) + P (A2) + โ€ฆ.. P (Am) = 1
63
๏ต Property 3:
๏ต When events A and B are not mutually exclusive, then the probability of the union event โ€˜A
or Bโ€™ equals
๏ต P (A+B) = P (A) + P (B) โ€“ P (AB)
๏ต Where P (AB) is the probability of the joint event โ€˜A and Bโ€™.
๏ต P (AB) is called the โ€˜joint probabilityโ€™ if
๐‘ƒ ๐ด๐ต = lim
๐‘›โ†’โˆž
(
๐‘๐‘›(๐ด๐ต)
๐‘›)
Property 4:
๏ต Let there involves a pair of events A and B.
๏ต Let P (B/A) denote the probability of event B, given that event A has occurred and P (B/A)
is called โ€˜conditional probabilityโ€™ if
๐‘ƒ
๐ต
๐ด
=
๐‘ƒ (๐ด๐ต)
๐‘ƒ(๐ด)
๏ต We have Nn (AB)/ Nn (A) โ‰ค 1
๏ต Therefore,
๐‘ƒ(๐ต? ๐ด) = lim
๐‘›โ†’โˆž
(
๐‘๐‘›(๐ด๐ต)
๐‘๐‘›(๐ด))
64
References:
1. Digital Communications; Dr. Sanjay Sharma; Katson Books.
2. Communication Engineering; B P Lathi.
65

More Related Content

What's hot

Sns slide 1 2011
Sns slide 1 2011Sns slide 1 2011
Sns slide 1 2011cheekeong1231
ย 
8085 Architecture & Memory Interfacing1
8085 Architecture & Memory Interfacing18085 Architecture & Memory Interfacing1
8085 Architecture & Memory Interfacing1techbed
ย 
Properties of z transform day 2
Properties of z transform day 2Properties of z transform day 2
Properties of z transform day 2vijayanand Kandaswamy
ย 
Cyclic code non systematic
Cyclic code non systematicCyclic code non systematic
Cyclic code non systematicNihal Gupta
ย 
State space analysis, eign values and eign vectors
State space analysis, eign values and eign vectorsState space analysis, eign values and eign vectors
State space analysis, eign values and eign vectorsShilpa Shukla
ย 
Memory interfacing of microcontroller 8051
Memory interfacing of microcontroller 8051Memory interfacing of microcontroller 8051
Memory interfacing of microcontroller 8051Nilesh Bhaskarrao Bahadure
ย 
linear algebra in control systems
linear algebra in control systemslinear algebra in control systems
linear algebra in control systemsGanesh Bhat
ย 
Digital Communication Pulse code modulation.ppt
Digital Communication Pulse code modulation.pptDigital Communication Pulse code modulation.ppt
Digital Communication Pulse code modulation.pptMATRUSRI ENGINEERING COLLEGE
ย 
Information Theory and coding - Lecture 2
Information Theory and coding - Lecture 2Information Theory and coding - Lecture 2
Information Theory and coding - Lecture 2Aref35
ย 
Classification of signals
Classification of signalsClassification of signals
Classification of signalschitra raju
ย 
T iming diagrams of 8085
T iming diagrams of 8085T iming diagrams of 8085
T iming diagrams of 8085Salim Khan
ย 
Signal classification of signal
Signal classification of signalSignal classification of signal
Signal classification of signal001Abhishek1
ย 
Matlab: Speech Signal Analysis
Matlab: Speech Signal AnalysisMatlab: Speech Signal Analysis
Matlab: Speech Signal AnalysisDataminingTools Inc
ย 
Convolution codes - Coding/Decoding Tree codes and Trellis codes for multiple...
Convolution codes - Coding/Decoding Tree codes and Trellis codes for multiple...Convolution codes - Coding/Decoding Tree codes and Trellis codes for multiple...
Convolution codes - Coding/Decoding Tree codes and Trellis codes for multiple...Madhumita Tamhane
ย 
Unit II arm 7 Instruction Set
Unit II arm 7 Instruction SetUnit II arm 7 Instruction Set
Unit II arm 7 Instruction SetDr. Pankaj Zope
ย 
register file structure of PIC controller
register file structure of PIC controllerregister file structure of PIC controller
register file structure of PIC controllerNirbhay Singh
ย 
Chapter1 - Signal and System
Chapter1 - Signal and SystemChapter1 - Signal and System
Chapter1 - Signal and SystemAttaporn Ninsuwan
ย 
Pll in lpc2148
Pll in lpc2148Pll in lpc2148
Pll in lpc2148Aarav Soni
ย 

What's hot (20)

Sns slide 1 2011
Sns slide 1 2011Sns slide 1 2011
Sns slide 1 2011
ย 
8085 Architecture & Memory Interfacing1
8085 Architecture & Memory Interfacing18085 Architecture & Memory Interfacing1
8085 Architecture & Memory Interfacing1
ย 
Properties of z transform day 2
Properties of z transform day 2Properties of z transform day 2
Properties of z transform day 2
ย 
Cyclic code non systematic
Cyclic code non systematicCyclic code non systematic
Cyclic code non systematic
ย 
State space analysis, eign values and eign vectors
State space analysis, eign values and eign vectorsState space analysis, eign values and eign vectors
State space analysis, eign values and eign vectors
ย 
Memory interfacing of microcontroller 8051
Memory interfacing of microcontroller 8051Memory interfacing of microcontroller 8051
Memory interfacing of microcontroller 8051
ย 
linear algebra in control systems
linear algebra in control systemslinear algebra in control systems
linear algebra in control systems
ย 
Digital Communication Pulse code modulation.ppt
Digital Communication Pulse code modulation.pptDigital Communication Pulse code modulation.ppt
Digital Communication Pulse code modulation.ppt
ย 
Information Theory and coding - Lecture 2
Information Theory and coding - Lecture 2Information Theory and coding - Lecture 2
Information Theory and coding - Lecture 2
ย 
Classification of signals
Classification of signalsClassification of signals
Classification of signals
ย 
T iming diagrams of 8085
T iming diagrams of 8085T iming diagrams of 8085
T iming diagrams of 8085
ย 
Signal classification of signal
Signal classification of signalSignal classification of signal
Signal classification of signal
ย 
Matlab: Speech Signal Analysis
Matlab: Speech Signal AnalysisMatlab: Speech Signal Analysis
Matlab: Speech Signal Analysis
ย 
Convolution codes - Coding/Decoding Tree codes and Trellis codes for multiple...
Convolution codes - Coding/Decoding Tree codes and Trellis codes for multiple...Convolution codes - Coding/Decoding Tree codes and Trellis codes for multiple...
Convolution codes - Coding/Decoding Tree codes and Trellis codes for multiple...
ย 
Information theory
Information theoryInformation theory
Information theory
ย 
Unit II arm 7 Instruction Set
Unit II arm 7 Instruction SetUnit II arm 7 Instruction Set
Unit II arm 7 Instruction Set
ย 
8051 Presentation
8051 Presentation8051 Presentation
8051 Presentation
ย 
register file structure of PIC controller
register file structure of PIC controllerregister file structure of PIC controller
register file structure of PIC controller
ย 
Chapter1 - Signal and System
Chapter1 - Signal and SystemChapter1 - Signal and System
Chapter1 - Signal and System
ย 
Pll in lpc2148
Pll in lpc2148Pll in lpc2148
Pll in lpc2148
ย 

Similar to INFORMATION_THEORY.pdf

Unit-1_Digital_Communication-Information_Theory.pptx
Unit-1_Digital_Communication-Information_Theory.pptxUnit-1_Digital_Communication-Information_Theory.pptx
Unit-1_Digital_Communication-Information_Theory.pptxKIRUTHIKAAR2
ย 
Unit-1_Digital_Communication-Information_Theory.pptx
Unit-1_Digital_Communication-Information_Theory.pptxUnit-1_Digital_Communication-Information_Theory.pptx
Unit-1_Digital_Communication-Information_Theory.pptxKIRUTHIKAAR2
ย 
Information theory & coding PPT Full Syllabus.pptx
Information theory & coding PPT Full Syllabus.pptxInformation theory & coding PPT Full Syllabus.pptx
Information theory & coding PPT Full Syllabus.pptxprernaguptaec
ย 
Digital Communication: Information Theory
Digital Communication: Information TheoryDigital Communication: Information Theory
Digital Communication: Information TheoryDr. Sanjay M. Gulhane
ย 
Radio receiver and information coding .pptx
Radio receiver and information coding .pptxRadio receiver and information coding .pptx
Radio receiver and information coding .pptxswatihalunde
ย 
Lecture1
Lecture1Lecture1
Lecture1ntpc08
ย 
Information Theory and Coding
Information Theory and CodingInformation Theory and Coding
Information Theory and CodingVIT-AP University
ย 
Itblock2 150209161919-conversion-gate01
Itblock2 150209161919-conversion-gate01Itblock2 150209161919-conversion-gate01
Itblock2 150209161919-conversion-gate01Xuan Phu Nguyen
ย 
Unit I.pptx INTRODUCTION TO DIGITAL COMMUNICATION
Unit I.pptx INTRODUCTION TO DIGITAL COMMUNICATIONUnit I.pptx INTRODUCTION TO DIGITAL COMMUNICATION
Unit I.pptx INTRODUCTION TO DIGITAL COMMUNICATIONrubini Rubini
ย 
Unit I DIGITAL COMMUNICATION-INFORMATION THEORY.pdf
Unit I DIGITAL COMMUNICATION-INFORMATION THEORY.pdfUnit I DIGITAL COMMUNICATION-INFORMATION THEORY.pdf
Unit I DIGITAL COMMUNICATION-INFORMATION THEORY.pdfvani374987
ย 
Lecture7 channel capacity
Lecture7   channel capacityLecture7   channel capacity
Lecture7 channel capacityFrank Katta
ย 
Ec eg intro 1
Ec eg  intro 1Ec eg  intro 1
Ec eg intro 1abu55
ย 
Information Theory Coding 1
Information Theory Coding 1Information Theory Coding 1
Information Theory Coding 1Mahafuz Aveek
ย 
Information Theory
Information TheoryInformation Theory
Information TheorySou Jana
ย 
Entropy Based Assessment of Hydrometric Network Using Normal and Log-normal D...
Entropy Based Assessment of Hydrometric Network Using Normal and Log-normal D...Entropy Based Assessment of Hydrometric Network Using Normal and Log-normal D...
Entropy Based Assessment of Hydrometric Network Using Normal and Log-normal D...mathsjournal
ย 
ENTROPY BASED ASSESSMENT OF HYDROMETRIC NETWORK USING NORMAL AND LOG-NORMAL D...
ENTROPY BASED ASSESSMENT OF HYDROMETRIC NETWORK USING NORMAL AND LOG-NORMAL D...ENTROPY BASED ASSESSMENT OF HYDROMETRIC NETWORK USING NORMAL AND LOG-NORMAL D...
ENTROPY BASED ASSESSMENT OF HYDROMETRIC NETWORK USING NORMAL AND LOG-NORMAL D...mathsjournal
ย 

Similar to INFORMATION_THEORY.pdf (20)

Unit-1_Digital_Communication-Information_Theory.pptx
Unit-1_Digital_Communication-Information_Theory.pptxUnit-1_Digital_Communication-Information_Theory.pptx
Unit-1_Digital_Communication-Information_Theory.pptx
ย 
Unit-1_Digital_Communication-Information_Theory.pptx
Unit-1_Digital_Communication-Information_Theory.pptxUnit-1_Digital_Communication-Information_Theory.pptx
Unit-1_Digital_Communication-Information_Theory.pptx
ย 
Information theory & coding PPT Full Syllabus.pptx
Information theory & coding PPT Full Syllabus.pptxInformation theory & coding PPT Full Syllabus.pptx
Information theory & coding PPT Full Syllabus.pptx
ย 
Digital Communication: Information Theory
Digital Communication: Information TheoryDigital Communication: Information Theory
Digital Communication: Information Theory
ย 
UNIT-2.pdf
UNIT-2.pdfUNIT-2.pdf
UNIT-2.pdf
ย 
DC@UNIT 2 ppt.ppt
DC@UNIT 2 ppt.pptDC@UNIT 2 ppt.ppt
DC@UNIT 2 ppt.ppt
ย 
Radio receiver and information coding .pptx
Radio receiver and information coding .pptxRadio receiver and information coding .pptx
Radio receiver and information coding .pptx
ย 
Lecture1
Lecture1Lecture1
Lecture1
ย 
Information Theory and Coding
Information Theory and CodingInformation Theory and Coding
Information Theory and Coding
ย 
Itblock2 150209161919-conversion-gate01
Itblock2 150209161919-conversion-gate01Itblock2 150209161919-conversion-gate01
Itblock2 150209161919-conversion-gate01
ย 
Unit I.pptx INTRODUCTION TO DIGITAL COMMUNICATION
Unit I.pptx INTRODUCTION TO DIGITAL COMMUNICATIONUnit I.pptx INTRODUCTION TO DIGITAL COMMUNICATION
Unit I.pptx INTRODUCTION TO DIGITAL COMMUNICATION
ย 
Allerton
AllertonAllerton
Allerton
ย 
Unit I DIGITAL COMMUNICATION-INFORMATION THEORY.pdf
Unit I DIGITAL COMMUNICATION-INFORMATION THEORY.pdfUnit I DIGITAL COMMUNICATION-INFORMATION THEORY.pdf
Unit I DIGITAL COMMUNICATION-INFORMATION THEORY.pdf
ย 
Lecture7 channel capacity
Lecture7   channel capacityLecture7   channel capacity
Lecture7 channel capacity
ย 
Information theory
Information theoryInformation theory
Information theory
ย 
Ec eg intro 1
Ec eg  intro 1Ec eg  intro 1
Ec eg intro 1
ย 
Information Theory Coding 1
Information Theory Coding 1Information Theory Coding 1
Information Theory Coding 1
ย 
Information Theory
Information TheoryInformation Theory
Information Theory
ย 
Entropy Based Assessment of Hydrometric Network Using Normal and Log-normal D...
Entropy Based Assessment of Hydrometric Network Using Normal and Log-normal D...Entropy Based Assessment of Hydrometric Network Using Normal and Log-normal D...
Entropy Based Assessment of Hydrometric Network Using Normal and Log-normal D...
ย 
ENTROPY BASED ASSESSMENT OF HYDROMETRIC NETWORK USING NORMAL AND LOG-NORMAL D...
ENTROPY BASED ASSESSMENT OF HYDROMETRIC NETWORK USING NORMAL AND LOG-NORMAL D...ENTROPY BASED ASSESSMENT OF HYDROMETRIC NETWORK USING NORMAL AND LOG-NORMAL D...
ENTROPY BASED ASSESSMENT OF HYDROMETRIC NETWORK USING NORMAL AND LOG-NORMAL D...
ย 

Recently uploaded

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
ย 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
ย 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
ย 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
ย 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
ย 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
ย 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
ย 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
ย 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
ย 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
ย 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
ย 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
ย 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
ย 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
ย 
FULL ENJOY ๐Ÿ” 8264348440 ๐Ÿ” Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY ๐Ÿ” 8264348440 ๐Ÿ” Call Girls in Diplomatic Enclave | DelhiFULL ENJOY ๐Ÿ” 8264348440 ๐Ÿ” Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY ๐Ÿ” 8264348440 ๐Ÿ” Call Girls in Diplomatic Enclave | Delhisoniya singh
ย 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
ย 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
ย 
WhatsApp 9892124323 โœ“Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 โœ“Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 โœ“Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 โœ“Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
ย 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
ย 

Recently uploaded (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
ย 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
ย 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
ย 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
ย 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
ย 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
ย 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
ย 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
ย 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
ย 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
ย 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
ย 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
ย 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
ย 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
ย 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
ย 
FULL ENJOY ๐Ÿ” 8264348440 ๐Ÿ” Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY ๐Ÿ” 8264348440 ๐Ÿ” Call Girls in Diplomatic Enclave | DelhiFULL ENJOY ๐Ÿ” 8264348440 ๐Ÿ” Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY ๐Ÿ” 8264348440 ๐Ÿ” Call Girls in Diplomatic Enclave | Delhi
ย 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
ย 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
ย 
WhatsApp 9892124323 โœ“Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 โœ“Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 โœ“Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 โœ“Call Girls In Kalyan ( Mumbai ) secure service
ย 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
ย 

INFORMATION_THEORY.pdf

  • 1. INFORMATION THEORY ER. FARUK BIN POYEN, Asst. Professor DEPT. OF AEIE, UIT, BU, BURDWAN, WB, INDIA faruk.poyen@gmail.com
  • 2. Contents.. ๏ต Information Theory ๏ต What is Information? ๏ต Axioms of Information ๏ต Information Source ๏ต Information Content of a Discrete Memoryless Source ๏ต Information Content of a Symbol (i.e. Logarithmic Measure of Information) ๏ต Entropy (i.e. Average Information) ๏ต Information Rate ๏ต The Discrete Memoryless Channels (DMC) ๏ต Types of Channels ๏ต Conditional And Joint Entropies ๏ต The Mutual Information ๏ต The Channel Capacity 2
  • 3. Contents: Contd.. ๏ต Capacity of an Additive Gaussian Noise (AWGN) Chanel - Shannon โ€“ Hartley Law ๏ต Channel Capacity: Detailed Study โ€“ Proof of Shannon โ€“ Hartley Law expression. ๏ต The Source Coding ๏ต Few Terms Related to Source Coding Process ๏ต The Source Coding Theorem ๏ต Classification of Codes ๏ต The Kraft Inequality ๏ต The Entropy Coding โ€“ Shannon โ€“ Fano Coding; Huffman Coding. ๏ต Redundancy ๏ต Entropy Relations for a continuous Channel ๏ต Analytical Proof of Shannon โ€“ Hartley Law ๏ต Rate Distortion Theory ๏ต Probability Theory 3
  • 4. Introduction: ๏ต The purpose of communication system is to carry information bearing base band signals from one place to another placed over a communication channel. ๏ต Information theory is concerned with the fundamental limits of communication. ๏ต What is the ultimate limit to data compression? ๏ต What is the ultimate limit of reliable communication over a noisy channel, e.g. how many bits can be sent in one second over a telephone line? ๏ต Information Theory is a branch of probability theory which may be applied to the study of the communication systems that deals with the mathematical modelling and analysis of a communication system rather than with the physical sources and physical channels. ๏ต Two important elements presented in this theory are Binary Source (BS) and the Binary Symmetric Channel (BSC). ๏ต A binary source is a device that generates one of the two possible symbols โ€˜0โ€™ and โ€˜1โ€™ at a given rate โ€˜rโ€™, measured in symbols per second. 4
  • 5. ๏ต These symbols are called bits (binary digits) and are generated randomly. ๏ต The BSC is a medium through which it is possible to transmit one symbol per time unit. However this channel is not reliable and is characterized by error probability โ€˜pโ€™ (0 โ‰ค p โ‰ค 1/2) that an output bit can be different from the corresponding input. ๏ต Information theory tries to analyse communication between a transmitter and a receiver through an unreliable channel and in this approach performs an analysis of information sources, especially the amount of information produced by a given source, states the conditions for performing reliable transmission through an unreliable channel. ๏ต The source information measure, the channel capacity measure and the coding are all related by one of the Shannon theorems, the channel coding theorem which is stated as: โ€˜If the information rate of a given source does not exceed the capacity of a given channel then there exists a coding technique that makes possible transmission through this unreliable channel with an arbitrarily low error rate.โ€™ 5
  • 6. ๏ต There are three main concepts in this theory: 1. The first is the definition of a quantity that can be a valid measurement of information which should be consistent with a physical understanding of its properties. 2. The second concept deals with the relationship between the information and the source that generates it. This concept will be referred to as the source information. Compression and encryptions are related to this concept. 3. The third concept deals with the relationship between the information and the unreliable channel through which it is going to be transmitted. This concept leads to the definition of a very important parameter called the channel capacity. Error - correction coding is closely related to this concept. 6
  • 7. What is Information? ๏ต Information of an event depends only on its probability of occurrence and is not dependent on its content. ๏ต The randomness of happening of an event and the probability of its prediction as a news is known as information. ๏ต The message associated with the least likelihood event contains the maximum information. 7
  • 8. Axioms of Information: 1. Information is a non-negative quantity: I (p) โ‰ฅ 0. 2. If an event has probability 1, we get no information from the occurrence of the event: I (1) = 0. 3. If two independent events occur (whose joint probability is the product of their individual probabilities), then the information we get from observing the events is the sum of the two information: I (p1* p2) = I (p1) + I (p2). 4. I (p) is monotonic and continuous in p. 8
  • 9. Information Source: ๏ต An information source may be viewed as an object which produces an event, the outcome of which is selected at random according to a probability distribution. ๏ต The set of source symbols is called the source alphabet and the elements of the set are called symbols or letters. ๏ต Information source can be classified as having memory or being memoryless. ๏ต A source with memory is one for which a current symbol depends on the previous symbols. ๏ต A memoryless source is one for which each symbol produced is independent of the previous symbols. ๏ต A discrete memoryless source (DMS) can be characterized by the list of the symbol, the probability assignment of these symbols and the specification of the rate of generating these symbols by the source. 9
  • 10. Information Content of a DMS: ๏ต The amount of information contained in an event is closely related to its uncertainty. ๏ต A mathematical measure of information should be a function of the probability of the outcome and should satisfy the following axioms: a) Information should be proportional to the uncertainty of an outcome b) Information contained in independent outcomes should add up. 10
  • 11. Information Content of a Symbol (i.e. Logarithmic Measure of Information): ๏ต Let us consider a DMS denoted by โ€˜xโ€™ and having alphabet {x1, x2, โ€ฆโ€ฆ, xm}. ๏ต The information content of the symbol xi, denoted by I (๐‘ฅ๐‘–) is defined by ๏ต ๐ผ ๐‘ฅ๐‘– = log๐‘ 1 ๐‘ƒ(๐‘ฅ๐‘–) = โˆ’ log๐‘ ๐‘ƒ(๐‘ฅ๐‘–) ๏ต where P(๐‘ฅ๐‘–) is the probability of occurrence of symbol ๐‘ฅ๐‘–. ๏ต For any two independent source messages xi and xj with probabilities ๐‘ƒ๐‘– and ๐‘ƒ๐‘— respectively and with joint probability P (๐‘ฅ๐‘–, ๐‘ฅ๐‘—) = Pi Pj, the information of the messages is the addition of the information in each message. ๐ผ๐‘–๐‘— = ๐ผ๐‘– + ๐ผ๐‘—. 11
  • 12. ๏ต Note that I(xi) satisfies the following properties. 1. I(xi) = 0 for P(xi) = 1 2. I(xi) โ‰ฅ 0 3. I(xi) > I(xj) if P(xi) < P(xj) 4. I(xi, xj) = I(xi) + I(xj) if xi and xj are independent ๏ต Unit of I (xi): The unit of I (xi) is the bit (binary unit) if b = 2, Hartley or decit if b = 10 and nat (natural unit) if b = e. it is standard to use b = 2. log2 ๐‘Ž = ln ๐‘Ž/ ln 2 = log ๐‘Ž/ log 2 12
  • 13. Entropy (i.e. Average Information): ๏ต Entropy is a measure of the uncertainty in a random variable. The entropy, H, of a discrete random variable X is a measure of the amount of uncertainty associated with the value of X. ๏ต For quantitative representation of average information per symbol we make the following assumptions: i) The source is stationary so that the probabilities may remain constant with time. ii) The successive symbols are statistically independent and come from the source at an average rate of โ€˜rโ€™ symbols per second. ๏ต The quantity H(X) is called the entropy of source X. it is a measure of the average information content per source symbol. ๏ต The source entropy H(X) can be considered as the average amount of uncertainty within the source X that is resolved by the use of the alphabet. ๏ต H(X) = E [I(xi)] = - ฮฃP(xi) I(xi) = - ฮฃP(xi)log2 P(xi) b/symbol. 13
  • 14. ๏ต Entropy for Binary Source: ๐ป ๐‘‹ = โˆ’ 1 2 log2 1 2 โˆ’ 1 2 log2 1 2 = 1 ๐‘/๐‘ ๐‘ฆ๐‘š๐‘๐‘œ๐‘™ ๏ต The source entropy H(X) satisfies the relation: 0 โ‰ค H(X) โ‰ค log2 m, where m is the size of the alphabet source X. ๏ต Properties of Entropy: 1. 0 โ‰ค ๐ป ๐‘‹ โ‰ค log2 ๐‘š ; m = no. of symbols of the alphabet of source X. 2. When all the events are equally likely, the average uncertainty must have the largest value i.e. log2 ๐‘š โ‰ฅ ๐ป ๐‘‹ 3. H (X) = 0, if all the P(xi) are zero except for one symbol with P = 1. 14
  • 15. Information Rate: ๏ต If the time rate at which X emits symbols is โ€˜rโ€™ (symbols s), the information rate R of the source is given by ๏ต R = r H(X) b/s [(symbols / second) * (information bits/ symbol)]. ๏ต R is the information rate. H(X) = Entropy or average information. 15
  • 16. The Discrete Memoryless Channels (DMC): 1. Channel Representation: A communication channel may be defined as the path or medium through which the symbols flow to the receiver end. A DMC is a statistical model with an input X and output Y. Each possible input to output path is indicated along with a conditional probability P (yj/xi), where P (yj/xi) is the conditional probability of obtaining output yj given that the input is x1 and is called a channel transition probability. 1. A channel is completely specified by the complete set of transition probabilities. The channel is specified by the matrix of transition probabilities [P(Y/X)]. This matrix is known as Channel Matrix. 16
  • 17. ๐‘ƒ(๐‘Œ/๐‘‹) = ๐‘ƒ(๐‘ฆ1 /๐‘ฅ1) โ‹ฏ ๐‘ƒ(๐‘ฆ๐‘›/๐‘ฅ1) โ‹ฎ โ‹ฑ โ‹ฎ ๐‘ƒ(๐‘ฆ1 /๐‘ฅ๐‘š) โ‹ฏ ๐‘ƒ(๐‘ฆ๐‘› /๐‘ฅ๐‘š) Since each input to the channel results in some output, each row of the column matrix must sum to unity. This means that ๐‘—=1 ๐‘› ๐‘ƒ(๐‘ฆ๐‘— /๐‘ฅ๐‘–) = 1 ๐‘“๐‘œ๐‘Ÿ ๐‘Ž๐‘™๐‘™ ๐‘– Now, if the input probabilities P(X) are represented by the row matrix, we have ๐‘ƒ ๐‘‹ = [๐‘ƒ ๐‘ฅ1 ๐‘ƒ ๐‘ฅ2 โ€ฆ . ๐‘ƒ(๐‘ฅ๐‘š)] 17
  • 18. ๏ต Also the output probabilities P(Y) are represented by the row matrix, we have ๐‘ƒ ๐‘Œ = [๐‘ƒ ๐‘ฆ1 ๐‘ƒ ๐‘ฆ2 โ€ฆ . ๐‘ƒ(๐‘ฆ๐‘›)] Then [๐‘ƒ ๐‘Œ ] = [๐‘ƒ(๐‘‹)][๐‘ƒ(๐‘Œ/๐‘‹)] Now if P(X) is represented as a diagonal matrix, we have [๐‘ƒ(๐‘‹)]๐‘‘ = ๐‘ƒ(๐‘ฅ1) โ‹ฏ 0 โ‹ฎ โ‹ฑ โ‹ฎ 0 โ‹ฏ ๐‘ƒ(๐‘ฅ๐‘š) Then ๐‘ƒ ๐‘‹, ๐‘Œ = [๐‘ƒ(๐‘‹)]๐‘‘ [๐‘ƒ(๐‘Œ/๐‘‹)] ๏ต Where the (i, j) element of matrix [P(X,Y)] has the form P(xi, yj). ๏ต The matrix [P(X, Y)] is known as the joint probability matrix and the element P(xi, yj) is the joint probability of transmitting xi and receiving yj. 18
  • 19. Types of Channels: ๏ต Other than discrete and continuous channels, there are some special types of channels with their own channel matrices. They are as follows: ๏ต Lossless Channel: A channel described by a channel matrix with only one non โ€“ zero element in each column is called a lossless channel. ๐‘ƒ ๐‘Œ ๐‘‹ = 3 4 1 4 0 0 0 0 2 3 0 0 0 0 1 19
  • 20. ๏ต Deterministic Channel: A channel described by a channel matrix with only one non โ€“ zero element in each row is called a deterministic channel. ๐‘ƒ ๐‘Œ ๐‘‹ = 1 0 0 1 0 0 0 1 0 0 0 1 20
  • 21. ๏ต Noiseless Channel: A channel is called noiseless if it is both lossless and deterministic. For a lossless channel, m = n. ๐‘ƒ ๐‘Œ ๐‘‹ = 1 0 0 0 1 0 0 0 1 21
  • 22. ๏ต Binary Symmetric Channel: BSC has two inputs (x1 = 0 and x2 = 1) and two outputs (y1 = 0 and y2 = 1). ๏ต This channel is symmetric because the probability of receiving a 1 if a 0 is sent is the same as the probability of receiving a 0 if a 1 is sent. ๐‘ƒ ๐‘Œ ๐‘‹ = 1 โˆ’ ๐‘ ๐‘ ๐‘ 1 โˆ’ ๐‘ 22
  • 23. Conditional And Joint Entropies: ๏ต Using the input probabilities P (xi), output probabilities P (yj), transition probabilities P (yj/xi) and joint probabilities P (xi, yj), various entropy functions for a channel with m inputs and n outputs are defined. ๐ป ๐‘‹ = โˆ’ i=1 m P(xi) log2 P(xi) ๐ป ๐‘Œ = โˆ’ j=1 n P(yj ) log2 P(yj ) ๐ป ๐‘‹ ๐‘Œ = โˆ’ ๐‘—=1 ๐‘› ๐‘–=1 ๐‘š ๐‘ƒ(๐‘ฅ๐‘–, ๐‘ฆ๐‘—) log2 (๐‘ฅ๐‘–/๐‘ฆ๐‘—) ๐ป ๐‘Œ ๐‘‹ = โˆ’ ๐‘—=1 ๐‘› ๐‘–=1 ๐‘š ๐‘ƒ(๐‘ฅ๐‘–, ๐‘ฆ๐‘—) log2 (๐‘ฆ๐‘—/๐‘ฅ๐‘–) ๐ป ๐‘‹, ๐‘Œ = โˆ’ ๐‘—=1 ๐‘› ๐‘–=1 ๐‘š ๐‘ƒ(๐‘ฅ๐‘–, ๐‘ฆ๐‘—) log2 (๐‘ฅ๐‘–, ๐‘ฆ๐‘—) 23
  • 24. ๏ต H (X) is the average uncertainty of the channel input and H (Y) is the average uncertainty of the channel output. ๏ต The conditional entropy H (X/Y) is a measure of the average uncertainty remaining about the channel input after the channel output has been observed. H (X/Y) is also called equivocation of X w.r.t. Y. ๏ต The conditional entropy H (Y/X) is the average uncertainty of the channel output given that X was transmitted. ๏ต The joint entropy H (X, Y) is the average uncertainty of the communication channel as a whole. Few useful relationships among the above various entropies are as under: ๏ต H (X, Y) = H (X/Y) + H (Y) ๏ต H (X, Y) = H (Y/X) + H (X) ๏ต H (X, Y) = H (X) + H (Y) ๏ต H (X/Y) = H (X, Y) โ€“ H (Y) ๏ต X and Y are statistically independent. 24
  • 25. ๏ต The conditional entropy or conditional uncertainty of X given random variable Y (also called the equivocation of X about Y) is the average conditional entropy over Y. ๏ต The joint entropy of two discrete random variables X and Y is merely the entropy of their pairing: (X, Y), this implies that if X and Y are independent, then their joint entropy is the sum of their individual entropies. 25
  • 26. The Mutual Information: ๏ต Mutual information measures the amount of information that can be obtained about one random variable by observing another. ๏ต It is important in communication where it can be used to maximize the amount of information shared between sent and received signals. ๏ต The mutual information denoted by I (X, Y) of a channel is defined by: ๐ผ ๐‘‹; ๐‘Œ = ๐ป ๐‘‹ โˆ’ ๐ป ๐‘‹ ๐‘Œ ๐‘/๐‘ ๐‘ฆ๐‘š๐‘๐‘œ๐‘™ ๏ต Since H (X) represents the uncertainty about the channel input before the channel output is observed and H (X/Y) represents the uncertainty about the channel input after the channel output is observed, the mutual information I (X; Y) represents the uncertainty about the channel input that is resolved by observing the channel output. 26
  • 27. ๏ต Properties of Mutual Information I (X; Y) ๏ต I (X; Y) = I(Y; X) ๏ต I (X; Y) โ‰ฅ 0 ๏ต I (X; Y) = H (Y) โ€“ H (Y/X) ๏ต I (X; Y) = H(X) + H(Y) โ€“ H(X,Y) ๏ต The Entropy corresponding to mutual information [i.e. I (X, Y)] indicates a measure of the information transmitted through a channel. Hence, it is called โ€˜Transferred informationโ€™. 27
  • 28. The Channel Capacity: ๏ต The channel capacity represents the maximum amount of information that can be transmitted by a channel per second. ๏ต To achieve this rate of transmission, the information has to be processed properly or coded in the most efficient manner. ๏ต Channel Capacity per Symbol CS: The channel capacity per symbol of a discrete memoryless channel (DMC) is defined as ๐ถ๐‘† = max {P(xi)} ๐ผ ๐‘‹; ๐‘Œ ๐‘/๐‘ ๐‘ฆ๐‘š๐‘๐‘œ๐‘™ Where the maximization is over all possible input probability distributions {P (xi)} on X. ๏ต Channel Capacity per Second C: I f โ€˜rโ€™ symbols are being transmitted per second, then the maximum rate od transmission of information per second is โ€˜r CSโ€™. this is the channel capacity per second and is denoted by C (b/s) i.e. ๐ถ = ๐‘Ÿ๐ถ๐‘† ๐‘/๐‘  28
  • 29. Capacities of Special Channels: ๏ต Lossless Channel: For a lossless channel, H (X/Y) = 0 and I (X; Y) = H (X). ๏ต Thus the mutual information is equal to the input entropy and no source information is lost in transmission. ๐ถ๐‘† = max {P(xi)} ๐ป ๐‘‹ = log2 ๐‘š Where m is the number of symbols in X. ๏ต Deterministic Channel: For a deterministic channel, H (Y/X) = 0 for all input distributions P (xi) and I (X; Y) = H (Y). ๏ต Thus the information transfer is equal to the output entropy. The channel capacity per symbol will be ๐ถ๐‘† = max {P(xi)} ๐ป ๐‘Œ = log2 ๐‘› where n is the number of symbols in Y. 29
  • 30. ๏ต Noiseless Channel: since a noiseless channel is both lossless and deterministic, we have I (X; Y) = H (X) = H (Y) and the channel capacity per symbol is ๐ถ๐‘† = log2 ๐‘š = log2 ๐‘› ๏ต Binary Symmetric Channel: For the BSC, the mutual information is ๐ผ ๐‘‹; ๐‘Œ = ๐ป ๐‘Œ + ๐‘ log2 ๐‘ + (1 โˆ’ ๐‘) log2 (1 โˆ’ ๐‘) And the channel capacity per symbol will be ๐ถ๐‘† = 1 + ๐‘ log2 ๐‘ + 1 โˆ’ ๐‘ log2 (1 โˆ’ ๐‘) 30
  • 31. Capacity of an Additive Gaussian Noise (AWGN) Channel - Shannon โ€“ Hartley Law ๏ต The Shannon โ€“ Hartley law underscores the fundamental role of bandwidth and signal โ€“ to โ€“ noise ration in communication channel. It also shows that we can exchange increased bandwidth for decreased signal power for a system with given capacity C. ๏ต In an additive white Gaussian noise (AWGN) channel, the channel output Y is given by Y = X + n ๏ต Where X is the channel input and n is an additive bandlimited white Gaussian noise for zero mean and variance ฯƒ2. 31
  • 32. ๏ต The capacity CS of an AWGN channel is given by ๐ถ๐‘  = max {P(xi)} ๐ผ ๐‘‹; ๐‘Œ = 1 2 log2 1 + ๐‘† ๐‘ b/sample ๏ต Where S/N is the signal โ€“ to โ€“ noise ratio at the channel output. ๏ต If the channel bandwidth B Hz is fixed, then the output y(t) is also a bandlimited signal completely characterized by its periodic sample values taken at the Nyquist rate 2B samples/s. ๏ต Then the capacity C (b/s) of the AWGN channel is limited by ๐ถ = 2๐ต โˆ— ๐ถ๐‘† = ๐ต log2 1 + ๐‘† ๐‘ ๐‘/๐‘  ๏ต This above equation is known as the Shannon โ€“ Hartley Law. 32
  • 33. Channel Capacity: Shannon โ€“ Hartley Law Proof. ๏ต The bandwidth and the noise power place a restriction upon the rate of information that can be transmitted by a channel. Channel capacity C for an AWGN channel is expressed as ๐ถ = ๐ต log2 (1 + ๐‘† ๐‘) ๏ต Where B = channel bandwidth in Hz; S = signal power; N = noise power; ๏ต Proof: Assuming signal mixed with noise, the signal amplitude can be recognized only within the root mean square noise voltage. ๏ต Assuming average signal power and noise power to be S watts and N watts respectively the RMS value of the received signal is (๐‘† + ๐‘) and that of noise is ๐‘. ๏ต Therefore the number of distinct levels that can be distinguished without error is expressed as ๐‘€ = ๐‘† + ๐‘ ๐‘ = 1 + ๐‘† ๐‘ 33
  • 34. ๏ต The maximum amount of information carried by each pulse having 1 + ๐‘† ๐‘ distinct levels is given by ๐ผ = log2 1 + ๐‘† ๐‘ = 1 2 log2 (1 + ๐‘† ๐‘) ๐‘๐‘–๐‘ก๐‘  ๏ต The channel capacity is the maximum amount of information that can be transmitted per second by a channel. If a channel can transmit a maximum of K pulses per second, then the channel capacity C is given by ๐ถ = ๐พ 2 log2 1 + ๐‘† ๐‘ ๐‘๐‘–๐‘ก๐‘ /๐‘ ๐‘’๐‘๐‘œ๐‘›๐‘‘ ๏ต A system of bandwidth nfm Hz can transmit 2nfm independent pulses per second. It is concluded that a system with bandwidth B Hz can transmit a maximum of 2B pulses per second. Replacing K with 2B, we eventually get ๐ถ = ๐ต log2 1 + ๐‘† ๐‘ ๐‘๐‘–๐‘ก๐‘ /๐‘ ๐‘’๐‘๐‘œ๐‘›๐‘‘ ๏ต The bandwidth and the signal power can be exchanged for one another. 34
  • 35. The Source Coding: ๏ต Definition: A conversion of the output of a discrete memory less source (DMS) into a sequence of binary symbols i.e. binary code word, is called Source Coding. ๏ต The device that performs this conversion is called the Source Encoder. ๏ต Objective of Source Coding: An objective of source coding is to minimize the average bit rate required for representation of the source by reducing the redundancy of the information source. 35
  • 36. Few Terms Related to Source Coding Process: I. Code word Length: ๏ต Let X be a DMS with finite entropy H (X) and an alphabet {๐‘ฅ1 โ€ฆ โ€ฆ . . ๐‘ฅ๐‘š } with corresponding probabilities of occurrence P(xi) (i = 1, โ€ฆ. , m). Let the binary code word assigned to symbol xi by the encoder have length ni, measured in bits. The length of the code word is the number of binary digits in the code word. II. Average Code word Length: ๏ต The average code word length L, per source symbol is given by ๐‘ณ = ๐’Š=๐Ÿ ๐’Ž ๐‘ท(๐’™๐’Š)๐’๐’Š ๏ต The parameter L represents the average number of bits per source symbol used in the source coding process. 36
  • 37. III. Code Efficiency: ๏ต The code efficiency ฮท is defined as ๐œผ = ๐‘ณ๐’Ž๐’Š๐’ ๐‘ณ ๏ต where Lmin is the minimum value of L. As ฮท approaches unity, the code is said to be efficient. IV. Code Redundancy: ๏ต The code redundancy ฮณ is defined as ๐œธ = ๐Ÿ โˆ’ ฦž 37
  • 38. The Source Coding Theorem: ๏ต The source coding theorem states that for a DMS X, with entropy H (X), the average code word length L per symbol is bounded as L โ‰ฅ H (X). ๏ต And further, L can be made as close to H (X) as desired for some suitable chosen code. ๏ต Thus, with ๐ฟ๐‘š๐‘–๐‘› = ๐ป (๐‘‹) ๏ต The code efficiency can be rewritten as ๐œ‚ = ๐ป(๐‘‹) ๐ฟ 38
  • 39. Classification of Code: ๏ต Fixed โ€“ Length Codes ๏ต Variable โ€“ Length Codes ๏ต Distinct Codes ๏ต Prefix โ€“ Free Codes ๏ต Uniquely Decodable Codes ๏ต Instantaneous Codes ๏ต Optimal Codes 39 xi Code 1 Code 2 Code 3 Code 4 Code 5 Code 6 x1 x2 x3 x4 00 01 00 11 00 01 10 11 0 1 00 11 0 10 110 111 0 01 011 0111 1 01 001 0001
  • 40. ๏ต Fixed โ€“ Length Codes: A fixed โ€“ length code is one whose code word length is fixed. Code 1 and Code 2 of above table are fixed โ€“ length code words with length 2. ๏ต Variable โ€“ Length Codes: A variable โ€“ length code is one whose code word length is not fixed. All codes of above table except Code 1 and Code 2 are variable โ€“ length codes. ๏ต Distinct Codes: A code is distinct if each code word is distinguishable from each other. All codes of above table except Code 1 are distinct codes. ๏ต Prefix โ€“ Free Codes: A code in which no code word can be formed by adding code symbols to another code word is called a prefix- free code. In a prefix โ€“ free code, no code word is prefix of another. Codes 2, 4 and 6 of above table are prefix โ€“ free codes. 40
  • 41. ๏ต Uniquely Decodable Codes: A distinct code is uniquely decodable if the original source sequence can be reconstructed perfectly from the encoded binary sequence. A sufficient condition to ensure that a code is uniquely decodable is that no code word is a prefix of another. Thus the prefix โ€“ free codes 2, 4 and 6 are uniquely decodable codes. Prefix โ€“ free condition is not a necessary condition for uniquely decidability. Code 5 albeit does not satisfy the prefix โ€“ free condition and yet it is a uniquely decodable code since the bit 0 indicates the beginning of each code word of the code. ๏ต Instantaneous Codes: A uniquely decodable code is called an instantaneous code if the end of any code word is recognizable without examining subsequent code symbols. The instantaneous codes have the property previously mentioned that no code word is a prefix of another code word. Prefix โ€“ free codes are sometimes known as instantaneous codes. ๏ต Optimal Codes: A code is said to be optimal if it is instantaneous and has the minimum average L for a given source with a given probability assignment for the source symbols. 41
  • 42. Kraft Inequality: ๏ต Let X be a DMS with alphabet ๐‘ฅ๐‘– (๐‘– = 1,2, โ€ฆ , ๐‘š). Assume that the length of the assigned binary code word corresponding to xi is ni. ๏ต A necessary and sufficient condition for the existence of an instantaneous binary code is ๐พ = ๐‘–=1 ๐‘š 2โˆ’๐‘›๐‘– โ‰ค 1 ๏ต This is known as the Kraft Inequality. ๏ต It may be noted that Kraft inequality assures us of the existence of an instantaneously decodable code with code word lengths that satisfy the inequality. ๏ต But it does not show us how to obtain those code words, nor does it say any code satisfies the inequality is automatically uniquely decodable. 42
  • 43. Entropy Coding: ๏ต The design of a variable โ€“ length code such that its average code word length approaches the entropy of DMS is often referred to as Entropy Coding. ๏ต There are basically two types of entropy coding, viz. 1. Shannon โ€“ Fano Coding 2. Huffman Coding 43
  • 44. Shannon โ€“ Fano Coding: ๏ต An efficient code can be obtained by the following simple procedure, known as Shannon โ€“ Fano algorithm. 1. List the source symbols in order of decreasing probability. 2. Partition the set into two sets that are as close to equiprobables as possible and assign 0 to the upper set and 1 to the lower set. 3. Continue this process, each time partitioning the sets with as nearly equal probabilities as possible until further partitioning is not possible. 4. Assign code word by appending the 0s and 1s from left to right. 44
  • 45. Shannon โ€“ Fano Coding - Example ๏ต Let there be six (6) source symbols having probabilities as x1 = 0.30, x2 = 0.25, x3 = 0.20, x4 = 0.12, x5 = 0.08 x6 = 0.05. Obtain the Shannon โ€“ Fano Coding for the given source symbols. ๏ต Shannon Fano Code words ๏ต H (X) = 2.36 b/symbol ๏ต L = 2.38 b/symbol ๏ต ฮท = H (X)/ L = 0.99 45 xi P (xi) Step 1 Step 2 Step 3 Step 4 Code x1 x2 x3 x4 x5 x6 0.30 0.25 0.20 0.12 0.08 0.05 0 0 0 00 1 01 1 1 1 1 0 10 1 1 1 0 110 1 1 0 1110 1 1111
  • 46. Huffman Coding: ๏ต Huffman coding results in an optimal code. It is the code that has the highest efficiency. ๏ต The Huffman coding procedure is as follows: 1. List the source symbols in order of decreasing probability. 2. Combine the probabilities of the two symbols having the lowest probabilities and reorder the resultant probabilities, this step is called reduction 1. The same procedure is repeated until there are two ordered probabilities remaining. 3. Start encoding with the last reduction, which consists of exactly two ordered probabilities. Assign 0 as the first digit in the code word for all the source symbols associated with the first probability; assign 1 to the second probability. 4. Now go back and assign 0 and 1 to the second digit for the two probabilities that were combined in the previous reduction step, retaining all the source symbols associated with the first probability; assign 1 to the second probability. 5. Keep regressing this way until the first column is reached. 6. The code word is obtained tracing back from right to left. 46
  • 47. Huffman Encoding - Example 47 Source Sample xi P (xi) Codeword x1 x2 x3 x4 x5 x6 0.30 0.25 0.20 0.12 0.08 0.05 00 01 11 101 1000 1001 H (X) = 2.36 b/symbol L = 2.38 b/symbol ฮท = H (X)/ L = 0.99
  • 48. Redundancy: ๏ต Redundancy in information theory refers to the reduction in information content of a message from its maximum value. ๏ต For example, consider English having 26 alphabets. Assuming all alphabets are equally likely to occur, P (xi) = 1/26. For all the 26 letters, the information contained is therefore log2 26 = 4.7 ๐‘๐‘–๐‘ก๐‘ /๐‘™๐‘’๐‘ก๐‘ก๐‘’๐‘Ÿ ๏ต Assuming that each letter to occur with equal probability is not correct, if we assume that some letters are more likely to occur than others, it actually reduces the information content in English from its maximum value of 4.7 bits/symbol. ๏ต We define relative entropy on the ratio of H (Y/X) to H (X) which gives the maximum compression value and Redundancy is then expressed as ๐‘…๐‘’๐‘‘๐‘ข๐‘›๐‘‘๐‘Ž๐‘›๐‘๐‘ฆ = โˆ’ ๐ป(๐‘Œ/๐‘‹) ๐ป(๐‘‹) 48
  • 49. Entropy Relations for a continuous Channel: ๏ต In a continuous channel, an information source produces a continuous signal x (t). ๏ต The signal offered to such a channel of an ensemble of waveforms generated by some random ergodic processes if the channel is subject to AWGN. ๏ต If the continuous signal x (t) has a finite bandwidth, it can be as well described by a continuous random variable X having a PDF (Probability Density Function) fx (x). ๏ต The entropy of a continuous source X (t)is given by ๐ป ๐‘‹ = โˆ’ โˆ’โˆž โˆž ๐‘“๐‘ฅ (๐‘ฅ) log2 ๐‘“๐‘ฅ ๐‘ฅ ๐‘‘๐‘ฅ ๐‘๐‘–๐‘ก๐‘ /๐‘ ๐‘ฆ๐‘š๐‘๐‘œ๐‘™ ๏ต This relation is also known as Differential Entropy of X. 49
  • 50. ๏ต Similarly, the average mutual information for a continuous channel is given by ๐ผ ๐‘‹; ๐‘Œ = ๐ป ๐‘‹ โˆ’ ๐ป ๐‘‹ ๐‘Œ = ๐ป ๐‘Œ โˆ’ ๐ป(๐‘Œ/๐‘‹) ๏ต Where ๐ป ๐‘Œ = โˆ’ โˆ’โˆž โˆž ๐‘“๐‘Œ ๐‘ฆ log2 ๐‘“๐‘Œ (๐‘ฆ)๐‘‘๐‘ฆ ๐ป ๐‘‹ ๐‘Œ = โˆ’ โˆ’โˆž โˆž โˆ’โˆž โˆž ๐‘“๐‘‹๐‘Œ (๐‘ฅ, ๐‘ฆ) log2 ๐‘“๐‘‹ ๐‘ฅ ๐‘ฆ ๐‘‘๐‘ฅ๐‘‘๐‘ฆ ๐ป ๐‘Œ ๐‘‹ = โˆ’ โˆ’โˆž โˆž โˆ’โˆž โˆž ๐‘“๐‘‹๐‘Œ (๐‘ฅ, ๐‘ฆ) log2 ๐‘“๐‘Œ ๐‘ฆ ๐‘ฅ ๐‘‘๐‘ฅ๐‘‘๐‘ฆ 50
  • 51. Analytical Proof of Shannon โ€“ Hartley Law: ๏ต Let x, A and y represent the samples of x (t), A(t) and y (t). ๏ต Assume that the channel is band limited to B Hz. x (t) and A (t) are also limited to B Hz and hence y (t) is also bandlimited to B Hz. ๏ต Hence all signals can be specified by taking samples at Nyquist rate of 2B samples/sec. ๏ต We know ๐ผ ๐‘‹; ๐‘Œ = ๐ป ๐‘Œ โˆ’ ๐ป(๐‘Œ/๐‘‹) ๐ป ๐‘Œ ๐‘‹ = โˆ’โˆž โˆž โˆ’โˆž โˆž ๐‘ƒ(๐‘ฅ, ๐‘ฆ) log 1 ๐‘ƒ(๐‘ฆ/๐‘ฅ) ๐‘‘๐‘ฅ๐‘‘๐‘ฆ ๐ป ๐‘Œ ๐‘‹ = โˆ’โˆž โˆž ๐‘ƒ(๐‘ฅ)๐‘‘๐‘ฅ โˆ’โˆž โˆž ๐‘ƒ(๐‘ฆ/๐‘ฅ) log 1/๐‘ƒ(๐‘ฆ/๐‘ฅ) ๐‘‘๐‘ฆ ๏ต We know that y = x + n. ๏ต For a given x, y = n + a [a = constant of x] ๏ต Hence, the distribution of y for which x has a given value is identical to that of โ€˜nโ€™ except for a translation of x. 51
  • 52. ๏ต If PN (n) represents the probability density function of โ€˜nโ€™, then ๏ต P (y/x) = Pn (y - x) ๏ต Therefore โˆ’โˆž โˆž ๐‘ƒ(๐‘ฆ/๐‘ฅ) log 1/๐‘ƒ(๐‘ฆ/๐‘ฅ) ๐‘‘๐‘ฆ = โˆ’โˆž โˆž ๐‘ƒ๐‘›(๐‘ฆ โˆ’ ๐‘ฅ) log 1/๐‘ƒ๐‘›(๐‘ฆ โˆ’ ๐‘ฅ) ๐‘‘๐‘ฆ ๏ต Let y โ€“ x = z; dy = dz ๏ต Hence, the above equation becomes โˆ’โˆž โˆž ๐‘ƒ๐‘›(๐‘ง) log 1/๐‘ƒ๐‘›(๐‘ง) ๐‘‘๐‘ง = ๐ป(๐‘›) ๏ต H (n) is the entropy of the noise sample. ๏ต If the mean square value of the signal x (t) is S and that of noise n (t) is N, then the mean square value of the output is given by ๐‘ฆ2 = ๐‘† + ๐‘ 52
  • 53. ๏ต H (n) is the entropy of the noise sample. ๏ต If the mean square value of the signal x (t) is S and that of noise n (t) is N, then the mean square value of the output is given by ๐‘ฆ2 = ๐‘† + ๐‘ ๏ต Channel capacity C = max [I (X;Y)] = max [H(Y) โ€“ H(Y/X)] = max [H(Y) โ€“ H(n)] ๏ต H(Y) is the maximum when y is Gaussian and the maximum value of H(Y) is fm.log (2ฮ eฯƒ2); whereas fm is the band limited frequency, ฯƒ2 is the mean square value where ฯƒ2 = S + N. ๏ต Therefore, ๐ป(๐‘ฆ)๐‘š๐‘Ž๐‘ฅ = ๐ต log 2๐œ‹๐‘’ (๐‘† + ๐‘) ๏ต max ๐ป ๐‘› = ๐ต log 2๐œ‹๐‘’(๐‘) ๏ต Therefore ๐ถ = ๐ต log[2๐œ‹๐‘’(๐‘† + ๐‘)] โˆ’ ๐ต log(2๐œ‹๐‘’๐‘) Or ๐ถ = ๐ต log[2๐œ‹๐‘’(๐‘†+๐‘) 2๐œ‹๐‘’๐‘] ๏ต Hence ๐ถ = ๐ต log[1 + ๐‘† ๐‘] 53
  • 54. Rate Distortion Theory: ๏ต By source coding theorem for a discrete memoryless source, according to which the average code โ€“ word length must be at least as large as the source entropy for perfect coding (i.e. perfect representation of the source). ๏ต There are constraints that force the coding to be imperfect, thereby resulting in unavoidable distortion. ๏ต For example, the source may have a continuous amplitude as in case of speech and the requirement is to quantize the amplitude of each sample generated by the source to permit its representation by a code word of finite length as in PCM. ๏ต In such a case, the problem is referred to as Source coding with a fidelity criterion and the branch of information theory that deals with it is called Rate Distortion Theory. 54
  • 55. ๏ต RDT finds applications in two types of situations: 1. Source coding where permitted coding alphabet can not exactly represent the information source, in which case we are forced to do โ€˜lossy data compressionโ€™. 2. Information transmission at a rate greater than the channel capacity. 55
  • 56. Rate Distortion Function: ๏ต Consider DMS defined by a M โ€“ ary alphabet X: {xi| i = 1, 2, โ€ฆ., M} which consists of a set of statistically independent symbols together with the associated symbol probabilities {Pi | i = 1, 2, .. , M}. Let R be the average code rate in bits per code word. ๏ต The representation code words are taken from another alphabet Y: {yj| j = 1, 2, โ€ฆ N}. ๏ต This source coding theorem states that this second alphabet provides a perfect representation of the source provided that R > H, where H is the source entropy. ๏ต But if we are forced to have R < H, then there is an unavoidable distortion and therefore loss of information. ๏ต Let P (xi, yj) denote the joint probability of occurrence of source symbol xi and representation symbol yj. 56
  • 57. ๏ต From probability theory, we have ๏ต P (xi, yj) = P (yj/xi) P (xi) where P (yj/xi) is a transition probability. ๏ต Let d (xi, yj) denote a measure of the cost incurred in representing the source symbol xi by the symbol yj; ๏ต the quantity d (xi, yj) is referred to as a single โ€“ letter distortion measure. ๏ต The statistical average of d(xi, yj) over all possible source symbols and representation symbols is given by ๐‘‘ = ๐‘–=1 ๐‘€ ๐‘—=1 ๐‘ ๐‘ƒ ๐‘ฅ๐‘– ๐‘ƒ ๐‘ฆ๐‘— ๐‘ฅ๐‘– ๐‘‘(๐‘ฅ๐‘–๐‘ฆ๐‘—) Average distortion ๐‘‘ is a non โ€“ negative continuous function of the transition probabilities P (yj / xi) those are determined by the source encoder โ€“ decoder pair. 57
  • 58. ๏ต A conditional probability assignment P (yj / xi) is said to be D โ€“ admissible if and only the average distortion ๐‘‘ is less than or equal to some acceptable value D. the set of all D โ€“ admissible conditional probability assignments is denoted by ๐‘ƒ๐ท = P yj | xi : ๐‘‘ โ‰ค ๐ท ๏ต For each set of transition probabilities, we have a mutual information ๐ผ ๐‘‹; ๐‘Œ = ๐‘– = 1 ๐‘€ ๐‘— = 1 ๐‘ ๐‘ƒ ๐‘ฅ๐‘– ๐‘ƒ(๐‘ฆ๐‘— /๐‘ฅ๐‘–) log( ๐‘ƒ(๐‘ฆ๐‘— /๐‘ฅ๐‘–) ๐‘ƒ(๐‘ฆ๐‘— )) 58
  • 59. ๏ต โ€œA Rate Distortion Function R (D) is defined as the smallest coding rate possible for which the average distortion is guaranteed not to exceed Dโ€. ๏ต Let PD denote the set to which the conditional probability P (yj / xi) belongs for prescribed D. then, for a fixed D, we write ๐‘… ๐ท = min ๐‘ƒ( yj / xi )โˆˆ๐‘ƒ๐ท ๐ผ(๐‘‹; ๐‘Œ) ๏ต Subject to the constraint ๐‘—=1 ๐‘ ๐‘ƒ ๐‘ฆ๐‘— ๐‘ฅ๐‘– = 1 ๐‘“๐‘œ๐‘Ÿ ๐‘– = 1, 2, โ€ฆ . , ๐‘€ ๏ต The rate distortion function R (D) is measured in units of bits if the base โ€“ 2 logarithm is used. ๏ต We expect the distortion D to decrease as R (D) is increased. ๏ต Tolerating a large distortion D permits the use of a smaller rate for coding and/or transmission of information. 59
  • 60. Gaussian Source: ๏ต Consider a discrete time memoryless Gaussian source with zero mean and variance ฯƒ2. Let x denote the value of a sample generated by such a source. ๏ต Let y denote a quantized version of x that permits a finite representation of it. ๏ต The squared error distortion d(x, y) = (x-y)2 provides a distortion measure that is widely used for continuous alphabets. ๏ต The rate distortion function for the Gaussian source with squared error distortion, is given by ๏ต R (D) = 1 2 log(๐œŽ2 ๐ท); 0โ‰ค D โ‰ค ฯƒ2 otherwise 0 for D > ฯƒ2 ๏ต In this case, R (D) โ†’ โˆž as D โ†’ 0 ๏ต R (D) โ†’ 0 for D = ฯƒ2 60
  • 61. Data Compression: ๏ต RDT leads to data compression that involves a purposeful or unavoidable reduction in the information content of data from a continuous or discrete source. ๏ต Data compressor is a device that supplies a code with the least number of symbols for the representation of the source output subject to permissible or acceptable distortion. 61
  • 62. Probability Theory: 0 โ‰ค ๐‘๐‘›(๐ด) ๐‘› โ‰ค 1 ๐‘ƒ ๐ด = lim ๐‘›โ†’โˆž ๐‘๐‘›(๐ด) ๐‘› ๏ต Axioms of Probability: 1. P (S) = 1 2. 0 โ‰ค P (A) โ‰ค 1 3. P (A+ B) = P (A) + P (B) 4. {Nn (A+B)}/n = Nn(A)/n + Nn(B)/n 62
  • 63. ๏ต Property 1: ๏ต P (Aโ€™) = 1 โ€“ P (A) ๏ต S = A + Aโ€™ ๏ต 1= P (A) + P (Aโ€™) ๏ต Property 2: ๏ต If M mutually exclusive events A1, A2, โ€ฆโ€ฆ, Am, have the exhaustive property ๏ต A1 + A2 + โ€ฆโ€ฆ + Am = S ๏ต Then ๏ต P (A1) + P (A2) + โ€ฆ.. P (Am) = 1 63
  • 64. ๏ต Property 3: ๏ต When events A and B are not mutually exclusive, then the probability of the union event โ€˜A or Bโ€™ equals ๏ต P (A+B) = P (A) + P (B) โ€“ P (AB) ๏ต Where P (AB) is the probability of the joint event โ€˜A and Bโ€™. ๏ต P (AB) is called the โ€˜joint probabilityโ€™ if ๐‘ƒ ๐ด๐ต = lim ๐‘›โ†’โˆž ( ๐‘๐‘›(๐ด๐ต) ๐‘›) Property 4: ๏ต Let there involves a pair of events A and B. ๏ต Let P (B/A) denote the probability of event B, given that event A has occurred and P (B/A) is called โ€˜conditional probabilityโ€™ if ๐‘ƒ ๐ต ๐ด = ๐‘ƒ (๐ด๐ต) ๐‘ƒ(๐ด) ๏ต We have Nn (AB)/ Nn (A) โ‰ค 1 ๏ต Therefore, ๐‘ƒ(๐ต? ๐ด) = lim ๐‘›โ†’โˆž ( ๐‘๐‘›(๐ด๐ต) ๐‘๐‘›(๐ด)) 64
  • 65. References: 1. Digital Communications; Dr. Sanjay Sharma; Katson Books. 2. Communication Engineering; B P Lathi. 65