3. Introduction
• There are two main fields of coding theory, namely
o Source coding, which tries to represent the source symbols in minimal form for storage or
transmission efficiency.
o Channel coding, the purpose of which is to enhance detection and correction of
transmission errors, by choosing symbol representations which are far apart from each
other.
• Data compression can be considered an extension of source coding. It can be divided into two
phases:
o Modelling of the information source means defining suitable units for coding, such as
characters or words, and estimating the probability distribution of these units.
o Source coding (called also statistical or entropy coding) is applied to the units, using their
probabilities.
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 3
4. Information
• Shannon defined a measure of the information for the event 𝑥 by using a logarithmic measure
operating over the base 𝒃. For a discrete random variable, X, the information of an outcome X = x is:
𝑰 𝒙 = −𝒍𝒐𝒈𝒃(𝒑 𝒙 )
• The information of the event depends only on its probability of occurrence, and is not dependent on
its content.
• The base of the logarithmic measure can be converted by using:
𝒍𝒐𝒈𝒂 𝒑 𝒙 = 𝒍𝒐𝒈𝒃 𝒑 𝒙
𝟏
𝒍𝒐𝒈𝒃 𝒂
• If this measure is calculated to base 2, the information is said to be measured in bits.
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 4
5. Information
• For independent random variables:
𝑰 𝒙, 𝒚 = −𝒍𝒐𝒈 𝒑 𝒙, 𝒚
= −𝒍𝒐𝒈 𝒑 𝒙 𝒑 𝒚
= 𝑰(𝒙) + 𝑰(𝒚)
• If X is Bernoulli with 𝑃𝑟{𝑋 = 1} = 𝑝, the information of the outcome 𝑋 = 1 𝑖𝑠 𝐼 = −𝑙𝑜𝑔 𝑝 .
• The information is always a positive quantity.
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 5
6. Entropy
• The entropy of a discrete random variable, X, is defined by:
𝑯 𝑿 = −𝑬[𝒍𝒐𝒈𝟐 𝒑 𝑿 = − D
𝒙∈𝑿
𝒑 𝒙 𝒍𝒐𝒈𝟐 𝒑 𝒙
𝑯 𝑿 = D
𝒙∈𝑿
𝒑 𝒙 𝒍𝒐𝒈𝟐
𝟏
𝒑 𝒙
• For discrete random variables 𝐻(𝑋) ≥ 0
• The entropy is the average information of the random variable X:
𝐻 𝑋 = 𝐸[𝐼 𝑋 ]
• When base 2 is used, the entropy is measured in bits per symbol.
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 6
7. Entropy
• Note That:
o Entropy is the measure of average uncertainty in X
o Entropy is the average number of bits needed to describe X
o Entropy is a lower bound on the average length of the shortest description of X.
• The information rate is then equal to:
𝑹 = 𝒓𝑯(𝑿) 𝒃𝒑𝒔
Where r is symbols per second.
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 7
8. Example
• Entropy of a Bernoulli R.V. with parameter p.
• Solution:
𝐻 𝑋 = −𝑝 log(𝑝) − 1 − 𝑝 log(1 − 𝑝)
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 8
9. Example
• Entropy of a uniform R.V. taking on K values: e.g., 𝑋 = (1, … . , 𝐾).
• Solution:
𝑯 𝑿 = − D
𝒙∈𝑿
𝒑 𝒙 𝒍𝒐𝒈 𝒑 𝒙 = D
𝒊(𝟏
𝑲 𝟏
𝑲
𝒍𝒐𝒈 𝑲 = 𝒍𝒐𝒈 𝑲
• Note:
• It does not depend of the values that X takes, it depends only on their probabilities (X and X + a have the
same entropy!).
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 9
10. Example
• A source characterized in the frequency domain with a bandwidth of W =4000 Hz is sampled
at the Nyquist rate, generating a sequence of values taken from the range A = {−2,−1, 0, 1, 2}
with the following corresponding set of probabilities {
!
"
,
!
#
,
!
$
,
!
!%
,
!
!%
}. Calculate the source rate
in bits per second.
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 10
11. Solution
𝐻 𝑋 = D
+(,
-
𝑝(𝑥) log.
1
𝑝(𝑥)
𝐻 𝑋 =
1
2
log. 2 +
1
4
log. 4 +
1
8
log. 8 +
2
16
log. 16 =
15
8
𝑏𝑖𝑡𝑠/𝑠𝑎𝑚𝑝𝑙𝑒
• The minimum sampling frequency is equal to 8000 samples per second, so that the information rate is
equal to 15 kbps.
• Note that:
• Entropy can be evaluated to a different base by using:
𝐻/ 𝑋 =
𝐻 𝑥
log. 𝑏
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 11
12. Example
• A given source emits r = 3000 symbols per second from a range of four symbols, with the probabilities
given in Table:
Xi Pi Ii
A 1/3 1.5849
B 1/3 1.5849
C 1/6 2.5849
D 1/6 2.5849
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 12
13. Solution
• The entropy is:
𝐻 𝑋 = D
+(,
-
𝑝(𝑥) log.
1
𝑝(𝑥)
𝐻 𝑋 =
2
3
log. 3 +
2
6
log. 6 = 1.9183 𝑏𝑖𝑡𝑠 /𝑠𝑦𝑚𝑏𝑜𝑙
Xi Pi Ii
A 1/3 1.5849
B 1/3 1.5849
C 1/6 2.5849
D 1/6 2.5849
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 13
14. Example
• Find the entropy of English 26 letters (a-z) and a space character '-'.
• Solution:
• The entropy is:
𝐻 𝑋 = − D
0∈1
𝑝 𝑥 log 𝑝 𝑥 = 4.11 𝑏𝑖𝑡𝑠/𝑙𝑒𝑡𝑡𝑒𝑟
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 14
15. Transmission of Information
• The channel usually has negative effect on information transmission so that not all the
information (or entropy) is transferred to the receiver, instead a portion of this information is
discarded by the channel or the channel adds noise to the transferred information. The
model is shown below;
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 15
16. Transmission of Information
• The channel is modelled as conditional probability 𝑝 𝑥+ 𝑦2 𝑜𝑟 𝑝(𝑦2|𝑥+) for all values of xi and yj.
• For the simplest case of binary channel where we have;
X = { x1 , x2 } Y = { y1 , y2 }
• We have four conditional probabilities 𝑝(𝑦2|𝑥+) as follows:
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 16
𝑝 𝑦! 𝑥! Conditional probability of receiving 𝑦! by the receiver when the source produced 𝑥! or probability of correct
reception of 𝑥!
𝑝 𝑦" 𝑥" Conditional probability of receiving 𝑦" by the receiver when the source produced 𝑥" or probability of correct reception
of 𝑥"
𝑝 𝑦" 𝑥! Conditional probability of receiving 𝑦" by the receiver when the source produced 𝑥! or probability of incorrect
transition of 𝑥!
𝑝 𝑦! 𝑥" Conditional probability of receiving 𝑦! by the receiver when the source produced 𝑥" or probability of incorrect
transition of 𝑥"
17. Joint and Conditional Entropy
• The joint probability mass function of two random variables X and Y taking values on alphabets X and Y,
respectively, is:
𝑝 𝑥, 𝑦 = Pr 𝑋 = 𝑥, 𝑌 = 𝑦 , 𝑥, 𝑦 ∈ 𝑋, 𝑌
• If 𝑝 𝑥 = Pr 𝑋 = 𝑥 > 0, the conditional probability of Y=y given that X=x is defined by:
𝑝 𝑦|𝑥 = Pr 𝑌 = 𝑦|𝑋 = 𝑥 =
𝑝 𝑥, 𝑦
𝑝 𝑥
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 17
18. Joint and Conditional Entropy
• Independence: The events X = x and Y = y are independent if
𝑝(𝑥, 𝑦) = 𝑝(𝑥)𝑝(𝑦)
• The joint entropy: H(X, Y) of two random variables (X, Y) with pmf p(x,y) is defined as:
𝐻 𝑋, 𝑌 = −𝐸[𝑙𝑜𝑔 𝑝 𝑋, 𝑌 = − D
3∈4
D
0∈1
𝑝 𝑥, 𝑦 log(𝑝 𝑥, 𝑦 )
• The conditional entropy of Y given X is defined as:
𝐻 𝑌|𝑋 = −𝐸[𝑙𝑜𝑔 𝑝 𝑌|𝑋 = − ∑3∈4 ∑0∈1 𝑝 𝑥, 𝑦 log(𝑝 𝑦|𝑥 )
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 18
19. Joint and Conditional Entropy
• Note that:
𝐻 𝑌|𝑋 = − D
3∈4
D
0∈1
𝑝 𝑥, 𝑦 log 𝑝 𝑦|𝑥
= − D
0∈1
𝑝(𝑥) D
3∈4
𝑝 𝑦|𝑥 log 𝑝 𝑦|𝑥
= D
0∈1
𝑝(𝑥)𝐻(𝑌|𝑋 = 𝑥)
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 19
20. Chain rule
• We Know that 𝑝 𝑥, 𝑦 = 𝑝(𝑥)𝑝 𝑦|𝑥 . Therefore, taking logarithms and expectations on both
sides we arrive to:
𝐸[𝑙𝑜𝑔 𝑝 𝑥, 𝑦 = 𝐸[𝑙𝑜𝑔 𝑝 𝑥 + 𝐸[𝑙𝑜𝑔 p y|x ]
• So chain rule:
𝑯 𝑿, 𝒀 = 𝑯 𝑿 + 𝑯(𝒀|𝑿)
• Similarly:
𝑯 𝑿, 𝒀 = 𝑯 𝒀 + 𝑯(𝑿|𝒀)
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 20
21. Chain rule
• Note that:
𝑯(𝑿|𝒀) ≠ 𝑯 𝒀 𝑿
• But
𝑯 𝒀 − 𝑯 𝒀 𝑿 = 𝑯 𝑿 − 𝑯(𝑿|𝒀)
• As a corollary of the chain rule, it is easy to prove the following
𝑯(𝑿, 𝒀|𝒁) = 𝑯(𝑿|𝒁) + 𝑯(𝒀|𝑿, 𝒁)
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 21
22. Chain rule
• Mutual information: The mutual information between
two random variables is the “amount of information”
describing one random variable obtained through the other
(mutual dependence); alternate interpretations: how much
is your uncertainty about X reduced from knowing Y , how
much does X inform Y ?
𝐼 𝑋, 𝑌 = &
#,%
𝑃 𝑋, 𝑌 𝑙𝑜𝑔
𝑃 𝑋, 𝑌
𝑃 𝑋 𝑃 𝑌
= 𝐻 𝑋 − 𝐻 𝑋 𝑌
= 𝐻 𝑌 − 𝐻 𝑌 𝑋
= 𝐻 𝑋 + 𝐻 𝑌 − 𝐻(𝑋, 𝑌)
• Note that
• 𝐼(𝑋, 𝑌 ) = 𝐼(𝑌, 𝑋) ≥ 0, with equality if and only if X and Y
are independent.
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 22
23. Example
• Find:
a. 𝐻 𝑋, 𝑌 ,
b. 𝐻 𝑌 𝑋 , 𝐻 𝑋 𝑌 ,
c. 𝐼(𝑋, 𝑌).
XY 0 1
0 1/4 1/4
1 1/2 0
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 23
24. Solution
a. 𝐻 𝑋, 𝑌 =
.
5
𝑙𝑜𝑔. 4 +
,
.
𝑙𝑜𝑔. 2
𝐻 𝑋, 𝑌 = 1.5
XY 0 1
0 1/4 1/4
1 1/2 0
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 24
26. Solution
c. 𝐼 𝑋, 𝑌 = 𝐻 𝑋 + 𝐻 𝑌 − 𝐻 𝑋, 𝑌
𝐼 𝑋, 𝑌 = 1 + 0.81 − 1.5 = 0.31
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 26
27. Channel Capacity
• Operational channel capacity is the number of bit to represent the maximum number of distinguishable
signals for n uses of a communication channel.
• In n transmission, we can send M signals without error, the channel capacity is 𝒍𝒐𝒈 𝑴/𝒏 bits per
transmission.
• While Information channel capacity is the maximum mutual information. Operational channel capacity
is equal to Information channel capacity.
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 27
28. Channel Capacity
• The channel capacity of a discrete memoryless channel is defined as:
𝐶 = max 𝐼 𝑋, 𝑌
𝐶 = 𝑚𝑎𝑥[𝐻 𝑌 − 𝐻 𝑌 𝑋 ]
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 28
29. Noiseless Binary Channel
• Consider the channel presented in Figure. Show that the capacity is 1 bit per symbol (or per channel
use).
𝑝 𝑌 = 0 = 𝑝 𝑋 = 0 = 𝛼6,
𝑝 𝑌 = 1 = 𝑝 𝑋 = 1 = 𝛼, = 1 − 𝛼6
𝐼(𝑋; 𝑌 ) = 𝐻(𝑌 ) − 𝐻(𝑌 |𝑋) = 𝐻(𝑌 ) ≤ 1
𝛼6 = 𝛼, = 0.5
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 29
30. Binary Symmetric Channel(BSC)
• The BSC is characterized by a probability p that one of the binary symbols converts into the other one.
• Each binary symbol has, on the other hand, a probability of being transmitted.
• The probabilities of a 𝟎 𝒐𝒓 𝒂 𝟏 being transmitted are 𝜶 𝒂𝒏𝒅 𝟏 − 𝜶 respectively. According to the
notation used,
𝑥, = 0,
𝑥. = 1,
𝑦, = 0,
𝑦.= 1
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 30
31. Binary Symmetric Channel(BSC)
• The probability matrix for the BSC is equal to:
𝑃78 = [
1 − 𝑝 𝑝
𝑝 1 − 𝑝
]
• Channel Capacity
𝑪𝑩𝑺𝑪 = 𝟏 − 𝑯 𝒑, 𝟏 − 𝒑
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 31
32. Binary Erasure Channel (BEC)
• The binary erasure channel is when some bits are lost (rather than corrupted).
• Here the receiver knows which bit has been erased. Figure shows this channel.
• We are to calculate the capacity of binary erasure channel.
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 32
33. Binary Erasure Channel (BEC)
• For this channel, 𝟎 ≤ 𝒑 ≤ 𝟏 / 𝟐, 𝑤ℎ𝑒𝑟𝑒 𝒑 is the erasure probability, and the channel model has two
inputs and three outputs.
• When the received values are unreliable, or if blocks are detected to contain errors, then erasures are
declared, indicated by the symbol ‘?’. The probability matrix of the BEC is the following:
𝑃78 = [
1 − 𝑝 𝑝 0
0 𝑝 1 − 𝑝
]
• Channel Capacity
𝑪𝑩𝑬𝑪 = 𝟏 − 𝒑 𝒃𝒊𝒕 𝒑𝒆𝒓 𝒄𝒉𝒂𝒏𝒏𝒆𝒍 𝒖𝒔𝒆
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 33
34. Example
• Consider the binary channel for which the input range and output range are in both cases equal to {0,
1}. Find P(X|Y).
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 34
35. Solution:
𝑃 𝑋 𝑌 =
𝑃 𝑋, 𝑌
𝑃 𝑌
𝑃 𝑋 𝑌 =
𝑃 𝑌 𝑋 𝑃(𝑋)
𝑃 𝑌
• The corresponding transition probability matrix is in this case equal to:
𝑃78 =
3
4
1
4
1
8
7
8
= 𝑃(𝑌|𝑋)
𝑃(𝑋) =
5
=
,
=
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 35
38. Symmetric and Non Symmetric Channel
• Let us consider channel with transition matrix:
𝑃 𝑦 𝑥 =
0.3 0.2 0.5
0.5 0.3 0.2
0.2 0.5 0.3
• with the entry in 𝑥𝑡ℎ row and 𝑦𝑡ℎ column giving the probability that y is received when x is sent.
• All the rows are permutations of each other and the same holds for all columns. We say that such a
channel is symmetric.
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 38
39. Symmetric and Non Symmetric Channel
• Definition
• A channel is said to be symmetric if the rows of its transition matrix are permutations of each other, and
the columns are permutations of each other.
• A channel is said to be weakly symmetric if every row of the transition matrix is a permutation every
other row, and all the column sums are equal.
• If a channel is symmetric or weakly symmetric, the channel capacity is:
𝑪 = 𝒍𝒐𝒈 |𝒀| − 𝑯(𝒓)
• where r is the set of probabilities labeling branches leaving a code symbol X, or, viewed in a transition
matrix, one row of the transition matrix.
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 39
40. Example
• Consider a channel with three different inputs 𝑋 = {1, 2, 3} and the same set of outputs 𝑌 = 1, 2, 3
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 40
41. Solution
• The transition probability matrix is:
𝑃 𝑦 𝑥 =
0.7 0.1 0.2
0.2 0.7 0.1
0.1 0.2 0.7
• 𝐶 = 𝑙𝑜𝑔 |3| − 𝐻(0.7,0.1,0.2)
• 𝐶 = 1.58 − 0.7log(
,
6.?
) + 0.1log
,
6.,
+ 0.2log(
,
6..
)
• 𝐶 = 1.58 − 0.36 + 0.33 + 0.464
• 𝐶 = 0.426
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 41
42. Example
• Channel with two erasure symbols, one closer to 0 and one closer to 1, is shown:
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 42
43. Solution
• The transition probability matrix is:
𝑃 𝑦 𝑥 =
1/3 1/4 1/4 1/6
1/6 1/4 1/4 1/3
• We see that two rows have the same set of probabilities. Summing each column we get the constant value
%
!
,
• So we can conclude that the channel is weakly symmetric. The set of outputs 𝑌 has the cardinality |Y| = 4, and we can
calculate the capacity for this channel as:
• 𝐶 = 𝑙𝑜𝑔 |4| − 𝐻(
%
'
,
%
"
,
%
"
,
%
(
)
• 𝐶 = 2 −
%
'
𝑙𝑜𝑔 3 +
!
"
𝑙𝑜𝑔 4 +
%
(
𝑙𝑜𝑔 6
• 𝐶 = 2 − 1.956 = 0.044
• which is a very poor channel.
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 43
44. Non Symmetric Channel
𝑃 𝑌 𝑋 = [
𝑃,, 𝑃,.
𝑃., 𝑃..
]
𝑃 𝑄 = − 𝐻
𝑃,, 𝑃,.
𝑃., 𝑃..
𝑄,
𝑄.
= [
𝑃,,𝑙𝑜𝑔𝑃,, + 𝑃,.𝑙𝑜𝑔𝑃,.
𝑃.,𝑙𝑜𝑔𝑃., + 𝑃..𝑙𝑜𝑔𝑃..
]
• where Q1 and Q2 are auxiliary variables. Then:
𝐶 = log(2@) + 2@*)
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 44
45. Example
• Find the mutual information and channel capacity of the channel given below 𝑝(𝑥1) = 0.6
and 𝑝(𝑥2) = 0.4.
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 45
52. Solution
𝐶 = log(2Z2 + 2Z3)
𝐶 = log(2[.]^]
+ 2[._`]
)
𝐶 = log 0.635 + 0.508
𝐶 = log 1.14 = 0.189
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 52
53. Continuous Sources and channels
• Differential Entropy
• The differential entropy (Nats) of continuous source with generic probability density function (pdf) 𝑓1
is defined as:
h X = − ∫
AB
B
𝑓1 𝑥 log 𝑓1 𝑥 𝑑𝑥
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 53
54. Example
• A continuous source X with source alphabet [0, 1) and pdf 𝑓(𝑥) = 2𝑥 has
differential entropy equal to:
h X = − 6
&
!
2𝑥 log 2𝑥 𝑑𝑥
=
𝑥"
(1 − 2 log(2𝑥))
2
&
!
=
1
2
− log 2 ≈ −0.193 𝑛𝑎𝑡𝑠
• Note that the differential entropy, unlike the entropy, can be negative in its value.
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 54
55. Example
• Differential entropy of continuous sources with uniform
generic distribution)
• A continuous source X with uniform generic distribution over (a,
b) has differential entropy
• The pdf of an uniform random variable is:
𝑓 𝑥 = 5
1
𝑎 − 𝑏
𝑎 ≤ 𝑥 ≤ 𝑏
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 55
56. Example
• The differential entropy is simply:
ℎ 𝑋 = 𝐸[− log 𝑓(𝑋)] = log (𝑏 − 𝑎)
• Notice that the differential entropy can be negative or positive depending on whether 𝑏 − 𝑎 is less
than or greater than 1. In practice, because of this property, differential entropy is usually used as
means to determine mutual information and does not have much operational significance by itself.
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 56
57. Example
• Differential entropy of Gaussian sources
• The pdf of a Gaussian random variable is:
𝑓 𝑥 =
1
2𝜋𝜎.
𝑒
A,
.C*0*
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 57
58. Example
• The differential entropy is simply:
ℎ 𝑋 = 𝐸[− log 𝑓(𝑋)]
ℎ 𝑋 = 1 𝑓 𝑥 [
1
2
log 2𝜋𝜎" +
𝑥 − 𝜇 "
2𝜎" ]𝑑𝑥
=
1
2
log 2𝜋𝜎" +
1
2𝜎" 𝐸 𝑥 − 𝜇 "
=
1
2
log 2𝜋𝜎" +
1
2
=
!
"
log(2𝜋𝜎"𝑒) nats
• A continuous source X with Gaussian generic distribution of mean 𝜇 and variance 𝜎" has differential entropy.
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 58
59. The conditional differential entropy
• The conditional differential entropy of 𝑋 given 𝑌 is:
h X|Y = − Ž
0,3
𝑓14 𝑥, 𝑦 𝑙𝑜𝑔.𝑓1|4 𝑋 𝑌 𝑑𝑥𝑑𝑦
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 59
60. The mutual information
• The mutual information between 𝑋 and 𝑌 is:
𝐼 𝑋; 𝑌 = ℎ 𝑋 − ℎ 𝑋 𝑌 = ℎ 𝑌 − ℎ 𝑌 𝑋
𝐼 𝑋; 𝑌 = (
•,•
𝑓‘’ 𝑥, 𝑦 𝑙𝑜𝑔“
𝑓‘’ 𝑥, 𝑦
𝑓‘ 𝑥 𝑓’ 𝑦
𝑑𝑥𝑑𝑦
• Shannon’s channel coding theorem holds for continuous alphabet as well: The
capacity of any channel with power constraint P and transition law 𝑓’|‘ is
𝐶 = max 𝐼(𝑋; 𝑌)
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 60
61. The mutual information
• The Gaussian random variable is very important as we encounter it
frequently in communications and signal processing.
• You can compute the differential entropy of the Gaussian 𝑋 ~ 𝑁 µ, 𝜎“ : It is
equal to:
ℎ 𝑋 =
1
2
log“(2𝜋𝑒𝜎“)
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 61
62. The mutual information
• For the power constrained AWGN channel 𝑌N = 𝑋N + 𝑍N with 𝑍N ~ 𝑁 µ, 𝜎" is maximized when
𝑓) is 𝑁 0, 𝑃 . Then,
𝐼 𝑋; 𝑌 = ℎ 𝑋 − ℎ 𝑋 𝑌 = ℎ 𝑋 − ℎ 𝑍
• Since X, Z are independent Gaussians, 𝑌 ~ 𝑁 0, 𝑃 + 𝜎" . Using the formula for the entropy of
a Gaussian and simplifying, we get:
𝐶 =
1
2
log"(1 +
𝑃
𝜎")
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 62
63. Channel Efficiency and Redundancy
• Channel Efficiency = 𝜂”• =
–
—
. 100%
• Channel Redundancy = 𝑅”• =
—˜–
—
. 100%
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 63
64. Example
For the following channel: 𝑃 𝑦2 𝑥+ =
0.9
0
0.1 0
0.9 0.1
0.1 0 0.9
a) Is the channel symmetric? Why?
b) If the three source symbols probabilities are related by: 𝒑(𝒙𝟏) = 𝒑(𝒙𝟐) = 𝟐. 𝒑(𝒙𝟑), find source
probabilities, all entropies, and average mutual information.
c) Find the channel capacity, channel efficiency and redundancy.
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 64
65. Solution
a) The channel is symmetric (or TSC), because the components of the rows of 𝑃 𝑦2 𝑥+ are the same.
b) To start solving the problem to find source probabilities, we think that we have 3 unknowns and so we
need 3 equations and these are given:
P(x1) = P(x2) . . . (1)
P(x2) =2.P(x3) . . . .(2)
P(x1) + P(x2) + P(x3)=1 ……(3)
from (1) P(x2) = P(x1), from (2) P(x3) = P(x2)/2 = P(x1)/2 .
Putting these relations in (3) will give:
P(x1) + P(x1) + P(x1)/2 =1 à 5. P(x1) /2 =1 or P(x1) =2/5 =0.4
• Now, using (1) and (2) à P(x1) =0.4, P(x2) =0.4, and P(x3) =0.2
• Now we have P(xi) = [ 0.4 0.4 0.2]
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 65
66. Solution
• From the relation 𝑃 𝑥N, 𝑦O = 𝑝(𝑥N) 𝑝 𝑦O 𝑥N and the given matrix of 𝑝 𝑦O 𝑥N :
𝑃(𝑥N, 𝑦O) =
0.36
0
0.04 0
0.36 0.04
0.02 0 0.18
• Summing the column components to have 𝑃 𝑦O = [ 0.38 0.4 0.22]
• Other probabilities are unnecessary in this example, now calculate:
H(x) = − ∑* 𝑃 𝑥N 𝐿𝑜𝑔𝑃(𝑥N) = 1.5219 Bit/Symbol
H(y) = − ∑K 𝑃 𝑦O 𝐿𝑜𝑔𝑃(𝑦O) = 1.5398 Bit/Symbol
• Since we have symmetric channel then, H(y|x)= − ∑K 𝑃 𝑦O|𝑥N 𝐿𝑜𝑔𝑃 𝑦O 𝑥N
• H(y|x) = − [ 0.9 Log 0.9 + 0.1 Log 0.1 +0 Log0 ] = 0.467 Bits/Symbol
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 66
67. Solution
• I= H(Y) – H(Y|X) = 1.5398 - 0.467 = 1.0728 Bits/Symbol
• H(X|Y) = H(X) – I = 1.5219 - 1.0728 = 0.4491 Bits/Symbol
• H(X,Y) = H(X) + H(Y) – I = 1.5219 + 1.5398 - 1.0728 = 1.9889 Bits/Symbol
c) Using the general expression for channel capacity for symmetric channel:
• 𝐶 = 𝐿𝑜𝑔 𝑀 − 𝐻 𝑌 𝑋 = 𝐿𝑜𝑔 𝑀 + ∑3 𝑃 𝑦2|𝑥+ 𝐿𝑜𝑔𝑃(𝑦2|𝑥+)
= 𝐿𝑜𝑔. 3 − 0.467 = 1.1179 Bits/Symbol
• Channel Efficiency = 𝜂78 =
F
G
. 100% =
,.6?.H
,.,,?I
. 100% = 95.96 %
• Channel Redundancy = 𝑅78 =
GAF
G
. 100% =
,.,,?IA,.6?.H
,.,,?I
. 100% = 4.04 %
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 67
68. Entropy, Information, and Capacity Rates
• The meaning of rate is the unit of a physical quantity per unit time. For the entropy, the
average mutual information, and the capacity the rate is measured in bits per second
(bits/sec) or more general bps. This is much important than the unit Bits/symbol.
• In terms of units >>>
PNQ
RKSTUV
×
RKSTUV
WXYUZ[
=
PNQ
RXYUZ[
𝑜𝑟 𝑏𝑝𝑠
• Let Rx be the source symbol rate, then the time of the symbol (Tx )is given by:
𝑇* =
!
*
second/Symbol
• Now each entropy H, I or C can be converted from Bits/Symbol unit into rate unit of bps by
multiplying each of them by Rx, as follows:
𝐻] 𝑥 = 𝑅*. 𝐻 𝑥
𝐼] = 𝑅*. 𝐼
𝐶] = 𝑅*. 𝐶
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 68
69. Example
• For previous Example, find 𝐻J 𝑥 , 𝐼J 𝑎𝑛𝑑 𝐶J, if the average time interval of the source symbol is 10
𝜇. 𝑠𝑒𝑐.
• Solution:
o Since TK =
,
L+
then RK =
,
M+
=
,
,6 K ,6,- = 10=
= 100000 symbols/sec
o From the results of previous Example
Ø H(x) =1.5219 Bits/Symbol, then 𝐻J 𝑥 = 𝑅0. 𝐻 𝑥 = 152190 bps
Ø 𝐼 = 1.0728 Bits/symbol , then 𝐼J = 𝑅0. 𝐼 = 107280 bps
Ø C=1.1179 Bits/symbol, then 𝐶J = 𝑅0. 𝐶 = 111790 bps.
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 69
70. Information, and Capacity Over Continues
Channel
a) The Bandwidth (B): The bandwidth is the range of frequency occupied by given signal or system in
Hz. It can be measured by the difference between fmax and fmin over positive side in the frequency
domain.
b) Nyquist’s Theorem: The maximum sample or symbol rate of signal over channel having bandwidth
B is limited to 2B symbols/sec. In mathematical representation: 𝑅 ≤ 2. 𝐵 Symbols/sec (Rmax=2B).
c) The signal-to-noise power ratio (S/N): It is the ratio of the signal power (S in Watt) to the noise
power (N in Watt also) in the channel. Thus, it is a unit-less (ratio). Usually, expressed in dB where;
«
𝑆
𝑁 NO
= 10. 𝐿𝑜𝑔,6 «
𝑆
𝑁 JPQ+R
𝑑𝐵
• The inverse conversion is required also: ®
S
T JPQ+R
= 10 ⌉
[S/T ./÷,6
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 70
71. Information, and Capacity Over Continues
Channel
• The model in the case of continues channel is shown below:
• As before, the source output is continues random variable symbols x with pdf
f(x), the received symbol is y with pdf f(y), and the noise is n with pdf f(n). It
is required to find an expression for C.
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 71
+
Source
X, f(x)
Received
y, f(y)
Noise (AWGN)
n, f(n)
72. Information, and Capacity Over Continues
Channel
• Assumptions / Their Reason:
1. f(x) is Gaussian (normal) /The Hmax(x) is given by Gaussian RV.
2. f(n) is also Gaussian distribution / Due to the following:
a. Natural noise is totally random in nature like Gaussian RV.
b. We need to test the system under worst case of noise (Gaussian).
c. According to central limit theorem “sum of unknown independent noise sources can be modeled as
Gaussian RV”
3. The mean values of the source and noise are zeros ( ̅
𝑥 = 0 𝑎𝑛𝑑 °
𝑛 = 0 )/ the DC level or the mean does
not affect the information.
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 72
+
Source
X, f(x)
Received
y, f(y)
Noise (AWGN)
n, f(n)
73. Information, and Capacity Over Continues
Channel
• Since the noise is added to the signal and has Gaussian pdf, it is called Additive White Gaussian Noise
(AWGN). The term white here is used to specify that the noise is present in all frequencies with the same
power spectral density. So, the above model is also called AWGN channel model.
• Now we shall use the above definitions and assumptions to derive the channel capacity for continues
channel:
• Since the source and noise are both Gaussian with zero means then:
- The signal power is 𝑆 = 𝑥" = 𝜎+
"
• Then, 𝐻 𝑥 = 𝐿𝑜𝑔 2𝜋𝑒𝜎+
"
=
!
"
𝐿𝑜𝑔(2𝜋𝑒𝑆) (the signal entropy)
- The noise power is 𝑁 = 𝑛" = 𝜎,
"
• Then, 𝐻 𝑛 = 𝐿𝑜𝑔 2𝜋𝑒𝜎,
"
=
!
"
𝐿𝑜𝑔(2𝜋𝑒𝑁)
• or 𝐻 𝑦|𝑥 =
!
"
𝐿𝑜𝑔(2𝜋𝑒𝑁) (the noise entropy)
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 73
74. Information, and Capacity Over Continues
Channel
- Since y=x +n and both x and n are Gaussian RVs, then also y is Gaussian RV, with mean °
𝑦 = ̅
𝑥 + °
𝑛 =
0, so, the received signal power is (𝑆 + 𝑁) = 𝑦. = 𝜎3
. , then
𝐻 𝑦 = 𝐿𝑜𝑔 2𝜋𝑒𝜎3
.
=
,
.
𝐿𝑜𝑔(2𝜋𝑒(𝑆 + 𝑁)) (the receiver entropy)
• Since we assume maximum source entropy over given AWGN channel, then:
C = Imax = H(y) – H(y|x)
=
,
.
𝐿𝑜𝑔 2𝜋𝑒 𝑆 + 𝑁 −
,
.
𝐿𝑜𝑔 2𝜋𝑒𝑁
=
,
.
𝐿𝑜𝑔
.YZ S[T
.YZT
=
,
.
𝐿𝑜𝑔
S[T
T
=
,
.
𝐿𝑜𝑔(1 +
S
T
)
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 74
75. Information, and Capacity Over Continues
Channel
• We have, C=
!
"
𝐿𝑜𝑔"(1 +
-
.
) Bits/symbol
• Using the above Nyquist’s Theorem then Rmax = 2B Symbols/Sec
• Thus, the capacity rate in bps is given by:
Cr = Rmax.C = B 𝐿𝑜𝑔"(1 +
-
.
) bps
• The equation is known as Shannon-Hartley equation for channel capacity:
Cr = B 𝐿𝑜𝑔"(1 +
-
.
) bps
• The above equation relates the bandwidth of the channel with both the signal power and the noise
power. Clearly, when B or S/N increased the capacity rate is also increased. In practice this is not
always true since the noise power is also increased if the bandwidth increased, where
N=NoB (in Watt)
• where B is the channel bandwidth as before and No is called the noise power spectral density (in
Watts/Hz). It is the power of noise for each Hz of the channel bandwidth.
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 75
76. Example
a. The 4G cellular system used maximum bandwidth of 100MHz using efficient signal that provides
S/N=20dB, find the maximum bit rate.
b. If the above is replaced by Huawei 5G cellular system that provides an extended bandwidth of 500
MHz, using the same signal and S/N of 20 dB, what is the percentage increase in system bit rate.
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 76
77. Solution
a. We have S/N= 20 dB, this should be converted to ratio to be used inside Shannon-Hartley Eq., thus,
®
S
T JPQ+R
= 10 ⌉
[S/T ./÷,6 = 10.6÷,6 = 10. = 100 (𝑟𝑎𝑡𝑖𝑜)
• Now: Cr = B 𝐿𝑜𝑔.(1 +
S
T
)
• = 100×10 𝐿𝑜𝑔. 1 + 100 ≈ 666 𝑀𝑏𝑝𝑠
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 77
78. Solution
b. Using 500 MHz bandwidth:
• Now: Cr = B 𝐿𝑜𝑔.(1 +
S
T
) =
= 500×10
𝐿𝑜𝑔. 1 + 100 ≈ 3330 𝑀𝑏𝑝𝑠 (𝑜𝑟 3.3 𝐺𝑏𝑝𝑠)
• %increase in rate =
TZ] ^PQZA_`N ^PQZ
_`N ^PQZ
. 100%
=
aaa6A
. 100%
=
.5
. 100% = 400%
• This means, the rate (bps) in 5G is four times that of 4G.
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 78
79. Example
• Consider the following specifications for Digital Image: Image frame resolution or dimension =
1200×800 pixels/frame. Colored (RGB) information for each 𝑝𝑖𝑥𝑒𝑙 = 24 𝐵𝑖𝑡𝑠/ 𝑝𝑖𝑥𝑒𝑙. The pixels are
equal probable to have any color value. Find:
a. the amount of information carried by one frame (in bits/frame).
b. the amount of information produced by 1000 frames.
c. the rate of information (in bps), if the above 1000 frames are sent within 100 sec.
d. the required channel bandwidth if the signal to noise power ratio is 45 dB.
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 79
80. Solution
• First, we need to know the detail of any digital image. It consists of a number of picture elements also
called pixel or pel or dot. One can notice this small element when get very close to TV screen. The
single image also known as frame or just picture is 2-dimentional representation of large number of
pixels. This number is determined by the height (H) and the width (W) of the picture or frame. In
above example 𝑊×𝐻 = 1200×800 (also called the resolution). The actual resolution is the number of
bits in each pixel. Higher resolution produces better quality picture.
a. Given 𝑊×𝐻 = 1200×800 and 24 bits/pixel we need to find the total information of one frame:
• 𝐼bJPcZ = 1200𝑥800
d+0Z`
bJPcZ
×24
O+Qe
d+0Z`
= 2304×105 O+Qe
fJPcZ
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 80
81. Solution
b. 𝐼g = 2304×105 O+Qe
fJPcZ
× 1000 𝐹𝑟𝑎𝑚𝑒𝑠 = 2304×10?
𝐵𝑖𝑡𝑠
c. 𝑅/ = 2304×10? O+Qe
,66 eZ7
= 2304×10= 𝑏𝑝𝑠
d. Here 𝐶J = 𝑅/ = 2304×10= 𝑏𝑝𝑠
• and ®
S
T JPQ+R
= 10 ⌉
[S/T ./÷,6 = 105=÷,6 = 105.= = 31622.8 (𝑟𝑎𝑡𝑖𝑜)
• Using the channel capacity theorem: Cr = B 𝐿𝑜𝑔.(1 +
S
T
), then
• 𝐵 =
G0
hRi*(,[
1
2
)
=
.a650,63
hRi*(,[ a,...H)
=
.a650,63
,5.I=
= 15.4× 10𝐻𝑧 = 15.4 𝑀𝐻𝑍
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 81
82. Example
• Repeat the requirements for previous Example if the image is a Gray scale image with 8 bits/pixel.
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 82
83. Solution
• The image here is gray scale image (also called Black and White or B/W picture). So instead of colored 24 bits per pixel we
have 8 bits per pixel.
a. Given 𝑊×𝐻 = 1200×800 as before and 8 bits/pixel :
𝐼45678 = 1200×800
𝑝𝑖𝑥𝑒𝑙
𝑓𝑟𝑎𝑚𝑒
×8
𝐵𝑖𝑡𝑠
𝑝𝑖𝑥𝑒𝑙
= 768×10"
𝐵𝑖𝑡𝑠
𝐹𝑟𝑎𝑚𝑒
b. 𝐼9 = 768×10" :;<=
>5678
× 1000 𝐹𝑟𝑎𝑚𝑒𝑠 = 768×10&
𝐵𝑖𝑡𝑠
c. 𝑅? = 768×10& :;<=
%@@ =8A
= 768×10#
𝑏𝑝𝑠
d. Here 𝐶5 = 𝑅? = 768𝑥10#
𝑏𝑝𝑠
and p
B
C 56<;D
= 10 ⌉
[B/C !"÷%@
= 10"#÷%@
= 10".#
= 31622.8 (𝑟𝑎𝑡𝑖𝑜)
Using the channel capacity theorem: Cr = B 𝐿𝑜𝑔!(1 +
B
C
), then
𝐵 =
J#
KDL$(%N
%
&
)
=
&($P%@'
KDL$(%N '%(!!.$)
=
&($P%@'
%".Q#
= 52.42×10#
𝐻𝑧 = 5.242 𝑀𝐻𝑍
Asst. Prof. Dr. Hamsa A. Abdullah Advanced Coding Techniques 83