SlideShare a Scribd company logo
1 of 41
Mustaqbal University
College of Engineering &Computer Sciences
Electronics and Communication Engineering Department
Course: EE301: Probability Theory and Applications
Prerequisite: Stat 219
Text Book: B.P. Lathi, β€œModern Digital and Analog Communication Systems”, 3th edition, Oxford University
Press, Inc., 1998
Reference: A. Papoulis, Probability, Random Variables, and Stochastic Processes, Mc-Graw Hill, 2005
Dr. Aref Hassan Kurdali
Application: Information Theory
β€’ In the context of communications, information theory deals with
mathematical modeling and analysis of a communication system
rather than with physical sources and physical channels.
β€’ In particular, it provides answers to two fundamental questions
(among others):
1) What is the minimum number of binits (binary digits) per source
symbol required to fully represent the source in acceptable quality ?
(Most efficient source coding)
2) What is the ultimate (highest) transmission binit rate for reliable
communication (no error transmission) over a noisy channel?
(Most efficient channel coding)
The answers to these two questions lie in the entropy of a source and the
capacity of a channel respectively.
Entropy is defined in terms of the probabilistic behavior of a source of
information (How much average uncertainty of an information source?);
it is so named in respect to the parallel use of this concept in
thermodynamics (How much average instability of a physical source?).
Capacity is defined as the basic ability of a channel to transmit
information; it is naturally related to the noise characteristics of the
channel.
A remarkable result that emerges from information theory is that if the
entropy of the source is less than the capacity of the channel, then
error-free communication over the channel can be achieved.
The discrete source output is modeled as a discrete random variable, S,
which takes on symbols from a fixed finite alphabet
S={s1, s2, s3, .........., sq}
With probability distribution P(S= si) = pi, i=1,2, 3,........,q
Where
A discrete memoryless source (zero memory source) emits statistically
independent symbols during successive signaling intervals where the
symbol emitted at any time is independent of previous emitted symbols.
οƒ₯
ο€½
ο€½
q
i
i
p
1
1
Discrete Memoryless Source
Information Measure
How much information I(a) associated with an event β€˜a’ whose
probability p(a) = p?.
The information measure I(a) should have several properties:
1. Information is a non-negative quantity: I(a) β‰₯ 0.
2. If an event has probability 1, we get no information from the
occurrence of that event, i.e. I(a) = 0 if p (a) =1.
3. If two independent events (a & b) occur (whose joint probability is the
product of their individual probabilities i.e. p(ab) = p(a)p(b)), then the
total information we get from observing these two events is the sum of
the two informations:
I(ab) = I(a)+I(b). (This is the critical property . . . )
4. The information measure should be a continuous (and, in fact,
monotonic) function of the probability (slight changes in probability
should result in slight changes in information).
Since, I(a2) = I(aa) = I(a)+I(a) = 2 I(a)
Thus, by continuity, we get, for 0 < p(a) ≀ 1, and n > 0 as real number:
I(an) = n * I(a)
From this, The information can be measured by the logarithm function,
i.e. I(a) = βˆ’logb(p(a)) = logb(1/p(a)) for some base b.
The base b determines the unit of information used.
The unit can be changed by changing the base, using the following formula:
For b1, b2 & x > 0,
Therefore, logb1 (x) = logb2(x) / logb2(b1)
The occurrence of an event S = sk either provides some or no information, but never
brings about a loss of information.
The less probable an event is, the more information we gain when it occurs.
Uncertainty, Surprise, and Information
The amount of (uncertainty, surprise), information gained (before, at)
after observing the event S = sk, which occurs with probability pk, is
therefore defined using the logarithmic function
Units of information
The base of the logarithm in Equation (9.4) is quite arbitrary.
Nevertheless, it is the standard practice today to use a logarithm to base 2.
The resulting unit of information is called the bit
When pk = 1/2, we have I(sk) = 1 bit. Hence, one bit is the amount of
information that we gain when one of two possible and equally likely
(i,e., equiprobable) events occurs.
If a logarithm to base 10 is used, the resulting unit of information is
called the hartly. When pk = 1/10, we have I(sk) = 1 hartly.
A logarithm to base e can also be used, the resulting unit of information
is called the nat. When pk = 1/e, we have I(sk) = 1 nat.
Source Entropy H(S)
the entropy of a discrete memoryless source
H(S) =
It is the average amount of information content per source symbol.
The source entropy is bounded as follows:
0 ≀ H(S) ≀ log q
where q is the radix (number of symbols) of the alphabet of the source.
Furthermore, we may make two statements:
1. H(S) = 0, if and only if the probability pi = 1 for some i, and the
remaining probabilities in the set are all zero; this lower bound on
entropy corresponds to no uncertainty.
2. H(S) = log q, if and only if pi = 1/q for all i (i.e., all the symbols in the
alphabet are equiprobable); this upper bound on entropy corresponds to
maximum uncertainty.
οƒ₯
οƒ₯ ο€½
ο€½
ο€½
q
i
i
i
q
i
i
i p
p
I
p
1
1
)
/
1
log(
Consider a binary source for which
symbol 0 occurs with probability p0 and
symbol 1 with probability pl = 1 – p0.
The source is memoryless so that
successive symbols emitted by the
source are statistically independent.
The entropy of the binary source is
usually called as the entropy function
h(p0) = p0 log (1/p0) + (1-p0) log (1/(1-p0))
we often find it useful to consider blocks rather than individual symbols,
with each block consisting of n successive source symbols.
We may view each such block as being produced by an extended source
with a source alphabet that has qn distinct blocks, where q is the number
of distinct symbols in the source alphabet of the original source.
a)In the case of a discrete memoryless source, the source symbols are
statistically independent. Hence, the probability of an extended source
symbol is equal to the product of the probabilities of the n original
source symbols constituting the particular extended source symbol. Thus,
it may be intuitively to expect that H(Sn), the entropy of the extended
source, is equal to n times H(S) the entropy of the original source. That
is, we may write
H(Sn) = n H(S)
Problems
1. Find the entropy of a 7-symbol source at uniform distribution.
(Answer: 2.81 bits of information/SS)
2. Given a five-symbol source with the following probability
distribution {1/2, 1/4, 1/8, 1/16, 1/16, calculate the average
amount of information per source symbol. (Answer: 1.875
bits/SS)
3. Given a 3-symbol, zero memory source S (a, b, c). If the
amount of the joint information I(bc) = log(12) bits of
information. Find any possible source probability distribution
the source S. (Answer: {5/12, 1/3, 1/4} )
4. Consider a zero memory binary source S with P(s1) = 0.8 &
P(s2) = 0.2.
a) Construct 2nd and 3rd extensions of the source S.
b) Find the corresponding probability distribution of each extension.
c) Calculate the average amount of information per source symbol (H(S2) and
H(S3)).
The process by which an efficient representation of data generated by a discrete source
( with finite source alphabet) is called source encoding. The device that performs this
representation is called a source encoder. For the source encoder to be efficient,
knowledge of the statistics of the source is required. In particular, if some source
alphabets (symbols) are known to be more probable than others, then this feature may
be exploited in the generation of a source code by assigning short code words to
frequent source symbols, and long code words to rare source symbols in order to
achieve lower code rate (# of code symbols/sec.)and hence using lower communication
channel bandwidth in Hz for transmission or less memory bits for storage. Such a
source code is called a variable-length code.
Let r represents the code radix (number of code alphabet), ( r =2 for binary code, r = 8
for octal code and r = 10 for decimal code and so on).
j is the codeword length (# of code symbol per codeword) and nj is the # of codewords
of length j.
An efficient source encoder should satisfy two functional requirements:
1. The code words produced by the encoder are in binary form.
2. The source code is uniquely decodable, so that the original source sequence can
be reconstructed perfectly from the encoded binary sequence.
Source Coding Theory
Prefix (Instantaneous) Code
(Entropy Code - Lossless Data Compression)
For a source variable length code to be of practical use, the code has to
be uniquely decodable (The code and all its extensions must be unique).
This restriction ensures that for each finite sequence of symbols emitted
by the source, the corresponding sequence of code words is unique and
different from the sequence of code words corresponding to any other
source sequence. A prefix (instantaneous) code (a Subclass of uniquely
decodable) is defined as a code in which no code word is the prefix of
any other code word.
Only Code II is a prefix code which is always uniquely decodable code.
Code III is also an uniquely decodable code since the bit 0 indicates the
beginning of each code word but not an instantaneous code. Each codeword
of an instantaneous code can be directly decoded once it is completely
received. (Code I not decodable, example: when 00 is received, it will be s2 or s0 s0)
Decision Tree
The shown decision tree is a graphical
representation of the code words which has
an initial state and four terminal states
corresponding to source symbols so, s1, s2,
and s3. Source symbols must not be in
intermediate states to satisfy the prefix
condition. The decoder always
starts at the initial state. The first received bit
moves the decoder to the terminal state so
if it is 0, or else to a second decision point if
it is 1. In the latter case, the second bit moves
the decoder one step further down the tree,
either to terminal state s2 if it is 0, or else to
a third decision point if it is 1, and so on.
Once each terminal state emits its symbol, the decoder is reset to its initial state. Note
also that each bit in the received encoded sequence is examined only once.
For example, the encoded sequence 1011111000 . . . is readily decoded as the source
sequence sl s3 s2 so so.. . .
Kraft-McMillan Inequality
1
1
1
ο‚£
ο€½ ο€­
ο€½
ο€½
ο€­
οƒ₯
οƒ₯ j
l
j
j
q
i
l
r
n
r i
Where r is the code radix (number of symbols in the code alphabet, r =2 for
binary code), nj is the # of codewords of length j and l is the maximum
codeword length. Moreover, if a prefix code has been constructed for a discrete
memoryless source with source alphabet (s1, s2, . . . , sq) and source statistics
(P1, P2 , . . . , Pq) and the codeword for symbol si has length li, i = 1, 2, . . . , q,
then the codeword lengths must satisfy the above inequality known as the
Kraft-McMillan Inequality. It does not tell us that a source code is a prefix
code. Rather, it is merely a condition on the codeword lengths of the code and
not on the code words themselves. Referring to the three codes listed in Table
9.2:Code I violates the Kraft-McMillan inequality; it cannot therefore be a
prefix code while, the Kraft-McMillan inequality is satisfied by both codes II
and III; but only code II is a prefix code.
Kraft-McMillan Inequality
Prefix codes are distinguished from other uniquely decodable codes by the fact
that the end of the code word is always recognizable. Hence, the decoding of a
prefix can be accomplished as soon as the binary sequence representing a
source symbol is fully received. For this reason, prefix codes are also referred
to as instantaneous codes.
Code I:
𝑖=1
π‘ž
π‘Ÿβˆ’π‘™π‘– =
𝑖=1
4
2βˆ’π‘™π‘– = 2βˆ’1
+ 2βˆ’1
+ 2βˆ’2
+ 2βˆ’2
= 1.5 ⇨ πΆπ‘œπ‘‘π‘’ 𝐼 𝑖𝑠 𝒏𝒐𝒕 π‘’π‘›π‘–π‘žπ‘’π‘’π‘™π‘¦ π‘‘π‘’π‘π‘œπ‘‘π‘Žπ‘π‘™π‘’
π‘œπ‘Ÿ
𝑗=1
𝑙
π‘›π‘—π‘Ÿβˆ’π‘—
=
𝑗=1
2
π‘›π‘—π‘Ÿβˆ’π‘—
= 2 Γ— 2βˆ’1
+ 2 Γ— 2βˆ’2
= 1.5
Code II:
𝑖=1
π‘ž
π‘Ÿβˆ’π‘™π‘– =
𝑖=1
4
2βˆ’π‘™π‘– = 2βˆ’1
+ 2βˆ’2
+ 2βˆ’3
+ 2βˆ’3
= 1 ≀ 1 ⇨ πΆπ‘œπ‘‘π‘’ 𝐼𝐼 𝑖𝑠 π‘’π‘›π‘–π‘žπ‘’π‘’π‘™π‘¦ π‘‘π‘’π‘π‘œπ‘‘π‘Žπ‘π‘™π‘’
Code III:
𝑖=1
π‘ž
π‘Ÿβˆ’π‘™π‘– =
𝑖=1
4
2βˆ’π‘™π‘– = 2βˆ’1
+ 2βˆ’2
+ 2βˆ’3
+ 2βˆ’4
=
15
16
≀ 1 ⇨ πΆπ‘œπ‘‘π‘’ 𝐼𝐼𝐼 𝑖𝑠 π‘’π‘›π‘–π‘žπ‘’π‘’π‘™π‘¦ π‘‘π‘’π‘π‘œπ‘‘π‘Žπ‘π‘™π‘’
Coding Efficiency
Assume the source has an alphabet with q different symbols, and that the ith symbol si
occurs with probability pi , i = 1, 2,. . . , q. Let the binary code word assigned to symbol
si by the encoder have length li measured in binits.
Then, the average code-word length, L, of the source encoder is defined as
In physical terms, the parameter L represents the average number of binits per source
symbol used in the source encoding process. Let Lmin denote the minimum possible
value of L, then, the coding efficiency of the source encoder is defined as
Ξ· = Lmin/ L
With L β‰₯ Lmin we clearly have Ξ· ≀1. The source encoder is said to be efficient when Ξ·
approaches unity.
οƒ₯
ο€½
ο€½
q
i
i
i p
l
L
1
Data Compaction
A common characteristic of signals generated by physical sources is that,
in their natural form, they contain a significant amount of information
that is redundant. The transmission of such redundancy is therefore
wasteful of primary communication resources. For efficient signal
transmission, the redundant information should be removed from the
signal prior to transmission.
This operation, with no loss of information, is ordinarily performed on a
signal in digital form, in which case it is called as data compaction or
lossless data compression.
According to the source-coding theorem, the entropy H(S) represents a
fundamental limit on the removal of redundancy from the data. i.e. the
average number of bits per source symbol necessary to represent a
discrete memoryless source can be made as small as, but no smaller than,
the entropy H(S).
Thus with Lmin = H(S), the efficiency of a source encoder may be
rewritten in terms of the source entropy H(S) as
Ξ· = H(S)/ L
Data Compaction
Code I:
L = 𝑖=1
π‘ž
𝑙𝑖𝑝𝑖 = 𝑖=1
4
𝑙𝑖𝑝𝑖 = 1 Γ— 0.5 + 1 Γ— 0.25 + 2 Γ— 0.125 + 2 Γ— 0.125 = 1.25
𝐻 𝑆 = 𝑖=1
π‘ž
𝑝𝑖. π‘™π‘œπ‘”2(
1
𝑝𝑖
) = 0.5 π‘™π‘œπ‘”2
1
0.5
+ 0.25 π‘™π‘œπ‘”2
1
0.25
+ 0.125 π‘™π‘œπ‘”2
1
0.125
+ 0.125 π‘™π‘œπ‘”2
1
0.125
= 1.75
πœ‚ =
πΏπ‘šπ‘–π‘›
𝐿
=
𝐻(𝑆)
𝐿
=
1.75
1.25
= 1.400
Code II:
L =
𝑖=1
π‘ž
𝑙𝑖𝑝𝑖 =
𝑖=1
4
𝑙𝑖𝑝𝑖 = 1 Γ— 0.5 + 2 Γ— 0.25 + 3 Γ— 0.125 + 3 Γ— 0.125 = 1.750
πœ‚ =
𝐻(𝑆)
𝐿
=
1.75
1.75
= 1
Code III:
L =
𝑖=1
π‘ž
𝑙𝑖𝑝𝑖 =
𝑖=1
4
𝑙𝑖𝑝𝑖 = 1 Γ— 0.5 + 2 Γ— 0.25 + 3 Γ— 0.125 + 4 Γ— 0.125 = 1.875
πœ‚ =
𝐻(𝑆)
𝐿
=
1.75
1.875
= 0.933
Problem
Find the efficiency
of the source code I
and II and II.
Huffman Code
An important class of prefix codes is known as Huffman codes. The Huffman code by
definition is the most efficient code (highest possible efficiency without coding of
source extension).
The Huffman code of radix r algorithm proceeds as follows:
1. The source symbols are listed in order of decreasing probability.
2. The total # of source symbols q should equal to [b(r-1)+1] & b=0,1,2,3,…. Unless
dummy symbols with zero probabilities should be augmented at the end of the list.
3. The β€˜r’ source symbols of lowest probabilities are regarded as being combined (or)
into a new source symbol with probability equal to the sum of the original r
probabilities. Therefore ,The list of source symbols is reduced in size by (r-1). The
probability of the new symbol is placed in the list in accordance with its value (Keep
probability descending order in all time).
3. The procedure is repeated until we are left with a final list of r combined symbols for
which a code symbol is assigned to each one.
4. The code for each (original) source symbol is found by working backward and
tracing the sequence of the code symbols assigned to that source symbol as well as its
successors.
Example 1: Huffman Binary Code (HC)
𝑖=1
π‘ž
π‘Ÿβˆ’π‘™π‘– =
𝑖=1
4
2βˆ’π‘™π‘– = 3 Γ— 2βˆ’2
+ 2 Γ— 2βˆ’3
= 1 ⇨ 𝑖𝑠 π‘π‘Ÿπ‘’π‘“π‘–π‘₯
πœ‚ =
𝐻(𝑆)
𝐿
=
2.12
2.2
= 0.96
Example 2: Huffman Binary Code (HC)
Si Pi HC1
S1 0.7 0 s1 0.7 0 s1 0.7 0 s1 0.7 0
S2 0.1 100 s45 0.1 11 s23 0.2 10 s2-5 0.3 1
S3 0.1 101 s2 0.1 100 s45 0.1 11
S4 0.05 110 s3 0.1 101
S5 0.05 111
Si Pi HC2
S1 0.7 0 s1 0.7 0 s1 0.7 0 s1 0.7 0
S2 0.1 11 s2 0.1 11 s345 0.2 10 s2-5 0.3 1
S3 0.1 100 s3 0.1 100 s2 0.1 11
S4 0.05 1010 s45 0.1 101
S5 0.05 1011
Problem 1
Consider a zero memory binary source S with P(s1) = 0.8 & P(s2) = 0.2 :
a) Construct 2nd and 3rd extensions of the source and find the corresponding probability
distribution of each extension and find the entropy.
b) Write down the binary code of the 2nd extension of the source [T ≑ S2] using each of the
following binary decision trees:
c) Find the average code word length L for each binary code.
d) Encode the following source symbol stream using each of the above binary code:
s2 s1 s1 s1 s1 s2 s2 s2 s1 s1
e) Calculate the binit rate in binits/sec. of each one if the source S emits 2000 symbols/sec.
Problem 2
Consider a zero memory statistical independent binary source S with two source symbols s1 and
s2. If P(s1) = 0.85, calculate:
a) The amount of information of source symbol s1 = I(s1) in bit of information.
b) The amount of information of source symbol s2 = I(s2) in bit of information.
c) The statistical average of information of the source S = H(S) in bits/source symbol
d) The joint information of the events: A={s1s2} and B={s1s1} in Hartley.
e) The conditional information of the event: A={s1/ s2} in Nat.
Problem 2 - Solution
Consider a zero memory statistical independent binary source S with two source symbols s1 and
s2. If P(s1) = 0.85, calculate:
a) The amount of information of source symbol s1 = I(s1) in bit of information.
I(s1) = log (1/0.85) = 0.2345
b) The amount of information of source symbol s2 = I(s2) in bit of information.
I(s2) = log(1/0.15) = 2.737
c) The statistical average of information of the source S = H(S) in bits/source symbol
H(S) = 0.85 Γ— 0.2345 + .15 Γ— 2.737 = 0.61 bits/SS
d) The joint information of the events: A={s1s2} and B={s1s1} in Hartley.
I(A) = log10 (1/(0.85 Γ— 0.15)) = log10 (1/0.1275) = 0.8945 Hartley
I(B) = log10 (1/(0.85 Γ— 0.85)) = log10 (1/0.7225) = 0.1412 Hartley
e) The conditional information of the event: A={s1/ s2} in Nat.
P(s1/ s2) = P(s1) ……. (SI)
I(A) = ln(1/0.85) = 0.1625 Nat
Consider 3-symbol, zero memory source S (a, b, c) with P(a) = 0.8 and P(b) = 0.05.
1) Encode the source S symbols using a binary code. Calculate the average code
length L.
2) Calculate the source entropy H(S). Calculate the code efficiency Ξ· = H(S)/L
3) Construct the second extension of the source [T ≑ S2] and find its probability
distribution.
4) Write down the binary code of the source (T) symbols using each of the following
binary decision trees:
5) Calculate the average code length of source (T) and the code efficiency for each
code (LI, πœ‚I, LII, πœ‚II)
6) Encode the following source symbol stream using each of the above binary code
(b a c c a a b b a c b a )
7) Calculate the binit rate in binits/sec. of each code if the source S emits 3000
symbols/sec.
Problem 3
Consider 3-symbol, zero memory source S (a, b, c) with P(a) = 0.8 and P(b) = 0.05.
1) Encode the source S symbols using a binary code. Calculate the average code length
P(a) = 0.8
P(b) = 0.05
P(c) = 0.15
0.8 a 0
0.05 b 10
0.15 c 11
(L = 0.8 + 2Γ— 0.05 + 3 Γ— 0.15 = 1.35
L = 0.8 + 3 Γ— 0.05 + 2 Γ— 0.15 = 1.25)
L = 0.8 + 2 Γ— 0.05 + 2 Γ— 0.15 = 0.8 + 2 Γ— 0.2=1.2 binits/SS
Problem 3 - Solution
2) Calculate the source entropy H(S). Calculate the code efficiency Ξ· = H(S)/L
H(S) = .8log(1/.8) + .05log(1/.05) + .15log(1/.15) = 0.884 bits/SS,
Ξ· = H(S)/L = 0.884/1.2 = 73.68%
3) Construct the second extension of the source [T ≑ S2] and find its probability
distribution.
P(t1) = P(aa) = 0.82 = 0.64
P(t2) = P(ab) = 0.8 Γ— 0.05 = 0.04
P(t3) = P(ac) = 0.8 Γ— 0.15 = 0.12
P(t4) = P(bb) = 0.052 = 0.0025
P(t5) = P(ba) = 0.8 Γ— 0.05 = 0.04
P(t6) = P(bc) = 0.05 Γ— 0.15 = 0.0075
P(t7) = P(cc) = 0.152 = 0.0225
P(t8) = P(cb) = 0.15 Γ— 0.05 = 0.0075
P(t9) = P(ca) = 0.8 Γ— 0.15 = 0.12
Problem 3 - Solution
4) Write down the binary code of the source (T) symbols using each of the following
binary decision trees.
Problem 3 - Solution
5) Calculate the average code length of source (T) and the code efficiency for each code
(LI, πœ‚I, LII, πœ‚II)
Code I word length: {2, 2, 3, 3, 3, 4, 5, 6, 6}
L = 2Γ—0.76 + 3Γ—0.2 + 4Γ—0.0225 + 5Γ—0.0075 + 6Γ—0.01= 2.3075 binits/2SS
Ξ·I = H(T)/L = 2H(S)/L = 2Γ—0.884/2.3075 = 76.62%
Code II word lengths: {2, 3, 3, 3, 3, 3, 4, 5, 5}
L = 2Γ—0.64 + 3Γ—0.3425 + 4Γ—0.0075 + 5Γ—0.01 = 2.3875 binits/2SS
Ξ·II = H(T)/L = 2H(S)/L = 2Γ—0.884/2.3875 = 74.05%
6) Encode the following source symbol stream using each of the above binary code:
b a c c a a b b a c b a
T: t5 t7 t1 t4 t3 t5
Code I 000 0100 11 010111 10 000
Code II 000 011 10 11111 110 000
Problem 3 - Solution
7) Calculate the binit rate in binits/sec. of each code if the source S emits 3000
symbols/sec.
(binit rate = source symbol rate Γ— source average code length)
Code I binit rate = 2.3075 Γ— 1500 = 3.461 kb/sec
Code II binit rate = 2.3875 Γ—1500 = 3.581 kb/sec
Noteworthy that:
The binit rate without extension = 1.2 Γ— 3000 = 3600 binit/sec = 3.6 kb/sec
Problem 3 - Solution
Can an instantaneous (Prefix) code be constructed with the
following codeword lengths?. Find the corresponding code using
the decision tree for each eligible case
a) {1,2,3,3,4,4,5,5}, r = 2
b){1,1,2,2,3,3,4,4}, r = 3
c) {1,1,1,2,2,2,2}, r = 4
Problem 4
Problem 4 - Solution
A zero memory source S emits one of eight symbols
randomly every 1 microsecond with probabilities
{0.13, 0.2, 0.16, 0.3, 0.07, 0.05, 0.03, 0.06}
1. Calculate the source entropy H(S).
2. Construct a Huffman binary code.
3. Calculate the code efficiency.
4. Find the encoder output average binit rate.
Problem 5
A zero memory source S emits one of five symbols
randomly every 2 microsecond with probabilities
{0.25, 0.25, 0.2, 0.15, 0.15}
1. Calculate the source entropy H(S).
2. Construct a Huffman binary code.
3. Calculate the average length of this code.
4. Calculate the code efficiency.
5. Find the encoder output average binit rate.
Problem 6
A zero memory source S emits one of five symbols
randomly every 2 microsecond with probabilities
{0.25, 0.25, 0.2, 0.15, 0.15}
1. Construct a Huffman ternary code.
2. Calculate the average length of this code.
3. Calculate the code efficiency.
4. Calculate the code Redundancy.(𝜸=1- πœ‚)
Problem 7
If r β‰₯ 3, we may not have a sufficient number of symbols so that we can
combine them r at a time. In such a case, we add dummy symbols to the end of
the set of symbols. The dummy symbols have probability 0 and are inserted to
fill the tree. Since at each stage of the reduction, the number of symbols is
reduced by r βˆ’ 1, we want the total number of symbols to be 1 + k(r βˆ’ 1), where
k is the number of merges. Hence, we add enough dummy symbols so that the
total number of symbols is of this form. For example:
A zero memory source S emits one of six symbols randomly with probabilities
{0.25, 0.25, 0.2, 0.1, 0.1, 0.1}
1. Construct a Huffman ternary code.
2. Calculate the average length of this code.
3. Calculate the code efficiency.
4. Calculate the code Redundancy.(𝜸=1- πœ‚)
Problem 8
Complete the following probability distribution of the second
extension T of a binary memoryless source S of 3-symbols {a, b &
c}
Problem 9
T S Prob
𝑝(𝑑1) 𝑝(aa) 0.25
𝑝(𝑑2) 𝑝(ab)
𝑝(𝑑3) 𝑝(ac)
𝑝(𝑑4) 𝑝(ba)
𝑝(𝑑5) 𝑝(bb)
𝑝(𝑑6) 𝑝(bc)
𝑝(𝑑7) 𝑝(ca)
𝑝(𝑑8) 𝑝(cb)
𝑝(𝑑9) 𝑝(cc) 0.01
1. Find the zero memory source S probability
distribution.
2. Calculate the source entropy H(T).
3. Find the ternary Huffman code for the above
source second extension T and calculate the code
efficiency and redundancy. (Hint: you do not
need to add dummy symbol with zero
probability)
Code Variance
As a measure of the variability in code-word lengths of a source code,
the variance of the average code-word length L over the ensemble of
source symbols is defined as
where po, pl, . . . , pK-1, are the source statistics, and lk is the length of the
code word assigned to source symbol sk. It is usually found that when a
combined symbol is moved as high as possible, the resulting Huffman
code has a significantly smaller variance Οƒ2 (which is better) than when
it is moved as low as possible. On this basis, it is reasonable to choose
the former Huffman code over the latter.

More Related Content

What's hot

Convolution Codes
Convolution CodesConvolution Codes
Convolution CodesPratishtha Ram
Β 
Data encoding and modulation
Data encoding and modulationData encoding and modulation
Data encoding and modulationShankar Gangaju
Β 
Orthogonal Frequency Division Multiplexing (OFDM)
Orthogonal Frequency Division Multiplexing (OFDM)Orthogonal Frequency Division Multiplexing (OFDM)
Orthogonal Frequency Division Multiplexing (OFDM)Gagan Randhawa
Β 
Modulation techniques
Modulation techniquesModulation techniques
Modulation techniquesSathish Kumar
Β 
Digital Modulation Unit 3
Digital Modulation Unit 3Digital Modulation Unit 3
Digital Modulation Unit 3Anil Nigam
Β 
Equalization techniques
Equalization techniquesEqualization techniques
Equalization techniquesAanchalKumari4
Β 
3.Frequency Domain Representation of Signals and Systems
3.Frequency Domain Representation of Signals and Systems3.Frequency Domain Representation of Signals and Systems
3.Frequency Domain Representation of Signals and SystemsINDIAN NAVY
Β 
Digital Communication: Channel Coding
Digital Communication: Channel CodingDigital Communication: Channel Coding
Digital Communication: Channel CodingDr. Sanjay M. Gulhane
Β 
Digital modulation techniques...
Digital modulation techniques...Digital modulation techniques...
Digital modulation techniques...Nidhi Baranwal
Β 
3. free space path loss model part 1
3. free space path loss model   part 13. free space path loss model   part 1
3. free space path loss model part 1JAIGANESH SEKAR
Β 
Channel capacity
Channel capacityChannel capacity
Channel capacityPALLAB DAS
Β 
Digital communication system
Digital communication systemDigital communication system
Digital communication systembabak danyal
Β 
Pulse Code Modulation (PCM)
Pulse Code Modulation (PCM)Pulse Code Modulation (PCM)
Pulse Code Modulation (PCM)Arun c
Β 
microwave-tubes
 microwave-tubes microwave-tubes
microwave-tubesATTO RATHORE
Β 

What's hot (20)

Convolution Codes
Convolution CodesConvolution Codes
Convolution Codes
Β 
Tdm and fdm
Tdm and fdmTdm and fdm
Tdm and fdm
Β 
Data encoding and modulation
Data encoding and modulationData encoding and modulation
Data encoding and modulation
Β 
Orthogonal Frequency Division Multiplexing (OFDM)
Orthogonal Frequency Division Multiplexing (OFDM)Orthogonal Frequency Division Multiplexing (OFDM)
Orthogonal Frequency Division Multiplexing (OFDM)
Β 
Modulation techniques
Modulation techniquesModulation techniques
Modulation techniques
Β 
Sampling Theorem and Band Limited Signals
Sampling Theorem and Band Limited SignalsSampling Theorem and Band Limited Signals
Sampling Theorem and Band Limited Signals
Β 
Digital Modulation Unit 3
Digital Modulation Unit 3Digital Modulation Unit 3
Digital Modulation Unit 3
Β 
Spread spectrum
Spread spectrumSpread spectrum
Spread spectrum
Β 
Equalization techniques
Equalization techniquesEqualization techniques
Equalization techniques
Β 
3.Frequency Domain Representation of Signals and Systems
3.Frequency Domain Representation of Signals and Systems3.Frequency Domain Representation of Signals and Systems
3.Frequency Domain Representation of Signals and Systems
Β 
Digital Communication: Channel Coding
Digital Communication: Channel CodingDigital Communication: Channel Coding
Digital Communication: Channel Coding
Β 
Digital modulation techniques...
Digital modulation techniques...Digital modulation techniques...
Digital modulation techniques...
Β 
3. free space path loss model part 1
3. free space path loss model   part 13. free space path loss model   part 1
3. free space path loss model part 1
Β 
Channel capacity
Channel capacityChannel capacity
Channel capacity
Β 
BIT Error Rate
BIT Error RateBIT Error Rate
BIT Error Rate
Β 
Digital communication system
Digital communication systemDigital communication system
Digital communication system
Β 
Multiplexing : FDM
Multiplexing : FDMMultiplexing : FDM
Multiplexing : FDM
Β 
information theory
information theoryinformation theory
information theory
Β 
Pulse Code Modulation (PCM)
Pulse Code Modulation (PCM)Pulse Code Modulation (PCM)
Pulse Code Modulation (PCM)
Β 
microwave-tubes
 microwave-tubes microwave-tubes
microwave-tubes
Β 

Similar to Information Theory and coding - Lecture 2

INFORMATION_THEORY.pdf
INFORMATION_THEORY.pdfINFORMATION_THEORY.pdf
INFORMATION_THEORY.pdftemmy7
Β 
Information Theory - Introduction
Information Theory  -  IntroductionInformation Theory  -  Introduction
Information Theory - IntroductionBurdwan University
Β 
Information Theory and coding - Lecture 3
Information Theory and coding - Lecture 3Information Theory and coding - Lecture 3
Information Theory and coding - Lecture 3Aref35
Β 
Unit-1_Digital_Communication-Information_Theory.pptx
Unit-1_Digital_Communication-Information_Theory.pptxUnit-1_Digital_Communication-Information_Theory.pptx
Unit-1_Digital_Communication-Information_Theory.pptxKIRUTHIKAAR2
Β 
Unit-1_Digital_Communication-Information_Theory.pptx
Unit-1_Digital_Communication-Information_Theory.pptxUnit-1_Digital_Communication-Information_Theory.pptx
Unit-1_Digital_Communication-Information_Theory.pptxKIRUTHIKAAR2
Β 
Information Theory MSU-EEE.ppt
Information Theory MSU-EEE.pptInformation Theory MSU-EEE.ppt
Information Theory MSU-EEE.pptrobomango
Β 
Information Theory Coding 1
Information Theory Coding 1Information Theory Coding 1
Information Theory Coding 1Mahafuz Aveek
Β 
DC Lecture Slides 1 - Information Theory.ppt
DC Lecture Slides 1 - Information Theory.pptDC Lecture Slides 1 - Information Theory.ppt
DC Lecture Slides 1 - Information Theory.pptshortstime400
Β 
Information theory & coding PPT Full Syllabus.pptx
Information theory & coding PPT Full Syllabus.pptxInformation theory & coding PPT Full Syllabus.pptx
Information theory & coding PPT Full Syllabus.pptxprernaguptaec
Β 
information_theory_1.ppt
information_theory_1.pptinformation_theory_1.ppt
information_theory_1.pptTrongMinhHoang1
Β 
Itblock2 150209161919-conversion-gate01
Itblock2 150209161919-conversion-gate01Itblock2 150209161919-conversion-gate01
Itblock2 150209161919-conversion-gate01Xuan Phu Nguyen
Β 
Huffman&Shannon-multimedia algorithms.ppt
Huffman&Shannon-multimedia algorithms.pptHuffman&Shannon-multimedia algorithms.ppt
Huffman&Shannon-multimedia algorithms.pptPrincessSaro
Β 
Machine Learning
Machine LearningMachine Learning
Machine Learningbutest
Β 
Module 1 till huffman coding5c-converted.pdf
Module 1 till huffman coding5c-converted.pdfModule 1 till huffman coding5c-converted.pdf
Module 1 till huffman coding5c-converted.pdfAmoghR3
Β 
Communication engineering -UNIT IV .pptx
Communication engineering -UNIT IV .pptxCommunication engineering -UNIT IV .pptx
Communication engineering -UNIT IV .pptxManoj Kumar
Β 
Unit I DIGITAL COMMUNICATION-INFORMATION THEORY.pdf
Unit I DIGITAL COMMUNICATION-INFORMATION THEORY.pdfUnit I DIGITAL COMMUNICATION-INFORMATION THEORY.pdf
Unit I DIGITAL COMMUNICATION-INFORMATION THEORY.pdfvani374987
Β 
Introduction to SMPC
Introduction to SMPCIntroduction to SMPC
Introduction to SMPCsecurityxploded
Β 
Image compression
Image compressionImage compression
Image compressionBassam Kanber
Β 

Similar to Information Theory and coding - Lecture 2 (20)

INFORMATION_THEORY.pdf
INFORMATION_THEORY.pdfINFORMATION_THEORY.pdf
INFORMATION_THEORY.pdf
Β 
Information Theory - Introduction
Information Theory  -  IntroductionInformation Theory  -  Introduction
Information Theory - Introduction
Β 
Information Theory and coding - Lecture 3
Information Theory and coding - Lecture 3Information Theory and coding - Lecture 3
Information Theory and coding - Lecture 3
Β 
Unit-1_Digital_Communication-Information_Theory.pptx
Unit-1_Digital_Communication-Information_Theory.pptxUnit-1_Digital_Communication-Information_Theory.pptx
Unit-1_Digital_Communication-Information_Theory.pptx
Β 
Unit-1_Digital_Communication-Information_Theory.pptx
Unit-1_Digital_Communication-Information_Theory.pptxUnit-1_Digital_Communication-Information_Theory.pptx
Unit-1_Digital_Communication-Information_Theory.pptx
Β 
Information Theory MSU-EEE.ppt
Information Theory MSU-EEE.pptInformation Theory MSU-EEE.ppt
Information Theory MSU-EEE.ppt
Β 
Information Theory Coding 1
Information Theory Coding 1Information Theory Coding 1
Information Theory Coding 1
Β 
DC Lecture Slides 1 - Information Theory.ppt
DC Lecture Slides 1 - Information Theory.pptDC Lecture Slides 1 - Information Theory.ppt
DC Lecture Slides 1 - Information Theory.ppt
Β 
Information theory & coding PPT Full Syllabus.pptx
Information theory & coding PPT Full Syllabus.pptxInformation theory & coding PPT Full Syllabus.pptx
Information theory & coding PPT Full Syllabus.pptx
Β 
information_theory_1.ppt
information_theory_1.pptinformation_theory_1.ppt
information_theory_1.ppt
Β 
Itblock2 150209161919-conversion-gate01
Itblock2 150209161919-conversion-gate01Itblock2 150209161919-conversion-gate01
Itblock2 150209161919-conversion-gate01
Β 
Huffman&Shannon-multimedia algorithms.ppt
Huffman&Shannon-multimedia algorithms.pptHuffman&Shannon-multimedia algorithms.ppt
Huffman&Shannon-multimedia algorithms.ppt
Β 
UNIT-2.pdf
UNIT-2.pdfUNIT-2.pdf
UNIT-2.pdf
Β 
Machine Learning
Machine LearningMachine Learning
Machine Learning
Β 
Module 1 till huffman coding5c-converted.pdf
Module 1 till huffman coding5c-converted.pdfModule 1 till huffman coding5c-converted.pdf
Module 1 till huffman coding5c-converted.pdf
Β 
Communication engineering -UNIT IV .pptx
Communication engineering -UNIT IV .pptxCommunication engineering -UNIT IV .pptx
Communication engineering -UNIT IV .pptx
Β 
Unit I DIGITAL COMMUNICATION-INFORMATION THEORY.pdf
Unit I DIGITAL COMMUNICATION-INFORMATION THEORY.pdfUnit I DIGITAL COMMUNICATION-INFORMATION THEORY.pdf
Unit I DIGITAL COMMUNICATION-INFORMATION THEORY.pdf
Β 
Introduction to smpc
Introduction to smpc Introduction to smpc
Introduction to smpc
Β 
Introduction to SMPC
Introduction to SMPCIntroduction to SMPC
Introduction to SMPC
Β 
Image compression
Image compressionImage compression
Image compression
Β 

Recently uploaded

DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesMayuraD1
Β 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network DevicesChandrakantDivate1
Β 
AIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsAIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsvanyagupta248
Β 
Hospital management system project report.pdf
Hospital management system project report.pdfHospital management system project report.pdf
Hospital management system project report.pdfKamal Acharya
Β 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.Kamal Acharya
Β 
School management system project Report.pdf
School management system project Report.pdfSchool management system project Report.pdf
School management system project Report.pdfKamal Acharya
Β 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxJuliansyahHarahap1
Β 
PE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and propertiesPE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and propertiessarkmank1
Β 
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...Amil baba
Β 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationBhangaleSonal
Β 
πŸ’šTrustworthy Call Girls Pune Call Girls Service Just Call πŸ‘πŸ‘„6378878445 πŸ‘πŸ‘„ Top...
πŸ’šTrustworthy Call Girls Pune Call Girls Service Just Call πŸ‘πŸ‘„6378878445 πŸ‘πŸ‘„ Top...πŸ’šTrustworthy Call Girls Pune Call Girls Service Just Call πŸ‘πŸ‘„6378878445 πŸ‘πŸ‘„ Top...
πŸ’šTrustworthy Call Girls Pune Call Girls Service Just Call πŸ‘πŸ‘„6378878445 πŸ‘πŸ‘„ Top...vershagrag
Β 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXssuser89054b
Β 
Introduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdfIntroduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdfsumitt6_25730773
Β 
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxS1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxSCMS School of Architecture
Β 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapRishantSharmaFr
Β 
457503602-5-Gas-Well-Testing-and-Analysis-pptx.pptx
457503602-5-Gas-Well-Testing-and-Analysis-pptx.pptx457503602-5-Gas-Well-Testing-and-Analysis-pptx.pptx
457503602-5-Gas-Well-Testing-and-Analysis-pptx.pptxrouholahahmadi9876
Β 
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...HenryBriggs2
Β 
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best ServiceTamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Servicemeghakumariji156
Β 
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptxOrlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptxMuhammadAsimMuhammad6
Β 

Recently uploaded (20)

DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakes
Β 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network Devices
Β 
AIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsAIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech students
Β 
Hospital management system project report.pdf
Hospital management system project report.pdfHospital management system project report.pdf
Hospital management system project report.pdf
Β 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.
Β 
School management system project Report.pdf
School management system project Report.pdfSchool management system project Report.pdf
School management system project Report.pdf
Β 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Β 
PE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and propertiesPE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and properties
Β 
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
Β 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
Β 
πŸ’šTrustworthy Call Girls Pune Call Girls Service Just Call πŸ‘πŸ‘„6378878445 πŸ‘πŸ‘„ Top...
πŸ’šTrustworthy Call Girls Pune Call Girls Service Just Call πŸ‘πŸ‘„6378878445 πŸ‘πŸ‘„ Top...πŸ’šTrustworthy Call Girls Pune Call Girls Service Just Call πŸ‘πŸ‘„6378878445 πŸ‘πŸ‘„ Top...
πŸ’šTrustworthy Call Girls Pune Call Girls Service Just Call πŸ‘πŸ‘„6378878445 πŸ‘πŸ‘„ Top...
Β 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Β 
Introduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdfIntroduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdf
Β 
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxS1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
Β 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
Β 
457503602-5-Gas-Well-Testing-and-Analysis-pptx.pptx
457503602-5-Gas-Well-Testing-and-Analysis-pptx.pptx457503602-5-Gas-Well-Testing-and-Analysis-pptx.pptx
457503602-5-Gas-Well-Testing-and-Analysis-pptx.pptx
Β 
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
Β 
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best ServiceTamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Β 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
Β 
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptxOrlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
Β 

Information Theory and coding - Lecture 2

  • 1. Mustaqbal University College of Engineering &Computer Sciences Electronics and Communication Engineering Department Course: EE301: Probability Theory and Applications Prerequisite: Stat 219 Text Book: B.P. Lathi, β€œModern Digital and Analog Communication Systems”, 3th edition, Oxford University Press, Inc., 1998 Reference: A. Papoulis, Probability, Random Variables, and Stochastic Processes, Mc-Graw Hill, 2005 Dr. Aref Hassan Kurdali
  • 2. Application: Information Theory β€’ In the context of communications, information theory deals with mathematical modeling and analysis of a communication system rather than with physical sources and physical channels. β€’ In particular, it provides answers to two fundamental questions (among others): 1) What is the minimum number of binits (binary digits) per source symbol required to fully represent the source in acceptable quality ? (Most efficient source coding) 2) What is the ultimate (highest) transmission binit rate for reliable communication (no error transmission) over a noisy channel? (Most efficient channel coding)
  • 3. The answers to these two questions lie in the entropy of a source and the capacity of a channel respectively. Entropy is defined in terms of the probabilistic behavior of a source of information (How much average uncertainty of an information source?); it is so named in respect to the parallel use of this concept in thermodynamics (How much average instability of a physical source?). Capacity is defined as the basic ability of a channel to transmit information; it is naturally related to the noise characteristics of the channel. A remarkable result that emerges from information theory is that if the entropy of the source is less than the capacity of the channel, then error-free communication over the channel can be achieved.
  • 4. The discrete source output is modeled as a discrete random variable, S, which takes on symbols from a fixed finite alphabet S={s1, s2, s3, .........., sq} With probability distribution P(S= si) = pi, i=1,2, 3,........,q Where A discrete memoryless source (zero memory source) emits statistically independent symbols during successive signaling intervals where the symbol emitted at any time is independent of previous emitted symbols. οƒ₯ ο€½ ο€½ q i i p 1 1 Discrete Memoryless Source
  • 5. Information Measure How much information I(a) associated with an event β€˜a’ whose probability p(a) = p?. The information measure I(a) should have several properties: 1. Information is a non-negative quantity: I(a) β‰₯ 0. 2. If an event has probability 1, we get no information from the occurrence of that event, i.e. I(a) = 0 if p (a) =1. 3. If two independent events (a & b) occur (whose joint probability is the product of their individual probabilities i.e. p(ab) = p(a)p(b)), then the total information we get from observing these two events is the sum of the two informations: I(ab) = I(a)+I(b). (This is the critical property . . . ) 4. The information measure should be a continuous (and, in fact, monotonic) function of the probability (slight changes in probability should result in slight changes in information).
  • 6. Since, I(a2) = I(aa) = I(a)+I(a) = 2 I(a) Thus, by continuity, we get, for 0 < p(a) ≀ 1, and n > 0 as real number: I(an) = n * I(a) From this, The information can be measured by the logarithm function, i.e. I(a) = βˆ’logb(p(a)) = logb(1/p(a)) for some base b. The base b determines the unit of information used. The unit can be changed by changing the base, using the following formula: For b1, b2 & x > 0, Therefore, logb1 (x) = logb2(x) / logb2(b1)
  • 7. The occurrence of an event S = sk either provides some or no information, but never brings about a loss of information. The less probable an event is, the more information we gain when it occurs. Uncertainty, Surprise, and Information The amount of (uncertainty, surprise), information gained (before, at) after observing the event S = sk, which occurs with probability pk, is therefore defined using the logarithmic function
  • 8. Units of information The base of the logarithm in Equation (9.4) is quite arbitrary. Nevertheless, it is the standard practice today to use a logarithm to base 2. The resulting unit of information is called the bit When pk = 1/2, we have I(sk) = 1 bit. Hence, one bit is the amount of information that we gain when one of two possible and equally likely (i,e., equiprobable) events occurs. If a logarithm to base 10 is used, the resulting unit of information is called the hartly. When pk = 1/10, we have I(sk) = 1 hartly. A logarithm to base e can also be used, the resulting unit of information is called the nat. When pk = 1/e, we have I(sk) = 1 nat.
  • 9. Source Entropy H(S) the entropy of a discrete memoryless source H(S) = It is the average amount of information content per source symbol. The source entropy is bounded as follows: 0 ≀ H(S) ≀ log q where q is the radix (number of symbols) of the alphabet of the source. Furthermore, we may make two statements: 1. H(S) = 0, if and only if the probability pi = 1 for some i, and the remaining probabilities in the set are all zero; this lower bound on entropy corresponds to no uncertainty. 2. H(S) = log q, if and only if pi = 1/q for all i (i.e., all the symbols in the alphabet are equiprobable); this upper bound on entropy corresponds to maximum uncertainty. οƒ₯ οƒ₯ ο€½ ο€½ ο€½ q i i i q i i i p p I p 1 1 ) / 1 log(
  • 10. Consider a binary source for which symbol 0 occurs with probability p0 and symbol 1 with probability pl = 1 – p0. The source is memoryless so that successive symbols emitted by the source are statistically independent. The entropy of the binary source is usually called as the entropy function h(p0) = p0 log (1/p0) + (1-p0) log (1/(1-p0))
  • 11. we often find it useful to consider blocks rather than individual symbols, with each block consisting of n successive source symbols. We may view each such block as being produced by an extended source with a source alphabet that has qn distinct blocks, where q is the number of distinct symbols in the source alphabet of the original source. a)In the case of a discrete memoryless source, the source symbols are statistically independent. Hence, the probability of an extended source symbol is equal to the product of the probabilities of the n original source symbols constituting the particular extended source symbol. Thus, it may be intuitively to expect that H(Sn), the entropy of the extended source, is equal to n times H(S) the entropy of the original source. That is, we may write H(Sn) = n H(S)
  • 12. Problems 1. Find the entropy of a 7-symbol source at uniform distribution. (Answer: 2.81 bits of information/SS) 2. Given a five-symbol source with the following probability distribution {1/2, 1/4, 1/8, 1/16, 1/16, calculate the average amount of information per source symbol. (Answer: 1.875 bits/SS) 3. Given a 3-symbol, zero memory source S (a, b, c). If the amount of the joint information I(bc) = log(12) bits of information. Find any possible source probability distribution the source S. (Answer: {5/12, 1/3, 1/4} ) 4. Consider a zero memory binary source S with P(s1) = 0.8 & P(s2) = 0.2. a) Construct 2nd and 3rd extensions of the source S. b) Find the corresponding probability distribution of each extension. c) Calculate the average amount of information per source symbol (H(S2) and H(S3)).
  • 13. The process by which an efficient representation of data generated by a discrete source ( with finite source alphabet) is called source encoding. The device that performs this representation is called a source encoder. For the source encoder to be efficient, knowledge of the statistics of the source is required. In particular, if some source alphabets (symbols) are known to be more probable than others, then this feature may be exploited in the generation of a source code by assigning short code words to frequent source symbols, and long code words to rare source symbols in order to achieve lower code rate (# of code symbols/sec.)and hence using lower communication channel bandwidth in Hz for transmission or less memory bits for storage. Such a source code is called a variable-length code. Let r represents the code radix (number of code alphabet), ( r =2 for binary code, r = 8 for octal code and r = 10 for decimal code and so on). j is the codeword length (# of code symbol per codeword) and nj is the # of codewords of length j. An efficient source encoder should satisfy two functional requirements: 1. The code words produced by the encoder are in binary form. 2. The source code is uniquely decodable, so that the original source sequence can be reconstructed perfectly from the encoded binary sequence. Source Coding Theory
  • 14. Prefix (Instantaneous) Code (Entropy Code - Lossless Data Compression) For a source variable length code to be of practical use, the code has to be uniquely decodable (The code and all its extensions must be unique). This restriction ensures that for each finite sequence of symbols emitted by the source, the corresponding sequence of code words is unique and different from the sequence of code words corresponding to any other source sequence. A prefix (instantaneous) code (a Subclass of uniquely decodable) is defined as a code in which no code word is the prefix of any other code word. Only Code II is a prefix code which is always uniquely decodable code. Code III is also an uniquely decodable code since the bit 0 indicates the beginning of each code word but not an instantaneous code. Each codeword of an instantaneous code can be directly decoded once it is completely received. (Code I not decodable, example: when 00 is received, it will be s2 or s0 s0)
  • 15. Decision Tree The shown decision tree is a graphical representation of the code words which has an initial state and four terminal states corresponding to source symbols so, s1, s2, and s3. Source symbols must not be in intermediate states to satisfy the prefix condition. The decoder always starts at the initial state. The first received bit moves the decoder to the terminal state so if it is 0, or else to a second decision point if it is 1. In the latter case, the second bit moves the decoder one step further down the tree, either to terminal state s2 if it is 0, or else to a third decision point if it is 1, and so on. Once each terminal state emits its symbol, the decoder is reset to its initial state. Note also that each bit in the received encoded sequence is examined only once. For example, the encoded sequence 1011111000 . . . is readily decoded as the source sequence sl s3 s2 so so.. . .
  • 16. Kraft-McMillan Inequality 1 1 1 ο‚£ ο€½ ο€­ ο€½ ο€½ ο€­ οƒ₯ οƒ₯ j l j j q i l r n r i Where r is the code radix (number of symbols in the code alphabet, r =2 for binary code), nj is the # of codewords of length j and l is the maximum codeword length. Moreover, if a prefix code has been constructed for a discrete memoryless source with source alphabet (s1, s2, . . . , sq) and source statistics (P1, P2 , . . . , Pq) and the codeword for symbol si has length li, i = 1, 2, . . . , q, then the codeword lengths must satisfy the above inequality known as the Kraft-McMillan Inequality. It does not tell us that a source code is a prefix code. Rather, it is merely a condition on the codeword lengths of the code and not on the code words themselves. Referring to the three codes listed in Table 9.2:Code I violates the Kraft-McMillan inequality; it cannot therefore be a prefix code while, the Kraft-McMillan inequality is satisfied by both codes II and III; but only code II is a prefix code.
  • 17. Kraft-McMillan Inequality Prefix codes are distinguished from other uniquely decodable codes by the fact that the end of the code word is always recognizable. Hence, the decoding of a prefix can be accomplished as soon as the binary sequence representing a source symbol is fully received. For this reason, prefix codes are also referred to as instantaneous codes.
  • 18. Code I: 𝑖=1 π‘ž π‘Ÿβˆ’π‘™π‘– = 𝑖=1 4 2βˆ’π‘™π‘– = 2βˆ’1 + 2βˆ’1 + 2βˆ’2 + 2βˆ’2 = 1.5 ⇨ πΆπ‘œπ‘‘π‘’ 𝐼 𝑖𝑠 𝒏𝒐𝒕 π‘’π‘›π‘–π‘žπ‘’π‘’π‘™π‘¦ π‘‘π‘’π‘π‘œπ‘‘π‘Žπ‘π‘™π‘’ π‘œπ‘Ÿ 𝑗=1 𝑙 π‘›π‘—π‘Ÿβˆ’π‘— = 𝑗=1 2 π‘›π‘—π‘Ÿβˆ’π‘— = 2 Γ— 2βˆ’1 + 2 Γ— 2βˆ’2 = 1.5 Code II: 𝑖=1 π‘ž π‘Ÿβˆ’π‘™π‘– = 𝑖=1 4 2βˆ’π‘™π‘– = 2βˆ’1 + 2βˆ’2 + 2βˆ’3 + 2βˆ’3 = 1 ≀ 1 ⇨ πΆπ‘œπ‘‘π‘’ 𝐼𝐼 𝑖𝑠 π‘’π‘›π‘–π‘žπ‘’π‘’π‘™π‘¦ π‘‘π‘’π‘π‘œπ‘‘π‘Žπ‘π‘™π‘’ Code III: 𝑖=1 π‘ž π‘Ÿβˆ’π‘™π‘– = 𝑖=1 4 2βˆ’π‘™π‘– = 2βˆ’1 + 2βˆ’2 + 2βˆ’3 + 2βˆ’4 = 15 16 ≀ 1 ⇨ πΆπ‘œπ‘‘π‘’ 𝐼𝐼𝐼 𝑖𝑠 π‘’π‘›π‘–π‘žπ‘’π‘’π‘™π‘¦ π‘‘π‘’π‘π‘œπ‘‘π‘Žπ‘π‘™π‘’
  • 19. Coding Efficiency Assume the source has an alphabet with q different symbols, and that the ith symbol si occurs with probability pi , i = 1, 2,. . . , q. Let the binary code word assigned to symbol si by the encoder have length li measured in binits. Then, the average code-word length, L, of the source encoder is defined as In physical terms, the parameter L represents the average number of binits per source symbol used in the source encoding process. Let Lmin denote the minimum possible value of L, then, the coding efficiency of the source encoder is defined as Ξ· = Lmin/ L With L β‰₯ Lmin we clearly have Ξ· ≀1. The source encoder is said to be efficient when Ξ· approaches unity. οƒ₯ ο€½ ο€½ q i i i p l L 1
  • 20. Data Compaction A common characteristic of signals generated by physical sources is that, in their natural form, they contain a significant amount of information that is redundant. The transmission of such redundancy is therefore wasteful of primary communication resources. For efficient signal transmission, the redundant information should be removed from the signal prior to transmission. This operation, with no loss of information, is ordinarily performed on a signal in digital form, in which case it is called as data compaction or lossless data compression. According to the source-coding theorem, the entropy H(S) represents a fundamental limit on the removal of redundancy from the data. i.e. the average number of bits per source symbol necessary to represent a discrete memoryless source can be made as small as, but no smaller than, the entropy H(S). Thus with Lmin = H(S), the efficiency of a source encoder may be rewritten in terms of the source entropy H(S) as Ξ· = H(S)/ L
  • 21. Data Compaction Code I: L = 𝑖=1 π‘ž 𝑙𝑖𝑝𝑖 = 𝑖=1 4 𝑙𝑖𝑝𝑖 = 1 Γ— 0.5 + 1 Γ— 0.25 + 2 Γ— 0.125 + 2 Γ— 0.125 = 1.25 𝐻 𝑆 = 𝑖=1 π‘ž 𝑝𝑖. π‘™π‘œπ‘”2( 1 𝑝𝑖 ) = 0.5 π‘™π‘œπ‘”2 1 0.5 + 0.25 π‘™π‘œπ‘”2 1 0.25 + 0.125 π‘™π‘œπ‘”2 1 0.125 + 0.125 π‘™π‘œπ‘”2 1 0.125 = 1.75 πœ‚ = πΏπ‘šπ‘–π‘› 𝐿 = 𝐻(𝑆) 𝐿 = 1.75 1.25 = 1.400 Code II: L = 𝑖=1 π‘ž 𝑙𝑖𝑝𝑖 = 𝑖=1 4 𝑙𝑖𝑝𝑖 = 1 Γ— 0.5 + 2 Γ— 0.25 + 3 Γ— 0.125 + 3 Γ— 0.125 = 1.750 πœ‚ = 𝐻(𝑆) 𝐿 = 1.75 1.75 = 1 Code III: L = 𝑖=1 π‘ž 𝑙𝑖𝑝𝑖 = 𝑖=1 4 𝑙𝑖𝑝𝑖 = 1 Γ— 0.5 + 2 Γ— 0.25 + 3 Γ— 0.125 + 4 Γ— 0.125 = 1.875 πœ‚ = 𝐻(𝑆) 𝐿 = 1.75 1.875 = 0.933 Problem Find the efficiency of the source code I and II and II.
  • 22. Huffman Code An important class of prefix codes is known as Huffman codes. The Huffman code by definition is the most efficient code (highest possible efficiency without coding of source extension). The Huffman code of radix r algorithm proceeds as follows: 1. The source symbols are listed in order of decreasing probability. 2. The total # of source symbols q should equal to [b(r-1)+1] & b=0,1,2,3,…. Unless dummy symbols with zero probabilities should be augmented at the end of the list. 3. The β€˜r’ source symbols of lowest probabilities are regarded as being combined (or) into a new source symbol with probability equal to the sum of the original r probabilities. Therefore ,The list of source symbols is reduced in size by (r-1). The probability of the new symbol is placed in the list in accordance with its value (Keep probability descending order in all time). 3. The procedure is repeated until we are left with a final list of r combined symbols for which a code symbol is assigned to each one. 4. The code for each (original) source symbol is found by working backward and tracing the sequence of the code symbols assigned to that source symbol as well as its successors.
  • 23. Example 1: Huffman Binary Code (HC) 𝑖=1 π‘ž π‘Ÿβˆ’π‘™π‘– = 𝑖=1 4 2βˆ’π‘™π‘– = 3 Γ— 2βˆ’2 + 2 Γ— 2βˆ’3 = 1 ⇨ 𝑖𝑠 π‘π‘Ÿπ‘’π‘“π‘–π‘₯ πœ‚ = 𝐻(𝑆) 𝐿 = 2.12 2.2 = 0.96
  • 24. Example 2: Huffman Binary Code (HC) Si Pi HC1 S1 0.7 0 s1 0.7 0 s1 0.7 0 s1 0.7 0 S2 0.1 100 s45 0.1 11 s23 0.2 10 s2-5 0.3 1 S3 0.1 101 s2 0.1 100 s45 0.1 11 S4 0.05 110 s3 0.1 101 S5 0.05 111 Si Pi HC2 S1 0.7 0 s1 0.7 0 s1 0.7 0 s1 0.7 0 S2 0.1 11 s2 0.1 11 s345 0.2 10 s2-5 0.3 1 S3 0.1 100 s3 0.1 100 s2 0.1 11 S4 0.05 1010 s45 0.1 101 S5 0.05 1011
  • 25. Problem 1 Consider a zero memory binary source S with P(s1) = 0.8 & P(s2) = 0.2 : a) Construct 2nd and 3rd extensions of the source and find the corresponding probability distribution of each extension and find the entropy. b) Write down the binary code of the 2nd extension of the source [T ≑ S2] using each of the following binary decision trees: c) Find the average code word length L for each binary code. d) Encode the following source symbol stream using each of the above binary code: s2 s1 s1 s1 s1 s2 s2 s2 s1 s1 e) Calculate the binit rate in binits/sec. of each one if the source S emits 2000 symbols/sec.
  • 26. Problem 2 Consider a zero memory statistical independent binary source S with two source symbols s1 and s2. If P(s1) = 0.85, calculate: a) The amount of information of source symbol s1 = I(s1) in bit of information. b) The amount of information of source symbol s2 = I(s2) in bit of information. c) The statistical average of information of the source S = H(S) in bits/source symbol d) The joint information of the events: A={s1s2} and B={s1s1} in Hartley. e) The conditional information of the event: A={s1/ s2} in Nat.
  • 27. Problem 2 - Solution Consider a zero memory statistical independent binary source S with two source symbols s1 and s2. If P(s1) = 0.85, calculate: a) The amount of information of source symbol s1 = I(s1) in bit of information. I(s1) = log (1/0.85) = 0.2345 b) The amount of information of source symbol s2 = I(s2) in bit of information. I(s2) = log(1/0.15) = 2.737 c) The statistical average of information of the source S = H(S) in bits/source symbol H(S) = 0.85 Γ— 0.2345 + .15 Γ— 2.737 = 0.61 bits/SS d) The joint information of the events: A={s1s2} and B={s1s1} in Hartley. I(A) = log10 (1/(0.85 Γ— 0.15)) = log10 (1/0.1275) = 0.8945 Hartley I(B) = log10 (1/(0.85 Γ— 0.85)) = log10 (1/0.7225) = 0.1412 Hartley e) The conditional information of the event: A={s1/ s2} in Nat. P(s1/ s2) = P(s1) ……. (SI) I(A) = ln(1/0.85) = 0.1625 Nat
  • 28. Consider 3-symbol, zero memory source S (a, b, c) with P(a) = 0.8 and P(b) = 0.05. 1) Encode the source S symbols using a binary code. Calculate the average code length L. 2) Calculate the source entropy H(S). Calculate the code efficiency Ξ· = H(S)/L 3) Construct the second extension of the source [T ≑ S2] and find its probability distribution. 4) Write down the binary code of the source (T) symbols using each of the following binary decision trees: 5) Calculate the average code length of source (T) and the code efficiency for each code (LI, πœ‚I, LII, πœ‚II) 6) Encode the following source symbol stream using each of the above binary code (b a c c a a b b a c b a ) 7) Calculate the binit rate in binits/sec. of each code if the source S emits 3000 symbols/sec. Problem 3
  • 29. Consider 3-symbol, zero memory source S (a, b, c) with P(a) = 0.8 and P(b) = 0.05. 1) Encode the source S symbols using a binary code. Calculate the average code length P(a) = 0.8 P(b) = 0.05 P(c) = 0.15 0.8 a 0 0.05 b 10 0.15 c 11 (L = 0.8 + 2Γ— 0.05 + 3 Γ— 0.15 = 1.35 L = 0.8 + 3 Γ— 0.05 + 2 Γ— 0.15 = 1.25) L = 0.8 + 2 Γ— 0.05 + 2 Γ— 0.15 = 0.8 + 2 Γ— 0.2=1.2 binits/SS Problem 3 - Solution
  • 30. 2) Calculate the source entropy H(S). Calculate the code efficiency Ξ· = H(S)/L H(S) = .8log(1/.8) + .05log(1/.05) + .15log(1/.15) = 0.884 bits/SS, Ξ· = H(S)/L = 0.884/1.2 = 73.68% 3) Construct the second extension of the source [T ≑ S2] and find its probability distribution. P(t1) = P(aa) = 0.82 = 0.64 P(t2) = P(ab) = 0.8 Γ— 0.05 = 0.04 P(t3) = P(ac) = 0.8 Γ— 0.15 = 0.12 P(t4) = P(bb) = 0.052 = 0.0025 P(t5) = P(ba) = 0.8 Γ— 0.05 = 0.04 P(t6) = P(bc) = 0.05 Γ— 0.15 = 0.0075 P(t7) = P(cc) = 0.152 = 0.0225 P(t8) = P(cb) = 0.15 Γ— 0.05 = 0.0075 P(t9) = P(ca) = 0.8 Γ— 0.15 = 0.12 Problem 3 - Solution
  • 31. 4) Write down the binary code of the source (T) symbols using each of the following binary decision trees. Problem 3 - Solution
  • 32. 5) Calculate the average code length of source (T) and the code efficiency for each code (LI, πœ‚I, LII, πœ‚II) Code I word length: {2, 2, 3, 3, 3, 4, 5, 6, 6} L = 2Γ—0.76 + 3Γ—0.2 + 4Γ—0.0225 + 5Γ—0.0075 + 6Γ—0.01= 2.3075 binits/2SS Ξ·I = H(T)/L = 2H(S)/L = 2Γ—0.884/2.3075 = 76.62% Code II word lengths: {2, 3, 3, 3, 3, 3, 4, 5, 5} L = 2Γ—0.64 + 3Γ—0.3425 + 4Γ—0.0075 + 5Γ—0.01 = 2.3875 binits/2SS Ξ·II = H(T)/L = 2H(S)/L = 2Γ—0.884/2.3875 = 74.05% 6) Encode the following source symbol stream using each of the above binary code: b a c c a a b b a c b a T: t5 t7 t1 t4 t3 t5 Code I 000 0100 11 010111 10 000 Code II 000 011 10 11111 110 000 Problem 3 - Solution
  • 33. 7) Calculate the binit rate in binits/sec. of each code if the source S emits 3000 symbols/sec. (binit rate = source symbol rate Γ— source average code length) Code I binit rate = 2.3075 Γ— 1500 = 3.461 kb/sec Code II binit rate = 2.3875 Γ—1500 = 3.581 kb/sec Noteworthy that: The binit rate without extension = 1.2 Γ— 3000 = 3600 binit/sec = 3.6 kb/sec Problem 3 - Solution
  • 34. Can an instantaneous (Prefix) code be constructed with the following codeword lengths?. Find the corresponding code using the decision tree for each eligible case a) {1,2,3,3,4,4,5,5}, r = 2 b){1,1,2,2,3,3,4,4}, r = 3 c) {1,1,1,2,2,2,2}, r = 4 Problem 4
  • 35. Problem 4 - Solution
  • 36. A zero memory source S emits one of eight symbols randomly every 1 microsecond with probabilities {0.13, 0.2, 0.16, 0.3, 0.07, 0.05, 0.03, 0.06} 1. Calculate the source entropy H(S). 2. Construct a Huffman binary code. 3. Calculate the code efficiency. 4. Find the encoder output average binit rate. Problem 5
  • 37. A zero memory source S emits one of five symbols randomly every 2 microsecond with probabilities {0.25, 0.25, 0.2, 0.15, 0.15} 1. Calculate the source entropy H(S). 2. Construct a Huffman binary code. 3. Calculate the average length of this code. 4. Calculate the code efficiency. 5. Find the encoder output average binit rate. Problem 6
  • 38. A zero memory source S emits one of five symbols randomly every 2 microsecond with probabilities {0.25, 0.25, 0.2, 0.15, 0.15} 1. Construct a Huffman ternary code. 2. Calculate the average length of this code. 3. Calculate the code efficiency. 4. Calculate the code Redundancy.(𝜸=1- πœ‚) Problem 7
  • 39. If r β‰₯ 3, we may not have a sufficient number of symbols so that we can combine them r at a time. In such a case, we add dummy symbols to the end of the set of symbols. The dummy symbols have probability 0 and are inserted to fill the tree. Since at each stage of the reduction, the number of symbols is reduced by r βˆ’ 1, we want the total number of symbols to be 1 + k(r βˆ’ 1), where k is the number of merges. Hence, we add enough dummy symbols so that the total number of symbols is of this form. For example: A zero memory source S emits one of six symbols randomly with probabilities {0.25, 0.25, 0.2, 0.1, 0.1, 0.1} 1. Construct a Huffman ternary code. 2. Calculate the average length of this code. 3. Calculate the code efficiency. 4. Calculate the code Redundancy.(𝜸=1- πœ‚) Problem 8
  • 40. Complete the following probability distribution of the second extension T of a binary memoryless source S of 3-symbols {a, b & c} Problem 9 T S Prob 𝑝(𝑑1) 𝑝(aa) 0.25 𝑝(𝑑2) 𝑝(ab) 𝑝(𝑑3) 𝑝(ac) 𝑝(𝑑4) 𝑝(ba) 𝑝(𝑑5) 𝑝(bb) 𝑝(𝑑6) 𝑝(bc) 𝑝(𝑑7) 𝑝(ca) 𝑝(𝑑8) 𝑝(cb) 𝑝(𝑑9) 𝑝(cc) 0.01 1. Find the zero memory source S probability distribution. 2. Calculate the source entropy H(T). 3. Find the ternary Huffman code for the above source second extension T and calculate the code efficiency and redundancy. (Hint: you do not need to add dummy symbol with zero probability)
  • 41. Code Variance As a measure of the variability in code-word lengths of a source code, the variance of the average code-word length L over the ensemble of source symbols is defined as where po, pl, . . . , pK-1, are the source statistics, and lk is the length of the code word assigned to source symbol sk. It is usually found that when a combined symbol is moved as high as possible, the resulting Huffman code has a significantly smaller variance Οƒ2 (which is better) than when it is moved as low as possible. On this basis, it is reasonable to choose the former Huffman code over the latter.

Editor's Notes

  1. 1