SlideShare a Scribd company logo
Lecture 2:
Basic Information Theory
Thinh Nguyen
Oregon State University
What is information?
 Can we measure information?
 Consider the two following sentences:
1. There is a traffic jam on I5
2. There is a traffic jam on I5 near Exit 234
Sentence 2 seems to have more information than that of
sentence 1. From the semantic viewpoint, sentence 2 provides
more useful information.
What is information?
 It is hard to measure the “semantic” information!
 Consider the following two sentences
1. There is a traffic jam on I5 near Exit 160
2. There is a traffic jam on I5 near Exit 234
It’s not clear whether sentence 1 or 2 would have more information!
What is information?
 Let’s attempt at a different definition of
information.
 How about counting the number of letters in
the two sentences:
1. There is a traffic jam on I5 (22 letters)
2. There is a traffic jam on I5 near Exit 234 (33 letters)
Definitely something we can measure and compare!
What is information?
 First attempt to quantify information by Hartley
(1928).
 Every symbol of the message has a choice of possibilities.
 A message of length , therefore can have distinguishable
possibilities.
 Information measure is then the logarithm of
l
s
l l
s
s
)log()log( slsI l

Intuitively, this definition makes sense:
one symbol (letter) has the information of then a sentence of length
should have times more information, i.e.
)log(s l
l sl log
It’s interesting to know that
log is the only function
that satisfies
)()( slfsf l

f
How about we measure information as the number of
Yes/No questions one has to ask to get the correct
answer to a simple game below
1 2
3 4
1 2 3 4
5 6 8
9 10 11 12
13 14 15 16
7
How many questions?
How many questions?
2
4
Randomness due to uncerntainty of where the circle is!
Shannon’s Information Theory
Claude Shannon: A Mathematical Theory of Communication
The
Bell System Technical Journal, 1948
Where there are symbols 1, 2, … , each with
probability of occurrence of
 Shannon’s measure of information is the number of bits to
represent the amount of uncertainty (randomness) in a
data source, and is defined as entropy
)log(
1


n
i
ii ppH
ip
n n
Shannon’s Entropy
 Consider the following string consisting of symbols a and b:
abaabaababbbaabbabab… ….
 On average, there are equal number of a and b.
 The string can be considered as an output of a below source
with equal probability of outputting symbol a or b:
source
0.5
0.5
a
b
We want to characterize the average
information generated by the source!
Intuition on Shannon’s Entropy
Suppose you have a long random string of two binary symbols 0 and 1, and the
probability of symbols 1 and 0 are and
Ex: 00100100101101001100001000100110001 ….
If any string is long enough say , it is likely to contain 0’s and 1’s.
The probability of this string pattern occurs is equal to
)log(
1


n
i
ii ppHWhy
10
10
NpNp
ppp 
Hence, # of possible patterns is
# bits to represent all possible patterns is 


1
0
10 log)log( 10
i
ii
NpNp
pNppp
The average # of bits to represent the symbol is therefore
0p 1p
N 0Np 1Np
10
10/1 NpNp
ppp 



1
0
log
i
ii pp
More Intuition on Entropy
 Assume a binary memoryless source, e.g., a flip of a coin. How
much information do we receive when we are told that the
outcome is heads?
 If it’s a fair coin, i.e., P(heads) = P (tails) = 0.5, we say that
the amount of information is 1 bit.
 If we already know that it will be (or was) heads, i.e., P(heads)
= 1, the amount of information is zero!
 If the coin is not fair, e.g., P(heads) = 0.9, the amount of
information is more than zero but less than one bit!
 Intuitively, the amount of information received is the same if
P(heads) = 0.9 or P (heads) = 0.1.
Self Information
 So, let’s look at it the way Shannon did.
 Assume a memoryless source with
 alphabet A = (a1, …, an)
 symbol probabilities (p1, …, pn).
 How much information do we get when finding out that the
next symbol is ai?
 According to Shannon the self information of ai is
Why?
Assume two independent events A and B, with
probabilities P(A) = pA and P(B) = pB.
For both the events to happen, the probability is
pA ¢ pB. However, the amount of information
should be added, not multiplied.
Logarithms satisfy this!
No, we want the information to increase with
decreasing probabilities, so let’s use the negative
logarithm.
Self Information
Example 1:
Example 2:
Which logarithm? Pick the one you like! If you pick the natural log,
you’ll measure in nats, if you pick the 10-log, you’ll get Hartleys,
if you pick the 2-log (like everyone else), you’ll get bits.
Self Information
H(X) is called the first order entropy of the source.
This can be regarded as the degree of uncertainty
about the following symbol.
On average over all the symbols, we get:
Entropy
Example: Binary Memoryless Source
BMS 0 1 1 0 1 0 0 0 …
1
0 0.5 1
The uncertainty (information) is greatest when
Then
Let
Example
P = {0.5, 0.25, 0.25}
Three symbols a, b, c with corresponding probabilities:
What is H(P)?
Q = {0.48, 0.32, 0.20}
Three weather conditions in Corvallis: Rain, sunny, cloudy with
corresponding probabilities:
What is H(Q)?
Entropy: Three properties
1. It can be shown that 0 · H · log N.
2. Maximum entropy (H = log N) is reached when
all symbols are equiprobable, i.e.,
pi = 1/N.
3. The difference log N – H is called the redundancy
of the source.

More Related Content

Similar to Information theory 1

Digital Communication: Information Theory
Digital Communication: Information TheoryDigital Communication: Information Theory
Digital Communication: Information Theory
Dr. Sanjay M. Gulhane
 
Information Theory and coding - Lecture 2
Information Theory and coding - Lecture 2Information Theory and coding - Lecture 2
Information Theory and coding - Lecture 2
Aref35
 
6.1-Information Sources and Entropy.pptx
6.1-Information Sources and Entropy.pptx6.1-Information Sources and Entropy.pptx
6.1-Information Sources and Entropy.pptx
DipakMahurkar1
 
Huffman&Shannon-multimedia algorithms.ppt
Huffman&Shannon-multimedia algorithms.pptHuffman&Shannon-multimedia algorithms.ppt
Huffman&Shannon-multimedia algorithms.ppt
PrincessSaro
 
ODSC London 2018
ODSC London 2018ODSC London 2018
ODSC London 2018
Kfir Bar
 
silo.tips_-chapter-1-the-binary-number-system-11-why-binary.pdf
silo.tips_-chapter-1-the-binary-number-system-11-why-binary.pdfsilo.tips_-chapter-1-the-binary-number-system-11-why-binary.pdf
silo.tips_-chapter-1-the-binary-number-system-11-why-binary.pdf
ssuserf7cd2b
 
Binary overview
Binary overviewBinary overview
Binary overview
Januário Esteves
 
Elliptic Curve Cryptography for those who are afraid of maths
Elliptic Curve Cryptography for those who are afraid of mathsElliptic Curve Cryptography for those who are afraid of maths
Elliptic Curve Cryptography for those who are afraid of maths
Martijn Grooten
 
R8 information theory
R8 information theoryR8 information theory
R8 information theory
Jeyamalar Rajarajan
 
The Complexity Of Primality Testing
The Complexity Of Primality TestingThe Complexity Of Primality Testing
The Complexity Of Primality Testing
Mohammad Elsheikh
 
DC Lecture Slides 1 - Information Theory.ppt
DC Lecture Slides 1 - Information Theory.pptDC Lecture Slides 1 - Information Theory.ppt
DC Lecture Slides 1 - Information Theory.ppt
shortstime400
 
Internet of Things Data Science
Internet of Things Data ScienceInternet of Things Data Science
Internet of Things Data Science
Albert Bifet
 
L&NDeltaTalk
L&NDeltaTalkL&NDeltaTalk
L&NDeltaTalk
Steve Sugden
 
AI3391 Artificial intelligence Unit IV Notes _ merged.pdf
AI3391 Artificial intelligence Unit IV Notes _ merged.pdfAI3391 Artificial intelligence Unit IV Notes _ merged.pdf
AI3391 Artificial intelligence Unit IV Notes _ merged.pdf
Asst.prof M.Gokilavani
 
Lecture 2-cs648
Lecture 2-cs648Lecture 2-cs648
Lecture 2-cs648
Rajiv Omar
 
Datacompression1
Datacompression1Datacompression1
Datacompression1
anithabalaprabhu
 
Probability Assignment Help
Probability Assignment HelpProbability Assignment Help
Probability Assignment Help
Statistics Assignment Help
 
Itblock2 150209161919-conversion-gate01
Itblock2 150209161919-conversion-gate01Itblock2 150209161919-conversion-gate01
Itblock2 150209161919-conversion-gate01
Xuan Phu Nguyen
 
Information Theory
Information TheoryInformation Theory
Information Theory
Sou Jana
 
Probability Homework Help
Probability Homework HelpProbability Homework Help
Probability Homework Help
Statistics Homework Helper
 

Similar to Information theory 1 (20)

Digital Communication: Information Theory
Digital Communication: Information TheoryDigital Communication: Information Theory
Digital Communication: Information Theory
 
Information Theory and coding - Lecture 2
Information Theory and coding - Lecture 2Information Theory and coding - Lecture 2
Information Theory and coding - Lecture 2
 
6.1-Information Sources and Entropy.pptx
6.1-Information Sources and Entropy.pptx6.1-Information Sources and Entropy.pptx
6.1-Information Sources and Entropy.pptx
 
Huffman&Shannon-multimedia algorithms.ppt
Huffman&Shannon-multimedia algorithms.pptHuffman&Shannon-multimedia algorithms.ppt
Huffman&Shannon-multimedia algorithms.ppt
 
ODSC London 2018
ODSC London 2018ODSC London 2018
ODSC London 2018
 
silo.tips_-chapter-1-the-binary-number-system-11-why-binary.pdf
silo.tips_-chapter-1-the-binary-number-system-11-why-binary.pdfsilo.tips_-chapter-1-the-binary-number-system-11-why-binary.pdf
silo.tips_-chapter-1-the-binary-number-system-11-why-binary.pdf
 
Binary overview
Binary overviewBinary overview
Binary overview
 
Elliptic Curve Cryptography for those who are afraid of maths
Elliptic Curve Cryptography for those who are afraid of mathsElliptic Curve Cryptography for those who are afraid of maths
Elliptic Curve Cryptography for those who are afraid of maths
 
R8 information theory
R8 information theoryR8 information theory
R8 information theory
 
The Complexity Of Primality Testing
The Complexity Of Primality TestingThe Complexity Of Primality Testing
The Complexity Of Primality Testing
 
DC Lecture Slides 1 - Information Theory.ppt
DC Lecture Slides 1 - Information Theory.pptDC Lecture Slides 1 - Information Theory.ppt
DC Lecture Slides 1 - Information Theory.ppt
 
Internet of Things Data Science
Internet of Things Data ScienceInternet of Things Data Science
Internet of Things Data Science
 
L&NDeltaTalk
L&NDeltaTalkL&NDeltaTalk
L&NDeltaTalk
 
AI3391 Artificial intelligence Unit IV Notes _ merged.pdf
AI3391 Artificial intelligence Unit IV Notes _ merged.pdfAI3391 Artificial intelligence Unit IV Notes _ merged.pdf
AI3391 Artificial intelligence Unit IV Notes _ merged.pdf
 
Lecture 2-cs648
Lecture 2-cs648Lecture 2-cs648
Lecture 2-cs648
 
Datacompression1
Datacompression1Datacompression1
Datacompression1
 
Probability Assignment Help
Probability Assignment HelpProbability Assignment Help
Probability Assignment Help
 
Itblock2 150209161919-conversion-gate01
Itblock2 150209161919-conversion-gate01Itblock2 150209161919-conversion-gate01
Itblock2 150209161919-conversion-gate01
 
Information Theory
Information TheoryInformation Theory
Information Theory
 
Probability Homework Help
Probability Homework HelpProbability Homework Help
Probability Homework Help
 

Recently uploaded

LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
Anant Corporation
 
Data Control Language.pptx Data Control Language.pptx
Data Control Language.pptx Data Control Language.pptxData Control Language.pptx Data Control Language.pptx
Data Control Language.pptx Data Control Language.pptx
ramrag33
 
Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
bijceesjournal
 
Generative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of contentGenerative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of content
Hitesh Mohapatra
 
Computational Engineering IITH Presentation
Computational Engineering IITH PresentationComputational Engineering IITH Presentation
Computational Engineering IITH Presentation
co23btech11018
 
CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1
PKavitha10
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
kandramariana6
 
integral complex analysis chapter 06 .pdf
integral complex analysis chapter 06 .pdfintegral complex analysis chapter 06 .pdf
integral complex analysis chapter 06 .pdf
gaafergoudaay7aga
 
An Introduction to the Compiler Designss
An Introduction to the Compiler DesignssAn Introduction to the Compiler Designss
An Introduction to the Compiler Designss
ElakkiaU
 
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
Yasser Mahgoub
 
Software Engineering and Project Management - Introduction, Modeling Concepts...
Software Engineering and Project Management - Introduction, Modeling Concepts...Software Engineering and Project Management - Introduction, Modeling Concepts...
Software Engineering and Project Management - Introduction, Modeling Concepts...
Prakhyath Rai
 
AI assisted telemedicine KIOSK for Rural India.pptx
AI assisted telemedicine KIOSK for Rural India.pptxAI assisted telemedicine KIOSK for Rural India.pptx
AI assisted telemedicine KIOSK for Rural India.pptx
architagupta876
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
IJECEIAES
 
Design and optimization of ion propulsion drone
Design and optimization of ion propulsion droneDesign and optimization of ion propulsion drone
Design and optimization of ion propulsion drone
bjmsejournal
 
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
171ticu
 
People as resource Grade IX.pdf minimala
People as resource Grade IX.pdf minimalaPeople as resource Grade IX.pdf minimala
People as resource Grade IX.pdf minimala
riddhimaagrawal986
 
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURSCompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
RamonNovais6
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
IJECEIAES
 
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
ecqow
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Sinan KOZAK
 

Recently uploaded (20)

LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
 
Data Control Language.pptx Data Control Language.pptx
Data Control Language.pptx Data Control Language.pptxData Control Language.pptx Data Control Language.pptx
Data Control Language.pptx Data Control Language.pptx
 
Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
 
Generative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of contentGenerative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of content
 
Computational Engineering IITH Presentation
Computational Engineering IITH PresentationComputational Engineering IITH Presentation
Computational Engineering IITH Presentation
 
CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
 
integral complex analysis chapter 06 .pdf
integral complex analysis chapter 06 .pdfintegral complex analysis chapter 06 .pdf
integral complex analysis chapter 06 .pdf
 
An Introduction to the Compiler Designss
An Introduction to the Compiler DesignssAn Introduction to the Compiler Designss
An Introduction to the Compiler Designss
 
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
 
Software Engineering and Project Management - Introduction, Modeling Concepts...
Software Engineering and Project Management - Introduction, Modeling Concepts...Software Engineering and Project Management - Introduction, Modeling Concepts...
Software Engineering and Project Management - Introduction, Modeling Concepts...
 
AI assisted telemedicine KIOSK for Rural India.pptx
AI assisted telemedicine KIOSK for Rural India.pptxAI assisted telemedicine KIOSK for Rural India.pptx
AI assisted telemedicine KIOSK for Rural India.pptx
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
 
Design and optimization of ion propulsion drone
Design and optimization of ion propulsion droneDesign and optimization of ion propulsion drone
Design and optimization of ion propulsion drone
 
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
 
People as resource Grade IX.pdf minimala
People as resource Grade IX.pdf minimalaPeople as resource Grade IX.pdf minimala
People as resource Grade IX.pdf minimala
 
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURSCompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
 
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
 

Information theory 1

  • 1. Lecture 2: Basic Information Theory Thinh Nguyen Oregon State University
  • 2. What is information?  Can we measure information?  Consider the two following sentences: 1. There is a traffic jam on I5 2. There is a traffic jam on I5 near Exit 234 Sentence 2 seems to have more information than that of sentence 1. From the semantic viewpoint, sentence 2 provides more useful information.
  • 3. What is information?  It is hard to measure the “semantic” information!  Consider the following two sentences 1. There is a traffic jam on I5 near Exit 160 2. There is a traffic jam on I5 near Exit 234 It’s not clear whether sentence 1 or 2 would have more information!
  • 4. What is information?  Let’s attempt at a different definition of information.  How about counting the number of letters in the two sentences: 1. There is a traffic jam on I5 (22 letters) 2. There is a traffic jam on I5 near Exit 234 (33 letters) Definitely something we can measure and compare!
  • 5. What is information?  First attempt to quantify information by Hartley (1928).  Every symbol of the message has a choice of possibilities.  A message of length , therefore can have distinguishable possibilities.  Information measure is then the logarithm of l s l l s s )log()log( slsI l  Intuitively, this definition makes sense: one symbol (letter) has the information of then a sentence of length should have times more information, i.e. )log(s l l sl log It’s interesting to know that log is the only function that satisfies )()( slfsf l  f
  • 6. How about we measure information as the number of Yes/No questions one has to ask to get the correct answer to a simple game below 1 2 3 4 1 2 3 4 5 6 8 9 10 11 12 13 14 15 16 7 How many questions? How many questions? 2 4 Randomness due to uncerntainty of where the circle is!
  • 7. Shannon’s Information Theory Claude Shannon: A Mathematical Theory of Communication The Bell System Technical Journal, 1948 Where there are symbols 1, 2, … , each with probability of occurrence of  Shannon’s measure of information is the number of bits to represent the amount of uncertainty (randomness) in a data source, and is defined as entropy )log( 1   n i ii ppH ip n n
  • 8. Shannon’s Entropy  Consider the following string consisting of symbols a and b: abaabaababbbaabbabab… ….  On average, there are equal number of a and b.  The string can be considered as an output of a below source with equal probability of outputting symbol a or b: source 0.5 0.5 a b We want to characterize the average information generated by the source!
  • 9. Intuition on Shannon’s Entropy Suppose you have a long random string of two binary symbols 0 and 1, and the probability of symbols 1 and 0 are and Ex: 00100100101101001100001000100110001 …. If any string is long enough say , it is likely to contain 0’s and 1’s. The probability of this string pattern occurs is equal to )log( 1   n i ii ppHWhy 10 10 NpNp ppp  Hence, # of possible patterns is # bits to represent all possible patterns is    1 0 10 log)log( 10 i ii NpNp pNppp The average # of bits to represent the symbol is therefore 0p 1p N 0Np 1Np 10 10/1 NpNp ppp     1 0 log i ii pp
  • 10. More Intuition on Entropy  Assume a binary memoryless source, e.g., a flip of a coin. How much information do we receive when we are told that the outcome is heads?  If it’s a fair coin, i.e., P(heads) = P (tails) = 0.5, we say that the amount of information is 1 bit.  If we already know that it will be (or was) heads, i.e., P(heads) = 1, the amount of information is zero!  If the coin is not fair, e.g., P(heads) = 0.9, the amount of information is more than zero but less than one bit!  Intuitively, the amount of information received is the same if P(heads) = 0.9 or P (heads) = 0.1.
  • 11. Self Information  So, let’s look at it the way Shannon did.  Assume a memoryless source with  alphabet A = (a1, …, an)  symbol probabilities (p1, …, pn).  How much information do we get when finding out that the next symbol is ai?  According to Shannon the self information of ai is
  • 12. Why? Assume two independent events A and B, with probabilities P(A) = pA and P(B) = pB. For both the events to happen, the probability is pA ¢ pB. However, the amount of information should be added, not multiplied. Logarithms satisfy this! No, we want the information to increase with decreasing probabilities, so let’s use the negative logarithm.
  • 13. Self Information Example 1: Example 2: Which logarithm? Pick the one you like! If you pick the natural log, you’ll measure in nats, if you pick the 10-log, you’ll get Hartleys, if you pick the 2-log (like everyone else), you’ll get bits.
  • 14. Self Information H(X) is called the first order entropy of the source. This can be regarded as the degree of uncertainty about the following symbol. On average over all the symbols, we get:
  • 15. Entropy Example: Binary Memoryless Source BMS 0 1 1 0 1 0 0 0 … 1 0 0.5 1 The uncertainty (information) is greatest when Then Let
  • 16. Example P = {0.5, 0.25, 0.25} Three symbols a, b, c with corresponding probabilities: What is H(P)? Q = {0.48, 0.32, 0.20} Three weather conditions in Corvallis: Rain, sunny, cloudy with corresponding probabilities: What is H(Q)?
  • 17. Entropy: Three properties 1. It can be shown that 0 · H · log N. 2. Maximum entropy (H = log N) is reached when all symbols are equiprobable, i.e., pi = 1/N. 3. The difference log N – H is called the redundancy of the source.