Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

Like this presentation? Why not share!

- Modeling, Control and Optimization ... by Behzad Samadi 770 views
- Analog Communication Apr 2013 by Paramjeet Singh J... 433 views
- Introducing Myself Through Websites! by Behzad Samadi 429 views
- Controller Synthesis for Nonholonom... by Behzad Samadi 418 views
- Non Linear and Adaptive Control Sys... by Paramjeet Singh J... 492 views
- Recorders by Paramjeet Singh J... 503 views

529 views

Published on

FCE

No Downloads

Total views

529

On SlideShare

0

From Embeds

0

Number of Embeds

11

Shares

0

Downloads

0

Comments

0

Likes

4

No embeds

No notes for slide

- 1. Noise, Information Theory,and EntropyCS414 – Spring 2007By Roger Cheng(Huffman coding slides courtesyof Brian Bailey)
- 2. Why study noise?It’s present in allsystems of interest,and we have to dealwith itBy knowing itscharacteristics, wecan fight it betterCreate models toevaluate analytically
- 3. Communication systemabstractionInformationsourceEncoder ModulatorChannelOutput signal Decoder DemodulatorSender sideReceiver side
- 4. The additive noise channelTransmitted signal s(t)is corrupted by noisesource n(t), and theresulting received signalis r(t)Noise could result formmany sources, includingelectronic componentsand transmissioninterferencen(t)+s(t) r(t)
- 5. Random processesA random variable is the result of a singlemeasurementA random process is a indexed collection ofrandom variables, or equivalently a non-deterministic signal that can be described bya probability distributionNoise can be modeled as a random process
- 6. WGN (White Gaussian Noise)Properties At each time instant t = t0, the value of n(t)is normally distributed with mean 0,variance σ2(ie E[n(t0)] = 0, E[n(t0)2] = σ2) At any two different time instants, thevalues of n(t) are uncorrelated (ieE[n(t0)n(tk)] = 0) The power spectral density of n(t) hasequal power in all frequency bands
- 7. WGN continuedWhen an additive noise channel has a whiteGaussian noise source, we call it an AWGNchannelMost frequently used model incommunicationsReasons why we use this model It’s easy to understand and compute It applies to a broad class of physical channels
- 8. Signal energy and powerEnergy is defined asPower is defined asMost signals are either finite energy and zeropower, or infinite energy and finite powerNoise power is hard to compute in time domain Power of WGN is its variance σ22= | ( ) | dtx x tε∞−∞∫/22/21= lim | ( ) |TxTTP x t dtT→∞−∫
- 9. Signal to Noise Ratio (SNR)Defined as the ratio of signal power to thenoise power corrupting the signalUsually more practical to measure SNR on adB scaleObviously, want as high an SNR as possible
- 10. SNR example - AudioOriginal sound file isCD quality (16 bit,44.1 kHz samplingrate)Original 40 dB20 dB 10 dB
- 11. Analog vs. DigitalAnalog system Any amount of noise will create distortion at theoutputDigital system A relatively small amount of noise will cause noharm at all Too much noise will make decoding of receivedsignal impossibleBoth - Goal is to limit effects of noise to amanageable/satisfactory amount
- 12. Information theory andentropyInformation theory tries tosolve the problem ofcommunicating as muchdata as possible over anoisy channelMeasure of data isentropyClaude Shannon firstdemonstrated thatreliable communicationover a noisy channel ispossible (jump-starteddigital age)
- 13. Entropy definitionsShannon entropyBinary entropy formulaDifferential entropy
- 14. Properties of entropyCan be defined as the expectation of log p(x)(ie H(X) = E[-log p(x)])Is not a function of a variable’s values, is afunction of the variable’s probabilitiesUsually measured in “bits” (using logs of base2) or “nats” (using logs of base e)Maximized when all values are equally likely(ie uniform distribution)Equal to 0 when only one value is possible Cannot be negative
- 15. Joint and conditional entropyJoint entropy is the entropy of thepairing (X,Y)Conditional entropy is the entropy of X ifthe value of Y was knownRelationship between the two
- 16. Mutual informationMutual information is how muchinformation about X can be obtained byobserving Y
- 17. Mathematical model of achannelAssume that our input to the channel isX, and the output is YThen the characteristics of the channelcan be defined by its conditionalprobability distribution p(y|x)
- 18. Channel capacity and rateChannel capacity is defined as themaximum possible value of the mutualinformation We choose the best f(x) to maximize CFor any rate R < C, we can transmitinformation with arbitrarily smallprobability of error
- 19. Binary symmetric channelCorrect bit transmitted with probability 1-pWrong bit transmitted with probability p Sometimes called “cross-over probability”Capacity C = 1 - H(p,1-p)
- 20. Binary erasure channelCorrect bit transmitted with probability 1-p“Erasure” transmitted with probability pCapacity C = 1 - p
- 21. Coding theoryInformation theory only gives us an upperbound on communication rateNeed to use coding theory to find a practicalmethod to achieve a high rate2 types Source coding - Compress source data to asmaller size Channel coding - Adds redundancy bits to maketransmission across noisy channel more robust
- 22. Source-channel separationtheoremShannon showed that when dealingwith one transmitter and one receiver,we can break up source coding andchannel coding into separate stepswithout loss of optimalityDoes not apply when there are multipletransmitters and/or receivers Need to use network information theoryprinciples in those cases
- 23. Huffman EncodingUse probability distribution to determinehow many bits to use for each symbol higher-frequency assigned shorter codes entropy-based, block-variable codingscheme
- 24. Huffman EncodingProduces a code which uses a minimumnumber of bits to represent each symbol cannot represent same sequence using fewer realbits per symbol when using code words optimal when using code words, but this may differslightly from the theoretical lower limitBuild Huffman tree to assign codes
- 25. Informal Problem DescriptionGiven a set of symbols from an alphabet andtheir probability distribution assumes distribution is known and stableFind a prefix free binary code with minimumweighted path length prefix free means no codeword is a prefix of anyother codeword
- 26. Huffman AlgorithmConstruct a binary tree of codes leaf nodes represent symbols to encode interior nodes represent cumulative probability edges assigned 0 or 1 output codeConstruct the tree bottom-up connect the two nodes with the lowest probabilityuntil no more nodes to connect
- 27. Huffman ExampleConstruct theHuffman coding tree(in class)Symbol(S)P(S)A 0.25B 0.30C 0.12D 0.15E 0.18
- 28. Characteristics of SolutionLowest probability symbol isalways furthest from rootAssignment of 0/1 to childrenedges arbitrary other solutions possible; lengthsremain the same If two nodes have equalprobability, can select any twoNotes prefix free code O(nlgn) complexitySymbol(S)CodeA 11B 00C 010D 011E 10
- 29. Example Encoding/DecodingEncode “BEAD”⇒001011011⇒Decode “0101100”Symbol(S)CodeA 11B 00C 010D 011E 10
- 30. Entropy (Theoretical Limit)= -.25 * log2 .25 +-.30 * log2 .30 +-.12 * log2 .12 +-.15 * log2 .15 +-.18 * log2 .18H = 2.24 bits∑=−=Niii spspH1)(log)( 2Symbol P(S)CodeA 0.25 11B 0.30 00C 0.12 010D 0.15 011E 0.18 10
- 31. Average Codeword Length= .25(2) +.30(2) +.12(3) +.15(3) +.18(2)L = 2.27 bits∑==Niii scodelengthspL1)()(Symbol P(S)CodeA 0.25 11B 0.30 00C 0.12 010D 0.15 011E 0.18 10
- 32. Code Length Relative toEntropyHuffman reaches entropy limit when allprobabilities are negative powers of 2 i.e., 1/2; 1/4; 1/8; 1/16; etc.H <= Code Length <= H + 1∑=−=Niii spspH1)(log)( 2∑==Niii scodelengthspL1)()(
- 33. ExampleH = -.01*log2.01 +-.99*log2.99= .08L = .01(1) +.99(1)= 1Symbol P(S) CodeA 0.01 1B 0.99 0
- 34. LimitationsDiverges from lower limit when probability ofa particular symbol becomes high always uses an integral number of bitsMust send code book with the data lowers overall efficiencyMust determine frequency distribution must remain stable over the data set

No public clipboards found for this slide

Be the first to comment