Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Compression  For sending and storing information Text, audio, images, videos
Common Applications <ul><li>Text compression </li></ul><ul><ul><li>loss-less, gzip uses Lempel-Ziv coding, 3:1 compression...
Text Compression <ul><li>Prefix code: one, of many, approaches </li></ul><ul><ul><li>no code is prefix of any other code <...
Simplest Text Encoding <ul><li>Run-length encoding </li></ul><ul><li>Requires special character, say @ </li></ul><ul><li>E...
Shannon’s Information theory (1948) How well can we encode? <ul><li>Shannon’s goal: reduce size of messages for improved c...
Example <ul><li>Send ACTG each occurring 1/4 of the time </li></ul><ul><li>Code: A--00, C--01, T--10, G--11 </li></ul><ul>...
Understanding Entropy/Information
The Shannon-Fano Algorithm <ul><li>Earliest algorithm: Heuristic divide and conquer </li></ul><ul><li>Illustration: source...
Shannon-Fano Tree 0 1
Result for this distribution <ul><li>Symbol  Count  -log(1/p)  Code  (# of bits) </li></ul><ul><li>------  -----  --------...
Code Tree Method/Analysis <ul><li>Binary tree method  </li></ul><ul><li>Internal nodes have left/right references: </li></...
Code Encode(character) <ul><li>Again can use binary prefix tree </li></ul><ul><li>For encode and decode could use hashing ...
Huffman Code <ul><li>Provably optimal: i.e. yields minimum storage cost </li></ul><ul><li>Algorithm:  CodeTree huff(docume...
Bad code example
Tree, a la Huffman
Tree with codes: note Prefix property
Tree Cost
Analysis <ul><li>Intuition:  least frequent chars get longest codes or most frequent chars get shortest codes. </li></ul><...
Analysis (continued) <ul><li>Sk : Huffman algorithm on k chars produces optimal code. </li></ul><ul><ul><li>S2:  obvious <...
Lempel-Ziv <ul><li>Input: string of characters </li></ul><ul><li>Internal: dictionary of (codewords, words) </li></ul><ul>...
Lempel-Ziv Algorithm <ul><li>w = NIL;  </li></ul><ul><li>while ( read a character c ) </li></ul><ul><li>{  </li></ul><ul><...
Adaptive Encoding <ul><li>Webster has 157,000 entries: could encode in X bits </li></ul><ul><ul><li>but only works for thi...
Audio Compression <ul><li>Sounds can be represented as a vector valued function </li></ul><ul><li>At any point in time, a ...
Audio <ul><li>Using many frequencies, as in CDs, yields a good approximation Using few frequenices, as in telephones, a po...
Image Compression <ul><li>with or without loss, mostly with </li></ul><ul><ul><li>who cares about what the eye can’t see <...
Image Compression <ul><li>faces can be done with eigenfaces  </li></ul><ul><ul><li>images can be regarded a points in R^(b...
Video Compression <ul><li>Uses DCT (discrete cosine transform) </li></ul><ul><ul><li>Note: Nice functions can be approxima...
Summary <ul><li>Issues: </li></ul><ul><ul><li>Context: what problem are you solving and what is an acceptable solution. </...
Upcoming SlideShare
Loading in …5
×

Lec5 Compression

1,335 views

Published on

Published in: Technology, Art & Photos
  • Be the first to comment

Lec5 Compression

  1. 1. Compression For sending and storing information Text, audio, images, videos
  2. 2. Common Applications <ul><li>Text compression </li></ul><ul><ul><li>loss-less, gzip uses Lempel-Ziv coding, 3:1 compression </li></ul></ul><ul><ul><li>better than Huffman </li></ul></ul><ul><li>Audio compression </li></ul><ul><ul><li>lossy, mpeg 3:1 to 24:1 compression </li></ul></ul><ul><ul><li>MPEG = motion picture expert group </li></ul></ul><ul><li>Image compression </li></ul><ul><ul><li>lossy, jpeg 3:1 compression </li></ul></ul><ul><ul><li>JPEG = Joint photographic expert group </li></ul></ul><ul><li>Video compression </li></ul><ul><ul><li>lossy, mpeg 27:1 compression </li></ul></ul>
  3. 3. Text Compression <ul><li>Prefix code: one, of many, approaches </li></ul><ul><ul><li>no code is prefix of any other code </li></ul></ul><ul><ul><li>constraint: loss-less </li></ul></ul><ul><ul><li>tasks </li></ul></ul><ul><ul><ul><li>encode: text (string) -> code </li></ul></ul></ul><ul><ul><ul><li>decode: code --> text </li></ul></ul></ul><ul><ul><li>main goal: maximally reduce storage, measured by compression ratio </li></ul></ul><ul><ul><li>minor goals: </li></ul></ul><ul><ul><ul><li>simplicity </li></ul></ul></ul><ul><ul><ul><li>efficiency: time and space </li></ul></ul></ul><ul><ul><ul><ul><li>some require code dictionary or 2 passes of data </li></ul></ul></ul></ul>
  4. 4. Simplest Text Encoding <ul><li>Run-length encoding </li></ul><ul><li>Requires special character, say @ </li></ul><ul><li>Example Source: </li></ul><ul><ul><li>ACCCTGGGGGAAAACCCCCC </li></ul></ul><ul><li>Encoding: </li></ul><ul><ul><li>A@C3T@G5@4A@C6 </li></ul></ul><ul><li>Method </li></ul><ul><ul><li>any 3 or more characters are replace by @char# </li></ul></ul><ul><li>+: simple </li></ul><ul><li>-: special characters, non-optimal </li></ul>
  5. 5. Shannon’s Information theory (1948) How well can we encode? <ul><li>Shannon’s goal: reduce size of messages for improved communication </li></ul><ul><li>What messages would be easiest/hardest to send? </li></ul><ul><ul><li>Random bits hardest - no redundancy or pattern </li></ul></ul><ul><li>Formal definition: S, a set of symbols si </li></ul><ul><li>Information content of S = -sum pi*log(pi) </li></ul><ul><ul><li>measure of randomness </li></ul></ul><ul><ul><li>more random, less predictable, higher information content! </li></ul></ul><ul><li>Theorem: only measure with several natural properties </li></ul><ul><li>Information is not knowledge </li></ul><ul><li>Compression relies on finding regularities or redundancies. </li></ul>
  6. 6. Example <ul><li>Send ACTG each occurring 1/4 of the time </li></ul><ul><li>Code: A--00, C--01, T--10, G--11 </li></ul><ul><li>2 bits per letters: no surprise </li></ul><ul><li>Average message length: </li></ul><ul><ul><li>prob(A)*codelength(A)+prob(B)*codelength(B) +… </li></ul></ul><ul><ul><li>1/4*2+…. = 2 bits. </li></ul></ul><ul><li>Now suppose: </li></ul><ul><ul><li>prob(A) = 13/16 and other 1/16 </li></ul></ul><ul><ul><li>Codes: A - 1; C-00, G-010, T-011 (prefix) </li></ul></ul><ul><ul><li>13/16*1+ 1/16*2+ 1/16*3+1/16*3=21/16 = 1.3+ </li></ul></ul><ul><li>What is best result? Part of the answer: </li></ul><ul><li>The information content! But how to get it? </li></ul>
  7. 7. Understanding Entropy/Information
  8. 8. The Shannon-Fano Algorithm <ul><li>Earliest algorithm: Heuristic divide and conquer </li></ul><ul><li>Illustration: source text with only letters ABCDE </li></ul><ul><li>Symbol A B C D E </li></ul><ul><li>---------------------------------- </li></ul><ul><li>Count 15 7 6 6 5 </li></ul><ul><li>Intuition: frequent letters get short codes </li></ul><ul><li>1. Sort symbols according to their frequencies/probabilities, i.e. ABCDE. </li></ul><ul><li>2. Recursively divide into two parts, each with approx. same number of counts. </li></ul><ul><li>This is instance of “balancing” which is NP-complete. </li></ul><ul><li>Note: variable length codes. </li></ul>
  9. 9. Shannon-Fano Tree 0 1
  10. 10. Result for this distribution <ul><li>Symbol Count -log(1/p) Code (# of bits) </li></ul><ul><li>------ ----- -------- --------- -------------------- </li></ul><ul><li>A 15 1.38 00 30 </li></ul><ul><li>B 7 2.48 01 14 </li></ul><ul><li>C 6 2.70 10 12 </li></ul><ul><li>D 6 2.70 110 18 </li></ul><ul><li>E 5 2.96 111 15 </li></ul><ul><li>TOTAL (# of bits): 89 </li></ul><ul><li>average message length = 89/39=2.3 </li></ul><ul><li>Note: Prefix property for decoding </li></ul><ul><li>Can you do better? </li></ul><ul><li>Theoretical optimum = -sum pi*log(pi) = entropy </li></ul>
  11. 11. Code Tree Method/Analysis <ul><li>Binary tree method </li></ul><ul><li>Internal nodes have left/right references: </li></ul><ul><ul><li>0 means go to the left </li></ul></ul><ul><ul><li>1 means go to the right </li></ul></ul><ul><li>Leaf nodes store the value </li></ul><ul><li>Decode time-cost is O(logN) </li></ul><ul><li>Decode space-cost is O(N) </li></ul><ul><ul><li>quick argument: number of leaves > number of internal nodes. </li></ul></ul><ul><ul><li>Proof: induction on ….. </li></ul></ul><ul><ul><ul><li>number of internal nodes. </li></ul></ul></ul><ul><li>Prefix Property: each prefix uniquely defines char. </li></ul>
  12. 12. Code Encode(character) <ul><li>Again can use binary prefix tree </li></ul><ul><li>For encode and decode could use hashing </li></ul><ul><ul><li>yields O(1) encode/decode time </li></ul></ul><ul><ul><li>O(N) space cost ( N is size of alphabet) </li></ul></ul><ul><li>For compression, main goal is reducing storage size </li></ul><ul><ul><li>in example it’s the total number of bits </li></ul></ul><ul><ul><li>code size for single character = depth of tree </li></ul></ul><ul><ul><li>code size for document = sum of (frequency of char * depth of character) </li></ul></ul><ul><ul><li>different trees yield different storage efficiency </li></ul></ul><ul><ul><li>What’s the best tree? </li></ul></ul>
  13. 13. Huffman Code <ul><li>Provably optimal: i.e. yields minimum storage cost </li></ul><ul><li>Algorithm: CodeTree huff(document) </li></ul><ul><ul><li>1. Compute the frequency and a leaf node for each char </li></ul></ul><ul><ul><ul><li>leaf node has countfield and character </li></ul></ul></ul><ul><ul><li>2. Remove the 2 nodes with least counts and create a new node with count equal to the sum of counts and sons, the removed nodes. </li></ul></ul><ul><ul><ul><li>internal node has 2 node ptrs and count field </li></ul></ul></ul><ul><ul><li>3. Repeat 2 until only 1 node left. </li></ul></ul><ul><ul><li>4. That’s it! </li></ul></ul>
  14. 14. Bad code example
  15. 15. Tree, a la Huffman
  16. 16. Tree with codes: note Prefix property
  17. 17. Tree Cost
  18. 18. Analysis <ul><li>Intuition: least frequent chars get longest codes or most frequent chars get shortest codes. </li></ul><ul><li>Let T be a minimal code tree. (Induction) </li></ul><ul><ul><li>All nodes have 2 sons. (by construction) </li></ul></ul><ul><ul><li>Lemma: if c1 and c2 be least frequently used then they are at the deepest depth </li></ul></ul><ul><ul><ul><li>Proof: </li></ul></ul></ul><ul><ul><ul><ul><li>if not deepest nodes, exchange and total cost (number of bits) goes down </li></ul></ul></ul></ul>
  19. 19. Analysis (continued) <ul><li>Sk : Huffman algorithm on k chars produces optimal code. </li></ul><ul><ul><li>S2: obvious </li></ul></ul><ul><ul><li>Sk => Sk+1 </li></ul></ul><ul><ul><ul><li>Let T be optimal code on k+1 chars </li></ul></ul></ul><ul><ul><ul><li>By lemma, two least freq chars are deepest </li></ul></ul></ul><ul><ul><ul><li>Replace two least freq char by new char with freq equal to sum </li></ul></ul></ul><ul><ul><ul><li>Now have tree with k nodes </li></ul></ul></ul><ul><ul><ul><li>By induction, Huffman yields optimal tree. </li></ul></ul></ul>
  20. 20. Lempel-Ziv <ul><li>Input: string of characters </li></ul><ul><li>Internal: dictionary of (codewords, words) </li></ul><ul><li>Output: string of codewords and characters. </li></ul><ul><li>Codewords are distinct from characters. </li></ul><ul><li>In algorithm, w is a string, c is character and w+c means concatenation. </li></ul><ul><li>When adding a new word to the dictionary, a new code word needs to be assigned. </li></ul>
  21. 21. Lempel-Ziv Algorithm <ul><li>w = NIL; </li></ul><ul><li>while ( read a character c ) </li></ul><ul><li>{ </li></ul><ul><li>if w+c exists in the dictionary </li></ul><ul><li>w = w+c; </li></ul><ul><li>else </li></ul><ul><li>add w+c to the dictionary; </li></ul><ul><li>output the code for w; </li></ul><ul><li>w = k; </li></ul><ul><li>} </li></ul>
  22. 22. Adaptive Encoding <ul><li>Webster has 157,000 entries: could encode in X bits </li></ul><ul><ul><li>but only works for this document </li></ul></ul><ul><ul><li>Don’t want to do two passes </li></ul></ul><ul><li>Adaptive Huffman </li></ul><ul><ul><li>modify model on the fly </li></ul></ul><ul><li>Zempel-Liv 1977 </li></ul><ul><li>ZLW Zempel-Liv Welsh </li></ul><ul><ul><li>1984 used in compress (UNIX) </li></ul></ul><ul><ul><li>uses dictionary method </li></ul></ul><ul><ul><li>variable number of symbols to fixed length code </li></ul></ul><ul><ul><li>better with large documents- finds repetitive patterns </li></ul></ul>
  23. 23. Audio Compression <ul><li>Sounds can be represented as a vector valued function </li></ul><ul><li>At any point in time, a sound is a combination of different frequencies of different strengths </li></ul><ul><li>For example, each note on a piano yields a specific frequency. </li></ul><ul><li>Also, our ears, like pianos, have cilia that responds to specific frequencies. </li></ul><ul><li>Just like sin(x) can be approximated by small number of terms, e.g. x -x^3/3+x^5/120…, so can sound. </li></ul><ul><li>Transforming a sound into its “spectrum” is done mathematically by a fourier transform. </li></ul><ul><li>The spectrum can be played back, as on computer with sound cards. </li></ul>
  24. 24. Audio <ul><li>Using many frequencies, as in CDs, yields a good approximation Using few frequenices, as in telephones, a poor approximation </li></ul><ul><li>Sampling frequencies yields compresssion ratios between 6 to 24, depending on sound and quality </li></ul><ul><li>High-priced electronic pianos store and reuse “samples” of concert pianos </li></ul><ul><li>High filter: removes/reduces high frequencies, a common problem with aging </li></ul><ul><li>Low filter: removes/reduces low frequencies </li></ul><ul><li>Can use differential methods: </li></ul><ul><ul><li>only report change in sounds </li></ul></ul>
  25. 25. Image Compression <ul><li>with or without loss, mostly with </li></ul><ul><ul><li>who cares about what the eye can’t see </li></ul></ul><ul><li>Black and white images can regarded as functions from the plane (R^2) into the reals (R), as in old TVs </li></ul><ul><ul><li>positions vary continuous, but our eyes can’t see the discreteness around 100 pixels per inch. </li></ul></ul><ul><li>Color images can be regarded as functions from the plane into R^3, the RGB space. </li></ul><ul><ul><li>Colors are vary continuous, but our eyes sample colors with only 3 difference receptors (RGB) </li></ul></ul><ul><li>Mathematical theories yields close approximation </li></ul><ul><ul><li>there are spatial analogues to fourier transforms </li></ul></ul>
  26. 26. Image Compression <ul><li>faces can be done with eigenfaces </li></ul><ul><ul><li>images can be regarded a points in R^(big) </li></ul></ul><ul><ul><li>choose good bases and use most important vectors </li></ul></ul><ul><ul><li>i.e. approximate with fewer dimensions: </li></ul></ul><ul><ul><li>JPEG, MPEG, GIF are compressed images </li></ul></ul>
  27. 27. Video Compression <ul><li>Uses DCT (discrete cosine transform) </li></ul><ul><ul><li>Note: Nice functions can be approximated by </li></ul></ul><ul><ul><ul><li>sum of x, x^2,… with appropriate coefficients </li></ul></ul></ul><ul><ul><ul><li>sum of sin(x), sin(2x),… with right coefficients </li></ul></ul></ul><ul><ul><ul><li>almost any infinite sum of functions </li></ul></ul></ul><ul><ul><li>DCT is good because few terms give good results on images. </li></ul></ul><ul><ul><li>Differential methods used: </li></ul></ul><ul><ul><ul><li>only report changes in video </li></ul></ul></ul>
  28. 28. Summary <ul><li>Issues: </li></ul><ul><ul><li>Context: what problem are you solving and what is an acceptable solution. </li></ul></ul><ul><ul><li>evaluation: compression ratios </li></ul></ul><ul><ul><li>fidelity, if loss </li></ul></ul><ul><ul><ul><li>approximation, quantization, transforms, differential </li></ul></ul></ul><ul><ul><li>adaptive, if on-the-fly, e.g. movies, tv </li></ul></ul><ul><ul><li>Different sources yield different best approaches </li></ul></ul><ul><ul><ul><li>cartoons versus cities versus outdoors </li></ul></ul></ul><ul><ul><li>code book separate or not </li></ul></ul><ul><ul><li>fixed or variable length codes </li></ul></ul>

×