Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Image compression


Published on

image compression in multimedia

Image compression

  1. 1. Image Compression (Chapter 8)CS474/674 – Prof. Bebis
  2. 2. Goal of Image Compression• Digital images require huge amounts of space forstorage and large bandwidths for transmission.– A 640 x 480 color image requires close to 1MB of space.• The goal of image compression is to reduce the amountof data required to represent a digital image.– Reduce storage requirements and increase transmission rates.
  3. 3. Approaches• Lossless– Information preserving– Low compression ratios• Lossy– Not information preserving– High compression ratios• Trade-off: image quality vs compression ratio
  4. 4. Data ≠ Information• Data and information are not synonymous terms!• Data is the means by which information is conveyed.• Data compression aims to reduce the amount of datarequired to represent a given quantity of informationwhile preserving as much information as possible.
  5. 5. Data vs Information (cont’d)• The same amount of information can be representedby various amount of data, e.g.:Your wife, Helen, will meet you at Logan Airportin Boston at 5 minutes past 6:00 pm tomorrownightYour wife will meet you at Logan Airport at 5minutes past 6:00 pm tomorrow nightHelen will meet you at Logan at 6:00 pmtomorrow nightEx1:Ex2:Ex3:
  6. 6. Data RedundancycompressionCompression ratio:
  7. 7. Data Redundancy (cont’d)• Relative data redundancy:Example:
  8. 8. Types of Data Redundancy(1) Coding(2) Interpixel(3) Psychovisual• Compression attempts to reduce one or more of theseredundancy types.
  9. 9. Coding Redundancy• Code: a list of symbols (letters, numbers, bits etc.)• Code word: a sequence of symbols used to represent apiece of information or an event (e.g., gray levels).• Code word length: number of symbols in each codeword
  10. 10. Coding Redundancy (cont’d)N x M imagerk: k-th gray levelP(rk): probability of rkl(rk): # of bits for rk( ) ( )xE X xP X x= =∑Expected value:
  11. 11. Coding Redundancy (con’d)• l(rk) = constant lengthExample:
  12. 12. Coding Redundancy (cont’d)• l(rk) = variable length• Consider the probability of the gray levels:variable length
  13. 13. Interpixel redundancy• Interpixel redundancy implies that any pixel value can bereasonably predicted by its neighbors (i.e., correlated).( ) ( ) ( ) ( )f x o g x f x g x a da∞−∞= +∫autocorrelation: f(x)=g(x)
  14. 14. Interpixel redundancy (cont’d)• To reduce interpixel redundnacy, the data must betransformed in another format (i.e., through a transformation)– e.g., thresholding, differences between adjacent pixels, DFT• Example:originalthresholded(profile – line 100)threshold(1+10) bits/pair
  15. 15. Psychovisual redundancy• The human eye does not respond with equal sensitivity toall visual information.• It is more sensitive to the lower frequencies than to thehigher frequencies in the visual spectrum.• Idea: discard data that is perceptually insignificant!
  16. 16. Psychovisual redundancy (cont’d)256 gray levels 16 gray levels16 gray levelsC=8/4 = 2:1i.e., add to each pixel asmall pseudo-random numberprior to quantizationExample: quantization
  17. 17. How do we measure information?• What is the information content of a message/image?• What is the minimum amount of data that is sufficientto describe completely an image without loss ofinformation?
  18. 18. Modeling Information• Information generation is assumed to be aprobabilistic process.• Idea: associate information with probability!Note: I(E)=0 when P(E)=1A random event E with probability P(E) contains:
  19. 19. How much information does a pixel contain?• Suppose that gray level values are generated by arandom variable, then rk contains:units of information!
  20. 20. • Average information content of an image:units/pixel10( )Pr( )Lk kkE I r r−== ∑usingHow much information does an image contain?(assumes statistically independent random events)Entropy
  21. 21. • Redundancy:Redundancy (revisited)where:Note: of Lavg= H, the R=0 (no redundancy)
  22. 22. Entropy Estimation• It is not easy to estimate H reliably!image
  23. 23. Entropy Estimation (cont’d)• First order estimate of H:
  24. 24. Estimating Entropy (cont’d)• Second order estimate of H:– Use relative frequencies of pixel blocks :image
  25. 25. Estimating Entropy (cont’d)• The first-order estimate provides only a lower-bound on the compression that can be achieved.• Differences between higher-order estimates ofentropy and the first-order estimate indicate thepresence of interpixel redundancy!Need to apply transformations!
  26. 26. Estimating Entropy (cont’d)• For example, consider differences:16
  27. 27. Estimating Entropy (cont’d)• Entropy of difference image:• However, a better transformation could be found since:• Better than before (i.e., H=1.81 for original image)
  28. 28. Image Compression Model
  29. 29. Image Compression Model (cont’d)•• Mapper: transforms input data in a way that facilitatesreduction of interpixel redundancies.
  30. 30. Image Compression Model (cont’d)•• Quantizer: reduces the accuracy of the mapper’s output inaccordance with some pre-established fidelity criteria.
  31. 31. Image Compression Model (cont’d)•• Symbol encoder: assigns the shortest code to the mostfrequently occurring output values.
  32. 32. Image Compression Models (cont’d)• Inverse operations are performed.• But … quantization is irreversible in general.
  33. 33. Fidelity Criteria• How close is to ?• Criteria– Subjective: based on human observers– Objective: mathematically defined criteria
  34. 34. Subjective Fidelity Criteria
  35. 35. Objective Fidelity Criteria• Root mean square error (RMS)• Mean-square signal-to-noise ratio (SNR)
  36. 36. RMSE = 5.17 RMSE = 15.67 RMSE = 14.17Objective Fidelity Criteria (cont’d)
  37. 37. Lossless Compression
  38. 38. Lossless Methods: Taxonomy
  39. 39. Huffman Coding (coding redundancy)• A variable-length coding technique.• Optimal code (i.e., minimizes the number of codesymbols per source symbol).• Assumption: symbols are encoded one at a time!
  40. 40. Huffman Coding (cont’d)• Forward Pass1. Sort probabilities per symbol2. Combine the lowest two probabilities3. Repeat Step2 until only two probabilities remain.
  41. 41. Huffman Coding (cont’d)• Backward PassAssign code symbols going backwards
  42. 42. Huffman Coding (cont’d)• Lavgusing Huffman coding:• Lavgassuming binary codes:
  43. 43. Huffman Coding/Decoding• After the code has been created, coding/decoding canbe implemented using a look-up table.• Note that decoding is done unambiguously.
  44. 44. Arithmetic (or Range) Coding(coding redundancy)• No assumption on encode source symbols one at atime.– Sequences of source symbols are encoded together.– There is no one-to-one correspondence between sourcesymbols and code words.• Slower than Huffman coding but typically achievesbetter compression.
  45. 45. Arithmetic Coding (cont’d)• A sequence of source symbols is assigned a singlearithmetic code word which corresponds to a sub-interval in [0,1].• As the number of symbols in the message increases,the interval used to represent it becomes smaller.• Smaller intervals require more information units (i.e.,bits) to be represented.
  46. 46. Arithmetic Coding (cont’d)Encode message: a1 a2 a3 a3 a40 11) Assume message occupies [0, 1)2) Subdivide [0, 1) based on the probability of αi3) Update interval by processing source symbols
  47. 47. Examplea1 a2 a3 a3 a4[0.06752, 0.0688)or,0.068Encode
  48. 48. Example• The message a1 a2 a3 a3 a4 is encoded using 3 decimaldigits or 3/5 = 0.6 decimal digits per source symbol.• The entropy of this message is:Note: finite precision arithmetic might cause problemsdue to truncations!-(3 x 0.2log10(0.2)+0.4log10(0.4))=0.5786 digits/symbol
  49. 49. a3 a1 a2 a40.57280.571520568960.567680.56 0.56 0.5664Decode 0.572Arithmetic Decodinga1a2a3a4
  50. 50. LZW Coding (interpixel redundancy)• Requires no priori knowledge of pixel probabilitydistribution values.• Assigns fixed length code words to variable lengthsequences.• Patented Algorithm US 4,558,302• Included in GIF and TIFF and PDF file formats
  51. 51. LZW Coding• A codebook (or dictionary) needs to be constructed.• Initially, the first 256 entries of the dictionary are assignedto the gray levels 0,1,2,..,255 (i.e., assuming 8 bits/pixel)Consider a 4x4, 8 bit image39 39 126 12639 39 126 12639 39 126 12639 39 126 126Dictionary Location Entry0 01 1. .255 255256 -511 -Initial Dictionary
  52. 52. LZW Coding (cont’d)39 39 126 12639 39 126 12639 39 126 12639 39 126 126- Is 39 in the dictionary……..Yes- What about 39-39………….No- Then add 39-39 in entry 256Dictionary Location Entry0 01 1. .255 255256 -511 -39-39As the encoder examines imagepixels, gray level sequences(i.e., blocks) that are not in thedictionary are assigned to a newentry.
  53. 53. Example39 39 126 12639 39 126 12639 39 126 12639 39 126 126Concatenated Sequence: CS = CR + Pelse:(1) Output D(CR)(2) Add CS to D(3) CR=PIf CS is found:(1) No Output(2) CR=CS(CR) (P)CR = empty
  54. 54. Decoding LZW• The dictionary which was used for encoding need notbe sent with the image.• Can be built on the “fly” by the decoder as it reads thereceived code words.
  55. 55. Differential Pulse Code Modulation (DPCM)Coding (interpixel redundancy)• A predictive coding approach.• Each pixel value (except at the boundaries) is predictedbased on its neighbors (e.g., linear combination) to get apredicted image.• The difference between the original and predictedimages yields a differential or residual image.– i.e., has much less dynamic range of pixel values.• The differential image is encoded using Huffmancoding.
  56. 56. Run-length coding (RLC)(interpixel redundancy)• Used to reduce the size of a repeating string of characters(i.e., runs):1 1 1 1 1 0 0 0 0 0 0 1  (1,5) (0, 6) (1, 1)a a a b b b b b b c c  (a,3) (b, 6) (c, 2)• Encodes a run of symbols into two bytes: (symbol, count)• Can compress any type of data but cannot achieve highcompression ratios compared to other compression methods.
  57. 57. Bit-plane coding (interpixel redundancy)• An effective technique to reduce inter pixel redundancyis to process each bit plane individually.(1) Decompose an image into a series of binary images.(2) Compress each binary image (e.g., using run-lengthcoding)
  58. 58. Combining Huffman Codingwith Run-length Coding• Assuming that a message has been encoded usingHuffman coding, additional compression can beachieved using run-length coding.e.g., (0,1)(1,1)(0,1)(1,0)(0,2)(1,4)(0,2)
  59. 59. Lossy Compression• Transform the image into a domain where compressioncan be performed more efficiently (i.e., reduce interpixelredundancies).~ (N/n)2subimages
  60. 60. Example: Fourier TransformThe magnitude of theFT decreases, as u, vincrease!K-1 K-1K << N
  61. 61. Transform Selection• T(u,v) can be computed using varioustransformations, for example:– DFT– DCT (Discrete Cosine Transform)– KLT (Karhunen-Loeve Transformation)
  62. 62. DCTif u=0if u>0if v=0if v>0forwardinverse
  63. 63. DCT (cont’d)• Basis set of functions for a 4x4 image (i.e.,cosines ofdifferent frequencies).
  64. 64. DCT (cont’d)DFT WHT DCTRMS error: 2.32 1.78 1.138 x 8 subimages64 coefficientsper subimage50% of thecoefficientstruncated
  65. 65. DCT (cont’d)• DCT minimizes "blocking artifacts" (i.e., boundariesbetween subimages do not become very visible).DFTi.e., n-point periodicitygives rise todiscontinuities!DCTi.e., 2n-point periodicitypreventsdiscontinuities!
  66. 66. DCT (cont’d)• Subimage size selection:2 x 2 subimagesoriginal 4 x 4 subimages 8 x 8 subimages
  67. 67. JPEG Compression• JPEG is an image compression standard which wasaccepted as an international standard in 1992.• Developed by the Joint Photographic Expert Group ofthe ISO/IEC for coding and compression of color/grayscale images.• Yields acceptable compression in the 10:1 range.• A scheme for video compression based on JPEGcalled Motion JPEG (MJPEG) exists
  68. 68. JPEG Compression (cont’d)• JPEG uses DCT for handling interpixel redundancy.• Modes of operation:(1) Sequential DCT-based encoding(2) Progressive DCT-based encoding(3) Lossless encoding(4) Hierarchical encoding
  69. 69. JPEG Compression(Sequential DCT-based encoding)EntropyencoderEntropydecoder
  70. 70. JPEG Steps1. Divide the image into 8x8 subimages;For each subimage do:2. Shift the gray-levels in the range [-128, 127]- DCT requires range be centered around 03. Apply DCT (i.e., 64 coefficients)1 DC coefficient: F(0,0)63 AC coefficients: F(u,v)
  71. 71. Example(i.e., non-centeredspectrum)
  72. 72. JPEG Steps4. Quantize the coefficients (i.e., reduce the amplitude ofcoefficients that do not contribute a lot).Q(u,v): quantization table
  73. 73. Example• Quantization Table Q[i][j]
  74. 74. Example (cont’d)Quantization
  75. 75. JPEG Steps (cont’d)5. Order the coefficients using zig-zag ordering- Place non-zero coefficients first- Create long runs of zeros (i.e., good for run-length encoding)
  76. 76. Example
  77. 77. JPEG Steps (cont’d)6. Form intermediate symbol sequence and encode coefficients:6.2 AC coefficients: variable length coding6.1 DC coefficients: predictive encoding
  78. 78. Intermediate Codingsymbol_1 (SIZE) symbol_2 (AMPLITUDE)DCDC (6) (61)AC (0,2) (-3)end of blocksymbol_1 (RUN-LENGTH, SIZE) symbol_2 (AMPLITUDE)SIZE: # bits for encoding amplitudeRUN-LENGTH: run of zeros
  79. 79. DC/AC Symbol Encoding• DC encoding• AC encodingsymbol_1 symbol_2(SIZE) (AMPLITUDE)If RUN-LENGTH > 15, use symbol (15,0) , i.e., RUN-LENGTH=160 0 0 0 0 0 476(6,9)(476)[-2048, 2047]=predictivecoding:= [-210, 210-1]1 ≤SIZE≤10[-211, 211-1]1 ≤SIZE≤11
  80. 80. Entropy Encoding (e.g, variable length)
  81. 81. Entropy Encoding (e.g, Huffman)Symbol_1(Variable Length Code (VLC))Symbol_2(Variable Length Integer (VLI))(1,4)(12)  (111110110 1100)VLC VLI
  82. 82. Effect of “Quality”90 (58k bytes)50 (21k bytes)10 (8k bytes)best quality,lowest compressionworst quality,highest compression
  83. 83. Effect of “Quality” (cont’d)
  84. 84. Example 1: homogeneous 8 x 8 block
  85. 85. Example 1 (cont’d)Quantized De-quantized
  86. 86. Example 1 (cont’d)Reconstructed Error
  87. 87. Example 2: less homogeneous 8 x 8 block
  88. 88. Example 2 (cont’d)Quantized De-quantized
  89. 89. Example 2 (cont’d)Reconstructed – spatial Error
  90. 90. JPEG for Color Images• Could apply JPEG on R/G/B components .• It is more efficient to describe a color in terms of itsluminance and chrominance content separately, to enablemore efficient processing.– YUV• Chrominance can be subsampled due to human visioninsensitivity
  91. 91. JPEG for Color Images• Luminance: Received brightness of the light(proportional to the total energy in the visible band).• Chrominance: Describe the perceived color tone of thelight (depends on the wavelength composition of light– Hue: Specify the color tone (i.e., depends on the peakwavelength of the light).– Saturation: Describe how pure the color is (i.e., depends onthe spread or bandwidth of the light spectrum).
  92. 92. YUV Color Space• YUV color space– Y is the components of luminance– Cb and Cr are the components of chrominance– The values in the YUV coordinate are related to the valuesin the RGB coordinate by:0.299 0.587 0.114 00.169 0.334 0.500 1280.500 0.419 0.081 128Y RCb GCr B       ÷  ÷ ÷  ÷= − − + ÷  ÷ ÷  ÷ ÷  ÷ ÷  ÷− −      
  93. 93. JPEG for Color ImagesEncoderDecoder
  94. 94. JPEG Modes• JPEG supports several different modes– Sequential Mode– Progressive Mode– Hierarchical Mode– Lossless Mode• Sequential is the default mode– Each image component is encoded in a single left-to-right,top-to-bottom scan.– This is the mode we have been describing.
  95. 95. Progressive JPEG• The image is encoded in multiple scans, in order toproduce a quick, rough decoded image whentransmission time is long.SequentialProgressive
  96. 96. Progressive JPEG (cont’d)• Send DCT coefficients in multiple scans:(1) Progressive spectral selection algorithm(2) Progressive successive approximation algorithm(3) Hybrid progressive algorithm
  97. 97. Progressive JPEG (cont’d)(1) Progressive spectral selection algorithm– Group DCT coefficients into several spectral bands– Send low-frequency DCT coefficients first– Send higher-frequency DCT coefficients next
  98. 98. Progressive JPEG (cont’d)(2) Progressive successive approximation algorithm– Send all DCT coefficients but with lower precision.– Refine DCT coefficients in later scans.
  99. 99. Progressive JPEG (cont’d)(3) Hybrid progressive algorithm– Combines spectral selection and successive approximation.
  100. 100. Results using spectral selection
  101. 101. Results using successive approximation
  102. 102. Example using successive approximationafter 0.9s after 1.6safter 3.6s after 7.0s
  103. 103. Hierarchical JPEG• Hierarchical mode encodes the image at severaldifferent resolutions.• Image is transmitted in multiple passes with increasedresolution at each pass.
  104. 104. Hierarchical JPEG (cont’d)N x NN/2 x N/2N/4 x N/4ff2f4
  105. 105. Hierarchical JPEG (cont’d)
  106. 106. Hierarchical JPEG (cont’d)
  107. 107. Lossless JPEG• Uses predictive coding (see later)
  108. 108. Lossy Methods: Taxonomy
  109. 109. Lossless Differential Pulse CodeModulation (DPCM) Coding• Each pixel value (except at the boundaries) is predictedbased on certain neighbors (e.g., linear combination) toget a predicted image.• The difference between the original and predictedimages yields a differential or residual image.• Encode differential image using Huffman coding.xmPredictorEntropyEncoderpmdm
  110. 110. Lossy Differential Pulse CodeModulation (DPCM) Coding• Similar to lossless DPCM except that (i) it usesquantization and (ii) the pixels are predicted from the“reconstructed” values of certain neighbors.
  111. 111. Block Truncation Coding• Divide image in non-overlapping blocks of pixels.• Derive a bitmap (0/1) for each block using thresholding.– e.g., use mean pixel value in each block as threshold.• For each group of 1s and 0s, determine reconstructionvalue– e.g., average of corresponding pixel values in original block.
  112. 112. Subband Coding• Analyze image to produce components containingfrequencies in well defined bands (i.e., subbands)– e.g., use wavelet transform.• Optimize quantization/coding in each subband.
  113. 113. Vector Quantization• Develop a dictionary of fixed-size vectors (i.e., codevectors), usually blocks of pixel values.• Partition image in non-overlapping blocks (i.e., imagevectors).• Encode each image vector by the index of its closestcode vector.
  114. 114. Fractal Coding• What is a “fractal”?– A rough or fragmented geometric shape that can be split intoparts, each of which is (at least approximately) a reduced-sizecopy of the whole.Idea: store images as collections of transformations!
  115. 115. Fractal Coding (cont’d)Generated by 4 affinetransformations!
  116. 116. Fractal Coding (cont’d)• Decompose image into segments (i.e., using standardsegmentations techniques based on edges, color, texture,etc.) and look them up in a library of IFS codes.• Best suited for textures and natural images.
  117. 117. Fingerprint Compression• An image coding standard for digitized fingerprints,developed and maintained by:– FBI– Los Alamos National Lab (LANL)– National Institute for Standards and Technology (NIST).• The standard employs a discrete wavelet transform-based algorithm (Wavelet/Scalar Quantization orWSQ).
  118. 118. Memory Requirements• FBI is digitizing fingerprints at 500 dots per inch with8 bits of grayscale resolution.• A single fingerprint card turns into about 10 MB ofdata!A sample fingerprint image768 x 768 pixels =589,824 bytes
  119. 119. Preserving Fingerprint DetailsThe "white" spots in the middle ofthe black ridges are sweat poresThey’re admissible points ofidentification in court, as are the littleblack flesh ‘‘islands’’ in the groovesbetween the ridgesThese details are just a couple pixels wide!
  120. 120. What compression scheme should be used?• Better use a lossless method to preserve everypixel perfectly.• Unfortunately, in practice lossless methodshaven’t done better than 2:1 on fingerprints!• Would JPEG work well for fingerprintcompression?
  121. 121. Results using JPEG compressionfile size 45853 bytescompression ratio: 12.9The fine details are pretty muchhistory, and the whole image hasthis artificial ‘‘blocky’’ patternsuperimposed on it.The blocking artifacts affect theperformance of manual orautomated systems!
  122. 122. Results using WSQ compressionfile size 45621 bytescompression ratio: 12.9The fine details are preservedbetter than they are with JPEG.NO blocking artifacts!
  123. 123. WSQ Algorithm
  124. 124. Varying compression ratio• FBI’s target bit rate is around 0.75 bits per pixel (bpp)– i.e., corresponds to a target compression ratio of 10.7(assuming 8-bit images)• This target bit rate is set via a ‘‘knob’’ on the WSQalgorithm.– i.e., similar to the "quality" parameter in many JPEGimplementations.• Fingerprints coded with WSQ at a target of 0.75 bppwill actually come in around 15:1
  125. 125. Varying compression ratio (cont’d)Original image 768 x 768 pixels (589824 bytes)
  126. 126. Varying compression ratio (cont’d)0.9 bpp compressionWSQ image, file size 47619 bytes,compression ratio 12.4JPEG image, file size 49658 bytes,compression ratio 11.9
  127. 127. Varying compression ratio (cont’d)0.75 bpp compressionWSQ image, file size 39270 bytescompression ratio 15.0JPEG image, file size 40780 bytes,compression ratio 14.5
  128. 128. Varying compression ratio (cont’d)0.6 bpp compressionWSQ image, file size 30987 bytes,compression ratio 19.0JPEG image, file size 30081 bytes,compression ratio 19.6