Image Compression (Chapter 8)CS474/674 – Prof. Bebis
Goal of Image Compression• Digital images require huge amounts of space forstorage and large bandwidths for transmission.– A 640 x 480 color image requires close to 1MB of space.• The goal of image compression is to reduce the amountof data required to represent a digital image.– Reduce storage requirements and increase transmission rates.
Approaches• Lossless– Information preserving– Low compression ratios• Lossy– Not information preserving– High compression ratios• Trade-off: image quality vs compression ratio
Data ≠ Information• Data and information are not synonymous terms!• Data is the means by which information is conveyed.• Data compression aims to reduce the amount of datarequired to represent a given quantity of informationwhile preserving as much information as possible.
Data vs Information (cont’d)• The same amount of information can be representedby various amount of data, e.g.:Your wife, Helen, will meet you at Logan Airportin Boston at 5 minutes past 6:00 pm tomorrownightYour wife will meet you at Logan Airport at 5minutes past 6:00 pm tomorrow nightHelen will meet you at Logan at 6:00 pmtomorrow nightEx1:Ex2:Ex3:
Data Redundancy (cont’d)• Relative data redundancy:Example:
Types of Data Redundancy(1) Coding(2) Interpixel(3) Psychovisual• Compression attempts to reduce one or more of theseredundancy types.
Coding Redundancy• Code: a list of symbols (letters, numbers, bits etc.)• Code word: a sequence of symbols used to represent apiece of information or an event (e.g., gray levels).• Code word length: number of symbols in each codeword
Coding Redundancy (cont’d)N x M imagerk: k-th gray levelP(rk): probability of rkl(rk): # of bits for rk( ) ( )xE X xP X x= =∑Expected value:
Coding Redundancy (cont’d)• l(rk) = variable length• Consider the probability of the gray levels:variable length
Interpixel redundancy• Interpixel redundancy implies that any pixel value can bereasonably predicted by its neighbors (i.e., correlated).( ) ( ) ( ) ( )f x o g x f x g x a da∞−∞= +∫autocorrelation: f(x)=g(x)
Interpixel redundancy (cont’d)• To reduce interpixel redundnacy, the data must betransformed in another format (i.e., through a transformation)– e.g., thresholding, differences between adjacent pixels, DFT• Example:originalthresholded(profile – line 100)threshold(1+10) bits/pair
Psychovisual redundancy• The human eye does not respond with equal sensitivity toall visual information.• It is more sensitive to the lower frequencies than to thehigher frequencies in the visual spectrum.• Idea: discard data that is perceptually insignificant!
Psychovisual redundancy (cont’d)256 gray levels 16 gray levels16 gray levelsC=8/4 = 2:1i.e., add to each pixel asmall pseudo-random numberprior to quantizationExample: quantization
How do we measure information?• What is the information content of a message/image?• What is the minimum amount of data that is sufficientto describe completely an image without loss ofinformation?
Modeling Information• Information generation is assumed to be aprobabilistic process.• Idea: associate information with probability!Note: I(E)=0 when P(E)=1A random event E with probability P(E) contains:
How much information does a pixel contain?• Suppose that gray level values are generated by arandom variable, then rk contains:units of information!
• Average information content of an image:units/pixel10( )Pr( )Lk kkE I r r−== ∑usingHow much information does an image contain?(assumes statistically independent random events)Entropy
• Redundancy:Redundancy (revisited)where:Note: of Lavg= H, the R=0 (no redundancy)
Entropy Estimation• It is not easy to estimate H reliably!image
Entropy Estimation (cont’d)• First order estimate of H:
Estimating Entropy (cont’d)• Second order estimate of H:– Use relative frequencies of pixel blocks :image
Estimating Entropy (cont’d)• The first-order estimate provides only a lower-bound on the compression that can be achieved.• Differences between higher-order estimates ofentropy and the first-order estimate indicate thepresence of interpixel redundancy!Need to apply transformations!
Estimating Entropy (cont’d)• For example, consider differences:16
Estimating Entropy (cont’d)• Entropy of difference image:• However, a better transformation could be found since:• Better than before (i.e., H=1.81 for original image)
Huffman Coding/Decoding• After the code has been created, coding/decoding canbe implemented using a look-up table.• Note that decoding is done unambiguously.
Arithmetic (or Range) Coding(coding redundancy)• No assumption on encode source symbols one at atime.– Sequences of source symbols are encoded together.– There is no one-to-one correspondence between sourcesymbols and code words.• Slower than Huffman coding but typically achievesbetter compression.
Arithmetic Coding (cont’d)• A sequence of source symbols is assigned a singlearithmetic code word which corresponds to a sub-interval in [0,1].• As the number of symbols in the message increases,the interval used to represent it becomes smaller.• Smaller intervals require more information units (i.e.,bits) to be represented.
Arithmetic Coding (cont’d)Encode message: a1 a2 a3 a3 a40 11) Assume message occupies [0, 1)2) Subdivide [0, 1) based on the probability of αi3) Update interval by processing source symbols
Example• The message a1 a2 a3 a3 a4 is encoded using 3 decimaldigits or 3/5 = 0.6 decimal digits per source symbol.• The entropy of this message is:Note: finite precision arithmetic might cause problemsdue to truncations!-(3 x 0.2log10(0.2)+0.4log10(0.4))=0.5786 digits/symbol
LZW Coding (interpixel redundancy)• Requires no priori knowledge of pixel probabilitydistribution values.• Assigns fixed length code words to variable lengthsequences.• Patented Algorithm US 4,558,302• Included in GIF and TIFF and PDF file formats
LZW Coding• A codebook (or dictionary) needs to be constructed.• Initially, the first 256 entries of the dictionary are assignedto the gray levels 0,1,2,..,255 (i.e., assuming 8 bits/pixel)Consider a 4x4, 8 bit image39 39 126 12639 39 126 12639 39 126 12639 39 126 126Dictionary Location Entry0 01 1. .255 255256 -511 -Initial Dictionary
LZW Coding (cont’d)39 39 126 12639 39 126 12639 39 126 12639 39 126 126- Is 39 in the dictionary……..Yes- What about 39-39………….No- Then add 39-39 in entry 256Dictionary Location Entry0 01 1. .255 255256 -511 -39-39As the encoder examines imagepixels, gray level sequences(i.e., blocks) that are not in thedictionary are assigned to a newentry.
Decoding LZW• The dictionary which was used for encoding need notbe sent with the image.• Can be built on the “fly” by the decoder as it reads thereceived code words.
Differential Pulse Code Modulation (DPCM)Coding (interpixel redundancy)• A predictive coding approach.• Each pixel value (except at the boundaries) is predictedbased on its neighbors (e.g., linear combination) to get apredicted image.• The difference between the original and predictedimages yields a differential or residual image.– i.e., has much less dynamic range of pixel values.• The differential image is encoded using Huffmancoding.
Run-length coding (RLC)(interpixel redundancy)• Used to reduce the size of a repeating string of characters(i.e., runs):1 1 1 1 1 0 0 0 0 0 0 1 (1,5) (0, 6) (1, 1)a a a b b b b b b c c (a,3) (b, 6) (c, 2)• Encodes a run of symbols into two bytes: (symbol, count)• Can compress any type of data but cannot achieve highcompression ratios compared to other compression methods.
Bit-plane coding (interpixel redundancy)• An effective technique to reduce inter pixel redundancyis to process each bit plane individually.(1) Decompose an image into a series of binary images.(2) Compress each binary image (e.g., using run-lengthcoding)
Combining Huffman Codingwith Run-length Coding• Assuming that a message has been encoded usingHuffman coding, additional compression can beachieved using run-length coding.e.g., (0,1)(1,1)(0,1)(1,0)(0,2)(1,4)(0,2)
Lossy Compression• Transform the image into a domain where compressioncan be performed more efficiently (i.e., reduce interpixelredundancies).~ (N/n)2subimages
Example: Fourier TransformThe magnitude of theFT decreases, as u, vincrease!K-1 K-1K << N
Transform Selection• T(u,v) can be computed using varioustransformations, for example:– DFT– DCT (Discrete Cosine Transform)– KLT (Karhunen-Loeve Transformation)
DCT (cont’d)• Basis set of functions for a 4x4 image (i.e.,cosines ofdifferent frequencies).
DCT (cont’d)DFT WHT DCTRMS error: 2.32 1.78 1.138 x 8 subimages64 coefficientsper subimage50% of thecoefficientstruncated
DCT (cont’d)• DCT minimizes "blocking artifacts" (i.e., boundariesbetween subimages do not become very visible).DFTi.e., n-point periodicitygives rise todiscontinuities!DCTi.e., 2n-point periodicitypreventsdiscontinuities!
DCT (cont’d)• Subimage size selection:2 x 2 subimagesoriginal 4 x 4 subimages 8 x 8 subimages
JPEG Compression• JPEG is an image compression standard which wasaccepted as an international standard in 1992.• Developed by the Joint Photographic Expert Group ofthe ISO/IEC for coding and compression of color/grayscale images.• Yields acceptable compression in the 10:1 range.• A scheme for video compression based on JPEGcalled Motion JPEG (MJPEG) exists
JPEG Steps1. Divide the image into 8x8 subimages;For each subimage do:2. Shift the gray-levels in the range [-128, 127]- DCT requires range be centered around 03. Apply DCT (i.e., 64 coefficients)1 DC coefficient: F(0,0)63 AC coefficients: F(u,v)
Example 2 (cont’d)Reconstructed – spatial Error
JPEG for Color Images• Could apply JPEG on R/G/B components .• It is more efficient to describe a color in terms of itsluminance and chrominance content separately, to enablemore efficient processing.– YUV• Chrominance can be subsampled due to human visioninsensitivity
JPEG for Color Images• Luminance: Received brightness of the light(proportional to the total energy in the visible band).• Chrominance: Describe the perceived color tone of thelight (depends on the wavelength composition of light– Hue: Specify the color tone (i.e., depends on the peakwavelength of the light).– Saturation: Describe how pure the color is (i.e., depends onthe spread or bandwidth of the light spectrum).
YUV Color Space• YUV color space– Y is the components of luminance– Cb and Cr are the components of chrominance– The values in the YUV coordinate are related to the valuesin the RGB coordinate by:0.299 0.587 0.114 00.169 0.334 0.500 1280.500 0.419 0.081 128Y RCb GCr B ÷ ÷ ÷ ÷= − − + ÷ ÷ ÷ ÷ ÷ ÷ ÷ ÷− −
JPEG Modes• JPEG supports several different modes– Sequential Mode– Progressive Mode– Hierarchical Mode– Lossless Mode• Sequential is the default mode– Each image component is encoded in a single left-to-right,top-to-bottom scan.– This is the mode we have been describing.
Progressive JPEG• The image is encoded in multiple scans, in order toproduce a quick, rough decoded image whentransmission time is long.SequentialProgressive
Lossless Differential Pulse CodeModulation (DPCM) Coding• Each pixel value (except at the boundaries) is predictedbased on certain neighbors (e.g., linear combination) toget a predicted image.• The difference between the original and predictedimages yields a differential or residual image.• Encode differential image using Huffman coding.xmPredictorEntropyEncoderpmdm
Lossy Differential Pulse CodeModulation (DPCM) Coding• Similar to lossless DPCM except that (i) it usesquantization and (ii) the pixels are predicted from the“reconstructed” values of certain neighbors.
Block Truncation Coding• Divide image in non-overlapping blocks of pixels.• Derive a bitmap (0/1) for each block using thresholding.– e.g., use mean pixel value in each block as threshold.• For each group of 1s and 0s, determine reconstructionvalue– e.g., average of corresponding pixel values in original block.
Subband Coding• Analyze image to produce components containingfrequencies in well defined bands (i.e., subbands)– e.g., use wavelet transform.• Optimize quantization/coding in each subband.
Vector Quantization• Develop a dictionary of fixed-size vectors (i.e., codevectors), usually blocks of pixel values.• Partition image in non-overlapping blocks (i.e., imagevectors).• Encode each image vector by the index of its closestcode vector.
Fractal Coding• What is a “fractal”?– A rough or fragmented geometric shape that can be split intoparts, each of which is (at least approximately) a reduced-sizecopy of the whole.Idea: store images as collections of transformations!
Fractal Coding (cont’d)Generated by 4 affinetransformations!
Fractal Coding (cont’d)• Decompose image into segments (i.e., using standardsegmentations techniques based on edges, color, texture,etc.) and look them up in a library of IFS codes.• Best suited for textures and natural images.
Fingerprint Compression• An image coding standard for digitized fingerprints,developed and maintained by:– FBI– Los Alamos National Lab (LANL)– National Institute for Standards and Technology (NIST).• The standard employs a discrete wavelet transform-based algorithm (Wavelet/Scalar Quantization orWSQ).
Memory Requirements• FBI is digitizing fingerprints at 500 dots per inch with8 bits of grayscale resolution.• A single fingerprint card turns into about 10 MB ofdata!A sample fingerprint image768 x 768 pixels =589,824 bytes
Preserving Fingerprint DetailsThe "white" spots in the middle ofthe black ridges are sweat poresThey’re admissible points ofidentification in court, as are the littleblack flesh ‘‘islands’’ in the groovesbetween the ridgesThese details are just a couple pixels wide!
What compression scheme should be used?• Better use a lossless method to preserve everypixel perfectly.• Unfortunately, in practice lossless methodshaven’t done better than 2:1 on fingerprints!• Would JPEG work well for fingerprintcompression?
Results using JPEG compressionfile size 45853 bytescompression ratio: 12.9The fine details are pretty muchhistory, and the whole image hasthis artificial ‘‘blocky’’ patternsuperimposed on it.The blocking artifacts affect theperformance of manual orautomated systems!
Results using WSQ compressionfile size 45621 bytescompression ratio: 12.9The fine details are preservedbetter than they are with JPEG.NO blocking artifacts!
Varying compression ratio• FBI’s target bit rate is around 0.75 bits per pixel (bpp)– i.e., corresponds to a target compression ratio of 10.7(assuming 8-bit images)• This target bit rate is set via a ‘‘knob’’ on the WSQalgorithm.– i.e., similar to the "quality" parameter in many JPEGimplementations.• Fingerprints coded with WSQ at a target of 0.75 bppwill actually come in around 15:1
Varying compression ratio (cont’d)Original image 768 x 768 pixels (589824 bytes)
Varying compression ratio (cont’d)0.9 bpp compressionWSQ image, file size 47619 bytes,compression ratio 12.4JPEG image, file size 49658 bytes,compression ratio 11.9
Varying compression ratio (cont’d)0.75 bpp compressionWSQ image, file size 39270 bytescompression ratio 15.0JPEG image, file size 40780 bytes,compression ratio 14.5
Varying compression ratio (cont’d)0.6 bpp compressionWSQ image, file size 30987 bytes,compression ratio 19.0JPEG image, file size 30081 bytes,compression ratio 19.6