Lossy
Upcoming SlideShare
Loading in...5
×
 

Lossy

on

  • 1,355 views

 

Statistics

Views

Total Views
1,355
Views on SlideShare
1,338
Embed Views
17

Actions

Likes
0
Downloads
59
Comments
0

1 Embed 17

http://www.ustudy.in 17

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Lossy Lossy Presentation Transcript

  • Lossy Compression CIS 658 Fall 2005
  • Lossy Compression
    • In order to achieve higher rates of compression, we give up complete reconstruction and consider lossy compression techniques
    • So we need a way to measure how good the compression technique is
      • How close to the original data the reconstructed data is
  • Distortion Measures
    • A distortion measure is a mathematical quality that specifies how close an approximation is to its original
      • Difficult to find a measure which corresponds to our perceptual distortion
      • The average pixel difference is given by the Mean Square Error (MSE)
    View slide
  • Distortion Measures
    • The size of the error relative to the signal is given by the signal-to-noise ratio (SNR)
    • Another common measure is the peak-signal-to-noise ratio (PSNR)
    View slide
  • Distortion Measures
    • Each of these last two measures is defined in decibel (dB) units
      • 1 dB is a tenth of a bel
      • If a signal has 10 times the power of the error, the SNR is 20 dB
      • The term “decibels” as applied to sounds in our environment usually is in comparison to a just-audible sound with frequency 1kHz
  • Rate-Distortion Theory
      • We trade off rate (number of bits per symbol) versus distortion this is represented by a rate-distortion function R(D)
  • Quantization
    • Quantization is the heart of any scheme
      • The sources we are compressing contains a large number of distinct output values (infinite for analog)
      • We compress the source output by reducing the distinct values to a smaller set via quantization
      • Each quantizer can be uniquely described by its partition of the input range (encoder side) and set of output values (decoder side)
  • Uniform Scalar Quantization
    • The inputs and output can be either scalar or vector
    • The quantizer can partition the domain of input values into either equally spaced or unequally spaced partitions
    • We now examine uniform scalar quantization
  • Uniform Scalar Quantization
    • The endpoints of partitions of equally spaced intervals in the input values of a uniform scalar quantizer are called decision boundaries
      • The output value for each interval is the midpoint of the interval
      • The length of each interval is called the step size 
      • A UQT can be midrise or midtread
  • Uniform Scalar Quantization
  • Uniform Scalar Quantization
    • Midtread quantizer
      • Has zero as one of its output values
      • Has an odd number of output values
    • Midrise quantizer
      • Has a partition interval that brackets zero
      • Has an even number of output values
    • For  = 1
  • Uniform Scalar Quantization
    • We want to minimize the distortion for a given input source with a desired number of output values
      • Do this by adjusting the step size  to match the input statistics
      • Let B = { b 0 , b 1 , …, b M } be the set of decision boundaries
      • Let Y = { y 1 , y 2 , …, y M } be the set of reconstruction or output values
  • Uniform Scalar Quantization
    • Assume the input is uniformly distributed in the interval [- X max , X max ]
      • Then the rate of the quantizer is
        • R =  log 2 M 
      • R is the number of bits needed to code the M output values
      • The step size  is given by
        •  = 2 X max /M
  • Quantization Error
    • For bounded input, the quantization error is referred to as granular distortion
      • That distortion caused by replacing a whole range of values from a maximum values to ∞ (and also on the negative side) is called the overload distortion
  • Quantization Error
  • Quantization Error
    • The decision boundaries b i for a midrise quantizer are
      • [( i - 1)  , i  ], i = 1 .. M /2 (for positive data X )
    • Output values y i are the midpoints
      • i  -  /2, i = 1.. M /2 (for positive data)
    • The total distortion (after normalizing) is twice the sum over the positive data
  • Quantization Error
    • Since the reconstruction values y i are the midpoints of each interval, the quantization error must lie within the range [-  /2,  /2]
      • As shown on a previous slide, the quantization error is uniformly distributed
      • Therefore the average squared error is the same as the variance (  d ) 2 of from just the interval [0,  ] with errors in the range shown above
  • Quantization Error
    • The error value at x is e ( x ) = x -  /2, so the variance is given by
    • (  d ) 2 = (1/  )  0  ( e ( x ) - e ) 2 dx
    • (  d ) 2 = (1/  )  0  [ x - (  /2 ) - 0] 2 dx
    • (  d ) 2 =  2 /12
  • Quantization Error
    • In the same way, we can derive the signal variance (  x ) 2 as (2 X max ) 2 /12, so if the quantizer is n bits, M = 2 n then
    • SQNR = 10log 10 [(  x ) 2 /(  d ) 2 ]
    • SQNR = 10log 10 {[(2 X max ) 2 /12]  [12/  2 ]}
    • SQNR = 10log 10 {[(2 X max ) 2 /12]  [12/ (2 X max ) 2 ]}
    • SQNR = 10log 10 M 2 = 20 n log 10 2
    • SQNR = 6.02 n dB
  • Nonuniform Scalar Quantization
    • A uniform quantizer may be inefficient for an input source which is not uniformly distributed
      • Use more decision levels where input is densely distributed
        • This lowers granular distortion
      • Use fewer where sparsely distributed
        • Total number of decision levels remains the same
      • This is nonuniform quantization
  • Nonuniform Scalar Quantization
    • Lloyd-Max quantization
      • iteratively estimates optimal boundaries based on current estimates of reconstruction levels then updates the level and continues until levels converge
    • In companded quantization
      • Input mapped using a compressor function G then quantized using a uniform quantizer
      • After transmission, quantized values mapped back using an expander function G -1
  • Companded Quantization
    • The most commonly used companders are u-law and A-law from telephony
      • More bits assigned where most sound occurs
  • Transform Coding
    • Reason for transform coding
      • Coding vectors is more efficient than coding scalars so we need to group blocks of consecutive samples from the source into vectors
      • If Y is the result of a linear transformation T of an input vector X such that the elements of Y are much less correlated than X , then Y can be coded more efficiently than X .
  • Transform Coding
    • With vectors of higher dimensions, if most of the information in the vectors is carried in the first few components we can roughly quantize the remaining elements
    • The more decorrelated the elements are, the more we can compress the less important elements without affecting the important ones.
  • Discrete Cosine Transform
    • The Discrete Cosine Transform (DCT) is a widely used transform coding technique
      • Spatial frequency indicates how many times pixel values change across an image block
      • The DCT formalizes this notion in terms of how much the image contents change in correspondence to the number of cycles of a cosine wave per block
  • Discrete Cosine Transform
    • The DCT decomposes the original signal into its DC and AC components
      • Following the techniques of Fourier Analysis, any signal can be described as a sum of multiple signals that are sine or cosine waveforms at various amplitudes and frequencies
    • The inverse DCT (IDCT) reconstructs the original signal
  • Definition of DCT
    • Given an input function f ( i , j ) over two input variables, the 2D DCT transforms it into a new function F ( u , v ), with u and v having the same range as i and j . The general definition is
    • Where i , u = 0, 1, … M -1, j , v = 0, 1, … N -1; and the constants C ( u ) and C ( v ) are defined by:
  • Definition of DCT
    • In JPEG, M = N = 8, so we have
    • The 2D IDCT is quite similar
    • with i, j , u , v = 0, 1, …,7
  • Definition of DCT
    • These DCTs work on 2D signals like images.
    • For one dimensional signals we have
  • Basis Functions
    • The DCT and IDCT use the same set of cosine functions - the basis functions
  • Basis Functions
  • DCT Examples
  • DCT Examples
    • The first example on the previous slide has a constant value of 100
      • Remember - C (0) = sqrt(2)/2
      • Remember - cos(0) = 1
      • F 1 (0) = [sqrt(2)/(2  2)]  (1  100 +1  100 +
      • 1  100 + 1  100 + 1  100 +1  100 + 1  100)
      •  283
  • DCT Examples
    • For u = 1, notice that cos(  /16) = -cos(15  /16), cos(3  /16)= -cos(13  /16), etc. Also, C (1). So we have:
    • F 1 (1) = (1/2)  [cos(  /16)  100 + cos(3  /16)  100 + cos(5  /16)  100 + cos(7  /16)  100 + cos(9  /16)  100 + cos(11  /16)  100 + cos(13  /16)  100 + cos(15  /16)  100] = 0
    • The same holds for F 1 (2), F 1 (3), …, F 1 (7) each = 0
  • DCT Examples
    • The second example shows a discrete cosine signal f 2 ( i ) with the same frequency and phase as the second cosine basis function and amplitude 100
      • When u = 0, all the cosine terms in the 1D DCT equal 1.
      • Each of the first four terms inside of the parentheses has an opposite (so they cancel).
        • E.g. cos  /8 = -cos 7  /8
  • DCT Examples
    • F 2 (0) = [sqrt(2)/(2  2)]  1  [100cos(  /8) + 100cos(3  /8) + 100cos(5  /8) + 100cos(7  /8) + 100cos(9  /8) + 100cos(11  /8) + 100cos(13  /8) + 100cos(15  /8)]
    • = 0
    • Similarly, F 2 (1), F 2 (3), F 2 (4), …, F 2 (7) = 0
    • For u = 2, because cos(3  /8) = sin(  /8)
    • we have cos 2 (  /8) + cos 2 (3  /8) = cos 2 (  /8) + sin 2 (  /8) = 1
    • Similarly, cos 2 (5  /8) + cos 2 (7  /8) = cos 2 (9  /8) + cos 2 (11  /8) = cos 2 (13  /8) + cos 2 (15  /8) = 1
  • DCT Examples
    • F 2 (2) =1/2  [cos(  /8)  cos(  /8) + cos(3  /8)  cos(3  /8) + cos(5  /8)  cos(5  /8) + cos(7  /8)  cos(7  /8) + cos(9  /8)  cos(9  /8) + cos(11  /8)  cos(11  /8) + cos(13  /8)  cos(13  /8) + cos(15  /8) + cos(15  /8)]  100
    • = 1/2  (1 + 1 + 1 + 1)  100
  • DCT Examples
  • DCT Characteristics
    • The DCT produces the frequency spectrum F ( u ) corresponding to the spatial signal f ( i )
      • The 0th DCT coefficient F ( 0 ) is the DC coefficient of f ( i ) and the other 7 DCT coefficients represent the various changing (AC) components of f ( i )
        • 0th component represents the average value of f ( i )
  • DCT Characteristics
    • The DCT is a linear transform
      • A transform is linear iff
      • Where  and  are constants and p and q are any functions, variables or constants.
  • Cosine Basis Functions
    • For better decomposition, the basis functions should be orthognal , so as to have the least amount of redundancy
    • Functions B p ( i ) and B q ( i ) are orthognal if
      • Where “.” is the dot product
  • Cosine Basis Functions
    • Further, the functions B p ( i ) and B q ( i ) are orthonormal if they are orthognal and
      • Orthonormal property guarantees reconstructability
      • It can be shown that
  • Graphical Illustration of 2D DCT Basis Functions
  • 2D Separable Basis
    • With block size 8, the 2D DCT can be separated into a sequence of 2 1D DCT steps (Fast DCT)
      • This algorithm is much more efficient (linear vs. quadratic)
  • Discrete Fourier Transform
    • The DCT is comparable to the more widely known (in mathematical circles) Discrete Fourier Transform
  • Other Transforms
    • The Karhunen-Loeve Transform (KLT) is a reversible linear transform that optimally decorrelates the input
    • The wavelet transform uses a set of basis functions called wavelets which can be implemented in a computationally efficient manner by means of multi-resolution analysis