Your SlideShare is downloading. ×
0
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Adaptive Signal and Image Processing

561

Published on

Slide of a course for the summer school "Ecole Analyse multirésolution pour l'image", June 20-22, 2012, Auxerre, France.

Slide of a course for the summer school "Ecole Analyse multirésolution pour l'image", June 20-22, 2012, Auxerre, France.

Published in: Education
1 Comment
0 Likes
Statistics
Notes
  • Be the first to like this

No Downloads
Views
Total Views
561
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
127
Comments
1
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Adaptive Signaland Image Processing Gabriel Peyré www.numerical-tours.com
  • 2. Natural Image PriorsFourier decomposition Wavelet decomposition
  • 3. Natural Image PriorsFourier decomposition Wavelet decomposition
  • 4. Overview•Sparsity for Compression and Denoising•Geometric Representations•L0 vs. L1 Sparsity•Dictionary Learning
  • 5. Sparse Approximation in a Basis
  • 6. Sparse Approximation in a Basis
  • 7. Sparse Approximation in a Basis
  • 8. Image and Texture ModelsUniformly smooth C image. Fourier, Wavelets: ||f fM ||2 = O(M ). | f |2
  • 9. Image and Texture ModelsUniformly smooth C image. Fourier, Wavelets: ||f fM ||2 = O(M ). | f |2Discontinuous image with bounded variation. Wavelets: ||f fM ||2 = O(M 1 ). | f|
  • 10. Image and Texture ModelsUniformly smooth C image. Fourier, Wavelets: ||f fM ||2 = O(M ). | f |2Discontinuous image with bounded variation. Wavelets: ||f fM ||2 = O(M 1 ). | f|C -geometrically regular image. 2 Curvelets: ||f fM ||2 = O(log3 (M )M 2 ). | f|
  • 11. Image and Texture ModelsUniformly smooth C image. Fourier, Wavelets: ||f fM ||2 = O(M ). | f |2Discontinuous image with bounded variation. Wavelets: ||f fM ||2 = O(M 1 ). | f|C -geometrically regular image. 2 Curvelets: ||f fM ||2 = O(log3 (M )M 2 ). | f|
  • 12. Image and Texture ModelsUniformly smooth C image. Fourier, Wavelets: ||f fM ||2 = O(M ). | f |2Discontinuous image with bounded variation. Wavelets: ||f fM ||2 = O(M 1 ). | f|C -geometrically regular image. 2 Curvelets: ||f fM ||2 = O(log3 (M )M 2 ). | f|More complex images: needs adaptivity.
  • 13. Compression by Transform-coding forwardf transform a[m] = ⇥f, m⇤ R Image f Zoom on f
  • 14. Compression by Transform-coding forward f transform a[m] = ⇥f, m⇤ R q[m] Z bin T ⇥ 2T |a[m]| 2T T T a[m]Quantization: q[m] = sign(a[m]) Z T a[m] ˜ Image f Zoom on f Quantized q[m]
  • 15. Compression by Transform-coding forward codi f transform a[m] = ⇥f, m⇤ R q[m] Z ng bin T ⇥ 2T |a[m]| 2T T T a[m]Quantization: q[m] = sign(a[m]) Z T a[m] ˜Entropic coding: use statistical redundancy (many 0’s). Image f Zoom on f Quantized q[m]
  • 16. Compression by Transform-coding forward codi f transform a[m] = ⇥f, m⇤ R q[m] Z n g bin T ng c odi de q[m] Z ⇥ 2T |a[m]| 2T T T a[m]Quantization: q[m] = sign(a[m]) Z T a[m] ˜Entropic coding: use statistical redundancy (many 0’s). Image f Zoom on f Quantized q[m]
  • 17. Compression by Transform-coding forward codi f transform a[m] = ⇥f, m⇤ R q[m] Z n g bin T ng c odi dequantization de a[m] ˜ q[m] Z ⇥ 2T |a[m]| 2T T T a[m]Quantization: q[m] = sign(a[m]) Z T a[m] ˜Entropic coding: use statistical redundancy (many 0’s). ⇥ 1Dequantization: a[m] = sign(q[m]) |q[m] + ˜ T 2 Image f Zoom on f Quantized q[m]
  • 18. Compression by Transform-coding forward codi f transform a[m] = ⇥f, m⇤ R q[m] Z n g bin T ng c odi backward dequantization defR = a[m] ˜ m a[m] ˜ q[m] Z transform ⇥ 2T m IT |a[m]| 2T T T a[m]Quantization: q[m] = sign(a[m]) Z T a[m] ˜Entropic coding: use statistical redundancy (many 0’s). ⇥ 1Dequantization: a[m] = sign(q[m]) |q[m] + ˜ T 2 Image f Zoom on f Quantized q[m] f , R =0.2 bit/pixel
  • 19. Compression by Transform-coding forward codi f transform a[m] = ⇥f, m⇤ R q[m] Z n g bin T ng c odi backward dequantization defR = a[m] ˜ m a[m] ˜ q[m] Z transform ⇥ 2T m IT |a[m]| 2T T T a[m]Quantization: q[m] = sign(a[m]) Z T a[m] ˜Entropic coding: use statistical redundancy (many 0’s). ⇥ 1Dequantization: a[m] = sign(q[m]) |q[m] + ˜ T 2 “Theorem:” ||f fM ||2 = O(M ) =⇥ ||f fR ||2 = O(log (R)R ) Image f Zoom on f Quantized q[m] f , R =0.2 bit/pixel
  • 20. Wavelets…JPEG-2000 vs. JPEG JPEG2k: exploit the statistical redundancy of coe cients. + embedded coder. chunks of large coe cients. neighboring coe cients are not independent. Compression JPEG Compression EZW Compressionorm Image f JPEG, R = .19bit/pxl JPEG2k, R = .15bit/pxl
  • 21. Denoising (Donoho/Johnstone)
  • 22. Denoising (Donoho/Johnstone) N 1 thresh. ˜ f= f, m⇥ m f= f, m⇥ m m=0 | f, m ⇥|>T
  • 23. Denoising (Donoho/Johnstone) N 1 thresh. ˜ f= f, m⇥ m f= f, m⇥ m m=0 | f, m ⇥|>TTheorem: if ||f0 f0,M ||2 = O(M ), In practice: ˜E(||f f0 || ) = O( 2 2 +1 ) for T = 2 log(N ) T 3
  • 24. Overview•Sparsity for Compression and Denoising•Geometric Representations•L0 vs. L1 Sparsity•Dictionary Learning
  • 25. Piecewise Regular Functions in 1D Theorem: If f is C outside a finite set of discontinuities: O(M 1 ) (Fourier), ||f fM || = 2 O(M 2 ) (wavelets).For Fourier, linear non-linear, sub-optimal.For wavelets, linear non-linear, optimal.
  • 26. Piecewise Regular Functions in 2D Theorem: If f is C outside a set of finite length edge curves, O(M 1/2 ) (Fourier), ||f fM || = 2 O(M 1 ) (wavelets).Fourier Wavelets.Wavelets: same result for BV functions (optimal).
  • 27. Piecewise Regular Functions in 2D Theorem: If f is C outside a set of finite length edge curves, O(M 1/2 ) (Fourier), ||f fM || = 2 O(M 1 ) (wavelets).Fourier Wavelets.Wavelets: same result for BV functions (optimal).Regular C edges: sub-optimal (requires anisotropy).
  • 28. Geometrically Regular ImagesGeometric image model: f is C outside a set of C edge curves. BV image: level sets have finite lengths. Geometric image: level sets are regular. | f| Geometry = cartoon image Sharp edges Smoothed edges
  • 29. Curvelets for Cartoon ImagesCurvelets: [Candes, Donoho] [Candes, Demanet, Ying, Donoho]
  • 30. Curvelets for Cartoon ImagesCurvelets: [Candes, Donoho] [Candes, Demanet, Ying, Donoho]Redundant tight frame (redundancy 5): not e cient for compression.Denoising by curvelet thresholding: recovers edges and geometric textures.
  • 31. Approximation with Triangulation Vertices V = {vi }M .Triangulation (V, F): i=1 Faces F {1, . . . , M }3 .
  • 32. Approximation with Triangulation Vertices V = {vi }M .Triangulation (V, F): i=1 Faces F {1, . . . , M }3 . MPiecewise linear approximation: fM = m ⇥m m=1 = argmin ||f µm ⇥m || µ m ⇥m (vi ) = m i is a ne on each face of F.
  • 33. Approximation with Triangulation Vertices V = {vi }M .Triangulation (V, F): i=1 Faces F {1, . . . , M }3 . MPiecewise linear approximation: fM = m ⇥m m=1 = argmin ||f µm ⇥m || µ m ⇥m (vi ) = m i is a ne on each face of F.Theorem: There exists (V, F) such that Regular areas: M/2 equilateral triangles. 2 ||f fM || Cf M 1/2 M Singular areas: M 1/2 M/2 anisotropic triangles.
  • 34. Approximation with Triangulation Vertices V = {vi }M .Triangulation (V, F): i=1 Faces F {1, . . . , M }3 . MPiecewise linear approximation: fM = m ⇥m m=1 = argmin ||f µm ⇥m || µ m ⇥m (vi ) = m i is a ne on each face of F.Theorem: There exists (V, F) such that Regular areas: M/2 equilateral triangles. 2 ||f fM || Cf M 1/2 MOptimal (V, F): NP-hard. Singular areas:Provably good greedy schemes: M 1/2 M/2 anisotropic triangles. [Mirebeau, Cohen, 2009]
  • 35. Greedy Triangulation Optimization Bougleux, Peyr´, Cohen, ICCV’09 eM = 200M = 600 Anisotropic triangulation JPEG2000
  • 36. Geometric Multiscale Processing f Image f R N Wavelet transform jWavelet coe cients fj [n] = f, j,n fj
  • 37. Geometric Multiscale Processing f Image f R N Wavelet transform j Wavelet coe cients fj [n] = f, j,n fjGeometry Geometric transform Geometric coe cients
  • 38. Geometric Multiscale Processing f Image f R N Wavelet transform j Wavelet coe cients fj [n] = f, j,n fjGeometry Geometric transform Geometric coe cients 2 j b[j, , n] = fj , b⇤,n ⇥ ⇥ 2
  • 39. Geometric Multiscale Processing f Image f R N Wavelet transform j Wavelet coe cients fj [n] = f, j,n fjGeometry Geometric transform C cartoon image: Geometric coe cients 2 j ||f fM ||2 = O(M ) b[j, , n] = fj , b⇤,n ⇥ ⇥ M = Mband + M 2
  • 40. Overview•Sparsity for Compression and Denoising•Geometric Representations•L0 vs. L1 Sparsity•Dictionary Learning
  • 41. Image Representation Q 1Dictionary D = {dm }m=0 of atoms dm RN . Q 1Image decomposition: f = xm dm = Dx m=0 dm xm dm = f D x
  • 42. Image Representation Q 1Dictionary D = {dm }m=0 of atoms dm RN . Q 1Image decomposition: f = xm dm = Dx m=0 dmImage approximation: f Dx xm dm = f D x
  • 43. Image Representation Q 1Dictionary D = {dm }m=0 of atoms dm RN . Q 1Image decomposition: f = xm dm = Dx m=0 dmImage approximation: f Dx xm dmOrthogonal dictionary: N = Q = xm = f, dm f D x
  • 44. Image Representation Q 1Dictionary D = {dm }m=0 of atoms dm RN . Q 1Image decomposition: f = xm dm = Dx m=0 dmImage approximation: f Dx xm dmOrthogonal dictionary: N = Q = xm = f, dm f D xRedundant dictionary: N Q Examples: TI wavelets, curvelets, . . . x is not unique.
  • 45. Sparsity Q 1Decomposition: f= xm dm = Dx m=0Sparsity: most xm are small. Example: wavelet transform. Image f Coe cients x
  • 46. Sparsity Q 1Decomposition: f= xm dm = Dx m=0Sparsity: most xm are small. Example: wavelet transform. Image fIdeal sparsity: most xm are zero. J0 (x) = | {m xm = 0} | Coe cients x
  • 47. Sparsity Q 1Decomposition: f= xm dm = Dx m=0Sparsity: most xm are small. Example: wavelet transform. Image fIdeal sparsity: most xm are zero. J0 (x) = | {m xm = 0} |Approximate sparsity: compressibility ||f Dx|| is small with J0 (x) M. Coe cients x
  • 48. Sparse Coding Q 1Redundant dictionary D = {dm }m=0 , Q N. non-unique representation f = Dx.Sparsest decomposition: min J0 (x) f =Dx
  • 49. Sparse Coding Q 1Redundant dictionary D = {dm }m=0 , Q N. non-unique representation f = Dx.Sparsest decomposition: min J0 (x) f =Dx 1Sparsest approximation: min ||f Dx|| + J0 (x) 2 x 2 Equivalence min ||f Dx|| M ⇥ J0 (x) M min J0 (x) ||f Dx||
  • 50. Sparse Coding Q 1Redundant dictionary D = {dm }m=0 , Q N. non-unique representation f = Dx.Sparsest decomposition: min J0 (x) f =Dx 1Sparsest approximation: min ||f Dx|| + J0 (x) 2 x 2 Equivalence min ||f Dx|| M ⇥ J0 (x) M min J0 (x) ||f Dx||Ortho-basis D: ⇤ Pick the M largest f, dm ⇥ if |xm | 2 coe cients xm = 0 otherwise. in { f, dm ⇥}m
  • 51. Sparse Coding Q 1Redundant dictionary D = {dm }m=0 , Q N. non-unique representation f = Dx.Sparsest decomposition: min J0 (x) f =Dx 1Sparsest approximation: min ||f Dx|| + J0 (x) 2 x 2 Equivalence min ||f Dx|| M ⇥ J0 (x) M min J0 (x) ||f Dx||Ortho-basis D: ⇤ Pick the M largest f, dm ⇥ if |xm | 2 coe cients xm = 0 otherwise. in { f, dm ⇥}mGeneral redundant dictionary: NP-hard.
  • 52. Convex Relaxation: L1 Prior J0 (x) = | {m xm = 0} | J0 (x) = 0 null image.Image with 2 pixels: J0 (x) = 1 sparse image. J0 (x) = 2 non-sparse image. d1 d0 q=0
  • 53. Convex Relaxation: L1 Prior J0 (x) = | {m xm = 0} | J0 (x) = 0 null image.Image with 2 pixels: J0 (x) = 1 sparse image. J0 (x) = 2 non-sparse image. d1 d0 q=0 q = 1/2 q=1 q = 3/2 q=2 q priors: Jq (x) = |xm |q (convex for q 1) m
  • 54. Convex Relaxation: L1 Prior J0 (x) = | {m xm = 0} | J0 (x) = 0 null image.Image with 2 pixels: J0 (x) = 1 sparse image. J0 (x) = 2 non-sparse image. d1 d0 q=0 q = 1/2 q=1 q = 3/2 q=2 q priors: Jq (x) = |xm |q (convex for q 1) mSparse 1 prior: J1 (x) = ||x||1 = |xm | m
  • 55. Inverse ProblemsDenoising/approximation: = Id.
  • 56. Inverse ProblemsDenoising/approximation: = Id.Examples: Inpainting, super-resolution, compressed-sensing
  • 57. Regularized InversionDenoising/compression: y = f0 + w RN . Sparse approximation: f = Dx where 1 x ⇥ argmin ||y Dx||2 + ||x||1 x 2 Fidelity
  • 58. Regularized InversionDenoising/compression: y = f0 + w RN . Sparse approximation: f = Dx where 1 x ⇥ argmin ||y Dx||2 + ||x||1 x 2 Fidelity Replace D by DInverse problems y = f0 + w RP . 1 x ⇥ argmin ||y Dx|| + ||x||1 2 x 2
  • 59. Regularized InversionDenoising/compression: y = f0 + w RN . Sparse approximation: f = Dx where 1 x ⇥ argmin ||y Dx||2 + ||x||1 x 2 Fidelity Replace D by DInverse problems y = f0 + w RP . 1 x ⇥ argmin ||y Dx|| + ||x||1 2 x 2Numerical solvers: proximal splitting schemes. www.numerical-tours.com
  • 60. Inpainting Results
  • 61. Overview•Sparsity for Compression and Denoising•Geometric Representations•L0 vs. L1 Sparsity•Dictionary Learning
  • 62. Dictionary Learning: MAP EnergySet of (noisy) exemplars {yk }k . 1Sparse approximation: min ||yk Dxk || + ||xk ||1 2 xk 2
  • 63. Dictionary Learning: MAP EnergySet of (noisy) exemplars {yk }k . 1Sparse approximation: min min ||yk Dxk || + ||xk ||1 2 D C xk k 2 Dictionary learning
  • 64. Dictionary Learning: MAP EnergySet of (noisy) exemplars {yk }k . 1Sparse approximation: min min ||yk Dxk || + ||xk ||1 2 D C xk k 2 Dictionary learning Constraint: C = {D = (dm )m m, ||dm || 1} Otherwise: D + , X 0
  • 65. Dictionary Learning: MAP EnergySet of (noisy) exemplars {yk }k . 1Sparse approximation: min min ||yk Dxk || + ||xk ||1 2 D C xk k 2 Dictionary learning Constraint: C = {D = (dm )m m, ||dm || 1} Otherwise: D + , X 0Matrix formulation: 1 min f (X, D) = ||Y DX|| + ||X||1 2 X⇥R Q K 2 D⇥C RN Q
  • 66. Dictionary Learning: MAP EnergySet of (noisy) exemplars {yk }k . 1Sparse approximation: min min ||yk Dxk || + ||xk ||1 2 D C xk k 2 Dictionary learning Constraint: C = {D = (dm )m m, ||dm || 1} Otherwise: D + , X 0Matrix formulation: 1 min f (X, D) = ||Y DX|| + ||X||1 2 X⇥R Q K 2 min f (X, D) D⇥C R N Q X Convex with respect to X. Convex with respect to D. D Non-onvex with respect to (X, D). Local minima
  • 67. Dictionary Learning: AlgorithmStep 1: k, minimization on xk 1 min ||yk Dxk || + ||xk ||1 2 xk 2 Convex sparse coding. D, initialization
  • 68. Dictionary Learning: AlgorithmStep 1: k, minimization on xk 1 min ||yk Dxk || + ||xk ||1 2 xk 2 Convex sparse coding.Step 2: Minimization on D D, initialization min ||Y DX|| 2 D C Convex constraint minimization.
  • 69. Dictionary Learning: Algorithm Step 1: k, minimization on xk 1 min ||yk Dxk || + ||xk ||1 2 xk 2 Convex sparse coding. Step 2: Minimization on D D, initialization min ||Y DX|| 2 D C Convex constraint minimization. Projected gradient descent:D ( +1) = ProjC D ( ) (D ( ) X Y )X
  • 70. Dictionary Learning: Algorithm Step 1: k, minimization on xk 1 min ||yk Dxk || + ||xk ||1 2 xk 2 Convex sparse coding. Step 2: Minimization on D D, initialization min ||Y DX|| 2 D C Convex constraint minimization. Projected gradient descent:D ( +1) = ProjC D ( ) (D ( ) X Y )XConvergence: toward a stationary point of f (X, D). D, convergence
  • 71. Patch-based Learning Learning DExemplar patches yk Dictionary D [Olshausen, Fields 1997] State of the art denoising [Elad et al. 2006]
  • 72. Patch-based Learning Learning DExemplar patches yk Dictionary D [Olshausen, Fields 1997] State of the art denoising [Elad et al. 2006] Learning D Sparse texture synthesis, inpainting [Peyr´ 2008] e
  • 73. Comparison with PCAPCA dimensionality reduction: ⇥ k, min ||Y D(k) X|| D (k) = (dm )m=0 k 1 DLinear (PCA): Fourier-like atoms. RUBINSTEIN et al.: al.: DICTIONARIES FOR SPARSE REPRESENTATION RUBINSTEIN et DICTIONARIES FOR SPARSE REPRESENTATION 1980 by by Bast 1980 Bastiaa fundamental prop fundamental p A basic 1-D G A basic 1-D forms forms © © G = = n, G ⇤ ⇤ DCT PCA where w(·) is is where w(·) a Fig. 1. 1.Left: A fewfew £ 12 12 DCT atoms. Right: The first 40 KLT atoms, (typically a Gau Fig. Left: A 12 12 £ DCT atoms. Right: The first 40 KLT atoms, (typically a G trained using 12 £ 12 12 image patches from Lena. trained using 12 £ image patches from Lena. frequency resolu frequency reso matical foundatio matical founda late 1980’s by by late 1980’s D B. B. Non-Linear Revolution and Elements Modern Dictionary Non-Linear Revolution and Elements of of Modern Dictionary who studied thet who studied Design Design and by by Feichti and Feichting In In statistics research, the 1980’s saw the rise of new generalized group statistics research, the 1980’s saw the rise of a a new generalized gro powerful approach known as as robust statistics. Robust statistics powerful approach known robust statistics. Robust statistics
  • 74. Comparison with PCAPCA dimensionality reduction: ⇥ k, min ||Y D(k) X|| D (k) = (dm )m=0 k 1 DLinear (PCA): Fourier-like atoms. RUBINSTEIN et al.: al.: DICTIONARIES FOR SPARSE REPRESENTATION RUBINSTEIN et DICTIONARIES FOR SPARSE REPRESENTATIONSparse (learning): Gabor-like atoms. 1980 by by Bast 1980 Bastiaa fundamental prop fundamental p A basic 1-D G A basic 1-D forms forms © © 4 G = = n, G ⇤ ⇤ 4 DCT PCA where w(·) is is where w(·) a Fig. 1. 1.Left: A fewfew £ 12 12 DCT atoms. Right: The first 40 KLT atoms, (typically a Gau Fig. Left: A 12 12 £ DCT atoms. Right: The first 40 KLT atoms, 0.15 (typically a G 0.15 trained using 12 £ 12 12 image patches from Lena. trained using 12 £ image patches from Lena. frequency resolu 0.1 frequency reso 0.1 matical foundatio matical founda 0.05 0.05 0 0 late 1980’s by by late 1980’s D B. B. Non-Linear Revolution and Elements Modern Dictionary Non-Linear Revolution and Elements of of Modern Dictionary -0.05 -0.05 who studied thet who studied Design Design -0.1 -0.1 and by by Feichti -0.15 and Feichting -0.15 In In statistics research, the 1980’s saw the rise of new generalized group statistics research, the 1980’s saw the rise of a a new generalized gro -0.2 -0.2 Gabor Learned powerful approach known as as robust statistics. Robust statistics powerful approach known robust statistics. Robust statistics
  • 75. Patch-based DenoisingNoisy image: f = f0 + w.Step 1: Extract patches. yk (·) = f (zk + ·) yk[Aharon & Elad 2006]
  • 76. Patch-based DenoisingNoisy image: f = f0 + w.Step 1: Extract patches. yk (·) = f (zk + ·)Step 2: Dictionary learning. 1 min ||yk Dxk || + ||xk ||1 2 D,(xk )k 2 k yk[Aharon & Elad 2006]
  • 77. Patch-based DenoisingNoisy image: f = f0 + w.Step 1: Extract patches. yk (·) = f (zk + ·)Step 2: Dictionary learning. 1 min ||yk Dxk || + ||xk ||1 2 D,(xk )k 2 kStep 3: Patch averaging. yk = Dxk ˜ ˜ f (·) ⇥ yk (· zk ) ˜ k yk ˜ yk[Aharon & Elad 2006]
  • 78. Learning with Missing DataInverse problem: y = f0 + w LEARNING MULTISCALE AND S 1 1 min ||y f || + 2 ||pk (f ) Dxk || + ⇥||xk ||1 2f,(xk )k 2 2 kD C f0Patch extractor: pk (f ) = f (zk + ·) pk LEARNING MULTISCALE AND SPARSE REPRESENTATIONS 237 (a) Original y (a) Original (b) Damaged
  • 79. Learning with Missing DataInverse problem: y = f0 + w LEARNING MULTISCALE AND S 1 1 min ||y f || + 2 ||pk (f ) Dxk || + ⇥||xk ||1 2f,(xk )k 2 2 kD C f0Patch extractor: pk (f ) = f (zk + ·) pk Step 1: k, minimization on xk LEARNING MULTISCALE AND SPARSE REPRESENTATIONS 237 Convex sparse coding. (a) Original y (a) Original (b) Damaged
  • 80. Learning with Missing DataInverse problem: y = f0 + w LEARNING MULTISCALE AND S 1 1 min ||y f || + 2 ||pk (f ) Dxk || + ⇥||xk ||1 2f,(xk )k 2 2 kD C f0Patch extractor: pk (f ) = f (zk + ·) pk Step 1: k, minimization on xk LEARNING MULTISCALE AND SPARSE REPRESENTATIONS 237 Convex sparse coding. Step 2: Minimization on D (a) Original y Quadratic constrained. (a) Original (b) Damaged
  • 81. Learning with Missing DataInverse problem: y = f0 + w LEARNING MULTISCALE AND S 1 1 min ||y f || + 2 ||pk (f ) Dxk || + ⇥||xk ||1 2f,(xk )k 2 2 kD C f0Patch extractor: pk (f ) = f (zk + ·) pk Step 1: k, minimization on xk LEARNING MULTISCALE AND SPARSE REPRESENTATIONS 237 Convex sparse coding. Step 2: Minimization on D (a) Original y Quadratic constrained. Step 3: Minimization on f Quadratic. (a) Original (b) Damaged
  • 82. Inpainting Example LEARNING MULTISCALE AND SPARSE REPRESENTATIONS LEARNING MULTISCALE AND SPARSE REPRESENTATIONS 237 237 (a) Original (b) Damaged Image f0 (a) Original (a) Original Observations (c) Restored, N = 1 (b) Damaged (b) Damaged Regularized f (d) Restored, N = 2 y = using + w Fig. 14. Inpainting f0 N = 2 and n = 16 × 16 (bottom-right image), or N = 1 and n = 8 × 8 (bottom-left). J = 100 iterations were performed, producing an adaptive dictionary. During the learning, 50% of the patches were used. A sparsity factor L = 10 has been used during the learning process and L = 25 for the final reconstruction. The damaged image was created by removing 75% of the data from the original image. The initial PSNR is 6.13dB. The resulting PSNR for N = 2 is[Mairal et al. 2008] 33.97dB and 31.75dB for N = 1.
  • 83. Adaptive Inpainting and Separation Wavelets Local DCT Wavelets Local DCT Learned [Peyr´, Fadili, Starck 2010] e
  • 84. OISING ALGORITHM WITH 256 ATOMS OF SIZE 7 7 3 FOR TS ARE THOSE GIVEN BY MCAULEY AND AL [28] WITH THEIR “3 D DICTIONARY. THE BOTTOM-RIGHT ARE THE IMPROVEMENTS OBTAINED WITH THE ADAPTIVE APPRO SENTATION FOR COLOR IMAGE RESTORATION EST RESULTS FOR EACH GROUP. AS CAN BE SEEN, OUR PROPOSED TECHNIQUE CONSISTENTLY PRODUC K-SVD ALGORITHM [2] ON EACH CHANNEL SEPARATELY WITH 8 8 ATOMS. THE BOTTOM-LEFT AR s are reduced with our proposed technique ( Higher Dimensional Learningmples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric). in our proposed new metric). Both images have been denoised with the same global dictionary.bserves a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when MAIRAL rs), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. dB. (c) Proposed algorithm, et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATIONnaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms. can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.OLOR IMAGE RESTORATION 61 Fig. 7. Data set used for evaluating denoising experiments. Learning D (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one. TABLE I Fig. 7. Data set used for evaluating denoising experiments. les of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposedatoms learned on new metric). Fig. 2. Dictionaries with 256 ( in the a generic database of natural images, with two dare reduced with our proposed technique ( TABLE I our proposed new metric). Both images have been denoised negative values, the vectors are presented scaled and shifted to the [0,2 in Since the atoms can have with the same global dictionary.ervesITH 256 ATOMS OF SIZE castle and in3 FOR of the water. What is more, the color of the sky is . EACH constant IS DIVIDED IN FOUR W a bias effect in the color from the 7 7 some part AND 6 6 3 FOR piecewise CASE whenEN BY is another artifact our approach corrected. (a)HEIR “3(b) Original algorithm, HE TOP-RIGHT RESULTS ARE THOSE OBTAINED BY), which MCAULEY AND AL [28] WITH T Original. 3 MODEL.” T dB. (c) Proposed algorithm, 3 MODEL.” THE TOP-RIGHT RESUL dB. AND 6M [2] ON EACH CHANNEL SEPARATELY WITH 8 8 ATOMS. THE BOTTOM-LEFT ARE OUR RESULTS OBTAINEDE BOTTOM-RIGHT ARE THE IMPROVEMENTS OBTAINED WITH THE ADAPTIVE APPROACH WITH 20 ITERATIONS.EACH GROUP. AS CAN BE SEEN, OUR PROPOSED TECHNIQUE CONSISTENTLY PRODUCES THE BEST RESULTS 6 3 FOR . EAC
  • 85. OISING ALGORITHM WITH 256 ATOMS OF SIZE 7 7 3 FOR TS ARE THOSE GIVEN BY MCAULEY AND AL [28] WITH THEIR “3 D DICTIONARY. THE BOTTOM-RIGHT ARE THE IMPROVEMENTS OBTAINED WITH THE ADAPTIVE APPRO SENTATION FOR COLOR IMAGE RESTORATION EST RESULTS FOR EACH GROUP. AS CAN BE SEEN, OUR PROPOSED TECHNIQUE CONSISTENTLY PRODUC K-SVD ALGORITHM [2] ON EACH CHANNEL SEPARATELY WITH 8 8 ATOMS. THE BOTTOM-LEFT AR Higher Dimensional Learning O NLINE L EARNING FOR M ATRIX FACTORIZATION AND FACTORIZATION AND S PARSE C ODING O NLINE L EARNING FOR M ATRIX S PARSE C ODINGmples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( s are reduced with our proposed technique ( in the new metric). in our proposed new metric). Both images have been denoised with the same global dictionary.bserves a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when MAIRAL rs), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. dB. (c) Proposed algorithm, et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATIONnaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms. can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.OLOR IMAGE RESTORATION 61 Fig. 7. Data set used for evaluating denoising experiments. Learning D (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one. TABLE I Fig. 7. Data set used for evaluating denoising experiments. les of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposedatoms learned on new metric). Fig. 2. Dictionaries with 256 ( in the a generic database of natural images, with two dare reduced with our proposed technique ( TABLE I our proposed new metric). Both images have been denoised negative values, the vectors are presented scaled and shifted to the [0,2 in Since the atoms can have with the same global dictionary.ervesITH 256 ATOMS OF SIZE castle and in3 FOR of the water. What is more, the color of the sky is . EACH constant IS DIVIDED IN FOUR W a bias effect in the color from the 7 7 some part AND 6 6 3 FOR piecewise CASE whenEN BY is another artifact our approach corrected. (a)HEIR “3(b) Original algorithm, HE TOP-RIGHT RESULTS ARE THOSE OBTAINED BY), which MCAULEY AND AL [28] WITH T Original. 3 MODEL.” T dB. (c) Proposed algorithm, 3 MODEL.” THE TOP-RIGHT RESUL dB. AND 6M [2] ON EACH CHANNEL SEPARATELY WITH 8 8 ATOMS. THE BOTTOM-LEFT ARE OUR RESULTS OBTAINEDE BOTTOM-RIGHT ARE THE IMPROVEMENTS OBTAINED WITH THE ADAPTIVE APPROACH WITH 20 ITERATIONS. InpaintingEACH GROUP. AS CAN BE SEEN, OUR PROPOSED TECHNIQUE CONSISTENTLY PRODUCES THE BEST RESULTS 6 3 FOR . EAC
  • 86. Movie Inpainting
  • 87. Facial Image Compression O. Bryt, M. Elad / J. Vis. Commun. Image R. 19 (2008) 270–282 271 [Elad et al. 2009] show recognizable faces. We use a database containing around 6000 such facial images, some of which are used for training and tuning the algorithm, and the others for testing it, similar to the approachImage registration. taken in [17]. In our work we propose a novel compression algorithm, related to the one presented in [17], improving over it. Our algorithm relies strongly on recent advancements made in using sparse and redundant representation of signals [18–26], and learning their sparsifying dictionaries [27–29]. We use the K-SVD algorithm for learning the dictionaries for representing small image patches in a locally adaptive way, and use these to sparse- code the patches’ content. This is a relatively simple and straight-forward algorithm with hardly any entropy coding stage. Yet, it is shown to be superior to several competing algorithms: (i) the JPEG2000, (ii) the VQ-based algorithm presented in [17], and (iii) A Principal Component Analysis (PCA) approach.2 Fig. 1. (Left) Piece-wise affine warping of the image by triangulation. (Right) A In the next section we provide some background material for uniform slicing to disjoint square patches for coding purposes. this work: we start by presenting the details of the compression algorithm developed in [17], as their scheme is the one we embark K-Means) per each patch separately, using patches taken from the from in the development of ours. We also describe the topic of same location from 5000 training images. This way, each VQ is sparse and redundant representations and the K-SVD, that are adapted to the expected local content, and thus the high perfor- the foundations for our algorithm. In Section 3 we turn to present mance presented by this algorithm. The number of code-words the proposed algorithm in details, showing its various steps, and in the VQ is a function of the bit-allocation for the patches. As discussing its computational/memory complexities. Section 4 we argue in the next section, VQ coding is limited by the available presents results of our method, demonstrating the claimed number of examples and the desired rate, forcing relatively small superiority. We conclude in Section 5 with a list of future activities patch sizes. This, in turn, leads to a loss of some redundancy be- that can further improve over the proposed scheme. tween adjacent patches, and thus loss of potential compression. Another ingredient in this algorithm that partly compensates 2. Background material for the above-described shortcoming is a multi-scale coding scheme. The image is scaled down and VQ-coded using patches 2.1. VQ-based image compression of size 8 Â 8. Then it is interpolated back to the original resolution, and the residual is coded using VQ on 8 Â 8 pixel patches once Among the thousands of papers that study still image again. This method can be applied on a Laplacian pyramid of the compression algorithms, there are relatively few that consider original (warped) image with several scales [33]. the treatment of facial images [2–17]. Among those, the most As already mentioned above, the results shown in [17] surpass recent and the best performing algorithm is the one reported in those obtained by JPEG2000, both visually and in Peak-Signal-to- [17]. That paper also provides a thorough literature survey that Noise Ratio (PSNR) quantitative comparisons. In our work we pro- compares the various methods and discusses similarities and pose to replace the coding stage from VQ to sparse and redundant differences between them. Therefore, rather than repeating such representations—this leads us to the next subsection, were we de- a survey here, we refer the interested reader to [17]. In this scribe the principles behind this coding strategy. sub-section we concentrate on the description of the algorithm in [17] as our method resembles it to some extent. 2.2. Sparse and redundant representations
  • 88. Facial Image Compression O.O. Bryt, M. EladJ. J. Vis. Commun. Image R. 19 (2008) 270–282 Bryt, M. Elad / / Vis. Commun. Image R. 19 (2008) 270–282 271 271 [Elad et al. 2009] show recognizable faces. We use a a database containing around 6000 show recognizable faces. We use database containing around 6000 such facial images, some of which are used for training and tuning such facial images, some of which are used for training and tuning the algorithm, and the others for testing it, similar to the approach the algorithm, and the others for testing it, similar to the approachImage registration. taken in [17]. taken in [17]. In our work we propose a a novel compression algorithm, related In our work we propose novel compression algorithm, related to the one presented in [17], improving over it. to the one presented in [17], improving over it. Our algorithm relies strongly on recent advancements made in Our algorithm relies strongly on recent advancements made inNon-overlapping patches (fk )k . using sparse and redundant representation of signals [18–26], and using sparse and redundant representation of signals [18–26], and fk learning their sparsifying dictionaries [27–29]. We use the K-SVD learning their sparsifying dictionaries [27–29]. We use the K-SVD algorithm for learning the dictionaries for representing small algorithm for learning the dictionaries for representing small image patches in a a locally adaptive way, and use these to sparse- image patches in locally adaptive way, and use these to sparse- code the patches’ content. This isis a a relatively simple and code the patches’ content. This relatively simple and straight-forward algorithm with hardly any entropy coding stage. straight-forward algorithm with hardly any entropy coding stage. Yet, itit is shown to be superior to several competing algorithms: Yet, is shown to be superior to several competing algorithms: (i) the JPEG2000, (ii) the VQ-based algorithm presented in [17], (i) the JPEG2000, (ii) the VQ-based algorithm presented in [17], 2 and (iii) AA Principal Component Analysis (PCA) approach.2 and (iii) Principal Component Analysis (PCA) approach. Fig. 1.1. (Left) Piece-wise affine warping of the image by triangulation. (Right) A Fig. (Left) Piece-wise affine warping of the image by triangulation. (Right) A In the next section we provide some background material for In the next section we provide some background material for uniform slicing toto disjoint square patches for coding purposes. uniform slicing disjoint square patches for coding purposes. this work: we start by presenting the details of the compression this work: we start by presenting the details of the compression algorithm developed in [17], as their scheme isis the one we embark algorithm developed in [17], as their scheme the one we embark K-Means) per each patch separately, using patches taken from the K-Means) per each patch separately, using patches taken from the from in the development of ours. We also describe the topic of from in the development of ours. We also describe the topic of same location from 5000 training images. This way, each VQ isis same location from 5000 training images. This way, each VQ sparse and redundant representations and the K-SVD, that are sparse and redundant representations and the K-SVD, that are adapted to the expected local content, and thus the high perfor- adapted to the expected local content, and thus the high perfor- the foundations for our algorithm. In Section 3 3 we turn to present the foundations for our algorithm. In Section we turn to present mance presented by this algorithm. The number of code-words mance presented by this algorithm. The number of code-words the proposed algorithm in details, showing its various steps, and the proposed algorithm in details, showing its various steps, and in the VQ isisa afunction of the bit-allocation for the patches. As in the VQ function of the bit-allocation for the patches. As discussing its computational/memory complexities. Section 4 4 discussing its computational/memory complexities. Section we argue in the next section, VQ coding isis limited by the available we argue in the next section, VQ coding limited by the available presents results of our method, demonstrating the claimed presents results of our method, demonstrating the claimed number of examples and the desired rate, forcing relatively small number of examples and the desired rate, forcing relatively small superiority. We conclude in Section 5 5 with a list of future activities superiority. We conclude in Section with a list of future activities patch sizes. This, in turn, leads to a a loss of some redundancy be- patch sizes. This, in turn, leads to loss of some redundancy be- that can further improve over the proposed scheme. that can further improve over the proposed scheme. tween adjacent patches, and thus loss of potential compression. tween adjacent patches, and thus loss of potential compression. Another ingredient in this algorithm that partly compensates Another ingredient in this algorithm that partly compensates 2. Background material 2. Background material for the above-described shortcoming isis a a multi-scale coding for the above-described shortcoming multi-scale coding scheme. The image isisscaled down and VQ-coded using patches scheme. The image scaled down and VQ-coded using patches 2.1. VQ-based image compression 2.1. VQ-based image compression of size 8 8  8. Then it is interpolated back to the original resolution, of size  8. Then it is interpolated back to the original resolution, and the residual isiscoded using VQ on 8 8  8pixel patches once and the residual coded using VQ on  8 pixel patches once Among the thousands of papers that study still image Among the thousands of papers that study still image again. This method can be applied on a a Laplacian pyramid of the again. This method can be applied on Laplacian pyramid of the compression algorithms, there are relatively few that consider compression algorithms, there are relatively few that consider original (warped) image with several scales [33]. original (warped) image with several scales [33]. the treatment of facial images [2–17]. Among those, the most the treatment of facial images [2–17]. Among those, the most As already mentioned above, the results shown in [17] surpass As already mentioned above, the results shown in [17] surpass recent and the best performing algorithm isis the one reported in recent and the best performing algorithm the one reported in those obtained by JPEG2000, both visually and in Peak-Signal-to- those obtained by JPEG2000, both visually and in Peak-Signal-to- [17]. That paper also provides a athorough literature survey that [17]. That paper also provides thorough literature survey that Noise Ratio (PSNR) quantitative comparisons. In our work we pro- Noise Ratio (PSNR) quantitative comparisons. In our work we pro- compares the various methods and discusses similarities and compares the various methods and discusses similarities and pose to replace the coding stage from VQ to sparse and redundant pose to replace the coding stage from VQ to sparse and redundant differences between them. Therefore, rather than repeating such differences between them. Therefore, rather than repeating such representations—this leads us to the next subsection, were we de- representations—this leads us to the next subsection, were we de- a asurvey here, we refer the interested reader to [17]. In this survey here, we refer the interested reader to [17]. In this scribe the principles behind this coding strategy. scribe the principles behind this coding strategy. sub-section we concentrate on the description of the algorithm sub-section we concentrate on the description of the algorithm in [17] as our method resembles itit to some extent. in [17] as our method resembles to some extent. 2.2. Sparse and redundant representations 2.2. Sparse and redundant representations
  • 89. Facial Image Compression O. Bryt, M. Elad / J. Vis. Commun. Image R. 19 (2008) 270–282 Before turning to preset the results we should add the follow- O.O. Bryt, M. EladJ. J. ing: while all theImage R. 19 (2008) 270–282 specific database Bryt, M. Elad / / Vis. Commun. Image R. shown here 270–282 Vis. Commun. results 19 (2008) refer to the was trained for patch number 80 (The left 271 coding atoms, and similarly, in Fig. 7 we271 can we operate on, the overall scheme proposed is general and should was trained for patch number 87 (The right apply to other face images databases just as well. Naturally, some sparse coding atoms. It can be seen that bot [Elad et al. 2009] show recognizable faces. We use a a database containing around 6000 the parameters might be necessary, and among those, show recognizable faces. We use database containing around 6000 in changes images similar in nature to the image patch such facial images, some of which are used for training and tuning size is the most important to consider. We also note that such facial images, some of which are used for training andthe patch tuning trained for. A similar behavior was observed the algorithm, and the others for testing it, similar to the approach from one source of images to another, this relative size as one shifts the algorithm, and the others for testing it, similar to the approach of the background in the photos may vary, and the necessarily 4.2. Reconstructed imagesImage registration. taken in [17]. taken in [17]. leads to changes in performance. More specifically, when the back- In our work we propose a a novel compression algorithm, ground small such larger (e.g., the images we useperformance is In our work we propose novel compression algorithm, related related tively regions are regions), the compression here have rela- Our coding strategy allows us to learn w age are more difficult than others to co to the one presented in [17], improving over it. to the one presented in [17], improving over it. expected to improve. assigning the same representation error th Our algorithm relies strongly on recent advancements made in Our algorithm relies strongly on recent advancements made in dictionariesNon-overlapping patches (fk )k . 4.1. K-SVD using sparse and redundant representation of signals [18–26], and using sparse and redundant representation of signals [18–26], and fk patches, and observing how many atoms representation of each patch on average. a small number of allocated atoms are simp learning their sparsifying dictionaries [27–29]. We use the K-SVD learning their sparsifying dictionaries [27–29]. We use the K-SVDThe primary stopping condition for the training process was set others. We would expect that the represent to be a limitation on the maximal number of K-SVD iterations of the image such as the background, p algorithm for learning the dictionaries for representing (being 100). A secondary stopping condition was a limitation on algorithm for learning the dictionaries for representingsmall small maybe parts of the clothes will be simpler image patches in a a locally adaptive way, and use these to sparse- image patches in locally adaptive way, and use these to sparse- the minimal representation error. In the image compression stage tion of areas containing high frequency eDictionary learning (Dk )k . we added a limitation on the maximal number of atoms per patch. hair or the eyes. Fig. 8 shows maps of atom code the patches’ content. This isis a a relatively simple and code the patches’ content. This relatively simple and These conditions were used to allow us to better control the rates and representation error (RMSE—squared straight-forward algorithm with hardly any entropy coding of stage. stage. straight-forward algorithm with hardly any entropy coding the resulting images and the overall simulation time. squared error) per patch for the images in Every obtained dictionary contains 512 patches of size different bit-rates. It can be seen that more Yet, itit is shown to be superior to several competing algorithms: as atoms. In Fig. 6 we can see the dictionary that Yet, is shown to be superior to several competing algorithms:pixels 15  15 to patches containing the facial details (h (i) the JPEG2000, (ii) the VQ-based algorithm presented in [17], (i) the JPEG2000, (ii) the VQ-based algorithm presented in [17], 2 and (iii) AA Principal Component Analysis (PCA) approach.2 and (iii) Principal Component Analysis (PCA) approach. Fig. 1.1. (Left) Piece-wise affine warping of the image by triangulation. (Right) A Fig. (Left) Piece-wise affine warping of the image by triangulation. (Right) A In the next section we provide some background material for In the next section we provide some background material for uniform slicing toto disjoint square patches for coding purposes. uniform slicing disjoint square patches for coding purposes. this work: we start by presenting the details of the compression this work: we start by presenting the details of the compression algorithm developed in [17], as their scheme isis the one we embark algorithm developed in [17], as their scheme the one we embark K-Means) per each patch separately, using patches taken from the K-Means) per each patch separately, using patches taken from the from in the development of ours. We also describe the topic of from in the development of ours. We also describe the topic of same location from 5000 training images. This way, each VQ isis same location from 5000 training images. This way, each VQ sparse and redundant representations and the K-SVD, that are sparse and redundant representations and the K-SVD, that are adapted to the expected local content, and thus the high perfor- adapted to the expected local content, and thus the high perfor- the foundations for our algorithm. In Section 3 3 we turn to present the foundations for our algorithm. In Section we turn to present mance presented by this algorithm. The number of code-words mance presented by this algorithm. The number of code-words the proposed algorithm in details, showing its various steps, and the proposed algorithm in details, showing its various steps, and in the VQ isisa afunction of the bit-allocation for the patches. As in the VQ function of the bit-allocation for the patches. As discussing its computational/memory complexities. Section 4 4 discussing its computational/memory complexities. Section we argue in the next section, VQ coding isis limited by the available we argue in the next section, VQ coding limited by the available presents results of our method, demonstrating the claimed presents results of our method, demonstrating the claimed number of examples and the desired rate, forcing relatively small number of examples and the desired rate, forcing relatively small superiority. We conclude in Section 5 5 with a list of future activities superiority. We conclude in Section with a list of future activities patch sizes. This, in turn, leads to a a loss of some redundancy be- patch sizes. This, in turn, leads to loss of some redundancy be- that can further improve over the proposed scheme. that can further improve over the proposed scheme. tween adjacent patches, and thus loss of potential compression. tween adjacent patches, and thus loss of potential compression. 2. Background material 2. Background material AnotherThe Dictionary obtained by K-SVD for Patch No. 80 (the that partlyOMPcompensates Another ingredient in this algorithmleft eye) using the compensates Fig. 6. ingredient in this algorithm that partly method with L ¼ 4. for the above-described shortcoming isis a a multi-scale coding for the above-described shortcoming multi-scale coding Dk scheme. The image isisscaled down and VQ-coded using patches scheme. The image scaled down and VQ-coded using patches 2.1. VQ-based image compression 2.1. VQ-based image compression of size 8 8  8. Then it is interpolated back to the original resolution, of size  8. Then it is interpolated back to the original resolution, and the residual isiscoded using VQ on 8 8  8pixel patches once and the residual coded using VQ on  8 pixel patches once Among the thousands of papers that study still image Among the thousands of papers that study still image again. This method can be applied on a a Laplacian pyramid of the again. This method can be applied on Laplacian pyramid of the compression algorithms, there are relatively few that consider compression algorithms, there are relatively few that consider original (warped) image with several scales [33]. original (warped) image with several scales [33]. the treatment of facial images [2–17]. Among those, the most the treatment of facial images [2–17]. Among those, the most As already mentioned above, the results shown in [17] surpass As already mentioned above, the results shown in [17] surpass recent and the best performing algorithm isis the one reported in recent and the best performing algorithm the one reported in those obtained by JPEG2000, both visually and in Peak-Signal-to- those obtained by JPEG2000, both visually and in Peak-Signal-to- [17]. That paper also provides a athorough literature survey that [17]. That paper also provides thorough literature survey that Noise Ratio (PSNR) quantitative comparisons. In our work we pro- Noise Ratio (PSNR) quantitative comparisons. In our work we pro- compares the various methods and discusses similarities and compares the various methods and discusses similarities and pose to replace the coding stage from VQ to sparse and redundant pose to replace the coding stage from VQ to sparse and redundant differences between them. Therefore, rather than repeating such differences between them. Therefore, rather than repeating such representations—this leads us to the next subsection, were we de- representations—this leads us to the next subsection, were we de- a asurvey here, we refer the interested reader to [17]. In this survey here, we refer the interested reader to [17]. In this scribe the principles behind this coding strategy. scribe the principles behind this coding strategy. sub-section we concentrate on the description of the algorithm sub-section we concentrate on the description of the algorithm in [17] as our method resembles itit to some extent. in [17] as our method resembles to some extent. 2.2. Sparse and redundant representations 2.2. Sparse and redundant representations
  • 90. Facial Image Compression O. Bryt, M. Elad / J. Vis. Commun. Image R. 19 (2008) 270–282 Before turning to preset the results we should add the follow- O.O. Bryt, M. EladJ. J. ing: while all theImage R. 19 (2008) 270–282 specific database Bryt, M. Elad / / Vis. Commun. Image R. shown here 270–282 Vis. Commun. results 19 (2008) refer to the was trained for patch number 80 (The left 271 coding atoms, and similarly, in Fig. 7 we271 can we operate on, the overall scheme proposed is general and should was trained for patch number 87 (The right apply to other face images databases just as well. Naturally, some sparse coding atoms. It can be seen that bot [Elad et al. 2009] show recognizable faces. We use a a database containing around 6000 the parameters might be necessary, and among those, show recognizable faces. We use database containing around 6000 in changes images similar in nature to the image patch such facial images, some of which are used for training and tuning size is the most important to consider. We also note that such facial images, some of which are used for training andthe patch tuning trained for. A similar behavior was observed the algorithm, and the others for testing it, similar to the approach from one source of images to another, this relative size as one shifts the algorithm, and the others for testing it, similar to the approach of the background in the photos may vary, and the necessarily 4.2. Reconstructed images Image registration. taken in [17]. taken in [17]. leads to changes in performance. More specifically, when the back- In our work we propose a a novel compression algorithm, ground small such larger (e.g., the images we useperformance is In our work we propose novel compression algorithm, related related tively regions are regions), the compression here have rela- Our coding strategy allows us to learn w age are more difficult than others to co to the one presented in [17], improving over it. to the one presented in [17], improving over it. expected to improve. assigning the same representation error th Our algorithm relies strongly on recent advancements made in Our algorithm relies strongly on recent advancements made in dictionaries Non-overlapping patches (fk )k . 4.1. K-SVD using sparse and redundant representation of signals [18–26], and using sparse and redundant representation of signals [18–26], and fk patches, and observing how many atoms representation of each patch on average. a small number of allocated atoms are simp learning their sparsifying dictionaries [27–29]. We use the K-SVD learning their sparsifying dictionaries [27–29]. We use the K-SVDThe primary stopping condition for the training process was set others. We would expect that the represent to be a limitation on the maximal number of K-SVD iterations of the image such as the background, p algorithm for learning the dictionaries for representing (being 100). A secondary stopping condition was a limitation on algorithm for learning the dictionaries for representingsmall small maybe parts of the clothes will be simpler image patches in a a locally adaptive way, and use these to sparse- image patches in locally adaptive way, and use these to sparse- the minimal representation error. In the image compression stage tion of areas containing high frequency e Dictionary learning (Dk )k . we added a limitation on the maximal number of atoms per patch. hair or the eyes. Fig. 8 shows maps of atom code the patches’ content. This isis a a relatively simple and code the patches’ content. This relatively simple and These conditions were used to allow us to better control the rates and representation error (RMSE—squared straight-forward algorithm with hardly any entropy coding of stage. stage. straight-forward algorithm with hardly any entropy coding the resulting images and the overall simulation time. squared error) per patch for the images in Every obtained dictionary contains 512 patches of size different bit-rates. It can be seen that more Yet, itit is shown to be superior to several competing algorithms: as atoms. In Fig. 6 we can see the dictionary that Yet, is shown to be superior to several competing algorithms:pixels 15  15 to patches containing the facial details (h (i) the JPEG2000, (ii) the VQ-based algorithm presented in [17], (i) the JPEG2000, (ii) the VQ-based algorithm presented in [17], Sparse approximation: and (iii) AA Principal Component Analysis (PCA) approach.2 and (iii) Principal Component Analysis (PCA) approach. 2 In the next section we provide some background material for In the next section we provide some background material for Fig. 1.1. (Left) Piece-wise affine warping of the image by triangulation. (Right) A Fig. (Left) Piece-wise affine warping of the image by triangulation. (Right) A uniform slicing toto disjoint square patches for coding purposes. uniform slicing disjoint square patches for coding purposes. this work: we start by presenting the details of the compression this work: we start by presenting the details of the compression fk Dk xk algorithm developed in [17], as their scheme isis the one we embark algorithm developed in [17], as their scheme the one we embark from in the development of ours. We also describe the topic of from in the development of ours. We also describe the topic of K-Means) per each patch separately, using patches taken from the K-Means) per each patch separately, using patches taken from the same location from 5000 training images. This way, each VQ isis same location from 5000 training images. This way, each VQ sparse and redundant representations and the K-SVD, that are sparse and redundant representations and the K-SVD, that are adapted to the expected local content, and thus the high perfor- adapted to the expected local content, and thus the high perfor- Entropic coding: xk file. the foundations for our algorithm. In Section 3 3 we turn to present the foundations for our algorithm. In Section we turn to present the proposed algorithm in details, showing its various steps, and the proposed algorithm in details, showing its various steps, and discussing its computational/memory complexities. Section 4 4 discussing its computational/memory complexities. Section mance presented by this algorithm. The number of code-words mance presented by this algorithm. The number of code-words in the VQ isisa afunction of the bit-allocation for the patches. As in the VQ function of the bit-allocation for the patches. As we argue in the next section, VQ coding isis limited by the available we argue in the next section, VQ coding limited by the available presents results of our method, demonstrating the claimed presents results of our method, demonstrating the claimed number of examples and the desired rate, forcing relatively small number of examples and the desired rate, forcing relatively small superiority. We conclude in Section 5 5 with a list of future activities superiority. We conclude in Section with a list of future activities patch sizes. This, in turn, leads to a a loss of some redundancy be- patch sizes. This, in turn, leads to loss of some redundancy be- that can further improve over the proposed scheme. that can further improve over the proposed scheme. tween adjacent patches, and thus loss of potential compression. tween adjacent patches, and thus loss of potential compression.400 bytes 2. Background material 2. Background material AnotherThe Dictionary obtained by K-SVD for Patch No. 80 (the that partlyOMPcompensates Another ingredient in this algorithmleft eye) using the compensates Fig. 6. ingredient in this algorithm that partly method with L ¼ 4. for the above-described shortcoming isis a a multi-scale coding for the above-described shortcoming multi-scale coding Dk scheme. The image isisscaled down and VQ-coded using patches scheme. The image scaled down and VQ-coded using patches 2.1. VQ-based image compression 2.1. VQ-based image compression of size 8 8  8. Then it is interpolated back to the original resolution, of size  8. Then it is interpolated back to the original resolution, and the residual isiscoded using VQ on 8 8  8pixel patches once and the residual coded using VQ on  8 pixel patches once Among the thousands of papers that study still image Among the thousands of papers that study still image again. This method can be applied on a a Laplacian pyramid of the again. This method can be applied on Laplacian pyramid of the compression algorithms, there are relatively few that consider compression algorithms, there are relatively few that consider original (warped) image with several scales [33]. original (warped) image with several scales [33]. the treatment of facial images [2–17]. Among those, the most the treatment of facial images [2–17]. Among those, the most As already mentioned above, the results shown in [17] surpass As already mentioned above, the results shown in [17] surpass recent and the best performing algorithm isis the one reported in recent and the best performing algorithm the one reported in those obtained by JPEG2000, both visually and in Peak-Signal-to- those obtained by JPEG2000, both visually and in Peak-Signal-to- [17]. That paper also provides a athorough literature survey that [17]. That paper also provides thorough literature survey that Noise Ratio (PSNR) quantitative comparisons. In our work we pro- Noise Ratio (PSNR) quantitative comparisons. In our work we pro- compares the various methods and discusses similarities and compares the various methods and discusses similarities and pose to replace the coding stage from VQ to sparse and redundant pose to replace the coding stage from VQ to sparse and redundant differences between them. Therefore, rather than repeating such differences between them. Therefore, rather than repeating such representations—this leads us to the next subsection, were we de- representations—this leads us to the next subsection, were we de- a asurvey here, we refer the interested reader to [17]. In this Learning survey here, we refer the interested reader to [17]. In this JPEG-2k scribe the principles behind this coding strategy. PCA scribe the principles behind this coding strategy. sub-section we concentrate on the description of the algorithm sub-section we concentrate on the description of the algorithm in [17] as our method resembles itit to some extent. in [17] as our method resembles to some extent. 2.2. Sparse and redundant representations 2.2. Sparse and redundant representations
  • 91. Constraints on the Learning 1Dictionary learning: min ||Y DX|| + ||X||1 2 C = {D ||dm || 1} X, D C 2 (a) PCA (b) SPCA, τ = 70%O NLINE L EARNING FOR M ATRIX FACTORIZATION AND S PARSE C ODING (c) NMF (d) SPCA, τ = 30% Exemplars Y PCA (a) PCA Learning (e) Dictionary= 70% (b) SPCA, τ Learning (f) SPCA, τ = 10%
  • 92. Constraints on the Learning 1Dictionary learning: min ||Y DX|| + ||X||1 2 X, D C 2 O NLINE L EARNING FOR M ATRIX FACTORIZATION AND S PARSE C ODING C = {D ||dm || 1} (a) PCA (b) SPCA, τ = 70%Non-negative matrix factorization: C = {D ||dm || 1, D 0}O NLINE L EARNING FOR M ATRIX FACTORIZATION AND S PARSE C ODING (c) NMF (a) PCA (d) SPCA, τ = 30% Exemplars Y (b) SPCA, τ = 70% PCA (a) PCA Learning (e) Dictionary= 70% (b) SPCA, τ Learning NMF (c) NMF (f) SPCA, τ = 10% (d) SPCA, τ = 30%
  • 93. Constraints on the Learning 1Dictionary learning: min ||Y DX|| + ||X||1 2 X, D C 2 O NLINE L EARNING FOR M ATRIX FACTORIZATION AND S PARSE C ODING O NLINE L EARNING FOR M ATRIX FACTORIZATION AND S PARSE C ODING C = {D ||dm || 1} (a) PCA (b) SPCA, τ = 70%Non-negative matrix factorization: C = {D ||dm || 1, D 0}Sparse-PCA: C = D ||dm ||2 + ||dm ||1O NLINE L EARNING FOR M ATRIX FACTORIZATION AND S PARSE C ODING 1 (c) NMF (a) PCA (d) SPCA,PCA 30% (a) τ = Exemplars Y (b) SPCA, ττ= 70% (b) SPCA, = 70% PCA (a) PCA Learning (e) Dictionary= 70% (b) SPCA, τ Learning NMF (c) NMF (f) SPCA,NMF10% (c) τ = Sparse PCA (d) SPCA, ττ= 30% (d) SPCA, = 30%
  • 94. Dictionary Signature / Epitome 5.2. Influence of the Size of the Patches Translation invariance + patches: [Aharon & Elad 2008] The size of the patches seem to play an im [Jojic et al. 2003] the visual aspect of the epitome. We illustra Low dimensional dictionary parameterization. of epitome of siz an experiment where pairs D = (dm )m learned withd(zm + t) dm (t) = different sizes of patches. Dictionary D Image fLena, Boat and Barbara Signature d Figure 6. Pairs of epitomes of width 46 obtainedative and quantitative width 6, 8, 9, 10 and 12. All other parameters are
  • 95. Dictionary Signature / Epitome 5.2. Influence of the Size of the Patches Translation invariance + patches: [Aharon & Elad 2008] The size of the patches seem to play an im [Jojic et al. 2003] the visual aspect of the epitome. We illustra Low dimensional dictionary parameterization. of epitome of siz an experiment where pairs D = (dm )m learned withd(zm + t) dm (t) = different sizes of patches. Dictionary D Image BarbaraLena, Boat and f Signature d Faster learning. Figure 6. Pairs of epitomes of width 46 obtained Make use of atoms spacial location xm .ative and quantitative width 6, 8, 9, 10 and 12. All other parameters are
  • 96. Conclusion
  • 97. Conclusion
  • 98. Conclusion• Quest for the best representation: texturelets? Fourier wavelets curvelets bandlets
  • 99. Conclusion • Inverse problems regularization: Compressed sensing Convex sparsity prior: 1 m | f, m ⇥| More sparsity better prior better recovery.• Quest for the best representation: texturelets? Fourier wavelets curvelets bandlets
  • 100. Conclusion • Inverse problems regularization: Compressed sensing Convex sparsity prior: 1 m | f, m ⇥| More sparsity better prior better recovery. • Texture synthesis:• Quest for the best representation: texturelets? Wavelets Fourier wavelets curvelets bandlets More sparsity better synthesis

×