1) The document discusses the relationship between biological vision, human vision, and machine vision, and how an understanding of the visual world can inform the development of robust computer vision systems.
2) It examines prior approaches to modeling natural image priors and finds that methods assuming independence between image patches are too simplistic.
3) A simple Gaussian mixture model that allows correlation between patches is presented and outperforms existing prior models, achieving state-of-the-art results on image denoising tasks.
8. Motivation:
Natural Image Priors
Given a N ×aN matrix x, return Pr(x) “Probability that xxisis a
Given N × N matrix x, return Pr(x) “Probability
that a
naturalnatural image”. .
image”.
likely less likely really unlikely
Zhu and Mumford, Portilla and Simoncelli, Roth and Black, Weiss and
Freeman, Osindero, Welling and Hinton, Ranzato and Lecun,
Olshausen, Lewicki, Ng, Aharon and Elad, Mairal, Sapiro, · · ·
Biological Vison ⇔ Computer Vision
9. Roth and Black 2005
Prior based methods vs. Prior Free methods
•Training Images set ⇒ Filter image prior.
Prior based. Training natural Set
? ? ? ? ?
? ? ? ? ?
? ? ? ? ?
? ? ? ? ?
? ? ? ? ?
• Prior free. No training set. No explicit notion of natural
1 −E(x)
Pr(x; f ilters, energy) =
image prior. e
Z
m Likelihood ⇒
• Best performance in image denoising?
Filters Energy
ergy
10. Roth and Black 2005
Prior based methods vs. Prior Free methods
•Training Images set ⇒ Filter image prior.
Prior based. Training natural Set
? ? ? ? ?
? ? ? ? ?
? ? ? ? ?
? ? ? ? ?
? ? ? ? ?
• Prior free. No training set. No explicit notion of natural
1 −E(x)
Pr(x; f ilters, energy) =
image prior. e
Z
m Likelihood ⇒
• Best performance in image denoising?
• Prior Filters
free methods. Energy
ergy
11. The BM3D Prior free method
Buades et al. 05, Dabov et al. 06, Elad et a. 07,Mairal et al. 10, Liu
and Simoncelli 08
14. Comparison
100 different test images.
• BM3D vs. Fields of Expert (Roth and Black)
• BM3D is better 100/100 times.
• BM3D vs. generic KSVD (Elad and Aharon)
15. Comparison
100 different test images.
• BM3D vs. Fields of Expert (Roth and Black)
• BM3D is better 100/100 times.
• BM3D vs. generic KSVD (Elad and Aharon)
• BM3D is better 89/100 times.
19. Training Images
What’s going on? Filt
? ?
? ?
? ?
? ?
? ?
• “generic natural image” — too general?
• Training using maximum likelihood the wrong thing?
Pr(x; f ilters, energy) =
20. Training Images
What’s going on? Filt
? ?
? ?
? ?
? ?
? ?
• “generic natural image” — too general?
• Training using maximum likelihood the wrong thing?
Pr(x; f ilters, energy) =
• Current prior models poor (even in the likelihood sense).
21. What’s going on
ural image patches 120 30 30
alled) or the noisy Patch Average
EPLL
100 25
tionary, all overlap- 29
80 20
independently and
PSNR (dB)
28
Log L
ucted image. This
60 15
27
using this new esti-
40 10
26
KSVD is different 20 5
may be performed 0
Ind. Pixel MVG PCA ICA
0 25
Ind. Pixel MVG PCA I
ess the dictionary (a) (b)
• High Likelihood ⇒ Good Denoising Performance.
ges), but the opti-
pecial case of our Figure 4: (a) Whole image denoising with the proposed fram
, our cost function with all the priors discussed in Section 2. It can be seen that
priors (in the likelihood sense) lead to better denoising perform
ain, however, that
on whole images, left bar is log L, right bar is PSNR. (b) Not
riors which can be the EPLL framework improves performance significantly
e will see later on, compared to simple patch averaging (PA)
22. What’s going on
ural image patches 120 30 30
alled) or the noisy Patch Average
EPLL
100 25
tionary, all overlap- 29
80 20
independently and
PSNR (dB)
28
Log L
ucted image. This
60 15
27
using this new esti-
40 10
26
KSVD is different 20 5
may be performed 0
Ind. Pixel MVG PCA ICA
0 25
Ind. Pixel MVG PCA I
ess the dictionary (a) (b)
• High Likelihood ⇒ Good Denoising Performance.
ges), but the opti-
• of a simple,Figure 4: (a) Whole image denoising with the proposed fram
pecial caseBut our unconstrained Gaussian Mixture Model (200
mixture components for 8x8 image patches) · · · It can be seen that
, our cost function with all the priors discussed in Section 2.
priors (in the likelihood sense) lead to better denoising perform
ain, however, that
on whole images, left bar is log L, right bar is PSNR. (b) Not
riors which can be the EPLL framework improves performance significantly
e will see later on, compared to simple patch averaging (PA)
23. What’s going on
ural image patches 120 30 30
alled) or the noisy Patch Average
EPLL
100 25
tionary, all overlap- 29
80 20
independently and
PSNR (dB)
28
Log L
ucted image. This
60 15
27
using this new esti-
40 10
26
KSVD is different 20 5
may be performed 0
Ind. Pixel MVG PCA ICA
0 25
Ind. Pixel MVG PCA I
ess the dictionary (a) (b)
• High Likelihood ⇒ Good Denoising Performance.
ges), but the opti-
• of a simple,Figure 4: (a) Whole image denoising with the proposed fram
pecial caseBut our unconstrained Gaussian Mixture Model (200
mixture components for 8x8 image patches) · · · It can be seen that
, our cost function with all the priors discussed in Section 2.
priors (in the likelihood sense) lead to better denoising perform
ain, however, that likelihood 164.52. Much better than all existing
• Gives log
on whole images, left bar is log L, right bar is PSNR. (b) Not
riors whichmodels.
can be the EPLL framework improves performance significantly
e will see later on,
• Outperforms compared to simplebasedaveragingin denoising.
all existing prior patch models (PA)
28. GMM vs. BM3D
100 different test images.
• BM3D vs. GMM.
• GMM is better 81/100 times.
29. GMM vs. BM3D
100 different test images.
• BM3D vs. GMM.
• GMM is better 81/100 times.
• GMM can be used for any application.
Refer
[1] J.
ag
do
no
[2] A.
im
tio
on
[3] M
(a) Blurred (b) Krishnan et al. (c) EPLL GMM red
Krishnan et al. EPLL-GMM Pr
37
33. Table 2: Summary of denoising experiments results. Our method is clearly state-of-t
competitive with image based method such as BM3D and LLSC which are state-of-the-a
Secret of GMM
(a) Noisy Image
• Sparse coding, ICA, FOE all assumecovariancesort of
Figure 6: Eigenvectors of 6 randomly selected some matrices
from the learned GMM model, sorted by eigenvalue from largest
independence Note the richness ofoutputs. - some of the
to smallest. between filter the structures (c) LLSC - P
• GMMeigenvectors look like PCA components, while others model texture
suggests extremely structured sparse coding. Only 7: Examp
Figure
boundaries, edges and other structures at different orientations.
filters within same block can be active together. (Yu et state-of-the-art d
al.
2010) how detail is mu
to KSVD. Also n
4.3.1 Generic Priors when compared
34. Summary
• Robust Computer/Human/Biological Vision ⇒ Properties
of the visual world.
• Natural Image Priors. Biological Vision ⇔ Computer
Vision.
• Simple GMM model for image patches. No independence
assumptions ⇒ much better model.