The schedule included an introduction from 830-900, a discussion of models with small cliques and special potentials from 900-1000, a tea break from 1000-1030, a discussion of inference techniques like LP relaxation, Lagrangian relaxation and dual decomposition from 1030-1200, and a discussion of models with global potentials and parameters from 1200-1230. The document concluded by outlining some open questions for future work, including exploiting new ideas in applications, exploring other higher-order cliques, comparing different inference techniques, and learning higher-order random fields models.
CVPR2010: higher order models in computer vision: Part 4
1. Schedule
830-900 Introduction
900-1000 Models: small cliques
and special potentials
1000-1030 Tea break
1030-1200 Inference: Relaxation techniques:
LP, Lagrangian, Dual Decomposition
1200-1230 Models: global potentials and
global parameters + discussion
2. MRF with global potential
GrabCut model [Rother et. al. ‘04]
θF/B
E(x,θF,θB) = ∑ Fi(θF)xi+ Bi(θB)(1-xi) + ∑ |xi-xj|
i i,j Є N
Fi = -log Pr(zi|θF) Bi= -log Pr(zi|θB)
R
Background
Foreground
G
Image z Output x θF/B Gaussian
Mixture models
Problem: for unknown x,θF,θB the optimization is NP-hard! [Vicente et al. ‘09]
3. GrabCut: Iterated Graph Cuts θF/B
[Rother et al. Siggraph ‘04]
F B
min E(x, θF, θB) min E(x, θF, θB)
θF,θB x
Learning of the Graph cut to infer
colour distributions segmentation
Most systems with global variables work like that
e.g. [ObjCut Kumar et. al. ‘05, PoseCut Bray et al. ’06, LayoutCRF Winn et al. ’06]
5. Colour Model
R R
Iterated
Background Background
graph cut
Foreground &
Background G Foreground
G
6. Optimizing over θ’s help
Input no iteration after convergence
[Boykov&Jolly ‘01] [GrabCut ‘04]
Input after convergence
[GrabCut ‘04]
7. Global optimality?
GrabCut Global Optimum
(local optimum) [Vicente et al. ‘09]
Is it a problem of the optimization or the model?
8. … first attempt to solve it
[Lempisky et al. ECCV ‘08]
E(x,θF,θB)= ∑ Fi(θF)xi+ Bi(θB)(1-xi) + ∑ wij|xi-xj|
pЄV pq Є E
R 5
4
3
2 wF,B
6 7 8
1 G
8 Gaussians
whole image
Model a discrete subset:
wF= (1,1,0,1,0,0,0,0); wB = (1,0,0,0,0,0,0,1)
#solutions: wF*wB = 216
Global Optimum:
Exhaustive Search: 65.536 Graph Cuts
Branch-and-MinCut: ~ 130-500 Graph Cuts (depends on image)
9. Branch-and-MinCut
wF= (*,*,*,*,*,*,*,*)
wB= (*,*,*,*,*,*,*,*)
wF= (0,0,*,*,*,*,*,*)
wB= (0,*,*,*,*,*,*,*)
wF= (1,1,1,1,0,0,0,0)
wB= (1,0,1,1,0,1,0,0)
min BE(x,wF,wB) = min [ ∑ Fi(wF)xi+ Bi(wB)(1-xi) + ∑ wij(xi,xj) ]
x,w ,w
F F B x,w ,w
≥ min [∑ min Fi(wF)xi+ min Bi(wB)(1-xi) + ∑ wij(xi,xj)]
x F w w
B
10. Results …
E=-618
E = -618 E=-624 (speed-up 481)
E = -624 (speed-up E = -62
GrabCut Branch-and-MinCut
E = -618 E = -624 (speed-up 481) E = -62
E = -593 E = -584 (speed-up 141) E = -60
E = -593
E=-593 E = -584 (speed-up 141)
E=-584 (speed-up 141) E = -60
GrabCut Branch-and-MinCut
11. Object Recognition & Segmentation
min E(x,w) with: w = Templates x Position w
|w| ~ 2.000.000
Given exemplar shapes:
Test: Speed-up ~900; accuracy 98.8%
12. … second attempt to solve it
[Vicente et al. ICCV ‘09]
Eliminate global color model θF,θB :
E’(x) = min E(x,θF,θB)
θF,θB
13. Eliminate color model
E(x,θF,θB)= ∑ Fi(θF)xi+ Bi(θB)(1-xi) + ∑ wij|xi-xj|
K = 163
k
Image discretized in bins Image histogram
θF θB
given x foreground
k
background
k
distribution distribution
θ Fє [0,1]K is a distributions (∑θF = 1)
K
(background same)
K
Optimal θF/B given by empirical histograms: θF = nFk/nF
K
pЄV
nF = ∑xi #fgd. pixel
k
nF = ∑xi #fgd. pixel in bin k
k pЄV
i Є Bk
14. Eliminate color model
E(x,θF,θB)= ∑ Fi(θF)xi+ Bi(θB)(1-xi) + ∑ wij|xi-xj|
i
E(x,θF,θB)= ∑ -nFk log θFk -nBk log θBk + ∑ wij|xi-xj|
k
min θF,θB (θF = nFk/nF )
K
E’(x)= g(nF) + ∑ hk(nk ) + ∑ wij|xi-xj|
F with nF = ∑xi, nF = ∑xi
k
k i Є Bk
convex g concave hk
0 n/2 n nF 0 max nF
k
Prefers “equal area” segmentation Each color either fore- or background
15. How to optimize … Dual Decomposition
E(x)= g(nF) + ∑ hk(nFk) + ∑ wij|xi-xj|
k
E1(x) E2(x)
min E(x) = min [ E1(x) + yTx + E2(x) – yTx ]
x x
≥ min [ E1(x’) + yTx’ ] + min [E2(x) – yTx] =: L(y)
x’ x
Simple (no MRF) Robust Pn Potts
Goal: - maximize concave function L(y)
using sub-gradient E(x’)
- no guarantees on E (NP-hard)
L(y)
“paramteric maxflow” gives optimal y=λ1 efficiently [Vicente et al. ICCV ‘09]
16. Some results…
Global optimum in 61% of cases (GrabCut database)
Input GrabCut Global Optimum (DD)
Local Optimum (DD)
17. Insights on the GrabCut model
0.3 g 0.4 g g 1.5 g
convex g concave hk
0 n/2 n nF 0 max nF
k
Prefers “equal area” segmentation Each color either fore- or background
18. Relationship to Soft Pn Potts
Just different type of clustering:
Image One super- another super- GrabCut:
pixelization pixelization cluster all colors
together
Pairwise CRF only robust Pn Potts
TextonBoost [Kohli et al ‘08]
[Shotton et al. ‘06]
19. Marginal Probability Field (MPF)
What is the prior of a MAP-MRF solution:
Training image: 60% black, 40% white
MAP: Others less likely :
8
prior(x) = 0.6 = 0.016 prior(x) = 0.6 5 * 0.4 3 = 0.005
MRF is a bad prior since ignores shape of the (feature) distribution !
Introduce a global term, which controls global statistic
[Woodford et. al. ICCV ‘09]
20. Marginal Probability Field (MPF)
0 max 0 max
True energy
MRF
Optimization done with Dual Decomposition (different ones)
[Woodford et. al. ICCV ‘09]
21. Examples
Noisy input
Ground truth Pairwise MRF – Global gradient prior
Increase Prior strength
In-painting:
Segmentation:
22. Schedule
830-900 Introduction
900-1000 Models: small cliques
and special potentials
1000-1030 Tea break
1030-1200 Inference: Relaxation techniques:
LP, Lagrangian, Dual Decomposition
1200-1230 Models: global potentials and
global parameters + discussion
23. Open Questions
• Many exciting future directions
– Exploiting latest ideas for applications (object
recognition etc.)
– Many other higher-order cliques:
Topology, Grammars, etc. (this conference).
• Comparison of inference techniques needed:
– Factor graph message passing vs. transformation vs.
LP relaxation?
• Learning higher order Random Fields