Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
CVPR2010: higher order models in computer vision: Part 3
1. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Higher Order Models in Computer Vision:
Inference
Carsten Rother, Sebastian Nowozin
Machine Learning and Perception Group
Microsoft Research Cambridge
San Francisco, 18th June 2010
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
2. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Belief Propagation
Belief Propagation Reminder
Sum-product belief propagation rF →Yi
Max-product/Max-sum/Min-sum belief
.
. Yi .
.
propagation qYi →F
F
Uses
Messages: vectors at the factor graph
edges
qYi →F ∈ RYi , the variable-to-factor
message, and
rF →Yi ∈ RYi , the factor-to-variable
message.
Updates: iteratively recomputing these
vectors
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
3. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Belief Propagation
Belief Propagation Reminder: Min-Sum Updates (cont)
Variable-to-factor message
rA→Yi
A qYi →F
qYi →F (yi ) = rF →Yi (yi )
Yi
F ∈
M(i){F } B rF →Yi F
. rB→Yi
.
.
Factor-to-variable message
Yj qYj →F
rF →Yi
Yi
rF →Yi (yi ) = min EF (yF ) +
qYj →F (yj )
Yk
qYk →F F qYi →F
y ∈YF F
yi =yi
j∈ .
.
.
N(F ){i}
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
4. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Belief Propagation
Belief Propagation Reminder: Min-Sum Updates (cont)
Variable-to-factor message
rA→Yi
A qYi →F
qYi →F (yi ) = rF →Yi (yi )
Yi
F ∈
M(i){F } B rF →Yi F
. rB→Yi
.
.
Factor-to-variable message
Yj qYj →F
rF →Yi
Yi
rF →Yi (yi ) = min EF (yF ) +
qYj →F (yj )
Yk
qYk →F F qYi →F
y ∈YF F
yi =yi
j∈ .
.
.
N(F ){i}
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
5. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Belief Propagation
Higher-order Factors
Factor-to-variable message
Yj
rF →Yi (yi ) = min EF (yF ) +
qYj →F (yj )
Yk
y ∈YF F rF →Yi
j∈
yi =yi N(F ){i} Yi
Yl F
.
.
.
Minimization becomes intractable for Ym
general EF
Tractable for special structure of EF Yn
See (Felzenszwalb and Huttenlocher,
CVPR 2004), (Potetz, CVPR 2007), and
(Tarlow et al., AISTATS 2010)
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
6. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Belief Propagation
Higher-order Factors
Factor-to-variable message
Yj
rF →Yi (yi ) = min EF (yF ) +
qYj →F (yj )
Yk
y ∈YF F rF →Yi
j∈
yi =yi N(F ){i} Yi
Yl F
.
.
.
Minimization becomes intractable for Ym
general EF
Tractable for special structure of EF Yn
See (Felzenszwalb and Huttenlocher,
CVPR 2004), (Potetz, CVPR 2007), and
(Tarlow et al., AISTATS 2010)
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
7. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Relaxations
Problem Relaxations
Optimization problems (minimizing g : G → R over Y ⊆ G) can
become easier if
feasible set is enlarged, and/or
objective function is replaced with a bound.
g(x),
h(x)
g(y)
Y
h(z)
Z⊇Y
y, z
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
8. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Relaxations
Problem Relaxations
Optimization problems (minimizing g : G → R over Y ⊆ G) can
become easier if
feasible set is enlarged, and/or
objective function is replaced with a bound.
Definition (Relaxation (Geoffrion, 1974))
Given two optimization problems (g , Y, G) and (h, Z, G), the problem
(h, Z, G) is said to be a relaxation of (g , Y, G) if,
1. Z ⊇ Y, i.e. the feasible set of the relaxation contains the feasible
set of the original problem, and
2. ∀y ∈ Y : h(y ) ≤ g (y ), i.e. over the original feasible set the objective
function h achieves no larger values than the objective function g .
(Minimization version)
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
9. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Relaxations
Relaxations
Relaxed solution z ∗ provides a bound:
h(z ∗ ) ≥ g (y ∗ ), (maximization: upper bound)
h(z ∗ ) ≤ g (y ∗ ), (minimization: lower bound)
Relaxation is typically tractable
Evidence that relaxations are “better” for learning (Kulesza and
Pereira, 2007), (Finley and Joachims, 2008), (Martins et al., 2009)
Drawback: z ∗ ∈ Z Y possible
solution is not integral (but fractional), or
solution violates constraints (primal infeasible).
To obtain z ∈ Y from y ∗ : rounding, heuristics
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
10. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Relaxations
Relaxations
Relaxed solution z ∗ provides a bound:
h(z ∗ ) ≥ g (y ∗ ), (maximization: upper bound)
h(z ∗ ) ≤ g (y ∗ ), (minimization: lower bound)
Relaxation is typically tractable
Evidence that relaxations are “better” for learning (Kulesza and
Pereira, 2007), (Finley and Joachims, 2008), (Martins et al., 2009)
Drawback: z ∗ ∈ Z Y possible
solution is not integral (but fractional), or
solution violates constraints (primal infeasible).
To obtain z ∈ Y from y ∗ : rounding, heuristics
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
11. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Relaxations
Constructing Relaxations
Principled methods to construct relaxations
Linear Programming Relaxations
Lagrangian relaxation
Lagrangian/Dual decomposition
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
12. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Relaxations
Constructing Relaxations
Principled methods to construct relaxations
Linear Programming Relaxations
Lagrangian relaxation
Lagrangian/Dual decomposition
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
13. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
MAP-MRF LP
MAP-MRF Linear Programming Relaxation
Simple two variable pairwise MRF
Y1 Y2
Y1 Y1 × Y2 Y2
θ1 θ1,2 θ2
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
14. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
MAP-MRF LP
MAP-MRF Linear Programming Relaxation
Y1 Y2
Y1 Y1 × Y2 Y2
y1 = 2 (y1 , y2 ) = (2, 3)
y2 = 3
Energy E (y ; x) = θ1 (y1 ; x) + θ1,2 (y1 , y2 ; x) + θ2 (y2 ; x)
1
Probability p(y |x) = Z (x) exp{−E (y ; x)}
MAP prediction: argmax p(y |x) = argmin E (y ; x)
y ∈Y y ∈Y
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
16. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
MAP-MRF LP
MAP-MRF Linear Programming Relaxation (cont)
min θi (yi )µi (yi ) + θi,j (yi , yj )µi,j (yi , yj )
µ
i∈V yi ∈Yi {i,j}∈E (yi ,yj )∈Yi ×Yj
sb.t. µi (yi ) = 1, ∀i ∈ V ,
yi ∈Yi
µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi
yj ∈Yj
µi (yi ) ∈ {0, 1}, ∀i ∈ V , ∀yi ∈ Yi ,
µi,j (yi , yj ) ∈ {0, 1}, ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj .
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
17. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
MAP-MRF LP
MAP-MRF Linear Programming Relaxation (cont)
min θi (yi )µi (yi ) + θi,j (yi , yj )µi,j (yi , yj )
µ
i∈V yi ∈Yi {i,j}∈E (yi ,yj )∈Yi ×Yj
sb.t. µi (yi ) = 1, ∀i ∈ V ,
yi ∈Yi
µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi
yj ∈Yj
µi (yi ) ∈ {0, 1}, ∀i ∈ V , ∀yi ∈ Yi ,
µi,j (yi , yj ) ∈ {0, 1}, ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj .
→ NP-hard integer linear program
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
18. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
MAP-MRF LP
MAP-MRF Linear Programming Relaxation (cont)
min θi (yi )µi (yi ) + θi,j (yi , yj )µi,j (yi , yj )
µ
i∈V yi ∈Yi {i,j}∈E (yi ,yj )∈Yi ×Yj
sb.t. µi (yi ) = 1, ∀i ∈ V ,
yi ∈Yi
µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi
yj ∈Yj
µi (yi ) ∈ [0, 1], ∀i ∈ V , ∀yi ∈ Yi ,
µi,j (yi , yj ) ∈ [0, 1], ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj .
→ linear program
→ LOCAL polytope
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
19. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
MAP-MRF LP
MAP-MRF LP Relaxation, Works
MAP-MRF LP Analysis
(Wainwright et al., Trans. Inf. Tech., 2005), (Weiss et al., UAI
2007), (Werner, PAMI 2005)
TRW-S (Kolmogorov, PAMI 2006)
Improving tightness
(Komodakis and Paragios, ECCV 2008), (Kumar and Torr, ICML
2008), (Sontag and Jaakkola, NIPS 2007), (Sontag et al., UAI
2008), (Werner, CVPR 2008), (Kumar et al., NIPS 2007)
Algorithms based on the LP
(Globerson and Jaakkola, NIPS 2007), (Kumar and Torr, ICML
2008), (Sontag et al., UAI 2008)
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
20. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
MAP-MRF LP
MAP-MRF LP Relaxation, General case
B
A
yA ∈YA µA (yA ) = 1,
yA ∈YA µA (yA ) = µB (yB ), ∀yB ∈ YB .
[yA ]B =yB
Generalization to higher-order factors
For B ⊆ A factors: marginalization constraints
Defines LOCAL polytope (Wainwright and Jordan, 2008)
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
21. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
MAP-MRF LP
MAP-MRF LP Relaxation, Known results
1. LOCAL is tight iff the factor graph is a forest (acyclic)
2. All labelings are vertices of LOCAL (Wainwright and Jordan, 2008)
3. For cyclic graphs there are additional fractional vertices.
4. If all factors have regular energies, the fractional solutions are never
optimal (Wainwright and Jordan, 2008)
5. For models with only binary states: half-integrality, integral variables
are certain
See some examples later...
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
22. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Lagrangian Relaxation
Constructing Relaxations
Principled methods to construct relaxations
Linear Programming Relaxations
Lagrangian relaxation
Lagrangian/Dual decomposition
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
23. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Lagrangian Relaxation
Lagrangian Relaxation, Motivation
B
A
yA ∈YA µA (yA ) = µB (yB ), ∀yB ∈ YB .
[yA ]B =yB
MAP-MRF LP with higher-order factors problematic:
Number of LP variables grows with |YA | → exponential in MRF
variables
But: number of constraints stays small (|YB |)
Lagrangian Relaxation allows one to solve the LP by treating the
constraints implicitly
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
24. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Lagrangian Relaxation
Lagrangian Relaxation
min g (y )
y
sb.t. y ∈ D,
y ∈ C.
Assumption
Optimizing g (y ) over y ∈ D is “easy”.
Optimizing over y ∈ D ∩ C is hard.
High-level idea
Incorporate y ∈ C constraint into objective function
For an excellent introduction, see (Guignard, “Lagrangean Relaxation”,
TOP 2003) and (Lemar´chal, “Lagrangian Relaxation”, 2001).
e
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
25. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Lagrangian Relaxation
Lagrangian Relaxation (cont)
Recipe
1. Express C in terms of equalities and inequalities
C = {y ∈ G : ui (y ) = 0, ∀i = 1, . . . , I , vj (y ) ≤ 0, ∀j = 1, . . . , J},
ui : G → R differentiable, typically affine,
vj : G → R differentiable, typically convex.
2. Introduce Lagrange multipliers, yielding
min g (y )
y
sb.t. y ∈ D,
ui (y ) = 0 : λ, i = 1, . . . , I ,
vj (y ) ≤ 0 : µ, j = 1, . . . , J.
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
26. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Lagrangian Relaxation
Lagrangian Relaxation (cont)
2. Introduce Lagrange multipliers, yielding
min g (y )
y
sb.t. y ∈ D,
ui (y ) = 0 : λ, i = 1, . . . , I ,
vj (y ) ≤ 0 : µ, j = 1, . . . , J.
3. Build partial Lagrangian
min g (y ) + λ u(y ) + µ v (y ) (1)
y
sb.t. y ∈ D.
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
27. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Lagrangian Relaxation
Lagrangian Relaxation (cont)
3. Build partial Lagrangian
min g (y ) + λ u(y ) + µ v (y ) (1)
y
sb.t. y ∈ D.
Theorem (Weak Duality of Lagrangean Relaxation)
For differentiable functions ui : G → R and vj : G → R, and for any
λ ∈ RI and any non-negative µ ∈ RJ , µ ≥ 0, problem (1) is a relaxation
of the original problem: its value is lower than or equal to the optimal
value of the original problem.
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
28. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Lagrangian Relaxation
Lagrangian Relaxation (cont)
3. Build partial Lagrangian
q(λ, µ) := min g (y ) + λ u(y ) + µ v (y ) (1)
y
sb.t. y ∈ D.
4. Maximize lower bound wrt λ, µ
max q(λ, µ)
λ,µ
sb.t. µ≥0
→ efficiently solvable if q(λ, µ) can be evaluated
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
29. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Lagrangian Relaxation
Optimizing Lagrangian Dual Functions
4. Maximize lower bound wrt λ, µ
max q(λ, µ) (2)
λ,µ
sb.t. µ≥0
Theorem (Lagrangean Dual Function)
1. q is concave in λ and µ, Problem (2) is a concave maximization
problem.
2. If q is unbounded above, then the original problem is infeasible.
3. For any λ, µ ≥ 0, let
y (λ, µ) = argminy ∈D g (y ) + λ u(y ) + µ v (u)
Then, a supergradient of q can be constructed by evaluating the
constraint functions at y (λ, µ) as
∂ ∂
u(y (λ, µ)) ∈ q(λ, µ), and v (y (λ, µ)) ∈ q(λ, µ).
∂λ ∂µ
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
30. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Lagrangian Relaxation
Subgradients, Supergradients
f (x)
Subgradients for convex functions: global affine underestimators
Supergradients for concave functions: global affine overestimators
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
31. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Lagrangian Relaxation
Subgradients, Supergradients
f (x)
x
Subgradients for convex functions: global affine underestimators
Supergradients for concave functions: global affine overestimators
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
32. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Lagrangian Relaxation
Subgradients, Supergradients
f (x)
x
f (y) ≥ f (x) + g (y − x), ∀y
Subgradients for convex functions: global affine underestimators
Supergradients for concave functions: global affine overestimators
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
33. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Lagrangian Relaxation
Subgradients, Supergradients
f (x)
x
∂
0∈ ∂x
f (x)
Subgradients for convex functions: global affine underestimators
Supergradients for concave functions: global affine overestimators
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
34. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Lagrangian Relaxation
Example behaviour of objectives
(from a 10-by-10 pairwise binary MRF relaxation, submodular)
10
Dual objective
0
Primal objective
−10
Objective
−20
−30
−40
−50
0 50 100 150 200 250
Iteration
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
35. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Lagrangian Relaxation
Primal Solution Recovery
Assume we solved the dual problem for (λ∗ , µ∗ )
1. Can we obtain a primal solution y ∗ ?
2. Can we say something about the relaxation quality?
Theorem (Sufficient Optimality Conditions)
If for a given λ, µ ≥ 0, we have u(y (λ, µ)) = 0 and v (y (λ, µ)) ≤ 0
(primal feasibility) and further we have
λ u(y (λ, µ)) = 0, and µ v (y (λ, µ)) = 0,
(complementary slackness), then
y (λ, µ) is an optimal primal solution to the original problem, and
(λ, µ) is an optimal solution to the dual problem.
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
36. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Lagrangian Relaxation
Primal Solution Recovery
Assume we solved the dual problem for (λ∗ , µ∗ )
1. Can we obtain a primal solution y ∗ ?
2. Can we say something about the relaxation quality?
Theorem (Sufficient Optimality Conditions)
If for a given λ, µ ≥ 0, we have u(y (λ, µ)) = 0 and v (y (λ, µ)) ≤ 0
(primal feasibility) and further we have
λ u(y (λ, µ)) = 0, and µ v (y (λ, µ)) = 0,
(complementary slackness), then
y (λ, µ) is an optimal primal solution to the original problem, and
(λ, µ) is an optimal solution to the dual problem.
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
37. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Lagrangian Relaxation
Primal Solution Recovery (cont)
But,
we might never see a solution satisfying the optimality condition
is the case only if there is no duality gap, i.e. q(λ∗ , µ∗ ) = g (x, y ∗ )
Special case: integer linear programs
we can always reconstruct a primal solution to
min g (y ) (3)
y
sb.t. y ∈ conv(D),
y ∈ C.
For example for subgradient method updates it is known that
(Anstreicher and Wolsey, MathProg 2009)
T
1
lim y (λt , µt ) → y ∗ of (3).
T →∞ T
t=1
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
38. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Lagrangian Relaxation
Application to higher-order factors
B
A
yA ∈YA µA (yA ) = µB (yB ), ∀yB ∈ YB .
[yA ]B =yB
(Komodakis and Paragios, CVPR 2009) propose a MAP inference
algorithm suitable for higher-order factors
Can be seen as Lagrangian relaxation of the marginalization
constraint of the MAP-MRF LP
See also (Johnson et al., ACCC 2007), (Schlesinger and Giginyak,
CSC 2007), (Wainwright and Jordan, 2008, Sect. 4.1.3), (Werner,
PAMI 2005), (Werner, TechReport 2009), (Werner, CVPR 2008)
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
39. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Lagrangian Relaxation
Komodakis MAP-MRF Decomposition
min θi (yi )µi (yi ) + θi,j (yi , yj )µi,j (yi , yj )
µ
i∈V yi ∈Yi {i,j}∈ (yi ,yj )∈
E Yi ×Yj
sb.t. µi (yi ) = 1, ∀i ∈ V ,
yi ∈Yi
µi,j (yi , yj ) = 1, ∀{i, j} ∈ E ,
(yi ,yj )∈Yi ×Yj
µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi
yj ∈Yj
µi (yi ) ∈ {0, 1}, ∀i ∈ V , ∀yi ∈ Yi ,
µi,j (yi , yj ) ∈ {0, 1}, ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj .
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
40. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Lagrangian Relaxation
Komodakis MAP-MRF Decomposition
min θi , µ i + θi,j , µi,j
µ
i∈V {i,j}∈
E
sb.t. µi (yi ) = 1, ∀i ∈ V ,
yi ∈Yi
µi,j (yi , yj ) = 1, ∀{i, j} ∈ E ,
(yi ,yj )∈Yi ×Yj
µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi
yj ∈Yj
µi (yi ) ∈ {0, 1}, ∀i ∈ V , ∀yi ∈ Yi ,
µi,j (yi , yj ) ∈ {0, 1}, ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj .
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
41. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Lagrangian Relaxation
Komodakis MAP-MRF Decomposition
min θi , µ i + θi,j , µi,j + θA (yA )µA (yA )
µ
i∈V {i,j}∈ yA ∈YA
E
sb.t. µi (yi ) = 1, ∀i ∈ V ,
yi ∈Yi
µi,j (yi , yj ) = 1, ∀{i, j} ∈ E ,
(yi ,yj )∈Yi ×Yj
µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi
yj ∈Yj
µA (yA ) = µB (yB ), ∀B ⊂ A, yB ∈ YB ,
yA ∈YA ,[yA ]B =yB
µA (yA ) = 1,
yA ∈YA
µi (yi ) ∈ {0, 1}, ∀i ∈ V , ∀yi ∈ Yi ,
µi,j (yi , yj ) ∈ {0, 1}, ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj ,
µA (yA ) ∈ {0, 1}, ∀yA ∈ YA .
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
42. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Lagrangian Relaxation
Komodakis MAP-MRF Decomposition
min θi , µ i + θi,j , µi,j + θA (yA )µA (yA )
µ
i∈V {i,j}∈ yA ∈YA
E
sb.t. µi (yi ) = 1, ∀i ∈ V ,
yi ∈Yi
µi,j (yi , yj ) = 1, ∀{i, j} ∈ E ,
(yi ,yj )∈Yi ×Yj
µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi
yj ∈Yj
µA (yA ) = µB (yB ) : φA→B (yB ), ∀B ⊂ A, yB ∈ YB ,
yA ∈YA ,[yA ]B =yB
µA (yA ) = 1,
yA ∈YA
µi (yi ) ∈ {0, 1}, ∀i ∈ V , ∀yi ∈ Yi ,
µi,j (yi , yj ) ∈ {0, 1}, ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj ,
µA (yA ) ∈ {0, 1}, ∀yA ∈ YA .
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
43. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Lagrangian Relaxation
Komodakis MAP-MRF Decomposition
q(φ) := min θi , µ i + θi,j , µi,j + θA (yA )µA (yA )
µ
i∈V {i,j}∈ yA ∈YA
E
+ φA→B (yB ) µA (yA ) − µB (yB )
B⊂A, yA ∈YA ,[yA ]B =yB
yB ∈YB
sb.t. µi (yi ) = 1, ∀i ∈ V ,
yi ∈Yi
µi,j (yi , yj ) = 1, ∀{i, j} ∈ E ,
(yi ,yj )∈Yi ×Yj
µA (yA ) = 1,
yA ∈YA
µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi
yj ∈Yj
µi (yi ) ∈ {0, 1}, ∀i ∈ V , ∀yi ∈ Yi ,
µi,j (yi , yj ) ∈ {0, 1}, ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj ,
Carsten Rother, Sebastian Nowozin µA (yA ) ∈ {0, 1}, ∀yA ∈ YA . Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
44. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Lagrangian Relaxation
Komodakis MAP-MRF Decomposition
q(φ) := min θi , µ i + θi,j , µi,j + θA (yA )µA (yA )
µ
i∈V {i,j}∈ yA ∈YA
E
+ φA→B (yB ) µA (yA ) − φA→B (yB )µB (yB )
B⊂A, yA ∈YA ,[yA ]B =yB B⊂A,
yB ∈YB yB ∈YB
sb.t. µi (yi ) = 1, ∀i ∈ V ,
yi ∈Yi
µi,j (yi , yj ) = 1, ∀{i, j} ∈ E ,
(yi ,yj )∈Yi ×Yj
µA (yA ) = 1,
yA ∈YA
µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi
yj ∈Yj
µi (yi ) ∈ {0, 1}, ∀i ∈ V , ∀yi ∈ Yi ,
µi,j (yi , yj ) ∈ {0, 1}, ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj ,
µA (yA ) ∈ {0, 1}, ∀yA ∈ YA .
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
45. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Lagrangian Relaxation
Komodakis MAP-MRF Decomposition
q(φ) := min θi , µ i + θi,j , µi,j − φA→B (yB )µB (yB )
µ
i∈V {i,j}∈ B⊂A,
E yB ∈YB
+ φA→B (yB ) µA (yA ) + θA (yA )µA (yA )
B⊂A, yA ∈YA ,[yA ]B =yB yA ∈YA
yB ∈YB
sb.t. µi (yi ) = 1, ∀i ∈ V ,
yi ∈Yi
µi,j (yi , yj ) = 1, ∀{i, j} ∈ E ,
(yi ,yj )∈Yi ×Yj
µA (yA ) = 1,
yA ∈YA
µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi
yj ∈Yj
µi (yi ) ∈ {0, 1}, ∀i ∈ V , ∀yi ∈ Yi ,
µi,j (yi , yj ) ∈ {0, 1}, ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj ,
µA (yA ) ∈ {0, 1}, ∀yA ∈ YA .
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
46. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Lagrangian Relaxation
Komodakis MAP-MRF Decomposition
q(φ) := min θi , µ i + θi,j , µi,j − φA→B (yB )µB (yB )
µ
i∈V {i,j}∈ B⊂A,
E yB ∈YB
+ φA→B (yB ) µA (yA ) + θA (yA )µA (yA )
B⊂A, yA ∈YA ,[yA ]B =yB yA ∈YA
yB ∈YB
sb.t. µi (yi ) = 1, ∀i ∈ V ,
yi ∈Yi
µi,j (yi , yj ) = 1, ∀{i, j} ∈ E ,
(yi ,yj )∈Yi ×Yj
µA (yA ) = 1,
yA ∈YA
µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi
yj ∈Yj
µi (yi ) ∈ {0, 1}, ∀i ∈ V , ∀yi ∈ Yi ,
µi,j (yi , yj ) ∈ {0, 1}, ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj ,
µA (yA ) ∈ {0, 1}, ∀yA ∈ YA .
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
47. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Lagrangian Relaxation
Komodakis MAP-MRF Decomposition
q(φ) := min θi , µ i + θi,j , µi,j − φA→B (yB )µB (yB )
µ
i∈V {i,j}∈ B⊂A,
E yB ∈YB
+ φA→B (yB ) µA (yA ) + θA (yA )µA (yA )
B⊂A, yA ∈YA ,[yA ]B =yB yA ∈YA
yB ∈YB
sb.t. µi (yi ) = 1, ∀i ∈ V ,
yi ∈Yi
µi,j (yi , yj ) = 1, ∀{i, j} ∈ E ,
(yi ,yj )∈Yi ×Yj
µA (yA ) = 1,
yA ∈YA
µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi
yj ∈Yj
µi (yi ) ∈ {0, 1}, ∀i ∈ V , ∀yi ∈ Yi ,
µi,j (yi , yj ) ∈ {0, 1}, ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj ,
µA (yA ) ∈ {0, 1}, ∀yA ∈ YA .
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
48. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Lagrangian Relaxation
Message passing
B D
φA→B φA→D
φA→C A φA→E
C E
Maximizing q(φ) corresponds to message passing from the
higher-order factor to the unary factors
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
49. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Lagrangian Relaxation
The higher-order subproblem
min φA→B (yB ) µA (yA ) + θA (yA )µA (yA )
µA
B⊂A, yA ∈YA ,[yA ]B =yB yA ∈YA
yB ∈YB
sb.t. µA (yA ) = 1,
yA ∈YA
µA (yA ) ∈ {0, 1}, ∀yA ∈ YA .
Simple structure: select one element
Simplify to an optimization problem over yA ∈ YA
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
50. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Lagrangian Relaxation
The higher-order subproblem
min θA (yA ) + I ([yA ]B = yB )φA→B (yB )
yA ∈YA
B⊂A,
yB ∈YB
Solvability depends on structure of θA as function of yA
∗
(For now) assume we can solve for yA ∈ YA
How can we use this to solve the original large LP?
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
51. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Lagrangian Relaxation
Maximizing the dual function: Subgradient Method
Maximizing q(φ):
1. Initialize φ
2. Iterate
2.1 Given φ solve pairwise MAP subproblem, obtain µ∗ , µ∗
1 2
2.2 Given φ solve higher-order MAP subproblem, obtain y∗A
2.3 Update φ
Supergradient
∂ ∗
(I ([yA ]B = yB ) − µB (yB ))
∂φB (yB ) B⊂A,
yB ∈YB
Supergradient ascent, step size αt > 0
∗
φB (yB ) ← φB (yB ) + αt (I ([yA ]B = yB ) − µB (yB ))
B⊂A,
yB ∈YB
The step size αt must satisfy the “diminishing step size conditions”,
i) limt→∞ αt → 0, and ii) ∞ αt → ∞.
t=0
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
52. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Lagrangian Relaxation
Subgradient Method (cont)
Typical step size choices
1. Simple diminishing step size,
1+m
αt = ,
t +m
where m > 0 is an arbitrary constant.
2. Polyak’s step size,
q − q(λt , µt )
¯
αt = β t t , µt )) 2 + v (y (λt , µt )) 2
,
u(y (λ
where
0 < β t < 2 is a diminishing step size, and
q ≥ q(λ∗ , µ∗ ) is an upper bound on the optimal dual objective.
¯
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
53. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Lagrangian Relaxation
Subgradient Method (cont)
Primal solution recovery
Uniform average
T
1
lim y (λt , µt ) → y ∗ .
T →∞ T
t=1
Geometric series (volume algorithm)
y t = γy (λt , µt ) + (1 − γ)¯ t−1 ,
¯ y
where 0 < γ < 1 is a small constant such as γ = 0.1, and y t might
¯
be slightly primal infeasible.
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
54. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Lagrangian Relaxation
Pattern-based higher order potentials
Example (Pattern-based potentials)
Given a small set of patterns P = {y1 , y2 , . . . , yK }, define
CyA if yA ∈ P,
θA (yA ) =
Cmax otherwise.
(Rother et al, CVPR 2009), (Komodakis and Paragios, CVPR 2009)
Generalizes P n -Potts model (Kohli et al., CVPR 2007)
“Patterns have low energies”: Cy ≤ Cmax for all y ∈ P
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
55. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Lagrangian Relaxation
Pattern-based higher order potentials
Example (Pattern-based potentials)
Given a small set of patterns P = {y1 , y2 , . . . , yK }, define
CyA if yA ∈ P,
θA (yA ) =
Cmax otherwise.
Can solve
min θA (yA ) + I ([yA ]B = yB )φA→B (yB )
yA ∈YA
B⊂A,
yB ∈YB
by
1. Testing all yA ∈ P, and
2. Fixing θA (yA ) = Cmax and solving for the minimum of the second
term.
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
56. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Dual Decomposition
Constructing Relaxations
Principled methods to construct relaxations
Linear Programming Relaxations
Lagrangian relaxation
Lagrangian/Dual decomposition
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
57. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Dual Decomposition
Dual/Lagrangian Decomposition
Additive structure
K
min gk (y )
y
k=1
sb.t. y ∈ Yk , ∀k = 1, . . . , K ,
K
such that k=1 gk (y ) is hard, but for any k,
min gk (y )
y
sb.t. y ∈ Yk
is tractable.
(Guignard, “Lagrangean Decomposition”, MathProg 1987)
(Conejo et al., “Decomposition Techniques in Mathematical
Programming”, 2006)
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
58. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Dual Decomposition
Dual/Lagrangian Decomposition
Idea
1. Selectively duplicate variables (“variable splitting”)
K
min gk (yk )
y1 ,...,yK ,y
k=1
sb.t. y ∈Y ,
yk ∈ Yk , k = 1, . . . , K ,
y = yk : λk , k = 1, . . . , K , (4)
where Y ⊇ k=1,...,K Yk .
2. Add coupling equality constraints between duplicated variables
3. Apply Lagrangian relaxation to the coupling constraint
Also known as dual decomposition and variable splitting.
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
59. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Dual Decomposition
Dual/Lagrangian Decomposition (cont)
K
min gk (yk )
y1 ,...,yK ,y
k=1
sb.t. y ∈Y ,
yk ∈ Yk , k = 1, . . . , K ,
y = yk : λk , k = 1, . . . , K . (5)
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
60. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Dual Decomposition
Dual/Lagrangian Decomposition (cont)
K K
min gk (yk ) + λk (y − yk )
y1 ,...,yK ,y
k=1 k=1
sb.t. y ∈Y ,
yk ∈ Yk , k = 1, . . . , K ,
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
61. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Dual Decomposition
Dual/Lagrangian Decomposition (cont)
K K
min gk (yk ) − λk yk + λk y
y1 ,...,yK ,y
k=1 k=1
sb.t. y ∈Y ,
yk ∈ Yk , k = 1, . . . , K ,
Problem is decomposed
There can be multiple possible decompositions of different quality
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
62. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Dual Decomposition
Dual/Lagrangian Decomposition (cont)
Dual function
K K
q(λ) := min gk (yk ) − λk yk + λk y
y1 ,...,yK ,y
k=1 k=1
sb.t. y ∈Y ,
yk ∈ Yk , k = 1, . . . , K ,
Dual maximization problem
max q(λ)
λ
K
sb.t. λk = 0,
k=1
K
where {λ| k=1 λk = 0} is the domain where q(λ) > −∞.
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
63. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Dual Decomposition
Dual/Lagrangian Decomposition (cont)
Dual maximization problem
max q(λ)
λ
K
sb.t. λk = 0,
k=1
Projected subgradient method, using
K K
∂ 1 1
(y − yk ) − (y − y ) = y − yk .
∂λk K K
=1 =1
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
64. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Dual Decomposition
Example (Connectivity, Vicente et al., CVPR 2008)
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
65. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Dual Decomposition
Example (Connectivity, Vicente et al., CVPR 2008)
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
66. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Dual Decomposition
Example (Connectivity, Vicente et al., CVPR 2008)
min E (y ) + C (y ),
y
where E (y ) is a normal binary pairwise segmentation energy, and
0 if marked points are connected,
C (y ) =
∞ otherwise.
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
67. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Dual Decomposition
Example (Connectivity, Vicente et al., CVPR 2008)
miny ,z E (y ) + C (z)
sb.t. y = z.
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
68. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Dual Decomposition
Example (Connectivity, Vicente et al., CVPR 2008)
miny ,z E (y ) + C (z)
sb.t. y = z : λ.
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
69. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Dual Decomposition
Example (Connectivity, Vicente et al., CVPR 2008)
miny ,z E (y ) + C (z) + λ (y − z)
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
70. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Dual Decomposition
Example (Connectivity, Vicente et al., CVPR 2008)
q(λ) := miny ,z E (y ) + λ y + C (z) − λ z
Subproblem 1 Subproblem 2
Subproblem 1: normal binary image segmentation energy
Subproblem 2: shortest path problem (no pairwise terms)
→ Maximize q(λ) to find best lower bound
Above derivation simplified, see (Vicente et al., CVPR 2008) for a
more sophisticated decomposition using three subproblems
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
71. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Dual Decomposition
Example (Connectivity, Vicente et al., CVPR 2008)
q(λ) := miny ,z E (y ) + λ y + C (z) − λ z
Subproblem 1 Subproblem 2
Subproblem 1: normal binary image segmentation energy
Subproblem 2: shortest path problem (no pairwise terms)
→ Maximize q(λ) to find best lower bound
Above derivation simplified, see (Vicente et al., CVPR 2008) for a
more sophisticated decomposition using three subproblems
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
72. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Dual Decomposition
Other examples
Plenty of applications of decomposition:
(Komodakis et al., ICCV 2007)
(Strandmark and Kahl, CVPR 2010)
(Torresani et al., ECCV 2008)
(Werner, TechReport 2009)
(Strandmark and Kahl, CVPR 2010)
(Kumar and Koller, CVPR 2010)
(Woodford et al., ICCV 2009)
(Vicente et al., ICCV 2009)
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
73. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Dual Decomposition
Geometric Interpretation
Theorem (Lagrangian Decomposition Primal)
Let Yk be finite sets and g (y ) = c y a linear objective function. Then
the optimal solution of the Lagrangian decomposition obtains the value
of the following relaxed primal optimization problem.
K
min gk (y )
y
k=1
sb.t. y ∈ conv(Yk ), ∀k = 1, . . . , K .
Therefore, optimizing the Lagrangian decomposition dual is equivalent to
optimizing the primal objective,
on the intersection of the convex hulls of the individual feasible sets.
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
74. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Dual Decomposition
Geometric Interpretation
Theorem (Lagrangian Decomposition Primal)
Let Yk be finite sets and g (y ) = c y a linear objective function. Then
the optimal solution of the Lagrangian decomposition obtains the value
of the following relaxed primal optimization problem.
K
min gk (y )
y
k=1
sb.t. y ∈ conv(Yk ), ∀k = 1, . . . , K .
Therefore, optimizing the Lagrangian decomposition dual is equivalent to
optimizing the primal objective,
on the intersection of the convex hulls of the individual feasible sets.
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
75. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Dual Decomposition
Geometric Interpretation (cont)
yD
y∗
conv(Y1 )∩
conv(Y2 ) conv(Y2)
conv(Y1)
c y
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
76. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Dual Decomposition
Geometric Interpretation (cont)
yD
y∗
conv(Y1 )∩
conv(Y2 ) conv(Y2)
conv(Y1)
c y
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
77. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Dual Decomposition
Geometric Interpretation (cont)
yD
y∗
conv(Y1 )∩
conv(Y2 ) conv(Y2)
conv(Y1)
c y
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
78. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Dual Decomposition
Geometric Interpretation (cont)
yD
y∗
conv(Y1 )∩
conv(Y2 ) conv(Y2)
conv(Y1)
c y
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
79. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Dual Decomposition
Lagrangian Relaxation vs. Decomposition
Lagrangian Relaxation
Needs explicit complicating constraints
Lagrangian/Dual Decomposition
Can work with black-box solver for subproblem (implicit)
Can yield stronger bounds
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
80. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Dual Decomposition
Lagrangian Relaxation vs. Decomposition
Y1 ⊇ conv(Y1 )
yD yR
y∗
conv(Y1 )∩
conv(Y2 ) conv(Y2)
conv(Y1)
c y
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
81. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Polyhedral Higher-order Interactions
Polyhedral View on Higher-order Interactions
Idea
MAP-MRF LP relaxation: LOCAL polytope
Incorporate higher-order interactions directly by constraining LOCAL
Techniques from polyhedral combinatorics
See (Werner, CVPR 2008), (Nowozin and Lampert, CVPR 2009),
(Lempitsky et al., ICCV 2009)
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
82. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Polyhedral Higher-order Interactions
Representation of Polytopes
Convex polytopes have two equivalent
representations
d3 y ≤ 1
As a convex combination of extreme d2 y ≤ 1
points
As a set of facet-defining linear
inequalities Z
A linear inequality with respect to a
polytope can be d1 y ≤ 1
valid, does not cut off the polytope,
representing a face, valid and touching,
facet-defining, representing a face of
dimension one less than the polytope.
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
83. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Polyhedral Higher-order Interactions
Intersecting Polytopes
Construction
1. Consider marginal polytope
M(G ) of a discrete graphical
model
M(G)
2. Vertices of M(G ) are feasible
labelings
3. Add linear inequalities d µ ≤ b
that cut off vertices
4. Optimize over resulting µ
intersection M (G ),
0 if µ is valid,
EA (µA ) =
∞ otherwise.
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
84. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Polyhedral Higher-order Interactions
Intersecting Polytopes
Construction
1. Consider marginal polytope
M(G ) of a discrete graphical
model
2. Vertices of M(G ) are feasible M(G)
labelings
3. Add linear inequalities d µ ≤ b
that cut off vertices
4. Optimize over resulting µ
intersection M (G ),
0 if µ is valid,
EA (µA ) =
∞ otherwise.
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
85. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Polyhedral Higher-order Interactions
Intersecting Polytopes
Construction
1. Consider marginal polytope d µ≤b
M(G ) of a discrete graphical
model
2. Vertices of M(G ) are feasible
labelings M(G)
3. Add linear inequalities d µ ≤ b
that cut off vertices
4. Optimize over resulting
intersection M (G ),
µ
0 if µ is valid,
EA (µA ) =
∞ otherwise.
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
86. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Polyhedral Higher-order Interactions
Intersecting Polytopes
Construction
1. Consider marginal polytope
M(G ) of a discrete graphical
model
2. Vertices of M(G ) are feasible M (G)
labelings
3. Add linear inequalities d µ ≤ b
that cut off vertices
4. Optimize over resulting µ
intersection M (G ),
0 if µ is valid,
EA (µA ) =
∞ otherwise.
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
87. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Polyhedral Higher-order Interactions
Defining the Inequalities
Questions
1. Where do the inequalities come from?
2. Is this construction always possible?
3. How to optimize over M (G )?
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
88. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Polyhedral Higher-order Interactions
1. Deriving Combinatorial Polytopes
“Foreground mask must be connected”
Global constraint on labelings
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
89. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Polyhedral Higher-order Interactions
1. Deriving Combinatorial Polytopes
“Foreground mask must be connected”
Global constraint on labelings
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference
90. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions
Polyhedral Higher-order Interactions
Constructing the Inequalities
Carsten Rother, Sebastian Nowozin Microsoft Research Cambridge
Higher Order Models in Computer Vision: Inference