Your SlideShare is downloading. ×
CVPR2010: higher order models in computer vision: Part 3
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

CVPR2010: higher order models in computer vision: Part 3

411

Published on

Published in: Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
411
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
12
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions Higher Order Models in Computer Vision: Inference Carsten Rother, Sebastian Nowozin Machine Learning and Perception Group Microsoft Research Cambridge San Francisco, 18th June 2010Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 2. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsBelief PropagationBelief Propagation Reminder Sum-product belief propagation rF →Yi Max-product/Max-sum/Min-sum belief . . Yi . . propagation qYi →F F Uses Messages: vectors at the factor graph edges qYi →F ∈ RYi , the variable-to-factor message, and rF →Yi ∈ RYi , the factor-to-variable message. Updates: iteratively recomputing these vectorsCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 3. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsBelief PropagationBelief Propagation Reminder: Min-Sum Updates (cont) Variable-to-factor message rA→Yi A qYi →F qYi →F (yi ) = rF →Yi (yi ) Yi F ∈ M(i){F } B rF →Yi F . rB→Yi . . Factor-to-variable message   Yj qYj →F rF →Yi   Yi rF →Yi (yi ) = min EF (yF ) +  qYj →F (yj )  Yk qYk →F F qYi →F y ∈YF F yi =yi j∈ . . . N(F ){i}Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 4. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsBelief PropagationBelief Propagation Reminder: Min-Sum Updates (cont) Variable-to-factor message rA→Yi A qYi →F qYi →F (yi ) = rF →Yi (yi ) Yi F ∈ M(i){F } B rF →Yi F . rB→Yi . . Factor-to-variable message   Yj qYj →F rF →Yi   Yi rF →Yi (yi ) = min EF (yF ) +  qYj →F (yj )  Yk qYk →F F qYi →F y ∈YF F yi =yi j∈ . . . N(F ){i}Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 5. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsBelief PropagationHigher-order Factors Factor-to-variable message   Yj   rF →Yi (yi ) = min EF (yF ) +  qYj →F (yj )  Yk y ∈YF F rF →Yi j∈ yi =yi N(F ){i} Yi Yl F . . . Minimization becomes intractable for Ym general EF Tractable for special structure of EF Yn See (Felzenszwalb and Huttenlocher, CVPR 2004), (Potetz, CVPR 2007), and (Tarlow et al., AISTATS 2010)Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 6. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsBelief PropagationHigher-order Factors Factor-to-variable message   Yj   rF →Yi (yi ) = min EF (yF ) +  qYj →F (yj )  Yk y ∈YF F rF →Yi j∈ yi =yi N(F ){i} Yi Yl F . . . Minimization becomes intractable for Ym general EF Tractable for special structure of EF Yn See (Felzenszwalb and Huttenlocher, CVPR 2004), (Potetz, CVPR 2007), and (Tarlow et al., AISTATS 2010)Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 7. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsRelaxationsProblem Relaxations Optimization problems (minimizing g : G → R over Y ⊆ G) can become easier if feasible set is enlarged, and/or objective function is replaced with a bound. g(x), h(x) g(y) Y h(z) Z⊇Y y, zCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 8. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsRelaxationsProblem Relaxations Optimization problems (minimizing g : G → R over Y ⊆ G) can become easier if feasible set is enlarged, and/or objective function is replaced with a bound. Definition (Relaxation (Geoffrion, 1974)) Given two optimization problems (g , Y, G) and (h, Z, G), the problem (h, Z, G) is said to be a relaxation of (g , Y, G) if, 1. Z ⊇ Y, i.e. the feasible set of the relaxation contains the feasible set of the original problem, and 2. ∀y ∈ Y : h(y ) ≤ g (y ), i.e. over the original feasible set the objective function h achieves no larger values than the objective function g . (Minimization version)Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 9. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsRelaxationsRelaxations Relaxed solution z ∗ provides a bound: h(z ∗ ) ≥ g (y ∗ ), (maximization: upper bound) h(z ∗ ) ≤ g (y ∗ ), (minimization: lower bound) Relaxation is typically tractable Evidence that relaxations are “better” for learning (Kulesza and Pereira, 2007), (Finley and Joachims, 2008), (Martins et al., 2009) Drawback: z ∗ ∈ Z Y possible solution is not integral (but fractional), or solution violates constraints (primal infeasible). To obtain z ∈ Y from y ∗ : rounding, heuristicsCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 10. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsRelaxationsRelaxations Relaxed solution z ∗ provides a bound: h(z ∗ ) ≥ g (y ∗ ), (maximization: upper bound) h(z ∗ ) ≤ g (y ∗ ), (minimization: lower bound) Relaxation is typically tractable Evidence that relaxations are “better” for learning (Kulesza and Pereira, 2007), (Finley and Joachims, 2008), (Martins et al., 2009) Drawback: z ∗ ∈ Z Y possible solution is not integral (but fractional), or solution violates constraints (primal infeasible). To obtain z ∈ Y from y ∗ : rounding, heuristicsCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 11. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsRelaxationsConstructing Relaxations Principled methods to construct relaxations Linear Programming Relaxations Lagrangian relaxation Lagrangian/Dual decompositionCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 12. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsRelaxationsConstructing Relaxations Principled methods to construct relaxations Linear Programming Relaxations Lagrangian relaxation Lagrangian/Dual decompositionCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 13. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsMAP-MRF LPMAP-MRF Linear Programming Relaxation Simple two variable pairwise MRF Y1 Y2 Y1 Y1 × Y2 Y2 θ1 θ1,2 θ2Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 14. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsMAP-MRF LPMAP-MRF Linear Programming Relaxation Y1 Y2 Y1 Y1 × Y2 Y2 y1 = 2 (y1 , y2 ) = (2, 3) y2 = 3 Energy E (y ; x) = θ1 (y1 ; x) + θ1,2 (y1 , y2 ; x) + θ2 (y2 ; x) 1 Probability p(y |x) = Z (x) exp{−E (y ; x)} MAP prediction: argmax p(y |x) = argmin E (y ; x) y ∈Y y ∈YCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 15. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsMAP-MRF LPMAP-MRF Linear Programming Relaxation Y1 Y2 Y1 Y1 × Y2 Y2 µ1 ∈ {0, 1}Y1 µ1,2 ∈ {0, 1}Y1 ×Y2 µ2 ∈ {0, 1}Y2 µ1 (y1 ) = y2 ∈Y2 µ1,2 (y1 , y2 ) y1 ∈Y1 µ1,2 (y1 , y2 ) = µ2 (y2 ) y1 ∈Y1 µ1 (y1 ) = 1 (y1 ,y2 )∈Y1 ×Y2 µ1,2 (y1 , y2 ) = 1 y2 ∈Y2 µ2 (y2 ) = 1 Energy is now linear: E (y ; x) = θ, µCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 16. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsMAP-MRF LPMAP-MRF Linear Programming Relaxation (cont) min θi (yi )µi (yi ) + θi,j (yi , yj )µi,j (yi , yj ) µ i∈V yi ∈Yi {i,j}∈E (yi ,yj )∈Yi ×Yj sb.t. µi (yi ) = 1, ∀i ∈ V , yi ∈Yi µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi yj ∈Yj µi (yi ) ∈ {0, 1}, ∀i ∈ V , ∀yi ∈ Yi , µi,j (yi , yj ) ∈ {0, 1}, ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj .Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 17. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsMAP-MRF LPMAP-MRF Linear Programming Relaxation (cont) min θi (yi )µi (yi ) + θi,j (yi , yj )µi,j (yi , yj ) µ i∈V yi ∈Yi {i,j}∈E (yi ,yj )∈Yi ×Yj sb.t. µi (yi ) = 1, ∀i ∈ V , yi ∈Yi µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi yj ∈Yj µi (yi ) ∈ {0, 1}, ∀i ∈ V , ∀yi ∈ Yi , µi,j (yi , yj ) ∈ {0, 1}, ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj . → NP-hard integer linear programCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 18. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsMAP-MRF LPMAP-MRF Linear Programming Relaxation (cont) min θi (yi )µi (yi ) + θi,j (yi , yj )µi,j (yi , yj ) µ i∈V yi ∈Yi {i,j}∈E (yi ,yj )∈Yi ×Yj sb.t. µi (yi ) = 1, ∀i ∈ V , yi ∈Yi µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi yj ∈Yj µi (yi ) ∈ [0, 1], ∀i ∈ V , ∀yi ∈ Yi , µi,j (yi , yj ) ∈ [0, 1], ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj . → linear program → LOCAL polytopeCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 19. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsMAP-MRF LPMAP-MRF LP Relaxation, Works MAP-MRF LP Analysis (Wainwright et al., Trans. Inf. Tech., 2005), (Weiss et al., UAI 2007), (Werner, PAMI 2005) TRW-S (Kolmogorov, PAMI 2006) Improving tightness (Komodakis and Paragios, ECCV 2008), (Kumar and Torr, ICML 2008), (Sontag and Jaakkola, NIPS 2007), (Sontag et al., UAI 2008), (Werner, CVPR 2008), (Kumar et al., NIPS 2007) Algorithms based on the LP (Globerson and Jaakkola, NIPS 2007), (Kumar and Torr, ICML 2008), (Sontag et al., UAI 2008)Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 20. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsMAP-MRF LPMAP-MRF LP Relaxation, General case B A yA ∈YA µA (yA ) = 1, yA ∈YA µA (yA ) = µB (yB ), ∀yB ∈ YB . [yA ]B =yB Generalization to higher-order factors For B ⊆ A factors: marginalization constraints Defines LOCAL polytope (Wainwright and Jordan, 2008)Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 21. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsMAP-MRF LPMAP-MRF LP Relaxation, Known results 1. LOCAL is tight iff the factor graph is a forest (acyclic) 2. All labelings are vertices of LOCAL (Wainwright and Jordan, 2008) 3. For cyclic graphs there are additional fractional vertices. 4. If all factors have regular energies, the fractional solutions are never optimal (Wainwright and Jordan, 2008) 5. For models with only binary states: half-integrality, integral variables are certain See some examples later...Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 22. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationConstructing Relaxations Principled methods to construct relaxations Linear Programming Relaxations Lagrangian relaxation Lagrangian/Dual decompositionCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 23. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationLagrangian Relaxation, Motivation B A yA ∈YA µA (yA ) = µB (yB ), ∀yB ∈ YB . [yA ]B =yB MAP-MRF LP with higher-order factors problematic: Number of LP variables grows with |YA | → exponential in MRF variables But: number of constraints stays small (|YB |) Lagrangian Relaxation allows one to solve the LP by treating the constraints implicitlyCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 24. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationLagrangian Relaxation min g (y ) y sb.t. y ∈ D, y ∈ C. Assumption Optimizing g (y ) over y ∈ D is “easy”. Optimizing over y ∈ D ∩ C is hard. High-level idea Incorporate y ∈ C constraint into objective function For an excellent introduction, see (Guignard, “Lagrangean Relaxation”, TOP 2003) and (Lemar´chal, “Lagrangian Relaxation”, 2001). eCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 25. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationLagrangian Relaxation (cont) Recipe 1. Express C in terms of equalities and inequalities C = {y ∈ G : ui (y ) = 0, ∀i = 1, . . . , I , vj (y ) ≤ 0, ∀j = 1, . . . , J}, ui : G → R differentiable, typically affine, vj : G → R differentiable, typically convex. 2. Introduce Lagrange multipliers, yielding min g (y ) y sb.t. y ∈ D, ui (y ) = 0 : λ, i = 1, . . . , I , vj (y ) ≤ 0 : µ, j = 1, . . . , J.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 26. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationLagrangian Relaxation (cont) 2. Introduce Lagrange multipliers, yielding min g (y ) y sb.t. y ∈ D, ui (y ) = 0 : λ, i = 1, . . . , I , vj (y ) ≤ 0 : µ, j = 1, . . . , J. 3. Build partial Lagrangian min g (y ) + λ u(y ) + µ v (y ) (1) y sb.t. y ∈ D.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 27. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationLagrangian Relaxation (cont) 3. Build partial Lagrangian min g (y ) + λ u(y ) + µ v (y ) (1) y sb.t. y ∈ D. Theorem (Weak Duality of Lagrangean Relaxation) For differentiable functions ui : G → R and vj : G → R, and for any λ ∈ RI and any non-negative µ ∈ RJ , µ ≥ 0, problem (1) is a relaxation of the original problem: its value is lower than or equal to the optimal value of the original problem.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 28. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationLagrangian Relaxation (cont) 3. Build partial Lagrangian q(λ, µ) := min g (y ) + λ u(y ) + µ v (y ) (1) y sb.t. y ∈ D. 4. Maximize lower bound wrt λ, µ max q(λ, µ) λ,µ sb.t. µ≥0 → efficiently solvable if q(λ, µ) can be evaluatedCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 29. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationOptimizing Lagrangian Dual Functions 4. Maximize lower bound wrt λ, µ max q(λ, µ) (2) λ,µ sb.t. µ≥0 Theorem (Lagrangean Dual Function) 1. q is concave in λ and µ, Problem (2) is a concave maximization problem. 2. If q is unbounded above, then the original problem is infeasible. 3. For any λ, µ ≥ 0, let y (λ, µ) = argminy ∈D g (y ) + λ u(y ) + µ v (u) Then, a supergradient of q can be constructed by evaluating the constraint functions at y (λ, µ) as ∂ ∂ u(y (λ, µ)) ∈ q(λ, µ), and v (y (λ, µ)) ∈ q(λ, µ). ∂λ ∂µCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 30. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationSubgradients, Supergradients f (x) Subgradients for convex functions: global affine underestimators Supergradients for concave functions: global affine overestimatorsCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 31. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationSubgradients, Supergradients f (x) x Subgradients for convex functions: global affine underestimators Supergradients for concave functions: global affine overestimatorsCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 32. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationSubgradients, Supergradients f (x) x f (y) ≥ f (x) + g (y − x), ∀y Subgradients for convex functions: global affine underestimators Supergradients for concave functions: global affine overestimatorsCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 33. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationSubgradients, Supergradients f (x) x ∂ 0∈ ∂x f (x) Subgradients for convex functions: global affine underestimators Supergradients for concave functions: global affine overestimatorsCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 34. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationExample behaviour of objectives (from a 10-by-10 pairwise binary MRF relaxation, submodular) 10 Dual objective 0 Primal objective −10 Objective −20 −30 −40 −50 0 50 100 150 200 250 IterationCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 35. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationPrimal Solution Recovery Assume we solved the dual problem for (λ∗ , µ∗ ) 1. Can we obtain a primal solution y ∗ ? 2. Can we say something about the relaxation quality? Theorem (Sufficient Optimality Conditions) If for a given λ, µ ≥ 0, we have u(y (λ, µ)) = 0 and v (y (λ, µ)) ≤ 0 (primal feasibility) and further we have λ u(y (λ, µ)) = 0, and µ v (y (λ, µ)) = 0, (complementary slackness), then y (λ, µ) is an optimal primal solution to the original problem, and (λ, µ) is an optimal solution to the dual problem.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 36. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationPrimal Solution Recovery Assume we solved the dual problem for (λ∗ , µ∗ ) 1. Can we obtain a primal solution y ∗ ? 2. Can we say something about the relaxation quality? Theorem (Sufficient Optimality Conditions) If for a given λ, µ ≥ 0, we have u(y (λ, µ)) = 0 and v (y (λ, µ)) ≤ 0 (primal feasibility) and further we have λ u(y (λ, µ)) = 0, and µ v (y (λ, µ)) = 0, (complementary slackness), then y (λ, µ) is an optimal primal solution to the original problem, and (λ, µ) is an optimal solution to the dual problem.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 37. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationPrimal Solution Recovery (cont) But, we might never see a solution satisfying the optimality condition is the case only if there is no duality gap, i.e. q(λ∗ , µ∗ ) = g (x, y ∗ ) Special case: integer linear programs we can always reconstruct a primal solution to min g (y ) (3) y sb.t. y ∈ conv(D), y ∈ C. For example for subgradient method updates it is known that (Anstreicher and Wolsey, MathProg 2009) T 1 lim y (λt , µt ) → y ∗ of (3). T →∞ T t=1Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 38. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationApplication to higher-order factors B A yA ∈YA µA (yA ) = µB (yB ), ∀yB ∈ YB . [yA ]B =yB (Komodakis and Paragios, CVPR 2009) propose a MAP inference algorithm suitable for higher-order factors Can be seen as Lagrangian relaxation of the marginalization constraint of the MAP-MRF LP See also (Johnson et al., ACCC 2007), (Schlesinger and Giginyak, CSC 2007), (Wainwright and Jordan, 2008, Sect. 4.1.3), (Werner, PAMI 2005), (Werner, TechReport 2009), (Werner, CVPR 2008)Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 39. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationKomodakis MAP-MRF Decomposition min θi (yi )µi (yi ) + θi,j (yi , yj )µi,j (yi , yj ) µ i∈V yi ∈Yi {i,j}∈ (yi ,yj )∈ E Yi ×Yj sb.t. µi (yi ) = 1, ∀i ∈ V , yi ∈Yi µi,j (yi , yj ) = 1, ∀{i, j} ∈ E , (yi ,yj )∈Yi ×Yj µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi yj ∈Yj µi (yi ) ∈ {0, 1}, ∀i ∈ V , ∀yi ∈ Yi , µi,j (yi , yj ) ∈ {0, 1}, ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj .Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 40. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationKomodakis MAP-MRF Decomposition min θi , µ i + θi,j , µi,j µ i∈V {i,j}∈ E sb.t. µi (yi ) = 1, ∀i ∈ V , yi ∈Yi µi,j (yi , yj ) = 1, ∀{i, j} ∈ E , (yi ,yj )∈Yi ×Yj µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi yj ∈Yj µi (yi ) ∈ {0, 1}, ∀i ∈ V , ∀yi ∈ Yi , µi,j (yi , yj ) ∈ {0, 1}, ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj .Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 41. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationKomodakis MAP-MRF Decomposition min θi , µ i + θi,j , µi,j + θA (yA )µA (yA ) µ i∈V {i,j}∈ yA ∈YA E sb.t. µi (yi ) = 1, ∀i ∈ V , yi ∈Yi µi,j (yi , yj ) = 1, ∀{i, j} ∈ E , (yi ,yj )∈Yi ×Yj µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi yj ∈Yj µA (yA ) = µB (yB ), ∀B ⊂ A, yB ∈ YB , yA ∈YA ,[yA ]B =yB µA (yA ) = 1, yA ∈YA µi (yi ) ∈ {0, 1}, ∀i ∈ V , ∀yi ∈ Yi , µi,j (yi , yj ) ∈ {0, 1}, ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj , µA (yA ) ∈ {0, 1}, ∀yA ∈ YA .Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 42. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationKomodakis MAP-MRF Decomposition min θi , µ i + θi,j , µi,j + θA (yA )µA (yA ) µ i∈V {i,j}∈ yA ∈YA E sb.t. µi (yi ) = 1, ∀i ∈ V , yi ∈Yi µi,j (yi , yj ) = 1, ∀{i, j} ∈ E , (yi ,yj )∈Yi ×Yj µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi yj ∈Yj µA (yA ) = µB (yB ) : φA→B (yB ), ∀B ⊂ A, yB ∈ YB , yA ∈YA ,[yA ]B =yB µA (yA ) = 1, yA ∈YA µi (yi ) ∈ {0, 1}, ∀i ∈ V , ∀yi ∈ Yi , µi,j (yi , yj ) ∈ {0, 1}, ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj , µA (yA ) ∈ {0, 1}, ∀yA ∈ YA .Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 43. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationKomodakis MAP-MRF Decomposition q(φ) := min θi , µ i + θi,j , µi,j + θA (yA )µA (yA ) µ i∈V {i,j}∈ yA ∈YA E   + φA→B (yB )  µA (yA ) − µB (yB ) B⊂A, yA ∈YA ,[yA ]B =yB yB ∈YB sb.t. µi (yi ) = 1, ∀i ∈ V , yi ∈Yi µi,j (yi , yj ) = 1, ∀{i, j} ∈ E , (yi ,yj )∈Yi ×Yj µA (yA ) = 1, yA ∈YA µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi yj ∈Yj µi (yi ) ∈ {0, 1}, ∀i ∈ V , ∀yi ∈ Yi , µi,j (yi , yj ) ∈ {0, 1}, ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj ,Carsten Rother, Sebastian Nowozin µA (yA ) ∈ {0, 1}, ∀yA ∈ YA . Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 44. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationKomodakis MAP-MRF Decomposition q(φ) := min θi , µ i + θi,j , µi,j + θA (yA )µA (yA ) µ i∈V {i,j}∈ yA ∈YA E + φA→B (yB ) µA (yA ) − φA→B (yB )µB (yB ) B⊂A, yA ∈YA ,[yA ]B =yB B⊂A, yB ∈YB yB ∈YB sb.t. µi (yi ) = 1, ∀i ∈ V , yi ∈Yi µi,j (yi , yj ) = 1, ∀{i, j} ∈ E , (yi ,yj )∈Yi ×Yj µA (yA ) = 1, yA ∈YA µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi yj ∈Yj µi (yi ) ∈ {0, 1}, ∀i ∈ V , ∀yi ∈ Yi , µi,j (yi , yj ) ∈ {0, 1}, ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj , µA (yA ) ∈ {0, 1}, ∀yA ∈ YA .Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 45. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationKomodakis MAP-MRF Decomposition q(φ) := min θi , µ i + θi,j , µi,j − φA→B (yB )µB (yB ) µ i∈V {i,j}∈ B⊂A, E yB ∈YB + φA→B (yB ) µA (yA ) + θA (yA )µA (yA ) B⊂A, yA ∈YA ,[yA ]B =yB yA ∈YA yB ∈YB sb.t. µi (yi ) = 1, ∀i ∈ V , yi ∈Yi µi,j (yi , yj ) = 1, ∀{i, j} ∈ E , (yi ,yj )∈Yi ×Yj µA (yA ) = 1, yA ∈YA µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi yj ∈Yj µi (yi ) ∈ {0, 1}, ∀i ∈ V , ∀yi ∈ Yi , µi,j (yi , yj ) ∈ {0, 1}, ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj , µA (yA ) ∈ {0, 1}, ∀yA ∈ YA .Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 46. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationKomodakis MAP-MRF Decomposition q(φ) := min θi , µ i + θi,j , µi,j − φA→B (yB )µB (yB ) µ i∈V {i,j}∈ B⊂A, E yB ∈YB + φA→B (yB ) µA (yA ) + θA (yA )µA (yA ) B⊂A, yA ∈YA ,[yA ]B =yB yA ∈YA yB ∈YB sb.t. µi (yi ) = 1, ∀i ∈ V , yi ∈Yi µi,j (yi , yj ) = 1, ∀{i, j} ∈ E , (yi ,yj )∈Yi ×Yj µA (yA ) = 1, yA ∈YA µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi yj ∈Yj µi (yi ) ∈ {0, 1}, ∀i ∈ V , ∀yi ∈ Yi , µi,j (yi , yj ) ∈ {0, 1}, ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj , µA (yA ) ∈ {0, 1}, ∀yA ∈ YA .Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 47. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationKomodakis MAP-MRF Decomposition q(φ) := min θi , µ i + θi,j , µi,j − φA→B (yB )µB (yB ) µ i∈V {i,j}∈ B⊂A, E yB ∈YB + φA→B (yB ) µA (yA ) + θA (yA )µA (yA ) B⊂A, yA ∈YA ,[yA ]B =yB yA ∈YA yB ∈YB sb.t. µi (yi ) = 1, ∀i ∈ V , yi ∈Yi µi,j (yi , yj ) = 1, ∀{i, j} ∈ E , (yi ,yj )∈Yi ×Yj µA (yA ) = 1, yA ∈YA µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi yj ∈Yj µi (yi ) ∈ {0, 1}, ∀i ∈ V , ∀yi ∈ Yi , µi,j (yi , yj ) ∈ {0, 1}, ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj , µA (yA ) ∈ {0, 1}, ∀yA ∈ YA .Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 48. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationMessage passing B D φA→B φA→D φA→C A φA→E C E Maximizing q(φ) corresponds to message passing from the higher-order factor to the unary factorsCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 49. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationThe higher-order subproblem min φA→B (yB ) µA (yA ) + θA (yA )µA (yA ) µA B⊂A, yA ∈YA ,[yA ]B =yB yA ∈YA yB ∈YB sb.t. µA (yA ) = 1, yA ∈YA µA (yA ) ∈ {0, 1}, ∀yA ∈ YA . Simple structure: select one element Simplify to an optimization problem over yA ∈ YACarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 50. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationThe higher-order subproblem min θA (yA ) + I ([yA ]B = yB )φA→B (yB ) yA ∈YA B⊂A, yB ∈YB Solvability depends on structure of θA as function of yA ∗ (For now) assume we can solve for yA ∈ YA How can we use this to solve the original large LP?Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 51. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationMaximizing the dual function: Subgradient Method Maximizing q(φ): 1. Initialize φ 2. Iterate 2.1 Given φ solve pairwise MAP subproblem, obtain µ∗ , µ∗ 1 2 2.2 Given φ solve higher-order MAP subproblem, obtain y∗A 2.3 Update φ Supergradient ∂ ∗ (I ([yA ]B = yB ) − µB (yB )) ∂φB (yB ) B⊂A, yB ∈YB Supergradient ascent, step size αt > 0 ∗ φB (yB ) ← φB (yB ) + αt (I ([yA ]B = yB ) − µB (yB )) B⊂A, yB ∈YB The step size αt must satisfy the “diminishing step size conditions”, i) limt→∞ αt → 0, and ii) ∞ αt → ∞. t=0Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 52. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationSubgradient Method (cont) Typical step size choices 1. Simple diminishing step size, 1+m αt = , t +m where m > 0 is an arbitrary constant. 2. Polyak’s step size, q − q(λt , µt ) ¯ αt = β t t , µt )) 2 + v (y (λt , µt )) 2 , u(y (λ where 0 < β t < 2 is a diminishing step size, and q ≥ q(λ∗ , µ∗ ) is an upper bound on the optimal dual objective. ¯Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 53. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationSubgradient Method (cont) Primal solution recovery Uniform average T 1 lim y (λt , µt ) → y ∗ . T →∞ T t=1 Geometric series (volume algorithm) y t = γy (λt , µt ) + (1 − γ)¯ t−1 , ¯ y where 0 < γ < 1 is a small constant such as γ = 0.1, and y t might ¯ be slightly primal infeasible.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 54. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationPattern-based higher order potentials Example (Pattern-based potentials) Given a small set of patterns P = {y1 , y2 , . . . , yK }, define CyA if yA ∈ P, θA (yA ) = Cmax otherwise. (Rother et al, CVPR 2009), (Komodakis and Paragios, CVPR 2009) Generalizes P n -Potts model (Kohli et al., CVPR 2007) “Patterns have low energies”: Cy ≤ Cmax for all y ∈ PCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 55. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationPattern-based higher order potentials Example (Pattern-based potentials) Given a small set of patterns P = {y1 , y2 , . . . , yK }, define CyA if yA ∈ P, θA (yA ) = Cmax otherwise. Can solve min θA (yA ) + I ([yA ]B = yB )φA→B (yB ) yA ∈YA B⊂A, yB ∈YB by 1. Testing all yA ∈ P, and 2. Fixing θA (yA ) = Cmax and solving for the minimum of the second term.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 56. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual DecompositionConstructing Relaxations Principled methods to construct relaxations Linear Programming Relaxations Lagrangian relaxation Lagrangian/Dual decompositionCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 57. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual DecompositionDual/Lagrangian Decomposition Additive structure K min gk (y ) y k=1 sb.t. y ∈ Yk , ∀k = 1, . . . , K , K such that k=1 gk (y ) is hard, but for any k, min gk (y ) y sb.t. y ∈ Yk is tractable. (Guignard, “Lagrangean Decomposition”, MathProg 1987) (Conejo et al., “Decomposition Techniques in Mathematical Programming”, 2006)Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 58. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual DecompositionDual/Lagrangian Decomposition Idea 1. Selectively duplicate variables (“variable splitting”) K min gk (yk ) y1 ,...,yK ,y k=1 sb.t. y ∈Y , yk ∈ Yk , k = 1, . . . , K , y = yk : λk , k = 1, . . . , K , (4) where Y ⊇ k=1,...,K Yk . 2. Add coupling equality constraints between duplicated variables 3. Apply Lagrangian relaxation to the coupling constraint Also known as dual decomposition and variable splitting.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 59. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual DecompositionDual/Lagrangian Decomposition (cont) K min gk (yk ) y1 ,...,yK ,y k=1 sb.t. y ∈Y , yk ∈ Yk , k = 1, . . . , K , y = yk : λk , k = 1, . . . , K . (5)Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 60. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual DecompositionDual/Lagrangian Decomposition (cont) K K min gk (yk ) + λk (y − yk ) y1 ,...,yK ,y k=1 k=1 sb.t. y ∈Y , yk ∈ Yk , k = 1, . . . , K ,Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 61. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual DecompositionDual/Lagrangian Decomposition (cont) K K min gk (yk ) − λk yk + λk y y1 ,...,yK ,y k=1 k=1 sb.t. y ∈Y , yk ∈ Yk , k = 1, . . . , K , Problem is decomposed There can be multiple possible decompositions of different qualityCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 62. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual DecompositionDual/Lagrangian Decomposition (cont) Dual function K K q(λ) := min gk (yk ) − λk yk + λk y y1 ,...,yK ,y k=1 k=1 sb.t. y ∈Y , yk ∈ Yk , k = 1, . . . , K , Dual maximization problem max q(λ) λ K sb.t. λk = 0, k=1 K where {λ| k=1 λk = 0} is the domain where q(λ) > −∞.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 63. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual DecompositionDual/Lagrangian Decomposition (cont) Dual maximization problem max q(λ) λ K sb.t. λk = 0, k=1 Projected subgradient method, using K K ∂ 1 1 (y − yk ) − (y − y ) = y − yk . ∂λk K K =1 =1Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 64. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual Decomposition Example (Connectivity, Vicente et al., CVPR 2008)Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 65. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual Decomposition Example (Connectivity, Vicente et al., CVPR 2008)Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 66. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual Decomposition Example (Connectivity, Vicente et al., CVPR 2008) min E (y ) + C (y ), y where E (y ) is a normal binary pairwise segmentation energy, and 0 if marked points are connected, C (y ) = ∞ otherwise.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 67. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual Decomposition Example (Connectivity, Vicente et al., CVPR 2008) miny ,z E (y ) + C (z) sb.t. y = z.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 68. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual Decomposition Example (Connectivity, Vicente et al., CVPR 2008) miny ,z E (y ) + C (z) sb.t. y = z : λ.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 69. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual Decomposition Example (Connectivity, Vicente et al., CVPR 2008) miny ,z E (y ) + C (z) + λ (y − z)Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 70. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual Decomposition Example (Connectivity, Vicente et al., CVPR 2008) q(λ) := miny ,z E (y ) + λ y + C (z) − λ z Subproblem 1 Subproblem 2 Subproblem 1: normal binary image segmentation energy Subproblem 2: shortest path problem (no pairwise terms) → Maximize q(λ) to find best lower bound Above derivation simplified, see (Vicente et al., CVPR 2008) for a more sophisticated decomposition using three subproblemsCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 71. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual Decomposition Example (Connectivity, Vicente et al., CVPR 2008) q(λ) := miny ,z E (y ) + λ y + C (z) − λ z Subproblem 1 Subproblem 2 Subproblem 1: normal binary image segmentation energy Subproblem 2: shortest path problem (no pairwise terms) → Maximize q(λ) to find best lower bound Above derivation simplified, see (Vicente et al., CVPR 2008) for a more sophisticated decomposition using three subproblemsCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 72. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual DecompositionOther examples Plenty of applications of decomposition: (Komodakis et al., ICCV 2007) (Strandmark and Kahl, CVPR 2010) (Torresani et al., ECCV 2008) (Werner, TechReport 2009) (Strandmark and Kahl, CVPR 2010) (Kumar and Koller, CVPR 2010) (Woodford et al., ICCV 2009) (Vicente et al., ICCV 2009)Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 73. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual DecompositionGeometric Interpretation Theorem (Lagrangian Decomposition Primal) Let Yk be finite sets and g (y ) = c y a linear objective function. Then the optimal solution of the Lagrangian decomposition obtains the value of the following relaxed primal optimization problem. K min gk (y ) y k=1 sb.t. y ∈ conv(Yk ), ∀k = 1, . . . , K . Therefore, optimizing the Lagrangian decomposition dual is equivalent to optimizing the primal objective, on the intersection of the convex hulls of the individual feasible sets.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 74. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual DecompositionGeometric Interpretation Theorem (Lagrangian Decomposition Primal) Let Yk be finite sets and g (y ) = c y a linear objective function. Then the optimal solution of the Lagrangian decomposition obtains the value of the following relaxed primal optimization problem. K min gk (y ) y k=1 sb.t. y ∈ conv(Yk ), ∀k = 1, . . . , K . Therefore, optimizing the Lagrangian decomposition dual is equivalent to optimizing the primal objective, on the intersection of the convex hulls of the individual feasible sets.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 75. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual DecompositionGeometric Interpretation (cont) yD y∗ conv(Y1 )∩ conv(Y2 ) conv(Y2) conv(Y1) c yCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 76. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual DecompositionGeometric Interpretation (cont) yD y∗ conv(Y1 )∩ conv(Y2 ) conv(Y2) conv(Y1) c yCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 77. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual DecompositionGeometric Interpretation (cont) yD y∗ conv(Y1 )∩ conv(Y2 ) conv(Y2) conv(Y1) c yCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 78. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual DecompositionGeometric Interpretation (cont) yD y∗ conv(Y1 )∩ conv(Y2 ) conv(Y2) conv(Y1) c yCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 79. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual DecompositionLagrangian Relaxation vs. Decomposition Lagrangian Relaxation Needs explicit complicating constraints Lagrangian/Dual Decomposition Can work with black-box solver for subproblem (implicit) Can yield stronger boundsCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 80. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual DecompositionLagrangian Relaxation vs. Decomposition Y1 ⊇ conv(Y1 ) yD yR y∗ conv(Y1 )∩ conv(Y2 ) conv(Y2) conv(Y1) c yCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 81. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsPolyhedral Higher-order InteractionsPolyhedral View on Higher-order Interactions Idea MAP-MRF LP relaxation: LOCAL polytope Incorporate higher-order interactions directly by constraining LOCAL Techniques from polyhedral combinatorics See (Werner, CVPR 2008), (Nowozin and Lampert, CVPR 2009), (Lempitsky et al., ICCV 2009)Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 82. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsPolyhedral Higher-order InteractionsRepresentation of Polytopes Convex polytopes have two equivalent representations d3 y ≤ 1 As a convex combination of extreme d2 y ≤ 1 points As a set of facet-defining linear inequalities Z A linear inequality with respect to a polytope can be d1 y ≤ 1 valid, does not cut off the polytope, representing a face, valid and touching, facet-defining, representing a face of dimension one less than the polytope.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 83. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsPolyhedral Higher-order InteractionsIntersecting Polytopes Construction 1. Consider marginal polytope M(G ) of a discrete graphical model M(G) 2. Vertices of M(G ) are feasible labelings 3. Add linear inequalities d µ ≤ b that cut off vertices 4. Optimize over resulting µ intersection M (G ), 0 if µ is valid, EA (µA ) = ∞ otherwise.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 84. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsPolyhedral Higher-order InteractionsIntersecting Polytopes Construction 1. Consider marginal polytope M(G ) of a discrete graphical model 2. Vertices of M(G ) are feasible M(G) labelings 3. Add linear inequalities d µ ≤ b that cut off vertices 4. Optimize over resulting µ intersection M (G ), 0 if µ is valid, EA (µA ) = ∞ otherwise.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 85. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsPolyhedral Higher-order InteractionsIntersecting Polytopes Construction 1. Consider marginal polytope d µ≤b M(G ) of a discrete graphical model 2. Vertices of M(G ) are feasible labelings M(G) 3. Add linear inequalities d µ ≤ b that cut off vertices 4. Optimize over resulting intersection M (G ), µ 0 if µ is valid, EA (µA ) = ∞ otherwise.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 86. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsPolyhedral Higher-order InteractionsIntersecting Polytopes Construction 1. Consider marginal polytope M(G ) of a discrete graphical model 2. Vertices of M(G ) are feasible M (G) labelings 3. Add linear inequalities d µ ≤ b that cut off vertices 4. Optimize over resulting µ intersection M (G ), 0 if µ is valid, EA (µA ) = ∞ otherwise.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 87. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsPolyhedral Higher-order InteractionsDefining the Inequalities Questions 1. Where do the inequalities come from? 2. Is this construction always possible? 3. How to optimize over M (G )?Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 88. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsPolyhedral Higher-order Interactions1. Deriving Combinatorial Polytopes “Foreground mask must be connected” Global constraint on labelingsCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 89. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsPolyhedral Higher-order Interactions1. Deriving Combinatorial Polytopes “Foreground mask must be connected” Global constraint on labelingsCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 90. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsPolyhedral Higher-order InteractionsConstructing the InequalitiesCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 91. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsPolyhedral Higher-order InteractionsConstructing the InequalitiesCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 92. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsPolyhedral Higher-order InteractionsConstructing the InequalitiesCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 93. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsPolyhedral Higher-order InteractionsConstructing the InequalitiesCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 94. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsPolyhedral Higher-order InteractionsConstructing the InequalitiesCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 95. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsPolyhedral Higher-order InteractionsConstructing the Inequalities y0 + y2 − y1 ≤ 1Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 96. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsPolyhedral Higher-order InteractionsGeneralization and Intuition The following inequalities are violated by unconnected subgraphs: yi + yj − yk ≤ 1, ¯ ∀(i, j) ∈ E : ∀S ∈ S(i, j) / k∈S Intuition: “If two vertices i and j are selected (yi = yj = 1, shown in black), then any set of vertices which separate them (set S) must contain at least one selected vertex.” S i . . . . . . j . . . . . . ¯ Figure: Vertex i and j and one vertex separator set S ∈ S(i, j).Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 97. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsPolyhedral Higher-order InteractionsDerivation, Formal Steps Steps from intuition to inequalities: 1. Identify invariants that hold for any “good” solution 2. Formulate invariants as linear inequalities 3. Prove that inequalities are facet-defining 4. Prove that together with integrality they are a formulation Final result is a statement like: Theorem For a given graph G = (V , E ), the set of all induced subgraphs that are connected, can be described exactly by the following constraint set. yi + yj − yk ≤ 1, ∀(i, j) ∈ E : ∀S ∈ S(i, j), / k∈S yi ∈ {0, 1}, i ∈ V.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 98. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsPolyhedral Higher-order InteractionsDerivation, Formal Steps Steps from intuition to inequalities: 1. Identify invariants that hold for any “good” solution 2. Formulate invariants as linear inequalities 3. Prove that inequalities are facet-defining 4. Prove that together with integrality they are a formulation Final result is a statement like: Theorem For a given graph G = (V , E ), the set of all induced subgraphs that are connected, can be described exactly by the following constraint set. yi + yj − yk ≤ 1, ∀(i, j) ∈ E : ∀S ∈ S(i, j), / k∈S yi ∈ {0, 1}, i ∈ V.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 99. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsPolyhedral Higher-order Interactions2. Is this always possible? Yes, but... identifying and proving the inequalities is not straight-forward, even if all facet-defining inequalities are known, problem can remain intractable. Further reading Polyhedral combinatorics literature (Schrijver, 2003), (Ziegler, 1994) There exist software to help identifying facet-defining inequalities, see e.g. polymakeCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 100. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsPolyhedral Higher-order Interactions3. How to optimize over M (G )? LOCAL has a small number of inequalities, M (G ) typically has an exponential number of inequalities, Optimization over M (G ) is still tractable if separation problem can be solved. Definition (Separation problem) Given a point y , prove that y ∈ M (G ) or produce a valid inequality that is strictly violated.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 101. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsPolyhedral Higher-order InteractionsExample Separation for Connectivity Need to show that for all (i, j) ∈ E with yi > 0, yj > 0: / yi + yj − yk − 1 ≤ 0, ¯ ∀S ∈ S(i, j). k∈S Equivalently, in variational form: max yi + yj − yk − 1 ≤ 0. (5) ¯ S∈S(i,j) k∈S If (5) is satisfied, no violated inequality for (i, j) exist. Otherwise, the set corresponding to the violated inequality is S ∗ (i, j) = argminS∈S(i,j) ¯ yk . (6) k∈SCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 102. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsPolyhedral Higher-order InteractionsExample Separation for Connectivity (cont) Need to solve a b S ∗ (i, j) = argminS∈S(i,j) ¯ yk k∈S i j Transform G into a directed edge-weighted graph c Solve minimum (i, j)-cut problem Equivalently: solve a linear max-flow ∞ ∞ ya yb problem (efficient) ∞ ∞ It follows that, j i ∞ by construction, a finite-capacity (i, j)-cut always exist, yc ∞ the cut found corresponds to a violated inequality or proves absence.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 103. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsPolyhedral Higher-order InteractionsExample X pattern Noisy X pattern 5 5 10 10 15 15 20 20 25 25 30 30 5 10 15 20 25 30 5 10 15 20 25 30Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 104. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsPolyhedral Higher-order Interactions MRF result MRFcomp result CMRF resultExample 5 5 5 10 10 10 15 15 15 20 20 20 25 25 25 30 30 30 5 10 15 20 25 30 5 10 15 20 25 30 5 10 15 20 25 30 Figure: MRF/MRFcomp/CMRF: E = −985.61, E = −974.16, E = −984.21 MRF result MRFcomp result CMRF result 5 5 5 10 10 10 15 15 15 20 20 20 25 25 25 30 30 30 5 10 15 20 25 30 5 10 15 20 25 30 5 10 15 20 25 30 Figure: MRF/MRFcomp/CMRF: E = −980.13, E = −974.03, E = −976.83Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 105. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsPolyhedral Higher-order InteractionsBounding-box Prior Example (Bounding Box Prior (Lempitsky et al., ICCV 2009)) Prior: segmentation should be tight wrt bounding box Exponential number of linear inequalitiesCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  • 106. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsPolyhedral Higher-order InteractionsRelaxation and Decomposition: Summary Lagrangian relaxation requires: complicating constraints widely applicable Problem decomposition requires: multiple tractable subproblems can make use of efficient algorithms for the subproblems widely applicable Polyhedral higher-order interactions requires: combinatorial understanding of constraint set universal: can represent any constraint tractability depends on tractable separation oracleCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference

×