Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

CVPR2010: higher order models in computer vision: Part 3

669 views

Published on

Published in: Education
  • Be the first to comment

CVPR2010: higher order models in computer vision: Part 3

  1. 1. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order Interactions Higher Order Models in Computer Vision: Inference Carsten Rother, Sebastian Nowozin Machine Learning and Perception Group Microsoft Research Cambridge San Francisco, 18th June 2010Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  2. 2. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsBelief PropagationBelief Propagation Reminder Sum-product belief propagation rF →Yi Max-product/Max-sum/Min-sum belief . . Yi . . propagation qYi →F F Uses Messages: vectors at the factor graph edges qYi →F ∈ RYi , the variable-to-factor message, and rF →Yi ∈ RYi , the factor-to-variable message. Updates: iteratively recomputing these vectorsCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  3. 3. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsBelief PropagationBelief Propagation Reminder: Min-Sum Updates (cont) Variable-to-factor message rA→Yi A qYi →F qYi →F (yi ) = rF →Yi (yi ) Yi F ∈ M(i){F } B rF →Yi F . rB→Yi . . Factor-to-variable message   Yj qYj →F rF →Yi   Yi rF →Yi (yi ) = min EF (yF ) +  qYj →F (yj )  Yk qYk →F F qYi →F y ∈YF F yi =yi j∈ . . . N(F ){i}Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  4. 4. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsBelief PropagationBelief Propagation Reminder: Min-Sum Updates (cont) Variable-to-factor message rA→Yi A qYi →F qYi →F (yi ) = rF →Yi (yi ) Yi F ∈ M(i){F } B rF →Yi F . rB→Yi . . Factor-to-variable message   Yj qYj →F rF →Yi   Yi rF →Yi (yi ) = min EF (yF ) +  qYj →F (yj )  Yk qYk →F F qYi →F y ∈YF F yi =yi j∈ . . . N(F ){i}Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  5. 5. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsBelief PropagationHigher-order Factors Factor-to-variable message   Yj   rF →Yi (yi ) = min EF (yF ) +  qYj →F (yj )  Yk y ∈YF F rF →Yi j∈ yi =yi N(F ){i} Yi Yl F . . . Minimization becomes intractable for Ym general EF Tractable for special structure of EF Yn See (Felzenszwalb and Huttenlocher, CVPR 2004), (Potetz, CVPR 2007), and (Tarlow et al., AISTATS 2010)Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  6. 6. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsBelief PropagationHigher-order Factors Factor-to-variable message   Yj   rF →Yi (yi ) = min EF (yF ) +  qYj →F (yj )  Yk y ∈YF F rF →Yi j∈ yi =yi N(F ){i} Yi Yl F . . . Minimization becomes intractable for Ym general EF Tractable for special structure of EF Yn See (Felzenszwalb and Huttenlocher, CVPR 2004), (Potetz, CVPR 2007), and (Tarlow et al., AISTATS 2010)Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  7. 7. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsRelaxationsProblem Relaxations Optimization problems (minimizing g : G → R over Y ⊆ G) can become easier if feasible set is enlarged, and/or objective function is replaced with a bound. g(x), h(x) g(y) Y h(z) Z⊇Y y, zCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  8. 8. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsRelaxationsProblem Relaxations Optimization problems (minimizing g : G → R over Y ⊆ G) can become easier if feasible set is enlarged, and/or objective function is replaced with a bound. Definition (Relaxation (Geoffrion, 1974)) Given two optimization problems (g , Y, G) and (h, Z, G), the problem (h, Z, G) is said to be a relaxation of (g , Y, G) if, 1. Z ⊇ Y, i.e. the feasible set of the relaxation contains the feasible set of the original problem, and 2. ∀y ∈ Y : h(y ) ≤ g (y ), i.e. over the original feasible set the objective function h achieves no larger values than the objective function g . (Minimization version)Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  9. 9. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsRelaxationsRelaxations Relaxed solution z ∗ provides a bound: h(z ∗ ) ≥ g (y ∗ ), (maximization: upper bound) h(z ∗ ) ≤ g (y ∗ ), (minimization: lower bound) Relaxation is typically tractable Evidence that relaxations are “better” for learning (Kulesza and Pereira, 2007), (Finley and Joachims, 2008), (Martins et al., 2009) Drawback: z ∗ ∈ Z Y possible solution is not integral (but fractional), or solution violates constraints (primal infeasible). To obtain z ∈ Y from y ∗ : rounding, heuristicsCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  10. 10. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsRelaxationsRelaxations Relaxed solution z ∗ provides a bound: h(z ∗ ) ≥ g (y ∗ ), (maximization: upper bound) h(z ∗ ) ≤ g (y ∗ ), (minimization: lower bound) Relaxation is typically tractable Evidence that relaxations are “better” for learning (Kulesza and Pereira, 2007), (Finley and Joachims, 2008), (Martins et al., 2009) Drawback: z ∗ ∈ Z Y possible solution is not integral (but fractional), or solution violates constraints (primal infeasible). To obtain z ∈ Y from y ∗ : rounding, heuristicsCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  11. 11. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsRelaxationsConstructing Relaxations Principled methods to construct relaxations Linear Programming Relaxations Lagrangian relaxation Lagrangian/Dual decompositionCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  12. 12. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsRelaxationsConstructing Relaxations Principled methods to construct relaxations Linear Programming Relaxations Lagrangian relaxation Lagrangian/Dual decompositionCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  13. 13. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsMAP-MRF LPMAP-MRF Linear Programming Relaxation Simple two variable pairwise MRF Y1 Y2 Y1 Y1 × Y2 Y2 θ1 θ1,2 θ2Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  14. 14. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsMAP-MRF LPMAP-MRF Linear Programming Relaxation Y1 Y2 Y1 Y1 × Y2 Y2 y1 = 2 (y1 , y2 ) = (2, 3) y2 = 3 Energy E (y ; x) = θ1 (y1 ; x) + θ1,2 (y1 , y2 ; x) + θ2 (y2 ; x) 1 Probability p(y |x) = Z (x) exp{−E (y ; x)} MAP prediction: argmax p(y |x) = argmin E (y ; x) y ∈Y y ∈YCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  15. 15. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsMAP-MRF LPMAP-MRF Linear Programming Relaxation Y1 Y2 Y1 Y1 × Y2 Y2 µ1 ∈ {0, 1}Y1 µ1,2 ∈ {0, 1}Y1 ×Y2 µ2 ∈ {0, 1}Y2 µ1 (y1 ) = y2 ∈Y2 µ1,2 (y1 , y2 ) y1 ∈Y1 µ1,2 (y1 , y2 ) = µ2 (y2 ) y1 ∈Y1 µ1 (y1 ) = 1 (y1 ,y2 )∈Y1 ×Y2 µ1,2 (y1 , y2 ) = 1 y2 ∈Y2 µ2 (y2 ) = 1 Energy is now linear: E (y ; x) = θ, µCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  16. 16. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsMAP-MRF LPMAP-MRF Linear Programming Relaxation (cont) min θi (yi )µi (yi ) + θi,j (yi , yj )µi,j (yi , yj ) µ i∈V yi ∈Yi {i,j}∈E (yi ,yj )∈Yi ×Yj sb.t. µi (yi ) = 1, ∀i ∈ V , yi ∈Yi µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi yj ∈Yj µi (yi ) ∈ {0, 1}, ∀i ∈ V , ∀yi ∈ Yi , µi,j (yi , yj ) ∈ {0, 1}, ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj .Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  17. 17. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsMAP-MRF LPMAP-MRF Linear Programming Relaxation (cont) min θi (yi )µi (yi ) + θi,j (yi , yj )µi,j (yi , yj ) µ i∈V yi ∈Yi {i,j}∈E (yi ,yj )∈Yi ×Yj sb.t. µi (yi ) = 1, ∀i ∈ V , yi ∈Yi µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi yj ∈Yj µi (yi ) ∈ {0, 1}, ∀i ∈ V , ∀yi ∈ Yi , µi,j (yi , yj ) ∈ {0, 1}, ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj . → NP-hard integer linear programCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  18. 18. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsMAP-MRF LPMAP-MRF Linear Programming Relaxation (cont) min θi (yi )µi (yi ) + θi,j (yi , yj )µi,j (yi , yj ) µ i∈V yi ∈Yi {i,j}∈E (yi ,yj )∈Yi ×Yj sb.t. µi (yi ) = 1, ∀i ∈ V , yi ∈Yi µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi yj ∈Yj µi (yi ) ∈ [0, 1], ∀i ∈ V , ∀yi ∈ Yi , µi,j (yi , yj ) ∈ [0, 1], ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj . → linear program → LOCAL polytopeCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  19. 19. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsMAP-MRF LPMAP-MRF LP Relaxation, Works MAP-MRF LP Analysis (Wainwright et al., Trans. Inf. Tech., 2005), (Weiss et al., UAI 2007), (Werner, PAMI 2005) TRW-S (Kolmogorov, PAMI 2006) Improving tightness (Komodakis and Paragios, ECCV 2008), (Kumar and Torr, ICML 2008), (Sontag and Jaakkola, NIPS 2007), (Sontag et al., UAI 2008), (Werner, CVPR 2008), (Kumar et al., NIPS 2007) Algorithms based on the LP (Globerson and Jaakkola, NIPS 2007), (Kumar and Torr, ICML 2008), (Sontag et al., UAI 2008)Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  20. 20. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsMAP-MRF LPMAP-MRF LP Relaxation, General case B A yA ∈YA µA (yA ) = 1, yA ∈YA µA (yA ) = µB (yB ), ∀yB ∈ YB . [yA ]B =yB Generalization to higher-order factors For B ⊆ A factors: marginalization constraints Defines LOCAL polytope (Wainwright and Jordan, 2008)Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  21. 21. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsMAP-MRF LPMAP-MRF LP Relaxation, Known results 1. LOCAL is tight iff the factor graph is a forest (acyclic) 2. All labelings are vertices of LOCAL (Wainwright and Jordan, 2008) 3. For cyclic graphs there are additional fractional vertices. 4. If all factors have regular energies, the fractional solutions are never optimal (Wainwright and Jordan, 2008) 5. For models with only binary states: half-integrality, integral variables are certain See some examples later...Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  22. 22. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationConstructing Relaxations Principled methods to construct relaxations Linear Programming Relaxations Lagrangian relaxation Lagrangian/Dual decompositionCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  23. 23. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationLagrangian Relaxation, Motivation B A yA ∈YA µA (yA ) = µB (yB ), ∀yB ∈ YB . [yA ]B =yB MAP-MRF LP with higher-order factors problematic: Number of LP variables grows with |YA | → exponential in MRF variables But: number of constraints stays small (|YB |) Lagrangian Relaxation allows one to solve the LP by treating the constraints implicitlyCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  24. 24. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationLagrangian Relaxation min g (y ) y sb.t. y ∈ D, y ∈ C. Assumption Optimizing g (y ) over y ∈ D is “easy”. Optimizing over y ∈ D ∩ C is hard. High-level idea Incorporate y ∈ C constraint into objective function For an excellent introduction, see (Guignard, “Lagrangean Relaxation”, TOP 2003) and (Lemar´chal, “Lagrangian Relaxation”, 2001). eCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  25. 25. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationLagrangian Relaxation (cont) Recipe 1. Express C in terms of equalities and inequalities C = {y ∈ G : ui (y ) = 0, ∀i = 1, . . . , I , vj (y ) ≤ 0, ∀j = 1, . . . , J}, ui : G → R differentiable, typically affine, vj : G → R differentiable, typically convex. 2. Introduce Lagrange multipliers, yielding min g (y ) y sb.t. y ∈ D, ui (y ) = 0 : λ, i = 1, . . . , I , vj (y ) ≤ 0 : µ, j = 1, . . . , J.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  26. 26. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationLagrangian Relaxation (cont) 2. Introduce Lagrange multipliers, yielding min g (y ) y sb.t. y ∈ D, ui (y ) = 0 : λ, i = 1, . . . , I , vj (y ) ≤ 0 : µ, j = 1, . . . , J. 3. Build partial Lagrangian min g (y ) + λ u(y ) + µ v (y ) (1) y sb.t. y ∈ D.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  27. 27. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationLagrangian Relaxation (cont) 3. Build partial Lagrangian min g (y ) + λ u(y ) + µ v (y ) (1) y sb.t. y ∈ D. Theorem (Weak Duality of Lagrangean Relaxation) For differentiable functions ui : G → R and vj : G → R, and for any λ ∈ RI and any non-negative µ ∈ RJ , µ ≥ 0, problem (1) is a relaxation of the original problem: its value is lower than or equal to the optimal value of the original problem.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  28. 28. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationLagrangian Relaxation (cont) 3. Build partial Lagrangian q(λ, µ) := min g (y ) + λ u(y ) + µ v (y ) (1) y sb.t. y ∈ D. 4. Maximize lower bound wrt λ, µ max q(λ, µ) λ,µ sb.t. µ≥0 → efficiently solvable if q(λ, µ) can be evaluatedCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  29. 29. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationOptimizing Lagrangian Dual Functions 4. Maximize lower bound wrt λ, µ max q(λ, µ) (2) λ,µ sb.t. µ≥0 Theorem (Lagrangean Dual Function) 1. q is concave in λ and µ, Problem (2) is a concave maximization problem. 2. If q is unbounded above, then the original problem is infeasible. 3. For any λ, µ ≥ 0, let y (λ, µ) = argminy ∈D g (y ) + λ u(y ) + µ v (u) Then, a supergradient of q can be constructed by evaluating the constraint functions at y (λ, µ) as ∂ ∂ u(y (λ, µ)) ∈ q(λ, µ), and v (y (λ, µ)) ∈ q(λ, µ). ∂λ ∂µCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  30. 30. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationSubgradients, Supergradients f (x) Subgradients for convex functions: global affine underestimators Supergradients for concave functions: global affine overestimatorsCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  31. 31. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationSubgradients, Supergradients f (x) x Subgradients for convex functions: global affine underestimators Supergradients for concave functions: global affine overestimatorsCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  32. 32. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationSubgradients, Supergradients f (x) x f (y) ≥ f (x) + g (y − x), ∀y Subgradients for convex functions: global affine underestimators Supergradients for concave functions: global affine overestimatorsCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  33. 33. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationSubgradients, Supergradients f (x) x ∂ 0∈ ∂x f (x) Subgradients for convex functions: global affine underestimators Supergradients for concave functions: global affine overestimatorsCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  34. 34. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationExample behaviour of objectives (from a 10-by-10 pairwise binary MRF relaxation, submodular) 10 Dual objective 0 Primal objective −10 Objective −20 −30 −40 −50 0 50 100 150 200 250 IterationCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  35. 35. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationPrimal Solution Recovery Assume we solved the dual problem for (λ∗ , µ∗ ) 1. Can we obtain a primal solution y ∗ ? 2. Can we say something about the relaxation quality? Theorem (Sufficient Optimality Conditions) If for a given λ, µ ≥ 0, we have u(y (λ, µ)) = 0 and v (y (λ, µ)) ≤ 0 (primal feasibility) and further we have λ u(y (λ, µ)) = 0, and µ v (y (λ, µ)) = 0, (complementary slackness), then y (λ, µ) is an optimal primal solution to the original problem, and (λ, µ) is an optimal solution to the dual problem.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  36. 36. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationPrimal Solution Recovery Assume we solved the dual problem for (λ∗ , µ∗ ) 1. Can we obtain a primal solution y ∗ ? 2. Can we say something about the relaxation quality? Theorem (Sufficient Optimality Conditions) If for a given λ, µ ≥ 0, we have u(y (λ, µ)) = 0 and v (y (λ, µ)) ≤ 0 (primal feasibility) and further we have λ u(y (λ, µ)) = 0, and µ v (y (λ, µ)) = 0, (complementary slackness), then y (λ, µ) is an optimal primal solution to the original problem, and (λ, µ) is an optimal solution to the dual problem.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  37. 37. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationPrimal Solution Recovery (cont) But, we might never see a solution satisfying the optimality condition is the case only if there is no duality gap, i.e. q(λ∗ , µ∗ ) = g (x, y ∗ ) Special case: integer linear programs we can always reconstruct a primal solution to min g (y ) (3) y sb.t. y ∈ conv(D), y ∈ C. For example for subgradient method updates it is known that (Anstreicher and Wolsey, MathProg 2009) T 1 lim y (λt , µt ) → y ∗ of (3). T →∞ T t=1Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  38. 38. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationApplication to higher-order factors B A yA ∈YA µA (yA ) = µB (yB ), ∀yB ∈ YB . [yA ]B =yB (Komodakis and Paragios, CVPR 2009) propose a MAP inference algorithm suitable for higher-order factors Can be seen as Lagrangian relaxation of the marginalization constraint of the MAP-MRF LP See also (Johnson et al., ACCC 2007), (Schlesinger and Giginyak, CSC 2007), (Wainwright and Jordan, 2008, Sect. 4.1.3), (Werner, PAMI 2005), (Werner, TechReport 2009), (Werner, CVPR 2008)Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  39. 39. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationKomodakis MAP-MRF Decomposition min θi (yi )µi (yi ) + θi,j (yi , yj )µi,j (yi , yj ) µ i∈V yi ∈Yi {i,j}∈ (yi ,yj )∈ E Yi ×Yj sb.t. µi (yi ) = 1, ∀i ∈ V , yi ∈Yi µi,j (yi , yj ) = 1, ∀{i, j} ∈ E , (yi ,yj )∈Yi ×Yj µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi yj ∈Yj µi (yi ) ∈ {0, 1}, ∀i ∈ V , ∀yi ∈ Yi , µi,j (yi , yj ) ∈ {0, 1}, ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj .Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  40. 40. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationKomodakis MAP-MRF Decomposition min θi , µ i + θi,j , µi,j µ i∈V {i,j}∈ E sb.t. µi (yi ) = 1, ∀i ∈ V , yi ∈Yi µi,j (yi , yj ) = 1, ∀{i, j} ∈ E , (yi ,yj )∈Yi ×Yj µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi yj ∈Yj µi (yi ) ∈ {0, 1}, ∀i ∈ V , ∀yi ∈ Yi , µi,j (yi , yj ) ∈ {0, 1}, ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj .Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  41. 41. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationKomodakis MAP-MRF Decomposition min θi , µ i + θi,j , µi,j + θA (yA )µA (yA ) µ i∈V {i,j}∈ yA ∈YA E sb.t. µi (yi ) = 1, ∀i ∈ V , yi ∈Yi µi,j (yi , yj ) = 1, ∀{i, j} ∈ E , (yi ,yj )∈Yi ×Yj µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi yj ∈Yj µA (yA ) = µB (yB ), ∀B ⊂ A, yB ∈ YB , yA ∈YA ,[yA ]B =yB µA (yA ) = 1, yA ∈YA µi (yi ) ∈ {0, 1}, ∀i ∈ V , ∀yi ∈ Yi , µi,j (yi , yj ) ∈ {0, 1}, ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj , µA (yA ) ∈ {0, 1}, ∀yA ∈ YA .Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  42. 42. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationKomodakis MAP-MRF Decomposition min θi , µ i + θi,j , µi,j + θA (yA )µA (yA ) µ i∈V {i,j}∈ yA ∈YA E sb.t. µi (yi ) = 1, ∀i ∈ V , yi ∈Yi µi,j (yi , yj ) = 1, ∀{i, j} ∈ E , (yi ,yj )∈Yi ×Yj µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi yj ∈Yj µA (yA ) = µB (yB ) : φA→B (yB ), ∀B ⊂ A, yB ∈ YB , yA ∈YA ,[yA ]B =yB µA (yA ) = 1, yA ∈YA µi (yi ) ∈ {0, 1}, ∀i ∈ V , ∀yi ∈ Yi , µi,j (yi , yj ) ∈ {0, 1}, ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj , µA (yA ) ∈ {0, 1}, ∀yA ∈ YA .Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  43. 43. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationKomodakis MAP-MRF Decomposition q(φ) := min θi , µ i + θi,j , µi,j + θA (yA )µA (yA ) µ i∈V {i,j}∈ yA ∈YA E   + φA→B (yB )  µA (yA ) − µB (yB ) B⊂A, yA ∈YA ,[yA ]B =yB yB ∈YB sb.t. µi (yi ) = 1, ∀i ∈ V , yi ∈Yi µi,j (yi , yj ) = 1, ∀{i, j} ∈ E , (yi ,yj )∈Yi ×Yj µA (yA ) = 1, yA ∈YA µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi yj ∈Yj µi (yi ) ∈ {0, 1}, ∀i ∈ V , ∀yi ∈ Yi , µi,j (yi , yj ) ∈ {0, 1}, ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj ,Carsten Rother, Sebastian Nowozin µA (yA ) ∈ {0, 1}, ∀yA ∈ YA . Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  44. 44. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationKomodakis MAP-MRF Decomposition q(φ) := min θi , µ i + θi,j , µi,j + θA (yA )µA (yA ) µ i∈V {i,j}∈ yA ∈YA E + φA→B (yB ) µA (yA ) − φA→B (yB )µB (yB ) B⊂A, yA ∈YA ,[yA ]B =yB B⊂A, yB ∈YB yB ∈YB sb.t. µi (yi ) = 1, ∀i ∈ V , yi ∈Yi µi,j (yi , yj ) = 1, ∀{i, j} ∈ E , (yi ,yj )∈Yi ×Yj µA (yA ) = 1, yA ∈YA µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi yj ∈Yj µi (yi ) ∈ {0, 1}, ∀i ∈ V , ∀yi ∈ Yi , µi,j (yi , yj ) ∈ {0, 1}, ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj , µA (yA ) ∈ {0, 1}, ∀yA ∈ YA .Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  45. 45. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationKomodakis MAP-MRF Decomposition q(φ) := min θi , µ i + θi,j , µi,j − φA→B (yB )µB (yB ) µ i∈V {i,j}∈ B⊂A, E yB ∈YB + φA→B (yB ) µA (yA ) + θA (yA )µA (yA ) B⊂A, yA ∈YA ,[yA ]B =yB yA ∈YA yB ∈YB sb.t. µi (yi ) = 1, ∀i ∈ V , yi ∈Yi µi,j (yi , yj ) = 1, ∀{i, j} ∈ E , (yi ,yj )∈Yi ×Yj µA (yA ) = 1, yA ∈YA µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi yj ∈Yj µi (yi ) ∈ {0, 1}, ∀i ∈ V , ∀yi ∈ Yi , µi,j (yi , yj ) ∈ {0, 1}, ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj , µA (yA ) ∈ {0, 1}, ∀yA ∈ YA .Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  46. 46. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationKomodakis MAP-MRF Decomposition q(φ) := min θi , µ i + θi,j , µi,j − φA→B (yB )µB (yB ) µ i∈V {i,j}∈ B⊂A, E yB ∈YB + φA→B (yB ) µA (yA ) + θA (yA )µA (yA ) B⊂A, yA ∈YA ,[yA ]B =yB yA ∈YA yB ∈YB sb.t. µi (yi ) = 1, ∀i ∈ V , yi ∈Yi µi,j (yi , yj ) = 1, ∀{i, j} ∈ E , (yi ,yj )∈Yi ×Yj µA (yA ) = 1, yA ∈YA µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi yj ∈Yj µi (yi ) ∈ {0, 1}, ∀i ∈ V , ∀yi ∈ Yi , µi,j (yi , yj ) ∈ {0, 1}, ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj , µA (yA ) ∈ {0, 1}, ∀yA ∈ YA .Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  47. 47. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationKomodakis MAP-MRF Decomposition q(φ) := min θi , µ i + θi,j , µi,j − φA→B (yB )µB (yB ) µ i∈V {i,j}∈ B⊂A, E yB ∈YB + φA→B (yB ) µA (yA ) + θA (yA )µA (yA ) B⊂A, yA ∈YA ,[yA ]B =yB yA ∈YA yB ∈YB sb.t. µi (yi ) = 1, ∀i ∈ V , yi ∈Yi µi,j (yi , yj ) = 1, ∀{i, j} ∈ E , (yi ,yj )∈Yi ×Yj µA (yA ) = 1, yA ∈YA µi,j (yi , yj ) = µi (yi ), ∀{i, j} ∈ E , ∀yi ∈ Yi yj ∈Yj µi (yi ) ∈ {0, 1}, ∀i ∈ V , ∀yi ∈ Yi , µi,j (yi , yj ) ∈ {0, 1}, ∀{i, j} ∈ E , ∀(yi , yj ) ∈ Yi × Yj , µA (yA ) ∈ {0, 1}, ∀yA ∈ YA .Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  48. 48. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationMessage passing B D φA→B φA→D φA→C A φA→E C E Maximizing q(φ) corresponds to message passing from the higher-order factor to the unary factorsCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  49. 49. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationThe higher-order subproblem min φA→B (yB ) µA (yA ) + θA (yA )µA (yA ) µA B⊂A, yA ∈YA ,[yA ]B =yB yA ∈YA yB ∈YB sb.t. µA (yA ) = 1, yA ∈YA µA (yA ) ∈ {0, 1}, ∀yA ∈ YA . Simple structure: select one element Simplify to an optimization problem over yA ∈ YACarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  50. 50. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationThe higher-order subproblem min θA (yA ) + I ([yA ]B = yB )φA→B (yB ) yA ∈YA B⊂A, yB ∈YB Solvability depends on structure of θA as function of yA ∗ (For now) assume we can solve for yA ∈ YA How can we use this to solve the original large LP?Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  51. 51. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationMaximizing the dual function: Subgradient Method Maximizing q(φ): 1. Initialize φ 2. Iterate 2.1 Given φ solve pairwise MAP subproblem, obtain µ∗ , µ∗ 1 2 2.2 Given φ solve higher-order MAP subproblem, obtain y∗A 2.3 Update φ Supergradient ∂ ∗ (I ([yA ]B = yB ) − µB (yB )) ∂φB (yB ) B⊂A, yB ∈YB Supergradient ascent, step size αt > 0 ∗ φB (yB ) ← φB (yB ) + αt (I ([yA ]B = yB ) − µB (yB )) B⊂A, yB ∈YB The step size αt must satisfy the “diminishing step size conditions”, i) limt→∞ αt → 0, and ii) ∞ αt → ∞. t=0Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  52. 52. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationSubgradient Method (cont) Typical step size choices 1. Simple diminishing step size, 1+m αt = , t +m where m > 0 is an arbitrary constant. 2. Polyak’s step size, q − q(λt , µt ) ¯ αt = β t t , µt )) 2 + v (y (λt , µt )) 2 , u(y (λ where 0 < β t < 2 is a diminishing step size, and q ≥ q(λ∗ , µ∗ ) is an upper bound on the optimal dual objective. ¯Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  53. 53. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationSubgradient Method (cont) Primal solution recovery Uniform average T 1 lim y (λt , µt ) → y ∗ . T →∞ T t=1 Geometric series (volume algorithm) y t = γy (λt , µt ) + (1 − γ)¯ t−1 , ¯ y where 0 < γ < 1 is a small constant such as γ = 0.1, and y t might ¯ be slightly primal infeasible.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  54. 54. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationPattern-based higher order potentials Example (Pattern-based potentials) Given a small set of patterns P = {y1 , y2 , . . . , yK }, define CyA if yA ∈ P, θA (yA ) = Cmax otherwise. (Rother et al, CVPR 2009), (Komodakis and Paragios, CVPR 2009) Generalizes P n -Potts model (Kohli et al., CVPR 2007) “Patterns have low energies”: Cy ≤ Cmax for all y ∈ PCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  55. 55. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsLagrangian RelaxationPattern-based higher order potentials Example (Pattern-based potentials) Given a small set of patterns P = {y1 , y2 , . . . , yK }, define CyA if yA ∈ P, θA (yA ) = Cmax otherwise. Can solve min θA (yA ) + I ([yA ]B = yB )φA→B (yB ) yA ∈YA B⊂A, yB ∈YB by 1. Testing all yA ∈ P, and 2. Fixing θA (yA ) = Cmax and solving for the minimum of the second term.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  56. 56. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual DecompositionConstructing Relaxations Principled methods to construct relaxations Linear Programming Relaxations Lagrangian relaxation Lagrangian/Dual decompositionCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  57. 57. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual DecompositionDual/Lagrangian Decomposition Additive structure K min gk (y ) y k=1 sb.t. y ∈ Yk , ∀k = 1, . . . , K , K such that k=1 gk (y ) is hard, but for any k, min gk (y ) y sb.t. y ∈ Yk is tractable. (Guignard, “Lagrangean Decomposition”, MathProg 1987) (Conejo et al., “Decomposition Techniques in Mathematical Programming”, 2006)Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  58. 58. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual DecompositionDual/Lagrangian Decomposition Idea 1. Selectively duplicate variables (“variable splitting”) K min gk (yk ) y1 ,...,yK ,y k=1 sb.t. y ∈Y , yk ∈ Yk , k = 1, . . . , K , y = yk : λk , k = 1, . . . , K , (4) where Y ⊇ k=1,...,K Yk . 2. Add coupling equality constraints between duplicated variables 3. Apply Lagrangian relaxation to the coupling constraint Also known as dual decomposition and variable splitting.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  59. 59. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual DecompositionDual/Lagrangian Decomposition (cont) K min gk (yk ) y1 ,...,yK ,y k=1 sb.t. y ∈Y , yk ∈ Yk , k = 1, . . . , K , y = yk : λk , k = 1, . . . , K . (5)Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  60. 60. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual DecompositionDual/Lagrangian Decomposition (cont) K K min gk (yk ) + λk (y − yk ) y1 ,...,yK ,y k=1 k=1 sb.t. y ∈Y , yk ∈ Yk , k = 1, . . . , K ,Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  61. 61. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual DecompositionDual/Lagrangian Decomposition (cont) K K min gk (yk ) − λk yk + λk y y1 ,...,yK ,y k=1 k=1 sb.t. y ∈Y , yk ∈ Yk , k = 1, . . . , K , Problem is decomposed There can be multiple possible decompositions of different qualityCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  62. 62. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual DecompositionDual/Lagrangian Decomposition (cont) Dual function K K q(λ) := min gk (yk ) − λk yk + λk y y1 ,...,yK ,y k=1 k=1 sb.t. y ∈Y , yk ∈ Yk , k = 1, . . . , K , Dual maximization problem max q(λ) λ K sb.t. λk = 0, k=1 K where {λ| k=1 λk = 0} is the domain where q(λ) > −∞.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  63. 63. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual DecompositionDual/Lagrangian Decomposition (cont) Dual maximization problem max q(λ) λ K sb.t. λk = 0, k=1 Projected subgradient method, using K K ∂ 1 1 (y − yk ) − (y − y ) = y − yk . ∂λk K K =1 =1Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  64. 64. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual Decomposition Example (Connectivity, Vicente et al., CVPR 2008)Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  65. 65. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual Decomposition Example (Connectivity, Vicente et al., CVPR 2008)Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  66. 66. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual Decomposition Example (Connectivity, Vicente et al., CVPR 2008) min E (y ) + C (y ), y where E (y ) is a normal binary pairwise segmentation energy, and 0 if marked points are connected, C (y ) = ∞ otherwise.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  67. 67. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual Decomposition Example (Connectivity, Vicente et al., CVPR 2008) miny ,z E (y ) + C (z) sb.t. y = z.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  68. 68. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual Decomposition Example (Connectivity, Vicente et al., CVPR 2008) miny ,z E (y ) + C (z) sb.t. y = z : λ.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  69. 69. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual Decomposition Example (Connectivity, Vicente et al., CVPR 2008) miny ,z E (y ) + C (z) + λ (y − z)Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  70. 70. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual Decomposition Example (Connectivity, Vicente et al., CVPR 2008) q(λ) := miny ,z E (y ) + λ y + C (z) − λ z Subproblem 1 Subproblem 2 Subproblem 1: normal binary image segmentation energy Subproblem 2: shortest path problem (no pairwise terms) → Maximize q(λ) to find best lower bound Above derivation simplified, see (Vicente et al., CVPR 2008) for a more sophisticated decomposition using three subproblemsCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  71. 71. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual Decomposition Example (Connectivity, Vicente et al., CVPR 2008) q(λ) := miny ,z E (y ) + λ y + C (z) − λ z Subproblem 1 Subproblem 2 Subproblem 1: normal binary image segmentation energy Subproblem 2: shortest path problem (no pairwise terms) → Maximize q(λ) to find best lower bound Above derivation simplified, see (Vicente et al., CVPR 2008) for a more sophisticated decomposition using three subproblemsCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  72. 72. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual DecompositionOther examples Plenty of applications of decomposition: (Komodakis et al., ICCV 2007) (Strandmark and Kahl, CVPR 2010) (Torresani et al., ECCV 2008) (Werner, TechReport 2009) (Strandmark and Kahl, CVPR 2010) (Kumar and Koller, CVPR 2010) (Woodford et al., ICCV 2009) (Vicente et al., ICCV 2009)Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  73. 73. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual DecompositionGeometric Interpretation Theorem (Lagrangian Decomposition Primal) Let Yk be finite sets and g (y ) = c y a linear objective function. Then the optimal solution of the Lagrangian decomposition obtains the value of the following relaxed primal optimization problem. K min gk (y ) y k=1 sb.t. y ∈ conv(Yk ), ∀k = 1, . . . , K . Therefore, optimizing the Lagrangian decomposition dual is equivalent to optimizing the primal objective, on the intersection of the convex hulls of the individual feasible sets.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  74. 74. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual DecompositionGeometric Interpretation Theorem (Lagrangian Decomposition Primal) Let Yk be finite sets and g (y ) = c y a linear objective function. Then the optimal solution of the Lagrangian decomposition obtains the value of the following relaxed primal optimization problem. K min gk (y ) y k=1 sb.t. y ∈ conv(Yk ), ∀k = 1, . . . , K . Therefore, optimizing the Lagrangian decomposition dual is equivalent to optimizing the primal objective, on the intersection of the convex hulls of the individual feasible sets.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  75. 75. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual DecompositionGeometric Interpretation (cont) yD y∗ conv(Y1 )∩ conv(Y2 ) conv(Y2) conv(Y1) c yCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  76. 76. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual DecompositionGeometric Interpretation (cont) yD y∗ conv(Y1 )∩ conv(Y2 ) conv(Y2) conv(Y1) c yCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  77. 77. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual DecompositionGeometric Interpretation (cont) yD y∗ conv(Y1 )∩ conv(Y2 ) conv(Y2) conv(Y1) c yCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  78. 78. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual DecompositionGeometric Interpretation (cont) yD y∗ conv(Y1 )∩ conv(Y2 ) conv(Y2) conv(Y1) c yCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  79. 79. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual DecompositionLagrangian Relaxation vs. Decomposition Lagrangian Relaxation Needs explicit complicating constraints Lagrangian/Dual Decomposition Can work with black-box solver for subproblem (implicit) Can yield stronger boundsCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  80. 80. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsDual DecompositionLagrangian Relaxation vs. Decomposition Y1 ⊇ conv(Y1 ) yD yR y∗ conv(Y1 )∩ conv(Y2 ) conv(Y2) conv(Y1) c yCarsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  81. 81. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsPolyhedral Higher-order InteractionsPolyhedral View on Higher-order Interactions Idea MAP-MRF LP relaxation: LOCAL polytope Incorporate higher-order interactions directly by constraining LOCAL Techniques from polyhedral combinatorics See (Werner, CVPR 2008), (Nowozin and Lampert, CVPR 2009), (Lempitsky et al., ICCV 2009)Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  82. 82. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsPolyhedral Higher-order InteractionsRepresentation of Polytopes Convex polytopes have two equivalent representations d3 y ≤ 1 As a convex combination of extreme d2 y ≤ 1 points As a set of facet-defining linear inequalities Z A linear inequality with respect to a polytope can be d1 y ≤ 1 valid, does not cut off the polytope, representing a face, valid and touching, facet-defining, representing a face of dimension one less than the polytope.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  83. 83. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsPolyhedral Higher-order InteractionsIntersecting Polytopes Construction 1. Consider marginal polytope M(G ) of a discrete graphical model M(G) 2. Vertices of M(G ) are feasible labelings 3. Add linear inequalities d µ ≤ b that cut off vertices 4. Optimize over resulting µ intersection M (G ), 0 if µ is valid, EA (µA ) = ∞ otherwise.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  84. 84. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsPolyhedral Higher-order InteractionsIntersecting Polytopes Construction 1. Consider marginal polytope M(G ) of a discrete graphical model 2. Vertices of M(G ) are feasible M(G) labelings 3. Add linear inequalities d µ ≤ b that cut off vertices 4. Optimize over resulting µ intersection M (G ), 0 if µ is valid, EA (µA ) = ∞ otherwise.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  85. 85. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsPolyhedral Higher-order InteractionsIntersecting Polytopes Construction 1. Consider marginal polytope d µ≤b M(G ) of a discrete graphical model 2. Vertices of M(G ) are feasible labelings M(G) 3. Add linear inequalities d µ ≤ b that cut off vertices 4. Optimize over resulting intersection M (G ), µ 0 if µ is valid, EA (µA ) = ∞ otherwise.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference
  86. 86. Belief Propagation Relaxations MAP-MRF LP Lagrangian Relaxation Dual Decomposition Polyhedral Higher-order InteractionsPolyhedral Higher-order InteractionsIntersecting Polytopes Construction 1. Consider marginal polytope M(G ) of a discrete graphical model 2. Vertices of M(G ) are feasible M (G) labelings 3. Add linear inequalities d µ ≤ b that cut off vertices 4. Optimize over resulting µ intersection M (G ), 0 if µ is valid, EA (µA ) = ∞ otherwise.Carsten Rother, Sebastian Nowozin Microsoft Research CambridgeHigher Order Models in Computer Vision: Inference

×