Upcoming SlideShare
×

# ECCV2008: MAP Estimation Algorithms in Computer Vision - Part 1

492 views

Published on

0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total views
492
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
7
0
Likes
0
Embeds 0
No embeds

No notes for slide
• 3-cycle figure
• Colour code ‘0’ etc. …. Key equations here
• ### ECCV2008: MAP Estimation Algorithms in Computer Vision - Part 1

1. 1. MAP Estimation Algorithms in M. Pawan Kumar, University of Oxford Pushmeet Kohli, Microsoft Research Computer Vision - Part I
2. 2. Aim of the Tutorial <ul><li>Description of some successful algorithms </li></ul><ul><li>Computational issues </li></ul><ul><li>Enough details to implement </li></ul><ul><li>Some proofs will be skipped :-( </li></ul><ul><li>But references to them will be given :-) </li></ul>
3. 3. A Vision Application Binary Image Segmentation How ? Cost function Models our knowledge about natural images Optimize cost function to obtain the segmentation
4. 4. A Vision Application Object - white, Background - green/grey Graph G = (V,E) Each vertex corresponds to a pixel Edges define a 4-neighbourhood grid graph Assign a label to each vertex from L = {obj,bkg} Binary Image Segmentation
5. 5. A Vision Application Graph G = (V,E) Cost of a labelling f : V  L Per Vertex Cost Cost of label ‘obj’ low Cost of label ‘bkg’ high Object - white, Background - green/grey Binary Image Segmentation
6. 6. A Vision Application Graph G = (V,E) Cost of a labelling f : V  L Cost of label ‘obj’ high Cost of label ‘bkg’ low Per Vertex Cost UNARY COST Object - white, Background - green/grey Binary Image Segmentation
7. 7. A Vision Application Graph G = (V,E) Cost of a labelling f : V  L Per Edge Cost Cost of same label low Cost of different labels high Object - white, Background - green/grey Binary Image Segmentation
8. 8. A Vision Application Graph G = (V,E) Cost of a labelling f : V  L Cost of same label high Cost of different labels low Per Edge Cost PAIRWISE COST Object - white, Background - green/grey Binary Image Segmentation
9. 9. A Vision Application Graph G = (V,E) Problem: Find the labelling with minimum cost f* Object - white, Background - green/grey Binary Image Segmentation
10. 10. A Vision Application Graph G = (V,E) Problem: Find the labelling with minimum cost f* Binary Image Segmentation
11. 11. Another Vision Application Object Detection using Parts-based Models How ? Once again, by defining a good cost function
12. 12. Another Vision Application H T L1 Each vertex corresponds to a part - ‘Head’, ‘Torso’, ‘Legs’ 1 Edges define a TREE Assign a label to each vertex from L = {positions} Graph G = (V,E) L2 L3 L4 Object Detection using Parts-based Models
13. 13. Another Vision Application 2 Each vertex corresponds to a part - ‘Head’, ‘Torso’, ‘Legs’ Assign a label to each vertex from L = {positions} Graph G = (V,E) Edges define a TREE H T L1 L2 L3 L4 Object Detection using Parts-based Models
14. 14. Another Vision Application 3 Each vertex corresponds to a part - ‘Head’, ‘Torso’, ‘Legs’ Assign a label to each vertex from L = {positions} Graph G = (V,E) Edges define a TREE H T L1 L2 L3 L4 Object Detection using Parts-based Models
15. 15. Another Vision Application Cost of a labelling f : V  L Unary cost : How well does part match image patch? Pairwise cost : Encourages valid configurations Find best labelling f* Graph G = (V,E) 3 H T L1 L2 L3 L4 Object Detection using Parts-based Models
16. 16. Another Vision Application Cost of a labelling f : V  L Unary cost : How well does part match image patch? Pairwise cost : Encourages valid configurations Find best labelling f* Graph G = (V,E) 3 H T L1 L2 L3 L4 Object Detection using Parts-based Models
17. 17. Yet Another Vision Application Stereo Correspondence Disparity Map How ? Minimizing a cost function
18. 18. Yet Another Vision Application Stereo Correspondence Graph G = (V,E) Vertex corresponds to a pixel Edges define grid graph L = {disparities}
19. 19. Yet Another Vision Application Stereo Correspondence Cost of labelling f : Unary cost + Pairwise Cost Find minimum cost f*
20. 20. The General Problem b a e d c f Graph G = ( V, E ) Discrete label set L = {1,2,…,h} Assign a label to each vertex f: V  L 1 1 2 2 2 3 Cost of a labelling Q(f) Unary Cost Pairwise Cost Find f* = arg min Q(f)
21. 21. Outline <ul><li>Problem Formulation </li></ul><ul><ul><li>Energy Function </li></ul></ul><ul><ul><li>MAP Estimation </li></ul></ul><ul><ul><li>Computing min-marginals </li></ul></ul><ul><li>Reparameterization </li></ul><ul><li>Belief Propagation </li></ul><ul><li>Tree-reweighted Message Passing </li></ul>
22. 22. Energy Function V a V b V c V d Label l 0 Label l 1 D a D b D c D d Random Variables V = {V a , V b , ….} Labels L = {l 0 , l 1 , ….} Data D Labelling f: {a, b, …. }  {0,1, …}
23. 23. Energy Function V a V b V c V d D a D b D c D d Q(f) = ∑ a  a;f(a) Unary Potential 2 5 4 2 6 3 3 7 Label l 0 Label l 1 Easy to minimize Neighbourhood
24. 24. Energy Function V a V b V c V d D a D b D c D d E : (a,b)  E iff V a and V b are neighbours E = { (a,b) , (b,c) , (c,d) } 2 5 4 2 6 3 3 7 Label l 0 Label l 1
25. 25. Energy Function V a V b V c V d D a D b D c D d +∑ (a,b)  ab;f(a)f(b) Pairwise Potential 0 1 1 0 0 2 1 1 4 1 0 3 2 5 4 2 6 3 3 7 Label l 0 Label l 1 Q(f) = ∑ a  a;f(a)
26. 26. Energy Function V a V b V c V d D a D b D c D d 0 1 1 0 0 2 1 1 4 1 0 3 Parameter 2 5 4 2 6 3 3 7 Label l 0 Label l 1 +∑ (a,b)  ab;f(a)f(b) Q(f;  ) = ∑ a  a;f(a)
27. 27. Outline <ul><li>Problem Formulation </li></ul><ul><ul><li>Energy Function </li></ul></ul><ul><ul><li>MAP Estimation </li></ul></ul><ul><ul><li>Computing min-marginals </li></ul></ul><ul><li>Reparameterization </li></ul><ul><li>Belief Propagation </li></ul><ul><li>Tree-reweighted Message Passing </li></ul>
28. 28. MAP Estimation V a V b V c V d 2 5 4 2 6 3 3 7 0 1 1 0 0 2 1 1 4 1 0 3 Q(f;  ) = ∑ a  a;f(a) + ∑ (a,b)  ab;f(a)f(b) Label l 0 Label l 1
29. 29. MAP Estimation V a V b V c V d 2 5 4 2 6 3 3 7 0 1 1 0 0 2 1 1 4 1 0 3 Q(f;  ) = ∑ a  a;f(a) + ∑ (a,b)  ab;f(a)f(b) 2 + 1 + 2 + 1 + 3 + 1 + 3 = 13 Label l 0 Label l 1
30. 30. MAP Estimation V a V b V c V d 2 5 4 2 6 3 3 7 0 1 1 0 0 2 1 1 4 1 0 3 Q(f;  ) = ∑ a  a;f(a) + ∑ (a,b)  ab;f(a)f(b) Label l 0 Label l 1
31. 31. MAP Estimation V a V b V c V d 2 5 4 2 6 3 3 7 0 1 1 0 0 2 1 1 4 1 0 3 Q(f;  ) = ∑ a  a;f(a) + ∑ (a,b)  ab;f(a)f(b) 5 + 1 + 4 + 0 + 6 + 4 + 7 = 27 Label l 0 Label l 1
32. 32. MAP Estimation V a V b V c V d 2 5 4 2 6 3 3 7 0 1 1 0 0 2 1 1 4 1 0 3 Q(f;  ) = ∑ a  a;f(a) + ∑ (a,b)  ab;f(a)f(b) f* = arg min Q(f;  ) q* = min Q(f;  ) = Q(f*;  ) Label l 0 Label l 1
33. 33. MAP Estimation 16 possible labellings f* = {1, 0, 0, 1} q* = 13 20 1 1 1 0 27 0 1 1 0 19 1 0 1 0 22 0 0 1 0 20 1 1 0 0 27 0 1 0 0 15 1 0 0 0 18 0 0 0 0 Q(f;  ) f(d) f(c) f(b) f(a) 16 1 1 1 1 23 0 1 1 1 15 1 0 1 1 18 0 0 1 1 18 1 1 0 1 25 0 1 0 1 13 1 0 0 1 16 0 0 0 1 Q(f;  ) f(d) f(c) f(b) f(a)
34. 34. Computational Complexity Segmentation 2 |V| |V| = number of pixels ≈ 320 * 480 = 153600
35. 35. Computational Complexity |L| = number of pixels ≈ 153600 Detection |L| |V|
36. 36. Computational Complexity |V| = number of pixels ≈ 153600 Stereo |L| |V| Can we do better than brute-force? MAP Estimation is NP-hard !!
37. 37. Computational Complexity |V| = number of pixels ≈ 153600 Stereo |L| |V| Exact algorithms do exist for special cases Good approximate algorithms for general case But first … two important definitions
38. 38. Outline <ul><li>Problem Formulation </li></ul><ul><ul><li>Energy Function </li></ul></ul><ul><ul><li>MAP Estimation </li></ul></ul><ul><ul><li>Computing min-marginals </li></ul></ul><ul><li>Reparameterization </li></ul><ul><li>Belief Propagation </li></ul><ul><li>Tree-reweighted Message Passing </li></ul>
39. 39. Min-Marginals V a V b V c V d 2 5 4 2 6 3 3 7 0 1 1 0 0 2 1 1 4 1 0 3 f* = arg min Q(f;  ) such that f(a) = i Min-marginal q a;i Label l 0 Label l 1 Not a marginal (no summation)
40. 40. Min-Marginals 16 possible labellings q a;0 = 15 20 1 1 1 0 27 0 1 1 0 19 1 0 1 0 22 0 0 1 0 20 1 1 0 0 27 0 1 0 0 15 1 0 0 0 18 0 0 0 0 Q(f;  ) f(d) f(c) f(b) f(a) 16 1 1 1 1 23 0 1 1 1 15 1 0 1 1 18 0 0 1 1 18 1 1 0 1 25 0 1 0 1 13 1 0 0 1 16 0 0 0 1 Q(f;  ) f(d) f(c) f(b) f(a)
41. 41. Min-Marginals 16 possible labellings q a;1 = 13 16 1 1 1 1 23 0 1 1 1 15 1 0 1 1 18 0 0 1 1 18 1 1 0 1 25 0 1 0 1 13 1 0 0 1 16 0 0 0 1 Q(f;  ) f(d) f(c) f(b) f(a) 20 1 1 1 0 27 0 1 1 0 19 1 0 1 0 22 0 0 1 0 20 1 1 0 0 27 0 1 0 0 15 1 0 0 0 18 0 0 0 0 Q(f;  ) f(d) f(c) f(b) f(a)
42. 42. Min-Marginals and MAP <ul><li>Minimum min-marginal of any variable = </li></ul><ul><li>energy of MAP labelling </li></ul>min f Q(f;  ) such that f(a) = i q a;i min i min i ( ) V a has to take one label min f Q(f;  )
43. 43. Summary MAP Estimation f* = arg min Q(f;  ) Q(f;  ) = ∑ a  a;f(a) + ∑ (a,b)  ab;f(a)f(b) Min-marginals q a;i = min Q(f;  ) s.t. f(a) = i Energy Function
44. 44. Outline <ul><li>Problem Formulation </li></ul><ul><li>Reparameterization </li></ul><ul><li>Belief Propagation </li></ul><ul><li>Tree-reweighted Message Passing </li></ul>
45. 45. Reparameterization V a V b 2 5 4 2 0 1 1 0 2 + 2 + - 2 - 2 Add a constant to all  a;i Subtract that constant from all  b;k 6 1 1 5 0 1 10 1 0 7 0 0 Q(f;  ) f(b) f(a)
46. 46. Reparameterization Add a constant to all  a;i Subtract that constant from all  b;k Q(f;  ’) = Q(f;  ) V a V b 2 5 4 2 0 0 2 + 2 + - 2 - 2 1 1 6 + 2 - 2 1 1 5 + 2 - 2 0 1 10 + 2 - 2 1 0 7 + 2 - 2 0 0 Q(f;  ) f(b) f(a)
47. 47. Reparameterization V a V b 2 5 4 2 0 1 1 0 - 3 + 3 Add a constant to one  b;k Subtract that constant from  ab;ik for all ‘i’ - 3 6 1 1 5 0 1 10 1 0 7 0 0 Q(f;  ) f(b) f(a)
48. 48. Reparameterization V a V b 2 5 4 2 0 1 1 0 - 3 + 3 - 3 Q(f;  ’) = Q(f;  ) Add a constant to one  b;k Subtract that constant from  ab;ik for all ‘i’ 6 - 3 + 3 1 1 5 0 1 10 - 3 + 3 1 0 7 0 0 Q(f;  ) f(b) f(a)
49. 49. Reparameterization - 2 - 2 - 2 + 2 + 1 + 1 + 1 - 1 - 4 + 4 - 4 - 4  ’ a;i =  a;i  ’ b;k =  b;k  ’ ab;ik =  ab;ik + M ab;k - M ab;k + M ba;i - M ba;i Q(f;  ’) = Q(f;  ) V a V b 2 5 4 2 3 1 0 1 2 V a V b 2 5 4 2 3 1 1 0 1 V a V b 2 5 4 2 3 1 2 1 0
50. 50. Reparameterization Q(f;  ’) = Q(f;  ), for all f  ’ is a reparameterization of  , iff  ’    ’ b;k =  b;k Kolmogorov, PAMI, 2006  ’ a;i =  a;i  ’ ab;ik =  ab;ik + M ab;k - M ab;k + M ba;i - M ba;i Equivalently V a V b 2 5 4 2 0 0 2 + 2 + - 2 - 2 1 1
51. 51. Recap MAP Estimation f* = arg min Q(f;  ) Q(f;  ) = ∑ a  a;f(a) + ∑ (a,b)  ab;f(a)f(b) Min-marginals q a;i = min Q(f;  ) s.t. f(a) = i Q(f;  ’) = Q(f;  ), for all f  ’   Reparameterization
52. 52. Outline <ul><li>Problem Formulation </li></ul><ul><li>Reparameterization </li></ul><ul><li>Belief Propagation </li></ul><ul><ul><li>Exact MAP for Chains and Trees </li></ul></ul><ul><ul><li>Approximate MAP for general graphs </li></ul></ul><ul><ul><li>Computational Issues and Theoretical Properties </li></ul></ul><ul><li>Tree-reweighted Message Passing </li></ul>
53. 53. Belief Propagation <ul><li>Belief Propagation gives exact MAP for chains </li></ul><ul><li>Remember, some MAP problems are easy </li></ul><ul><li>Exact MAP for trees </li></ul><ul><li>Clever Reparameterization </li></ul>
54. 54. Two Variables V a V b 2 5 2 1 0 V a V b 2 5 4 0 1 Choose the right constant  ’ b;k = q b;k Add a constant to one  b;k Subtract that constant from  ab;ik for all ‘i’
55. 55. Two Variables V a V b 2 5 2 1 0 V a V b 2 5 4 0 1 Choose the right constant  ’ b;k = q b;k  a;0 +  ab;00 = 5 + 0  a;1 +  ab;10 = 2 + 1 min M ab;0 =
56. 56. Two Variables V a V b 2 5 5 -2 -3 V a V b 2 5 4 0 1 Choose the right constant  ’ b;k = q b;k
57. 57. Two Variables V a V b 2 5 5 -2 -3 V a V b 2 5 4 0 1 Choose the right constant  ’ b;k = q b;k f(a) = 1  ’ b;0 = q b;0 Potentials along the red path add up to 0
58. 58. Two Variables V a V b 2 5 5 -2 -3 V a V b 2 5 4 0 1 Choose the right constant  ’ b;k = q b;k  a;0 +  ab;01 = 5 + 1  a;1 +  ab;11 = 2 + 0 min M ab;1 =
59. 59. Two Variables V a V b 2 5 5 -2 -3 V a V b 2 5 6 -2 -1 Choose the right constant  ’ b;k = q b;k f(a) = 1  ’ b;0 = q b;0 f(a) = 1  ’ b;1 = q b;1 Minimum of min-marginals = MAP estimate
60. 60. Two Variables V a V b 2 5 5 -2 -3 V a V b 2 5 6 -2 -1 Choose the right constant  ’ b;k = q b;k f(a) = 1  ’ b;0 = q b;0 f(a) = 1  ’ b;1 = q b;1 f*(b) = 0 f*(a) = 1
61. 61. Two Variables V a V b 2 5 5 -2 -3 V a V b 2 5 6 -2 -1 Choose the right constant  ’ b;k = q b;k f(a) = 1  ’ b;0 = q b;0 f(a) = 1  ’ b;1 = q b;1 We get all the min-marginals of V b
62. 62. Recap We only need to know two sets of equations General form of Reparameterization Reparameterization of (a,b) in Belief Propagation M ab;k = min i {  a;i +  ab;ik } M ba;i = 0  ’ a;i =  a;i  ’ ab;ik =  ab;ik + M ab;k - M ab;k + M ba;i - M ba;i  ’ b;k =  b;k
63. 63. Three Variables V a V b 2 5 2 1 0 V c 4 6 0 1 0 1 3 2 3 Reparameterize the edge (a,b) as before l 0 l 1
64. 64. Three Variables V a V b 2 5 5 -3 V c 6 6 0 1 -2 3 Reparameterize the edge (a,b) as before f(a) = 1 f(a) = 1 -2 -1 2 3 l 0 l 1
65. 65. Three Variables V a V b 2 5 5 -3 V c 6 6 0 1 -2 3 Reparameterize the edge (a,b) as before f(a) = 1 f(a) = 1 Potentials along the red path add up to 0 -2 -1 2 3 l 0 l 1
66. 66. Three Variables V a V b 2 5 5 -3 V c 6 6 0 1 -2 3 Reparameterize the edge (b,c) as before f(a) = 1 f(a) = 1 Potentials along the red path add up to 0 -2 -1 2 3 l 0 l 1
67. 67. Three Variables V a V b 2 5 5 -3 V c 6 12 -6 -5 -2 9 Reparameterize the edge (b,c) as before f(a) = 1 f(a) = 1 Potentials along the red path add up to 0 f(b) = 1 f(b) = 0 -2 -1 -4 -3 l 0 l 1
68. 68. Three Variables V a V b 2 5 5 -3 V c 6 12 -6 -5 -2 9 Reparameterize the edge (b,c) as before f(a) = 1 f(a) = 1 Potentials along the red path add up to 0 f(b) = 1 f(b) = 0 q c;0 q c;1 -2 -1 -4 -3 l 0 l 1
69. 69. Three Variables V a V b 2 5 5 -3 V c 6 12 -6 -5 -2 9 f(a) = 1 f(a) = 1 f(b) = 1 f(b) = 0 q c;0 q c;1 f*(c) = 0 f*(b) = 0 f*(a) = 1 Generalizes to any length chain -2 -1 -4 -3 l 0 l 1
70. 70. Three Variables V a V b 2 5 5 -3 V c 6 12 -6 -5 -2 9 f(a) = 1 f(a) = 1 f(b) = 1 f(b) = 0 q c;0 q c;1 f*(c) = 0 f*(b) = 0 f*(a) = 1 Only Dynamic Programming -2 -1 -4 -3 l 0 l 1
71. 71. Why Dynamic Programming? 3 variables  2 variables + book-keeping n variables  (n-1) variables + book-keeping Start from left, go to right Reparameterize current edge (a,b) M ab;k = min i {  a;i +  ab;ik } Repeat  ’ ab;ik =  ab;ik + M ab;k - M ab;k  ’ b;k =  b;k
72. 72. Why Dynamic Programming? Start from left, go to right Reparameterize current edge (a,b) M ab;k = min i {  a;i +  ab;ik } Repeat Messages Message Passing Why stop at dynamic programming?  ’ ab;ik =  ab;ik + M ab;k - M ab;k  ’ b;k =  b;k
73. 73. Three Variables V a V b 2 5 5 -3 V c 6 12 -6 -5 -2 9 Reparameterize the edge (c,b) as before -2 -1 -4 -3 l 0 l 1
74. 74. Three Variables V a V b 2 5 9 -3 V c 11 12 -11 -9 -2 9 Reparameterize the edge (c,b) as before -2 -1 -9 -7  ’ b;i = q b;i l 0 l 1
75. 75. Three Variables V a V b 2 5 9 -3 V c 11 12 -11 -9 -2 9 Reparameterize the edge (b,a) as before -2 -1 -9 -7 l 0 l 1
76. 76. Three Variables V a V b 9 11 9 -9 V c 11 12 -11 -9 -9 9 Reparameterize the edge (b,a) as before -9 -7 -9 -7  ’ a;i = q a;i l 0 l 1
77. 77. Three Variables V a V b 9 11 9 -9 V c 11 12 -11 -9 -9 9 Forward Pass   Backward Pass -9 -7 -9 -7 All min-marginals are computed l 0 l 1
78. 78. Belief Propagation on Chains Start from left, go to right Reparameterize current edge (a,b) M ab;k = min i {  a;i +  ab;ik } Repeat till the end of the chain Start from right, go to left Repeat till the end of the chain  ’ ab;ik =  ab;ik + M ab;k - M ab;k  ’ b;k =  b;k
79. 79. Belief Propagation on Chains <ul><li>A way of computing reparam constants </li></ul><ul><li>Generalizes to chains of any length </li></ul><ul><li>Forward Pass - Start to End </li></ul><ul><ul><li>MAP estimate </li></ul></ul><ul><ul><li>Min-marginals of final variable </li></ul></ul><ul><li>Backward Pass - End to start </li></ul><ul><ul><li>All other min-marginals </li></ul></ul>Won’t need this .. But good to know
80. 80. Computational Complexity <ul><li>Each constant takes O(|L|) </li></ul><ul><li>Number of constants - O(|E||L|) </li></ul>O(|E||L| 2 ) <ul><li>Memory required ? </li></ul>O(|E||L|)
81. 81. Belief Propagation on Trees V b V a Forward Pass: Leaf  Root All min-marginals are computed Backward Pass: Root  Leaf V c V d V e V g V h
82. 82. Outline <ul><li>Problem Formulation </li></ul><ul><li>Reparameterization </li></ul><ul><li>Belief Propagation </li></ul><ul><ul><li>Exact MAP for Chains and Trees </li></ul></ul><ul><ul><li>Approximate MAP for general graphs </li></ul></ul><ul><ul><li>Computational Issues and Theoretical Properties </li></ul></ul><ul><li>Tree-reweighted Message Passing </li></ul>
83. 83. Belief Propagation on Cycles V a V b V d V c Where do we start? Arbitrarily  a;0  a;1  b;0  b;1  d;0  d;1  c;0  c;1 Reparameterize (a,b)
84. 84. Belief Propagation on Cycles V a V b V d V c  a;0  a;1  ’ b;0  ’ b;1  d;0  d;1  c;0  c;1 Potentials along the red path add up to 0
85. 85. Belief Propagation on Cycles V a V b V d V c  a;0  a;1  ’ b;0  ’ b;1  d;0  d;1  ’ c;0  ’ c;1 Potentials along the red path add up to 0
86. 86. Belief Propagation on Cycles V a V b V d V c  a;0  a;1  ’ b;0  ’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 Potentials along the red path add up to 0
87. 87. Belief Propagation on Cycles V a V b V d V c  ’ a;0  ’ a;1  ’ b;0  ’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 Potentials along the red path add up to 0
88. 88. Belief Propagation on Cycles V a V b V d V c  ’ a;0  ’ a;1  ’ b;0  ’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 Potentials along the red path add up to 0 -  a;0 -  a;1  ’ a;0 -  a;0 = q a;0  ’ a;1 -  a;1 = q a;1
89. 89. Belief Propagation on Cycles V a V b V d V c  ’ a;0  ’ a;1  ’ b;0  ’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 Pick minimum min-marginal. Follow red path. -  a;0 -  a;1  ’ a;0 -  a;0 = q a;0  ’ a;1 -  a;1 = q a;1
90. 90. Belief Propagation on Cycles V a V b V d V c  a;0  a;1  ’ b;0  ’ b;1  d;0  d;1  c;0  c;1 Potentials along the red path add up to 0
91. 91. Belief Propagation on Cycles V a V b V d V c  a;0  a;1  ’ b;0  ’ b;1  d;0  d;1  ’ c;0  ’ c;1 Potentials along the red path add up to 0
92. 92. Belief Propagation on Cycles V a V b V d V c  a;0  a;1  ’ b;0  ’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 Potentials along the red path add up to 0
93. 93. Belief Propagation on Cycles V a V b V d V c  ’ a;0  ’ a;1  ’ b;0  ’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 Potentials along the red path add up to 0 -  a;0 -  a;1  ’ a;1 -  a;1 = q a;1  ’ a;0 -  a;0 ≤ q a;0 ≤
94. 94. Belief Propagation on Cycles V a V b V d V c  ’ a;0  ’ a;1  ’ b;0  ’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 Problem Solved -  a;0 -  a;1  ’ a;1 -  a;1 = q a;1  ’ a;0 -  a;0 ≤ q a;0 ≤
95. 95. Belief Propagation on Cycles V a V b V d V c  ’ a;0  ’ a;1  ’ b;0  ’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 Problem Not Solved -  a;0 -  a;1  ’ a;1 -  a;1 = q a;1  ’ a;0 -  a;0 ≤ q a;0 ≥
96. 96. Belief Propagation on Cycles V a V b V d V c  ’ a;0  ’ a;1  ’ b;0  ’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 -  a;0 -  a;1 Reparameterize (a,b) again
97. 97. Belief Propagation on Cycles V a V b V d V c  ’ a;0  ’ a;1  ’’ b;0  ’’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 Reparameterize (a,b) again But doesn’t this overcount some potentials?
98. 98. Belief Propagation on Cycles V a V b V d V c  ’ a;0  ’ a;1  ’’ b;0  ’’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 Reparameterize (a,b) again Yes. But we will do it anyway
99. 99. Belief Propagation on Cycles V a V b V d V c  ’ a;0  ’ a;1  ’’ b;0  ’’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 Keep reparameterizing edges in some order Hope for convergence and a good solution
100. 100. Belief Propagation <ul><li>Generalizes to any arbitrary random field </li></ul><ul><li>Complexity per iteration ? </li></ul>O(|E||L| 2 ) <ul><li>Memory required ? </li></ul>O(|E||L|)
101. 101. Outline <ul><li>Problem Formulation </li></ul><ul><li>Reparameterization </li></ul><ul><li>Belief Propagation </li></ul><ul><ul><li>Exact MAP for Chains and Trees </li></ul></ul><ul><ul><li>Approximate MAP for general graphs </li></ul></ul><ul><ul><li>Computational Issues and Theoretical Properties </li></ul></ul><ul><li>Tree-reweighted Message Passing </li></ul>
102. 102. Computational Issues of BP Complexity per iteration O(|E||L| 2 ) Special Pairwise Potentials  ab;ik = w ab d(|i-k|) O(|E||L|) Felzenszwalb & Huttenlocher, 2004 i - k d Potts i - k d Truncated Linear i - k d Truncated Quadratic
103. 103. Computational Issues of BP Memory requirements O(|E||L|) Half of original BP Kolmogorov, 2006 Some approximations exist But memory still remains an issue Yu, Lin, Super and Tan, 2007 Lasserre, Kannan and Winn, 2007
104. 104. Computational Issues of BP Order of reparameterization Randomly Residual Belief Propagation In some fixed order The one that results in maximum change Elidan et al. , 2006
105. 105. Theoretical Properties of BP Exact for Trees Pearl, 1988 What about any general random field? Run BP. Assume it converges.
106. 106. Theoretical Properties of BP Exact for Trees Pearl, 1988 What about any general random field? Choose variables in a tree. Change their labels. Value of energy does not decrease
107. 107. Theoretical Properties of BP Exact for Trees Pearl, 1988 What about any general random field? Choose variables in a cycle. Change their labels. Value of energy does not decrease
108. 108. Theoretical Properties of BP Exact for Trees Pearl, 1988 What about any general random field? For cycles, if BP converges then exact MAP Weiss and Freeman, 2001
109. 109. Results Object Detection Felzenszwalb and Huttenlocher, 2004 Labels - Poses of parts Unary Potentials: Fraction of foreground pixels Pairwise Potentials: Favour Valid Configurations H T A1 A2 L1 L2
110. 110. Results Object Detection Felzenszwalb and Huttenlocher, 2004
111. 111. Results Binary Segmentation Szeliski et al. , 2008 Labels - {foreground, background} Unary Potentials: -log(likelihood) using learnt fg/bg models Pairwise Potentials: 0, if same labels 1 -  exp(|D a - D b |), if different labels
112. 112. Results Binary Segmentation Labels - {foreground, background} Unary Potentials: -log(likelihood) using learnt fg/bg models Szeliski et al. , 2008 Pairwise Potentials: 0, if same labels 1 -  exp(|D a - D b |), if different labels Belief Propagation
113. 113. Results Binary Segmentation Labels - {foreground, background} Unary Potentials: -log(likelihood) using learnt fg/bg models Szeliski et al. , 2008 Global optimum Pairwise Potentials: 0, if same labels 1 -  exp(|D a - D b |), if different labels
114. 114. Results Szeliski et al. , 2008 Labels - {disparities} Unary Potentials: Similarity of pixel colours Pairwise Potentials: 0, if same labels 1 -  exp(|D a - D b |), if different labels Stereo Correspondence
115. 115. Results Szeliski et al. , 2008 Labels - {disparities} Unary Potentials: Similarity of pixel colours Pairwise Potentials: 0, if same labels 1 -  exp(|D a - D b |), if different labels Belief Propagation Stereo Correspondence
116. 116. Results Szeliski et al. , 2008 Labels - {disparities} Unary Potentials: Similarity of pixel colours Global optimum Pairwise Potentials: 0, if same labels 1 -  exp(|D a - D b |), if different labels Stereo Correspondence
117. 117. Summary of BP Exact for chains Exact for trees Approximate MAP for general cases Not even convergence guaranteed So can we do something better?
118. 118. Outline <ul><li>Problem Formulation </li></ul><ul><li>Reparameterization </li></ul><ul><li>Belief Propagation </li></ul><ul><li>Tree-reweighted Message Passing </li></ul><ul><ul><li>Integer Programming Formulation </li></ul></ul><ul><ul><li>Linear Programming Relaxation and its Dual </li></ul></ul><ul><ul><li>Convergent Solution for Dual </li></ul></ul><ul><ul><li>Computational Issues and Theoretical Properties </li></ul></ul>
119. 119. TRW Message Passing <ul><li>Convex (not Combinatorial) Optimization </li></ul><ul><li>A different look at the same problem </li></ul><ul><li>A similar solution </li></ul><ul><li>Combinatorial (not Convex) Optimization </li></ul>We will look at the most general MAP estimation Not trees No assumption on potentials
120. 120. Things to Remember <ul><li>Forward-pass computes min-marginals of root </li></ul><ul><li>BP is exact for trees </li></ul><ul><li>Every iteration provides a reparameterization </li></ul><ul><li>Basics of Mathematical Optimization </li></ul>
121. 121. Mathematical Optimization min g 0 (x) subject to g i (x) ≤ 0 i=1, … , N <ul><li>Objective function </li></ul><ul><li>Constraints </li></ul><ul><li>Feasible region = {x | g i (x) ≤ 0} </li></ul>x* = arg Optimal Solution g 0 (x*) Optimal Value
122. 122. Integer Programming min g 0 (x) subject to g i (x) ≤ 0 i=1, … , N <ul><li>Objective function </li></ul><ul><li>Constraints </li></ul><ul><li>Feasible region = {x | g i (x) ≤ 0} </li></ul>x* = arg Optimal Solution g 0 (x*) Optimal Value x k  Z
123. 123. Feasible Region Generally NP-hard to optimize
124. 124. Linear Programming min g 0 (x) subject to g i (x) ≤ 0 i=1, … , N <ul><li>Objective function </li></ul><ul><li>Constraints </li></ul><ul><li>Feasible region = {x | g i (x) ≤ 0} </li></ul>x* = arg Optimal Solution g 0 (x*) Optimal Value
125. 125. Linear Programming min g 0 (x) subject to g i (x) ≤ 0 i=1, … , N <ul><li>Linear objective function </li></ul><ul><li>Linear constraints </li></ul><ul><li>Feasible region = {x | g i (x) ≤ 0} </li></ul>x* = arg Optimal Solution g 0 (x*) Optimal Value
126. 126. Linear Programming min c T x subject to Ax ≤ b i=1, … , N <ul><li>Linear objective function </li></ul><ul><li>Linear constraints </li></ul><ul><li>Feasible region = {x | Ax ≤ b} </li></ul>x* = arg Optimal Solution c T x* Optimal Value Polynomial-time Solution
127. 127. Feasible Region Polynomial-time Solution
128. 128. Feasible Region Optimal solution lies on a vertex (obj func linear)
129. 129. Outline <ul><li>Problem Formulation </li></ul><ul><li>Reparameterization </li></ul><ul><li>Belief Propagation </li></ul><ul><li>Tree-reweighted Message Passing </li></ul><ul><ul><li>Integer Programming Formulation </li></ul></ul><ul><ul><li>Linear Programming Relaxation and its Dual </li></ul></ul><ul><ul><li>Convergent Solution for Dual </li></ul></ul><ul><ul><li>Computational Issues and Theoretical Properties </li></ul></ul>
130. 130. Integer Programming Formulation V a V b Label l 0 Label l 1 2 5 4 2 0 1 1 0 2 Unary Potentials  a;0 = 5  a;1 = 2  b;0 = 2  b;1 = 4 Labelling f(a) = 1 f(b) = 0 y a;0 = 0 y a;1 = 1 y b;0 = 1 y b;1 = 0 Any f(.) has equivalent boolean variables y a;i
131. 131. Integer Programming Formulation V a V b 2 5 4 2 0 1 1 0 2 Unary Potentials  a;0 = 5  a;1 = 2  b;0 = 2  b;1 = 4 Labelling f(a) = 1 f(b) = 0 y a;0 = 0 y a;1 = 1 y b;0 = 1 y b;1 = 0 Find the optimal variables y a;i Label l 0 Label l 1
132. 132. Integer Programming Formulation V a V b 2 5 4 2 0 1 1 0 2 Unary Potentials  a;0 = 5  a;1 = 2  b;0 = 2  b;1 = 4 Sum of Unary Potentials ∑ a ∑ i  a;i y a;i y a;i  {0,1}, for all V a , l i ∑ i y a;i = 1, for all V a Label l 0 Label l 1
133. 133. Integer Programming Formulation V a V b 2 5 4 2 0 1 1 0 2 Pairwise Potentials  ab;00 = 0  ab;10 = 1  ab;01 = 1  ab;11 = 0 Sum of Pairwise Potentials ∑ (a,b) ∑ ik  ab;ik y a;i y b;k y a;i  {0,1} ∑ i y a;i = 1 Label l 0 Label l 1
134. 134. Integer Programming Formulation V a V b 2 5 4 2 0 1 1 0 2 Pairwise Potentials  ab;00 = 0  ab;10 = 1  ab;01 = 1  ab;11 = 0 Sum of Pairwise Potentials ∑ (a,b) ∑ ik  ab;ik y ab;ik y a;i  {0,1} ∑ i y a;i = 1 y ab;ik = y a;i y b;k Label l 0 Label l 1
135. 135. Integer Programming Formulation min ∑ a ∑ i  a;i y a;i + ∑ (a,b) ∑ ik  ab;ik y ab;ik y a;i  {0,1} ∑ i y a;i = 1 y ab;ik = y a;i y b;k
136. 136. Integer Programming Formulation min  T y y a;i  {0,1} ∑ i y a;i = 1 y ab;ik = y a;i y b;k  = [ …  a;i …. ; …  ab;ik ….] y = [ … y a;i …. ; … y ab;ik ….]
137. 137. One variable, two labels y a;0  {0,1} y a;1  {0,1} y a;0 + y a;1 = 1 y = [ y a;0 y a;1 ]  = [  a;0  a;1 ] y a;0 y a;1
138. 138. Two variables, two labels <ul><li>= [  a;0  a;1  b;0  b;1 </li></ul><ul><li> ab;00  ab;01  ab;10  ab;11 ] </li></ul>y = [ y a;0 y a;1 y b;0 y b;1 y ab;00 y ab;01 y ab;10 y ab;11 ] y a;0  {0,1} y a;1  {0,1} y a;0 + y a;1 = 1 y b;0  {0,1} y b;1  {0,1} y b;0 + y b;1 = 1 y ab;00 = y a;0 y b;0 y ab;01 = y a;0 y b;1 y ab;10 = y a;1 y b;0 y ab;11 = y a;1 y b;1
139. 139. In General Marginal Polytope
140. 140. In General <ul><li> R (|V||L| + |E||L| 2 ) </li></ul>y  {0,1} (|V||L| + |E||L| 2 ) Number of constraints |V||L| + |V| + |E||L| 2 y a;i  {0,1} ∑ i y a;i = 1 y ab;ik = y a;i y b;k
141. 141. Integer Programming Formulation min  T y y a;i  {0,1} ∑ i y a;i = 1 y ab;ik = y a;i y b;k  = [ …  a;i …. ; …  ab;ik ….] y = [ … y a;i …. ; … y ab;ik ….]
142. 142. Integer Programming Formulation min  T y y a;i  {0,1} ∑ i y a;i = 1 y ab;ik = y a;i y b;k Solve to obtain MAP labelling y*
143. 143. Integer Programming Formulation min  T y y a;i  {0,1} ∑ i y a;i = 1 y ab;ik = y a;i y b;k But we can’t solve it in general
144. 144. Outline <ul><li>Problem Formulation </li></ul><ul><li>Reparameterization </li></ul><ul><li>Belief Propagation </li></ul><ul><li>Tree-reweighted Message Passing </li></ul><ul><ul><li>Integer Programming Formulation </li></ul></ul><ul><ul><li>Linear Programming Relaxation and its Dual </li></ul></ul><ul><ul><li>Convergent Solution for Dual </li></ul></ul><ul><ul><li>Computational Issues and Theoretical Properties </li></ul></ul>
145. 145. Linear Programming Relaxation min  T y y a;i  {0,1} ∑ i y a;i = 1 y ab;ik = y a;i y b;k Two reasons why we can’t solve this
146. 146. Linear Programming Relaxation min  T y y a;i  [0,1] ∑ i y a;i = 1 y ab;ik = y a;i y b;k One reason why we can’t solve this
147. 147. Linear Programming Relaxation min  T y y a;i  [0,1] ∑ i y a;i = 1 ∑ k y ab;ik = ∑ k y a;i y b;k One reason why we can’t solve this
148. 148. Linear Programming Relaxation min  T y y a;i  [0,1] ∑ i y a;i = 1 One reason why we can’t solve this = 1 ∑ k y ab;ik = y a;i ∑ k y b;k
149. 149. Linear Programming Relaxation min  T y y a;i  [0,1] ∑ i y a;i = 1 ∑ k y ab;ik = y a;i One reason why we can’t solve this
150. 150. Linear Programming Relaxation min  T y y a;i  [0,1] ∑ i y a;i = 1 ∑ k y ab;ik = y a;i No reason why we can’t solve this * * memory requirements, time complexity
151. 151. One variable, two labels y a;0  {0,1} y a;1  {0,1} y a;0 + y a;1 = 1 y = [ y a;0 y a;1 ]  = [  a;0  a;1 ] y a;0 y a;1
152. 152. One variable, two labels y a;0  [0,1] y a;1  [0,1] y a;0 + y a;1 = 1 y = [ y a;0 y a;1 ]  = [  a;0  a;1 ] y a;0 y a;1
153. 153. Two variables, two labels <ul><li>= [  a;0  a;1  b;0  b;1 </li></ul><ul><li> ab;00  ab;01  ab;10  ab;11 ] </li></ul>y = [ y a;0 y a;1 y b;0 y b;1 y ab;00 y ab;01 y ab;10 y ab;11 ] y a;0  {0,1} y a;1  {0,1} y a;0 + y a;1 = 1 y b;0  {0,1} y b;1  {0,1} y b;0 + y b;1 = 1 y ab;00 = y a;0 y b;0 y ab;01 = y a;0 y b;1 y ab;10 = y a;1 y b;0 y ab;11 = y a;1 y b;1
154. 154. Two variables, two labels <ul><li>= [  a;0  a;1  b;0  b;1 </li></ul><ul><li> ab;00  ab;01  ab;10  ab;11 ] </li></ul>y = [ y a;0 y a;1 y b;0 y b;1 y ab;00 y ab;01 y ab;10 y ab;11 ] y a;0  [0,1] y a;1  [0,1] y a;0 + y a;1 = 1 y b;0  [0,1] y b;1  [0,1] y b;0 + y b;1 = 1 y ab;00 = y a;0 y b;0 y ab;01 = y a;0 y b;1 y ab;10 = y a;1 y b;0 y ab;11 = y a;1 y b;1
155. 155. Two variables, two labels <ul><li>= [  a;0  a;1  b;0  b;1 </li></ul><ul><li> ab;00  ab;01  ab;10  ab;11 ] </li></ul>y = [ y a;0 y a;1 y b;0 y b;1 y ab;00 y ab;01 y ab;10 y ab;11 ] y a;0  [0,1] y a;1  [0,1] y a;0 + y a;1 = 1 y b;0  [0,1] y b;1  [0,1] y b;0 + y b;1 = 1 y ab;00 + y ab;01 = y a;0 y ab;10 = y a;1 y b;0 y ab;11 = y a;1 y b;1
156. 156. Two variables, two labels <ul><li>= [  a;0  a;1  b;0  b;1 </li></ul><ul><li> ab;00  ab;01  ab;10  ab;11 ] </li></ul>y = [ y a;0 y a;1 y b;0 y b;1 y ab;00 y ab;01 y ab;10 y ab;11 ] y a;0  [0,1] y a;1  [0,1] y a;0 + y a;1 = 1 y b;0  [0,1] y b;1  [0,1] y b;0 + y b;1 = 1 y ab;00 + y ab;01 = y a;0 y ab;10 + y ab;11 = y a;1
157. 157. In General Marginal Polytope Local Polytope
158. 158. In General <ul><li> R (|V||L| + |E||L| 2 ) </li></ul>y  [0,1] (|V||L| + |E||L| 2 ) Number of constraints |V||L| + |V| + |E||L|
159. 159. Linear Programming Relaxation min  T y y a;i  [0,1] ∑ i y a;i = 1 ∑ k y ab;ik = y a;i No reason why we can’t solve this
160. 160. Linear Programming Relaxation Extensively studied Optimization Schlesinger, 1976 Koster, van Hoesel and Kolen, 1998 Theory Chekuri et al, 2001 Archer et al, 2004 Machine Learning Wainwright et al., 2001
161. 161. Linear Programming Relaxation Many interesting Properties <ul><li>Global optimal MAP for trees </li></ul>Wainwright et al., 2001 But we are interested in NP-hard cases <ul><li>Preserves solution for reparameterization </li></ul>
162. 162. Linear Programming Relaxation <ul><li>Large class of problems </li></ul><ul><ul><li>Metric Labelling </li></ul></ul><ul><ul><li>Semi-metric Labelling </li></ul></ul>Many interesting Properties - Integrality Gap Manokaran et al., 2008 <ul><li>Most likely, provides best possible integrality gap </li></ul>
163. 163. Linear Programming Relaxation <ul><li>A computationally useful dual </li></ul>Many interesting Properties - Dual Optimal value of dual = Optimal value of primal Easier-to-solve
164. 164. Dual of the LP Relaxation Wainwright et al., 2001 V a V b V c V d V e V f V g V h V i  min  T y y a;i  [0,1] ∑ i y a;i = 1 ∑ k y ab;ik = y a;i
165. 165. Dual of the LP Relaxation Wainwright et al., 2001 V a V b V c V d V e V f V g V h V i  V a V b V c V d V e V f V g V h V i V a V b V c V d V e V f V g V h V i  1  2  3  4  5  6  1  2  3  4  5  6   i  i =   i ≥ 0
166. 166. Dual of the LP Relaxation Wainwright et al., 2001  1  2  3  4  5  6 q*(  1 )   i  i =  q*(  2 ) q*(  3 ) q*(  4 ) q*(  5 ) q*(  6 )   i q*(  i ) Dual of LP  V a V b V c V d V e V f V g V h V i V a V b V c V d V e V f V g V h V i V a V b V c V d V e V f V g V h V i  i ≥ 0 max
167. 167. Dual of the LP Relaxation Wainwright et al., 2001  1  2  3  4  5  6 q*(  1 )   i  i   q*(  2 ) q*(  3 ) q*(  4 ) q*(  5 ) q*(  6 ) Dual of LP  V a V b V c V d V e V f V g V h V i V a V b V c V d V e V f V g V h V i V a V b V c V d V e V f V g V h V i  i ≥ 0   i q*(  i ) max
168. 168. Dual of the LP Relaxation Wainwright et al., 2001   i  i   max   i q*(  i ) I can easily compute q*(  i ) I can easily maintain reparam constraint So can I easily solve the dual?
169. 169. Outline <ul><li>Problem Formulation </li></ul><ul><li>Reparameterization </li></ul><ul><li>Belief Propagation </li></ul><ul><li>Tree-reweighted Message Passing </li></ul><ul><ul><li>Integer Programming Formulation </li></ul></ul><ul><ul><li>Linear Programming Relaxation and its Dual </li></ul></ul><ul><ul><li>Convergent Solution for Dual </li></ul></ul><ul><ul><li>Computational Issues and Theoretical Properties </li></ul></ul>
170. 170. TRW Message Passing Kolmogorov, 2006 V a V b V c V d V e V f V g V h V i V a V b V c V d V e V f V g V h V i  1  2  3  1  2  3  4  5  6  4  5  6   i  i     i q*(  i ) Pick a variable V a
171. 171. TRW Message Passing Kolmogorov, 2006   i  i     i q*(  i ) V c V b V a  1 c;0  1 c;1  1 b;0  1 b;1  1 a;0  1 a;1 V a V d V g  4 a;0  4 a;1  4 d;0  4 d;1  4 g;0  4 g;1
172. 172. TRW Message Passing Kolmogorov, 2006  1  1 +  4  4 +  rest    1 q*(  1 ) +  4 q*(  4 ) + K V c V b V a V a V d V g Reparameterize to obtain min-marginals of V a  1 c;0  1 c;1  1 b;0  1 b;1  1 a;0  1 a;1  4 a;0  4 a;1  4 d;0  4 d;1  4 g;0  4 g;1
173. 173. TRW Message Passing Kolmogorov, 2006  1  ’ 1 +  4  ’ 4 +  rest V c V b V a  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’ 1 a;0  ’ 1 a;1 V a V d V g  ’ 4 a;0  ’ 4 a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1 One pass of Belief Propagation  1 q*(  ’ 1 ) +  4 q*(  ’ 4 ) + K
174. 174. TRW Message Passing Kolmogorov, 2006  1  ’ 1 +  4  ’ 4 +  rest   V c V b V a V a V d V g Remain the same  1 q*(  ’ 1 ) +  4 q*(  ’ 4 ) + K  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’ 1 a;0  ’ 1 a;1  ’ 4 a;0  ’ 4 a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1
175. 175. TRW Message Passing Kolmogorov, 2006  1  ’ 1 +  4  ’ 4 +  rest    1 min{  ’ 1 a;0 ,  ’ 1 a;1 } +  4 min{  ’ 4 a;0 ,  ’ 4 a;1 } + K V c V b V a V a V d V g  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’ 1 a;0  ’ 1 a;1  ’ 4 a;0  ’ 4 a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1
176. 176. TRW Message Passing Kolmogorov, 2006  1  ’ 1 +  4  ’ 4 +  rest   V c V b V a V a V d V g Compute weighted average of min-marginals of V a  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’ 1 a;0  ’ 1 a;1  ’ 4 a;0  ’ 4 a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1  1 min{  ’ 1 a;0 ,  ’ 1 a;1 } +  4 min{  ’ 4 a;0 ,  ’ 4 a;1 } + K
177. 177. TRW Message Passing Kolmogorov, 2006  1  ’ 1 +  4  ’ 4 +  rest   V c V b V a V a V d V g  ’’ a;0 =  1  ’ 1 a;0 +  4  ’ 4 a;0  1 +  4  ’’ a;1 =  1  ’ 1 a;1 +  4  ’ 4 a;1  1 +  4  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’ 1 a;0  ’ 1 a;1  ’ 4 a;0  ’ 4 a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1  1 min{  ’ 1 a;0 ,  ’ 1 a;1 } +  4 min{  ’ 4 a;0 ,  ’ 4 a;1 } + K
178. 178. TRW Message Passing Kolmogorov, 2006  1  ’’ 1 +  4  ’’ 4 +  rest V c V b V a V a V d V g  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’’ a;0  ’’ a;1  ’’ a;0  ’’ a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1  1 min{  ’ 1 a;0 ,  ’ 1 a;1 } +  4 min{  ’ 4 a;0 ,  ’ 4 a;1 } + K  ’’ a;0 =  1  ’ 1 a;0 +  4  ’ 4 a;0  1 +  4  ’’ a;1 =  1  ’ 1 a;1 +  4  ’ 4 a;1  1 +  4
179. 179. TRW Message Passing Kolmogorov, 2006  1  ’’ 1 +  4  ’’ 4 +  rest   V c V b V a V a V d V g  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’’ a;0  ’’ a;1  ’’ a;0  ’’ a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1  1 min{  ’ 1 a;0 ,  ’ 1 a;1 } +  4 min{  ’ 4 a;0 ,  ’ 4 a;1 } + K  ’’ a;0 =  1  ’ 1 a;0 +  4  ’ 4 a;0  1 +  4  ’’ a;1 =  1  ’ 1 a;1 +  4  ’ 4 a;1  1 +  4
180. 180. TRW Message Passing Kolmogorov, 2006  1  ’’ 1 +  4  ’’ 4 +  rest   V c V b V a V a V d V g  1 min{  ’’ a;0 ,  ’’ a;1 } +  4 min{  ’’ a;0 ,  ’’ a;1 } + K  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’’ a;0  ’’ a;1  ’’ a;0  ’’ a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1  ’’ a;0 =  1  ’ 1 a;0 +  4  ’ 4 a;0  1 +  4  ’’ a;1 =  1  ’ 1 a;1 +  4  ’ 4 a;1  1 +  4
181. 181. TRW Message Passing Kolmogorov, 2006  1  ’’ 1 +  4  ’’ 4 +  rest   V c V b V a V a V d V g (  1 +  4 ) min{  ’’ a;0 ,  ’’ a;1 } + K  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’’ a;0  ’’ a;1  ’’ a;0  ’’ a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1  ’’ a;0 =  1  ’ 1 a;0 +  4  ’ 4 a;0  1 +  4  ’’ a;1 =  1  ’ 1 a;1 +  4  ’ 4 a;1  1 +  4
182. 182. TRW Message Passing Kolmogorov, 2006  1  ’’ 1 +  4  ’’ 4 +  rest   V c V b V a V a V d V g (  1 +  4 ) min{  ’’ a;0 ,  ’’ a;1 } + K  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’’ a;0  ’’ a;1  ’’ a;0  ’’ a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1 min {p 1 +p 2 , q 1 +q 2 } min {p 1 , q 1 } + min {p 2 , q 2 } ≥
183. 183. TRW Message Passing Kolmogorov, 2006  1  ’’ 1 +  4  ’’ 4 +  rest   V c V b V a V a V d V g Objective function increases or remains constant  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’’ a;0  ’’ a;1  ’’ a;0  ’’ a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1 (  1 +  4 ) min{  ’’ a;0 ,  ’’ a;1 } + K
184. 184. TRW Message Passing Initialize  i . Take care of reparam constraint Choose random variable V a Compute min-marginals of V a for all trees Node-average the min-marginals REPEAT Kolmogorov, 2006 Can also do edge-averaging
185. 185. Example 1 V a V b 0 1 1 0 2 5 4 2 l 0 l 1 V b V c 0 2 3 1 4 2 6 3 V c V a 1 4 1 0 6 3 6 4  2 =1  3 =1  1 =1 5 6 7 Pick variable V a . Reparameterize.
186. 186. Example 1 V a V b -3 -2 -1 -2 5 7 4 2 V b V c 0 2 3 1 4 2 6 3 V c V a -3 1 -3 -3 6 3 10 7  2 =1  3 =1  1 =1 5 6 7 Average the min-marginals of V a l 0 l 1
187. 187. Example 1 V a V b -3 -2 -1 -2 7.5 7 4 2 V b V c 0 2 3 1 4 2 6 3 V c V a -3 1 -3 -3 6 3 7.5 7  2 =1  3 =1  1 =1 7 6 7 Pick variable V b . Reparameterize. l 0 l 1
188. 188. Example 1 V a V b -7.5 -7 -5.5 -7 7.5 7 8.5 7 V b V c -5 -3 -1 -3 9 6 6 3 V c V a -3 1 -3 -3 6 3 7.5 7  2 =1  3 =1  1 =1 7 6 7 Average the min-marginals of V b l 0 l 1
189. 189. Example 1 V a V b -7.5 -7 -5.5 -7 7.5 7 8.75 6.5 V b V c -5 -3 -1 -3 8.75 6.5 6 3 V c V a -3 1 -3 -3 6 3 7.5 7  2 =1  3 =1  1 =1 6.5 6.5 7 Value of dual does not increase l 0 l 1
190. 190. Example 1 V a V b -7.5 -7 -5.5 -7 7.5 7 8.75 6.5 V b V c -5 -3 -1 -3 8.75 6.5 6 3 V c V a -3 1 -3 -3 6 3 7.5 7  2 =1  3 =1  1 =1 6.5 6.5 7 Maybe it will increase for V c NO l 0 l 1
191. 191. Example 1 V a V b -7.5 -7 -5.5 -7 7.5 7 8.75 6.5 V b V c -5 -3 -1 -3 8.75 6.5 6 3 V c V a -3 1 -3 -3 6 3 7.5 7  2 =1  3 =1  1 =1 Strong Tree Agreement Exact MAP Estimate f 1 (a) = 0 f 1 (b) = 0 f 2 (b) = 0 f 2 (c) = 0 f 3 (c) = 0 f 3 (a) = 0 l 0 l 1
192. 192. Example 2 V a V b 0 1 1 0 2 5 2 2 V b V c 1 0 0 1 0 0 0 0 V c V a 0 1 1 0 0 3 4 8  2 =1  3 =1  1 =1 4 0 4 Pick variable V a . Reparameterize. l 0 l 1
193. 193. Example 2 V a V b -2 -1 -1 -2 4 7 2 2 V b V c 1 0 0 1 0 0 0 0 V c V a 0 0 1 -1 0 3 4 9  2 =1  3 =1  1 =1 4 0 4 Average the min-marginals of V a l 0 l 1
194. 194. Example 2 V a V b -2 -1 -1 -2 4 8 2 2 V b V c 1 0 0 1 0 0 0 0 V c V a 0 0 1 -1 0 3 4 8  2 =1  3 =1  1 =1 4 0 4 Value of dual does not increase l 0 l 1
195. 195. Example 2 V a V b -2 -1 -1 -2 4 8 2 2 V b V c 1 0 0 1 0 0 0 0 V c V a 0 0 1 -1 0 3 4 8  2 =1  3 =1  1 =1 4 0 4 Maybe it will decrease for V b or V c NO l 0 l 1
196. 196. Example 2 V a V b -2 -1 -1 -2 4 8 2 2 V b V c 1 0 0 1 0 0 0 0 V c V a 0 0 1 -1 0 3 4 8  2 =1  3 =1  1 =1 f 1 (a) = 1 f 1 (b) = 1 f 2 (b) = 1 f 2 (c) = 0 f 3 (c) = 1 f 3 (a) = 1 f 2 (b) = 0 f 2 (c) = 1 Weak Tree Agreement Not Exact MAP Estimate l 0 l 1
197. 197. Example 2 V a V b -2 -1 -1 -2 4 8 2 2 V b V c 1 0 0 1 0 0 0 0 V c V a 0 0 1 -1 0 3 4 8  2 =1  3 =1  1 =1 Weak Tree Agreement Convergence point of TRW l 0 l 1 f 1 (a) = 1 f 1 (b) = 1 f 2 (b) = 1 f 2 (c) = 0 f 3 (c) = 1 f 3 (a) = 1 f 2 (b) = 0 f 2 (c) = 1
198. 198. Obtaining the Labelling Only solves the dual. Primal solutions? V a V b V c V d V e V f V g V h V i  ’ =   i  i   Fix the label Of V a
199. 199. Obtaining the Labelling Only solves the dual. Primal solutions? V a V b V c V d V e V f V g V h V i  ’ =   i  i   Fix the label Of V b Continue in some fixed order Meltzer et al., 2006
200. 200. Outline <ul><li>Problem Formulation </li></ul><ul><li>Reparameterization </li></ul><ul><li>Belief Propagation </li></ul><ul><li>Tree-reweighted Message Passing </li></ul><ul><ul><li>Integer Programming Formulation </li></ul></ul><ul><ul><li>Linear Programming Relaxation and its Dual </li></ul></ul><ul><ul><li>Convergent Solution for Dual </li></ul></ul><ul><ul><li>Computational Issues and Theoretical Properties </li></ul></ul>
201. 201. Computational Issues of TRW <ul><li>Speed-ups for some pairwise potentials </li></ul>Basic Component is Belief Propagation Felzenszwalb & Huttenlocher, 2004 <ul><li>Memory requirements cut down by half </li></ul>Kolmogorov, 2006 <ul><li>Further speed-ups using monotonic chains </li></ul>Kolmogorov, 2006
202. 202. Theoretical Properties of TRW <ul><li>Always converges, unlike BP </li></ul>Kolmogorov, 2006 <ul><li>Strong tree agreement implies exact MAP </li></ul>Wainwright et al., 2001 <ul><li>Optimal MAP for two-label submodular problems </li></ul>Kolmogorov and Wainwright, 2005  ab;00 +  ab;11 ≤  ab;01 +  ab;10
203. 203. Results Binary Segmentation Szeliski et al. , 2008 Labels - {foreground, background} Unary Potentials: -log(likelihood) using learnt fg/bg models Pairwise Potentials: 0, if same labels 1 -  exp(|D a - D b |), if different labels
204. 204. Results Binary Segmentation Labels - {foreground, background} Unary Potentials: -log(likelihood) using learnt fg/bg models Szeliski et al. , 2008 Pairwise Potentials: 0, if same labels 1 -  exp(|D a - D b |), if different labels TRW
205. 205. Results Binary Segmentation Labels - {foreground, background} Unary Potentials: -log(likelihood) using learnt fg/bg models Szeliski et al. , 2008 Belief Propagation Pairwise Potentials: 0, if same labels 1 -  exp(|D a - D b |), if different labels
206. 206. Results Stereo Correspondence Szeliski et al. , 2008 Labels - {disparities} Unary Potentials: Similarity of pixel colours Pairwise Potentials: 0, if same labels 1 -  exp(|D a - D b |), if different labels
207. 207. Results Szeliski et al. , 2008 Labels - {disparities} Unary Potentials: Similarity of pixel colours Pairwise Potentials: 0, if same labels 1 -  exp(|D a - D b |), if different labels TRW Stereo Correspondence
208. 208. Results Szeliski et al. , 2008 Labels - {disparities} Unary Potentials: Similarity of pixel colours Belief Propagation Pairwise Potentials: 0, if same labels 1 -  exp(|D a - D b |), if different labels Stereo Correspondence
209. 209. Results Non-submodular problems Kolmogorov, 2006 BP TRW-S 30x30 grid K 50 BP TRW-S BP outperforms TRW-S
210. 210. Summary <ul><li>Trees can be solved exactly - BP </li></ul><ul><li>No guarantee of convergence otherwise - BP </li></ul><ul><li>Strong Tree Agreement - TRW-S </li></ul><ul><li>Submodular energies solved exactly - TRW-S </li></ul><ul><li>TRW-S solves an LP relaxation of MAP estimation </li></ul><ul><li>Loopier graphs give worse results </li></ul><ul><li>Rother and Kolmogorov, 2006 </li></ul>
211. 211. Related New(er) Work <ul><li>Solving the Dual </li></ul>Globerson and Jaakkola, 2007 Komodakis, Paragios and Tziritas 2007 Weiss et al., 2006 Schlesinger and Giginyak, 2007 <ul><li>Solving the Primal </li></ul>Ravikumar, Agarwal and Wainwright, 2008
212. 212. Related New(er) Work <ul><li>More complex relaxations </li></ul>Sontag and Jaakkola, 2007 Komodakis and Paragios, 2008 Kumar, Kolmogorov and Torr, 2007 Werner, 2008 Sontag et al., 2008 Kumar and Torr, 2008
213. 213. Questions on Part I ? Code + Standard Data http://vision.middlebury.edu/MRF