Your SlideShare is downloading. ×
ECCV2008: MAP Estimation Algorithms in Computer Vision - Part 1
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

ECCV2008: MAP Estimation Algorithms in Computer Vision - Part 1

227
views

Published on

Published in: Education, Technology, Business

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
227
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • 3-cycle figure
  • Colour code ‘0’ etc. …. Key equations here
  • Transcript

    • 1. MAP Estimation Algorithms in M. Pawan Kumar, University of Oxford Pushmeet Kohli, Microsoft Research Computer Vision - Part I
    • 2. Aim of the Tutorial
      • Description of some successful algorithms
      • Computational issues
      • Enough details to implement
      • Some proofs will be skipped :-(
      • But references to them will be given :-)
    • 3. A Vision Application Binary Image Segmentation How ? Cost function Models our knowledge about natural images Optimize cost function to obtain the segmentation
    • 4. A Vision Application Object - white, Background - green/grey Graph G = (V,E) Each vertex corresponds to a pixel Edges define a 4-neighbourhood grid graph Assign a label to each vertex from L = {obj,bkg} Binary Image Segmentation
    • 5. A Vision Application Graph G = (V,E) Cost of a labelling f : V  L Per Vertex Cost Cost of label ‘obj’ low Cost of label ‘bkg’ high Object - white, Background - green/grey Binary Image Segmentation
    • 6. A Vision Application Graph G = (V,E) Cost of a labelling f : V  L Cost of label ‘obj’ high Cost of label ‘bkg’ low Per Vertex Cost UNARY COST Object - white, Background - green/grey Binary Image Segmentation
    • 7. A Vision Application Graph G = (V,E) Cost of a labelling f : V  L Per Edge Cost Cost of same label low Cost of different labels high Object - white, Background - green/grey Binary Image Segmentation
    • 8. A Vision Application Graph G = (V,E) Cost of a labelling f : V  L Cost of same label high Cost of different labels low Per Edge Cost PAIRWISE COST Object - white, Background - green/grey Binary Image Segmentation
    • 9. A Vision Application Graph G = (V,E) Problem: Find the labelling with minimum cost f* Object - white, Background - green/grey Binary Image Segmentation
    • 10. A Vision Application Graph G = (V,E) Problem: Find the labelling with minimum cost f* Binary Image Segmentation
    • 11. Another Vision Application Object Detection using Parts-based Models How ? Once again, by defining a good cost function
    • 12. Another Vision Application H T L1 Each vertex corresponds to a part - ‘Head’, ‘Torso’, ‘Legs’ 1 Edges define a TREE Assign a label to each vertex from L = {positions} Graph G = (V,E) L2 L3 L4 Object Detection using Parts-based Models
    • 13. Another Vision Application 2 Each vertex corresponds to a part - ‘Head’, ‘Torso’, ‘Legs’ Assign a label to each vertex from L = {positions} Graph G = (V,E) Edges define a TREE H T L1 L2 L3 L4 Object Detection using Parts-based Models
    • 14. Another Vision Application 3 Each vertex corresponds to a part - ‘Head’, ‘Torso’, ‘Legs’ Assign a label to each vertex from L = {positions} Graph G = (V,E) Edges define a TREE H T L1 L2 L3 L4 Object Detection using Parts-based Models
    • 15. Another Vision Application Cost of a labelling f : V  L Unary cost : How well does part match image patch? Pairwise cost : Encourages valid configurations Find best labelling f* Graph G = (V,E) 3 H T L1 L2 L3 L4 Object Detection using Parts-based Models
    • 16. Another Vision Application Cost of a labelling f : V  L Unary cost : How well does part match image patch? Pairwise cost : Encourages valid configurations Find best labelling f* Graph G = (V,E) 3 H T L1 L2 L3 L4 Object Detection using Parts-based Models
    • 17. Yet Another Vision Application Stereo Correspondence Disparity Map How ? Minimizing a cost function
    • 18. Yet Another Vision Application Stereo Correspondence Graph G = (V,E) Vertex corresponds to a pixel Edges define grid graph L = {disparities}
    • 19. Yet Another Vision Application Stereo Correspondence Cost of labelling f : Unary cost + Pairwise Cost Find minimum cost f*
    • 20. The General Problem b a e d c f Graph G = ( V, E ) Discrete label set L = {1,2,…,h} Assign a label to each vertex f: V  L 1 1 2 2 2 3 Cost of a labelling Q(f) Unary Cost Pairwise Cost Find f* = arg min Q(f)
    • 21. Outline
      • Problem Formulation
        • Energy Function
        • MAP Estimation
        • Computing min-marginals
      • Reparameterization
      • Belief Propagation
      • Tree-reweighted Message Passing
    • 22. Energy Function V a V b V c V d Label l 0 Label l 1 D a D b D c D d Random Variables V = {V a , V b , ….} Labels L = {l 0 , l 1 , ….} Data D Labelling f: {a, b, …. }  {0,1, …}
    • 23. Energy Function V a V b V c V d D a D b D c D d Q(f) = ∑ a  a;f(a) Unary Potential 2 5 4 2 6 3 3 7 Label l 0 Label l 1 Easy to minimize Neighbourhood
    • 24. Energy Function V a V b V c V d D a D b D c D d E : (a,b)  E iff V a and V b are neighbours E = { (a,b) , (b,c) , (c,d) } 2 5 4 2 6 3 3 7 Label l 0 Label l 1
    • 25. Energy Function V a V b V c V d D a D b D c D d +∑ (a,b)  ab;f(a)f(b) Pairwise Potential 0 1 1 0 0 2 1 1 4 1 0 3 2 5 4 2 6 3 3 7 Label l 0 Label l 1 Q(f) = ∑ a  a;f(a)
    • 26. Energy Function V a V b V c V d D a D b D c D d 0 1 1 0 0 2 1 1 4 1 0 3 Parameter 2 5 4 2 6 3 3 7 Label l 0 Label l 1 +∑ (a,b)  ab;f(a)f(b) Q(f;  ) = ∑ a  a;f(a)
    • 27. Outline
      • Problem Formulation
        • Energy Function
        • MAP Estimation
        • Computing min-marginals
      • Reparameterization
      • Belief Propagation
      • Tree-reweighted Message Passing
    • 28. MAP Estimation V a V b V c V d 2 5 4 2 6 3 3 7 0 1 1 0 0 2 1 1 4 1 0 3 Q(f;  ) = ∑ a  a;f(a) + ∑ (a,b)  ab;f(a)f(b) Label l 0 Label l 1
    • 29. MAP Estimation V a V b V c V d 2 5 4 2 6 3 3 7 0 1 1 0 0 2 1 1 4 1 0 3 Q(f;  ) = ∑ a  a;f(a) + ∑ (a,b)  ab;f(a)f(b) 2 + 1 + 2 + 1 + 3 + 1 + 3 = 13 Label l 0 Label l 1
    • 30. MAP Estimation V a V b V c V d 2 5 4 2 6 3 3 7 0 1 1 0 0 2 1 1 4 1 0 3 Q(f;  ) = ∑ a  a;f(a) + ∑ (a,b)  ab;f(a)f(b) Label l 0 Label l 1
    • 31. MAP Estimation V a V b V c V d 2 5 4 2 6 3 3 7 0 1 1 0 0 2 1 1 4 1 0 3 Q(f;  ) = ∑ a  a;f(a) + ∑ (a,b)  ab;f(a)f(b) 5 + 1 + 4 + 0 + 6 + 4 + 7 = 27 Label l 0 Label l 1
    • 32. MAP Estimation V a V b V c V d 2 5 4 2 6 3 3 7 0 1 1 0 0 2 1 1 4 1 0 3 Q(f;  ) = ∑ a  a;f(a) + ∑ (a,b)  ab;f(a)f(b) f* = arg min Q(f;  ) q* = min Q(f;  ) = Q(f*;  ) Label l 0 Label l 1
    • 33. MAP Estimation 16 possible labellings f* = {1, 0, 0, 1} q* = 13 20 1 1 1 0 27 0 1 1 0 19 1 0 1 0 22 0 0 1 0 20 1 1 0 0 27 0 1 0 0 15 1 0 0 0 18 0 0 0 0 Q(f;  ) f(d) f(c) f(b) f(a) 16 1 1 1 1 23 0 1 1 1 15 1 0 1 1 18 0 0 1 1 18 1 1 0 1 25 0 1 0 1 13 1 0 0 1 16 0 0 0 1 Q(f;  ) f(d) f(c) f(b) f(a)
    • 34. Computational Complexity Segmentation 2 |V| |V| = number of pixels ≈ 320 * 480 = 153600
    • 35. Computational Complexity |L| = number of pixels ≈ 153600 Detection |L| |V|
    • 36. Computational Complexity |V| = number of pixels ≈ 153600 Stereo |L| |V| Can we do better than brute-force? MAP Estimation is NP-hard !!
    • 37. Computational Complexity |V| = number of pixels ≈ 153600 Stereo |L| |V| Exact algorithms do exist for special cases Good approximate algorithms for general case But first … two important definitions
    • 38. Outline
      • Problem Formulation
        • Energy Function
        • MAP Estimation
        • Computing min-marginals
      • Reparameterization
      • Belief Propagation
      • Tree-reweighted Message Passing
    • 39. Min-Marginals V a V b V c V d 2 5 4 2 6 3 3 7 0 1 1 0 0 2 1 1 4 1 0 3 f* = arg min Q(f;  ) such that f(a) = i Min-marginal q a;i Label l 0 Label l 1 Not a marginal (no summation)
    • 40. Min-Marginals 16 possible labellings q a;0 = 15 20 1 1 1 0 27 0 1 1 0 19 1 0 1 0 22 0 0 1 0 20 1 1 0 0 27 0 1 0 0 15 1 0 0 0 18 0 0 0 0 Q(f;  ) f(d) f(c) f(b) f(a) 16 1 1 1 1 23 0 1 1 1 15 1 0 1 1 18 0 0 1 1 18 1 1 0 1 25 0 1 0 1 13 1 0 0 1 16 0 0 0 1 Q(f;  ) f(d) f(c) f(b) f(a)
    • 41. Min-Marginals 16 possible labellings q a;1 = 13 16 1 1 1 1 23 0 1 1 1 15 1 0 1 1 18 0 0 1 1 18 1 1 0 1 25 0 1 0 1 13 1 0 0 1 16 0 0 0 1 Q(f;  ) f(d) f(c) f(b) f(a) 20 1 1 1 0 27 0 1 1 0 19 1 0 1 0 22 0 0 1 0 20 1 1 0 0 27 0 1 0 0 15 1 0 0 0 18 0 0 0 0 Q(f;  ) f(d) f(c) f(b) f(a)
    • 42. Min-Marginals and MAP
      • Minimum min-marginal of any variable =
      • energy of MAP labelling
      min f Q(f;  ) such that f(a) = i q a;i min i min i ( ) V a has to take one label min f Q(f;  )
    • 43. Summary MAP Estimation f* = arg min Q(f;  ) Q(f;  ) = ∑ a  a;f(a) + ∑ (a,b)  ab;f(a)f(b) Min-marginals q a;i = min Q(f;  ) s.t. f(a) = i Energy Function
    • 44. Outline
      • Problem Formulation
      • Reparameterization
      • Belief Propagation
      • Tree-reweighted Message Passing
    • 45. Reparameterization V a V b 2 5 4 2 0 1 1 0 2 + 2 + - 2 - 2 Add a constant to all  a;i Subtract that constant from all  b;k 6 1 1 5 0 1 10 1 0 7 0 0 Q(f;  ) f(b) f(a)
    • 46. Reparameterization Add a constant to all  a;i Subtract that constant from all  b;k Q(f;  ’) = Q(f;  ) V a V b 2 5 4 2 0 0 2 + 2 + - 2 - 2 1 1 6 + 2 - 2 1 1 5 + 2 - 2 0 1 10 + 2 - 2 1 0 7 + 2 - 2 0 0 Q(f;  ) f(b) f(a)
    • 47. Reparameterization V a V b 2 5 4 2 0 1 1 0 - 3 + 3 Add a constant to one  b;k Subtract that constant from  ab;ik for all ‘i’ - 3 6 1 1 5 0 1 10 1 0 7 0 0 Q(f;  ) f(b) f(a)
    • 48. Reparameterization V a V b 2 5 4 2 0 1 1 0 - 3 + 3 - 3 Q(f;  ’) = Q(f;  ) Add a constant to one  b;k Subtract that constant from  ab;ik for all ‘i’ 6 - 3 + 3 1 1 5 0 1 10 - 3 + 3 1 0 7 0 0 Q(f;  ) f(b) f(a)
    • 49. Reparameterization - 2 - 2 - 2 + 2 + 1 + 1 + 1 - 1 - 4 + 4 - 4 - 4  ’ a;i =  a;i  ’ b;k =  b;k  ’ ab;ik =  ab;ik + M ab;k - M ab;k + M ba;i - M ba;i Q(f;  ’) = Q(f;  ) V a V b 2 5 4 2 3 1 0 1 2 V a V b 2 5 4 2 3 1 1 0 1 V a V b 2 5 4 2 3 1 2 1 0
    • 50. Reparameterization Q(f;  ’) = Q(f;  ), for all f  ’ is a reparameterization of  , iff  ’    ’ b;k =  b;k Kolmogorov, PAMI, 2006  ’ a;i =  a;i  ’ ab;ik =  ab;ik + M ab;k - M ab;k + M ba;i - M ba;i Equivalently V a V b 2 5 4 2 0 0 2 + 2 + - 2 - 2 1 1
    • 51. Recap MAP Estimation f* = arg min Q(f;  ) Q(f;  ) = ∑ a  a;f(a) + ∑ (a,b)  ab;f(a)f(b) Min-marginals q a;i = min Q(f;  ) s.t. f(a) = i Q(f;  ’) = Q(f;  ), for all f  ’   Reparameterization
    • 52. Outline
      • Problem Formulation
      • Reparameterization
      • Belief Propagation
        • Exact MAP for Chains and Trees
        • Approximate MAP for general graphs
        • Computational Issues and Theoretical Properties
      • Tree-reweighted Message Passing
    • 53. Belief Propagation
      • Belief Propagation gives exact MAP for chains
      • Remember, some MAP problems are easy
      • Exact MAP for trees
      • Clever Reparameterization
    • 54. Two Variables V a V b 2 5 2 1 0 V a V b 2 5 4 0 1 Choose the right constant  ’ b;k = q b;k Add a constant to one  b;k Subtract that constant from  ab;ik for all ‘i’
    • 55. Two Variables V a V b 2 5 2 1 0 V a V b 2 5 4 0 1 Choose the right constant  ’ b;k = q b;k  a;0 +  ab;00 = 5 + 0  a;1 +  ab;10 = 2 + 1 min M ab;0 =
    • 56. Two Variables V a V b 2 5 5 -2 -3 V a V b 2 5 4 0 1 Choose the right constant  ’ b;k = q b;k
    • 57. Two Variables V a V b 2 5 5 -2 -3 V a V b 2 5 4 0 1 Choose the right constant  ’ b;k = q b;k f(a) = 1  ’ b;0 = q b;0 Potentials along the red path add up to 0
    • 58. Two Variables V a V b 2 5 5 -2 -3 V a V b 2 5 4 0 1 Choose the right constant  ’ b;k = q b;k  a;0 +  ab;01 = 5 + 1  a;1 +  ab;11 = 2 + 0 min M ab;1 =
    • 59. Two Variables V a V b 2 5 5 -2 -3 V a V b 2 5 6 -2 -1 Choose the right constant  ’ b;k = q b;k f(a) = 1  ’ b;0 = q b;0 f(a) = 1  ’ b;1 = q b;1 Minimum of min-marginals = MAP estimate
    • 60. Two Variables V a V b 2 5 5 -2 -3 V a V b 2 5 6 -2 -1 Choose the right constant  ’ b;k = q b;k f(a) = 1  ’ b;0 = q b;0 f(a) = 1  ’ b;1 = q b;1 f*(b) = 0 f*(a) = 1
    • 61. Two Variables V a V b 2 5 5 -2 -3 V a V b 2 5 6 -2 -1 Choose the right constant  ’ b;k = q b;k f(a) = 1  ’ b;0 = q b;0 f(a) = 1  ’ b;1 = q b;1 We get all the min-marginals of V b
    • 62. Recap We only need to know two sets of equations General form of Reparameterization Reparameterization of (a,b) in Belief Propagation M ab;k = min i {  a;i +  ab;ik } M ba;i = 0  ’ a;i =  a;i  ’ ab;ik =  ab;ik + M ab;k - M ab;k + M ba;i - M ba;i  ’ b;k =  b;k
    • 63. Three Variables V a V b 2 5 2 1 0 V c 4 6 0 1 0 1 3 2 3 Reparameterize the edge (a,b) as before l 0 l 1
    • 64. Three Variables V a V b 2 5 5 -3 V c 6 6 0 1 -2 3 Reparameterize the edge (a,b) as before f(a) = 1 f(a) = 1 -2 -1 2 3 l 0 l 1
    • 65. Three Variables V a V b 2 5 5 -3 V c 6 6 0 1 -2 3 Reparameterize the edge (a,b) as before f(a) = 1 f(a) = 1 Potentials along the red path add up to 0 -2 -1 2 3 l 0 l 1
    • 66. Three Variables V a V b 2 5 5 -3 V c 6 6 0 1 -2 3 Reparameterize the edge (b,c) as before f(a) = 1 f(a) = 1 Potentials along the red path add up to 0 -2 -1 2 3 l 0 l 1
    • 67. Three Variables V a V b 2 5 5 -3 V c 6 12 -6 -5 -2 9 Reparameterize the edge (b,c) as before f(a) = 1 f(a) = 1 Potentials along the red path add up to 0 f(b) = 1 f(b) = 0 -2 -1 -4 -3 l 0 l 1
    • 68. Three Variables V a V b 2 5 5 -3 V c 6 12 -6 -5 -2 9 Reparameterize the edge (b,c) as before f(a) = 1 f(a) = 1 Potentials along the red path add up to 0 f(b) = 1 f(b) = 0 q c;0 q c;1 -2 -1 -4 -3 l 0 l 1
    • 69. Three Variables V a V b 2 5 5 -3 V c 6 12 -6 -5 -2 9 f(a) = 1 f(a) = 1 f(b) = 1 f(b) = 0 q c;0 q c;1 f*(c) = 0 f*(b) = 0 f*(a) = 1 Generalizes to any length chain -2 -1 -4 -3 l 0 l 1
    • 70. Three Variables V a V b 2 5 5 -3 V c 6 12 -6 -5 -2 9 f(a) = 1 f(a) = 1 f(b) = 1 f(b) = 0 q c;0 q c;1 f*(c) = 0 f*(b) = 0 f*(a) = 1 Only Dynamic Programming -2 -1 -4 -3 l 0 l 1
    • 71. Why Dynamic Programming? 3 variables  2 variables + book-keeping n variables  (n-1) variables + book-keeping Start from left, go to right Reparameterize current edge (a,b) M ab;k = min i {  a;i +  ab;ik } Repeat  ’ ab;ik =  ab;ik + M ab;k - M ab;k  ’ b;k =  b;k
    • 72. Why Dynamic Programming? Start from left, go to right Reparameterize current edge (a,b) M ab;k = min i {  a;i +  ab;ik } Repeat Messages Message Passing Why stop at dynamic programming?  ’ ab;ik =  ab;ik + M ab;k - M ab;k  ’ b;k =  b;k
    • 73. Three Variables V a V b 2 5 5 -3 V c 6 12 -6 -5 -2 9 Reparameterize the edge (c,b) as before -2 -1 -4 -3 l 0 l 1
    • 74. Three Variables V a V b 2 5 9 -3 V c 11 12 -11 -9 -2 9 Reparameterize the edge (c,b) as before -2 -1 -9 -7  ’ b;i = q b;i l 0 l 1
    • 75. Three Variables V a V b 2 5 9 -3 V c 11 12 -11 -9 -2 9 Reparameterize the edge (b,a) as before -2 -1 -9 -7 l 0 l 1
    • 76. Three Variables V a V b 9 11 9 -9 V c 11 12 -11 -9 -9 9 Reparameterize the edge (b,a) as before -9 -7 -9 -7  ’ a;i = q a;i l 0 l 1
    • 77. Three Variables V a V b 9 11 9 -9 V c 11 12 -11 -9 -9 9 Forward Pass   Backward Pass -9 -7 -9 -7 All min-marginals are computed l 0 l 1
    • 78. Belief Propagation on Chains Start from left, go to right Reparameterize current edge (a,b) M ab;k = min i {  a;i +  ab;ik } Repeat till the end of the chain Start from right, go to left Repeat till the end of the chain  ’ ab;ik =  ab;ik + M ab;k - M ab;k  ’ b;k =  b;k
    • 79. Belief Propagation on Chains
      • A way of computing reparam constants
      • Generalizes to chains of any length
      • Forward Pass - Start to End
        • MAP estimate
        • Min-marginals of final variable
      • Backward Pass - End to start
        • All other min-marginals
      Won’t need this .. But good to know
    • 80. Computational Complexity
      • Each constant takes O(|L|)
      • Number of constants - O(|E||L|)
      O(|E||L| 2 )
      • Memory required ?
      O(|E||L|)
    • 81. Belief Propagation on Trees V b V a Forward Pass: Leaf  Root All min-marginals are computed Backward Pass: Root  Leaf V c V d V e V g V h
    • 82. Outline
      • Problem Formulation
      • Reparameterization
      • Belief Propagation
        • Exact MAP for Chains and Trees
        • Approximate MAP for general graphs
        • Computational Issues and Theoretical Properties
      • Tree-reweighted Message Passing
    • 83. Belief Propagation on Cycles V a V b V d V c Where do we start? Arbitrarily  a;0  a;1  b;0  b;1  d;0  d;1  c;0  c;1 Reparameterize (a,b)
    • 84. Belief Propagation on Cycles V a V b V d V c  a;0  a;1  ’ b;0  ’ b;1  d;0  d;1  c;0  c;1 Potentials along the red path add up to 0
    • 85. Belief Propagation on Cycles V a V b V d V c  a;0  a;1  ’ b;0  ’ b;1  d;0  d;1  ’ c;0  ’ c;1 Potentials along the red path add up to 0
    • 86. Belief Propagation on Cycles V a V b V d V c  a;0  a;1  ’ b;0  ’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 Potentials along the red path add up to 0
    • 87. Belief Propagation on Cycles V a V b V d V c  ’ a;0  ’ a;1  ’ b;0  ’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 Potentials along the red path add up to 0
    • 88. Belief Propagation on Cycles V a V b V d V c  ’ a;0  ’ a;1  ’ b;0  ’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 Potentials along the red path add up to 0 -  a;0 -  a;1  ’ a;0 -  a;0 = q a;0  ’ a;1 -  a;1 = q a;1
    • 89. Belief Propagation on Cycles V a V b V d V c  ’ a;0  ’ a;1  ’ b;0  ’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 Pick minimum min-marginal. Follow red path. -  a;0 -  a;1  ’ a;0 -  a;0 = q a;0  ’ a;1 -  a;1 = q a;1
    • 90. Belief Propagation on Cycles V a V b V d V c  a;0  a;1  ’ b;0  ’ b;1  d;0  d;1  c;0  c;1 Potentials along the red path add up to 0
    • 91. Belief Propagation on Cycles V a V b V d V c  a;0  a;1  ’ b;0  ’ b;1  d;0  d;1  ’ c;0  ’ c;1 Potentials along the red path add up to 0
    • 92. Belief Propagation on Cycles V a V b V d V c  a;0  a;1  ’ b;0  ’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 Potentials along the red path add up to 0
    • 93. Belief Propagation on Cycles V a V b V d V c  ’ a;0  ’ a;1  ’ b;0  ’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 Potentials along the red path add up to 0 -  a;0 -  a;1  ’ a;1 -  a;1 = q a;1  ’ a;0 -  a;0 ≤ q a;0 ≤
    • 94. Belief Propagation on Cycles V a V b V d V c  ’ a;0  ’ a;1  ’ b;0  ’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 Problem Solved -  a;0 -  a;1  ’ a;1 -  a;1 = q a;1  ’ a;0 -  a;0 ≤ q a;0 ≤
    • 95. Belief Propagation on Cycles V a V b V d V c  ’ a;0  ’ a;1  ’ b;0  ’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 Problem Not Solved -  a;0 -  a;1  ’ a;1 -  a;1 = q a;1  ’ a;0 -  a;0 ≤ q a;0 ≥
    • 96. Belief Propagation on Cycles V a V b V d V c  ’ a;0  ’ a;1  ’ b;0  ’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 -  a;0 -  a;1 Reparameterize (a,b) again
    • 97. Belief Propagation on Cycles V a V b V d V c  ’ a;0  ’ a;1  ’’ b;0  ’’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 Reparameterize (a,b) again But doesn’t this overcount some potentials?
    • 98. Belief Propagation on Cycles V a V b V d V c  ’ a;0  ’ a;1  ’’ b;0  ’’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 Reparameterize (a,b) again Yes. But we will do it anyway
    • 99. Belief Propagation on Cycles V a V b V d V c  ’ a;0  ’ a;1  ’’ b;0  ’’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 Keep reparameterizing edges in some order Hope for convergence and a good solution
    • 100. Belief Propagation
      • Generalizes to any arbitrary random field
      • Complexity per iteration ?
      O(|E||L| 2 )
      • Memory required ?
      O(|E||L|)
    • 101. Outline
      • Problem Formulation
      • Reparameterization
      • Belief Propagation
        • Exact MAP for Chains and Trees
        • Approximate MAP for general graphs
        • Computational Issues and Theoretical Properties
      • Tree-reweighted Message Passing
    • 102. Computational Issues of BP Complexity per iteration O(|E||L| 2 ) Special Pairwise Potentials  ab;ik = w ab d(|i-k|) O(|E||L|) Felzenszwalb & Huttenlocher, 2004 i - k d Potts i - k d Truncated Linear i - k d Truncated Quadratic
    • 103. Computational Issues of BP Memory requirements O(|E||L|) Half of original BP Kolmogorov, 2006 Some approximations exist But memory still remains an issue Yu, Lin, Super and Tan, 2007 Lasserre, Kannan and Winn, 2007
    • 104. Computational Issues of BP Order of reparameterization Randomly Residual Belief Propagation In some fixed order The one that results in maximum change Elidan et al. , 2006
    • 105. Theoretical Properties of BP Exact for Trees Pearl, 1988 What about any general random field? Run BP. Assume it converges.
    • 106. Theoretical Properties of BP Exact for Trees Pearl, 1988 What about any general random field? Choose variables in a tree. Change their labels. Value of energy does not decrease
    • 107. Theoretical Properties of BP Exact for Trees Pearl, 1988 What about any general random field? Choose variables in a cycle. Change their labels. Value of energy does not decrease
    • 108. Theoretical Properties of BP Exact for Trees Pearl, 1988 What about any general random field? For cycles, if BP converges then exact MAP Weiss and Freeman, 2001
    • 109. Results Object Detection Felzenszwalb and Huttenlocher, 2004 Labels - Poses of parts Unary Potentials: Fraction of foreground pixels Pairwise Potentials: Favour Valid Configurations H T A1 A2 L1 L2
    • 110. Results Object Detection Felzenszwalb and Huttenlocher, 2004
    • 111. Results Binary Segmentation Szeliski et al. , 2008 Labels - {foreground, background} Unary Potentials: -log(likelihood) using learnt fg/bg models Pairwise Potentials: 0, if same labels 1 -  exp(|D a - D b |), if different labels
    • 112. Results Binary Segmentation Labels - {foreground, background} Unary Potentials: -log(likelihood) using learnt fg/bg models Szeliski et al. , 2008 Pairwise Potentials: 0, if same labels 1 -  exp(|D a - D b |), if different labels Belief Propagation
    • 113. Results Binary Segmentation Labels - {foreground, background} Unary Potentials: -log(likelihood) using learnt fg/bg models Szeliski et al. , 2008 Global optimum Pairwise Potentials: 0, if same labels 1 -  exp(|D a - D b |), if different labels
    • 114. Results Szeliski et al. , 2008 Labels - {disparities} Unary Potentials: Similarity of pixel colours Pairwise Potentials: 0, if same labels 1 -  exp(|D a - D b |), if different labels Stereo Correspondence
    • 115. Results Szeliski et al. , 2008 Labels - {disparities} Unary Potentials: Similarity of pixel colours Pairwise Potentials: 0, if same labels 1 -  exp(|D a - D b |), if different labels Belief Propagation Stereo Correspondence
    • 116. Results Szeliski et al. , 2008 Labels - {disparities} Unary Potentials: Similarity of pixel colours Global optimum Pairwise Potentials: 0, if same labels 1 -  exp(|D a - D b |), if different labels Stereo Correspondence
    • 117. Summary of BP Exact for chains Exact for trees Approximate MAP for general cases Not even convergence guaranteed So can we do something better?
    • 118. Outline
      • Problem Formulation
      • Reparameterization
      • Belief Propagation
      • Tree-reweighted Message Passing
        • Integer Programming Formulation
        • Linear Programming Relaxation and its Dual
        • Convergent Solution for Dual
        • Computational Issues and Theoretical Properties
    • 119. TRW Message Passing
      • Convex (not Combinatorial) Optimization
      • A different look at the same problem
      • A similar solution
      • Combinatorial (not Convex) Optimization
      We will look at the most general MAP estimation Not trees No assumption on potentials
    • 120. Things to Remember
      • Forward-pass computes min-marginals of root
      • BP is exact for trees
      • Every iteration provides a reparameterization
      • Basics of Mathematical Optimization
    • 121. Mathematical Optimization min g 0 (x) subject to g i (x) ≤ 0 i=1, … , N
      • Objective function
      • Constraints
      • Feasible region = {x | g i (x) ≤ 0}
      x* = arg Optimal Solution g 0 (x*) Optimal Value
    • 122. Integer Programming min g 0 (x) subject to g i (x) ≤ 0 i=1, … , N
      • Objective function
      • Constraints
      • Feasible region = {x | g i (x) ≤ 0}
      x* = arg Optimal Solution g 0 (x*) Optimal Value x k  Z
    • 123. Feasible Region Generally NP-hard to optimize
    • 124. Linear Programming min g 0 (x) subject to g i (x) ≤ 0 i=1, … , N
      • Objective function
      • Constraints
      • Feasible region = {x | g i (x) ≤ 0}
      x* = arg Optimal Solution g 0 (x*) Optimal Value
    • 125. Linear Programming min g 0 (x) subject to g i (x) ≤ 0 i=1, … , N
      • Linear objective function
      • Linear constraints
      • Feasible region = {x | g i (x) ≤ 0}
      x* = arg Optimal Solution g 0 (x*) Optimal Value
    • 126. Linear Programming min c T x subject to Ax ≤ b i=1, … , N
      • Linear objective function
      • Linear constraints
      • Feasible region = {x | Ax ≤ b}
      x* = arg Optimal Solution c T x* Optimal Value Polynomial-time Solution
    • 127. Feasible Region Polynomial-time Solution
    • 128. Feasible Region Optimal solution lies on a vertex (obj func linear)
    • 129. Outline
      • Problem Formulation
      • Reparameterization
      • Belief Propagation
      • Tree-reweighted Message Passing
        • Integer Programming Formulation
        • Linear Programming Relaxation and its Dual
        • Convergent Solution for Dual
        • Computational Issues and Theoretical Properties
    • 130. Integer Programming Formulation V a V b Label l 0 Label l 1 2 5 4 2 0 1 1 0 2 Unary Potentials  a;0 = 5  a;1 = 2  b;0 = 2  b;1 = 4 Labelling f(a) = 1 f(b) = 0 y a;0 = 0 y a;1 = 1 y b;0 = 1 y b;1 = 0 Any f(.) has equivalent boolean variables y a;i
    • 131. Integer Programming Formulation V a V b 2 5 4 2 0 1 1 0 2 Unary Potentials  a;0 = 5  a;1 = 2  b;0 = 2  b;1 = 4 Labelling f(a) = 1 f(b) = 0 y a;0 = 0 y a;1 = 1 y b;0 = 1 y b;1 = 0 Find the optimal variables y a;i Label l 0 Label l 1
    • 132. Integer Programming Formulation V a V b 2 5 4 2 0 1 1 0 2 Unary Potentials  a;0 = 5  a;1 = 2  b;0 = 2  b;1 = 4 Sum of Unary Potentials ∑ a ∑ i  a;i y a;i y a;i  {0,1}, for all V a , l i ∑ i y a;i = 1, for all V a Label l 0 Label l 1
    • 133. Integer Programming Formulation V a V b 2 5 4 2 0 1 1 0 2 Pairwise Potentials  ab;00 = 0  ab;10 = 1  ab;01 = 1  ab;11 = 0 Sum of Pairwise Potentials ∑ (a,b) ∑ ik  ab;ik y a;i y b;k y a;i  {0,1} ∑ i y a;i = 1 Label l 0 Label l 1
    • 134. Integer Programming Formulation V a V b 2 5 4 2 0 1 1 0 2 Pairwise Potentials  ab;00 = 0  ab;10 = 1  ab;01 = 1  ab;11 = 0 Sum of Pairwise Potentials ∑ (a,b) ∑ ik  ab;ik y ab;ik y a;i  {0,1} ∑ i y a;i = 1 y ab;ik = y a;i y b;k Label l 0 Label l 1
    • 135. Integer Programming Formulation min ∑ a ∑ i  a;i y a;i + ∑ (a,b) ∑ ik  ab;ik y ab;ik y a;i  {0,1} ∑ i y a;i = 1 y ab;ik = y a;i y b;k
    • 136. Integer Programming Formulation min  T y y a;i  {0,1} ∑ i y a;i = 1 y ab;ik = y a;i y b;k  = [ …  a;i …. ; …  ab;ik ….] y = [ … y a;i …. ; … y ab;ik ….]
    • 137. One variable, two labels y a;0  {0,1} y a;1  {0,1} y a;0 + y a;1 = 1 y = [ y a;0 y a;1 ]  = [  a;0  a;1 ] y a;0 y a;1
    • 138. Two variables, two labels
      • = [  a;0  a;1  b;0  b;1
      •  ab;00  ab;01  ab;10  ab;11 ]
      y = [ y a;0 y a;1 y b;0 y b;1 y ab;00 y ab;01 y ab;10 y ab;11 ] y a;0  {0,1} y a;1  {0,1} y a;0 + y a;1 = 1 y b;0  {0,1} y b;1  {0,1} y b;0 + y b;1 = 1 y ab;00 = y a;0 y b;0 y ab;01 = y a;0 y b;1 y ab;10 = y a;1 y b;0 y ab;11 = y a;1 y b;1
    • 139. In General Marginal Polytope
    • 140. In General
      •  R (|V||L| + |E||L| 2 )
      y  {0,1} (|V||L| + |E||L| 2 ) Number of constraints |V||L| + |V| + |E||L| 2 y a;i  {0,1} ∑ i y a;i = 1 y ab;ik = y a;i y b;k
    • 141. Integer Programming Formulation min  T y y a;i  {0,1} ∑ i y a;i = 1 y ab;ik = y a;i y b;k  = [ …  a;i …. ; …  ab;ik ….] y = [ … y a;i …. ; … y ab;ik ….]
    • 142. Integer Programming Formulation min  T y y a;i  {0,1} ∑ i y a;i = 1 y ab;ik = y a;i y b;k Solve to obtain MAP labelling y*
    • 143. Integer Programming Formulation min  T y y a;i  {0,1} ∑ i y a;i = 1 y ab;ik = y a;i y b;k But we can’t solve it in general
    • 144. Outline
      • Problem Formulation
      • Reparameterization
      • Belief Propagation
      • Tree-reweighted Message Passing
        • Integer Programming Formulation
        • Linear Programming Relaxation and its Dual
        • Convergent Solution for Dual
        • Computational Issues and Theoretical Properties
    • 145. Linear Programming Relaxation min  T y y a;i  {0,1} ∑ i y a;i = 1 y ab;ik = y a;i y b;k Two reasons why we can’t solve this
    • 146. Linear Programming Relaxation min  T y y a;i  [0,1] ∑ i y a;i = 1 y ab;ik = y a;i y b;k One reason why we can’t solve this
    • 147. Linear Programming Relaxation min  T y y a;i  [0,1] ∑ i y a;i = 1 ∑ k y ab;ik = ∑ k y a;i y b;k One reason why we can’t solve this
    • 148. Linear Programming Relaxation min  T y y a;i  [0,1] ∑ i y a;i = 1 One reason why we can’t solve this = 1 ∑ k y ab;ik = y a;i ∑ k y b;k
    • 149. Linear Programming Relaxation min  T y y a;i  [0,1] ∑ i y a;i = 1 ∑ k y ab;ik = y a;i One reason why we can’t solve this
    • 150. Linear Programming Relaxation min  T y y a;i  [0,1] ∑ i y a;i = 1 ∑ k y ab;ik = y a;i No reason why we can’t solve this * * memory requirements, time complexity
    • 151. One variable, two labels y a;0  {0,1} y a;1  {0,1} y a;0 + y a;1 = 1 y = [ y a;0 y a;1 ]  = [  a;0  a;1 ] y a;0 y a;1
    • 152. One variable, two labels y a;0  [0,1] y a;1  [0,1] y a;0 + y a;1 = 1 y = [ y a;0 y a;1 ]  = [  a;0  a;1 ] y a;0 y a;1
    • 153. Two variables, two labels
      • = [  a;0  a;1  b;0  b;1
      •  ab;00  ab;01  ab;10  ab;11 ]
      y = [ y a;0 y a;1 y b;0 y b;1 y ab;00 y ab;01 y ab;10 y ab;11 ] y a;0  {0,1} y a;1  {0,1} y a;0 + y a;1 = 1 y b;0  {0,1} y b;1  {0,1} y b;0 + y b;1 = 1 y ab;00 = y a;0 y b;0 y ab;01 = y a;0 y b;1 y ab;10 = y a;1 y b;0 y ab;11 = y a;1 y b;1
    • 154. Two variables, two labels
      • = [  a;0  a;1  b;0  b;1
      •  ab;00  ab;01  ab;10  ab;11 ]
      y = [ y a;0 y a;1 y b;0 y b;1 y ab;00 y ab;01 y ab;10 y ab;11 ] y a;0  [0,1] y a;1  [0,1] y a;0 + y a;1 = 1 y b;0  [0,1] y b;1  [0,1] y b;0 + y b;1 = 1 y ab;00 = y a;0 y b;0 y ab;01 = y a;0 y b;1 y ab;10 = y a;1 y b;0 y ab;11 = y a;1 y b;1
    • 155. Two variables, two labels
      • = [  a;0  a;1  b;0  b;1
      •  ab;00  ab;01  ab;10  ab;11 ]
      y = [ y a;0 y a;1 y b;0 y b;1 y ab;00 y ab;01 y ab;10 y ab;11 ] y a;0  [0,1] y a;1  [0,1] y a;0 + y a;1 = 1 y b;0  [0,1] y b;1  [0,1] y b;0 + y b;1 = 1 y ab;00 + y ab;01 = y a;0 y ab;10 = y a;1 y b;0 y ab;11 = y a;1 y b;1
    • 156. Two variables, two labels
      • = [  a;0  a;1  b;0  b;1
      •  ab;00  ab;01  ab;10  ab;11 ]
      y = [ y a;0 y a;1 y b;0 y b;1 y ab;00 y ab;01 y ab;10 y ab;11 ] y a;0  [0,1] y a;1  [0,1] y a;0 + y a;1 = 1 y b;0  [0,1] y b;1  [0,1] y b;0 + y b;1 = 1 y ab;00 + y ab;01 = y a;0 y ab;10 + y ab;11 = y a;1
    • 157. In General Marginal Polytope Local Polytope
    • 158. In General
      •  R (|V||L| + |E||L| 2 )
      y  [0,1] (|V||L| + |E||L| 2 ) Number of constraints |V||L| + |V| + |E||L|
    • 159. Linear Programming Relaxation min  T y y a;i  [0,1] ∑ i y a;i = 1 ∑ k y ab;ik = y a;i No reason why we can’t solve this
    • 160. Linear Programming Relaxation Extensively studied Optimization Schlesinger, 1976 Koster, van Hoesel and Kolen, 1998 Theory Chekuri et al, 2001 Archer et al, 2004 Machine Learning Wainwright et al., 2001
    • 161. Linear Programming Relaxation Many interesting Properties
      • Global optimal MAP for trees
      Wainwright et al., 2001 But we are interested in NP-hard cases
      • Preserves solution for reparameterization
    • 162. Linear Programming Relaxation
      • Large class of problems
        • Metric Labelling
        • Semi-metric Labelling
      Many interesting Properties - Integrality Gap Manokaran et al., 2008
      • Most likely, provides best possible integrality gap
    • 163. Linear Programming Relaxation
      • A computationally useful dual
      Many interesting Properties - Dual Optimal value of dual = Optimal value of primal Easier-to-solve
    • 164. Dual of the LP Relaxation Wainwright et al., 2001 V a V b V c V d V e V f V g V h V i  min  T y y a;i  [0,1] ∑ i y a;i = 1 ∑ k y ab;ik = y a;i
    • 165. Dual of the LP Relaxation Wainwright et al., 2001 V a V b V c V d V e V f V g V h V i  V a V b V c V d V e V f V g V h V i V a V b V c V d V e V f V g V h V i  1  2  3  4  5  6  1  2  3  4  5  6   i  i =   i ≥ 0
    • 166. Dual of the LP Relaxation Wainwright et al., 2001  1  2  3  4  5  6 q*(  1 )   i  i =  q*(  2 ) q*(  3 ) q*(  4 ) q*(  5 ) q*(  6 )   i q*(  i ) Dual of LP  V a V b V c V d V e V f V g V h V i V a V b V c V d V e V f V g V h V i V a V b V c V d V e V f V g V h V i  i ≥ 0 max
    • 167. Dual of the LP Relaxation Wainwright et al., 2001  1  2  3  4  5  6 q*(  1 )   i  i   q*(  2 ) q*(  3 ) q*(  4 ) q*(  5 ) q*(  6 ) Dual of LP  V a V b V c V d V e V f V g V h V i V a V b V c V d V e V f V g V h V i V a V b V c V d V e V f V g V h V i  i ≥ 0   i q*(  i ) max
    • 168. Dual of the LP Relaxation Wainwright et al., 2001   i  i   max   i q*(  i ) I can easily compute q*(  i ) I can easily maintain reparam constraint So can I easily solve the dual?
    • 169. Outline
      • Problem Formulation
      • Reparameterization
      • Belief Propagation
      • Tree-reweighted Message Passing
        • Integer Programming Formulation
        • Linear Programming Relaxation and its Dual
        • Convergent Solution for Dual
        • Computational Issues and Theoretical Properties
    • 170. TRW Message Passing Kolmogorov, 2006 V a V b V c V d V e V f V g V h V i V a V b V c V d V e V f V g V h V i  1  2  3  1  2  3  4  5  6  4  5  6   i  i     i q*(  i ) Pick a variable V a
    • 171. TRW Message Passing Kolmogorov, 2006   i  i     i q*(  i ) V c V b V a  1 c;0  1 c;1  1 b;0  1 b;1  1 a;0  1 a;1 V a V d V g  4 a;0  4 a;1  4 d;0  4 d;1  4 g;0  4 g;1
    • 172. TRW Message Passing Kolmogorov, 2006  1  1 +  4  4 +  rest    1 q*(  1 ) +  4 q*(  4 ) + K V c V b V a V a V d V g Reparameterize to obtain min-marginals of V a  1 c;0  1 c;1  1 b;0  1 b;1  1 a;0  1 a;1  4 a;0  4 a;1  4 d;0  4 d;1  4 g;0  4 g;1
    • 173. TRW Message Passing Kolmogorov, 2006  1  ’ 1 +  4  ’ 4 +  rest V c V b V a  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’ 1 a;0  ’ 1 a;1 V a V d V g  ’ 4 a;0  ’ 4 a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1 One pass of Belief Propagation  1 q*(  ’ 1 ) +  4 q*(  ’ 4 ) + K
    • 174. TRW Message Passing Kolmogorov, 2006  1  ’ 1 +  4  ’ 4 +  rest   V c V b V a V a V d V g Remain the same  1 q*(  ’ 1 ) +  4 q*(  ’ 4 ) + K  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’ 1 a;0  ’ 1 a;1  ’ 4 a;0  ’ 4 a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1
    • 175. TRW Message Passing Kolmogorov, 2006  1  ’ 1 +  4  ’ 4 +  rest    1 min{  ’ 1 a;0 ,  ’ 1 a;1 } +  4 min{  ’ 4 a;0 ,  ’ 4 a;1 } + K V c V b V a V a V d V g  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’ 1 a;0  ’ 1 a;1  ’ 4 a;0  ’ 4 a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1
    • 176. TRW Message Passing Kolmogorov, 2006  1  ’ 1 +  4  ’ 4 +  rest   V c V b V a V a V d V g Compute weighted average of min-marginals of V a  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’ 1 a;0  ’ 1 a;1  ’ 4 a;0  ’ 4 a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1  1 min{  ’ 1 a;0 ,  ’ 1 a;1 } +  4 min{  ’ 4 a;0 ,  ’ 4 a;1 } + K
    • 177. TRW Message Passing Kolmogorov, 2006  1  ’ 1 +  4  ’ 4 +  rest   V c V b V a V a V d V g  ’’ a;0 =  1  ’ 1 a;0 +  4  ’ 4 a;0  1 +  4  ’’ a;1 =  1  ’ 1 a;1 +  4  ’ 4 a;1  1 +  4  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’ 1 a;0  ’ 1 a;1  ’ 4 a;0  ’ 4 a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1  1 min{  ’ 1 a;0 ,  ’ 1 a;1 } +  4 min{  ’ 4 a;0 ,  ’ 4 a;1 } + K
    • 178. TRW Message Passing Kolmogorov, 2006  1  ’’ 1 +  4  ’’ 4 +  rest V c V b V a V a V d V g  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’’ a;0  ’’ a;1  ’’ a;0  ’’ a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1  1 min{  ’ 1 a;0 ,  ’ 1 a;1 } +  4 min{  ’ 4 a;0 ,  ’ 4 a;1 } + K  ’’ a;0 =  1  ’ 1 a;0 +  4  ’ 4 a;0  1 +  4  ’’ a;1 =  1  ’ 1 a;1 +  4  ’ 4 a;1  1 +  4
    • 179. TRW Message Passing Kolmogorov, 2006  1  ’’ 1 +  4  ’’ 4 +  rest   V c V b V a V a V d V g  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’’ a;0  ’’ a;1  ’’ a;0  ’’ a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1  1 min{  ’ 1 a;0 ,  ’ 1 a;1 } +  4 min{  ’ 4 a;0 ,  ’ 4 a;1 } + K  ’’ a;0 =  1  ’ 1 a;0 +  4  ’ 4 a;0  1 +  4  ’’ a;1 =  1  ’ 1 a;1 +  4  ’ 4 a;1  1 +  4
    • 180. TRW Message Passing Kolmogorov, 2006  1  ’’ 1 +  4  ’’ 4 +  rest   V c V b V a V a V d V g  1 min{  ’’ a;0 ,  ’’ a;1 } +  4 min{  ’’ a;0 ,  ’’ a;1 } + K  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’’ a;0  ’’ a;1  ’’ a;0  ’’ a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1  ’’ a;0 =  1  ’ 1 a;0 +  4  ’ 4 a;0  1 +  4  ’’ a;1 =  1  ’ 1 a;1 +  4  ’ 4 a;1  1 +  4
    • 181. TRW Message Passing Kolmogorov, 2006  1  ’’ 1 +  4  ’’ 4 +  rest   V c V b V a V a V d V g (  1 +  4 ) min{  ’’ a;0 ,  ’’ a;1 } + K  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’’ a;0  ’’ a;1  ’’ a;0  ’’ a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1  ’’ a;0 =  1  ’ 1 a;0 +  4  ’ 4 a;0  1 +  4  ’’ a;1 =  1  ’ 1 a;1 +  4  ’ 4 a;1  1 +  4
    • 182. TRW Message Passing Kolmogorov, 2006  1  ’’ 1 +  4  ’’ 4 +  rest   V c V b V a V a V d V g (  1 +  4 ) min{  ’’ a;0 ,  ’’ a;1 } + K  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’’ a;0  ’’ a;1  ’’ a;0  ’’ a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1 min {p 1 +p 2 , q 1 +q 2 } min {p 1 , q 1 } + min {p 2 , q 2 } ≥
    • 183. TRW Message Passing Kolmogorov, 2006  1  ’’ 1 +  4  ’’ 4 +  rest   V c V b V a V a V d V g Objective function increases or remains constant  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’’ a;0  ’’ a;1  ’’ a;0  ’’ a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1 (  1 +  4 ) min{  ’’ a;0 ,  ’’ a;1 } + K
    • 184. TRW Message Passing Initialize  i . Take care of reparam constraint Choose random variable V a Compute min-marginals of V a for all trees Node-average the min-marginals REPEAT Kolmogorov, 2006 Can also do edge-averaging
    • 185. Example 1 V a V b 0 1 1 0 2 5 4 2 l 0 l 1 V b V c 0 2 3 1 4 2 6 3 V c V a 1 4 1 0 6 3 6 4  2 =1  3 =1  1 =1 5 6 7 Pick variable V a . Reparameterize.
    • 186. Example 1 V a V b -3 -2 -1 -2 5 7 4 2 V b V c 0 2 3 1 4 2 6 3 V c V a -3 1 -3 -3 6 3 10 7  2 =1  3 =1  1 =1 5 6 7 Average the min-marginals of V a l 0 l 1
    • 187. Example 1 V a V b -3 -2 -1 -2 7.5 7 4 2 V b V c 0 2 3 1 4 2 6 3 V c V a -3 1 -3 -3 6 3 7.5 7  2 =1  3 =1  1 =1 7 6 7 Pick variable V b . Reparameterize. l 0 l 1
    • 188. Example 1 V a V b -7.5 -7 -5.5 -7 7.5 7 8.5 7 V b V c -5 -3 -1 -3 9 6 6 3 V c V a -3 1 -3 -3 6 3 7.5 7  2 =1  3 =1  1 =1 7 6 7 Average the min-marginals of V b l 0 l 1
    • 189. Example 1 V a V b -7.5 -7 -5.5 -7 7.5 7 8.75 6.5 V b V c -5 -3 -1 -3 8.75 6.5 6 3 V c V a -3 1 -3 -3 6 3 7.5 7  2 =1  3 =1  1 =1 6.5 6.5 7 Value of dual does not increase l 0 l 1
    • 190. Example 1 V a V b -7.5 -7 -5.5 -7 7.5 7 8.75 6.5 V b V c -5 -3 -1 -3 8.75 6.5 6 3 V c V a -3 1 -3 -3 6 3 7.5 7  2 =1  3 =1  1 =1 6.5 6.5 7 Maybe it will increase for V c NO l 0 l 1
    • 191. Example 1 V a V b -7.5 -7 -5.5 -7 7.5 7 8.75 6.5 V b V c -5 -3 -1 -3 8.75 6.5 6 3 V c V a -3 1 -3 -3 6 3 7.5 7  2 =1  3 =1  1 =1 Strong Tree Agreement Exact MAP Estimate f 1 (a) = 0 f 1 (b) = 0 f 2 (b) = 0 f 2 (c) = 0 f 3 (c) = 0 f 3 (a) = 0 l 0 l 1
    • 192. Example 2 V a V b 0 1 1 0 2 5 2 2 V b V c 1 0 0 1 0 0 0 0 V c V a 0 1 1 0 0 3 4 8  2 =1  3 =1  1 =1 4 0 4 Pick variable V a . Reparameterize. l 0 l 1
    • 193. Example 2 V a V b -2 -1 -1 -2 4 7 2 2 V b V c 1 0 0 1 0 0 0 0 V c V a 0 0 1 -1 0 3 4 9  2 =1  3 =1  1 =1 4 0 4 Average the min-marginals of V a l 0 l 1
    • 194. Example 2 V a V b -2 -1 -1 -2 4 8 2 2 V b V c 1 0 0 1 0 0 0 0 V c V a 0 0 1 -1 0 3 4 8  2 =1  3 =1  1 =1 4 0 4 Value of dual does not increase l 0 l 1
    • 195. Example 2 V a V b -2 -1 -1 -2 4 8 2 2 V b V c 1 0 0 1 0 0 0 0 V c V a 0 0 1 -1 0 3 4 8  2 =1  3 =1  1 =1 4 0 4 Maybe it will decrease for V b or V c NO l 0 l 1
    • 196. Example 2 V a V b -2 -1 -1 -2 4 8 2 2 V b V c 1 0 0 1 0 0 0 0 V c V a 0 0 1 -1 0 3 4 8  2 =1  3 =1  1 =1 f 1 (a) = 1 f 1 (b) = 1 f 2 (b) = 1 f 2 (c) = 0 f 3 (c) = 1 f 3 (a) = 1 f 2 (b) = 0 f 2 (c) = 1 Weak Tree Agreement Not Exact MAP Estimate l 0 l 1
    • 197. Example 2 V a V b -2 -1 -1 -2 4 8 2 2 V b V c 1 0 0 1 0 0 0 0 V c V a 0 0 1 -1 0 3 4 8  2 =1  3 =1  1 =1 Weak Tree Agreement Convergence point of TRW l 0 l 1 f 1 (a) = 1 f 1 (b) = 1 f 2 (b) = 1 f 2 (c) = 0 f 3 (c) = 1 f 3 (a) = 1 f 2 (b) = 0 f 2 (c) = 1
    • 198. Obtaining the Labelling Only solves the dual. Primal solutions? V a V b V c V d V e V f V g V h V i  ’ =   i  i   Fix the label Of V a
    • 199. Obtaining the Labelling Only solves the dual. Primal solutions? V a V b V c V d V e V f V g V h V i  ’ =   i  i   Fix the label Of V b Continue in some fixed order Meltzer et al., 2006
    • 200. Outline
      • Problem Formulation
      • Reparameterization
      • Belief Propagation
      • Tree-reweighted Message Passing
        • Integer Programming Formulation
        • Linear Programming Relaxation and its Dual
        • Convergent Solution for Dual
        • Computational Issues and Theoretical Properties
    • 201. Computational Issues of TRW
      • Speed-ups for some pairwise potentials
      Basic Component is Belief Propagation Felzenszwalb & Huttenlocher, 2004
      • Memory requirements cut down by half
      Kolmogorov, 2006
      • Further speed-ups using monotonic chains
      Kolmogorov, 2006
    • 202. Theoretical Properties of TRW
      • Always converges, unlike BP
      Kolmogorov, 2006
      • Strong tree agreement implies exact MAP
      Wainwright et al., 2001
      • Optimal MAP for two-label submodular problems
      Kolmogorov and Wainwright, 2005  ab;00 +  ab;11 ≤  ab;01 +  ab;10
    • 203. Results Binary Segmentation Szeliski et al. , 2008 Labels - {foreground, background} Unary Potentials: -log(likelihood) using learnt fg/bg models Pairwise Potentials: 0, if same labels 1 -  exp(|D a - D b |), if different labels
    • 204. Results Binary Segmentation Labels - {foreground, background} Unary Potentials: -log(likelihood) using learnt fg/bg models Szeliski et al. , 2008 Pairwise Potentials: 0, if same labels 1 -  exp(|D a - D b |), if different labels TRW
    • 205. Results Binary Segmentation Labels - {foreground, background} Unary Potentials: -log(likelihood) using learnt fg/bg models Szeliski et al. , 2008 Belief Propagation Pairwise Potentials: 0, if same labels 1 -  exp(|D a - D b |), if different labels
    • 206. Results Stereo Correspondence Szeliski et al. , 2008 Labels - {disparities} Unary Potentials: Similarity of pixel colours Pairwise Potentials: 0, if same labels 1 -  exp(|D a - D b |), if different labels
    • 207. Results Szeliski et al. , 2008 Labels - {disparities} Unary Potentials: Similarity of pixel colours Pairwise Potentials: 0, if same labels 1 -  exp(|D a - D b |), if different labels TRW Stereo Correspondence
    • 208. Results Szeliski et al. , 2008 Labels - {disparities} Unary Potentials: Similarity of pixel colours Belief Propagation Pairwise Potentials: 0, if same labels 1 -  exp(|D a - D b |), if different labels Stereo Correspondence
    • 209. Results Non-submodular problems Kolmogorov, 2006 BP TRW-S 30x30 grid K 50 BP TRW-S BP outperforms TRW-S
    • 210. Summary
      • Trees can be solved exactly - BP
      • No guarantee of convergence otherwise - BP
      • Strong Tree Agreement - TRW-S
      • Submodular energies solved exactly - TRW-S
      • TRW-S solves an LP relaxation of MAP estimation
      • Loopier graphs give worse results
      • Rother and Kolmogorov, 2006
    • 211. Related New(er) Work
      • Solving the Dual
      Globerson and Jaakkola, 2007 Komodakis, Paragios and Tziritas 2007 Weiss et al., 2006 Schlesinger and Giginyak, 2007
      • Solving the Primal
      Ravikumar, Agarwal and Wainwright, 2008
    • 212. Related New(er) Work
      • More complex relaxations
      Sontag and Jaakkola, 2007 Komodakis and Paragios, 2008 Kumar, Kolmogorov and Torr, 2007 Werner, 2008 Sontag et al., 2008 Kumar and Torr, 2008
    • 213. Questions on Part I ? Code + Standard Data http://vision.middlebury.edu/MRF