ICCV2009: MAP Inference in Discrete Models: Part 3
1. MAP Inference in Discrete Models
M. Pawan Kumar, Stanford University
2. The Problem
E(x) = ∑ fi (xi) + ∑ gij (xi,xj) + ∑ hc(xc)
i ij c
Unary Pairwise Higher Order
Problems worthy of attack
Prove their worth by fighting back
Minimize E(x) ….. Done !!!
16. Reparameterization
’ is a reparameterization of , iff
Q(f; ’) = Q(f; ), for all f ’
Equivalently Kolmogorov, PAMI, 2006
2+2 0 4 -2
’a;i = a;i + Mba;i
1 1
’b;k = b;k + Mab;k
2+ 5 0 2 -2
Va Vb
’ab;ik = ab;ik - Mab;k - Mba;i
17. Recap
MAP Estimation
f* = arg min Q(f; )
Q(f; ) = ∑a a;f(a) + ∑(a,b) ab;f(a)f(b)
Min-marginals
qa;i = min Q(f; ) s.t. f(a) = i
Reparameterization
Q(f; ’) = Q(f; ), for all f ’
18. Outline
• Reparameterization
• Belief Propagation
– Exact MAP for Chains and Trees
– Approximate MAP for general graphs
– Computational Issues and Theoretical Properties
• Tree-reweighted Message Passing
19. Belief Propagation
• Some MAP problems are easy
• Belief Propagation gives exact MAP for chains
• Exact MAP for trees
• Clever Reparameterization
20. Two Variables
2 2 0 4
1 1
5 0 2 5
Va Vb Va Vb
Add a constant to one b;k
Subtract that constant from ab;ik for all ‘i’
Choose the right constant ’b;k = qb;k
21. Two Variables
2 2 0 4
1 1
5 0 2 5
Va Vb Va Vb
a;0 + ab;00 = 5+0
Mab;0 = min
a;1 + ab;10 = 2+1
Choose the right constant ’b;k = qb;k
22. Two Variables
2 2 0 4
-2 1
5 -3 5 5
Va Vb Va Vb
Choose the right constant ’b;k = qb;k
23. Two Variables
2
f(a) = 1 2 0 4
-2 1
5 -3 5 5
Va Vb Va Vb
’b;0 = qb;0
Potentials along the red path add up to 0
Choose the right constant ’b;k = qb;k
24. Two Variables
2 2 0 4
-2 1
5 -3 5 5
Va Vb Va Vb
a;0 + ab;01 = 5+1
Mab;1 = min
a;1 + ab;11 = 2+0
Choose the right constant ’b;k = qb;k
25. Two Variables
f(a) = 1 f(a) = 1
2 2 -2 6
-2 -1
5 -3 5 5
Va Vb Va Vb
’b;0 = qb;0 ’b;1 = qb;1
Minimum of min-marginals = MAP estimate
Choose the right constant ’b;k = qb;k
26. Two Variables
f(a) = 1 f(a) = 1
2 2 -2 6
-2 -1
5 -3 5 5
Va Vb Va Vb
’b;0 = qb;0 ’b;1 = qb;1
f*(b) = 0 f*(a) = 1
Choose the right constant ’b;k = qb;k
27. Two Variables
f(a) = 1 f(a) = 1
2 2 -2 6
-2 -1
5 -3 5 5
Va Vb Va Vb
’b;0 = qb;0 ’b;1 = qb;1
We get all the min-marginals of Vb
Choose the right constant ’b;k = qb;k
28. Recap
We only need to know two sets of equations
General form of Reparameterization
’a;i = a;i + Mba;i ’b;k = b;k+ Mab;k
’ab;ik = ab;ik- Mab;k- Mba;i
Reparameterization of (a,b) in Belief Propagation
Mab;k = mini { a;i + ab;ik }
Mba;i = 0
29. Three Variables
l1 2 0 4 0 6
1 1 2 3
l0
5 0 2 1 3
Va Vb Vc
Reparameterize the edge (a,b) as before
30. Three Variables
f(a) = 1
l1 2 -2 6 0 6
-2 -1 2 3
l0
5 -3 5 1 3
Va Vb Vc
f(a) = 1
Reparameterize the edge (a,b) as before
31. Three Variables
f(a) = 1
l1 2 -2 6 0 6
-2 -1 2 3
l0
5 -3 5 1 3
Va Vb Vc
f(a) = 1
Reparameterize the edge (a,b) as before
Potentials along the red path add up to 0
32. Three Variables
f(a) = 1
l1 2 -2 6 0 6
-2 -1 2 3
l0
5 -3 5 1 3
Va Vb Vc
f(a) = 1
Reparameterize the edge (b,c) as before
Potentials along the red path add up to 0
33. Three Variables
f(a) = 1 f(b) = 1
l1 2 -2 6 -6 12
-2 -1 -4 -3
l0
5 -3 5 -5 9
Va Vb Vc
f(a) = 1 f(b) = 0
Reparameterize the edge (b,c) as before
Potentials along the red path add up to 0
34. Three Variables
f(a) = 1 f(b) = 1
l1 2 -2 6 -6 12
qc;1
-2 -1 -4 -3
l0 qc;0
5 -3 5 -5 9
Va Vb Vc
f(a) = 1 f(b) = 0
Reparameterize the edge (b,c) as before
Potentials along the red path add up to 0
37. Why Dynamic Programming?
3 variables 2 variables + book-keeping
n variables (n-1) variables + book-keeping
Start from left, go to right
Reparameterize current edge (a,b)
Mab;k = mini { a;i + ab;ik }
’b;k = b;k+ Mab;k ’ab;ik = ab;ik- Mab;k
Repeat
38. Why Dynamic Programming?
Messages Message Passing
Why stop at dynamic programming?
Start from left, go to right
Reparameterize current edge (a,b)
Mab;k = mini { a;i + ab;ik }
’b;k = b;k+ Mab;k ’ab;ik = ab;ik- Mab;k
Repeat
39. Three Variables
l1 2 -2 6 -6 12
-2 -1 -4 -3
l0
5 -3 5 -5 9
Va Vb Vc
Reparameterize the edge (c,b) as before
40. Three Variables
l1 2 -2 11 -11 12
-2 -1 -9 -7
l0
5 -3 9 -9 9
Va Vb Vc
Reparameterize the edge (c,b) as before
’b;i = qb;i
41. Three Variables
l1 2 -2 11 -11 12
-2 -1 -9 -7
l0
5 -3 9 -9 9
Va Vb Vc
Reparameterize the edge (b,a) as before
42. Three Variables
l1 9 -9 11 -11 12
-9 -7 -9 -7
l0
11 -9 9 -9 9
Va Vb Vc
Reparameterize the edge (b,a) as before
’a;i = qa;i
43. Three Variables
l1 9 -9 11 -11 12
-9 -7 -9 -7
l0
11 -9 9 -9 9
Va Vb Vc
Forward Pass Backward Pass
All min-marginals are computed
44. Belief Propagation on Chains
Start from left, go to right
Reparameterize current edge (a,b)
Mab;k = mini { a;i + ab;ik }
’b;k = b;k+ Mab;k ’ab;ik = ab;ik- Mab;k
Repeat till the end of the chain
Start from right, go to left
Repeat till the end of the chain
45. Belief Propagation on Chains
• Generalizes to chains of any length
• A way of computing reparam constants
• Forward Pass - Start to End
• MAP estimate
• Min-marginals of final variable
• Backward Pass - End to start
• All other min-marginals
Won’t need this .. But good to know
46. Computational Complexity
• Each constant takes O(|L|)
• Number of constants - O(|E||L|)
O(|E||L|2)
• Memory required ?
O(|E||L|)
47. Belief Propagation on Trees
Va
Vb Vc
Vd Ve Vg Vh
Forward Pass: Leaf Root
Backward Pass: Root Leaf
All min-marginals are computed
48. Outline
• Reparameterization
• Belief Propagation
– Exact MAP for Chains and Trees
– Approximate MAP for general graphs
– Computational Issues and Theoretical Properties
• Tree-reweighted Message Passing
49. Belief Propagation on Cycles
a;1 b;1
a;0 b;0
Va Vb
d;1 c;1
d;0 c;0
Vd Vc
Where do we start? Arbitrarily
Reparameterize (a,b)
50. Belief Propagation on Cycles
a;1
’b;1
a;0
’b;0
Va Vb
d;1 c;1
d;0 c;0
Vd Vc
Potentials along the red path add up to 0
51. Belief Propagation on Cycles
a;1
’b;1
a;0
’b;0
Va Vb
d;1
’c;1
d;0
’c;0
Vd Vc
Potentials along the red path add up to 0
52. Belief Propagation on Cycles
a;1
’b;1
a;0
’b;0
Va Vb
’d;1 ’c;1
’d;0 ’c;0
Vd Vc
Potentials along the red path add up to 0
53. Belief Propagation on Cycles
’a;1 - a;1 ’b;1
’a;0 ’b;0
- a;0
Va Vb
’d;1 ’c;1
’d;0 ’c;0
Vd Vc
Did not obtain min-marginals
Potentials along the red path add up to 0
54. Belief Propagation on Cycles
’a;1 - a;1 ’b;1
’a;0 ’b;0
- a;0
Va Vb
’d;1 ’c;1
’d;0 ’c;0
Vd Vc
Reparameterize (a,b) again
55. Belief Propagation on Cycles
’a;1 ’’b;1
’a;0 ’’b;0
Va Vb
’d;1 ’c;1
’d;0 ’c;0
Vd Vc
Reparameterize (a,b) again
But doesn’t this overcount some potentials?
56. Belief Propagation on Cycles
’a;1 ’’b;1
’a;0 ’’b;0
Va Vb
’d;1 ’c;1
’d;0 ’c;0
Vd Vc
Reparameterize (a,b) again
Yes. But we will do it anyway
57. Belief Propagation on Cycles
’a;1 ’’b;1
’a;0 ’’b;0
Va Vb
’d;1 ’c;1
’d;0 ’c;0
Vd Vc
Keep reparameterizing edges in some order
No convergence guarantees
58. Belief Propagation
• Generalizes to any arbitrary random field
• Complexity per iteration ?
O(|E||L|2)
• Memory required ?
O(|E||L|)
59. Outline
• Reparameterization
• Belief Propagation
– Exact MAP for Chains and Trees
– Approximate MAP for general graphs
– Computational Issues and Theoretical Properties
• Tree-reweighted Message Passing
60. Computational Issues of BP
Complexity per iteration O(|E||L|2)
Special Pairwise Potentials ab;ik = wabd(|i-k|)
d d d
i-k i-k i-k
Potts Truncated Linear Truncated Quadratic
O(|E||L|) Felzenszwalb & Huttenlocher, 2004
61. Computational Issues of BP
Memory requirements O(|E||L|)
Half of original BP Kolmogorov, 2006
Some approximations exist
Yu, Lin, Super and Tan, 2007
Lasserre, Kannan and Winn, 2007
But memory still remains an issue
62. Computational Issues of BP
Order of reparameterization
Randomly
In some fixed order
The one that results in maximum change
Residual Belief Propagation
Elidan et al. , 2006
63. Summary of BP
Exact for chains
Exact for trees
Approximate MAP for general cases
Convergence not guaranteed
So can we do something better?
64. Outline
• Reparameterization
• Belief Propagation
• Tree-reweighted Message Passing
– Integer Programming Formulation
– Linear Programming Relaxation and its Dual
– Convergent Solution for Dual
– Computational Issues and Theoretical
Properties
80. Linear Programming Relaxation
min Ty
ya;i [0,1]
∑i ya;i = 1
∑k yab;ik = ya;i
No reason why we can’t solve this*
*memory requirements, time complexity
81. Dual of the LP Relaxation
Wainwright et al., 2001
min Ty
Va Vb Vc
Vd Ve Vf
ya;i [0,1]
Vg Vh Vi
∑i ya;i = 1
∑k yab;ik = ya;i
82. Dual of the LP Relaxation
Wainwright et al., 2001
1 1
Va Vb Vc
Va Vb Vc
2 2
Vd Ve Vf
Vd Ve Vf
3 Vg Vh Vi 3
Vg Vh Vi
4 5 6
Va Vb Vc
i≥ 0
Vd Ve Vf
Vg Vh Vi
i i = 4 5 6
83. Dual of the LP Relaxation
Wainwright et al., 2001
1
q*( 1) Va Vb Vc
Va Vb Vc
2
q*( 2) Vd Ve Vf
Vd Ve Vf
3
q*( 3) Vg Vh Vi
Vg Vh Vi
q*( 4) q*( 5) q*( 6)
Va Vb Vc
i≥ 0
Dual of LP Vd Ve Vf
max i q*( i)
Vg Vh Vi
i i = 4 5 6
84. Dual of the LP Relaxation
Wainwright et al., 2001
1
q*( 1) Va Vb Vc
Va Vb Vc
2
q*( 2) Vd Ve Vf
Vd Ve Vf
3
q*( 3) Vg Vh Vi
Vg Vh Vi
q*( 4) q*( 5) q*( 6)
Va Vb Vc
i≥ 0
Dual of LP Vd Ve Vf
max i q*( i)
Vg Vh Vi
i i 4 5 6
85. Dual of the LP Relaxation
Wainwright et al., 2001
max i q*( i)
i i
I can easily compute q*( i)
I can easily maintain reparam constraint
So can I easily solve the dual?
86. Outline
• Reparameterization
• Belief Propagation
• Tree-reweighted Message Passing
– Integer Programming Formulation
– Linear Programming Relaxation and its Dual
– Convergent Solution for Dual
– Computational Issues and Theoretical
Properties
87. TRW Message Passing
Kolmogorov, 2006 4 5 6
1 V 1
a Vb Vc Va Vb Vc
2 2
Vd Ve Vf Vd Ve Vf
3 V Vh Vi 3
g Vg Vh Vi
4 5 6
Pick a variable Va
i q*( i)
i i
88. TRW Message Passing
Kolmogorov, 2006
1 1 1 4 4 4
c;1 b;1 a;1 a;1 d;1 g;1
1 1 1 4 4 4
c;0 b;0 a;0 a;0 d;0 g;0
Vc Vb Va Va Vd Vg
i q*( i)
i i
99. TRW Message Passing
Kolmogorov, 2006
’1c;1 ’1b;1 ’’a;1 ’’a;1 ’4d;1 ’4g;1
’1c;0 ’1b;0 ’’a;0 ’’a;0 ’4d;0 ’4g;0
Vc Vb Va Va Vd Vg
min {p1+p2, q1+q2} ≥ min {p1, q1} + min {p2, q2}
( 1+ 4) min{ ’’a;0, ’’a;1} + K
1 ’’1 + 4 ’’4 + rest
100. TRW Message Passing
Kolmogorov, 2006
’1c;1 ’1b;1 ’’a;1 ’’a;1 ’4d;1 ’4g;1
’1c;0 ’1b;0 ’’a;0 ’’a;0 ’4d;0 ’4g;0
Vc Vb Va Va Vd Vg
Objective function increases or remains constant
( 1+ 4) min{ ’’a;0, ’’a;1} + K
1 ’’1 + 4 ’’4 + rest
101. TRW Message Passing
Initialize i. Take care of reparam constraint
Choose random variable Va
Compute min-marginals of Va for all trees
Node-average the min-marginals
REPEAT Can also do edge-averaging
Kolmogorov, 2006
115. Obtaining the Labeling
Only solves the dual. Primal solutions?
Va Vb Vc Fix the label
Of Va
Vd Ve Vf
Vg Vh Vi
’= i i
116. Obtaining the Labeling
Only solves the dual. Primal solutions?
Va Vb Vc Fix the label
Of Vb
Vd Ve Vf
Vg Vh Vi
’= i i
Continue in some fixed order
Meltzer et al., 2006
117. Outline
• Problem Formulation
• Reparameterization
• Belief Propagation
• Tree-reweighted Message Passing
– Integer Programming Formulation
– Linear Programming Relaxation and its Dual
– Convergent Solution for Dual
– Computational Issues and Theoretical Properties
118. Computational Issues of TRW
Basic Component is Belief Propagation
• Speed-ups for some pairwise potentials
Felzenszwalb & Huttenlocher, 2004
• Memory requirements cut down by half
Kolmogorov, 2006
• Further speed-ups using monotonic chains
Kolmogorov, 2006
119. Theoretical Properties of TRW
• Always converges, unlike BP
Kolmogorov, 2006
• Strong tree agreement implies exact MAP
Wainwright et al., 2001
• Optimal MAP for two-label submodular problems
ab;00 + ab;11 ≤ ab;01 + ab;10
Kolmogorov and Wainwright, 2005
120. Summary
• Trees can be solved exactly - BP
• No guarantee of convergence otherwise - BP
• Strong Tree Agreement - TRW-S
• Submodular energies solved exactly - TRW-S
• TRW-S solves an LP relaxation of MAP estimation