Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
×

# Efficient Identification of Improving Moves in a Ball for Pseudo-Boolean Problems

306 views

Published on

Paper presentation in GECCO 2014 research conference

Published in: Technology
• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here
• Be the first to comment

• Be the first to like this

### Efficient Identification of Improving Moves in a Ball for Pseudo-Boolean Problems

1. 1. 1 / 28GECCO 2014, Vancouver, Canada, July 14 Introduction Background Contribution Experiments Conclusions & Future Work Efficient Identification of Improving Moves in a Ball for Pseudo-Boolean Problems Francisco Chicano, Darrell Whitley, Andrew M. Sutton
2. 2. 2 / 28GECCO 2014, Vancouver, Canada, July 14 Introduction Background Contribution Experiments Conclusions & Future Work r = 1 n r = 2 n 2 r = 3 n 3 r n r Ball Pr i=1 n i S1( r = 1 n r = 2 n 2 r = 3 n 3 r n r Ball Pr i=1 n i • Considering binary strings of length n and Hamming distance… Solutions in a ball of radius r r=1 r=2 r=3 Ball of radius r Improving moves Previous work How many solutions at Hamming distance r? If r << n : Θ (nr)
3. 3. 3 / 28GECCO 2014, Vancouver, Canada, July 14 Introduction Background Contribution Experiments Conclusions & Future Work • We want to find improving moves in a ball of radius r around solution x • What is the computational cost of this exploration? • By complete enumeration: O (nr) if the fitness evaluation is O(1) • Our contribution in this work: Improving moves in a ball of radius r r We propose a way to find improving moves in ball of radius r in O(1) (constant time independent of n) Ball of radius r Improving moves Previous work
4. 4. 4 / 28GECCO 2014, Vancouver, Canada, July 14 Introduction Background Contribution Experiments Conclusions & Future Work • Whitley and Chen proposed an O(1) approximated steepest descent for MAX-kSAT and NK-landscapes based on Walsh decomposition • For k-bounded pseudo-Boolean functions its complexity is O(k2 2k) • Chen, Whitley, Hains and Howe reduced the time required to identify improving moves to O(k3) using partial derivatives • Szeider proved that the exploration of a ball of radius r in MAX-kSAT and kSAT can be done in O(n) if each variable appears in a bounded number of clauses • Our result can be obtained by Walsh analysis or partial derivatives, but none of them will be used here Previous work Ball of radius r Improving moves Previous work D. Whitley and W. Chen. Constant time steepest descent local search with lookahead for NK-landscapes and MAX-kSAT. GECCO 2012: 1357–1364 W. Chen, D. Whitley, D. Hains, and A. Howe. Second order partial derivatives for NK-landscapes. GECCO 2013: 503–510 S. Szeider. The parameterized complexity of k-flip local search for SAT and MAX SAT. Discrete Optimization, 8(1):139–145, 2011
5. 5. 5 / 28GECCO 2014, Vancouver, Canada, July 14 Introduction Background Contribution Experiments Conclusions & Future Work • Definition: • where f(i) only depends on k variables (k-bounded epistasis) • We will also assume that the variables are arguments of at most c subfunctions • Example (m=4, n=4, k=2): • Is this set of functions too small? Is it interesting? • Max-kSAT is a k-bounded pseudo-Boolean optimization problem • NK-landscapes is a (K+1)-bounded pseudo-Boolean optimization problem • Any compressible pseudo-Boolean function can be reduced to a quadratic pseudo-Boolean function (e.g., Rosenberg, 1975) k-bounded pseudo-Boolean functions Pseudo-Boolean functions Scores The family of k-bounded pseudo-Boolean Optimization problems have also been described as an embedded landscape. An embedded landscape [3] with bounded epistasis k is de- ﬁned as a function f(x) that can be written as the sum of m subfunctions, each one depending at most on k input variables. That is: f(x) = mX i=1 f(i) (x), (1) where the subfunctions f(i) depend only on k components of x. Embedded Landscapes generalize NK-landscapes and the MAX-kSAT problem. We will consider in this paper that the number of subfunctions is linear in n, that is m 2 O(n). For NK-landscapes m = n and is a common assumption in MAX-kSAT that m 2 O(n). 3. SCORES IN THE HAMMING BALL For v, x 2 Bn , and a pseudo-Boolean function f : Bn ! R, we denote the Score of x with respect to move v as Sv(x), deﬁned as follows:1 Sv(x) = f(x v) f(x), (2) 1 We omit the function f in Sv(x) to simplify the notation. S(l) v (x) = Equation (5) cl change in the mov f(l) the Score of th this subfunction w On the other hand we only need to c changed variables acterized by the m we can write (3) a S 3.1 Scores De The Score value tion than just the c in that ball. Let us balls of radius r = xj are two variabl ments of any subfu f = + + +f(1)(x) f(2)(x) f(3)(x) f(4)(x) x1 x2 x3 x4
6. 6. 6 / 28GECCO 2014, Vancouver, Canada, July 14 Introduction Background Contribution Experiments Conclusions & Future Work • Let us represent a potential move of the current solution with a binary vector v having 1s in the positions that should be flipped • The score of move v for solution x is the difference in the fitness value of the neighboring and the current solution • Scores are useful to identify improving moves: if Sv(x) > 0, v is an improving move • We keep all the scores in the score vector Scores: definition Current solution, x Neighboring solution, y Move, v 01110101010101001 01111011010101001 00001110000000000 01110101010101001 00110101110101111 01000000100000110 01110101010101001 01000101010101001 00110000000000000 er of subfunctions is linear in n, that is m andscapes m = n and is a common assu AT that m 2 O(n). ORES IN THE HAMMING BA x 2 Bn , and a pseudo-Boolean function f e the Score of x with respect to move v s follows:1 Sv(x) = f(x v) f(x), t the function f in Sv(x) to simplify the Pseudo-Boolean functions Scores
7. 7. 7 / 28GECCO 2014, Vancouver, Canada, July 14 Introduction Background Contribution Experiments Conclusions & Future Work • The key idea of our proposal is to compute the scores from scratch once at the beginning and update their values as the solution moves (less expensive) Scores update Main idea Decomposition of scores Constant time update r Selected improving move Update the score vector
8. 8. 8 / 28GECCO 2014, Vancouver, Canada, July 14 Introduction Background Contribution Experiments Conclusions & Future Work • The key idea of our proposal is to compute the scores from scratch once at the beginning and update their values as the solution moves (less expensive) • How can we do it less expensive? • We have still O(nr) scores to update! • … thanks to two key facts: •  We don’t need all the O(nr) scores to know if there is an improving move •  From the ones we need, we only have to update a constant number of them and we can do each update in constant time Key facts for efficient scores update r Main idea Decomposition of scores Constant time update
9. 9. 9 / 28GECCO 2014, Vancouver, Canada, July 14 Introduction Background Contribution Experiments Conclusions & Future Work Examples: 1 and 4 f(1) f(2) f(3) f(4) x1 x2 x3 x4 Ball Pr i=1 n i S1(x) = f(x 1) f(x) Sv(x) = f(x v) f(x) = mX l=1 (f(l) (x v) f(l) (x)) = mX l=1 S(l) (x) S1(x) = f(x 1) f(x) v) f(x) = mX l=1 (f(l) (x v) f(l) (x)) = mX l=1 S(l) (x) f(1) f(2) f(3) f(4) x1 x2 x3 x4 S1(x) = f(x 1) f(x) v) f(x) = mX l=1 (f(l) (x v) f(l) (x)) = mX l=1 S(l) (x) S4(x) = f(x 4) f(x) Main idea Decomposition of scores Constant time update
10. 10. 10 / 28GECCO 2014, Vancouver, Canada, July 14 Introduction Background Contribution Experiments Conclusions & Future Work Examples: 1 and 4 f(1) x1 x2 Ball Pr i=1 n i S1(x) = f(x 1) f(x) Sv(x) = f(x v) f(x) = mX l=1 (f(l) (x v) f(l) (x)) = mX l=1 S(l) (x) S1(x) = f(x 1) f(x) v) f(x) = mX l=1 (f(l) (x v) f(l) (x)) = mX l=1 S(l) (x) f(3) f(4) x2 x3 x4 S1(x) = f(x 1) f(x) v) f(x) = mX l=1 (f(l) (x v) f(l) (x)) = mX l=1 S(l) (x) S4(x) = f(x 4) f(x) Main idea Decomposition of scores Constant time update
11. 11. 11 / 28GECCO 2014, Vancouver, Canada, July 14 Introduction Background Contribution Experiments Conclusions & Future Work Example: 1,4 f(1) f(2) f(3) f(4) x1 x2 x3 x4 r Ball Pr i=1 n i S1(x) = f(x 1) f(x) Sv(x) = f(x v) f(x) = mX l=1 (f(l) (x v) f(l) (x)) = mX l=1 S(l) (x) n i S1(x) = f(x 1) f(x) v) f(x) = mX l=1 (f(l) (x v) f(l) (x)) = mX l=1 S(l) (x) S4(x) = f(x 4) f(x) S1,4(x) = f(x 1, 4) f(x) Main idea Decomposition of scores Constant time update
12. 12. 12 / 28GECCO 2014, Vancouver, Canada, July 14 Introduction Background Contribution Experiments Conclusions & Future Work Example: 1,4 f(1) f(3) f(4) x1 x2 x3 x4 r Ball Pr i=1 n i S1(x) = f(x 1) f(x) Sv(x) = f(x v) f(x) = mX l=1 (f(l) (x v) f(l) (x)) = mX l=1 S(l) (x) n i S1(x) = f(x 1) f(x) v) f(x) = mX l=1 (f(l) (x v) f(l) (x)) = mX l=1 S(l) (x) S4(x) = f(x 4) f(x) S1,4(x) = f(x 1, 4) f(x) S1(x) = f(x 1) f(x) f(x) = mX l=1 (f(l) (x v) f(l) (x)) = mX l=1 S(l) (x) Ball Pr i=1 n i S1(x) = Sv(x) = f(x v) f(x) = S4(x) = S1(x) = f(x 1) f(x) x) = f(x v) f(x) = mX l=1 (f(l) (x v) f(l) (x)) = mX l=1 S(l S4(x) = f(x 4) f(x) S1,4(x) = f(x 1, 4) f(x) S1,4(x) = S1(x) + S4(x) We don’t need to store S1,4(x) since can be computed from others If none of 1 and 4 are improving moves, 1,4 will not be an improving move Main idea Decomposition of scores Constant time update
13. 13. 13 / 28GECCO 2014, Vancouver, Canada, July 14 Introduction Background Contribution Experiments Conclusions & Future Work Example: 1,2 S1(x) = f(x 1) f(x) v) f(x) = mX l=1 (f(l) (x v) f(l) (x)) = mX l=1 S(l) (x) S4(x) = f(x 4) f(x) S1,4(x) = f(x 1, 4) f(x) S1,4(x) = S1(x) + S4(x) S1(x) = f(1) (x 1) f(1) (x) f(1) f(2) f(3) x1 x2 x3 x4 f(1) f(2) f(3) x1 x2 x3 x4 f(1) x1 x2 Sv(x) = f(x v) f(x) = mX l=1 (f(l) (x v) f(l) (x)) = mX l=1 S(l) (x) S4(x) = f(x 4) f(x) S1,4(x) = f(x 1, 4) f(x) S1,4(x) = S1(x) + S4(x) S1(x) = f(1) (x 1) f(1) (x) S2(x) = f(1) (x 2) f(1) (x) + f(2) (x 2) f(2) (x) + f(3) (x 2) f(3) (x) S1,2(x) = f(1) (x 1, 2) f(1) (x)+f(2) (x 1, 2) f(2) (x)+f(3) (x 1, 2) f(3) (x) S1,2(x) 6= S1(x) + S2(x) Sv(x) = f(x v) f(x) = mX l=1 (f(l) (x v) f(l) (x)) = mX l=1 S(l) (x) S4(x) = f(x 4) f(x) S1,4(x) = f(x 1, 4) f(x) S1,4(x) = S1(x) + S4(x) S1(x) = f(1) (x 1) f(1) (x) S2(x) = f(1) (x 2) f(1) (x) + f(2) (x 2) f(2) (x) + f(3) (x 2) f(3) (x) S1,2(x) = f(1) (x 1, 2) f(1) (x)+f(2) (x 1, 2) f(2) (x)+f(3) (x 1, 2) f(3) (x) S1,2(x) 6= S1(x) + S2(x) Sv(x) = f(x v) f(x) = mX l=1 (f(l) (x v) f(l) (x)) = mX l=1 S(l) (x) S4(x) = f(x 4) f(x) S1,4(x) = f(x 1, 4) f(x) S1,4(x) = S1(x) + S4(x) S1(x) = f(1) (x 1) f(1) (x) S2(x) = f(1) (x 2) f(1) (x) + f(2) (x 2) f(2) (x) + f(3) (x 2) f(3) (x) S1,2(x) = f(1) (x 1, 2) f(1) (x)+f(2) (x 1, 2) f(2) (x)+f(3) (x 1, 2) f(3) (x) S1,2(x) 6= S1(x) + S2(x) x1 and x2 “interact” Main idea Decomposition of scores Constant time update
14. 14. 14 / 28GECCO 2014, Vancouver, Canada, July 14 Introduction Background Contribution Experiments Conclusions & Future Work Decomposition rule for scores • When can we decompose a score as the sum of lower order scores? • … when the variables in the move can be partitioned in subsets of variables that DON’T interact • Let us define the Variable Interactions Graph (VIG) f(1) f(2) f(3) f(4) x1 x2 x3 x4 There is an edge between two variables if there exists a function that depends on both variables (they “interact”) x4 x3 x1 x2 Main idea Decomposition of scores Constant time update
15. 15. 15 / 28GECCO 2014, Vancouver, Canada, July 14 Introduction Background Contribution Experiments Conclusions & Future Work Scores to store • In terms of the VIG a score can be decomposed if the subgraph containing the variables in the move is NOT connected • The number of these scores (up to radius r) is O((3kc)r n) • Details of the proof in the paper • With a linear amount of information we can explore a ball of radius r containing O(nr) solutions x4 x3 x1 x2 x4 x3 x1 x2 S2(x) = f(1) (x 2) f(1) (x) + f(2) (x 2) f(2) (x) + f S1,2(x) = f(1) (x 1, 2) f(1) (x)+f(2) (x 1, 2) f(2) (x)+f S1,2(x) 6= S1(x) + S2(x) l=1 l=1 S4(x) = f(x 4) f(x) S1,4(x) = f(x 1, 4) f(x) S1,4(x) = S1(x) + S4(x) We need to store the scores of moves whose variables form a connected subgraph of the VIG Main idea Decomposition of scores Constant time update
16. 16. 16 / 28GECCO 2014, Vancouver, Canada, July 14 Introduction Background Contribution Experiments Conclusions & Future Work Scores to update • Let us assume that x4 is flipped • Which scores do we need to update? • Those that need to evaluate f(3) and f(4) f(1) f(2) f(3) f(4) x1 x2 x3 x4 Main idea Decomposition of scores Constant time update
17. 17. 17 / 28GECCO 2014, Vancouver, Canada, July 14 Introduction Background Contribution Experiments Conclusions & Future Work Scores to update • Let us assume that x4 is flipped • Which scores do we need to update? • Those that need to evaluate f(3) and f(4) f(1) f(2) f(3) f(4) x1 x2 x3 x4 Main idea Decomposition of scores Constant time update
18. 18. 18 / 28GECCO 2014, Vancouver, Canada, July 14 Introduction Background Contribution Experiments Conclusions & Future Work Scores to update • Let us assume that x4 is flipped • Which scores do we need to update? • Those that need to evaluate f(3) and f(4) f(1) f(2) f(3) f(4) x1 x2 x3 x4 Main idea Decomposition of scores Constant time update x4 x3 x1 x2 •  The scores of moves containing variables adjacent or equal to x4 in the VIG
19. 19. 19 / 28GECCO 2014, Vancouver, Canada, July 14 Introduction Background Contribution Experiments Conclusions & Future Work Scores to update and time required • The number of neighbors of a variable in the VIG is bounded by c k • The number of stored scores in which a variable appears is the number of spanning trees of size less than or equal to r with the variable at the root • This number is constant • The update of each score implies evaluating a constant number of functions that depend on at most k variables (constant), so it requires constant time x4 x3 x1 x2 O( b(k) (3kc)r |v| ) b(k) is a bound for the time to evaluate any subfunction Main idea Decomposition of scores Constant time update f(1) f(2) f(3) f(4) x1 x2 x3 x4
20. 20. 20 / 28GECCO 2014, Vancouver, Canada, July 14 Introduction Background Contribution Experiments Conclusions & Future Work Definition NKq-landscapes Sanity check Random model Next improvement Why NKq and not NK? Floating point precision • An NK-landscape is a pseudo-Boolean optimization problem with objective function: where each subfunction f(l) depends on variable xl and K other variables • There is polynomial time algorithm to solve the adjacent model (Wright et al., 2000) • The subfunctions are randomly generated and the values are taken in the range [0,1] • In NKq-landscapes the subfunctions take integer values in the range [0,q-1] • We use NKq-landscapes in the experiments f(1) (x)+f(2) (x 1, 2) f(2) (x)+f(3) (x 1, 2) f(3) (x) S1,2(x) 6= S1(x) + S2(x) f(x) = NX l=1 f(l) (x) 1 f = + + +f(1)(x) f(3)(x)f(2)(x) f(4)(x) x1 x2 x3 x4 • In the random model these other variables are random • In the adjacent model the variables are consecutive f = + + +f(1)(x) f(3)(x)f(2)(x) f(4)(x) x1 x2 x3 x4
21. 21. 21 / 28GECCO 2014, Vancouver, Canada, July 14 Introduction Background Contribution Experiments Conclusions & Future Work Results: checking the constant time • Sanity check: flip every variable the same number of times (120,000) and measure the time and memory required by the score updates NKq-landscapes •  Adjacent model •  N=1,000 to 12,000 •  K=1 to 4 •  q=2K+1 •  r=1 to 4 •  30 instances per conf. K=3 r=1 r=2 r=3 r=4 0 2000 4000 6000 8000 10000 12000 0 5 10 15 20 N TimeHsL NKq-landscapes Sanity check Random model Next improvement
22. 22. 22 / 28GECCO 2014, Vancouver, Canada, July 14 Introduction Background Contribution Experiments Conclusions & Future Work Results: checking the constant time • Sanity check: flip every variable the same number of times (120,000) and measure the time and memory required by the score updates NKq-landscapes •  Adjacent model •  N=1,000 to 12,000 •  K=1 to 4 •  q=2K+1 •  r=1 to 4 •  30 instances per conf. K=3 r=1 r=2 r=3 r=4 0 2000 4000 6000 8000 10000 12000 0 50000 100000 150000 N Scoresstoredinmemory NKq-landscapes Sanity check Random model Next improvement
23. 23. 23 / 28GECCO 2014, Vancouver, Canada, July 14 Introduction Background Contribution Experiments Conclusions & Future Work Results: checking the time in the random model • Random model: the number of subfunctions in which a variable appears, c, is not bounded by a constant NKq-landscapes •  Random model •  N=1,000 to 12,000 •  K=1 to 4 •  q=2K+1 •  r=1 to 4 •  30 instances per conf. K=3 r=1 r=2 r=3 0 2000 4000 6000 8000 10000 12000 0 50 100 150 200 n TimeHsL NKq-landscapes Sanity check Random model Next improvement
24. 24. 24 / 28GECCO 2014, Vancouver, Canada, July 14 Introduction Background Contribution Experiments Conclusions & Future Work Results: checking the time in the random model • Random model: the number of subfunctions in which a variable appears, c, is not bounded by a constant NKq-landscapes •  Random model •  N=1,000 to 12,000 •  K=1 to 4 •  q=2K+1 •  r=1 to 4 •  30 instances per conf. K=3 r=1 r=2 r=3 0 2000 4000 6000 8000 10000 12000 0 100000 200000 300000 400000 N Scoresstoredinmemory NKq-landscapes Sanity check Random model Next improvement
25. 25. 25 / 28GECCO 2014, Vancouver, Canada, July 14 Introduction Background Contribution Experiments Conclusions & Future Work Next improvement algorithm • Nearest moves are selected first (e,g, all r=1 moves before r=2) ists (Sv > 0 for some v 2 Mr ), the algorithm selects one of the improving moves t (line 6), updates the Scores using Algorithm 1 (line 7) and changes the current solution by the new one (line 8). Algorithm 3 Hamming-ball next ascent. 1: best ? 2: while stop condition not met do 3: x randomSolution(); 4: S computeScores(x); 5: while Sv > 0 for some v 2 Mr do 6: t selectImprovingMove(S); 7: updateScores(S,x,t); 8: x x t; 9: end while 10: if best = ? or f(x) > f(best) then 11: best x; 12: end if 13: end while Regarding the selection of the improving move, our ap- proach in the experiments was to select always the one with the lowest Hamming distance to the current solution, that 5. In with used tion chan resp whe K v rand rand tion indi NKq opti rith of s c = bou NKq-landscapes Sanity check Random model Next improvement
26. 26. 26 / 28GECCO 2014, Vancouver, Canada, July 14 Introduction Background Contribution Experiments Conclusions & Future Work Results for next improvement The normalized distance to the optimum, nd, is: nd(x) = f⇤ f(x) f⇤ , (11) where f⇤ is the ﬁtness value of the global optimum, com- puted using the algorithm by Wright et al. [10]. Figure 7: Normalized distance to the global opti- mum for the Hamming-ball next ascent. NKq-landscapes •  Adjacent model •  N=10,000 •  K=1 •  q=2K+1 •  r=1 to 10 •  30 instances • From r=6 to r=10 the global optimum is always found • r=10 always found the global optimum in the first descent r=7 always finds the global optimum in less than 2.1 s NKq-landscapes Sanity check Random model Next improvement
27. 27. 27 / 28GECCO 2014, Vancouver, Canada, July 14 Introduction Background Contribution Experiments Conclusions & Future Work Conclusions and Future Work Conclusions & Future Work •  We can identify improving moves in a ball of radius r around a solution in constant time (independent of n) •  The space required to store the information (scores) is linear in the size of the problem n •  This information can be used to design efficient search algorithms Conclusions •  Random restarts are costly, study the applicability of soft restarts •  Application to other pseudo-Boolean problems like MAX-kSAT •  Include clever strategies to escape from local optima Future Work
28. 28. 28 / 28GECCO 2014, Vancouver, Canada, July 14 Acknowledgements Efficient Identification of Improving Moves in a Ball for Pseudo-Boolean Problems