• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
GECCO 2007
 

GECCO 2007

on

  • 1,416 views

 

Statistics

Views

Total Views
1,416
Views on SlideShare
1,408
Embed Views
8

Actions

Likes
0
Downloads
0
Comments
0

2 Embeds 8

http://www.linkedin.com 6
http://www.slideshare.net 2

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    GECCO 2007 GECCO 2007 Presentation Transcript

    • Estimation of Fitness Landscape Contour Lines in Evolutionary Algorithms Petr Pošík Vojtˇ ch Franc e Czech Technical University in Prague Fraunhofer-FIRST.IDA Dept. of Cybernetics fravoj@first.fraunhofer.de posik@labe.felk.cvut.cz July 11, 2007 P. Pošík c 2007 GECCO 2007, London – 1
    • Introduction Motivation Comparison of Algorithms Outline The Algorithm Experimental Evaluation Conclusions Introduction P. Pošík c 2007 GECCO 2007, London – 2
    • Motivation Introduction In many EAs, Motivation Comparison of Algorithms the non-selected individuals have no effect on the evolution, Outline The Algorithm even though they carry a valueable information Experimental Evaluation about the local shape of the objective function, and/or Conclusions about the search space areas where the search should be suppressed. P. Pošík c 2007 GECCO 2007, London – 3
    • Comparison of Algorithms Introduction Motivation EGNA-like algorithm estimates the distribution of selected individuals Comparison of Algorithms Outline CMA-ES-like algorithm estimates the distribution of selected mutation The Algorithm steps Experimental Evaluation Conclusions Suggested algorithm estimates the contour line between selected and discarded individuals 1 0 −1 −2 −3 −4 −2 −1 0 1 2 3 P. Pošík c 2007 GECCO 2007, London – 4
    • Comparison of Algorithms Introduction Motivation EGNA-like algorithm estimates the distribution of selected individuals Comparison of Algorithms Outline CMA-ES-like algorithm estimates the distribution of selected mutation The Algorithm steps Experimental Evaluation Conclusions Suggested algorithm estimates the contour line between selected and discarded individuals 1 0 −1 −2 −3 −4 −2 −1 0 1 2 3 P. Pošík c 2007 GECCO 2007, London – 4
    • Comparison of Algorithms Introduction Motivation EGNA-like algorithm estimates the distribution of selected individuals Comparison of Algorithms Outline CMA-ES-like algorithm estimates the distribution of selected mutation The Algorithm steps Experimental Evaluation Conclusions Suggested algorithm estimates the contour line between selected and discarded individuals 1 0 −1 −2 −3 −4 −2 −1 0 1 2 3 P. Pošík c 2007 GECCO 2007, London – 4
    • Outline Introduction This work Motivation Comparison of Algorithms introduces a novel idea of using population members, Outline The Algorithm constructs a simple real-valued algorithm using this idea Experimental Evaluation Conclusions with the aim of minimizing the number of fitness evaluations, compares it against the CMA-ES on selected problems, and offers suggestions of augmenting the algorithm for use with general optimization problems. P. Pošík c 2007 GECCO 2007, London – 5
    • Introduction The Algorithm EDA vs. Contour Line Estimation Learning the Classifier Augmented Perceptron Algorithm Turning the Classifier into a Gaussian Notable Related Algorithms Experimental Evaluation Conclusions The Algorithm P. Pošík c 2007 GECCO 2007, London – 6
    • EDA vs. Contour Line Estimation High-level description: Algorithm 1: EDA 1 begin 2 Initialize and evaluate the population of size N. 3 while termination criteria are not met do 4 Select parents from the population. 5 Learn a probabilistic model describing their distribution. 6 Sample new individuals. 7 Replace the worst individuals. 8 end P. Pošík c 2007 GECCO 2007, London – 7
    • EDA vs. Contour Line Estimation High-level description: Algorithm 1: EDA Algorithm 2: Contour Line Estimation 1 begin 1 begin 2 Initialize and evaluate the population of 2 Initialize and evaluate the population of size N. size N. 3 while termination criteria are not met do 3 while termination criteria are not met do 4 Select parents from the population. 4 Divide the population to better and worse individuals. 5 Learn a probabilistic model 5 Learn a classifier distinguishing describing their distribution. between them. 6 Turn the description of better individuals into probability distribution. 6 Sample new individuals. 7 Sample new individuals. 7 Replace the worst individuals. 8 Replace the worst individuals. 8 end 9 end P. Pošík c 2007 GECCO 2007, London – 7
    • Learning the Classifier Introduction Assuming a minimization problem, we learn the elliptic discrimination The Algorithm EDA vs. Contour Line function of the form Estimation >0 for non − selected individuals, x T Ax + B T x + C Learning the Classifier (1) Augmented Perceptron Algorithm <0 for selected individuals. Turning the Classifier into a Gaussian Notable Related This is done by mapping the points x to their quadratic images z Algorithms Experimental Evaluation z = qmap(x) = Conclusions = ( x1 , 2x1 x2 , . . . , 2x1 x D , x2 , . . . , 2x2 xd , . . . , x2 , 2 2 D (2) x1 , . . . , x D , 1), and by introducing a weight vector w w = ( a11 , a12 , . . . , a1D , a22 , . . . , a2D , . . . , a DD , (3) b1 , . . . , bD , c), so that we can write the quadratic discrimination function as a linear function x T Ax + B T x + C = w T z (4) using e.g. an augmented perceptron algorithm (as is done in our paper). P. Pošík c 2007 GECCO 2007, London – 8
    • Augmented Perceptron Algorithm Introduction Perceptron: The Algorithm EDA vs. Contour Line Estimation learns linear decision boundary (if exists, otherwise does not stop) Learning the Classifier Augmented Perceptron in our case, learns general quadratic decision function Algorithm Turning the Classifier into a Gaussian needs to be augmented to learn elliptic decision function (A must be Notable Related Algorithms positive definite, A > 0) Experimental Evaluation Augmented Perceptron: Conclusions stops after $ all data points are correctly classified, and $ and all eigenvalues of A are positive if negative eigenvalue is found, the corresponding eigenvector is used in perceptron learning rule as wrongly classified data point to adapt A P. Pošík c 2007 GECCO 2007, London – 9
    • Turning the Classifier into a Gaussian Introduction Many quadratic functions with the same decision boundary → normalization, The Algorithm EDA vs. Contour Line so that minx x T Ax + B T x + C = −1 (assuming A > 0). Estimation 1 Learning the Classifier Augmented Perceptron 0.8 Algorithm σ σ 0.6 Turning the Classifier into a Gaussian 0.4 Notable Related Algorithms 0.2 Experimental Evaluation 0 Conclusions −0.2 −0.4 −0.6 −0.8 −1 −1 0 1 2 3 4 5 Having the parameters of the quadratic discrimination function in matrices A, B, and C, we get the Gaussian search distribution N (µ, Σ) by setting 1 µ = − B T A −1 (5) 2 Σ = A −1 (6) P. Pošík c 2007 GECCO 2007, London – 10
    • Notable Related Algorithms Introduction LEM (Learnable Evolution Model) by Wojtusiak and Michalski [2]: The Algorithm EDA vs. Contour Line Estimation several alternative ways of creating offspring (including GA-like Learning the Classifier pipeline, reinitialization pipeline), one of them: Augmented Perceptron Algorithm Turning the Classifier classifier distinguishing between good and bad individuals, and into a Gaussian Notable Related Algorithms instantiation of the concept of good individuals Experimental Evaluation Features: Conclusions LEM uses AQ21 classification rules as the model. The rules divide the search space with axis-parallel splits. Not very suitable for continuous spaces. On the other hand: LEM can be applied to mixed continuous-discrete problems. P. Pošík c 2007 GECCO 2007, London – 11
    • Notable Related Algorithms (cont.) Introduction LS-CMA-ES by Auger et al. [1]: The Algorithm EDA vs. Contour Line Estimation provides a way of learning the covariance matrix inside CMA-ES Learning the Classifier using a local quadratic model of the fitness function Augmented Perceptron Algorithm Turning the Classifier the parameters of quadratic function are determined by multivariate into a Gaussian Notable Related regression Algorithms D ( D +3) Experimental Evaluation needs 2 evaluated points, i.e. point + its fitness value Conclusions P. Pošík c 2007 GECCO 2007, London – 12
    • Introduction The Algorithm Experimental Evaluation Algorithm Settings Experimental Setup Contour Line Estimation vs. LEM Contour Line Estimation vs. CMA-ES Population Sizing When trained by SDP. . . Conclusions Experimental Evaluation P. Pošík c 2007 GECCO 2007, London – 13
    • Algorithm Settings Introduction The Algorithm Learned Gaussian is centered around the best-so-far solution Experimental Evaluation (similarly to CMA-ES, we learn only the shape, not the position) Algorithm Settings Experimental Setup The global step size is set in such a way that the separating ellipsoid Contour Line Estimation vs. LEM contains 99% of generated offspring Contour Line Estimation vs. CMA-ES Population Sizing When trained by SDP. . . Conclusions P. Pošík c 2007 GECCO 2007, London – 14
    • Experimental Setup Introduction The Algorithm Truncation selection Experimental Evaluation Initialization range: x ∈ −10, −5 D Algorithm Settings Experimental Setup Contour Line Estimation Each experiment performed 20 times with different pop. sizes vs. LEM Contour Line Estimation vs. CMA-ES Perceptron → selected and discarded individuals must be separable Population Sizing When trained by SDP. . . by an ellipsoid → restriction on quadratic fitness functions Conclusions Fitness functions used: D 2 $ Sphere function: f sphere = ∑d=1 xd $ Elliptic function: f elli = D 6 ) D−1 x2 d ∑d=1 (10 −1 d P. Pošík c 2007 GECCO 2007, London – 15
    • Contour Line Estimation vs. LEM Sphere Function Ellipsoid Function 4 10 10 10 Perceptron Perceptron 2 LEM LEM 10 5 0 10 10 Average BSF Fitness Average BSF Fitness −2 10 0 10 −4 10 −6 10 −5 10 −8 10 −10 −10 10 10 0 2000 4000 6000 8000 10000 12000 0 1000 2000 3000 4000 5000 6000 7000 8000 Number of Evaluations Number of Evaluations P. Pošík c 2007 GECCO 2007, London – 16
    • Contour Line Estimation vs. CMA-ES Sphere Function Ellipsoid Function 4 10 10 10 Perceptron Perceptron 2 CMA−ES CMA−ES 10 5 0 10 10 Average BSF Fitness Average BSF Fitness −2 10 0 10 −4 10 −6 10 −5 10 −8 10 −10 −10 10 10 0 500 1000 1500 2000 2500 0 1000 2000 3000 4000 5000 6000 Number of Evaluations Number of Evaluations P. Pošík c 2007 GECCO 2007, London – 17
    • Population Sizing Introduction The Algorithm No explicit model given Experimental Evaluation Algorithm Settings Optimal population sizes determined by experiments: Experimental Setup Contour Line Estimation vs. LEM Dimension 2 4 6 8 Contour Line Estimation vs. CMA-ES CMA-ES N = 4 + ⌊3log( D )⌋ 6 8 9 10 Population Sizing Our method, Sphere 9 8 7 6 When trained by SDP. . . Conclusions Our method, Ellipsoidal 11 10 8 6 Possible reason: 5 4 3 2 1 0 −1 −2 −3 −4 −5 −6 −4 −2 0 2 4 6 P. Pošík c 2007 GECCO 2007, London – 18
    • When trained by SDP. . . Introduction Optimal population sizes when semidefinite programming is used instead of The Algorithm augmented perceptron: Experimental Evaluation Algorithm Settings Experimental Setup Dimension 2 4 6 8 Contour Line Estimation CMA-ES N = 4 + ⌊3log( D )⌋ 6 8 9 10 vs. LEM Contour Line Estimation Sphere 7 8 9 10 vs. CMA-ES Population Sizing Ellipsoidal 6 7 8 11 When trained by SDP. . . Conclusions P. Pošík c 2007 GECCO 2007, London – 19
    • Introduction The Algorithm Experimental Evaluation Conclusions Bottlenecks, Future Work Advantages Thank you Conclusions P. Pošík c 2007 GECCO 2007, London – 20
    • Bottlenecks, Future Work Introduction The Algorithm This is only very initial study: Experimental Evaluation $ better learning algoritm needed (done, semidefinite Conclusions Bottlenecks, Future Work programming used, results encouraging) Advantages Thank you $ generalization to non-separable case needed (on the way) $ verification on more complex unimodal functions $ study the behavior on multimodal functions $ scalability study High time demands — where CMA-ES needs seconds, contour line estimation needs minutes $ suitable only for problems with time demanding fitness evaluation P. Pošík c 2007 GECCO 2007, London – 21
    • Advantages Introduction Compared to LEM: The Algorithm Experimental Evaluation CLE with ellipsoidal model is more suitable for real-valued problems Conclusions Bottlenecks, Future Work Compared to LS-CMA-ES: Advantages Thank you CLE can handle situations when only the selected/discarded indicator is known (no information about fitness function values) CLE needs less data points to estimate the quadratic function Compared to Adaptive Variance Scaling: CLE does not need to decide if the population is on slope or in the neighborhood of an optimum Generally: No need to adapt the global step length (the size of the Gaussian) Independence on translation and rotation, robustness against skewness (this implementation needs to invert the covariance matrix. . . ). P. Pošík c 2007 GECCO 2007, London – 22
    • Thank you Introduction The Algorithm [1] Anne Auger, Marc Schoenauer, and Nicolas Vanhaecke. LS-CMA-ES: A Experimental Evaluation second-order algorithm for covariance matrix adaptation. In Xin Yao Conclusions et al., editor, Parallel Problem Solving from Nature VIII, number 3242 in Bottlenecks, Future Work LNCS, pages 182–191. Springer Verlag, 2004. Advantages Thank you [2] Janusz Wojtusiak and Ryszard S. Michalski. The LEM3 system for non-darwinian evolutionary computation and its application to complex function optimization. Reports of the Machine Learning and Inference Laboratory MLI 04-1, George Mason University, Fairfax, VA, February 2006. Acknowledgments: The project was supported by the Ministry of Education, Youth and Sport of the Czech Republic with the grant No. MSM6840770012 entitled “Transdisciplinary Research in Biomedical Engineering IIquot;. The second author was also supported by Marie Curie Intra-European Fellowship grant SCOLES. The authors wish to thank to Janusz Wojtusiak for many valueable comments and help with LEM3 system. P. Pošík c 2007 GECCO 2007, London – 23