Complexity bounds for comparison-based optimization and parallel optimization
Upcoming SlideShare
Loading in...5
×
 

Complexity bounds for comparison-based optimization and parallel optimization

on

  • 303 views

@article{fournier:inria-00452791,...

@article{fournier:inria-00452791,
hal_id = {inria-00452791},
url = {http://hal.inria.fr/inria-00452791},
title = {{Lower Bounds for Comparison Based Evolution Strategies using VC-dimension and Sign Patterns}},
author = {Fournier, Herv{\'e} and Teytaud, Olivier},
abstract = {{We derive lower bounds on the convergence rate of comparison based or selection based algorithms, improving existing results in the continuous setting, and extending them to non-trivial results in the discrete case. This is achieved by considering the VC-dimension of the level sets of the fitness functions; results are then obtained through the use of the shatter function lemma. In the special case of optimization of the sphere function, improved lower bounds are obtained by an argument based on the number of sign patterns.}},
keywords = {Evolutionary Algorithms;Parallel Optimization;Comparison-based algorithms;VC-dimension;Sign patterns;Complexity},
language = {Anglais},
affiliation = {Parall{\'e}lisme, R{\'e}seaux, Syst{\`e}mes d'information, Mod{\'e}lisation - PRISM , Laboratoire de Recherche en Informatique - LRI , TAO - INRIA Saclay - Ile de France},
publisher = {Springer},
journal = {Algorithmica},
audience = {internationale },
year = {2010},
pdf = {http://hal.inria.fr/inria-00452791/PDF/evolution.pdf},
}

@incollection{teytaud:inria-00593179,
hal_id = {inria-00593179},
url = {http://hal.inria.fr/inria-00593179},
title = {{Lower Bounds for Evolution Strategies}},
author = {Teytaud, Olivier},
abstract = {{The mathematical analysis of optimization algorithms involves upper and lower bounds; we here focus on the second case. Whereas other chap- ters will consider black box complexity, we will here consider complexity based on the key assumption that the only information available on the fitness values is the rank of individuals - we will not make use of the exact fitness values. Such a reduced information is known efficient in terms of ro- bustness (Gelly et al., 2007), what gives a solid theoretical foundation to the robustness of evolution strategies, which is often argued without mathemat- ical rigor - and we here show the implications of this reduced information on convergence rates. In particular, our bounds are proved without infi- nite dimension assumption, and they have been used since that time for designing algorithms with better performance in the parallel setting.}},
language = {Anglais},
affiliation = {Laboratoire de Recherche en Informatique - LRI , TAO - INRIA Saclay - Ile de France},
booktitle = {{Theory of Randomized Search Heuristics}},
publisher = {World Scientific},
pages = {327-354},
volume = {1},
editor = {Anne Auger, Benjamin Doerr },
series = {Series on Theoretical Computer Science },
audience = {internationale },
year = {2011},
month = May,
pdf = {http://hal.inria.fr/inria-00593179/PDF/ws-book9x6.pdf},
}

Statistics

Views

Total Views
303
Views on SlideShare
303
Embed Views
0

Actions

Likes
0
Downloads
2
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as OpenOffice

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • I am Frederic Lemoine, PhD student at the University Paris Sud. I will present you my work on GenoQuery, a new querying module adapted to a functional genomics warehouse
  • I am Frederic Lemoine, PhD student at the University Paris Sud. I will present you my work on GenoQuery, a new querying module adapted to a functional genomics warehouse
  • I am Frederic Lemoine, PhD student at the University Paris Sud. I will present you my work on GenoQuery, a new querying module adapted to a functional genomics warehouse
  • I am Frederic Lemoine, PhD student at the University Paris Sud. I will present you my work on GenoQuery, a new querying module adapted to a functional genomics warehouse
  • I am Frederic Lemoine, PhD student at the University Paris Sud. I will present you my work on GenoQuery, a new querying module adapted to a functional genomics warehouse
  • I am Frederic Lemoine, PhD student at the University Paris Sud. I will present you my work on GenoQuery, a new querying module adapted to a functional genomics warehouse
  • I am Frederic Lemoine, PhD student at the University Paris Sud. I will present you my work on GenoQuery, a new querying module adapted to a functional genomics warehouse
  • I am Frederic Lemoine, PhD student at the University Paris Sud. I will present you my work on GenoQuery, a new querying module adapted to a functional genomics warehouse
  • I am Frederic Lemoine, PhD student at the University Paris Sud. I will present you my work on GenoQuery, a new querying module adapted to a functional genomics warehouse
  • I am Frederic Lemoine, PhD student at the University Paris Sud. I will present you my work on GenoQuery, a new querying module adapted to a functional genomics warehouse
  • I am Frederic Lemoine, PhD student at the University Paris Sud. I will present you my work on GenoQuery, a new querying module adapted to a functional genomics warehouse
  • I am Frederic Lemoine, PhD student at the University Paris Sud. I will present you my work on GenoQuery, a new querying module adapted to a functional genomics warehouse
  • I am Frederic Lemoine, PhD student at the University Paris Sud. I will present you my work on GenoQuery, a new querying module adapted to a functional genomics warehouse

Complexity bounds for comparison-based optimization and parallel optimization Complexity bounds for comparison-based optimization and parallel optimization Presentation Transcript

  • Complexity bounds in parallel  evolution A. Auger, H. Fournier, N. Hansen, P. Rolet, F. Teytaud, O. Teytaud Paris, 2010Tao, Inria Saclay Ile-De-France,LRI (Université Paris Sud, France),UMR CNRS 8623, I&A team, Digiteo,Pascal Network of Excellence.
  • Outline Introduction Complexity bounds Branching Factor Automatic Parallelization Real-world algorithms Log() correctionsAuger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 2
  • Outline Introduction - What is optimization ? - What are comparison-based optimization algorithms ? - Why we are interested in cp-based opt ? - Why we consider parallel machines ?Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 3
  • Introduction: what is optimization ? Consider f: X --> R We look for x* such that x,f(x*) ≤ f(x) w random variable f is randomly drawn; f(x) = f(x,w).Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 4
  • Introduction: what is optimization ? Quality of “Opt” quantified as follows: (to be minimized) w random variableAuger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 5
  • Introduction: what is optimization ? Consider f: X --> R We look for x* such that x,f(x*) ≤ f(x) ==> Quasi-Newton, random search, Newton, Simplex, Interior points...Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 6
  • Comparison-based optimization is comparison-based ifAuger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 7
  • The main rules for step-size adaptation While ( I have time ) { Generate points (x1,...,x) distributed as N(x,) Evaluate the fitness at x1,...,x Update x, update  }Main trouble: choosing Cumulative step-size adaptationMutative self-adaptationEstimation of Multivariate Normal Algorithm
  • Example 1: Estimation of Multivariate Normal Algorithm While ( I have time ) { Generate points (x1,...,x) distributed as N(x,) Evaluate the fitness at x1,...,x X= mean  best points = standard deviation of  best points } I have a Gaussian...
  • Example 1: Estimation of Multivariate Normal Algorithm While ( I have time ) { Generate points (x1,...,x) distributed as N(x,) Evaluate the fitness at x1,...,x X= mean  best points = standard deviation of  best points } I generate 6 points
  • Example 1: Estimation of Multivariate Normal Algorithm While ( I have time ) { Generate points (x1,...,x) distributed as N(x,) Evaluate the fitness at x1,...,x X= mean  best points = standard deviation of  best points } I select the three best
  • Example 1: Estimation of Multivariate Normal Algorithm While ( I have time ) { Generate points (x1,...,x) distributed as N(x,) Evaluate the fitness at x1,...,x X= mean  best points = standard deviation of  best points } I update the Gaussian
  • Example 1: Estimation of Multivariate Normal Algorithm While ( I have time ) { Generate points (x1,...,x) distributed as N(x,) Evaluate the fitness at x1,...,x X= mean  best points = standard deviation of  best points } Obviously 6-parallel
  • Example 2: Mutative self-adaptation  = / 4 While ( I have time ) { Generate points (1,...,) as  x exp(- k.N) Generate points (x1,...,x) distributed as N(x,i) Select the  best points Update x (=mean), update (=log. mean)}
  • Plenty of comparison-based algorithms EMNA and other EDA Self-adaptive algorithms Cumulative step-size adaptation Pattern Search Methods ...
  • Families of comparison-based algorithmsMain parameter =  = number of evaluations per iteration = parallelismFull-Ranking vs Selection-Based (param ) FR: we know the ranking of the  best SB: we just know which are the  bestElitist or not Elitist: comparison with all visited points Non-elitist: only within current offspring
  • EMNA ? Self-adaptation ?Main parameter =  = number of evaluations per iteration = parallelismFull-Ranking vs Selection-Based FR: we know the ranking of all visited points SB: we just know which are the  bestElitist or not Elitist: comparison with all visited points Non-elitist: only within current offspring==> yet, they work quite well
  • Comparison-based algorithms are robust Consider f: X --> R We look for x* such that x,f(x*) ≤ f(x) ==> what if we see g o f (g increasing) ? ==> x* is the same, but xn might change ==> then, comparison-based methods are optimalAuger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 18
  • Robustness of comparison-based algorithms: formalstatement this does not depend on g for a comparison-based algorithm a comparison-based algorithm is optimal for (I dont give a proof here, but I promise its true) Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 19
  • Introduction: I like  large ● Grid5000 = 5 000 cores (increasing) ● Submitting jobs ==> grouping runs ==>  much bigger than number of cores. ● Next generations of computers: tenths, hundreds, thousands of cores. ● Evolutionary algorithms are population based but they have a bad speed-up. Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 20
  • Introduction: I like  large ● Grid5000 = 5 000 cores (increasing) ● Submitting jobs ==> grouping runs ==>  much bigger than number of cores. ● Next generations of computers: tenths, hundreds, thousands of cores. ● Evolutionary algorithms are population based but they have a bad speed-up. Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 21
  • Introduction: I like  large ● Grid5000 = 5 000 cores (increasing) ● Submitting jobs ==> grouping runs ==>  much bigger than number of cores. ● Next generations of computers: tenths, hundreds, thousands of cores. ● Evolutionary algorithms are population based but they have a bad speed-up. Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 22
  • Introduction: I like  large ● Grid5000 = 5 000 cores (increasing) ● Submitting jobs ==> grouping runs ==>  much bigger than number of cores. ● Next generations of computers: tenths, hundreds, thousands of cores. ● Evolutionary algorithms are population based but they have a bad speed-up. Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 23
  • Introduction: concluding :-) ● Optimization = finding minima ● Many algorithms are comparison-based ● ==> good idea for robustness ● Parallel case interesting ● ==> now we can have fun with bounds Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 24
  • Outline Introduction On a given domain D On a space F of objective Complexity bounds functions such that {x*(f);f∈F}=D Branching Factor Automatic Parallelization Real-world algorithms Log() correctionsAuger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 25
  • Complexity bounds (N = dimension) = nb of fitness evaluations for precision  with probability at least ½ for all f N() = cov. number of the search space Exp ( - Convergence ratio ) = Convergence rate Convergence ratio ~ 1 / computational cost ==> more convenient for speed-upsAuger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 26
  • Complexity bounds ½ = nb of fitness evaluations for precision  with probability at least ½ for all f N() = cov. number of the search space Exp ( - Convergence ratio ) = Convergence rate Convergence ratio ~ 1 / computational cost ==> more convenient for speed-upsAuger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 27
  • Complexity bounds: basic technique We want to know how many iterations we need for reaching precision  in an evolutionary algorithm. Key observation: (most) evolutionary algorithms are comparison-based Lets consider (for simplicity) a deterministic selection-based non-elitist algorithm First idea: how many different branches we have in a run ? We select  points among  Therefore, at most K = ! / ( ! (  -  )!) different branches Second idea: how many different answers should we able to give ? Use packing numbers: at least N() different possible answers Conclusion: the number n of iterations should verify Kn ≥ N (  )Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 28
  • Complexity bounds: basic technique We want to know how many iterations we need for reaching precision  in an evolutionary algorithm. Key observation: (most) evolutionary algorithms are comparison-based Lets consider (for simplicity) a deterministic selection-based non-elitist algorithm First idea: how many different branches we have in a run ? We select  points among  Therefore, at most K = ! / ( ! (  -  )!) different branches Second idea: how many different answers should we able to give ? Use packing numbers: at least N() different possible answers Conclusion: the number n of iterations should verify Kn ≥ N (  )Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 29
  • Complexity bounds: basic technique We want to know how many iterations we need for reaching precision  in an evolutionary algorithm. Key observation: (most) evolutionary algorithms are comparison-based Lets consider (for simplicity) a deterministic selection-based non-elitist algorithm First idea: how many different branches we have in a run ? We select  points among  Therefore, at most K = ! / ( ! (  -  )!) different branches Second idea: how many different answers should we able to give ? Use packing numbers: at least N() different possible answers Conclusion: the number n of iterations should verify Kn ≥ N (  )Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 30
  • Complexity bounds: basic technique We want to know how many iterations we need for reaching precision  in an evolutionary algorithm. Key observation: (most) evolutionary algorithms are comparison-based Lets consider (for simplicity) a deterministic selection-based non-elitist algorithm First idea: how many different branches we have in a run ? We select  points among  Therefore, at most K = ! / ( ! (  -  )!) different branches Second idea: how many different answers should we able to give ? Use packing numbers: at least N() different possible answers Conclusion: the number n of iterations should verify Kn ≥ N (  )Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 31
  • Complexity bounds: -balls We want to know how many iterations we need for reaching precision  in an evolutionary algorithm. Key observation: (most) evolutionary algorithms are comparison-based Lets consider (for simplicity) a deterministic selection-based non-elitist algorithm First idea: how many different branches we have in a run ? We select  points among  Therefore, at most K = ! / ( ! (  -  )!) different branches Second idea: how many different answers should we able to give ? Use packing numbers: at least N() different possible answers Conclusion: the number n of iterations should verify Kn ≥ N (  )Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 32
  • Complexity bounds: -balls We want to know how many iterations we need for reaching precision  in an evolutionary algorithm. Key observation: (most) evolutionary algorithms are comparison-based Lets consider (for simplicity) a deterministic selection-based non-elitist algorithm First idea: how many different branches we have in a run ? We select  points among  Therefore, at most K = ! / ( ! (  -  )!) different branches Second idea: how many different answers should we able to give ? Use packing numbers: at least N() different possible answers Conclusion: the number n of iterations should verify Kn ≥ N (  )Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 33
  • Complexity bounds: -balls We want to know how many iterations we need for reaching precision  in an evolutionary algorithm. Key observation: (most) evolutionary algorithms are comparison-based Lets consider (for simplicity) a deterministic selection-based non-elitist algorithm First idea: how many different branches we have in a run ? We select  points among  Therefore, at most K = ! / ( ! (  -  )!) different branches Second idea: how many different answers should we able to give ? Use packing numbers: at least N() different possible answers Conclusion: the number n of iterations should verify Kn ≥ N (  )Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 34
  • Complexity bounds: basic technique We want to know how many iterations we need for reaching precision  in an evolutionary algorithm. Key observation: (most) evolutionary algorithms are comparison-based Lets consider (for simplicity) a deterministic selection-based non-elitist algorithm First idea: how many different branches we have in a run ? We select  points among  Therefore, at most K = ! / ( ! (  -  )!) different branches Second idea: how many different answers should we able to give ? Use packing numbers: at least N() different possible answers Conclusion: the number n of iterations should verify Kn ≥ N (  )Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 35
  • Complexity bounds on the convergence ratio FR: full ranking (selected points are ranked) SB: selection-based (selected points are not ranked)Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 36
  • Complexity bounds on the convergence ratio Linear in  ? FR: full ranking (selected points are ranked) SB: selection-based (selected points are not ranked)Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 37
  • Linear speed-up ? My bound is tight, Ive proved it!Bounds:On a given domain DOn a space F of objective functions such that {x*(f);f∈F}=D==> very strange F possible!==> much easier than F={||x-x*|| ; x*∈ D }Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 38
  • Linear speed-up ? My bound is Ok, tight bound. tight, But what Ive proved it! happens with a better model ?Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 39
  • Complexity bounds on the convergence ratio - Comparison-based optimization (or opt. with limited precision numbers) - We have developped bounds based on:Branching factor: finitely many possibleinformations on the problem per time step(→ communication. compl)Packing number (lower bound on number ofpossible outcomes) Adding assumptions ==> better bounds ?Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 40
  • Complexity bounds: improved technique We want to know how many iterations we need for reaching precision  in an evolutionary algorithm. Key observation: (most) evolutionary algorithms are comparison-based Lets consider (for simplicity) a deterministic selection-based non-elitist algorithm First idea: how many different branches we have in a run ? We select  points among  Therefore, at most K = ! / ( ! (  -  )!) different branches Second idea: how many different answers should we able to give ? Use packing numbers: at least N() different possible answers Conclusion: the number nMany of these K verify of iterations should Kn ≥ Nbranches are ( ) very unlikely !Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 41
  • Complexity bounds: improved technique We want to know how many iterations we need for reaching precision  in an evolutionary algorithm. Key observation: (most) evolutionary algorithms are comparison-based Lets consider (for simplicity) a deterministic selection-based non-elitist algorithm First idea: how many different branches we have in a run ? We select  points among  Therefore, at most K = ! / ( ! (  -  )!) different branches Second idea: how many different answers should we able to give ? Use packing numbers: at least N() different possible answers Conclusion: the number n of iterations should verify Many of these K n K ≥ N( ) branches are Well use... … VC-dimension ! very unlikely !Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 42
  • (these slides “shattering + VC-dim” extracted from Xue Meis talk at ENEE698A)Definition of shattering: A set S of points is shattered by a set H of sets if for every dichotomy of S there is a consistent hypothesis in H
  • Example: Shattering Is this set of points shattered by the set H o
  • Yes! + - + + + + + + - + + - + - - - - - - + + - - -
  • Is this set of points shattered by circles?
  • How About This One?
  • VC-dimension VC-dimension( set of sets ) = maximum cardinal of a shattered set VC-dimension (set of functions ) = VC-dimension ( level sets) Known (as a function of the dimension) for many sets of functions In particular, quadratic for ellipsoids, linear for homotheties of a fixed ellipsoid linear for circles...Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 48
  • VC-dimension VC-dimension( set of sets ) = maximum cardinal of a shattered set VC-dimension (set of functions ) = VC-dimension ( level sets) Known (as a function of the dimension) for many sets of functions In particular, quadratic for ellipsoids, linear for homotheties of a fixed ellipsoid linear for circles...Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 49
  • VC-dimension VC-dimension( set of sets ) = maximum cardinal of a shattered set VC-dimension (set of functions ) = VC-dimension ( level sets) Known (as a function of the dimension) for many sets of functions In particular, quadratic for ellipsoids, linear for homotheties of a fixed ellipsoid linear for circles...Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 50
  • VC-dimension VC-dimension( set of sets ) = maximum cardinal of a shattered set VC-dimension (set of functions ) = VC-dimension ( sublevel sets) Known (as a function of the dimension) for many sets of functions In particular, quadratic for ellipsoids, linear for homotheties of a fixed ellipsoid linear for circles...Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 51
  • VC-dimension: the link with optimization ? Sauers lemma: number of subsets of V points consistent V with a set of VC-dim V at most  So what ? number of possible selections at most V K≤ ==> instead of K = ! / ( ! (  -  )!) (V at least 3, otherwise a few details change...)Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 52
  • Complexity bounds on the convergence ratio FR: full ranking (selected points are ranked) SB: selection-based (selected points are not ranked)Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 53
  • Complexity bounds on the convergence ratio Should not be linear in  ! FR: full ranking (selected points are ranked) SB: selection-based (selected points are not ranked)Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 54
  • Complexity bounds on the convergence ratio Something remains! FR: full ranking (selected points are ranked) SB: selection-based (selected points are not ranked)Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 55
  • Sphere: fitness increases with distance to optimum 1 comparison = 1 hyperplane Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 56
  • Sphere: fitness increases with distance to optimum 1 comparison = 1 hyperplane Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 57
  • Sphere: fitness increases with distance to optimum 1 comparison = 1 hyperplane Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 58
  • Sphere: fitness increases with distance to optimum 1 comparison = 1 hyperplane Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 59
  • Outline Introduction Complexity bounds Branching Factor Automatic Parallelization Real-world algorithms Log() correctionsAuger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 60
  • Branching factor K (more in Gelly06; Fournier08)Rewrite your evolutionary algorithm as follows:g has values in a finite set of cardinal K: - e.g. subsets of {1,2,...,} of size  (K=! / (!(-)!) )- or ordered subsets (K=! / (-)! ).- ...Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 61
  • Outline Upper bounds for the Introduction dependency in  Complexity bounds Branching Factor Automatic Parallelization Real-world algorithms Log() correctionsAuger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 62
  • Automatic parallelizationAuger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 63
  • Speculative parallelization with branching factor3 Consider the sequential algorithm. (iteration 1)Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 64
  • Speculative parallelization with branching factor3 Consider the sequential algorithm. (iteration 2)Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 65
  • Speculative parallelization with branching factor3 Consider the sequential algorithm. (iteration 3)Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 66
  • Speculative parallelization with branching factor3 Parallel version for D=2. Population = union of all pops for 2 iterations.Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 67
  • Outline Introduction Tighter lower bounds for Complexity bounds specific algorithms ? Branching Factor Automatic Parallelization Real-world algorithms Log() correctionsAuger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 68
  • Real world algorithms Define: Necessary condition for log() speed-up: - E log( * ) ~ log() But for many algorithms, - E log( * ) = O(1) ==> constant speed-upAuger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 69
  • One-fifth rule: - E log( * ) = O(1) = proportion of mutated points better than x While ( I have time ) { Generate points (x1,...,x) distributed as N(x,) Evaluate the fitness at x1,...,x Update x = mean Update  By 1/5th rule } or Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 70
  • One-fifth rule: - E log( * ) = O(1) = proportion of mutated points better than xConsider e.g. Or consider e.g. In both cases * is lower-bounded independently of  ==> parameters should strongly depend on  ! Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 71
  • Self-adaptation, cumulative step-size adaptation In many cases, the same result: with parameters depending on the dimension only (and not depending on ), the speed-up is limited by a constant!Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 72
  • Outline Introduction Complexity bounds Branching Factor Automatic Parallelization Real-world algorithms Log() correctionsAuger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 73
  • The starting point of this work ●We have shown tight bounds. ●Usual algorithms dont reach the bounds for  large. ● ●Trouble: the algorithms we propose are boring (too complicated), people prefer usual algorithms. ● ● A simple patch for these algorithms?Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 74
  • Log() corrections ● In the discrete case (XPs): automatic parallelization surprisingly efficient. ● Simple trick in the continuous case: - E log( *) should be linear in log() (this provides corrections which work for SA, EMNA and CSA)Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 75
  • Example 1: Estimation of Multivariate Normal Algorithm While ( I have time ) { Generate points (x1,...,x) distributed as N(x,) Evaluate the fitness at x1,...,x X= mean  best points = standard deviation of  best points /= log( / 7)1 / d } I select the three best
  • Ex 2: Log(lambda) correction for mutative self-adapt.  =  / 4 ==> min( /4,d) While ( I have time ) { Generate points (1,...,) as  x exp(- k.N) Generate points (x1,...,x) distributed as N(x,i) Select the  best points Update x (=mean), update (=log. mean) }
  • Log() corrections (SA, dim 3) ● In the discrete case (XPs): automatic parallelization surprisingly efficient. ● Simple trick in the continuous case - E log( *) should be linear in log() (this provides corrections which work for SA and CSA)Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 78
  • Log() corrections ● In the discrete case (XPs): automatic parallelization surprisingly efficient. ● Simple trick in the continuous case - E log( *) should be linear in log() (this provides corrections which work for SA and CSA)Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 79
  • Conclusion The case of large population size is not well handled by usual algorithms. We proposed (I) theoretical bounds (II) an automatic parallelization matching the bound, and which works well in the discrete case. (III) a necessary condition for the continuous case, which provides useful hints.Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 80
  • Main limitation (of the application to the design of algo) All this is about a logarithmic speed-up. The computational power is like this ==> <== and the result is like that. ==> much better speed-up for noisy optimization. Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 81
  • Further work 1 Apply VC-bounds for considering only “reasonnable” branches in the automatic parallelization. Theoretically easy, but provides extremely complicated algorithms.Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 82
  • Further work 2 We have: - proofs for complicated algorithms - efficient (unproved) hints for usual algorithms Proofs for the versions with the “trick” ? Nb: the discrete case is moral: the best algorithm is the proved one :-)Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 83
  • Further work 3 What if the optimum is not a point but a subset with topological dimension N < N ?Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 84
  • Further work 4 Parallel bandits ? Experimentally, parallel UCT >> seq. UCT. with speed-up depending on nb of arms. Theory ? Perhaps not very hard, but not done yet.Auger, Fournier, Hansen, Rolet, Teytaud, Teytaud parallel evolution 85