Using Previous Models to Bias Structural Learning in the Hierarchical BOAMartin Pelikan
Estimation of distribution algorithms (EDAs) are stochastic optimization techniques that explore the space of potential solutions by building and sampling explicit probabilistic models of promising candidate solutions. While the primary goal of applying EDAs is to discover the global optimum or at least its accurate approximation, besides this, any EDA provides us with a sequence of probabilistic models, which in most cases hold a great deal of information about the problem. Although using problem-specific knowledge has been shown to significantly improve performance of EDAs and other evolutionary algorithms, this readily available source of problem-specific information has been practically ignored by the EDA community. This paper takes the first step towards the use of probabilistic models obtained by EDAs to speed up the solution of similar problems in future. More specifically, we propose two approaches to biasing model building in the hierarchical Bayesian optimization algorithm (hBOA) based on knowledge automatically learned from previous hBOA runs on similar problems. We show that the proposed methods lead to substantial speedups and argue that the methods should work well in other applications that require solving a large number of problems with similar structure.
Intelligent Bias of Network Structures in the Hierarchical BOAMartin Pelikan
One of the primary advantages of estimation of distribution algorithms (EDAs) over many other stochastic optimization techniques is that they supply us with a roadmap of how they solve a problem. This roadmap consists of a sequence of probabilistic models of candidate solutions of increasing quality. The first model in this sequence would typically encode the uniform distribution over all admissible solutions whereas the last model would encode a distribution that generates at least one global optimum with high probability. It has been argued that exploiting this knowledge should improve EDA performance when solving similar problems. This paper presents an approach to bias the building of Bayesian network models in the hierarchical Bayesian optimization algorithm (hBOA) using information gathered from models generated during previous hBOA runs on similar problems. The approach is evaluated on trap-5 and 2D spin glass problems.
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsPer Kristian Lehre
We demonstrate how to estimate the expected optimisation time of UMDA, an estimation of distribution algorithm, using the level-based theorem. The talk was given at the GECCO 2015 conference in Madrid, Spain.
Fitness inheritance in the Bayesian optimization algorithmMartin Pelikan
This paper describes how fitness inheritance can be used to estimate fitness for a proportion of newly sampled candidate solutions in the Bayesian optimization algorithm (BOA). The goal of estimating fitness for some candidate solutions is to reduce the number of fitness evaluations for problems where fitness evaluation is expensive. Bayesian networks used in BOA to model promising solutions and generate the new ones are extended to allow not only for modeling and sampling candidate solutions, but also for estimating their fitness. The results indicate that fitness inheritance is a promising concept in BOA, because population-sizing requirements for building appropriate models of promising solutions lead to good fitness estimates even if only a small proportion of candidate solutions is evaluated using the actual fitness function. This can lead to a reduction of the number of actual fitness evaluations by a factor of 30 or more.
Towards billion bit optimization via parallel estimation of distribution algo...kknsastry
This paper presents a highly efficient, fully parallelized implementation of the compact genetic algorithm to solve very large scale problems with millions to billions of variables. The paper presents principled results demonstrating the scalable solution of a difficult test function on instances over a billion variables using a parallel implementation of compact genetic algorithm (cGA). The problem addressed is a noisy, blind problem over a vector of binary decision variables. Noise is added equaling up to a tenth of the deterministic objective function variance of the problem, thereby making it difficult for simple hillclimbers to find the optimal solution. The compact GA, on the other hand, is able to find the optimum in the presence of noise quickly, reliably, and accurately, and the solution scalability follows known convergence theories. These results on noisy problem together with other results on problems involving varying modularity, hierarchy, and overlap foreshadow routine solution of billion-variable problems across the landscape of search problems.
Empirical Analysis of ideal recombination on random decomposable problemskknsastry
This paper analyzes the behavior of a selectorecombinative genetic algorithm (GA) with an ideal crossover on a class of random additively decomposable problems (rADPs). Specifically, additively decomposable problems of order k whose subsolution fitnesses are sampled from the standard uniform distribution U[0,1] are analyzed. The scalability of the selectorecombinative GA is investigated for 10,000 rADP instances. The validity of facetwise models in bounding the population size, run duration, and the number of function evaluations required to successfully solve the problems is also verified. Finally, rADP instances that are easiest and most difficult are also investigated.
iBOA: The Incremental Bayesian Optimization AlgorithmMartin Pelikan
This paper proposes the incremental Bayesian optimization algorithm (iBOA), which modifies standard BOA by removing the population of solutions and using incremental updates of the Bayesian network. iBOA is shown to be able to learn and exploit unrestricted Bayesian networks using incremental techniques for updating both the structure as well as the parameters of the probabilistic model. This represents an important step toward the design of competent incremental estimation of distribution algorithms that can solve difficult nearly decomposable problems scalably and reliably.
Effects of a Deterministic Hill climber on hBOAMartin Pelikan
Hybridization of global and local search algorithms is a well-established technique for enhancing the efficiency of search algorithms. Hybridizing estimation of distribution algorithms (EDAs) has been repeatedly shown to produce better performance than either the global or local search algorithm alone. The hierarchical Bayesian optimization algorithm (hBOA) is an advanced EDA which has previously been shown to benefit from hybridization with a local searcher. This paper examines the effects of combining hBOA with a deterministic hill climber (DHC). Experiments reveal that allowing DHC to find the local optima makes model building and decision making much easier for hBOA. This reduces the minimum population size required to find the global optimum, which substantially improves overall performance.
Transfer Learning, Soft Distance-Based Bias, and the Hierarchical BOAMartin Pelikan
An automated technique has recently been proposed to transfer learning in the hierarchical Bayesian optimization algorithm (hBOA) based on distance-based statistics. The technique enables practitioners to improve hBOA efficiency by collecting statistics from probabilistic models obtained in previous hBOA runs and using the obtained statistics to bias future hBOA runs on similar problems. The purpose of this paper is threefold: (1) test the technique on several classes of NP-complete problems, including MAXSAT, spin glasses and minimum vertex cover; (2) demonstrate that the technique is effective even when previous runs were done on problems of different size; (3) provide empirical evidence that combining transfer learning with other efficiency enhancement techniques can often yield nearly multiplicative speedups.
The Bayesian Optimization Algorithm with Substructural Local SearchMartin Pelikan
This work studies the utility of using substructural neighborhoods for local search in the Bayesian optimization algorithm (BOA). The probabilistic model of BOA, which automatically identifies important problem substructures, is used to define the structure of the neighborhoods used in local search. Additionally, a surrogate fitness model is considered to evaluate the improvement of the local search steps. The results show that performing substructural local search in BOA significatively reduces the number of generations necessary to converge to optimal solutions and thus provides substantial speedups.
Analyzing Probabilistic Models in Hierarchical BOA on Traps and Spin GlassesMartin Pelikan
The hierarchical Bayesian optimization algorithm (hBOA) can solve nearly decomposable and hierarchical problems of bounded difficulty in a robust and scalable manner by building and sampling probabilistic models of promising solutions. This paper analyzes probabilistic models in hBOA on two common test problems: concatenated traps and 2D Ising spin glasses with periodic boundary conditions. We argue that although Bayesian networks with local structures can encode complex probability distributions, analyzing these models in hBOA is relatively straightforward and the results of such analyses may provide practitioners with useful information about their problems. The results show that the probabilistic models in hBOA closely correspond to the structure of the underlying optimization problem, the models do not change significantly in subsequent iterations of BOA, and creating adequate probabilistic models by hand is not straightforward even with complete knowledge of the optimization problem.
Order Or Not: Does Parallelization of Model Building in hBOA Affect Its Scala...Martin Pelikan
It has been shown that model building in the hierarchical Bayesian optimization algorithm (hBOA) can be efficiently parallelized by randomly generating an ancestral ordering of the nodes of the network prior to learning the network structure and allowing only dependencies consistent with the generated ordering. However, it has not been thoroughly shown that this approach to restricting probabilistic models does not affect scalability of hBOA on important classes of problems. This presentation demonstrates that although the use of a random ancestral ordering restricts the structure of considered models to allow efficient parallelization of model building, its effects on hBOA performance and scalability are negligible.
Estimation of Distribution Algorithms TutorialMartin Pelikan
Probabilistic model-building algorithms (PMBGAs), also called estimation of distribution algorithms (EDAs) and iterated density estimation algorithms (IDEAs), replace traditional variation of genetic and evolutionary algorithms by (1) building a probabilistic model of promising solutions and (2) sampling the built model to generate new candidate solutions. PMBGAs are also known as estimation of distribution algorithms (EDAs) and iterated density-estimation algorithms (IDEAs).
Replacing traditional crossover and mutation operators by building and sampling a probabilistic model of promising solutions enables the use of machine learning techniques for automatic discovery of problem regularities and exploitation of these regularities for effective exploration of the search space. Using machine learning in optimization enables the design of optimization techniques that can automatically adapt to the given problem. There are many successful applications of PMBGAs, for example, Ising spin glasses in 2D and 3D, graph partitioning, MAXSAT, feature subset selection, forest management, groundwater remediation design, telecommunication network design, antenna design, and scheduling.
This tutorial provides a gentle introduction to PMBGAs with an overview of major research directions in this area. Strengths and weaknesses of different PMBGAs will be discussed and suggestions will be provided to help practitioners to choose the best PMBGA for their problem.
The video of this tutorial presented at GECCO-2008 can be found at
http://medal.cs.umsl.edu/blog/?p=293
Population Dynamics in Conway’s Game of Life and its VariantsMartin Pelikan
The presentation for the project of high school students Yonatan Biel and David Hua made in the Students and Teachers As Research Scientists (STARS) program at the Missouri Estimation of Distribution Algorithms Laboratory (MEDAL). To see animations, please download the powerpoint presentation.
Image segmentation using a genetic algorithm and hierarchical local searchMartin Pelikan
This paper proposes a hybrid genetic algorithm to perform image segmentation based on applying the q-state Potts spin glass model to a grayscale image. First, the image is converted to a set of weights for a q-state spin glass and then a steady-state genetic algorithm is used to evolve candidate segmented images until a suitable candidate solution is found. To speed up the convergence to an adequate solution, hierarchical local search is used on each evaluated solution. The results show that the hybrid genetic algorithm with hierarchical local search is able to efficiently perform image segmentation. The necessity of hierarchical search for these types of problems is also clearly demonstrated.
Distance-based bias in model-directed optimization of additively decomposable...Martin Pelikan
For many optimization problems it is possible to define a distance metric between problem variables that correlates with the likelihood and strength of interactions between the variables. For example, one may define a metric so that the dependencies between variables that are closer to each other with respect to the metric are expected to be stronger than the dependencies between variables that are further apart. The purpose of this paper is to describe a method that combines such a problem-specific distance metric with information mined from probabilistic models obtained in previous runs of estimation of distribution algorithms with the goal of solving future problem instances of similar type with increased speed, accuracy and reliability. While the focus of the paper is on additively decomposable problems and the hierarchical Bayesian optimization algorithm, it should be straightforward to generalize the approach to other model-directed optimization techniques and other problem classes. Compared to other techniques for learning from experience put forward in the past, the proposed technique is both more practical and more broadly applicable.
Pairwise and Problem-Specific Distance Metrics in the Linkage Tree Genetic Al...Martin Pelikan
The linkage tree genetic algorithm (LTGA) identifies linkages between problem variables using an agglomerative hierarchical clustering algorithm and linkage trees. This enables LTGA to solve many decomposable problems that are difficult with more conventional genetic algorithms. The goal of this paper is two-fold: (1) Present a thorough empirical evaluation of LTGA on a large set of problem instances of additively decomposable problems and (2) speed up the clustering algorithm used to build the linkage trees in LTGA by using a pairwise and a problem-specific metric.
http://medal.cs.umsl.edu/files/2011001.pdf
Using Previous Models to Bias Structural Learning in the Hierarchical BOAMartin Pelikan
Estimation of distribution algorithms (EDAs) are stochastic optimization techniques that explore the space of potential solutions by building and sampling explicit probabilistic models of promising candidate solutions. While the primary goal of applying EDAs is to discover the global optimum or at least its accurate approximation, besides this, any EDA provides us with a sequence of probabilistic models, which in most cases hold a great deal of information about the problem. Although using problem-specific knowledge has been shown to significantly improve performance of EDAs and other evolutionary algorithms, this readily available source of problem-specific information has been practically ignored by the EDA community. This paper takes the first step towards the use of probabilistic models obtained by EDAs to speed up the solution of similar problems in future. More specifically, we propose two approaches to biasing model building in the hierarchical Bayesian optimization algorithm (hBOA) based on knowledge automatically learned from previous hBOA runs on similar problems. We show that the proposed methods lead to substantial speedups and argue that the methods should work well in other applications that require solving a large number of problems with similar structure.
Intelligent Bias of Network Structures in the Hierarchical BOAMartin Pelikan
One of the primary advantages of estimation of distribution algorithms (EDAs) over many other stochastic optimization techniques is that they supply us with a roadmap of how they solve a problem. This roadmap consists of a sequence of probabilistic models of candidate solutions of increasing quality. The first model in this sequence would typically encode the uniform distribution over all admissible solutions whereas the last model would encode a distribution that generates at least one global optimum with high probability. It has been argued that exploiting this knowledge should improve EDA performance when solving similar problems. This paper presents an approach to bias the building of Bayesian network models in the hierarchical Bayesian optimization algorithm (hBOA) using information gathered from models generated during previous hBOA runs on similar problems. The approach is evaluated on trap-5 and 2D spin glass problems.
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsPer Kristian Lehre
We demonstrate how to estimate the expected optimisation time of UMDA, an estimation of distribution algorithm, using the level-based theorem. The talk was given at the GECCO 2015 conference in Madrid, Spain.
Fitness inheritance in the Bayesian optimization algorithmMartin Pelikan
This paper describes how fitness inheritance can be used to estimate fitness for a proportion of newly sampled candidate solutions in the Bayesian optimization algorithm (BOA). The goal of estimating fitness for some candidate solutions is to reduce the number of fitness evaluations for problems where fitness evaluation is expensive. Bayesian networks used in BOA to model promising solutions and generate the new ones are extended to allow not only for modeling and sampling candidate solutions, but also for estimating their fitness. The results indicate that fitness inheritance is a promising concept in BOA, because population-sizing requirements for building appropriate models of promising solutions lead to good fitness estimates even if only a small proportion of candidate solutions is evaluated using the actual fitness function. This can lead to a reduction of the number of actual fitness evaluations by a factor of 30 or more.
Towards billion bit optimization via parallel estimation of distribution algo...kknsastry
This paper presents a highly efficient, fully parallelized implementation of the compact genetic algorithm to solve very large scale problems with millions to billions of variables. The paper presents principled results demonstrating the scalable solution of a difficult test function on instances over a billion variables using a parallel implementation of compact genetic algorithm (cGA). The problem addressed is a noisy, blind problem over a vector of binary decision variables. Noise is added equaling up to a tenth of the deterministic objective function variance of the problem, thereby making it difficult for simple hillclimbers to find the optimal solution. The compact GA, on the other hand, is able to find the optimum in the presence of noise quickly, reliably, and accurately, and the solution scalability follows known convergence theories. These results on noisy problem together with other results on problems involving varying modularity, hierarchy, and overlap foreshadow routine solution of billion-variable problems across the landscape of search problems.
Empirical Analysis of ideal recombination on random decomposable problemskknsastry
This paper analyzes the behavior of a selectorecombinative genetic algorithm (GA) with an ideal crossover on a class of random additively decomposable problems (rADPs). Specifically, additively decomposable problems of order k whose subsolution fitnesses are sampled from the standard uniform distribution U[0,1] are analyzed. The scalability of the selectorecombinative GA is investigated for 10,000 rADP instances. The validity of facetwise models in bounding the population size, run duration, and the number of function evaluations required to successfully solve the problems is also verified. Finally, rADP instances that are easiest and most difficult are also investigated.
iBOA: The Incremental Bayesian Optimization AlgorithmMartin Pelikan
This paper proposes the incremental Bayesian optimization algorithm (iBOA), which modifies standard BOA by removing the population of solutions and using incremental updates of the Bayesian network. iBOA is shown to be able to learn and exploit unrestricted Bayesian networks using incremental techniques for updating both the structure as well as the parameters of the probabilistic model. This represents an important step toward the design of competent incremental estimation of distribution algorithms that can solve difficult nearly decomposable problems scalably and reliably.
Effects of a Deterministic Hill climber on hBOAMartin Pelikan
Hybridization of global and local search algorithms is a well-established technique for enhancing the efficiency of search algorithms. Hybridizing estimation of distribution algorithms (EDAs) has been repeatedly shown to produce better performance than either the global or local search algorithm alone. The hierarchical Bayesian optimization algorithm (hBOA) is an advanced EDA which has previously been shown to benefit from hybridization with a local searcher. This paper examines the effects of combining hBOA with a deterministic hill climber (DHC). Experiments reveal that allowing DHC to find the local optima makes model building and decision making much easier for hBOA. This reduces the minimum population size required to find the global optimum, which substantially improves overall performance.
Transfer Learning, Soft Distance-Based Bias, and the Hierarchical BOAMartin Pelikan
An automated technique has recently been proposed to transfer learning in the hierarchical Bayesian optimization algorithm (hBOA) based on distance-based statistics. The technique enables practitioners to improve hBOA efficiency by collecting statistics from probabilistic models obtained in previous hBOA runs and using the obtained statistics to bias future hBOA runs on similar problems. The purpose of this paper is threefold: (1) test the technique on several classes of NP-complete problems, including MAXSAT, spin glasses and minimum vertex cover; (2) demonstrate that the technique is effective even when previous runs were done on problems of different size; (3) provide empirical evidence that combining transfer learning with other efficiency enhancement techniques can often yield nearly multiplicative speedups.
The Bayesian Optimization Algorithm with Substructural Local SearchMartin Pelikan
This work studies the utility of using substructural neighborhoods for local search in the Bayesian optimization algorithm (BOA). The probabilistic model of BOA, which automatically identifies important problem substructures, is used to define the structure of the neighborhoods used in local search. Additionally, a surrogate fitness model is considered to evaluate the improvement of the local search steps. The results show that performing substructural local search in BOA significatively reduces the number of generations necessary to converge to optimal solutions and thus provides substantial speedups.
Analyzing Probabilistic Models in Hierarchical BOA on Traps and Spin GlassesMartin Pelikan
The hierarchical Bayesian optimization algorithm (hBOA) can solve nearly decomposable and hierarchical problems of bounded difficulty in a robust and scalable manner by building and sampling probabilistic models of promising solutions. This paper analyzes probabilistic models in hBOA on two common test problems: concatenated traps and 2D Ising spin glasses with periodic boundary conditions. We argue that although Bayesian networks with local structures can encode complex probability distributions, analyzing these models in hBOA is relatively straightforward and the results of such analyses may provide practitioners with useful information about their problems. The results show that the probabilistic models in hBOA closely correspond to the structure of the underlying optimization problem, the models do not change significantly in subsequent iterations of BOA, and creating adequate probabilistic models by hand is not straightforward even with complete knowledge of the optimization problem.
Order Or Not: Does Parallelization of Model Building in hBOA Affect Its Scala...Martin Pelikan
It has been shown that model building in the hierarchical Bayesian optimization algorithm (hBOA) can be efficiently parallelized by randomly generating an ancestral ordering of the nodes of the network prior to learning the network structure and allowing only dependencies consistent with the generated ordering. However, it has not been thoroughly shown that this approach to restricting probabilistic models does not affect scalability of hBOA on important classes of problems. This presentation demonstrates that although the use of a random ancestral ordering restricts the structure of considered models to allow efficient parallelization of model building, its effects on hBOA performance and scalability are negligible.
Estimation of Distribution Algorithms TutorialMartin Pelikan
Probabilistic model-building algorithms (PMBGAs), also called estimation of distribution algorithms (EDAs) and iterated density estimation algorithms (IDEAs), replace traditional variation of genetic and evolutionary algorithms by (1) building a probabilistic model of promising solutions and (2) sampling the built model to generate new candidate solutions. PMBGAs are also known as estimation of distribution algorithms (EDAs) and iterated density-estimation algorithms (IDEAs).
Replacing traditional crossover and mutation operators by building and sampling a probabilistic model of promising solutions enables the use of machine learning techniques for automatic discovery of problem regularities and exploitation of these regularities for effective exploration of the search space. Using machine learning in optimization enables the design of optimization techniques that can automatically adapt to the given problem. There are many successful applications of PMBGAs, for example, Ising spin glasses in 2D and 3D, graph partitioning, MAXSAT, feature subset selection, forest management, groundwater remediation design, telecommunication network design, antenna design, and scheduling.
This tutorial provides a gentle introduction to PMBGAs with an overview of major research directions in this area. Strengths and weaknesses of different PMBGAs will be discussed and suggestions will be provided to help practitioners to choose the best PMBGA for their problem.
The video of this tutorial presented at GECCO-2008 can be found at
http://medal.cs.umsl.edu/blog/?p=293
Population Dynamics in Conway’s Game of Life and its VariantsMartin Pelikan
The presentation for the project of high school students Yonatan Biel and David Hua made in the Students and Teachers As Research Scientists (STARS) program at the Missouri Estimation of Distribution Algorithms Laboratory (MEDAL). To see animations, please download the powerpoint presentation.
Image segmentation using a genetic algorithm and hierarchical local searchMartin Pelikan
This paper proposes a hybrid genetic algorithm to perform image segmentation based on applying the q-state Potts spin glass model to a grayscale image. First, the image is converted to a set of weights for a q-state spin glass and then a steady-state genetic algorithm is used to evolve candidate segmented images until a suitable candidate solution is found. To speed up the convergence to an adequate solution, hierarchical local search is used on each evaluated solution. The results show that the hybrid genetic algorithm with hierarchical local search is able to efficiently perform image segmentation. The necessity of hierarchical search for these types of problems is also clearly demonstrated.
Distance-based bias in model-directed optimization of additively decomposable...Martin Pelikan
For many optimization problems it is possible to define a distance metric between problem variables that correlates with the likelihood and strength of interactions between the variables. For example, one may define a metric so that the dependencies between variables that are closer to each other with respect to the metric are expected to be stronger than the dependencies between variables that are further apart. The purpose of this paper is to describe a method that combines such a problem-specific distance metric with information mined from probabilistic models obtained in previous runs of estimation of distribution algorithms with the goal of solving future problem instances of similar type with increased speed, accuracy and reliability. While the focus of the paper is on additively decomposable problems and the hierarchical Bayesian optimization algorithm, it should be straightforward to generalize the approach to other model-directed optimization techniques and other problem classes. Compared to other techniques for learning from experience put forward in the past, the proposed technique is both more practical and more broadly applicable.
Pairwise and Problem-Specific Distance Metrics in the Linkage Tree Genetic Al...Martin Pelikan
The linkage tree genetic algorithm (LTGA) identifies linkages between problem variables using an agglomerative hierarchical clustering algorithm and linkage trees. This enables LTGA to solve many decomposable problems that are difficult with more conventional genetic algorithms. The goal of this paper is two-fold: (1) Present a thorough empirical evaluation of LTGA on a large set of problem instances of additively decomposable problems and (2) speed up the clustering algorithm used to build the linkage trees in LTGA by using a pairwise and a problem-specific metric.
http://medal.cs.umsl.edu/files/2011001.pdf
Finding Ground States of Sherrington-Kirkpatrick Spin Glasses with Hierarchic...Martin Pelikan
This study focuses on the problem of finding ground states of random instances of the Sherrington-Kirkpatrick (SK) spin-glass model with Gaussian couplings. While the ground states of SK spin-glass instances can be obtained with branch and bound, the computational complexity of branch and bound yields instances of not more than about 90 spins. We describe several approaches based on the hierarchical Bayesian optimization algorithm (hBOA) to reliably identifying ground states of SK instances intractable with branch and bound, and present a broad range of empirical results on such problem instances. We argue that the proposed methodology holds a big promise for reliably solving large SK spin-glass instances to optimality with practical time complexity. The proposed approaches to identifying global optima reliably can also be applied to other problems and they can be used with many other evolutionary algorithms. Performance of hBOA is compared to that of the genetic algorithm with two common crossover operators.
Computational complexity and simulation of rare events of Ising spin glasses Martin Pelikan
We discuss the computational complexity of random 2D Ising spin glasses, which represent an interesting class of constraint satisfaction problems for black box optimization. Two extremal cases are considered: (1) the +/- J spin glass, and (2) the Gaussian spin glass. We also study a smooth transition between these two extremal cases. The computational complexity of all studied spin glass systems is found to be dominated by rare events of extremely hard spin glass samples. We show that complexity of all studied spin glass systems is closely related to Frechet extremal value distribution. In a hybrid algorithm that combines the hierarchical Bayesian optimization algorithm (hBOA) with a deterministic bit-flip hill climber, the number of steps performed by both the global searcher (hBOA) and the local searcher follow Frechet distributions. Nonetheless, unlike in methods based purely on local search, the parameters of these distributions confirm good scalability of hBOA with local search. We further argue that standard performance measures for optimization algorithms---such as the average number of evaluations until convergence---can be misleading. Finally, our results indicate that for highly multimodal constraint satisfaction problems, such as Ising spin glasses, recombination-based search can provide qualitatively better results than mutation-based search.
Hybrid Evolutionary Algorithms on Minimum Vertex Cover for Random GraphsMartin Pelikan
This work analyzes the hierarchical Bayesian optimization algorithm (hBOA) on minimum vertex cover for standard classes of random graphs and transformed SAT instances. The performance of hBOA is compared with that of the branch-and-bound problem solver (BB), the simple genetic algorithm (GA) and the parallel simulated annealing (PSA). The results indicate that BB is significantly outperformed by all the other tested methods, which is expected as BB is a complete search algorithm and minimum vertex cover is an NP-complete problem. The best performance is achieved by hBOA; nonetheless, the performance differences between hBOA and other evolutionary algorithms are relatively small, indicating that mutation-based search and recombination-based search lead to similar performance on the tested classes of minimum vertex cover problems.
Hybrid Evolutionary Algorithms on Minimum Vertex Cover for Random Graphs
Spurious Dependencies and EDA Scalability
1. Spurious Dependencies and EDA Scalability
Elizabeth Radetic and Martin Pelikan
Missouri Estimation of Distribution Algorithms Laboratory (MEDAL)
University of Missouri, St. Louis, MO
http://medal.cs.umsl.edu/
pelikan@cs.umsl.edu
Download MEDAL Report No. 2010002
http://medal.cs.umsl.edu/files/2010002.pdf
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
2. Motivation
Estimation of distribution algorithms (EDAs)
Replace standard crossover and mutation by
building a probabilistic model of selected solutions, and
sampling the probabilistic model to generate new solutions.
Can solve many problems intractable with standard EAs.
Model accuracy
It is important that the EDA model is accurate.
Types of inaccuracies for dependency-based models
Missing dependencies.
Spurious, unnecessary dependencies.
Most prior work focused on missing dependencies.
This study
Focus on effects of spurious dependencies.
Theoretical study for population sizing.
Empirical study for the number of generations.
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
3. Outline
1. Model accuracy.
2. Spurious dependencies
Model for spurious dependencies.
Effects on population sizing.
Effects on the number of generations.
3. Experiments.
4. Conclusions and future work.
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
4. Dependency-Based Probabilistic Models in EDAs
Dependency-based probabilistic models
Encode dependencies and independencies between variables.
Dependency structure decomposes the problem.
Subproblems should be of bounded order.
Examples
Marginal product models.
Bayesian networks.
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
5. Marginal Product Model
Beyond Pairwise Dependencies: ECGA
Variables are divided into linkage groups.
Defines problem decomposition into separable subproblems.
! Extended Compact GA (ECGA) (Harik, 1999).
Distribution of each group encoded by probability table.
We Consider groups of string positions.solutions.
!
assume binary representation of candidate
String Model
!!!
Martin Pelikan, Probabilistic Model-Building GAs
32
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
6. Model Accuracy
Types of inaccuracies
Missing dependencies.
Spurious, unnecessary dependencies.
Example: Trap-5
n/5
ftrap5 (X1 , . . . , Xn ) = i=1 trap5 (X5i−4 + X5i−3 + X5i−2 + X5i−1 + X5i )
5 if u = 5
trap5 (u) =
4−u otherwise
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
7. Onemax Model of Spurious Dependencies
Onemax is the sum of bits in the binary string
n
onemax(X1 , . . . , Xn ) = i=1 Xi
Perfect and spurious models for onemax
Perfect model assumes no dependence at all.
Spurious model assumes linkage groups of order kspurious > 1.
Parameter kspurious controls order of spurious dependencies.
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
8. Effects of Spurious Models on EDA Performance
Two main effects of spurious dependencies
Population size.
Number of generations.
Population sizing decomposition
Population size requirements should increase
Effects depend on learning, but sometimes substantial.
Number of generations
Number of generations may decrease due to weaker variation.
Effects not expected substantial.
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
9. EDA Population Sizing and Spurious Dependencies
Population sizing decomposition
Initial supply
Initial population is random.
Ensure sufficient supply of partial solutions for each group.
Decision making
Decision making between partial solutions is stochastic.
Ensure that best partial solution wins in each group.
Model building
Ensure accurate enough models to find the optimum.
The reason for spurious dependencies, not the effect.
Focus in this work
Initial supply.
Decision making.
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
10. Population Sizing: Initial Supply
Initial supply for perfect model (Goldberg et al., 2001)
N = 2 ln 2m
Initial supply for arbitrary kspurious
n
N = 2kspurious kspurious ln 2 + ln
kspurious
Initial-supply population increase factor
n
kspurious ln 2 + ln kspurious
γis = 2kspurious −1
ln 2 + ln n
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
11. Population Sizing: Decision Making
Decision making for perfect model (Harik et al., 1997)
1
N = − ln α π(n − 1)
2
Decision making for arbitrary kspurious
N = −2kspurious −2 ln α π(n − 1)
Decision-making population increase factor
γdm = 2kspurious
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
12. Number of Generations
Effects of spurious dependencies on number of generations
Spurious dependencies weaken the mixing.
This reduces the effects of variation.
This should reduce the number of generations until
convergence (assuming a large enough population).
No theoretical model as of now.
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
13. Description of Experiments
Operators
Binary tournament selection without replacement.
Three replacement types
Full replacement.
Elitist replacement (50% worst are replaced).
Restricted tournament replacement (niching).
Models with various levels of spurious linkage.
Parameters
Optimal population size obtained by bisection.
Runs stop when a solution close enough to the optimum is
reached (allow one linkage group to end up incorrect).
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
14. Population Size (Full Replacement)
Population size ratio
1000 Gambler’s ruin Gambler’s ruin
Initial supply 16
Population size
Initial supply
800 Experiment Experiment
12
600
400 8
200 4
0
1 1.5 2 2.5 3 3.5 4 4.5 5 1 2 3 4 5
Spurious linkage group size Spurious linkage group size
(a) Population size (b) Population size ratio
owth of the population size with respect spurious is exponential. a problem
Increase of population size with k to the group size for
side shows the actual population sizes compared to the theoretical mo
Theory provides a conservative bound.
d side shows the ratio of the population sizes with spurious linkage and th
spurious linkage.
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
500
15. 1 1.5 2 2.51.5 3.52.5 1.5 3.5 2.5 4.5 3.5 4 4.5 5 2 1
1 3 2 1 4.5 2 4 3 5
4 3 5 1 3 2 1 4 3 2 5 4 3 5 4 5
Population Size group sizegroup size group size Strategies) linkage group size
(All Replacement linkage group sizegroup size
Spurious linkage linkage
Spurious Spurious linkage Spurious Spurious linkage
Spurious
(a) Population size Population size(b) Population size ratio
(a) Population size
(a) (b) Population size ratio
(b) Population size ratio
wth of theGrowth population size with respect respect sizethe size forsize for 300 of 300
Figure 2: population size with respect to the group group group a problem bits.
2: Growth of the of the population size with to the to for a problem of a problem
The left-hand side the actual population sizes compared to the theoretical model, mod
t-hand side the actual population sizes compared to the theoretical model, whereas
side shows shows shows the actual population sizes compared to the theoretical whe
the right-hand sidethe ratiopopulation sizes with spurious linkagelinkage linkage and the
t-hand side the ratio of the ofratiopopulation sizes with spurious and the population
side shows shows shows the the of the population sizes with spurious and the popula
sizes with no spurious linkage.
th no spurious linkage.
purious linkage.
Full replacement Elitist replacement RTR
500 500 500
1200 size 1200 size
blem Problem Problem size 1000 Problem size 1000 size
1000 Problem Problem size Problem size
Problem size
Problem size
00 300 300 300 300 300 300 300 300
Population size
Population size
Population size
Population size
Population size
Population size
Population size
1000 1000 400 400 400
40 240 240 800 800
240 800
240 240 240 240 240
800
80 800
180 180 180 180 180 300 180
300 180
300 180
20 120 120 600 600
120 600
120 120 120 120 120
600 600 60 60 60
60 60 60 400 400 400 200 60
200 60
200 60
400 400
200 200 200 200 200 100 100 100
0 0 0 0 0 0 0 0
5 2 2.51.5 3.52.5 1.5 3.5 2.5 4.5 3.5 4 4.51 51.5 2 2.51.5 3.52.5 1.5 3.5 2.5 4.5 3.5 4 4.5 1.5 2 2.51.5 3.52.51.5 3.52.5 4.53
1 3 2 1 4.5 2 4 3 5
4 3 5 1 3 2 1 4.5 2 4 3 5
4 3 5 1 5 1 3 2 1 4 4.52 5 4 3
3
up size (bits per group)per group) per group)Group size (bits per group)per group) per group)
Group sizeGroup size (bits
(bits Group sizeGroup size (bits
(bits Group size (bits per group)per group)
Group sizeGroup size (bits
(bits
(a) Full replacement
replacementFull replacement(b) Elitist replacement
(a) (b) Elitist replacement
(b) Elitist replacement (c) RTR(c) RTR(c) RTR
gure Figure theGrowth population size with respect respect spurious linkage linkage size.
Growth of 3: population size with respect to the spurious linkage group size. grou
3: Growth of the of the population size with to the to the spurious group
Increase of population size with kspurious similar in all cases.
ows the averageaverage number of spurious linkage groups (groups at leastat leasteach p
1(a) shows the number of spurious linkage groups (groups at size of size each prob- for
average number of spurious linkage groups (groups of size of least 2) for 2) for 2)
results results resultsthe number of the number of such groups increases approximately liw
. The indicate that indicate that suchof such groups increases approximately linearly
em size. The indicate that the number groups increases approximately linearly with
m size. Figure 1(b) the average size ofaverage spurious linkage linkage groups. For size, pr
Figure 1(b) shows shows the average spurious linkage groups.groups. For each problem
problem size. Figure 1(b) shows the size of size of spurious For each problem each
rious linkagelinkage linkage groups is close to two, indicating thatlinkage linkage groups w
the spurious groups groups is close to two, indicating that linkage groups groups were cre
of size of spurious is close to two, indicating that larger larger larger were created
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
16. Number of Generations (All Replacement Strategies)
Full replacement Elitist replacement RTR
1e+07 1e+07 1e+07
Number of generations
Number of generations
Number of generations
Number of generations
Number of generations
Number of generations
Number of generations
80 80
Problem size Problem size 80
Problem size 80 80
Problem size
Problem size
Problem size Problem size
Problem size
Problem siz
120
70 300 70 300 120 12070 300 120
70 300 70 300 120 120 1e+06 1e+06
300 1e+06
300 300
240 60 60240 60 6060 240 240 60 60240 60 60 100000 240 240 240
60 60 100000 100000
180 180 180
180 180 180 180 180
50 50 50 50 50 10000 120
10000 120
10000 120
40 40 40 40 40 60 60 60
1000 1000 1000
30 30 30 30 30
20 20 20 20 20 100 100 100
10 10 10 10 10 10 10 10
2 2.5 1.5 3.51 41.5 2 52.5 3 3.5 4 4.51.5
1 3 2 2.5 4.5 3.5 4 4.5 5
3 1 5 2 2.5 1.5 3.51 41.5 2 52.5 3 3.5 4 4.5 5 1.5 2 2.5 3 3.5 4 1 1.5 2 2.5
1 3 2 2.5 4.5 3.5 4 4.5 5
3 1 1 1.5 2 2.5 3 5
4.5 3.5 4 4
p SizeGroup Size (bits per group)per group)
(bits per group) Size (bits
Group Group size Group size (bits per group)per group) Group size Group size (bits per gro
(bits per group) size (bits
Group (bits per group) size (b
Group
replacement Full replacement Elitist replacement
(a) Full (a)
replacement (b) (b) Elitist replacement
(b) Elitist replacement (c) RTR (c) RTR(c) RTR
owthGrowthFullthe numberreplacement respect respectrespect spurious linkagesize.
e Figure the Growth of the number of generations with spurious linkage group linkage
4: of 4: number of generations with with to the to the to the spurious group
of and elitist of generations
Number of generations slightly decreases with kspurious .
240 240
Niching (restricted tournament replacement) 20000 20000 20000
200 200 200
er of generations
er of generations
er of generations
er of evaluations
er of evaluations
er of evaluations
220 220 180 180 180 18000 18000 18000
pulation size
200 200 Number of 160generations dramatically increases!
160 160 16000 16000 16000
180 180 140 140 Full repl. Full repl. Full repl. 14000
140 14000 Full repl. Full repl. Full
14000
160 Full 160 Full repl. Full repl. 120 120 Elitist repl. Elitist repl.Elitist
repl. Elitist repl. Elitist repl.Elitist repl. 12000
120 12000 12000
RTR repl. RTR repl. RTR
140 Elitist 140 Elitist repl.Elitist repl. 100
repl. 100RTR repl. RTR repl. RTR repl. 10000
100 10000 10000
120 RTR 120 RTR repl. RTRPelikan 80
Elizabeth Radetic and Martin repl.
repl. 80 80 Spurious Dependencies and EDA Scalability 8000
8000 8000
17. Spurious Linkage in Multivariate EDAs
Experiment
Use optimal population size in ECGA.
Observe spurious dependencies in actual models.
Avg. number of groups > 1
140
Avg. size of groups > 1
Replacement 2.05 Replacement 1.8 Replacement
Average group size
120 RTR 1.75
RTR 2.045 RTR
100 Elitist Elitist 1.7 Elitist
Full 2.04 Full Full
80 1.65
2.035 1.6
60 2.03 1.55
40 2.025 1.5
20 2.02 1.45
0 2.015 1.4
50 100 150 200 250 300 50 100 150 200 250 300 50 100 150 200 250 300
Problem size (number of bits) Problem size (number of bits) Problem size (number of bits)
(a) Number of spurious linkage (b) Avg. size of spurious linkage (c) Average linkage group size
groups groups
Figure 1: The average number of spurious linkage groups (groups of size ≥ 2), the average size
of linkage groups of size ≥ 2, and the average linkage group size (including all linkage groups) for
ECGA on onemax. Three replacement strategies are considered: full replacement, elitist replace-
ment and RTR. For each problem size and replacement strategy, the results represent an average
over 100 runs (10 bisections of 10 runs each).
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
18. Conclusions and Future Work
Conclusions
Population size increases exponentially with kspurious .
Number of generations mostly unaffected.
But for niching, the number of generations skyrocks!
Spurious dependencies should not be ignored.
Future work
From our model to multivariate EDAs
In most EDAs population sizing driven by model building.
Almost always the models contain spurious dependencies.
How do the models interact?
Dramatic increase in the number of generations with niching
Explain why.
Propose ways to deal with it.
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
19. Acknowledgments
Acknowledgments
NSF; NSF CAREER grant ECS-0547013.
University of Missouri; High Performance Computing
Collaboratory sponsored by Information Technology Services;
Research Award; Research Board.
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability