SlideShare a Scribd company logo
1 of 46
Download to read offline
I
An Adaptive Selection Scheme for Balancing
Exploitation and Exploration
By
Muhammad Riyad Parvez
Student ID: 200605043
Department of Computer Science and Engineering
Bangladesh University of Engineering and Technology
March 2012
II
DECLARATION
I, hereby, declare that the work presented in this thesis is the outcome of the investigation
performed by me under the supervision of Dr. Md. Monirul Islam, Associate Professor,
Department of Computer Science and Engineering, Bangladesh University of Engineering
and Technology, Dhaka. I also declare that no part of this thesis and thereof has been or is
being submitted elsewhere for the award of any degree or diploma.
Signature
(Muhammad Riyad Parvez)
Candidate
Countersigned
(Dr. Md. Monirul Islam)
Supervisor
III
Acknowledgements
Firstly, I would like to express my profound gratefulness and honor to Dr. Md. Monirul Islam,
Associate Professor, Department of Computer Science and Engineering, Bangladesh University of
Engineering and Technology, for his continuous support, advice and care. His endless patience,
scholarly guidance, continual encouragement, constant and energetic supervision, constructive
criticism, valuable advice, reading many inferior drafts and correcting them at all stages have made it
possible to complete this thesis.
I would also like to thank all the faculty members and staff of Department of CSE, BUET, for their
support and cooperation.
Finally, I would like to express my deep respect and gratitude to my parents and my family. And to
Charles Darwin from whom, evolutionary algorithms get their aesthetic beauty.
IV
Abstract
Evolutionary algorithms are successfully applied to problems where typical optimization
algorithms and local search methods fail. Like any search algorithm, evolutionary algorithms
have to face the conflicting goals of local exploitations and global explorations during the
search process. The success of evolutionary algorithm lies in its ability to explore and exploit
simultaneously. But EA is also error prone to loosing population diversity early generations
which results into population trapped into local optima and degrading of its performance.
Though numerous schemes are suggested for various stages of EA to prevent premature
optimization, maintaining balance between exploration and exploitation isn’t tried at survivor
selection stage. Existing selection strategies either focuses on exploiting or exploring.
This thesis introduces Adaptive Survivor Selection Strategy (ASSS), a totally new concept
that tries to maintain required amount of diversity at survivor selection stage. Along with new
selection strategy this thesis paper also presents new criteria for measuring diversity both for
individual and whole population. Key concept of this strategy is to measure diversity across
the population, calculating needed amount of diversity at that time and try to gain that
diversity level selecting survivors using newly introduced diversity measurement technique.
ASSS uses both fitness of an individual and how much diverse is the individual regarding to
current population to calculate an adaptive survivor selection fitness function. Using that
fitness we simply weed out inferior individuals in terms of fitness and diversity. This survivor
fitness function is adaptive which gives it control on selection pressure.
Performance in maintaining required level of diversity at any time of algorithm is evaluated
on a number of benchmark numerical optimization problems and results are compared with
several existing selection schemes. Experimental result shows that ASSS shows significant
performance gain in managing diversity for any sort of EA.
V
Table of Contents
Chapter 1: Introduction……………………………………….……1
1.1 General Introduction……………………………………….……...1
1.2 Objective of the Thesis……………………………………….……2
1.3 Thesis Organization……………………………………………….3
Chapter 2: Background………………………………………….….4
2.1 When EA is Needed……………………………………………….4
2.2 Advantages of EA…………………………………………………4
2.3 Disadvantages of EA……………………………………………...4
2.4 Canonical Structure of EA……………………………………….5
2.5 Representation of Gene…………………………………………...6
2.6 Major Branches of EA…………………………………………....7
2.6.1 Genetic Algorithm………………………………………………….…..7
2.6.2 Evolutionary Programming……………………………………….…..7
2.6.3 Evolutionary Strategy…………………………………………………7
2.6.4 Genetic Programming…………………………………………………8
2.6.5 Memetic Algorithm……………………………………………………9
2.7 Existing Work…………………………………………………….9
2.7.1 Dynamic Parameter Control…………………………………………9
2.7.2 Maintaining Diversity and Multi-population GAs…………………9
2.7.3 Memory Based Genetic Algorithm…………………………………10
2.7.4 Mutation Based Work………………………………………………10
2.7.5 Survivor Selection Based Work……………………………………12
VI
Chapter 3: Proposed Algorithms………………………………..15
3.1 Dual Population Genetic Algorithm………………………….15
3.1.1 Advantages………………………………………………………….16
3.1.2 Disadvantages……………………………………………………….16
3.1.3 Recommendation……………………………………………………17
3.2 Modified DPGA Proposal……………………………………...17
3.2.1 Structure of Individual……………………………………………..17
3.2.2 Initialization………………………………………………………….18
3.2.3 Parent Selection……………………………………………………..18
3.2.4 Generating Parent Individual On The Fly………………………..18
3.2.5 Mutation…………………………………………………………….18
3.2.6 Survivor Selection…………………………………………………..19
3.2.6.1 Exploited Individual……………………………………...19
3.2.6.2 Explored Individual……………………………………....19
3.2.6.3 Normal Individual………………………………………..19
3.2.7 Schedule of T……………………………………………………….20
3.2.8 Advantages…………………………………………………………20
3.3 New Survivor Selection Strategy……………………………..20
3.4 New Mutation Strategy……………………………………….24
3.4.1 Laplace Distribution………………………………………………25
3.4.2 Slash Distribution…………………………………………………27
3.4.3 Students T-Distribution…………………………………………..28
Chapter 4: Experimental Study………………………………...29
4.1 Modified DPGA………………………………………………29
4.1.1 Pitfalls of Modified DPGA………………………………………29
VII
4.2 Adaptive Survivor Selection Strategy……………………….30
Chapter 5: Conclusion…………………………………………...32
5.2 Future Works……………………………………………….....32
5.2.1 Modified DPGA…………………………………………………...32
5.2.2 Adaptive Survivor Selection………………………………………33
5.2.3 New Distribution Based Mutation………………………………..33
References………………………………………………………...34
Appendix………………………………………………………….35
VIII
List of Figures
Table 1.1(a): Change in best fitness (best solution) with number of generations…………….1
Table 1.1(b): Comparison between Random Search, EA and Problem Tailored Search……..2
Table 2.4: Basic skeleton of an Evolutionary Algorithm………………………………….…..6
Table 2.6.4: Individual structure of GP………………………………………………….……8
Table 2.7.4: Probability Distribution Function (PDF) of Gaussian distribution…………..…11
Table 2.7.5: Truncation Selection……………………………………………………………12
Table 3.1(a): Offspring Generation of DPGA…………………………………………….…15
Table 3.1(b): Reserve Population Fitness Function……………………………………….…16
Table 3.4: Probability Density Function of Stable Family………………………….…….….24
Table 3.4.1(a): Probability Density Function of Laplace Distribution…………………...…..26
Table 3.4.1(b): Comparison of Gaussian and Laplace Distribution…………………….…....26
Table 3.4.2: Probability Density Function of Slash Distribution at different parameters........27
Table 3.4.3: Probability Density Function for Student’s t-distribution with different degrees
of freedom……………………………………………………………………………28
Table 4.2(a): Change in diversity across generations………………………………………...31
Table 4.2(b): Number of buckets searched…………………………………………………..31
IX
List of Symbols
MPGA Multi Population Genetic Algorithm
DPGA Dual Population Genetic Algorithm
GA Genetic Algorithm
StGA Standard Genetic Algorithm
EA Evolutionary Algorithm
EC Evolutionary Computing
EP Evolutionary Programming
MA Memetic Algorithm
FEP Fast Evolutionary Programming
ASSS Adaptive Survivor Selection Strategy
X
XI
Chapter 1
Introduction
1.1 General Introduction
Evolutionary Algorithm (EA) is the study of computational system which use ideas and get
inspirations from natural evolution. It’s a generic population based meta-heuristic
optimization algorithm. EA falls into category of bio-inspired computing. It uses selection,
crossover, mutation mechanisms borrowed from natural evolution. And survival of the fittest
principle lies in the heart of EA [1] [2]. Evolution Algorithms are often viewed as function
optimizers, although the range of problems to which EAs are applies quite broad. One of the
many advantages of EAs is they don’t require very broad domain knowledge. Although
domain knowledge can be introduced in EAs.
Fitness curve by generations for EA is asymptotic in nature. Fitness improvement in earlier
generations of EA is rapid and decreasingly increasing. And after certain generations,
improvement in best fitness throughout generations is negligible. That’s when we call
population has converged. It’s expected that population will converge to good enough
solution. But sometimes population converges to local optima which is not accepted result.
This phenomenon is called premature optimization.
Figure 1.1(a): Change in best fitness (best solution) with number of generations
EAs performs better than random search because search because of its exploitative behavior.
It uses random walk, but also tries exploit good solutions. It also outperforms local greedy
XII
search. Local greedy searches are exploitative in nature, often trapped into local maxima. But
EA has random walk and maintaining required level of diversity it’s less likely to be trapped
into local maxima. Problem tailored searches outperform EA only for the problem in which
the search is tailored and uses deep domain knowledge of that problem. Such deep domain
knowledge isn’t readily available and incorporating to problem tailored search is difficult.
Figure (1.1b): Comparison between Random Search, EA and Problem Tailored Search[4]
1.2 Thesis Objective
This thesis mainly focuses into maintaining diversity of single population algorithms. It is
frequently observed that populations lose diversity too early and their individuals are trapped
into local optima. For lack of diversity trapped individuals can’t escape basin of local
minima. This phenomenon is called Premature Convergence. Objective of this thesis paper is
to investigate better schemes which can maintain diversity of a population and also give
control on diversity. The quest is searching for an adaptive diversity maintaining scheme.
Thesis is done in three focused areas:
1. Modifying Dual Population Genetic Algorithm (DPGA) so that it can properly
manage diversity.
2. Seeking a survivor selection technique which is adaptive and gives more control
on diversity at any time of algorithm.
3. Examining probability distributions other than already used distributions which
can give appropriate amount of jumps in any stage of evolution.
XIII
1.3 Thesis Organization
The rest of the thesis is organized as follows. Chapter 2 introduces the fundamentals of
evolutionary algorithm, with its operators and processes. The essential terms related to
evolutionary algorithm are explained with examples. The strengths, limitations, and
applications of evolutionary algorithm are also mentioned.
In Chapter 3, we introduce new evolutionary strategies, entitled as Modified DPGA,
Adaptive Survivor Selection Strategy, New Mutation Based on Distributions, to balance the
exploitative and explorative features of the standard evolutionary algorithm. The different
stages, operators and procedures of Modified DPGA, ASSS, and Mutation Based on
Distribution are described in details. It is also explained how they differ substantially from
other existing works.
Chapter 4 evaluates Modified DPGA and ASSS on a number of benchmark numerical
optimization problems and makes comparisons with several other existing works. Although
Modified DPGA didn’t perform well, but we gained valuable insight how we can modify this
further to gain more performance. An in-depth experimentation with the parameters,
operators and the stages of ASSS, with their effects on population fitness and diversity, is
also carried out. Finally, in Chapter 5, we summarize our work and provide directions for
future research.
XIV
Chapter 2
Background
Evolutionary Algorithms (EA) consist of several heuristics, which are able to solve
optimization tasks by imitating some aspects of natural evolution. They may use different
levels of abstraction, but they are always working on whole populations of possible solutions
for a given task. EAs are an approved set of heuristics, which are flexible to use and postulate
only negligible requirements on the optimization task.
2.1 When EA is Needed
 The search space is large, complex or poorly understood.
 Domain knowledge is scarce or expert knowledge is difficult to encode to narrow the
search space.
 Only target (fitness) function is provided.
 No mathematical analysis is available.
 Traditional search methods fail.
 Not the best solution but good enough solution is needed.
 Local search methods can’t give good enough solutions.
 Continuous optimization problems.
2.2 Advantages of EA
 Applicable to a wide range of problems.
 Useful in areas without good problem specific techniques.
 No explicit assumptions about the search space necessary.
 Easy to implement.
 Any-time behavior.
2.3 Disadvantages of EA
 Problem representation must be robust.
 No general guarantee for an optimum.
 No solid theoretically foundations (yet).
 Parameter tuning: trial-and-error Process (but self-adaptive variants in evolution
strategies).
 Sometimes high memory requirements.
 Implementation: High degree of freedom.
XV
2.4 Canonical Structure of EA
EAs are family of algorithms. There’s no definite structure exists among them. Although
most of the EAs follow more or less following structure:
1. Initialization: The initial population of candidate solutions is usually generated
randomly across the search space. However, domain specific knowledge or other
knowledge can easily be incorporated.
2. Evaluation: Once the population is initialized or offspring population is created,
the fitness value of the candidate solutions is evaluated.
3. Parent Selection: Selection allocates more copies of those solutions with higher
fitness values and thus imposes the survival-of-the-fittest mechanism on the
candidate solutions. The main idea of selection is to prefer better solutions to
worse ones, and many selection procedures have been proposed to accomplish this
idea, including roulette-wheel selection, stochastic universal selection, ranking
selection and tournament selection, some of which are described in the next
section.
4. Recombination: Recombination combines parts of two or more parental solutions
to create new, possibly better solutions (i.e. offspring). There are many ways of
accomplishing this (some of which are discussed in the next section), and
competent performance depends on a properly designed recombination
mechanism. The offspring under recombination will not be identical to any
particular parent and will instead combine parental traits in a novel manner.
5. Mutation: While recombination operates on two or more parental chromosomes,
mutation locally but randomly modifies a solution. Again, there are many
variations of mutation, but it usually involves one or more changes being made to
an individual’s trait or traits. In other words, mutation performs a random walk in
the vicinity of a candidate solution.
6. Replacement: The offspring population created by selection, recombination, and
mutation replaces the original parental population. Many replacement techniques
such as elitist replacement, generation-wise re-placement and steady-state
replacement methods are used in GAs.
7. Repeat steps 2–6 until a terminating condition is met.
XVI
Figure (2.4): Basic skeleton of an Evolutionary Algorithm
2.5 Representation of Gene
Individual representations are typically divided into two types:
1. Genotypic Representation: Genes are internal structures those determine physical
characteristics of an individual. Usually represented by array of letters like genes
in human DNA. In case of EA, it is represented by bit-string. Genotypic
representation is used extensible in Genetic Algorithm. But it has some limitation.
Most real world problems are not in form of genotypic representation. So we have
to device a scheme to represent genotype by bit-string. Performance of algorithm
is dependent on representation of bit-string.
2. Phenotypic Representation: Individuals are represented by real valued vectors. So
there’s no need to convert them to any other representations. Algorithm directly
works on real valued vectors of problems. Extensively used in Evolutionary
Strategy and Evolutionary Programming. It’s used in real valued function
optimization.
XVII
2.6 Major Branches of EA
EAs are divided into four major branches.
2.6.1 Genetic Algorithm
Genetic Algorithm (GA) was first formulated by John Holland. Holland’s original GA is
called standard Genetic Algorithm which uses two parents, produces two offspring. It
simulates Darwinian evolution. Search operators are only applied to genotypic representation;
hence it’s called Genotypic Algorithm. It emphasizes the role of crossover and mutation as a
background operator. GA uses binary string as representation of individuals extensively.
2.6.2 Evolutionary Programming
Evolutionary Programming (EP) was first proposed by David Fogel [2]. It is closer to
Lamarckian evolution. It doesn’t use any kind of crossover. Only mutation is used both for
exploitation and exploration. Individuals are represented by two parts: object variables and
mutation step size . are essentially real valued vectors i.e. phenotypes. So they are called
Phenotypic Algorithm.
2.6.3 Evolutionary Strategies
Evolutionary Strategies (ES) was first proposed by Ingred Rechenberg. Individuals are
represented by real valued vectors. Good optimizer of real valued functions. Like EP, they
are also Phenotypic Algorithm. Mutation plays the main role, crossover is also used. It has
special self-adapting step size of mutation. ES has some basic notation:
1. (p,c) The p parents 'produce' c children using mutation. Each of the c children is then
assigned a fitness value, depending on its quality considering the problem-specific
environment. The best (the fittest) p children become next generations parents. This
means the c children are sorted by their fitness value and the first p individuals are
selected to be next generations parents (c must be greater or equal p).
2. (p+c) The p parents 'produce' c children using mutation. Each of the c children is then
assigned a fitness value, depending on its quality considering the problem-specific
environment. The best (the fittest) p individuals of both: parents and children become
next generations parents. This means the c children together with the p parents are
sorted by their fitness value and the first p individuals are selected to be next
generations parents.
3. (p/r,c) The p parents 'produce' c children using mutation and recombination. Each of
the c children is then assigned a fitness value, depending on its quality considering the
problem-specific environment. The best (the fittest) p children become next
generations parents. This means the c children are sorted by their fitness value and the
first p individuals are selected to be next generation parents (c must be greater or
equal p).
4. (p+c) The p parents 'produce' c children using mutation and recombination. Each of
the c children is then assigned a fitness value, depending on its quality considering the
problem-specific environment. The best (the fittest) p individuals of both: parents and
XVIII
children become next generation parents. This means the c children together with the
p parents are sorted by their fitness value and the first p individuals are selected to be
next generations parents.
2.6.4 Genetic Programming
Genetic Programming (GP) is put forward by John Koza. GP evolves computer programs. It
is a specialization of genetic algorithms (GA) where each individual is a computer program.
It is a machine learning technique used to optimize a population of computer programs
according to a fitness landscape determined by a program's ability to perform a given
computational task. Trees can be easily evaluated in a recursive manner. Every tree node has
an operator function and every terminal node has an operand, making mathematical
expressions easy to evolve and evaluate. Genetic programming starts with a primordial ooze
of thousands of randomly created computer programs. This population of programs is
progressively evolved over a series of generations. The evolutionary search uses the
Darwinian principle of natural selection (survival of the fittest) and analogs of various
naturally occurring operations, including crossover (sexual recombination), mutation, gene
duplication, gene deletion. Genetic programming sometimes also employs developmental
processes by which an embryo grows into fully developed organism. It uses both mutation
and crossover. Trees are often used as data structure for individuals. Although non-tree
representations have been suggested and successfully implemented. Although other fields of
EA developed to be in mainstream usage, GP still is in its infancy. Because of representation
of programs, huge search space, complex operation is needed to generate better individuals,
GP isn’t mainstream yet.
Figure (2.6.4): Individual structure of GP
XIX
2.6.5 Memetic algorithm
Although Memetic algorithms don’t fall into EA category, they incorporate other searching
techniques to EAs. The combination of Evolutionary Algorithms with Local Search
Operators that work within the EA loop has been termed “Memetic Algorithms” (MA). Quite
often, MA are also referred to in the literature as Baldwinian Evolutionary algorithms (EA),
Lamarckian EAs, cultural algorithms or genetic local search. After generating individuals
local search is performed on them. The frequency and intensity of individual learning directly
define the degree of evolution (exploration) against individual learning (exploitation) in the
MA search, for a given fixed limited computational budget. Clearly, a more intense
individual learning provides greater chance of convergence to the local optima but limits the
amount of evolution that may be expended without incurring excessive computational
resources. Therefore, care should be taken when setting these two parameters to balance the
computational budget available in achieving maximum search performance. When only a
portion of the population individuals undergo learning, the issues on which subset of
individuals to improve need to be considered to maximize the utility of MA search.
2.7 Existing Works
2.7.1 Dynamic Parameter Control
A variety of previous works have proposed methods of dynamically adjusting the parameters
of GA or other evolutionary algorithms. These methods include deterministic parameter
control, adaptive parameter control, and self-adaptive parameter control. The simplest
technique is the deterministic parameter control, which adjusts parameters according to a
predetermined policy. Since it controls the parameters deterministically, it cannot adapt to the
changes that occur during the execution of an algorithm.
Adaptive parameter control exploits feedback from the evolution of a population to control
the parameters. A notable example is the 1:5 adaptive Gaussian mutation widely used in the
evolution strategy algorithms. According to this method, the mutation step size is increased if
more than 20% of the mutations are successful and reduced otherwise. However, this method
cannot be applied to algorithms adopting other than the real number representation. Finally,
self-adaptive parameter control encodes the parameters into chromosomes and let them
evolve with other genes. Although elegant, its applicability and effectiveness in a broad range
of problems have not yet been shown
2.7.2 Maintaining Diversity and Multi-population Genetic Algorithms
Multi population GAs (MPGAs) do so by evolving multiple subpopulations which are
spatially separated [6]. Island-model GA (IMGA), which is a typical example of MPGA,
evolves two or more subpopulations and uses periodic migration for the exchange of
information between the subpopulations. The number and size of the populations of IMGA
XX
are predetermined and kept unchanged during the algorithm’s execution. However, other
MPGAs such as multinational GA forking GA the bi-objective multi population algorithm
and variable island GA can adjust the number and size of populations dynamically by
splitting a population into two smaller ones or combining two similar ones. The performance
of IMGA is sensitive to the migration policy, migration rates and size, and the particular
topology used, because they determine the spread speed of good solutions among the
subpopulations. A variety of previous works have studied the effect of these parameters for
migration both theoretically and experimentally.
2.7.3 Memory Based Genetic Algorithm
Diploid GA, GA with unexpressed genes, dual GA (dGA), and primal-dual GA (PDGA) have
adopted complementary and dominance mechanisms to maintain or provide population
diversity. Most organisms in nature have a great number of genes in their chromosomes and
only some of the dominant genes are expressed in a particular environment. The repressed
genes are considered as a means of storing additional information and providing a latent
source of
population diversity. Diploid GAs use diploid chromosomes which are different from natural
ones in that the two strands of the diploid chromosomes are not complementary. Only some
genes in a diploid chromosome are expressed and used for fitness evaluation by some
predetermined dominance rules. GAUG is different from diploid GA in that it uses haploid
chromosomes, but it also incorporates some unexpressed genes into its chromosomes. The
unexpressed genes in GAUG are not used for fitness evaluation but used for preserving
diversity.
dGAs and PDGAs also have haploid chromosomes in the population, but the chromosomes
are sometimes interpreted complementarily to provide additional diversity. In dGA, each
chromosome is attached with an additional bit which indicates whether the chromosome
should be interpreted as it is or as complemented. In PDGA, some bad-looking chromosomes
are interpreted both as complemented and original, and the original one is replaced by the
complemented one if the latter gives better evaluations. Since the additional diversity
provided by memory-based algorithms makes it easier to adapt to extreme environmental
changes, these methods are frequently used for dynamic optimization problems.
2.7.4 Mutation Based Work
ES and EP use mutations exclusively for both maintaining diversity and exploitation.
Mutations can be divided into several categories. Mutation classification based on uniform
ness across generations is:
1. Uniform Mutation: When mutation step size or mutation rate is uniform
regardless of generation at any time of algorithm, then it’s called uniform
mutation. Its usage not very high because of deterministic behavior regardless of
generations.
XXI
2. Non Uniform Mutation: If mutation step size of mutation rate varies with respect
to generation, then it’s called non uniform mutation. Usually at initial generations,
step size or mutation rate is higher. As generation continues to increase step size
or mutation rate is decreased gradually. It’s used frequently, because it gives
option for governing diversity rate and also when diversity is needed it’s
facilitated by large step size and convergence is needed it’s facilitated by small
step size.
For genetic algorithm, random bit-flipping is used for mutation. Random bit changing has
some issues. For example, bit changing in higher position in bit-string has more effect on bit
changing in lower position. And also for some bit-string going to immediate next or previous
bit-string needs all bits changing. So exploitation becomes difficult. It’s called Hamming
Cliff problem. Using gray code can mitigate effect of this problem.
For mutation, random step size is needed to introduce random walk into search space. For
random number generation, Gaussian distribution is most used. It’s a bell shaped curve. It’s
defined by two parameters: position parameter (mean, µ), scale parameter (standard
deviation, σ) and is denoted by . Always µ=0 and usually σ=3 i.e. is used for
random number.
generation (RNG). Mutations using Gaussian distribution is called Gaussian mutation.
Algorithms using distribution based mutation.
Figure (2.7.4): Probability Distribution Function (PDF) of Gaussian distribution
Xin Yao uses two more distributions for RNG. They are:
1. Cauchy Distribution
2. Levy Distribution
Gaussian, Cauchy and Levy they all have same bell curve shape PDF. Both of them have
same parameter set like Gaussian. Both Cauchy and Levy have fatter tail than Gaussian. That
means they are able to give more long jumps which can give more diverse individuals; less
prone to getting trapped into local optima. Mutation using Cauchy and Levy distribution as
XXII
RNG are called Cauchy mutation and Levy mutation respectively. Xin Yao uses adaptive
mutation parameter. Every individual is represented by pair of , where is real values
vectors, is adaptive mutation parameter, is size parameter or standard deviation of that
distribution.
2.7.5 Survivor Selection Based Work
Survivor selection is usually deterministic. In this phase of algorithm, selection pressure is
applied to individuals. Several survivor selection schemes exist:
1. Naïve Survivor Selection: Basically follows survival of the fittest principle.
Individuals are selected based on their fitness value for next generation. Lower fitness
valued individuals are weed out. Sometimes risky, because lower fitness individuals
can have latent genes which can give better individuals in later generations.
2. Elitist Selection: Population maintains spot for best individuals so that they didn’t get
lost across the generations. Certain portion of best individuals is transferred directly to
next generation without any modification. This ensures even if algorithm can’t make
any solution any better than current solutions, the best solution must remain and in the
end of the algorithm is returned.
3. Truncation Selection: Truncation selection simply retains the fittest x% of the
population. These fittest individuals are duplicated into the next generation, so that the
population size is maintained. Less fit candidates are culled even without being given
the opportunity to evolve into something better. Very often results in premature
convergence. Only advantage is rapid convergence.
Figure (2.7.5): Truncation Selection
XXIII
4. Fitness Proportionate Reproduction: Same as roulette wheel selection scheme.
Individuals are directly transferred to next generation based on their proportionate
fitness value. Individuals of lower fitness still have some chances to survive, so that
some genes that are latent can survive through generations even so they haven’t been
able to generate good individuals.
5. Niching Methods: Niching methods strive to maintain niches [9] [10]. That means it
ensures individuals of one niche don’t have to compete with individuals of other
niches. The advantage is pre-existing diversity is maintained. But also makes
convergence harder as selection pressure is lower. Niching methods are divided into
two categories:
i) Fitness Sharing: In nature, individuals of same species compete with each other
for fixed resources [13]. Like nature, in fitness sharing, individuals in same region
share fixed fitness values assigned to that region. Fitness is a shared resource of
the population. Population is first divided into niches. Region is defined by
sharing radius . Sharing Radius defines the niche size. This scheme is
very sensitive to the value of assigned fitness per region and sharing radius.
Population does not converge as a whole, but convergence takes place within the
niches. Sharing can be done at genotypic or phenotypic level: 1. Genotypic level:
Hamming distance and 2. Phenotypic Level: Euclidean distance. Sharing radius: if
too small, practically no effect on the process; if too large, several peaks will
’melt’ individual peaks into one.
ii) Crowding: Similar individuals in natural population, often of the same species,
compete against each other for limited resources. Dissimilar individuals tend to
occupy different niches, they typically don’t compete. Crowding uses individuals
newly entering in a population to replace similar individuals. Random sample of
CF (Crowding Factor) individuals is taken from the population. Larger crowding
factor indicates less tolerance for the similar solutions, smaller values indicate
similar solutions are more welcomed. New members of particular species replace
older members of that species, not replacing members of other species. Crowding
doesn’t increase the diversity of population; rather it strives to maintain the pre-
existing diversity. It’s not directly influenced by fitness value. Crowding is
divided into:
(1) Deterministic Crowding: New individual will always replace the most similar
individual if it has better fitness value.
(2) Probabilistic Crowding: Primarily a distance based niching method. Main
difference is the use of a probabilistic rather than deterministic acceptance
function. No longer do stronger individuals win over weaker individuals, they
win proportionally according to their fitness, and thus we get restorative
pressure. Two core ideas of probabilistic crowding are to hold tournament
between similar individuals and to let tournaments be probabilistic.
6. Deterministic Sampling: Average fitness of the population is calculated. Fitness
associated to each individual is divided by the average fitness, but only the integer
part of this operation is stored. If the value is equal or higher than one, the individual
XXIV
is copied to the next generation. Remaining free places in the new population is
fulfilled with individuals with the greatest fraction.
XXV
Chapter 3
Algorithm Proposal
3.1 Dual Population Genetic Algorithm
Dual Population Genetic Algorithm (DPGA) is a genetic algorithm which uses two
populations instead of one to avoid premature convergence with two different evolutionary
objectives [11] [12] [13]. The main population plays the role of that of an ordinary genetic
algorithm. It evolves to find a good solution of high fitness value. The additional population
is called reserve population is employed as reservoir for additional chromosomes which are
rather different from chromosomes of main population. Two different fitness functions are
used. Main population uses actual fitness function (like normal GA) and reserve population
uses a fitness function which gives better fitness to the chromosomes more different from
chromosomes of main population. Multi Population Genetic Algorithms use migration of
chromosomes from one population to another population to exchange information. DPGA
doesn’t use migration instead it uses another noble approach called crossbreeding.
Crossbreeding is performed by taking one parent from main population and another parent
from reserve population, making crossover between them. Newly born offspring are called
crossbred offspring. Crossbred offspring then evaluated for both main population and reserve
population for survival. DPGA also employs inbreeding, which takes two parents from the
same population and makes offspring by crossover. These inbred offspring compete for
survival in their respective parent population.
Figure (3.1a): Offspring Generation of DPGA
Mutation plays minimal role in DPGA and diversity is mainly provided by reserve population
through crossbreeding. Crossbreeding plays the role of maintaining diversity in DPGA. The
amount of diversity needed in any step of DPGA is specified by a self-adaptive parameter δ
(0< δ <1). δ defines the distance of parents from main population and parents from reserve
XXVI
population. As δ determines which individual will participate in crossbreeding, we can say
roughly δ is analogous to the length of step size. The fitness function of reserve population is
------------------------------------(1)
Figure (3.1b): Reserve Population Fitness Function
d(M, x) is average distance from main population of individual x. So we have turned our
focus into crossbreeding and fitness function for reserve population. δ defines how much
distant reserve population will be from main population. δ is set to lower values for
exploitation and to higher values for exploration. If δ is kept similar for several generations,
reserve population will start to converge at δ distance from main population.
There are some pros and cons of DPGA:
3.1.1 Advantages
1. Reserve population preserves genes which is extinct from main population. As
survivor crossbred offspring holds gene inherited from best individual of main
population (which is lost in later generations), can be recovered from reserve
population.
2. DPGA utilizes information from successful breeding. Value of δ which produces
surviving offspring used later for selecting parent. If crossover is unsuccessful, δ
is set to maximum, which influences selection of future parents.
3.1.2 Disadvantages
1. Reserve population introduces space and computational overhead. For main
population, individuals only need one fitness evaluation when they are created or
modified. But for reserve population individuals, every individual needs to be
evaluated whenever the δ changes value as well as evaluation when created or
modified. If number of chromosomes of main population is n and reserve
population is m, then total evaluation of reserve fitness function is O(nm).
2. When selecting parent, dual population genetic algorithm doesn’t measure
diversity of reserved population. Reserve population should be diverse enough for
XXVII
exploration of search space. As the selected parent of reserve population may not
be so dissimilar to the parent from main population. For crossbreeding, distance
between parents may not be δ. At the worst case, the distance may be far more
less or greater than δ.
3. If crossbred child survives in both main and reserved population. Diversity
decreases as same individual is copied to both populations. This gives us another
insight, crossbred child has higher fitness (according to fitness function for reserve
population) than inbred child and parent of reserve population may not be at
desired distant from parents of main population i.e. reserve population may not
have individuals who can breed offspring at the desired step size.
4. If crossbred offspring can’t survive in the main population, DPGA transforms into
single population algorithm. And if this happens for several generation, measures
to be taken to increase diversity of reserve population which incurs overhead.
5. If crossbred offspring manage to survive in reserve population, reserve population
will contain replicated genes of an individual of main population, decreasing
diversity of reserve population further.
6. At time of converging, DPGA keeps the value of δ low to facilitate convergence
for several generations. From equation (1), individuals having distance δ gains
more fitness over other individuals replacing individuals whose distance d(M, x)
much greater or less than delta are to be replaced. As in time of convergence, δ is
set to minimum i.e. individuals most distant from main population begins to
diminish (DPGA uses best n individual for survival selection for both population
with same fitness function for parent selection) and individuals similar to main
population individuals begins to takeover reserve population after few
generations. Hence, reserve population also begins to converge as like main
population but at distance δ from main population. When DPGA detects main
population converges to local optima, it sets δ to maximum to escape from local
optima. Now DPGA picks individual most distant from main population, but the
reserve population is already similar to main population and can’t provide
diversity any further.
7. Diversity is also dependent on success of crossover. If parents for crossbreeding
are at desired distance, they may not produce fittest individuals. Crossover always
a big jump to an area somewhere “in between” two (parent) areas. Offspring
seldom goes beyond their parents.
8. In DPGA, total gene frequency remained constant from the very beginning. As
crossover is only operator used, new gene is never introduced. Crossbreeding
changes gene frequency in individual population, but total frequency remained
unchanged. At worst scenario, if the best gene is missed at the initialization of
populations, DPGA never gets the optima.
9. Inbreeding in reserve population doesn’t introduce new genes. And if the distance
of two parents is δ and –δ (selection is based on their distance, not direction), then
the inbred offspring will be more similar to main population.
XXVIII
3.1.3 Recommendation
One of the biggest drawbacks of DPGA is convergence of reserve population along main
population. For survivor selection of reserve population, probabilistic crowding (fitness
function would be same as before for parent selection) should be used for survivor selection.
As we have seen, current reserve population survivor selection of DPGA leads to
convergence of reserve population. Probabilistic crowding prevents similar genes to takeover
whole population simultaneously preserving genes from extinction.
3.2 Modified DPGA Proposal
We have seen above that selected parent from reserve population may not be different enough
from parent of main population. We can say this parent of reserve population is best of bad
bunch. As a result, crossbred offspring are not so different from their parents. And once δ is
set to lower value, near to zero for several generations, reserve population also become
almost identical to main population. We have no problem if main population converges to
good enough solution or terminating criteria is met. But if we detect premature convergence,
then we have to increase diversity of main population to escape local optima. But as reserve
population is identical with main population, it can’t give diversity to main population. So it
remains trapped in local optima.
To address this problem, we propose elimination of reserve population, instead we will
generate individual on the fly which will play the role of reserve population parent. On the fly
generated individual will be at exactly δ distance from parent of main population.
3.2.1 Structure Of Individual
Every individual in main population will be consisted of pairs of (xi, δi). Where xi is real
valued vector in each dimension and δi determines how much jump or distant will be on the
fly generated individual incorporated in each dimension. δi is called jump parameter.
Another parameter temperature T is also introduced. This parameter plays similar role like in
simulated annealing. We tried to bring the concept of simulated annealing as local search for
rigorously searching newly found potential search regions. Value of T is bigger at the
beginning of the algorithm, so that search region will be bigger and more uniform in all
dimensions i.e. shape of search region will be n-dimensional sphere. At the final stage of
evolution, value of T will be scheduled to lower to facilitate more exploited local search and
the search will exploit more in the dimensions where solutions are getting better. The local
search region will be like elliptical shape, where the major axis of the ellipse will be towards
the direction of the local (maybe global) optima of that region.
XXIX
3.2.2 Initialization
xi is initialized in regular fashion. For δi, we will generate n random numbers. Then
3.2.3 Parent Selection
Any selection method can be used. But we prefer tournament selection or restricted
tournament selection (RTS).
3.2.4 Generating Parent Individual On The Fly
One parent is selected from main population and another parent is generated based on δ. If
is the real valued vector of generated individual at dimension I, then
as
We will take value of such that
√
is the maximum possible Euclidean distance between two points in search
space.
√∑
Then we will reset main parents jump parameters. Because if this parent is selected again and
jump parameters are unchanged; then same individual will be generated again. As a result,
same offspring will be produced and computation of a generation will be wasted. So we will
reset jump parameters like initialization.
3.2.5 Mutation
DPGA uses non-uniform mutation. But we will use Cauchy Mutation, as it gives more long
jumps to facilitate exploration, when algorithm is in exploration stage,. And when algorithm
is in exploitation stage, we will use Gaussian Mutation; it gives short jump to facilitate
convergence of individuals.
XXX
3.2.6 Survivor Selection
New algorithm will evaluate both on the fly generated individuals and their inbred offspring.
If we offspring survive, we will divide them into 3 categories
1. Exploited Individual: When
2. Normal Individual: When
3. Explored Individual: When
Here distance is the Euclidean distance between offspring and main parent. We have divided
them into 3 categories so that we can explore and exploit at the very same time. When in
exploitation stage, the algorithm can still explore other potential regions in search space while
exploiting in current region. On the other hand, when algorithm is in exploration stage, if a
potential region is found we can exploit that region by conducting a local search like memetic
algorithm while still exploring other regions.
3.2.6.1 Exploited Individual
⁄
when is the explored dimension
⁄
⁄
when is the exploited dimension
Here,
The rationale is exploited individual comes from a region which is already explored or being
explored by another individual. So it doesn’t need to explore surrounding region twice.
3.2.6.2 Explored Individual
⁄
for every dimension
This individual is far away from its parent. It can be assumed that this offspring is in region
where the algorithm never searched before. So this potential region needs exploration.
Exploration is provided because . Even if the algorithm is in exploitation mood, it
can still explore newly found unsearched potentially good region.
3.2.6.3 Normal Individual
Randomly select dimensions.
⁄
⁄
for randomly selected dimension
⁄
for other dimensions
XXXI
The individual is not in the distance which can be called exploited or explored. We have
selected dimensions because we want to introduce some variations based on fitness
difference.
3.2.7 Schedule of T
The initial value of T is dependent of optimization problem. For complex, multi-modal, rough
search space T should be greater to facilitate more exploration in local search and for simple,
unimodal search space smaller value of T is better. The value of T is a function of generation
count and surviving of offspring. We propose that, T should be increased with generation
count and if no offspring survived T should be remained same and if offspring survives T
should be increased. Because surviving of offspring means we are making progress towards
convergence, not surviving means we still need to explore more regions.
3.2.8 Advantages
1. Extra space of reserve population is no longer needed. Evaluation of reserve
population individuals is also eliminated.
2. On the fly generated individual is exactly at δ distance from parent of main
population, so diversity can be incorporated as much as we want.
3. New proposal introduces δi for each dimension ∑δi= 1. δi determines how much
exploitation or exploration will take place in any dimension. If we find a
dimension in which you can find better individual, we can continue to explore or
exploit in that dimension. On the other hand, if population is trapped in any deep
local optima, then we can experiment changing value of δi, to escape local optima.
4. Every individual will have their own δi, so we have granular control for every
individual in each dimension.
5. DPGA doesn’t facilitate exploitation in newly found good regions on fitness
landscapes, whether proposed algorithm gives full throttle in exploitation in
newly found region even if the algorithm is in globally exploration mode,
giving full local search capability like memetic algorithm.
6. DPGA doesn’t evaluate reserve inbred offspring and reserve parents for survival
in the main population. But this algorithm will evaluate both on the fly generated
individuals and inbred generated offspring. Since these individuals are already
generated as by-products of crossbreeding, evaluation of them has very little
overhead and if any of them survives, they can introduce more diversity in main
population and give a new region to search for potential global maxima.
3.3 New Survivor Selection Strategy
Current schemes of survivor selections fall into two categories:
1. Scheme those focused solely on survival of the fittest or exploitation. For
example, elitist selection, rank selection, fitness proportionate reproduction.
2. Scheme those focused on solely maintaining diversity. For example, niching
methods: fitness sharing, deterministic crowding, probabilistic crowding.
XXXII
Above two categories are in two extreme ends. Those who focused on exploitation don’t take
diversity into account. On the other hand, those who focused on diversity don’t take
exploitation into account. But survivor selection should be based on both diversity and
exploitation. So we propose new survivor selection scheme which will take both diversity and
exploitation into account. The fitness function for survivor selection:
Here, real fitness function,
function of gene variation with chromosomes of current generation.
adaptive parameter which determines how much weight will be put on functions
.
Usually value of will be lower for early generations to preserve diversity; will be bigger for
final generations to facilitate convergence. We propose changing value of is function of
generation count, survival of offspring and difference between fitness of the best individual
and desired fitness.
Measuring gene variation can be crucial. One naïve approach we can adopt is to measure
Euclidean distance from all the individuals of current population, which is of O(n). We can
improve this algorithm further by some trivial modifications. At first, we will take a point or
individual as reference for measuring distances. Let, we take the individual (LowerBound0,
LowerBound1,……, LowerBoundn-1) as our reference individual. Now at the beginning, we
will measure distance from reference individual to every individual of current population. So
normalized distance of any individual is
√
So mean normalized diversity of current population is
Standard deviation using µ as reference, √
Every individual will include an additional real valued vector called relative diversity.
Relative diversity is a measure of diversity of an individual relative to the rest of the
population. Relative diversity is found by calculating standard deviation of the population
using corresponding individual as reference
√
XXXIII
So diversity fitness function is
is monotonically increasing function of . A generic fitness function will be of form
where α is scaling factor dependent on optimization problem
Scaling factor α is needed to make diversity fitness function more comparable to real fitness
function.
Usually if , then offspring improves diversity .
Now if the offspring survives and normalized distance of replaced individual is . So
the new relative diversity measurement of individuals is
√ ( ) ( )
We will adopt elitist selection. Top 10% individuals according to real fitness function and top
10% individuals according to gene variation will be reserved. So this scheme emphasize both
on exploration and exploitation. The rest 80% individuals have to survive through proposed
fitness function.
Careful observation reveals that normalized diversity of individual is in range . So
standard deviation of ( ) will certainly between 0 and 1. We can use as adaptive
parameter . But if two groups of individuals are at maximum distance while group members
are in the same neighborhood, then will be nearly 0.5 high. But the population is not
diverse at all; it just converges in two groups situated far away from each other. We can take
average relative diversity as . But same problem still persists.
First we have to determine the area of neighborhood (β) in terms of normalized diversity .
Optimal size of neighborhood depends on optimization problem. We will introduce n-
dimensional array of buckets. A bucket is a small region specified by neighborhood size
which is essentially an n-dimensional hypercube. Individuals located on buckets region will
fall into that bucket. So there will be buckets. Each value in an element of array
means number of individuals in that bucket. Every element will be initialized to zero. One
can easily find value of bucket array index
⌊ ⌋
For huge search space this bucket array may require enormous amount of memory. We can
use sparse matrix as data structure to address this issue.
After finding an individual in bucket region, value of that bucket will be increased. Individual
will also contain the location of bucket for easy removal.
XXXIV
We can see that . If we have to remove an individual we simply
decrement the value of corresponding bucket. We can use this as parameter .
When value of is higher algorithm puts more weight in real fitness function, because
current population is diverse enough. On the other hand, if value of is lower, algorithm puts
more weight in diversity fitness function as current population is losing diversity.
Even so, this function needs to be modified. It doesn’t take generation number into account.
So when the algorithm is converging, value of this function i.e. is low which will slow the
rate of convergence. A simple approach can be
( ( ))
It works because is a monotonically increasing function. As generation increases value of
( ) also increases. And maximum value of ( ) can be
1, then value of will be less than 1.
Using buckets, we can also define the regions that are already searched. We will introduce a
boolean variable named isSearched (true if the region is already searched, false if not) for
each bucket. We will define criteria eligible of being searched for each separate optimization
problem. For example, we can declare a bucket region searched when it has at least three
individuals each surviving at least 50 generations. The point of marking regions as searched
is that two individuals of same relative diversity, individual belongs to unsearched bucket
will get more weight in diversity than the one in searched bucket. For example, we will
reduce the to ⁄ .
( )
A rather extreme scheme can be eliminating all individuals of a searched bucket except the
best individual in that bucket. These individuals will be replaced by new individuals taken
from buckets which are not searched and individual count is 0. But it can be detrimental for
complex optimization problem.
Above adaptive survival fitness function is for function maximization problems, but many
real life problems involve function minimization i.e. cost minimization. In case of function
minimization, above approach won’t work. Because, both real fitness function and diversity
fitness function should decrease value for better individuals. So we need to adopt an
algorithm which will assign lower diversity fitness value for diverse individuals, higher
diversity fitness value for less diverse individual. A simple approach can be
XXXV
Above equation assigns more diversity fitness for less diverse individual and less diversity
fitness for more diverse individual. So we can also apply this diversity management
technique to function minimization.
Another approach can be applying local search to the individuals like Memetic algorithm.
Steepest ascent hill climbing is adopted here. During this hill climbing process if that
individual goes through the region of a bucket, then that bucket will be marked as searched.
Advantage of this scheme is that we can easily identify the searched regions even if those
regions don’t meet the criteria for being marked as search.
In cases of premature convergence, we can override this function and manually set the value
of .
Although niching methods: crowding methods or fitness sharing maintains diversity. But
these methods lack control on diversity. They solely try to keep pre-existing diversity level;
they neither increase diversity level in case of premature convergence nor decrease diversity
to facilitate exploitation. On the other hand, proposed survivor selection scheme gives full
control on diversity level needed in any time of evolution.
3.4 New Mutation Strategy
Using probability distributions for generating random numbers to introduce random variation
in real vectors (is called mutation). Till now, only three distributions are used successfully in
mutation. They are:
1. Gaussian Mutation
2. Cauchy Mutation
3. Levy Mutation
Careful observation reveals that above three distributions used are members of Stable Family
of distributions. Stable Family is a family of distributions where linear combination of two
independent distributions of same kind has the same distribution up to location and scale
parameters. In fact, above three distributions are special cases of stable distribution. All the
stable distributions are infinitely divisible. They are absolutely continuous and unimodal. A
random variable X is called stable (has a stable distribution) if, for n independent copies Xi of
X, there exist constants cn > 0 and dn such that
XXXVI
Figure (3.4): Probability Density Function of Stable Family
So we can try other member distributions of stable family to generate random numbers for
mutation. Other two members of stable family are:
1. Laplace Distribution
2. Slash Distribution
3.4.1 Laplace Distribution
Like Gaussian distribution, it has two parameters: Location parameter, µ and Scale
parameter, σ. Cauchy distribution is the result of Fourier transformation of Laplace
distribution. The probability density function of the Laplace distribution is also reminiscent
of the Gaussian distribution; however, whereas the Gaussian distribution is expressed in
terms of the squared difference from the mean μ, the Laplace density is expressed in terms of
the absolute difference from the mean. Consequently the Laplace distribution has fatter tails
than the Gaussian distribution.
XXXVII
Figure (3.4.1a): Probability Density Function of Laplace Distribution
Figure (3.4.1b): Comparison of Gaussian and Laplace Distribution
Above is a graph of Gaussian and Laplace distribution with same scale and location
parameter. It is noticeable that Laplace has fatter tail than Gaussian and has a sharper peak
than Gaussian. Laplace falls rather quickly in comparison with Gaussian. It is expected to
have a higher probability of escaping from a local optima or moving away from a plateau,
especially when “the basin of attraction” of the local optima or plateau is large relative to the
mean step size. On the other hand, Gaussian has greater probability in the mid-range. From
observation, we can conclude that, sharp peak of Laplace facilitates exploitation as it has
more probability of producing short jump; it can also give long jump more than Gaussian.
Although for the mid-range jump, Gaussian gives better result.
XXXVIII
3.4.2 Slash Distribution
The Slash distribution is a continuous unbounded distribution developed as a deviation to the
Gaussian distribution to allow for fatter tails kurtosis by altering the κ parameter, as
illustrated in the plot below. When κ=0 the distribution reduces to a Gaussian(μ, σ). If
Gaussian distribution is divided by a standard uniform random variable, then the resulting
distribution is Slash distribution. It’s an example of ratio distribution.
It has three parameters, like Gaussian distribution location parameter μ, scale parameter σ and
an extra parameter κ.
Figure (3.4.2): Probability Density Function of Slash Distribution at different parameters
From above graph, we see that, as value of κ getting bigger, tail and peak of Slash
distribution is getting bigger, slope is getting steeper and mid-range is getting smaller. By
controlling the value κ, we can get an adaptive probability distribution which will facilitate
two extreme ends: exploitation and exploration.
The Slash distribution is used to fit to data that are approximately Gaussian distribution but
have a kurtosis > 3. i.e. greater than the Gaussian distribution. The Slash distribution can
readily be compared to a Gaussian distribution since they share the same mean μ and standard
deviation σ parameters.
Another distribution of which Gaussian distribution is a special form of family of
distributions is called Student’s t-distribution.
XXXIX
3.4.3 Student’s t-distribution
Student’s t-distribution (or simply the t-distribution) is a family of continuous probability
distributions that arises when estimating the mean of a normally distributed population in
situations where the sample size is small and population standard deviation is unknown. The
t-distribution is symmetric and bell-shaped, like the normal distribution, but has heavier tails,
meaning that it is more prone to producing values that fall far from its mean.
Figure (3.4.3): Probability Density Function for Student’s t-distribution with different degrees of
freedom
The overall shape of the probability density function of the t-distribution resembles the bell
shape of a normally distributed variable with mean 0 and variance 1, except that it is a bit
lower and wider. As the number of degrees of freedom grows, the t-distribution approaches
the normal distribution with mean 0 and variance 1. As t-distributions have similar bell curve
shape of Gaussian distribution and when the number of degrees of freedom reaches infinity it
converges to Gaussian. Although for practical purposes, when degree of freedom is 30, t-
distribution converges to Gaussian. Careful observation reveals that when degree of freedom
is low, t-distribution have much fatter tail and lower peak. As degree of freedom (DOF)
increases, tails become thinner and peak becomes higher. That means, at low DOF, this
distribution gives more long jumps and with increase of DOF distribution gives sorter jumps.
So we can exploit this behavior of t-distribution. At the beginning of EA, DOF for t-
distribution will be low; diversity is needed so t-distribution will produce long jumps to
facilitate diversity. As generation increases we will increase DOF of t-distribution, it will
give less short jumps and the algorithm will be less exploitative.
XL
Chapter 4
Experimental Study
4.1 Modified DPGA
We have implemented
1. Standard GA
2. DPGA
3. Modified DPGA
Parameter setting for algorithms:
 Maximum generation = 1000
 Population size = 500.
 Main population parents = 2
 Reserve population parents = 2
 Inbred main population offspring = 2
 Inbred reserve population offspring = 2
 Crossbred offspring (are produced by taking 1 parent individual from main population
and another parent from reserve population) = 2


 For crossover blend crossover method was used with parameter .
 Uniform Gaussian mutation with is applied for both of them.
Tournament selection was used for parent selection. Naïve survivor selection method was
adopted.
4.1.1 Pitfalls of Modified DPGA
Theoretically the proposed algorithm should work better than DPGA. But in practice it
doesn’t. At the beginning if the algorithm, we set which means generated individual
should be at maximum distance possible. As a result, on the fly generated individual always
go to the edge of search space. So generated individual only searches extreme ends of search
space. Thus offspring produced by crossover taking generated individual as parent, are also
on the boundary of search space or in its neighborhood. Mutating these offspring seldom
works, because short or mid-range jumps will still keep individual near other individuals.
And long jump needed to introduce diversity is very unlikely by current mutation operators.
XLI
Even if we design a mutation operator which gives this sort of jump, it has the risk of taking
individual out of search space and taking sufficiently diverse individual to already searched
regions. If we have search space of n dimensional, every dimension has same lower bound,
upper bound then the search space will be like n-dimensional hypercube. Literally this
algorithm only searches the faces of hypercube and their neighborhoods, while core region of
hypercube remains unsearched.
Another modification can be made to start the algorithm with lower value of . This will
prevent generating individual at the boundary of search space as well as offspring. But lower
value of also means algorithm is unable to make long jumps.
4.2 Adaptive Survivor Selection Strategy
We have used standard GA with different types of survivor selection scheme. Implemented
schemes are:
1. Naïve survivor selection
2. Adaptive survivor selection
Initial setting of parameters of adaptive survivor selection:
 Population size = 2000
 Maximum generation = 500
 Bucket edge length = 1.90734863e-6
 Minimum number of generations required to be declared searched = 70
 Minimum number of individuals required to be declared searched = 3
 Survival adaptive parameter = 0.3
 Diversity scaling factor = 50
 Penalty factor = 10
 Reserved number of best individuals (elite) = 20
 Reserved number of most diverse individuals (elite) = 20
Experiment result for Ackley function by both adaptive survivor selection and naïve
survivor selection given below:
XLII
Figure (4.2a): Change in diversity across generations
Figure (4.2b): Number of buckets searched
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
Adaptive Survivor Selection Naive Survivor Selection
XLIII
Chapter 5
Conclusion
5.1
Adaptive survivor selection is a noble approach which first introduces adaptive survivor
selection with new diversity measurement. The best tool it provides that selection method can
be adapted with respect to generation and diversity level. It checks the amount of diversity so
that at any moment diversity doesn’t fall beyond lowest permitted value. It also incorporates
elitist selection scheme not only for best individuals but also for most diverse individuals; so
that when individuals are trapped into deep local optima, these most diverse individuals
found by far helps to escape.
Experiments show that adaptive survivor selection beats currently most used naïve survivor
selection in terms of maintaining diversity exclusively. Although niching methods can
maintain pre-existing diversity better than adaptive survivor selection sometimes, but we can
mitigate this gap of performance by using proper initialization of adaptive diversity parameter
and update rule. This scheme addresses one of the drawbacks of niching methods, they can’t
control the diversity needed for at any generation. Actually niching methods and adaptive
survivor selection have different goals. Niching methods mainly focuses on growing niches
of individuals and maintaining niches, on the other hand our scheme focuses on maintain the
level of diversity which can guide to individuals to global maxima.
5.2 Future Work
5.2.1 Modified DPGA
It is obvious that value of δ caused this measurable performance of this algorithm. If we can
change initialization and update rule of δ, hopefully this algorithm will perform better. One
approach could be instead of initializing δ to 1, we will initialize δ to lower values. Thus risk
of individuals going beyond the search space or only residing on the search space boundary
will be mitigated. But this approach has a flaw. If we restrict δ to lower values, that means
algorithm is now less capable of getting out of local optima and hence more prone to
premature convergence. Assigning value to δ can be taken from a probability distribution. So
that δ won’t be vulnerable to being too high or too low. After the initialization problem of δ is
solved, update rule of δ is still needs to be revised.
XLIV
5.2.2 Adaptive Survivor Selection
We have investigated new diversity measurement technique and using that diversity
measurement technique, we have proposed new survival selection strategy which works
better than existing survivor selection schemes. A pitfall of new diversity measurement is for
some edge cases, diversity measurement gives high value of diversity although the population
isn’t diverse at all. So detecting these edge cases and mitigating the error caused by these
edge cases can be done in future. Also we can adopt fitness sharing to assign fitness to each
bucket, where every individual of that bucket will share that fitness. Assigned fitness to a
bucket will be dependent how much diverse that bucket is. That means instead of measuring
diversity of individuals, we are measuring diversity of their container buckets. Once bucket is
assigned fitness, then individuals of same bucket will share that fitness among them.
5.2.3 New Distribution Based Mutation
We have investigated distributions which have similar properties of currently deployed
distributions or have same family origin. These distributions have bell shaped curve similar to
Gaussian to and also dependent on the same set of parameters like Gaussian, Cauchy or Levy
distributions. Three distributions presented before has potential to replace current distribution
based mutations. All of them have fatter tails and Laplace, Slash distributions have higher
peaks, so theoretically both of them should give better performance in both exploration and
exploitation. Student’s t-distribution has converged to Gaussian at DOF 30. So we can
experiment on which initial DOF, we initiate our algorithm and how we can change the DOF
as the generation increases.
XLV
References
[1] D.E. Goldberg, Genetic Algorithms in Search, Optimization, and Machine Learning.
Addison-Wesley, Reading, MA, 1989.
[2] L.J. Fogel, A.J. Owens, and M.J. Walsh, Artificial Intelligence through simulated
evolution, New York, John Wiley & Sons, 1966.
[3] E. Eiben, R. Hinterding, and Z. Michalewicz, “Parameter control in evolutionary
algorithms,” IEEE Trans. Evol. Comput., vol. 3, no. 2, pp. 124–141, Jul. 1999.
[4] D. H. Wolpert and W. G. Macready, “No free lunch theorems for optimization”, IEEE
Transactions on Evolutionary Computation, vol. 1, no. 1, pp. 67–82, 1997.
[5] D. E. Goldberg and J. Richardson, “Genetic algorithms with sharing for multimodal
function optimization,” in Proc. 2nd Int. Conf. Genetic Algorithms (ICGA), 1987, pp. 41–49.
[6] T. Jumonji, G. Chakraborty, H. Mabuchi, and M. Matsuhara, “A novel distributed genetic
algorithm implementation with variable number of islands,” in Proc. IEEE Congr. Evolut.
Comput., 2007, pp. 4698–4705.
[7] Y. Yoshida and N. Adachi, “A diploid genetic algorithm for preserving population
diversity-pseudo-Meiosis GA,” in Proc. 3rd Parallel Problem Solving Nature (PPSN), 1994,
pp. 36–45.
[8] M. Kominami and T. Hamagami, “A new genetic algorithm with diploid chromosomes by
using probability decoding for nonstationary function optimization,” in Proc. IEEE Int. Conf.
Syst., Man, Cybern., 2007, pp. 1268–1273.
[9] S. W. Mahfoud, “Crowding and preselection revisited,” in Proc. 2nd Parallel Problem
Solving Nature (PPSN), 1992, pp. 27–37.
[10] S. W. Mahfoud, “Niching methods for genetic algorithms,” Ph.D. dis-sertation, Dept.
General Eng., Univ. Illinois, Urbana-Champaign, 1995.
[11] T. Park and K. R. Ryu, “A dual population genetic algorithm with evolving diversity,” in
Proc. IEEE Congr. Evol. Comput. , 2007, pp. 3516–3522.
[12] T. Park and K. R. Ryu, “Adjusting population distance for dual-population genetic
algorithm,” in Proc. Aust. Joint Conf. Artif. Intell., 2007, pp. 171–180.
[13] T. Park and K. R. Ryu, “A Dual-Population Genetic Algorithm for Adaptive Diversity
Control” in Proc. Aust. Joint Conf. Artif. Intell., 2009, pp. 191–210.
[13] R. McKay, “Fitness sharing in genetic programming,” in Proc. of the Genetic and
Evolutionary Computation Conference, Las Vegas, Nevada, 2000, pp. 435–442.
XLVI
[14] R. K. Ursem, “Diversity guided Evolutionary algorithm,” in Proc. of Parallel Problem
Solving from Nature (PPSN) VII, vol. 2439, J. J. Merelo, P. Adamidis, H. P. Schwefel, Eds.
Granada, Spain, 2002, pp. 462–471.
[15] T. Bäck and H.-P. Schwefel, “An overview of evolutionary algorithms for parameter
optimization,” Evol. Comput., vol. 1, pp. 1–23, 1993.
[16] K. Chellapilla, “Combining mutation operators in evolutionary programming,” IEEE
Trans. Evol. Comput., vol. 2, pp. 91–96, Sept. 1998.
[17] R. Mantegna, “Fast, accurate algorithm for numerical simulation of Lévy stable
stochastic process,” Phys. Rev. E, vol. 49, no. 5, pp. 4677–4683, 1994.
[18] X. Yao, G. Lin, and Y. Liu, “An analysis of evolutionary algorithms based on
neighborhood and step size,” in Proc. 6th Int. Conf. Evolutionary Programming, 1997, pp.
297–307
[19] D. Thierens, “Adaptive mutation rate control schemes in genetic algorithms,” in Proc.
Congr. Evol. Comput. , vol. 1. 2002, pp. 980–985.
[20] G. Rudolph, “On takeover times in spatially structured populations: Array and ring,” in
Proc. 2nd Asia-Pacific Conf. Genetic Algorithms Applicat., 2000, pp. 144–151.
[21] X. Yao, Y. Liu, and G. Lin, “Evolutionary programming made faster,” IEEE Trans.
Evol. Comput., vol. 3, no. 2, pp. 82–102, Jul. 1999.

More Related Content

Viewers also liked

Despierta Papa Despierta
Despierta Papa DespiertaDespierta Papa Despierta
Despierta Papa Despiertaguesta7eb4a
 
Comunicado Numero 1 CIMI
Comunicado Numero 1 CIMIComunicado Numero 1 CIMI
Comunicado Numero 1 CIMIDesarrollo Sena
 
Creative that cracks the code applied to indian market - Group 6
Creative that cracks the code applied to indian market - Group 6Creative that cracks the code applied to indian market - Group 6
Creative that cracks the code applied to indian market - Group 6Sameer Mathur
 
Conclusiones del Primer Foro de Industrias Culturales de Entre Ríos
Conclusiones del Primer Foro de Industrias Culturales de Entre RíosConclusiones del Primer Foro de Industrias Culturales de Entre Ríos
Conclusiones del Primer Foro de Industrias Culturales de Entre RíosSebastián Lorenzo
 
DI Frameworks - hidden pearls
DI Frameworks - hidden pearlsDI Frameworks - hidden pearls
DI Frameworks - hidden pearlsSven Ruppert
 
SIIE16. Proyecto eMadrid: Juegos Serios y Simulaciones Educativas. Manuel Fre...
SIIE16. Proyecto eMadrid: Juegos Serios y Simulaciones Educativas. Manuel Fre...SIIE16. Proyecto eMadrid: Juegos Serios y Simulaciones Educativas. Manuel Fre...
SIIE16. Proyecto eMadrid: Juegos Serios y Simulaciones Educativas. Manuel Fre...eMadrid network
 
SEO para Wordpress
SEO para WordpressSEO para Wordpress
SEO para WordpressClinic Seo
 
Astrónomos y astrólogos
Astrónomos y astrólogosAstrónomos y astrólogos
Astrónomos y astrólogosPilar Muñoz
 
Memoria bomberos 2012
Memoria bomberos 2012Memoria bomberos 2012
Memoria bomberos 2012Roberto Caro
 
Do you know about what you believe in? Christian bible test
Do you know about what you believe in?  Christian bible test Do you know about what you believe in?  Christian bible test
Do you know about what you believe in? Christian bible test Aurora Ute
 
Plano marketing telexfree
Plano marketing telexfreePlano marketing telexfree
Plano marketing telexfreeSergio Santos
 
Revascularización Coronaria en Pacientes con Disfunción Ventricular Izquierda
Revascularización Coronaria en Pacientes con Disfunción Ventricular IzquierdaRevascularización Coronaria en Pacientes con Disfunción Ventricular Izquierda
Revascularización Coronaria en Pacientes con Disfunción Ventricular IzquierdaDr. Juan Carlos Becerra Martinez
 
Silabo fisicai septiembre 2015 - anibal mantilla
Silabo fisicai   septiembre 2015 - anibal mantillaSilabo fisicai   septiembre 2015 - anibal mantilla
Silabo fisicai septiembre 2015 - anibal mantillafigempa uce
 
Lista de-utiles-ciclo-sierra
Lista de-utiles-ciclo-sierraLista de-utiles-ciclo-sierra
Lista de-utiles-ciclo-sierraFerchoisG
 
Case History Ecommerce Di Successo - Sediarreda Studio Cappello - Ecommerce Day
Case History Ecommerce Di Successo - Sediarreda Studio Cappello - Ecommerce DayCase History Ecommerce Di Successo - Sediarreda Studio Cappello - Ecommerce Day
Case History Ecommerce Di Successo - Sediarreda Studio Cappello - Ecommerce DayStudio Cappello | Digital Marketing
 

Viewers also liked (18)

Despierta Papa Despierta
Despierta Papa DespiertaDespierta Papa Despierta
Despierta Papa Despierta
 
Comunicado Numero 1 CIMI
Comunicado Numero 1 CIMIComunicado Numero 1 CIMI
Comunicado Numero 1 CIMI
 
Creative that cracks the code applied to indian market - Group 6
Creative that cracks the code applied to indian market - Group 6Creative that cracks the code applied to indian market - Group 6
Creative that cracks the code applied to indian market - Group 6
 
Present redes lvg
Present redes lvgPresent redes lvg
Present redes lvg
 
Building TV apps with Chromecast
Building TV apps with ChromecastBuilding TV apps with Chromecast
Building TV apps with Chromecast
 
Conclusiones del Primer Foro de Industrias Culturales de Entre Ríos
Conclusiones del Primer Foro de Industrias Culturales de Entre RíosConclusiones del Primer Foro de Industrias Culturales de Entre Ríos
Conclusiones del Primer Foro de Industrias Culturales de Entre Ríos
 
DI Frameworks - hidden pearls
DI Frameworks - hidden pearlsDI Frameworks - hidden pearls
DI Frameworks - hidden pearls
 
SIIE16. Proyecto eMadrid: Juegos Serios y Simulaciones Educativas. Manuel Fre...
SIIE16. Proyecto eMadrid: Juegos Serios y Simulaciones Educativas. Manuel Fre...SIIE16. Proyecto eMadrid: Juegos Serios y Simulaciones Educativas. Manuel Fre...
SIIE16. Proyecto eMadrid: Juegos Serios y Simulaciones Educativas. Manuel Fre...
 
SEO para Wordpress
SEO para WordpressSEO para Wordpress
SEO para Wordpress
 
Astrónomos y astrólogos
Astrónomos y astrólogosAstrónomos y astrólogos
Astrónomos y astrólogos
 
Memoria bomberos 2012
Memoria bomberos 2012Memoria bomberos 2012
Memoria bomberos 2012
 
Do you know about what you believe in? Christian bible test
Do you know about what you believe in?  Christian bible test Do you know about what you believe in?  Christian bible test
Do you know about what you believe in? Christian bible test
 
Plano marketing telexfree
Plano marketing telexfreePlano marketing telexfree
Plano marketing telexfree
 
Revascularización Coronaria en Pacientes con Disfunción Ventricular Izquierda
Revascularización Coronaria en Pacientes con Disfunción Ventricular IzquierdaRevascularización Coronaria en Pacientes con Disfunción Ventricular Izquierda
Revascularización Coronaria en Pacientes con Disfunción Ventricular Izquierda
 
Recetario de Ensaladas
Recetario de EnsaladasRecetario de Ensaladas
Recetario de Ensaladas
 
Silabo fisicai septiembre 2015 - anibal mantilla
Silabo fisicai   septiembre 2015 - anibal mantillaSilabo fisicai   septiembre 2015 - anibal mantilla
Silabo fisicai septiembre 2015 - anibal mantilla
 
Lista de-utiles-ciclo-sierra
Lista de-utiles-ciclo-sierraLista de-utiles-ciclo-sierra
Lista de-utiles-ciclo-sierra
 
Case History Ecommerce Di Successo - Sediarreda Studio Cappello - Ecommerce Day
Case History Ecommerce Di Successo - Sediarreda Studio Cappello - Ecommerce DayCase History Ecommerce Di Successo - Sediarreda Studio Cappello - Ecommerce Day
Case History Ecommerce Di Successo - Sediarreda Studio Cappello - Ecommerce Day
 

Similar to Adaptive Selection in Evolutionary Algorithm thesis

Time Table Management system
Time Table Management systemTime Table Management system
Time Table Management systemShaswat Lovee
 
Parallel evolutionary approach report
Parallel evolutionary approach reportParallel evolutionary approach report
Parallel evolutionary approach reportPriti Punia
 
Regularized Weighted Ensemble of Deep Classifiers
Regularized Weighted Ensemble of Deep Classifiers Regularized Weighted Ensemble of Deep Classifiers
Regularized Weighted Ensemble of Deep Classifiers ijcsa
 
Software Testing Using Genetic Algorithms
Software Testing Using Genetic AlgorithmsSoftware Testing Using Genetic Algorithms
Software Testing Using Genetic AlgorithmsIJCSES Journal
 
Convergence tendency of genetic algorithms and artificial immune system in so...
Convergence tendency of genetic algorithms and artificial immune system in so...Convergence tendency of genetic algorithms and artificial immune system in so...
Convergence tendency of genetic algorithms and artificial immune system in so...ijcsity
 
Utilizing Machine Learning, Detect Chronic Kidney Disease and Suggest A Healt...
Utilizing Machine Learning, Detect Chronic Kidney Disease and Suggest A Healt...Utilizing Machine Learning, Detect Chronic Kidney Disease and Suggest A Healt...
Utilizing Machine Learning, Detect Chronic Kidney Disease and Suggest A Healt...IRJET Journal
 
A Research Base Project Report on A study on physical activity recognition fr...
A Research Base Project Report on A study on physical activity recognition fr...A Research Base Project Report on A study on physical activity recognition fr...
A Research Base Project Report on A study on physical activity recognition fr...Diponkor Bala
 
J03504073076
J03504073076J03504073076
J03504073076theijes
 
Genetic Algorithms for Evolving Computer Chess Programs
Genetic Algorithms for Evolving Computer Chess Programs   Genetic Algorithms for Evolving Computer Chess Programs
Genetic Algorithms for Evolving Computer Chess Programs Patrick Walter
 
Survey: Biological Inspired Computing in the Network Security
Survey: Biological Inspired Computing in the Network SecuritySurvey: Biological Inspired Computing in the Network Security
Survey: Biological Inspired Computing in the Network SecurityEswar Publications
 
YOGA POSE DETECTION USING MACHINE LEARNING LIBRARIES
YOGA POSE DETECTION USING MACHINE LEARNING LIBRARIESYOGA POSE DETECTION USING MACHINE LEARNING LIBRARIES
YOGA POSE DETECTION USING MACHINE LEARNING LIBRARIESIRJET Journal
 
Survey on evolutionary computation tech techniques and its application in dif...
Survey on evolutionary computation tech techniques and its application in dif...Survey on evolutionary computation tech techniques and its application in dif...
Survey on evolutionary computation tech techniques and its application in dif...ijitjournal
 
Introduction to Optimization with Genetic Algorithm (GA)
Introduction to Optimization with Genetic Algorithm (GA)Introduction to Optimization with Genetic Algorithm (GA)
Introduction to Optimization with Genetic Algorithm (GA)Ahmed Gad
 
Memetic search in differential evolution algorithm
Memetic search in differential evolution algorithmMemetic search in differential evolution algorithm
Memetic search in differential evolution algorithmDr Sandeep Kumar Poonia
 
SchwarzentruberThesis2016
SchwarzentruberThesis2016SchwarzentruberThesis2016
SchwarzentruberThesis2016Adrianne Hines
 
Artificial Intelligence - 2
Artificial Intelligence - 2Artificial Intelligence - 2
Artificial Intelligence - 2Muhd Mu'izuddin
 

Similar to Adaptive Selection in Evolutionary Algorithm thesis (20)

Time Table Management system
Time Table Management systemTime Table Management system
Time Table Management system
 
Parallel evolutionary approach report
Parallel evolutionary approach reportParallel evolutionary approach report
Parallel evolutionary approach report
 
Ar03402580261
Ar03402580261Ar03402580261
Ar03402580261
 
Regularized Weighted Ensemble of Deep Classifiers
Regularized Weighted Ensemble of Deep Classifiers Regularized Weighted Ensemble of Deep Classifiers
Regularized Weighted Ensemble of Deep Classifiers
 
Software Testing Using Genetic Algorithms
Software Testing Using Genetic AlgorithmsSoftware Testing Using Genetic Algorithms
Software Testing Using Genetic Algorithms
 
Thesis
ThesisThesis
Thesis
 
Convergence tendency of genetic algorithms and artificial immune system in so...
Convergence tendency of genetic algorithms and artificial immune system in so...Convergence tendency of genetic algorithms and artificial immune system in so...
Convergence tendency of genetic algorithms and artificial immune system in so...
 
Utilizing Machine Learning, Detect Chronic Kidney Disease and Suggest A Healt...
Utilizing Machine Learning, Detect Chronic Kidney Disease and Suggest A Healt...Utilizing Machine Learning, Detect Chronic Kidney Disease and Suggest A Healt...
Utilizing Machine Learning, Detect Chronic Kidney Disease and Suggest A Healt...
 
A Research Base Project Report on A study on physical activity recognition fr...
A Research Base Project Report on A study on physical activity recognition fr...A Research Base Project Report on A study on physical activity recognition fr...
A Research Base Project Report on A study on physical activity recognition fr...
 
L018147377
L018147377L018147377
L018147377
 
J03504073076
J03504073076J03504073076
J03504073076
 
Genetic Algorithms for Evolving Computer Chess Programs
Genetic Algorithms for Evolving Computer Chess Programs   Genetic Algorithms for Evolving Computer Chess Programs
Genetic Algorithms for Evolving Computer Chess Programs
 
Survey: Biological Inspired Computing in the Network Security
Survey: Biological Inspired Computing in the Network SecuritySurvey: Biological Inspired Computing in the Network Security
Survey: Biological Inspired Computing in the Network Security
 
YOGA POSE DETECTION USING MACHINE LEARNING LIBRARIES
YOGA POSE DETECTION USING MACHINE LEARNING LIBRARIESYOGA POSE DETECTION USING MACHINE LEARNING LIBRARIES
YOGA POSE DETECTION USING MACHINE LEARNING LIBRARIES
 
Survey on evolutionary computation tech techniques and its application in dif...
Survey on evolutionary computation tech techniques and its application in dif...Survey on evolutionary computation tech techniques and its application in dif...
Survey on evolutionary computation tech techniques and its application in dif...
 
Introduction to Optimization with Genetic Algorithm (GA)
Introduction to Optimization with Genetic Algorithm (GA)Introduction to Optimization with Genetic Algorithm (GA)
Introduction to Optimization with Genetic Algorithm (GA)
 
Memetic search in differential evolution algorithm
Memetic search in differential evolution algorithmMemetic search in differential evolution algorithm
Memetic search in differential evolution algorithm
 
SchwarzentruberThesis2016
SchwarzentruberThesis2016SchwarzentruberThesis2016
SchwarzentruberThesis2016
 
Machine_Learning_Co__
Machine_Learning_Co__Machine_Learning_Co__
Machine_Learning_Co__
 
Artificial Intelligence - 2
Artificial Intelligence - 2Artificial Intelligence - 2
Artificial Intelligence - 2
 

Recently uploaded

APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxPoojaBan
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2RajaP95
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxvipinkmenon1
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineeringmalavadedarshan25
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 

Recently uploaded (20)

APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptx
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineering
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 

Adaptive Selection in Evolutionary Algorithm thesis

  • 1. I An Adaptive Selection Scheme for Balancing Exploitation and Exploration By Muhammad Riyad Parvez Student ID: 200605043 Department of Computer Science and Engineering Bangladesh University of Engineering and Technology March 2012
  • 2. II DECLARATION I, hereby, declare that the work presented in this thesis is the outcome of the investigation performed by me under the supervision of Dr. Md. Monirul Islam, Associate Professor, Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka. I also declare that no part of this thesis and thereof has been or is being submitted elsewhere for the award of any degree or diploma. Signature (Muhammad Riyad Parvez) Candidate Countersigned (Dr. Md. Monirul Islam) Supervisor
  • 3. III Acknowledgements Firstly, I would like to express my profound gratefulness and honor to Dr. Md. Monirul Islam, Associate Professor, Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, for his continuous support, advice and care. His endless patience, scholarly guidance, continual encouragement, constant and energetic supervision, constructive criticism, valuable advice, reading many inferior drafts and correcting them at all stages have made it possible to complete this thesis. I would also like to thank all the faculty members and staff of Department of CSE, BUET, for their support and cooperation. Finally, I would like to express my deep respect and gratitude to my parents and my family. And to Charles Darwin from whom, evolutionary algorithms get their aesthetic beauty.
  • 4. IV Abstract Evolutionary algorithms are successfully applied to problems where typical optimization algorithms and local search methods fail. Like any search algorithm, evolutionary algorithms have to face the conflicting goals of local exploitations and global explorations during the search process. The success of evolutionary algorithm lies in its ability to explore and exploit simultaneously. But EA is also error prone to loosing population diversity early generations which results into population trapped into local optima and degrading of its performance. Though numerous schemes are suggested for various stages of EA to prevent premature optimization, maintaining balance between exploration and exploitation isn’t tried at survivor selection stage. Existing selection strategies either focuses on exploiting or exploring. This thesis introduces Adaptive Survivor Selection Strategy (ASSS), a totally new concept that tries to maintain required amount of diversity at survivor selection stage. Along with new selection strategy this thesis paper also presents new criteria for measuring diversity both for individual and whole population. Key concept of this strategy is to measure diversity across the population, calculating needed amount of diversity at that time and try to gain that diversity level selecting survivors using newly introduced diversity measurement technique. ASSS uses both fitness of an individual and how much diverse is the individual regarding to current population to calculate an adaptive survivor selection fitness function. Using that fitness we simply weed out inferior individuals in terms of fitness and diversity. This survivor fitness function is adaptive which gives it control on selection pressure. Performance in maintaining required level of diversity at any time of algorithm is evaluated on a number of benchmark numerical optimization problems and results are compared with several existing selection schemes. Experimental result shows that ASSS shows significant performance gain in managing diversity for any sort of EA.
  • 5. V Table of Contents Chapter 1: Introduction……………………………………….……1 1.1 General Introduction……………………………………….……...1 1.2 Objective of the Thesis……………………………………….……2 1.3 Thesis Organization……………………………………………….3 Chapter 2: Background………………………………………….….4 2.1 When EA is Needed……………………………………………….4 2.2 Advantages of EA…………………………………………………4 2.3 Disadvantages of EA……………………………………………...4 2.4 Canonical Structure of EA……………………………………….5 2.5 Representation of Gene…………………………………………...6 2.6 Major Branches of EA…………………………………………....7 2.6.1 Genetic Algorithm………………………………………………….…..7 2.6.2 Evolutionary Programming……………………………………….…..7 2.6.3 Evolutionary Strategy…………………………………………………7 2.6.4 Genetic Programming…………………………………………………8 2.6.5 Memetic Algorithm……………………………………………………9 2.7 Existing Work…………………………………………………….9 2.7.1 Dynamic Parameter Control…………………………………………9 2.7.2 Maintaining Diversity and Multi-population GAs…………………9 2.7.3 Memory Based Genetic Algorithm…………………………………10 2.7.4 Mutation Based Work………………………………………………10 2.7.5 Survivor Selection Based Work……………………………………12
  • 6. VI Chapter 3: Proposed Algorithms………………………………..15 3.1 Dual Population Genetic Algorithm………………………….15 3.1.1 Advantages………………………………………………………….16 3.1.2 Disadvantages……………………………………………………….16 3.1.3 Recommendation……………………………………………………17 3.2 Modified DPGA Proposal……………………………………...17 3.2.1 Structure of Individual……………………………………………..17 3.2.2 Initialization………………………………………………………….18 3.2.3 Parent Selection……………………………………………………..18 3.2.4 Generating Parent Individual On The Fly………………………..18 3.2.5 Mutation…………………………………………………………….18 3.2.6 Survivor Selection…………………………………………………..19 3.2.6.1 Exploited Individual……………………………………...19 3.2.6.2 Explored Individual……………………………………....19 3.2.6.3 Normal Individual………………………………………..19 3.2.7 Schedule of T……………………………………………………….20 3.2.8 Advantages…………………………………………………………20 3.3 New Survivor Selection Strategy……………………………..20 3.4 New Mutation Strategy……………………………………….24 3.4.1 Laplace Distribution………………………………………………25 3.4.2 Slash Distribution…………………………………………………27 3.4.3 Students T-Distribution…………………………………………..28 Chapter 4: Experimental Study………………………………...29 4.1 Modified DPGA………………………………………………29 4.1.1 Pitfalls of Modified DPGA………………………………………29
  • 7. VII 4.2 Adaptive Survivor Selection Strategy……………………….30 Chapter 5: Conclusion…………………………………………...32 5.2 Future Works……………………………………………….....32 5.2.1 Modified DPGA…………………………………………………...32 5.2.2 Adaptive Survivor Selection………………………………………33 5.2.3 New Distribution Based Mutation………………………………..33 References………………………………………………………...34 Appendix………………………………………………………….35
  • 8. VIII List of Figures Table 1.1(a): Change in best fitness (best solution) with number of generations…………….1 Table 1.1(b): Comparison between Random Search, EA and Problem Tailored Search……..2 Table 2.4: Basic skeleton of an Evolutionary Algorithm………………………………….…..6 Table 2.6.4: Individual structure of GP………………………………………………….……8 Table 2.7.4: Probability Distribution Function (PDF) of Gaussian distribution…………..…11 Table 2.7.5: Truncation Selection……………………………………………………………12 Table 3.1(a): Offspring Generation of DPGA…………………………………………….…15 Table 3.1(b): Reserve Population Fitness Function……………………………………….…16 Table 3.4: Probability Density Function of Stable Family………………………….…….….24 Table 3.4.1(a): Probability Density Function of Laplace Distribution…………………...…..26 Table 3.4.1(b): Comparison of Gaussian and Laplace Distribution…………………….…....26 Table 3.4.2: Probability Density Function of Slash Distribution at different parameters........27 Table 3.4.3: Probability Density Function for Student’s t-distribution with different degrees of freedom……………………………………………………………………………28 Table 4.2(a): Change in diversity across generations………………………………………...31 Table 4.2(b): Number of buckets searched…………………………………………………..31
  • 9. IX List of Symbols MPGA Multi Population Genetic Algorithm DPGA Dual Population Genetic Algorithm GA Genetic Algorithm StGA Standard Genetic Algorithm EA Evolutionary Algorithm EC Evolutionary Computing EP Evolutionary Programming MA Memetic Algorithm FEP Fast Evolutionary Programming ASSS Adaptive Survivor Selection Strategy
  • 10. X
  • 11. XI Chapter 1 Introduction 1.1 General Introduction Evolutionary Algorithm (EA) is the study of computational system which use ideas and get inspirations from natural evolution. It’s a generic population based meta-heuristic optimization algorithm. EA falls into category of bio-inspired computing. It uses selection, crossover, mutation mechanisms borrowed from natural evolution. And survival of the fittest principle lies in the heart of EA [1] [2]. Evolution Algorithms are often viewed as function optimizers, although the range of problems to which EAs are applies quite broad. One of the many advantages of EAs is they don’t require very broad domain knowledge. Although domain knowledge can be introduced in EAs. Fitness curve by generations for EA is asymptotic in nature. Fitness improvement in earlier generations of EA is rapid and decreasingly increasing. And after certain generations, improvement in best fitness throughout generations is negligible. That’s when we call population has converged. It’s expected that population will converge to good enough solution. But sometimes population converges to local optima which is not accepted result. This phenomenon is called premature optimization. Figure 1.1(a): Change in best fitness (best solution) with number of generations EAs performs better than random search because search because of its exploitative behavior. It uses random walk, but also tries exploit good solutions. It also outperforms local greedy
  • 12. XII search. Local greedy searches are exploitative in nature, often trapped into local maxima. But EA has random walk and maintaining required level of diversity it’s less likely to be trapped into local maxima. Problem tailored searches outperform EA only for the problem in which the search is tailored and uses deep domain knowledge of that problem. Such deep domain knowledge isn’t readily available and incorporating to problem tailored search is difficult. Figure (1.1b): Comparison between Random Search, EA and Problem Tailored Search[4] 1.2 Thesis Objective This thesis mainly focuses into maintaining diversity of single population algorithms. It is frequently observed that populations lose diversity too early and their individuals are trapped into local optima. For lack of diversity trapped individuals can’t escape basin of local minima. This phenomenon is called Premature Convergence. Objective of this thesis paper is to investigate better schemes which can maintain diversity of a population and also give control on diversity. The quest is searching for an adaptive diversity maintaining scheme. Thesis is done in three focused areas: 1. Modifying Dual Population Genetic Algorithm (DPGA) so that it can properly manage diversity. 2. Seeking a survivor selection technique which is adaptive and gives more control on diversity at any time of algorithm. 3. Examining probability distributions other than already used distributions which can give appropriate amount of jumps in any stage of evolution.
  • 13. XIII 1.3 Thesis Organization The rest of the thesis is organized as follows. Chapter 2 introduces the fundamentals of evolutionary algorithm, with its operators and processes. The essential terms related to evolutionary algorithm are explained with examples. The strengths, limitations, and applications of evolutionary algorithm are also mentioned. In Chapter 3, we introduce new evolutionary strategies, entitled as Modified DPGA, Adaptive Survivor Selection Strategy, New Mutation Based on Distributions, to balance the exploitative and explorative features of the standard evolutionary algorithm. The different stages, operators and procedures of Modified DPGA, ASSS, and Mutation Based on Distribution are described in details. It is also explained how they differ substantially from other existing works. Chapter 4 evaluates Modified DPGA and ASSS on a number of benchmark numerical optimization problems and makes comparisons with several other existing works. Although Modified DPGA didn’t perform well, but we gained valuable insight how we can modify this further to gain more performance. An in-depth experimentation with the parameters, operators and the stages of ASSS, with their effects on population fitness and diversity, is also carried out. Finally, in Chapter 5, we summarize our work and provide directions for future research.
  • 14. XIV Chapter 2 Background Evolutionary Algorithms (EA) consist of several heuristics, which are able to solve optimization tasks by imitating some aspects of natural evolution. They may use different levels of abstraction, but they are always working on whole populations of possible solutions for a given task. EAs are an approved set of heuristics, which are flexible to use and postulate only negligible requirements on the optimization task. 2.1 When EA is Needed  The search space is large, complex or poorly understood.  Domain knowledge is scarce or expert knowledge is difficult to encode to narrow the search space.  Only target (fitness) function is provided.  No mathematical analysis is available.  Traditional search methods fail.  Not the best solution but good enough solution is needed.  Local search methods can’t give good enough solutions.  Continuous optimization problems. 2.2 Advantages of EA  Applicable to a wide range of problems.  Useful in areas without good problem specific techniques.  No explicit assumptions about the search space necessary.  Easy to implement.  Any-time behavior. 2.3 Disadvantages of EA  Problem representation must be robust.  No general guarantee for an optimum.  No solid theoretically foundations (yet).  Parameter tuning: trial-and-error Process (but self-adaptive variants in evolution strategies).  Sometimes high memory requirements.  Implementation: High degree of freedom.
  • 15. XV 2.4 Canonical Structure of EA EAs are family of algorithms. There’s no definite structure exists among them. Although most of the EAs follow more or less following structure: 1. Initialization: The initial population of candidate solutions is usually generated randomly across the search space. However, domain specific knowledge or other knowledge can easily be incorporated. 2. Evaluation: Once the population is initialized or offspring population is created, the fitness value of the candidate solutions is evaluated. 3. Parent Selection: Selection allocates more copies of those solutions with higher fitness values and thus imposes the survival-of-the-fittest mechanism on the candidate solutions. The main idea of selection is to prefer better solutions to worse ones, and many selection procedures have been proposed to accomplish this idea, including roulette-wheel selection, stochastic universal selection, ranking selection and tournament selection, some of which are described in the next section. 4. Recombination: Recombination combines parts of two or more parental solutions to create new, possibly better solutions (i.e. offspring). There are many ways of accomplishing this (some of which are discussed in the next section), and competent performance depends on a properly designed recombination mechanism. The offspring under recombination will not be identical to any particular parent and will instead combine parental traits in a novel manner. 5. Mutation: While recombination operates on two or more parental chromosomes, mutation locally but randomly modifies a solution. Again, there are many variations of mutation, but it usually involves one or more changes being made to an individual’s trait or traits. In other words, mutation performs a random walk in the vicinity of a candidate solution. 6. Replacement: The offspring population created by selection, recombination, and mutation replaces the original parental population. Many replacement techniques such as elitist replacement, generation-wise re-placement and steady-state replacement methods are used in GAs. 7. Repeat steps 2–6 until a terminating condition is met.
  • 16. XVI Figure (2.4): Basic skeleton of an Evolutionary Algorithm 2.5 Representation of Gene Individual representations are typically divided into two types: 1. Genotypic Representation: Genes are internal structures those determine physical characteristics of an individual. Usually represented by array of letters like genes in human DNA. In case of EA, it is represented by bit-string. Genotypic representation is used extensible in Genetic Algorithm. But it has some limitation. Most real world problems are not in form of genotypic representation. So we have to device a scheme to represent genotype by bit-string. Performance of algorithm is dependent on representation of bit-string. 2. Phenotypic Representation: Individuals are represented by real valued vectors. So there’s no need to convert them to any other representations. Algorithm directly works on real valued vectors of problems. Extensively used in Evolutionary Strategy and Evolutionary Programming. It’s used in real valued function optimization.
  • 17. XVII 2.6 Major Branches of EA EAs are divided into four major branches. 2.6.1 Genetic Algorithm Genetic Algorithm (GA) was first formulated by John Holland. Holland’s original GA is called standard Genetic Algorithm which uses two parents, produces two offspring. It simulates Darwinian evolution. Search operators are only applied to genotypic representation; hence it’s called Genotypic Algorithm. It emphasizes the role of crossover and mutation as a background operator. GA uses binary string as representation of individuals extensively. 2.6.2 Evolutionary Programming Evolutionary Programming (EP) was first proposed by David Fogel [2]. It is closer to Lamarckian evolution. It doesn’t use any kind of crossover. Only mutation is used both for exploitation and exploration. Individuals are represented by two parts: object variables and mutation step size . are essentially real valued vectors i.e. phenotypes. So they are called Phenotypic Algorithm. 2.6.3 Evolutionary Strategies Evolutionary Strategies (ES) was first proposed by Ingred Rechenberg. Individuals are represented by real valued vectors. Good optimizer of real valued functions. Like EP, they are also Phenotypic Algorithm. Mutation plays the main role, crossover is also used. It has special self-adapting step size of mutation. ES has some basic notation: 1. (p,c) The p parents 'produce' c children using mutation. Each of the c children is then assigned a fitness value, depending on its quality considering the problem-specific environment. The best (the fittest) p children become next generations parents. This means the c children are sorted by their fitness value and the first p individuals are selected to be next generations parents (c must be greater or equal p). 2. (p+c) The p parents 'produce' c children using mutation. Each of the c children is then assigned a fitness value, depending on its quality considering the problem-specific environment. The best (the fittest) p individuals of both: parents and children become next generations parents. This means the c children together with the p parents are sorted by their fitness value and the first p individuals are selected to be next generations parents. 3. (p/r,c) The p parents 'produce' c children using mutation and recombination. Each of the c children is then assigned a fitness value, depending on its quality considering the problem-specific environment. The best (the fittest) p children become next generations parents. This means the c children are sorted by their fitness value and the first p individuals are selected to be next generation parents (c must be greater or equal p). 4. (p+c) The p parents 'produce' c children using mutation and recombination. Each of the c children is then assigned a fitness value, depending on its quality considering the problem-specific environment. The best (the fittest) p individuals of both: parents and
  • 18. XVIII children become next generation parents. This means the c children together with the p parents are sorted by their fitness value and the first p individuals are selected to be next generations parents. 2.6.4 Genetic Programming Genetic Programming (GP) is put forward by John Koza. GP evolves computer programs. It is a specialization of genetic algorithms (GA) where each individual is a computer program. It is a machine learning technique used to optimize a population of computer programs according to a fitness landscape determined by a program's ability to perform a given computational task. Trees can be easily evaluated in a recursive manner. Every tree node has an operator function and every terminal node has an operand, making mathematical expressions easy to evolve and evaluate. Genetic programming starts with a primordial ooze of thousands of randomly created computer programs. This population of programs is progressively evolved over a series of generations. The evolutionary search uses the Darwinian principle of natural selection (survival of the fittest) and analogs of various naturally occurring operations, including crossover (sexual recombination), mutation, gene duplication, gene deletion. Genetic programming sometimes also employs developmental processes by which an embryo grows into fully developed organism. It uses both mutation and crossover. Trees are often used as data structure for individuals. Although non-tree representations have been suggested and successfully implemented. Although other fields of EA developed to be in mainstream usage, GP still is in its infancy. Because of representation of programs, huge search space, complex operation is needed to generate better individuals, GP isn’t mainstream yet. Figure (2.6.4): Individual structure of GP
  • 19. XIX 2.6.5 Memetic algorithm Although Memetic algorithms don’t fall into EA category, they incorporate other searching techniques to EAs. The combination of Evolutionary Algorithms with Local Search Operators that work within the EA loop has been termed “Memetic Algorithms” (MA). Quite often, MA are also referred to in the literature as Baldwinian Evolutionary algorithms (EA), Lamarckian EAs, cultural algorithms or genetic local search. After generating individuals local search is performed on them. The frequency and intensity of individual learning directly define the degree of evolution (exploration) against individual learning (exploitation) in the MA search, for a given fixed limited computational budget. Clearly, a more intense individual learning provides greater chance of convergence to the local optima but limits the amount of evolution that may be expended without incurring excessive computational resources. Therefore, care should be taken when setting these two parameters to balance the computational budget available in achieving maximum search performance. When only a portion of the population individuals undergo learning, the issues on which subset of individuals to improve need to be considered to maximize the utility of MA search. 2.7 Existing Works 2.7.1 Dynamic Parameter Control A variety of previous works have proposed methods of dynamically adjusting the parameters of GA or other evolutionary algorithms. These methods include deterministic parameter control, adaptive parameter control, and self-adaptive parameter control. The simplest technique is the deterministic parameter control, which adjusts parameters according to a predetermined policy. Since it controls the parameters deterministically, it cannot adapt to the changes that occur during the execution of an algorithm. Adaptive parameter control exploits feedback from the evolution of a population to control the parameters. A notable example is the 1:5 adaptive Gaussian mutation widely used in the evolution strategy algorithms. According to this method, the mutation step size is increased if more than 20% of the mutations are successful and reduced otherwise. However, this method cannot be applied to algorithms adopting other than the real number representation. Finally, self-adaptive parameter control encodes the parameters into chromosomes and let them evolve with other genes. Although elegant, its applicability and effectiveness in a broad range of problems have not yet been shown 2.7.2 Maintaining Diversity and Multi-population Genetic Algorithms Multi population GAs (MPGAs) do so by evolving multiple subpopulations which are spatially separated [6]. Island-model GA (IMGA), which is a typical example of MPGA, evolves two or more subpopulations and uses periodic migration for the exchange of information between the subpopulations. The number and size of the populations of IMGA
  • 20. XX are predetermined and kept unchanged during the algorithm’s execution. However, other MPGAs such as multinational GA forking GA the bi-objective multi population algorithm and variable island GA can adjust the number and size of populations dynamically by splitting a population into two smaller ones or combining two similar ones. The performance of IMGA is sensitive to the migration policy, migration rates and size, and the particular topology used, because they determine the spread speed of good solutions among the subpopulations. A variety of previous works have studied the effect of these parameters for migration both theoretically and experimentally. 2.7.3 Memory Based Genetic Algorithm Diploid GA, GA with unexpressed genes, dual GA (dGA), and primal-dual GA (PDGA) have adopted complementary and dominance mechanisms to maintain or provide population diversity. Most organisms in nature have a great number of genes in their chromosomes and only some of the dominant genes are expressed in a particular environment. The repressed genes are considered as a means of storing additional information and providing a latent source of population diversity. Diploid GAs use diploid chromosomes which are different from natural ones in that the two strands of the diploid chromosomes are not complementary. Only some genes in a diploid chromosome are expressed and used for fitness evaluation by some predetermined dominance rules. GAUG is different from diploid GA in that it uses haploid chromosomes, but it also incorporates some unexpressed genes into its chromosomes. The unexpressed genes in GAUG are not used for fitness evaluation but used for preserving diversity. dGAs and PDGAs also have haploid chromosomes in the population, but the chromosomes are sometimes interpreted complementarily to provide additional diversity. In dGA, each chromosome is attached with an additional bit which indicates whether the chromosome should be interpreted as it is or as complemented. In PDGA, some bad-looking chromosomes are interpreted both as complemented and original, and the original one is replaced by the complemented one if the latter gives better evaluations. Since the additional diversity provided by memory-based algorithms makes it easier to adapt to extreme environmental changes, these methods are frequently used for dynamic optimization problems. 2.7.4 Mutation Based Work ES and EP use mutations exclusively for both maintaining diversity and exploitation. Mutations can be divided into several categories. Mutation classification based on uniform ness across generations is: 1. Uniform Mutation: When mutation step size or mutation rate is uniform regardless of generation at any time of algorithm, then it’s called uniform mutation. Its usage not very high because of deterministic behavior regardless of generations.
  • 21. XXI 2. Non Uniform Mutation: If mutation step size of mutation rate varies with respect to generation, then it’s called non uniform mutation. Usually at initial generations, step size or mutation rate is higher. As generation continues to increase step size or mutation rate is decreased gradually. It’s used frequently, because it gives option for governing diversity rate and also when diversity is needed it’s facilitated by large step size and convergence is needed it’s facilitated by small step size. For genetic algorithm, random bit-flipping is used for mutation. Random bit changing has some issues. For example, bit changing in higher position in bit-string has more effect on bit changing in lower position. And also for some bit-string going to immediate next or previous bit-string needs all bits changing. So exploitation becomes difficult. It’s called Hamming Cliff problem. Using gray code can mitigate effect of this problem. For mutation, random step size is needed to introduce random walk into search space. For random number generation, Gaussian distribution is most used. It’s a bell shaped curve. It’s defined by two parameters: position parameter (mean, µ), scale parameter (standard deviation, σ) and is denoted by . Always µ=0 and usually σ=3 i.e. is used for random number. generation (RNG). Mutations using Gaussian distribution is called Gaussian mutation. Algorithms using distribution based mutation. Figure (2.7.4): Probability Distribution Function (PDF) of Gaussian distribution Xin Yao uses two more distributions for RNG. They are: 1. Cauchy Distribution 2. Levy Distribution Gaussian, Cauchy and Levy they all have same bell curve shape PDF. Both of them have same parameter set like Gaussian. Both Cauchy and Levy have fatter tail than Gaussian. That means they are able to give more long jumps which can give more diverse individuals; less prone to getting trapped into local optima. Mutation using Cauchy and Levy distribution as
  • 22. XXII RNG are called Cauchy mutation and Levy mutation respectively. Xin Yao uses adaptive mutation parameter. Every individual is represented by pair of , where is real values vectors, is adaptive mutation parameter, is size parameter or standard deviation of that distribution. 2.7.5 Survivor Selection Based Work Survivor selection is usually deterministic. In this phase of algorithm, selection pressure is applied to individuals. Several survivor selection schemes exist: 1. Naïve Survivor Selection: Basically follows survival of the fittest principle. Individuals are selected based on their fitness value for next generation. Lower fitness valued individuals are weed out. Sometimes risky, because lower fitness individuals can have latent genes which can give better individuals in later generations. 2. Elitist Selection: Population maintains spot for best individuals so that they didn’t get lost across the generations. Certain portion of best individuals is transferred directly to next generation without any modification. This ensures even if algorithm can’t make any solution any better than current solutions, the best solution must remain and in the end of the algorithm is returned. 3. Truncation Selection: Truncation selection simply retains the fittest x% of the population. These fittest individuals are duplicated into the next generation, so that the population size is maintained. Less fit candidates are culled even without being given the opportunity to evolve into something better. Very often results in premature convergence. Only advantage is rapid convergence. Figure (2.7.5): Truncation Selection
  • 23. XXIII 4. Fitness Proportionate Reproduction: Same as roulette wheel selection scheme. Individuals are directly transferred to next generation based on their proportionate fitness value. Individuals of lower fitness still have some chances to survive, so that some genes that are latent can survive through generations even so they haven’t been able to generate good individuals. 5. Niching Methods: Niching methods strive to maintain niches [9] [10]. That means it ensures individuals of one niche don’t have to compete with individuals of other niches. The advantage is pre-existing diversity is maintained. But also makes convergence harder as selection pressure is lower. Niching methods are divided into two categories: i) Fitness Sharing: In nature, individuals of same species compete with each other for fixed resources [13]. Like nature, in fitness sharing, individuals in same region share fixed fitness values assigned to that region. Fitness is a shared resource of the population. Population is first divided into niches. Region is defined by sharing radius . Sharing Radius defines the niche size. This scheme is very sensitive to the value of assigned fitness per region and sharing radius. Population does not converge as a whole, but convergence takes place within the niches. Sharing can be done at genotypic or phenotypic level: 1. Genotypic level: Hamming distance and 2. Phenotypic Level: Euclidean distance. Sharing radius: if too small, practically no effect on the process; if too large, several peaks will ’melt’ individual peaks into one. ii) Crowding: Similar individuals in natural population, often of the same species, compete against each other for limited resources. Dissimilar individuals tend to occupy different niches, they typically don’t compete. Crowding uses individuals newly entering in a population to replace similar individuals. Random sample of CF (Crowding Factor) individuals is taken from the population. Larger crowding factor indicates less tolerance for the similar solutions, smaller values indicate similar solutions are more welcomed. New members of particular species replace older members of that species, not replacing members of other species. Crowding doesn’t increase the diversity of population; rather it strives to maintain the pre- existing diversity. It’s not directly influenced by fitness value. Crowding is divided into: (1) Deterministic Crowding: New individual will always replace the most similar individual if it has better fitness value. (2) Probabilistic Crowding: Primarily a distance based niching method. Main difference is the use of a probabilistic rather than deterministic acceptance function. No longer do stronger individuals win over weaker individuals, they win proportionally according to their fitness, and thus we get restorative pressure. Two core ideas of probabilistic crowding are to hold tournament between similar individuals and to let tournaments be probabilistic. 6. Deterministic Sampling: Average fitness of the population is calculated. Fitness associated to each individual is divided by the average fitness, but only the integer part of this operation is stored. If the value is equal or higher than one, the individual
  • 24. XXIV is copied to the next generation. Remaining free places in the new population is fulfilled with individuals with the greatest fraction.
  • 25. XXV Chapter 3 Algorithm Proposal 3.1 Dual Population Genetic Algorithm Dual Population Genetic Algorithm (DPGA) is a genetic algorithm which uses two populations instead of one to avoid premature convergence with two different evolutionary objectives [11] [12] [13]. The main population plays the role of that of an ordinary genetic algorithm. It evolves to find a good solution of high fitness value. The additional population is called reserve population is employed as reservoir for additional chromosomes which are rather different from chromosomes of main population. Two different fitness functions are used. Main population uses actual fitness function (like normal GA) and reserve population uses a fitness function which gives better fitness to the chromosomes more different from chromosomes of main population. Multi Population Genetic Algorithms use migration of chromosomes from one population to another population to exchange information. DPGA doesn’t use migration instead it uses another noble approach called crossbreeding. Crossbreeding is performed by taking one parent from main population and another parent from reserve population, making crossover between them. Newly born offspring are called crossbred offspring. Crossbred offspring then evaluated for both main population and reserve population for survival. DPGA also employs inbreeding, which takes two parents from the same population and makes offspring by crossover. These inbred offspring compete for survival in their respective parent population. Figure (3.1a): Offspring Generation of DPGA Mutation plays minimal role in DPGA and diversity is mainly provided by reserve population through crossbreeding. Crossbreeding plays the role of maintaining diversity in DPGA. The amount of diversity needed in any step of DPGA is specified by a self-adaptive parameter δ (0< δ <1). δ defines the distance of parents from main population and parents from reserve
  • 26. XXVI population. As δ determines which individual will participate in crossbreeding, we can say roughly δ is analogous to the length of step size. The fitness function of reserve population is ------------------------------------(1) Figure (3.1b): Reserve Population Fitness Function d(M, x) is average distance from main population of individual x. So we have turned our focus into crossbreeding and fitness function for reserve population. δ defines how much distant reserve population will be from main population. δ is set to lower values for exploitation and to higher values for exploration. If δ is kept similar for several generations, reserve population will start to converge at δ distance from main population. There are some pros and cons of DPGA: 3.1.1 Advantages 1. Reserve population preserves genes which is extinct from main population. As survivor crossbred offspring holds gene inherited from best individual of main population (which is lost in later generations), can be recovered from reserve population. 2. DPGA utilizes information from successful breeding. Value of δ which produces surviving offspring used later for selecting parent. If crossover is unsuccessful, δ is set to maximum, which influences selection of future parents. 3.1.2 Disadvantages 1. Reserve population introduces space and computational overhead. For main population, individuals only need one fitness evaluation when they are created or modified. But for reserve population individuals, every individual needs to be evaluated whenever the δ changes value as well as evaluation when created or modified. If number of chromosomes of main population is n and reserve population is m, then total evaluation of reserve fitness function is O(nm). 2. When selecting parent, dual population genetic algorithm doesn’t measure diversity of reserved population. Reserve population should be diverse enough for
  • 27. XXVII exploration of search space. As the selected parent of reserve population may not be so dissimilar to the parent from main population. For crossbreeding, distance between parents may not be δ. At the worst case, the distance may be far more less or greater than δ. 3. If crossbred child survives in both main and reserved population. Diversity decreases as same individual is copied to both populations. This gives us another insight, crossbred child has higher fitness (according to fitness function for reserve population) than inbred child and parent of reserve population may not be at desired distant from parents of main population i.e. reserve population may not have individuals who can breed offspring at the desired step size. 4. If crossbred offspring can’t survive in the main population, DPGA transforms into single population algorithm. And if this happens for several generation, measures to be taken to increase diversity of reserve population which incurs overhead. 5. If crossbred offspring manage to survive in reserve population, reserve population will contain replicated genes of an individual of main population, decreasing diversity of reserve population further. 6. At time of converging, DPGA keeps the value of δ low to facilitate convergence for several generations. From equation (1), individuals having distance δ gains more fitness over other individuals replacing individuals whose distance d(M, x) much greater or less than delta are to be replaced. As in time of convergence, δ is set to minimum i.e. individuals most distant from main population begins to diminish (DPGA uses best n individual for survival selection for both population with same fitness function for parent selection) and individuals similar to main population individuals begins to takeover reserve population after few generations. Hence, reserve population also begins to converge as like main population but at distance δ from main population. When DPGA detects main population converges to local optima, it sets δ to maximum to escape from local optima. Now DPGA picks individual most distant from main population, but the reserve population is already similar to main population and can’t provide diversity any further. 7. Diversity is also dependent on success of crossover. If parents for crossbreeding are at desired distance, they may not produce fittest individuals. Crossover always a big jump to an area somewhere “in between” two (parent) areas. Offspring seldom goes beyond their parents. 8. In DPGA, total gene frequency remained constant from the very beginning. As crossover is only operator used, new gene is never introduced. Crossbreeding changes gene frequency in individual population, but total frequency remained unchanged. At worst scenario, if the best gene is missed at the initialization of populations, DPGA never gets the optima. 9. Inbreeding in reserve population doesn’t introduce new genes. And if the distance of two parents is δ and –δ (selection is based on their distance, not direction), then the inbred offspring will be more similar to main population.
  • 28. XXVIII 3.1.3 Recommendation One of the biggest drawbacks of DPGA is convergence of reserve population along main population. For survivor selection of reserve population, probabilistic crowding (fitness function would be same as before for parent selection) should be used for survivor selection. As we have seen, current reserve population survivor selection of DPGA leads to convergence of reserve population. Probabilistic crowding prevents similar genes to takeover whole population simultaneously preserving genes from extinction. 3.2 Modified DPGA Proposal We have seen above that selected parent from reserve population may not be different enough from parent of main population. We can say this parent of reserve population is best of bad bunch. As a result, crossbred offspring are not so different from their parents. And once δ is set to lower value, near to zero for several generations, reserve population also become almost identical to main population. We have no problem if main population converges to good enough solution or terminating criteria is met. But if we detect premature convergence, then we have to increase diversity of main population to escape local optima. But as reserve population is identical with main population, it can’t give diversity to main population. So it remains trapped in local optima. To address this problem, we propose elimination of reserve population, instead we will generate individual on the fly which will play the role of reserve population parent. On the fly generated individual will be at exactly δ distance from parent of main population. 3.2.1 Structure Of Individual Every individual in main population will be consisted of pairs of (xi, δi). Where xi is real valued vector in each dimension and δi determines how much jump or distant will be on the fly generated individual incorporated in each dimension. δi is called jump parameter. Another parameter temperature T is also introduced. This parameter plays similar role like in simulated annealing. We tried to bring the concept of simulated annealing as local search for rigorously searching newly found potential search regions. Value of T is bigger at the beginning of the algorithm, so that search region will be bigger and more uniform in all dimensions i.e. shape of search region will be n-dimensional sphere. At the final stage of evolution, value of T will be scheduled to lower to facilitate more exploited local search and the search will exploit more in the dimensions where solutions are getting better. The local search region will be like elliptical shape, where the major axis of the ellipse will be towards the direction of the local (maybe global) optima of that region.
  • 29. XXIX 3.2.2 Initialization xi is initialized in regular fashion. For δi, we will generate n random numbers. Then 3.2.3 Parent Selection Any selection method can be used. But we prefer tournament selection or restricted tournament selection (RTS). 3.2.4 Generating Parent Individual On The Fly One parent is selected from main population and another parent is generated based on δ. If is the real valued vector of generated individual at dimension I, then as We will take value of such that √ is the maximum possible Euclidean distance between two points in search space. √∑ Then we will reset main parents jump parameters. Because if this parent is selected again and jump parameters are unchanged; then same individual will be generated again. As a result, same offspring will be produced and computation of a generation will be wasted. So we will reset jump parameters like initialization. 3.2.5 Mutation DPGA uses non-uniform mutation. But we will use Cauchy Mutation, as it gives more long jumps to facilitate exploration, when algorithm is in exploration stage,. And when algorithm is in exploitation stage, we will use Gaussian Mutation; it gives short jump to facilitate convergence of individuals.
  • 30. XXX 3.2.6 Survivor Selection New algorithm will evaluate both on the fly generated individuals and their inbred offspring. If we offspring survive, we will divide them into 3 categories 1. Exploited Individual: When 2. Normal Individual: When 3. Explored Individual: When Here distance is the Euclidean distance between offspring and main parent. We have divided them into 3 categories so that we can explore and exploit at the very same time. When in exploitation stage, the algorithm can still explore other potential regions in search space while exploiting in current region. On the other hand, when algorithm is in exploration stage, if a potential region is found we can exploit that region by conducting a local search like memetic algorithm while still exploring other regions. 3.2.6.1 Exploited Individual ⁄ when is the explored dimension ⁄ ⁄ when is the exploited dimension Here, The rationale is exploited individual comes from a region which is already explored or being explored by another individual. So it doesn’t need to explore surrounding region twice. 3.2.6.2 Explored Individual ⁄ for every dimension This individual is far away from its parent. It can be assumed that this offspring is in region where the algorithm never searched before. So this potential region needs exploration. Exploration is provided because . Even if the algorithm is in exploitation mood, it can still explore newly found unsearched potentially good region. 3.2.6.3 Normal Individual Randomly select dimensions. ⁄ ⁄ for randomly selected dimension ⁄ for other dimensions
  • 31. XXXI The individual is not in the distance which can be called exploited or explored. We have selected dimensions because we want to introduce some variations based on fitness difference. 3.2.7 Schedule of T The initial value of T is dependent of optimization problem. For complex, multi-modal, rough search space T should be greater to facilitate more exploration in local search and for simple, unimodal search space smaller value of T is better. The value of T is a function of generation count and surviving of offspring. We propose that, T should be increased with generation count and if no offspring survived T should be remained same and if offspring survives T should be increased. Because surviving of offspring means we are making progress towards convergence, not surviving means we still need to explore more regions. 3.2.8 Advantages 1. Extra space of reserve population is no longer needed. Evaluation of reserve population individuals is also eliminated. 2. On the fly generated individual is exactly at δ distance from parent of main population, so diversity can be incorporated as much as we want. 3. New proposal introduces δi for each dimension ∑δi= 1. δi determines how much exploitation or exploration will take place in any dimension. If we find a dimension in which you can find better individual, we can continue to explore or exploit in that dimension. On the other hand, if population is trapped in any deep local optima, then we can experiment changing value of δi, to escape local optima. 4. Every individual will have their own δi, so we have granular control for every individual in each dimension. 5. DPGA doesn’t facilitate exploitation in newly found good regions on fitness landscapes, whether proposed algorithm gives full throttle in exploitation in newly found region even if the algorithm is in globally exploration mode, giving full local search capability like memetic algorithm. 6. DPGA doesn’t evaluate reserve inbred offspring and reserve parents for survival in the main population. But this algorithm will evaluate both on the fly generated individuals and inbred generated offspring. Since these individuals are already generated as by-products of crossbreeding, evaluation of them has very little overhead and if any of them survives, they can introduce more diversity in main population and give a new region to search for potential global maxima. 3.3 New Survivor Selection Strategy Current schemes of survivor selections fall into two categories: 1. Scheme those focused solely on survival of the fittest or exploitation. For example, elitist selection, rank selection, fitness proportionate reproduction. 2. Scheme those focused on solely maintaining diversity. For example, niching methods: fitness sharing, deterministic crowding, probabilistic crowding.
  • 32. XXXII Above two categories are in two extreme ends. Those who focused on exploitation don’t take diversity into account. On the other hand, those who focused on diversity don’t take exploitation into account. But survivor selection should be based on both diversity and exploitation. So we propose new survivor selection scheme which will take both diversity and exploitation into account. The fitness function for survivor selection: Here, real fitness function, function of gene variation with chromosomes of current generation. adaptive parameter which determines how much weight will be put on functions . Usually value of will be lower for early generations to preserve diversity; will be bigger for final generations to facilitate convergence. We propose changing value of is function of generation count, survival of offspring and difference between fitness of the best individual and desired fitness. Measuring gene variation can be crucial. One naïve approach we can adopt is to measure Euclidean distance from all the individuals of current population, which is of O(n). We can improve this algorithm further by some trivial modifications. At first, we will take a point or individual as reference for measuring distances. Let, we take the individual (LowerBound0, LowerBound1,……, LowerBoundn-1) as our reference individual. Now at the beginning, we will measure distance from reference individual to every individual of current population. So normalized distance of any individual is √ So mean normalized diversity of current population is Standard deviation using µ as reference, √ Every individual will include an additional real valued vector called relative diversity. Relative diversity is a measure of diversity of an individual relative to the rest of the population. Relative diversity is found by calculating standard deviation of the population using corresponding individual as reference √
  • 33. XXXIII So diversity fitness function is is monotonically increasing function of . A generic fitness function will be of form where α is scaling factor dependent on optimization problem Scaling factor α is needed to make diversity fitness function more comparable to real fitness function. Usually if , then offspring improves diversity . Now if the offspring survives and normalized distance of replaced individual is . So the new relative diversity measurement of individuals is √ ( ) ( ) We will adopt elitist selection. Top 10% individuals according to real fitness function and top 10% individuals according to gene variation will be reserved. So this scheme emphasize both on exploration and exploitation. The rest 80% individuals have to survive through proposed fitness function. Careful observation reveals that normalized diversity of individual is in range . So standard deviation of ( ) will certainly between 0 and 1. We can use as adaptive parameter . But if two groups of individuals are at maximum distance while group members are in the same neighborhood, then will be nearly 0.5 high. But the population is not diverse at all; it just converges in two groups situated far away from each other. We can take average relative diversity as . But same problem still persists. First we have to determine the area of neighborhood (β) in terms of normalized diversity . Optimal size of neighborhood depends on optimization problem. We will introduce n- dimensional array of buckets. A bucket is a small region specified by neighborhood size which is essentially an n-dimensional hypercube. Individuals located on buckets region will fall into that bucket. So there will be buckets. Each value in an element of array means number of individuals in that bucket. Every element will be initialized to zero. One can easily find value of bucket array index ⌊ ⌋ For huge search space this bucket array may require enormous amount of memory. We can use sparse matrix as data structure to address this issue. After finding an individual in bucket region, value of that bucket will be increased. Individual will also contain the location of bucket for easy removal.
  • 34. XXXIV We can see that . If we have to remove an individual we simply decrement the value of corresponding bucket. We can use this as parameter . When value of is higher algorithm puts more weight in real fitness function, because current population is diverse enough. On the other hand, if value of is lower, algorithm puts more weight in diversity fitness function as current population is losing diversity. Even so, this function needs to be modified. It doesn’t take generation number into account. So when the algorithm is converging, value of this function i.e. is low which will slow the rate of convergence. A simple approach can be ( ( )) It works because is a monotonically increasing function. As generation increases value of ( ) also increases. And maximum value of ( ) can be 1, then value of will be less than 1. Using buckets, we can also define the regions that are already searched. We will introduce a boolean variable named isSearched (true if the region is already searched, false if not) for each bucket. We will define criteria eligible of being searched for each separate optimization problem. For example, we can declare a bucket region searched when it has at least three individuals each surviving at least 50 generations. The point of marking regions as searched is that two individuals of same relative diversity, individual belongs to unsearched bucket will get more weight in diversity than the one in searched bucket. For example, we will reduce the to ⁄ . ( ) A rather extreme scheme can be eliminating all individuals of a searched bucket except the best individual in that bucket. These individuals will be replaced by new individuals taken from buckets which are not searched and individual count is 0. But it can be detrimental for complex optimization problem. Above adaptive survival fitness function is for function maximization problems, but many real life problems involve function minimization i.e. cost minimization. In case of function minimization, above approach won’t work. Because, both real fitness function and diversity fitness function should decrease value for better individuals. So we need to adopt an algorithm which will assign lower diversity fitness value for diverse individuals, higher diversity fitness value for less diverse individual. A simple approach can be
  • 35. XXXV Above equation assigns more diversity fitness for less diverse individual and less diversity fitness for more diverse individual. So we can also apply this diversity management technique to function minimization. Another approach can be applying local search to the individuals like Memetic algorithm. Steepest ascent hill climbing is adopted here. During this hill climbing process if that individual goes through the region of a bucket, then that bucket will be marked as searched. Advantage of this scheme is that we can easily identify the searched regions even if those regions don’t meet the criteria for being marked as search. In cases of premature convergence, we can override this function and manually set the value of . Although niching methods: crowding methods or fitness sharing maintains diversity. But these methods lack control on diversity. They solely try to keep pre-existing diversity level; they neither increase diversity level in case of premature convergence nor decrease diversity to facilitate exploitation. On the other hand, proposed survivor selection scheme gives full control on diversity level needed in any time of evolution. 3.4 New Mutation Strategy Using probability distributions for generating random numbers to introduce random variation in real vectors (is called mutation). Till now, only three distributions are used successfully in mutation. They are: 1. Gaussian Mutation 2. Cauchy Mutation 3. Levy Mutation Careful observation reveals that above three distributions used are members of Stable Family of distributions. Stable Family is a family of distributions where linear combination of two independent distributions of same kind has the same distribution up to location and scale parameters. In fact, above three distributions are special cases of stable distribution. All the stable distributions are infinitely divisible. They are absolutely continuous and unimodal. A random variable X is called stable (has a stable distribution) if, for n independent copies Xi of X, there exist constants cn > 0 and dn such that
  • 36. XXXVI Figure (3.4): Probability Density Function of Stable Family So we can try other member distributions of stable family to generate random numbers for mutation. Other two members of stable family are: 1. Laplace Distribution 2. Slash Distribution 3.4.1 Laplace Distribution Like Gaussian distribution, it has two parameters: Location parameter, µ and Scale parameter, σ. Cauchy distribution is the result of Fourier transformation of Laplace distribution. The probability density function of the Laplace distribution is also reminiscent of the Gaussian distribution; however, whereas the Gaussian distribution is expressed in terms of the squared difference from the mean μ, the Laplace density is expressed in terms of the absolute difference from the mean. Consequently the Laplace distribution has fatter tails than the Gaussian distribution.
  • 37. XXXVII Figure (3.4.1a): Probability Density Function of Laplace Distribution Figure (3.4.1b): Comparison of Gaussian and Laplace Distribution Above is a graph of Gaussian and Laplace distribution with same scale and location parameter. It is noticeable that Laplace has fatter tail than Gaussian and has a sharper peak than Gaussian. Laplace falls rather quickly in comparison with Gaussian. It is expected to have a higher probability of escaping from a local optima or moving away from a plateau, especially when “the basin of attraction” of the local optima or plateau is large relative to the mean step size. On the other hand, Gaussian has greater probability in the mid-range. From observation, we can conclude that, sharp peak of Laplace facilitates exploitation as it has more probability of producing short jump; it can also give long jump more than Gaussian. Although for the mid-range jump, Gaussian gives better result.
  • 38. XXXVIII 3.4.2 Slash Distribution The Slash distribution is a continuous unbounded distribution developed as a deviation to the Gaussian distribution to allow for fatter tails kurtosis by altering the κ parameter, as illustrated in the plot below. When κ=0 the distribution reduces to a Gaussian(μ, σ). If Gaussian distribution is divided by a standard uniform random variable, then the resulting distribution is Slash distribution. It’s an example of ratio distribution. It has three parameters, like Gaussian distribution location parameter μ, scale parameter σ and an extra parameter κ. Figure (3.4.2): Probability Density Function of Slash Distribution at different parameters From above graph, we see that, as value of κ getting bigger, tail and peak of Slash distribution is getting bigger, slope is getting steeper and mid-range is getting smaller. By controlling the value κ, we can get an adaptive probability distribution which will facilitate two extreme ends: exploitation and exploration. The Slash distribution is used to fit to data that are approximately Gaussian distribution but have a kurtosis > 3. i.e. greater than the Gaussian distribution. The Slash distribution can readily be compared to a Gaussian distribution since they share the same mean μ and standard deviation σ parameters. Another distribution of which Gaussian distribution is a special form of family of distributions is called Student’s t-distribution.
  • 39. XXXIX 3.4.3 Student’s t-distribution Student’s t-distribution (or simply the t-distribution) is a family of continuous probability distributions that arises when estimating the mean of a normally distributed population in situations where the sample size is small and population standard deviation is unknown. The t-distribution is symmetric and bell-shaped, like the normal distribution, but has heavier tails, meaning that it is more prone to producing values that fall far from its mean. Figure (3.4.3): Probability Density Function for Student’s t-distribution with different degrees of freedom The overall shape of the probability density function of the t-distribution resembles the bell shape of a normally distributed variable with mean 0 and variance 1, except that it is a bit lower and wider. As the number of degrees of freedom grows, the t-distribution approaches the normal distribution with mean 0 and variance 1. As t-distributions have similar bell curve shape of Gaussian distribution and when the number of degrees of freedom reaches infinity it converges to Gaussian. Although for practical purposes, when degree of freedom is 30, t- distribution converges to Gaussian. Careful observation reveals that when degree of freedom is low, t-distribution have much fatter tail and lower peak. As degree of freedom (DOF) increases, tails become thinner and peak becomes higher. That means, at low DOF, this distribution gives more long jumps and with increase of DOF distribution gives sorter jumps. So we can exploit this behavior of t-distribution. At the beginning of EA, DOF for t- distribution will be low; diversity is needed so t-distribution will produce long jumps to facilitate diversity. As generation increases we will increase DOF of t-distribution, it will give less short jumps and the algorithm will be less exploitative.
  • 40. XL Chapter 4 Experimental Study 4.1 Modified DPGA We have implemented 1. Standard GA 2. DPGA 3. Modified DPGA Parameter setting for algorithms:  Maximum generation = 1000  Population size = 500.  Main population parents = 2  Reserve population parents = 2  Inbred main population offspring = 2  Inbred reserve population offspring = 2  Crossbred offspring (are produced by taking 1 parent individual from main population and another parent from reserve population) = 2    For crossover blend crossover method was used with parameter .  Uniform Gaussian mutation with is applied for both of them. Tournament selection was used for parent selection. Naïve survivor selection method was adopted. 4.1.1 Pitfalls of Modified DPGA Theoretically the proposed algorithm should work better than DPGA. But in practice it doesn’t. At the beginning if the algorithm, we set which means generated individual should be at maximum distance possible. As a result, on the fly generated individual always go to the edge of search space. So generated individual only searches extreme ends of search space. Thus offspring produced by crossover taking generated individual as parent, are also on the boundary of search space or in its neighborhood. Mutating these offspring seldom works, because short or mid-range jumps will still keep individual near other individuals. And long jump needed to introduce diversity is very unlikely by current mutation operators.
  • 41. XLI Even if we design a mutation operator which gives this sort of jump, it has the risk of taking individual out of search space and taking sufficiently diverse individual to already searched regions. If we have search space of n dimensional, every dimension has same lower bound, upper bound then the search space will be like n-dimensional hypercube. Literally this algorithm only searches the faces of hypercube and their neighborhoods, while core region of hypercube remains unsearched. Another modification can be made to start the algorithm with lower value of . This will prevent generating individual at the boundary of search space as well as offspring. But lower value of also means algorithm is unable to make long jumps. 4.2 Adaptive Survivor Selection Strategy We have used standard GA with different types of survivor selection scheme. Implemented schemes are: 1. Naïve survivor selection 2. Adaptive survivor selection Initial setting of parameters of adaptive survivor selection:  Population size = 2000  Maximum generation = 500  Bucket edge length = 1.90734863e-6  Minimum number of generations required to be declared searched = 70  Minimum number of individuals required to be declared searched = 3  Survival adaptive parameter = 0.3  Diversity scaling factor = 50  Penalty factor = 10  Reserved number of best individuals (elite) = 20  Reserved number of most diverse individuals (elite) = 20 Experiment result for Ackley function by both adaptive survivor selection and naïve survivor selection given below:
  • 42. XLII Figure (4.2a): Change in diversity across generations Figure (4.2b): Number of buckets searched 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 Adaptive Survivor Selection Naive Survivor Selection
  • 43. XLIII Chapter 5 Conclusion 5.1 Adaptive survivor selection is a noble approach which first introduces adaptive survivor selection with new diversity measurement. The best tool it provides that selection method can be adapted with respect to generation and diversity level. It checks the amount of diversity so that at any moment diversity doesn’t fall beyond lowest permitted value. It also incorporates elitist selection scheme not only for best individuals but also for most diverse individuals; so that when individuals are trapped into deep local optima, these most diverse individuals found by far helps to escape. Experiments show that adaptive survivor selection beats currently most used naïve survivor selection in terms of maintaining diversity exclusively. Although niching methods can maintain pre-existing diversity better than adaptive survivor selection sometimes, but we can mitigate this gap of performance by using proper initialization of adaptive diversity parameter and update rule. This scheme addresses one of the drawbacks of niching methods, they can’t control the diversity needed for at any generation. Actually niching methods and adaptive survivor selection have different goals. Niching methods mainly focuses on growing niches of individuals and maintaining niches, on the other hand our scheme focuses on maintain the level of diversity which can guide to individuals to global maxima. 5.2 Future Work 5.2.1 Modified DPGA It is obvious that value of δ caused this measurable performance of this algorithm. If we can change initialization and update rule of δ, hopefully this algorithm will perform better. One approach could be instead of initializing δ to 1, we will initialize δ to lower values. Thus risk of individuals going beyond the search space or only residing on the search space boundary will be mitigated. But this approach has a flaw. If we restrict δ to lower values, that means algorithm is now less capable of getting out of local optima and hence more prone to premature convergence. Assigning value to δ can be taken from a probability distribution. So that δ won’t be vulnerable to being too high or too low. After the initialization problem of δ is solved, update rule of δ is still needs to be revised.
  • 44. XLIV 5.2.2 Adaptive Survivor Selection We have investigated new diversity measurement technique and using that diversity measurement technique, we have proposed new survival selection strategy which works better than existing survivor selection schemes. A pitfall of new diversity measurement is for some edge cases, diversity measurement gives high value of diversity although the population isn’t diverse at all. So detecting these edge cases and mitigating the error caused by these edge cases can be done in future. Also we can adopt fitness sharing to assign fitness to each bucket, where every individual of that bucket will share that fitness. Assigned fitness to a bucket will be dependent how much diverse that bucket is. That means instead of measuring diversity of individuals, we are measuring diversity of their container buckets. Once bucket is assigned fitness, then individuals of same bucket will share that fitness among them. 5.2.3 New Distribution Based Mutation We have investigated distributions which have similar properties of currently deployed distributions or have same family origin. These distributions have bell shaped curve similar to Gaussian to and also dependent on the same set of parameters like Gaussian, Cauchy or Levy distributions. Three distributions presented before has potential to replace current distribution based mutations. All of them have fatter tails and Laplace, Slash distributions have higher peaks, so theoretically both of them should give better performance in both exploration and exploitation. Student’s t-distribution has converged to Gaussian at DOF 30. So we can experiment on which initial DOF, we initiate our algorithm and how we can change the DOF as the generation increases.
  • 45. XLV References [1] D.E. Goldberg, Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley, Reading, MA, 1989. [2] L.J. Fogel, A.J. Owens, and M.J. Walsh, Artificial Intelligence through simulated evolution, New York, John Wiley & Sons, 1966. [3] E. Eiben, R. Hinterding, and Z. Michalewicz, “Parameter control in evolutionary algorithms,” IEEE Trans. Evol. Comput., vol. 3, no. 2, pp. 124–141, Jul. 1999. [4] D. H. Wolpert and W. G. Macready, “No free lunch theorems for optimization”, IEEE Transactions on Evolutionary Computation, vol. 1, no. 1, pp. 67–82, 1997. [5] D. E. Goldberg and J. Richardson, “Genetic algorithms with sharing for multimodal function optimization,” in Proc. 2nd Int. Conf. Genetic Algorithms (ICGA), 1987, pp. 41–49. [6] T. Jumonji, G. Chakraborty, H. Mabuchi, and M. Matsuhara, “A novel distributed genetic algorithm implementation with variable number of islands,” in Proc. IEEE Congr. Evolut. Comput., 2007, pp. 4698–4705. [7] Y. Yoshida and N. Adachi, “A diploid genetic algorithm for preserving population diversity-pseudo-Meiosis GA,” in Proc. 3rd Parallel Problem Solving Nature (PPSN), 1994, pp. 36–45. [8] M. Kominami and T. Hamagami, “A new genetic algorithm with diploid chromosomes by using probability decoding for nonstationary function optimization,” in Proc. IEEE Int. Conf. Syst., Man, Cybern., 2007, pp. 1268–1273. [9] S. W. Mahfoud, “Crowding and preselection revisited,” in Proc. 2nd Parallel Problem Solving Nature (PPSN), 1992, pp. 27–37. [10] S. W. Mahfoud, “Niching methods for genetic algorithms,” Ph.D. dis-sertation, Dept. General Eng., Univ. Illinois, Urbana-Champaign, 1995. [11] T. Park and K. R. Ryu, “A dual population genetic algorithm with evolving diversity,” in Proc. IEEE Congr. Evol. Comput. , 2007, pp. 3516–3522. [12] T. Park and K. R. Ryu, “Adjusting population distance for dual-population genetic algorithm,” in Proc. Aust. Joint Conf. Artif. Intell., 2007, pp. 171–180. [13] T. Park and K. R. Ryu, “A Dual-Population Genetic Algorithm for Adaptive Diversity Control” in Proc. Aust. Joint Conf. Artif. Intell., 2009, pp. 191–210. [13] R. McKay, “Fitness sharing in genetic programming,” in Proc. of the Genetic and Evolutionary Computation Conference, Las Vegas, Nevada, 2000, pp. 435–442.
  • 46. XLVI [14] R. K. Ursem, “Diversity guided Evolutionary algorithm,” in Proc. of Parallel Problem Solving from Nature (PPSN) VII, vol. 2439, J. J. Merelo, P. Adamidis, H. P. Schwefel, Eds. Granada, Spain, 2002, pp. 462–471. [15] T. Bäck and H.-P. Schwefel, “An overview of evolutionary algorithms for parameter optimization,” Evol. Comput., vol. 1, pp. 1–23, 1993. [16] K. Chellapilla, “Combining mutation operators in evolutionary programming,” IEEE Trans. Evol. Comput., vol. 2, pp. 91–96, Sept. 1998. [17] R. Mantegna, “Fast, accurate algorithm for numerical simulation of Lévy stable stochastic process,” Phys. Rev. E, vol. 49, no. 5, pp. 4677–4683, 1994. [18] X. Yao, G. Lin, and Y. Liu, “An analysis of evolutionary algorithms based on neighborhood and step size,” in Proc. 6th Int. Conf. Evolutionary Programming, 1997, pp. 297–307 [19] D. Thierens, “Adaptive mutation rate control schemes in genetic algorithms,” in Proc. Congr. Evol. Comput. , vol. 1. 2002, pp. 980–985. [20] G. Rudolph, “On takeover times in spatially structured populations: Array and ring,” in Proc. 2nd Asia-Pacific Conf. Genetic Algorithms Applicat., 2000, pp. 144–151. [21] X. Yao, Y. Liu, and G. Lin, “Evolutionary programming made faster,” IEEE Trans. Evol. Comput., vol. 3, no. 2, pp. 82–102, Jul. 1999.