1. 13
Basics of Genetic Algorithms
and some possibilities
Peter Spijker
Technische Universiteit Eindhoven
Department of Biomedical Engineering
Division of Biomedical Imaging and Modeling
California Institute of Technology
Materials Process and Simulation Center
Biochemistry & Molecular Biophysics
November 25, 2003
12
2. 13Presentation Overview
• Purpose of presentation
• General introduction to Genetic Algorithms (GA’s)
• Biological background
• Origin of species
• Natural selection
• Genetic Algorithm
• Search space
• Basic algorithm
• Coding
• Methods
• Examples
• Possibilities
3. 13Purpose of presentation
• Optimising parameters of force fields is a difficult and time
consuming task
• Use of optimising methods might be of use
• Methods:
- steepest descent
- simulated annealing (Monte Carlo)
- genetic algorithms
• Brief introduction to genetic algorithms in lecture style
4. 13General Introduction to GA’s
• Genetic algorithms (GA’s) are a technique to solve
problems which need optimization
• GA’s are a subclass of Evolutionary Computing
• GA’s are based on
Darwin’s theory of evolution
• History of GA’s
• Evolutionary computing evolved in the 1960’s.
• GA’s were created by John Holland in the mid-70’s.
5. 13Biological Background (1) – The cell
• Every animal cell is a complex of many small
“factories” working together
• The center of this all is the cell nucleus
• The nucleus contains the genetic information
6. 13Biological Background (2) – Chromosomes
• Genetic information is stored in the chromosomes
• Each chromosome is build of DNA
• Chromosomes in humans form pairs
• There are 23 pairs
• The chromosome is divided in parts: genes
• Genes code for properties
• The posibilities of the genes for
one property is called: allele
• Every gene has an unique position
on the chromosome: locus
7. 13Biological Background (3) – Genetics
• The entire combination of genes is called genotype
• A genotype develops to a phenotype
• Alleles can be either dominant or recessive
• Dominant alleles will always express from the genotype to
the fenotype
• Recessive alleles can survive in the population for many
generations, without being expressed.
8. 13Biological Background (4) – Reproduction
• Reproduction of genetical information
• Mitosis
• Meiosis
• Mitosis is copying the same
genetic information to new
offspring: there is no
exchange of information
• Mitosis is the normal way of
growing of multicell structures,
like organs.
9. 13Biological Background (5) – Reproduction
• Meiosis is the basis of sexual reproduction
• After meiotic division 2 gametes
appear in the process
• In reproduction two gametes
conjugate to a zygote wich
will become the new individual
• Hence genetic information is shared
between the parents in order to
create new offspring
10. 13Biological Background (6) – Reproduction
• During reproduction “errors” occur
• Due to these “errors” genetic variation exists
• Most important “errors” are:
• Recombination (cross-over)
• Mutation
11. 13Biological Background (7) – Natural selection
• The origin of species: “Preservation of favourable
variations and rejection of unfavourable variations.”
• There are more individuals born than can survive, so
there is a continuous struggle for life.
• Individuals with an advantage have a greater chance for
survive: survival of the fittest.
12. 13Biological Background (8) – Natural selection
• Important aspects in natural selection are:
• adaptation to the environment
• isolation of populations in different groups which
cannot mutually mate
• If small changes in the genotypes of individuals are
expressed easily, especially in small populations, we
speak of genetic drift
• Mathematical expresses as fitness: success in life
13. 13Presentation Overview
• Purpose of presentation
• General introduction to Genetic Algorithms (GA’s)
• Biological background
• Origin of species
• Natural selection
• Genetic Algorithm
• Search space
• Basic algorithm
• Coding
• Methods
• Examples
• Possibilities
14. 13Genetic Algorithm (1) – Search space
• Most often one is looking for the best solution
in a specific subset of solutions
• This subset is called the search space (or state space)
• Every point in the search space is a possible solution
• Therefore every point has a fitness value, depending on
the problem definition
• GA’s are used to search the
search space for the best
solution, e.g. a minimum
• Difficulties are the local
minima and the starting
point of the search
0 100 200 300 400 500 600 700 800 900 1000
0
0.5
1
1.5
2
2.5
15. 13Genetic Algorithm (2) – Basic algorithm
• Starting with a subset of n randomly chosen solutions
from the search space (i.e. chromosomes). This is the
population
• This population is used to produce a next generation
of individuals by reproduction
• Individuals with a higher fitness have more chance
to reproduce (i.e. natural selection)
16. 13Genetic Algorithm (3) – Basic algorithm
• Outline of the basic algorithm
0 START : Create random population of n chromosomes
1 FITNESS : Evaluate fitness f(x) of each chromosome
in the population
2 NEW POPULATION
0 SELECTION : Based on f(x)
1 RECOMBINATION : Cross-over chromosomes
2 MUTATION : Mutate chromosomes
3 ACCEPTATION : Reject or accept new one
3 REPLACE : Replace old with new population: the new
generation
4 TEST : Test problem criterium
5 LOOP : Continue step 1 – 4 until criterium is
satisfied
17. 13Genetic Algorithm (4) – Coding
• Normal cells are diploid (containing 2 complete sets
of chromosomes)
• On the contrary gametes are haploid
• Formalizing diploid reproduction is much more difficult
than haploid
• Diploid populations have an extra dimension compared to
haploid populations
• For simplicity therefore only haploid genetic algorithms
18. 13Genetic Algorithm (5) – Coding
• Chromosomes are encoded by bitstrings
• Every bitstring therefore is a solution but not necisseraly
the best solution
• The way bitstrings can code differs from problem to
problem
Either: sequence of on/off or the number 9
1
0
0
1
19. 13Genetic Algorithm (6) – Coding
• Recombination (cross-over) can when using
bitstrings schematically be represented:
1
0
0
1
1
0
1
0
1
0
1
1
1
0
X
1
0
0
1
1
1
0
0
1
0
1
1
0
1
20. 13Genetic Algorithm (7) – Coding
• Mutation prevents the algorithm to be trapped in a
local minimum
• In the bitstring approach mutation is simpy the flipping
of one of the bits
1
0
0
1
1
0
1
1
1
0
1
1
0
1
21. 13Genetic Algorithm (8) – Coding
• Both recombination and mutation depend a lot
on the exact definition of the problem and the choice
of representing the chromosomes (e.g. no bitstrings)
• Different encodings can be used:
• Binary encoding
• Permutation encoding
• Value encoding
• Tree encoding
• Focus in this presentation stays with binary encoding
22. 13Example Minimum of Function (1)
• First example shows how to find the minimum
of a function
0 100 200 300 400 500 600 700 800 900 1000
0
0.5
1
1.5
2
2.5
Minimum f(x)
at x = 809
1100101001
23. 13Example Minimum of Function (2)
0 200 400 600 800 1000 1200
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
2.2
Generation 1
0 10 20 30 40 50 60 70 80
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
1.2
1.3
Generations
Fitness
Best Fitness
Mean Fitness
Individual Best individual
Mean
fitness
Best
fitness
Generations
24. 13Example Minimum of Function (3)
• Interactive show of this algorithm with Matlab
• Using the function: genalg2()
• Variables:
• Population size
• Bitstringlength
• Mutation chance
• Recombination chance
• Starting population adaption
25. 13Genetic Algorithm (9) – Remarks
• It is clear from the example that the convergence
speed of the algorithm depends on many factors:
• Population size
• Mutation probability
• Recombination probability
• Elitism
• Selection methods
• Random selection of parents
• Roulette wheel selection of parents
• Strong point GA’s: mutation prevents from falling in
a local minimum, recombination initiates a fast
first convergence
26. 13Example Checkboard (1)
• We are given an n by n checkboard in which
every field can have a different colour from a
set of four colours.
• Goal is to achieve a checkboard in a way that there
are no neighbours with the same colour (not diagonal)
1 2 3 4 5 6 7 8 9 10
1
2
3
4
5
6
7
8
9
10
1 2 3 4 5 6 7 8 9 10
1
2
3
4
5
6
7
8
9
10
27. 13Example Checkboard (2)
• Chromosomes represent the way the checkboard
is coloured.
• Chromosomes are not represented by bitstrings
but by bitmatrices
• The bits in the bitmatrix can have one of the four values
0, 1, 2 or 3, depending on the colour
• Crossing-over involves matrix manipulation instead
of point wise operating. Crossing-over can be
combining the parential matrices in a horizontal,
vertical, triangular or square way
• Mutation remains bitwise changing bits in either one
of the other numbers
28. 13Example Checkboard (3)
• Fitnesscurve for the checkboard example
• This problem can be seen as a graph with n nodes
and (n-1) edges, so the fitness f(x) is easily
defined as: f(x) = 2 · (n-1) ·n
0 100 200 300 400 500 600
130
135
140
145
150
155
160
165
170
175
180
Generations
Fitness
Best Fitness
Mean Fitness
30. 13Example Checkboard (5)
• Interactive show of this algorithm with Matlab
• Using the functions:
• main()
• checkers()
• bestindividual()
• mutate()
• recombine()
• select()
• showbestindividual()
31. 13Possibilities
• Using the genetic algorithm to optimise
parameters for a force field
• Parameters are real numbers, so adaptations of
these algorithms is required
• Value incoding vs. bitstring encoding
• Difficulties:
• Definition fitness function
• Integration algorithm with software