Genetic algorithm

 Search technique used in computing to find the true or approximate solutions to
optimization and search problems
 Categorized as global search heuristic
 Uses techniques inspired by evolutionary biology such as inheritance, mutation, selection,
crossover (also called recombination)
 Implemented as a computer simulation in which population of abstract representation
(chromosomes/ genotype/ genome) of candidate solutions (individual/ creatures) to an
optimization problem evolves towards a better solution
 Solutions are represented in binary but other encodings are also possible

 Evolution starts from a population of randomly generated individuals and happens in
generations
 In each generation, the fitness of every individual is evaluated, multiple individuals are
selected form current population and modified to form a new population
 The new population is then used in the next iteration of the algorithm
 The algorithm terminates when the desired number of generation has been produced or a
satisfactory fitness level has been reached

 Individual – any possible solution
 Population – group of all individuals
 Search space – all possible solution to the problem
 Chromosome – blueprint of an individual
 Trait – possible aspect of an individual
 Allele – possible setting of a trait
 Locus – position of gene on the chromosome
 Genome – collection of all chromosomes for an individual

 Cells are the basic building block of the body
 Each cell has a core structure that contains the chromosomes
 Each chromosome is made up of tightly coiled strands of DNA
 Genes are segments of DNA that determine specific traits such as eye or hair colour
 A gene mutation is an alteration in DNA. It can be inherited or acquired during lifetime
 Darwin’s theory of evolution – only the organism best adapted to heir environment tend
to survive

 Produce an initial population of individuals
 Evaluate the fitness of all individuals
 While termination condition not meet do
 Select filter individuals for reproduction
 Recombine between individuals
 Mutate individuals
 Evaluate the fitness of modified individuals
 Generate a new population
 End while

 Suppose we want to maximize the number of ones in a string of L binary digits
 An individual is encoding as a string of l binary digits
 Lets say L = 10, so 1 = 0000000001 (10 bits)

Produce an initial population of individuals
 Evaluate the fitness of all individuals
 End while

 We start with the population of n random string. Suppose that l = 10 and n = 6
 We toss a fair coin 60 times to get the following initial population
s1 = 1111010101 f (s1) = 7
s2 = 0111000101 f (s2) = 5
s3 = 1110110101 f (s3) = 7
s4 = 0100010011 f (s4) = 4
s5 = 1110111101 f (s5) = 8
s6 = 0100110000 f (s6) = 3

 Produce an initial population of individuals
Evaluate the fitness of all individuals
 End while

 Generates and combines multiple predictions
 Bagging: Bootstrap Aggregating
 Boosting
 Tends to get better results since there is deliberately introduced significant diversity
among models
 Bagging and boosting are meta-algorithms that pool decisions from multiple classifiers

 Improves stability and accuracy of machine-learning algorithms used in statistical
classification and regression
 Reduces variance and helps avoid overfitting
 Technique: given a standard training set D of size n, bagging generates m new training
set Di each of size n’ by sampling from D uniformly and with replacement
 If n’=n, then for large n, the set Di is expected to have the fractions of unique examples of
D, the rest being duplicates

 Lets calculate the average price of a house
 From F, get a sample x = (x1, x2, …, xn) and calculate the average u
 Now get several samples from F
 Its impossible to get multiple samples. So we use bootstrap
 Repeat B time:
 Generate a sample Lk of of size n from L by sampling with replacement
 Compute x* for x
 We now have bootstrap values
 X* = (x1*, ……., x2*)

X=(3.12, 0, 1.57,
19.67, 0.22, 2.20)
Mean=4.46
X1=(1.57,0.22,19.67,
0,0,2.2,3.12)
Mean=4.13
X2=(0, 2.20, 2.20,
2.20, 19.67, 1.57)
Mean=4.64
X3=(0.22, 3.12,1.57,
3.12, 2.20, 0.22)
Mean=1.74

 Based on the question: can a set of weak learners produce a strong learners?
 Weak learner is a classifier that is strongly related to true classification
 Strong learner is a classifier that is well-correlated with true classification

Genetic algorithm

Recommended

Recommended

More Related Content

Similar to Genetic algorithm

Similar to Genetic algorithm (20)

More from Ujjawal

More from Ujjawal (9)

Recently uploaded

Recently uploaded (20)

Genetic algorithm