SlideShare a Scribd company logo
1 of 20
Genetic Algorithms




                     Jul 18, 2012
Evolution
   Here’s a very oversimplified description of how evolution works
    in biology
   Organisms (animals or plants) produce a number of offspring
    which are almost, but not entirely, like themselves
       Variation may be due to mutation (random changes)
       Variation may be due to sexual reproduction (offspring have some
        characteristics from each parent)
   Some of these offspring may survive to produce offspring of their
    own—some won’t
       The “better adapted” offspring are more likely to survive
       Over time, later generations become better and better adapted
   Genetic algorithms use this same process to “evolve” better
    programs
                                                                           2
Genotypes and phenotypes
   Genes are the basic “instructions” for building an
    organism
   A chromosome is a sequence of genes
   Biologists distinguish between an organism’s genotype
    (the genes and chromosomes) and its phenotype (what
    the organism actually is like)
       Example: You might have genes to be tall, but never grow to
        be tall for other reasons (such as poor diet)
   Similarly, “genes” may describe a possible solution to a
    problem, without actually being the solution

                                                                      3
The basic genetic algorithm
   Start with a large “population” of randomly generated
       “attempted solutions” to a problem
   Repeatedly do the following:
      Evaluate each of the attempted solutions

      Keep a subset of these solutions (the “best” ones)

      Use these solutions to generate a new population

   Quit when you have a satisfactory solution (or you run out of time)




                                                                          4
A really simple example
   Suppose your “organisms” are 32-bit computer words
   You want a string in which all the bits are ones
   Here’s how you can do it:
       Create 100 randomly generated computer words
       Repeatedly do the following:
          Count the 1 bits in each word

          Exit if any of the words have all 32 bits set to 1

          Keep the ten words that have the most 1s (discard the rest)

          From each word, generate 9 new words as follows:

              Pick a random bit in the word and toggle (change) it


   Note that this procedure does not guarantee that the next
    “generation” will have more 1 bits, but it’s likely
                                                                         5
A more realistic example, part I
   Suppose you have a large number of (x, y) data points
       For example, (1.0, 4.1), (3.1, 9.5), (-5.2, 8.6), ...
   You would like to fit a polynomial (of up to degree 5) through
    these data points
       That is, you want a formula y = ax5 + bx4 + cx3 + dx2 +ex + f that gives
        you a reasonably good fit to the actual data
       Here’s the usual way to compute goodness of fit:
            Compute the sum of (actual y – predicted y)2 for all the data points
            The lowest sum represents the best fit
   There are some standard curve fitting techniques, but let’s
    assume you don’t know about them
   You can use a genetic algorithm to find a “pretty good” solution


                                                                                    6
A more realistic example, part II
   Your formula is y = ax5 + bx4 + cx3 + dx2 +ex + f
   Your “genes” are a, b, c, d, e, and f
   Your “chromosome” is the array [a, b, c, d, e, f]
   Your evaluation function for one array is:
       For every actual data point (x, y), (I’m using red to mean “actual data”)
            Compute ý = ax5 + bx4 + cx3 + dx2 +ex + f
            Find the sum of (y – ý)2 over all x
            The sum is your measure of “badness” (larger numbers are worse)
       Example: For [0, 0, 0, 2, 3, 5] and the data points (1, 12) and (2, 22):
            ý = 0x5 + 0x4 + 0x3 + 2x2 +3x + 5 is 2 + 3 + 5 = 10 when x is 1
            ý = 0x5 + 0x4 + 0x3 + 2x2 +3x + 5 is 8 + 6 + 5 = 19 when x is 2
            (12 – 10)2 + (22 – 19)2 = 22 + 32 = 13
            If these are the only two data points, the “badness” of [0, 0, 0, 2, 3, 5] is 13

                                                                                                7
A more realistic example, part III
   Your algorithm might be as follows:
       Create 100 six-element arrays of random numbers
       Repeat 500 times (or any other number):
          For each of the 100 arrays, compute its badness (using all data

           points)
          Keep the ten best arrays (discard the other 90)

          From each array you keep, generate nine new arrays as

           follows:
               Pick a random element of the six

               Pick a random floating-point number between 0.0 and 2.0

               Multiply the random element of the array by the random

                floating-point number
       After all 500 trials, pick the best array as your final answer


                                                                             8
Asexual vs. sexual reproduction
   In the examples so far,
       Each “organism” (or “solution”) had only one parent
       Reproduction was asexual (without sex)
       The only way to introduce variation was through mutation
        (random changes)
   In sexual reproduction,
       Each “organism” (or “solution”) has two parents
       Assuming that each organism has just one chromosome, new
        offspring are produced by forming a new chromosome from
        parts of the chromosomes of each parent


                                                                   9
The really simple example again
   Suppose your “organisms” are 32-bit computer words,
    and you want a string in which all the bits are ones
   Here’s how you can do it:
       Create 100 randomly generated computer words
       Repeatedly do the following:
          Count the 1 bits in each word

          Exit if any of the words have all 32 bits set to 1

          Keep the ten words that have the most 1s (discard the rest)

          From each word, generate 9 new words as follows:

              Choose one of the other words

              Take the first half of this word and combine it with the

               second half of the other word


                                                                          10
The example continued
   Half from one, half from the other:
    0110 1001 0100 1110 1010 1101 1011 0101
    1101 0100 0101 1010 1011 0100 1010 0101
    0110 1001 0100 1110 1011 0100 1010 0101

   Or we might choose “genes” (bits) randomly:
    0110 1001 0100 1110 1010 1101 1011 0101
    1101 0100 0101 1010 1011 0100 1010 0101
    0100 0101 0100 1010 1010 1100 1011 0101

   Or we might consider a “gene” to be a larger unit:
    0110 1001 0100 1110 1010 1101 1011 0101
    1101 0100 0101 1010 1011 0100 1010 0101
    1101 1001 0101 1010 1010 1101 1010 0101

                                                         11
Comparison of simple examples
   In the simple example (trying to get all 1s):
       The sexual (two-parent, no mutation) approach, if it succeeds,
        is likely to succeed much faster
            Because up to half of the bits change each time, not just one bit
       However, with no mutation, it may not succeed at all
            By pure bad luck, maybe none of the first (randomly generated) words
             have (say) bit 17 set to 1
                Then there is no way a 1 could ever occur in this position

            Another problem is lack of genetic diversity
                Maybe some of the first generation did have bit 17 set to 1, but

                 none of them were selected for the second generation
   The best technique in general turns out to be sexual
    reproduction with a small probability of mutation

                                                                                    12
Curve fitting with sexual reproduction
   Your formula is y = ax5 + bx4 + cx3 + dx2 +ex + f
   Your “genes” are a, b, c, d, e, and f
   Your “chromosome” is the array [a, b, c, d, e, f]
   What’s the best way to combine two chromosomes into
    one?
       You could choose the first half of one and the second half of
        the other: [a, b, c, d, e, f]
       You could choose genes randomly: [a, b, c, d, e, f]
       You could compute “gene averages:”
         [(a+a)/2, (b+b)/2, (c+c)/2, (d+d)/2, (e+e)/2,(f+f)/2]
            I suspect this last may be the best, though I don’t know of a good
             biological analogy for it

                                                                                  13
Directed evolution
   Notice that, in the previous examples, we formed the
    child organisms randomly
       We did not try to choose the “best” genes from each parent
       This is how natural (biological) evolution works
            Biological evolution is not directed—there is no “goal”
       Genetic algorithms use biology as inspiration, not as a set of
        rules to be slavishly followed
   For trying to get a word of all 1s, there is an obvious
    measure of a “good” gene
       But that’s mostly because it’s a silly example
       It’s much harder to detect a “good gene” in the curve-fitting
        problem, harder still in almost any “real use” of a genetic
        algorithm
                                                                         14
Probabilistic matching
   In previous examples, we chose the N “best” organisms
    as parents for the next generation
   A more common approach is to choose parents
    randomly, based on their measure of goodness
       Thus, an organism that is twice as “good” as another is likely
        to have twice as many offspring
   This has a couple of advantages:
       The best organisms will contribute the most to the next
        generation
       Since every organism has some chance of being a parent,
        there is somewhat less loss of genetic diversity


                                                                         15
Genetic programming
   A string of bits could represent a program
   If you want a program to do something, you might try to
    evolve one
   As a concrete example, suppose you want a program to
    help you choose stocks in the stock market
       There is a huge amount of data, going back many years
       What data has the most predictive value?
       What’s the best way to combine this data?
   A genetic program is possible in theory, but it might
    take millions of years to evolve into something useful
       How can we improve this?

                                                                16
Shrinking the search space
   There are just too many possible bit patterns!
       99.9999% of these don’t even represent valid programs
   An incredible improvement would result if we could
    somehow restrict the search space to only valid (even if
    nonsensical) programs
       We can do this!
   Programs, as you should know by now, can be
    represented as trees
       Internal nodes are operators: +, *, if, while, ...
       Leaves are values: 2.71818, "AAPL", ...


                                                                17
Programs as trees
   Given a program represented as a tree, we can mutate it
    by changing one of its operators (or one of its values),
    or by adding or removing nodes
   Given two trees, we can form a new tree by taking parts
    of its two parents
   The next big problem: How do we evaluate program
    trees that are (initially) nothing at all like what we
    want?
   I realize this is all very vague—I just wanted to give
    you the general idea

                                                               18
Concluding remarks
   Genetic algorithms are—
       Fun! They are enjoyable to program and to work with
            This is probably why they are a subject of active research
       Mind-bogglingly slow—you don’t want to use them if you
        have any alternatives
       Good for a very few types of problems
            Genetic algorithms can sometimes come up with a solution when you
             can see no other way of tackling the problem




                                                                                 19
The End




          20

More Related Content

Similar to 45 genetic-algorithms

GeneticAlgorithm
GeneticAlgorithmGeneticAlgorithm
GeneticAlgorithm
guestfbf1e1
 
Sheet1Student Name ________________________________10-coin trialO.docx
Sheet1Student Name  ________________________________10-coin trialO.docxSheet1Student Name  ________________________________10-coin trialO.docx
Sheet1Student Name ________________________________10-coin trialO.docx
bjohn46
 
Genetic algorithms
Genetic algorithmsGenetic algorithms
Genetic algorithms
guest9938738
 

Similar to 45 genetic-algorithms (20)

Lec 7 genetic algorithms
Lec 7 genetic algorithmsLec 7 genetic algorithms
Lec 7 genetic algorithms
 
Genetic algorithm
Genetic algorithmGenetic algorithm
Genetic algorithm
 
Artificial intelligence cs607 handouts lecture 11 - 45
Artificial intelligence   cs607 handouts lecture 11 - 45Artificial intelligence   cs607 handouts lecture 11 - 45
Artificial intelligence cs607 handouts lecture 11 - 45
 
Data Science - Part XIV - Genetic Algorithms
Data Science - Part XIV - Genetic AlgorithmsData Science - Part XIV - Genetic Algorithms
Data Science - Part XIV - Genetic Algorithms
 
GeneticAlgorithm
GeneticAlgorithmGeneticAlgorithm
GeneticAlgorithm
 
A Review On Genetic Algorithm And Its Applications
A Review On Genetic Algorithm And Its ApplicationsA Review On Genetic Algorithm And Its Applications
A Review On Genetic Algorithm And Its Applications
 
Genetic Algorithm
Genetic AlgorithmGenetic Algorithm
Genetic Algorithm
 
Gadoc
GadocGadoc
Gadoc
 
Genetic Algorithm
Genetic AlgorithmGenetic Algorithm
Genetic Algorithm
 
Rsqrd AI: Errudite- Scalable, Reproducible, and Testable Error Analysis
Rsqrd AI: Errudite- Scalable, Reproducible, and Testable Error AnalysisRsqrd AI: Errudite- Scalable, Reproducible, and Testable Error Analysis
Rsqrd AI: Errudite- Scalable, Reproducible, and Testable Error Analysis
 
Machine Learning - Genetic Algorithm Fundamental
Machine Learning - Genetic Algorithm FundamentalMachine Learning - Genetic Algorithm Fundamental
Machine Learning - Genetic Algorithm Fundamental
 
A Genetic Algorithm-Based Solver for Very Large Jigsaw Puzzles
A Genetic Algorithm-Based Solver for Very Large Jigsaw PuzzlesA Genetic Algorithm-Based Solver for Very Large Jigsaw Puzzles
A Genetic Algorithm-Based Solver for Very Large Jigsaw Puzzles
 
An introduction to Prolog language slide
An introduction to Prolog language slideAn introduction to Prolog language slide
An introduction to Prolog language slide
 
Lec 06
Lec 06Lec 06
Lec 06
 
Introduction to Genetic algorithm
Introduction to Genetic algorithmIntroduction to Genetic algorithm
Introduction to Genetic algorithm
 
Prolog basics
Prolog basicsProlog basics
Prolog basics
 
Sheet1Student Name ________________________________10-coin trialO.docx
Sheet1Student Name  ________________________________10-coin trialO.docxSheet1Student Name  ________________________________10-coin trialO.docx
Sheet1Student Name ________________________________10-coin trialO.docx
 
Genetic algorithms
Genetic algorithmsGenetic algorithms
Genetic algorithms
 
Genetic Agorithm
Genetic AgorithmGenetic Agorithm
Genetic Agorithm
 
Genetic algorithm
Genetic algorithmGenetic algorithm
Genetic algorithm
 

Recently uploaded

Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
fonyou31
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
PECB
 

Recently uploaded (20)

Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 

45 genetic-algorithms

  • 1. Genetic Algorithms Jul 18, 2012
  • 2. Evolution  Here’s a very oversimplified description of how evolution works in biology  Organisms (animals or plants) produce a number of offspring which are almost, but not entirely, like themselves  Variation may be due to mutation (random changes)  Variation may be due to sexual reproduction (offspring have some characteristics from each parent)  Some of these offspring may survive to produce offspring of their own—some won’t  The “better adapted” offspring are more likely to survive  Over time, later generations become better and better adapted  Genetic algorithms use this same process to “evolve” better programs 2
  • 3. Genotypes and phenotypes  Genes are the basic “instructions” for building an organism  A chromosome is a sequence of genes  Biologists distinguish between an organism’s genotype (the genes and chromosomes) and its phenotype (what the organism actually is like)  Example: You might have genes to be tall, but never grow to be tall for other reasons (such as poor diet)  Similarly, “genes” may describe a possible solution to a problem, without actually being the solution 3
  • 4. The basic genetic algorithm  Start with a large “population” of randomly generated “attempted solutions” to a problem  Repeatedly do the following:  Evaluate each of the attempted solutions  Keep a subset of these solutions (the “best” ones)  Use these solutions to generate a new population  Quit when you have a satisfactory solution (or you run out of time) 4
  • 5. A really simple example  Suppose your “organisms” are 32-bit computer words  You want a string in which all the bits are ones  Here’s how you can do it:  Create 100 randomly generated computer words  Repeatedly do the following:  Count the 1 bits in each word  Exit if any of the words have all 32 bits set to 1  Keep the ten words that have the most 1s (discard the rest)  From each word, generate 9 new words as follows:  Pick a random bit in the word and toggle (change) it  Note that this procedure does not guarantee that the next “generation” will have more 1 bits, but it’s likely 5
  • 6. A more realistic example, part I  Suppose you have a large number of (x, y) data points  For example, (1.0, 4.1), (3.1, 9.5), (-5.2, 8.6), ...  You would like to fit a polynomial (of up to degree 5) through these data points  That is, you want a formula y = ax5 + bx4 + cx3 + dx2 +ex + f that gives you a reasonably good fit to the actual data  Here’s the usual way to compute goodness of fit:  Compute the sum of (actual y – predicted y)2 for all the data points  The lowest sum represents the best fit  There are some standard curve fitting techniques, but let’s assume you don’t know about them  You can use a genetic algorithm to find a “pretty good” solution 6
  • 7. A more realistic example, part II  Your formula is y = ax5 + bx4 + cx3 + dx2 +ex + f  Your “genes” are a, b, c, d, e, and f  Your “chromosome” is the array [a, b, c, d, e, f]  Your evaluation function for one array is:  For every actual data point (x, y), (I’m using red to mean “actual data”)  Compute ý = ax5 + bx4 + cx3 + dx2 +ex + f  Find the sum of (y – ý)2 over all x  The sum is your measure of “badness” (larger numbers are worse)  Example: For [0, 0, 0, 2, 3, 5] and the data points (1, 12) and (2, 22):  ý = 0x5 + 0x4 + 0x3 + 2x2 +3x + 5 is 2 + 3 + 5 = 10 when x is 1  ý = 0x5 + 0x4 + 0x3 + 2x2 +3x + 5 is 8 + 6 + 5 = 19 when x is 2  (12 – 10)2 + (22 – 19)2 = 22 + 32 = 13  If these are the only two data points, the “badness” of [0, 0, 0, 2, 3, 5] is 13 7
  • 8. A more realistic example, part III  Your algorithm might be as follows:  Create 100 six-element arrays of random numbers  Repeat 500 times (or any other number):  For each of the 100 arrays, compute its badness (using all data points)  Keep the ten best arrays (discard the other 90)  From each array you keep, generate nine new arrays as follows:  Pick a random element of the six  Pick a random floating-point number between 0.0 and 2.0  Multiply the random element of the array by the random floating-point number  After all 500 trials, pick the best array as your final answer 8
  • 9. Asexual vs. sexual reproduction  In the examples so far,  Each “organism” (or “solution”) had only one parent  Reproduction was asexual (without sex)  The only way to introduce variation was through mutation (random changes)  In sexual reproduction,  Each “organism” (or “solution”) has two parents  Assuming that each organism has just one chromosome, new offspring are produced by forming a new chromosome from parts of the chromosomes of each parent 9
  • 10. The really simple example again  Suppose your “organisms” are 32-bit computer words, and you want a string in which all the bits are ones  Here’s how you can do it:  Create 100 randomly generated computer words  Repeatedly do the following:  Count the 1 bits in each word  Exit if any of the words have all 32 bits set to 1  Keep the ten words that have the most 1s (discard the rest)  From each word, generate 9 new words as follows:  Choose one of the other words  Take the first half of this word and combine it with the second half of the other word 10
  • 11. The example continued  Half from one, half from the other: 0110 1001 0100 1110 1010 1101 1011 0101 1101 0100 0101 1010 1011 0100 1010 0101 0110 1001 0100 1110 1011 0100 1010 0101  Or we might choose “genes” (bits) randomly: 0110 1001 0100 1110 1010 1101 1011 0101 1101 0100 0101 1010 1011 0100 1010 0101 0100 0101 0100 1010 1010 1100 1011 0101  Or we might consider a “gene” to be a larger unit: 0110 1001 0100 1110 1010 1101 1011 0101 1101 0100 0101 1010 1011 0100 1010 0101 1101 1001 0101 1010 1010 1101 1010 0101 11
  • 12. Comparison of simple examples  In the simple example (trying to get all 1s):  The sexual (two-parent, no mutation) approach, if it succeeds, is likely to succeed much faster  Because up to half of the bits change each time, not just one bit  However, with no mutation, it may not succeed at all  By pure bad luck, maybe none of the first (randomly generated) words have (say) bit 17 set to 1  Then there is no way a 1 could ever occur in this position  Another problem is lack of genetic diversity  Maybe some of the first generation did have bit 17 set to 1, but none of them were selected for the second generation  The best technique in general turns out to be sexual reproduction with a small probability of mutation 12
  • 13. Curve fitting with sexual reproduction  Your formula is y = ax5 + bx4 + cx3 + dx2 +ex + f  Your “genes” are a, b, c, d, e, and f  Your “chromosome” is the array [a, b, c, d, e, f]  What’s the best way to combine two chromosomes into one?  You could choose the first half of one and the second half of the other: [a, b, c, d, e, f]  You could choose genes randomly: [a, b, c, d, e, f]  You could compute “gene averages:” [(a+a)/2, (b+b)/2, (c+c)/2, (d+d)/2, (e+e)/2,(f+f)/2]  I suspect this last may be the best, though I don’t know of a good biological analogy for it 13
  • 14. Directed evolution  Notice that, in the previous examples, we formed the child organisms randomly  We did not try to choose the “best” genes from each parent  This is how natural (biological) evolution works  Biological evolution is not directed—there is no “goal”  Genetic algorithms use biology as inspiration, not as a set of rules to be slavishly followed  For trying to get a word of all 1s, there is an obvious measure of a “good” gene  But that’s mostly because it’s a silly example  It’s much harder to detect a “good gene” in the curve-fitting problem, harder still in almost any “real use” of a genetic algorithm 14
  • 15. Probabilistic matching  In previous examples, we chose the N “best” organisms as parents for the next generation  A more common approach is to choose parents randomly, based on their measure of goodness  Thus, an organism that is twice as “good” as another is likely to have twice as many offspring  This has a couple of advantages:  The best organisms will contribute the most to the next generation  Since every organism has some chance of being a parent, there is somewhat less loss of genetic diversity 15
  • 16. Genetic programming  A string of bits could represent a program  If you want a program to do something, you might try to evolve one  As a concrete example, suppose you want a program to help you choose stocks in the stock market  There is a huge amount of data, going back many years  What data has the most predictive value?  What’s the best way to combine this data?  A genetic program is possible in theory, but it might take millions of years to evolve into something useful  How can we improve this? 16
  • 17. Shrinking the search space  There are just too many possible bit patterns!  99.9999% of these don’t even represent valid programs  An incredible improvement would result if we could somehow restrict the search space to only valid (even if nonsensical) programs  We can do this!  Programs, as you should know by now, can be represented as trees  Internal nodes are operators: +, *, if, while, ...  Leaves are values: 2.71818, "AAPL", ... 17
  • 18. Programs as trees  Given a program represented as a tree, we can mutate it by changing one of its operators (or one of its values), or by adding or removing nodes  Given two trees, we can form a new tree by taking parts of its two parents  The next big problem: How do we evaluate program trees that are (initially) nothing at all like what we want?  I realize this is all very vague—I just wanted to give you the general idea 18
  • 19. Concluding remarks  Genetic algorithms are—  Fun! They are enjoyable to program and to work with  This is probably why they are a subject of active research  Mind-bogglingly slow—you don’t want to use them if you have any alternatives  Good for a very few types of problems  Genetic algorithms can sometimes come up with a solution when you can see no other way of tackling the problem 19
  • 20. The End 20