Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning Techniques

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Favorites, Groups & Events

    Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning Techniques - Presentation Transcript

    1. Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning Techniques Xavier Llorà1,2, Kumara Sastry2, Tian-Li Yu3, David E. Goldberg2 1 National Center for Supercomputing Applications. University of Illinois at Urbana-Champaign 2 Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign 3 Department of Electrical Engineering, National Taiwan University Supported by AFOSR FA9550-06-1-0370, NSF at ISS-02-09199 GECCO 2007 HUMIES 1
    2. Motivation • Competent GBML – Use competent GAs to approach GBML problems – Take advantage of competent GA scalability – Provide insight about problem structure – χeCCS by Llorà, Sastry, Goldberg & de la Ossa (2006) • Rule matching may thread practical applications – Even for small dimensional problems (MUX 20), rule matching may take more than 85% of the execution time in XCS – As dimensionality or cardinality of training sets increase, rule matching rules the overall execution time – Efficient implementation (Llora & Sastry, 2006) still require matching rules GECCO 2007 Llorà, Sastry, Yu & Goldberg 2
    3. Motivation • Competent GAs – Byproduct: Models and problem structure insight – Revision of the fitness relaxation for expensive fitness evaluations – Idea: Build a cheap surrogate fitness accurate enough – Successfully applied to GA (Sastry, Lima & Goldberg, 2006) – Help cut down the number of fitness evaluations • GBML – Can we transfer the same ideas to GBML approaches? – What are the requirements needed for competent GBML to benefit from fitness relaxation? GECCO 2007 Llorà, Sastry, Yu & Goldberg 3
    4. Outline • Overview χeCCS • Evaluation relaxation • Fitness inheritance using least squares fitting • Fitness inheritance and χeCCS • Results • Conclusions GECCO 2007 Llorà, Sastry, Yu & Goldberg 4
    5. χ-ari Extended Compact Classifier System • No reinforcement learning is used • A competent GA is in charge of the learning • The idea: – A population of single rules – For each rule we compute its fitness – The χ-ari extended compact genetic algorithm – Niching to maintain different accurate rules (restricted tournament replacement) GECCO 2007 Llorà, Sastry, Yu & Goldberg 5
    6. Maximally Accurate and General Rules • Accuracy and generality can be compute as n t + (r) + n t# (r) n t + (r) \"(r) = \"(r) = nt nm • Fitness should combine accuracy and generality f (r) = \"(r) # $(r)% ! ! • Such measure can be either applied to rules or rule sets ! GECCO 2007 Llorà, Sastry, Yu & Goldberg 6
    7. Maximally Accurate and General Rules GECCO 2007 Llorà, Sastry, Yu & Goldberg 7
    8. Extended Compact Genetic Algorithm • A Probabilistic model building GA (Harik, 1999) – Builds models of good solutions as linkage groups • Key idea: – Good probability distribution → Linkage learning • Key components: – Representation: Marginal product model (MPM) • Marginal distribution of a gene partition – Quality: Minimum description length (MDL) • Occam’s razor principle • All things being equal, simpler models are better – Search Method: Greedy heuristic search GECCO 2007 Llorà, Sastry, Yu & Goldberg 8
    9. Marginal Product Model (MPM) • Partition variables into disjoint sets • Product of marginal distributions on a partition of genes • Gene partition maps to linkage groups MPM: [1, 2, 3], [4, 5, 6], … [l-2, l -1, l] ... xl-2 xl-1 xl x1 x2 x3 x4 x5 x6 {p000, p001, p00#, p010, p011, p01#, p100, p101, p10#, p110, p111, p11# … }(27 probabilities) GECCO 2007 Llorà, Sastry, Yu & Goldberg 9
    10. Minimum Description Length Metric • Hypothesis: For an optimal model – Model size and error is minimum • Model complexity, Cm – # of bits required to store all marginal probabilities • Compressed population complexity, Cp – Entropy of the marginal distribution over all partitions • MDL metric, Cc = Cm + Cp GECCO 2007 Llorà, Sastry, Yu & Goldberg 10
    11. Building an Optimal MPM • Assume independent genes ([1],[2],…,[l]) • Compute MDL metric, Cc • All combinations of two subset merges Eg., {([1,2],[3],…,[l]), ([1,3],[2],…,[l]), ([1],[2],…,[l-1,l])} • • Compute MDL metric for all model candidates • Select the set with minimum MDL, • If , accept the model and go to step 2. • Else, the current model is optimal GECCO 2007 Llorà, Sastry, Yu & Goldberg 11
    12. χeCCS Models for Different Multiplexers Building Block Size Increases
    13. Fitness Inheritance using Least Squares • Proposed by Sastry, Lima & Goldberg (2006) • Surrogate is a regression using basis identified by BBs • A simple example: [1,3] [2] [4] • The schemas represented are – {0*0*, 0*1*, 0*#*, 1*0*, 1*1*, 1*#*, #*0*, #*1*, #*#*, *0**, *1**, *#**, ***0 , ***1 , ***#} • Recode the individuals by GECCO 2007 Llorà, Sastry, Yu & Goldberg 13
    14. Fitness Inheritance using Least Squares • Recoding defines matrix A • Normalize the fitness GECCO 2007 Llorà, Sastry, Yu & Goldberg 14
    15. Fitness Inheritance using Least Squares • Solve using least squares • Once solved, the fitness surrogate take the following form GECCO 2007 Llorà, Sastry, Yu & Goldberg 15
    16. Fitness Inheritance and χeCCS • Two different problems Hidden XOR 6-input multiplexer GECCO 2007 Llorà, Sastry, Yu & Goldberg 16
    17. Hidden XOR • Evolved rules and model • Surrogate accuracy GECCO 2007 Llorà, Sastry, Yu & Goldberg 17
    18. 6-input Multiplexer • The evolved solution and model • The surrogate is totally off GECCO 2007 Llorà, Sastry, Yu & Goldberg 18
    19. 6-input Multiplexer • The key = missing basis • χeCCS is able to solve the problem quickly, reliably, and accurately • However, the model basis are not accurate enough to build a proper surrogate GECCO 2007 Llorà, Sastry, Yu & Goldberg 19
    20. Overlapping BBs using DSMGA • Proposed by Yu, Yassine, Goldberg and Chen (2003) • Based on organizational theory • Main property = DSMGA model builder (DSMcluster) deals with overlapping building blocks • The main issue = translate a populations or rules into a dependency structure matrix (DSM) • The intuition = specific bits are the ones responsible for the kind of linkage we seek GECCO 2007 Llorà, Sastry, Yu & Goldberg 20
    21. Jumping to the Results • DSMcluster model for the hidden XOR – [i0 i1 i2] [i3] [i4] [i5] • DSMcluster model for the 6-input multiplexer – [i0 i1] <i2 i3 i4 i5> – It identifies a BB [i0 i1] of variables interacting with a bus <i2 i3 i4 i5> – Translated into χeCCS language: [i0 i1 i2] [i0 i1 i3] [i0 i1 i4] [i0 i1 i5] – The right model which provides the right set of basis GECCO 2007 Llorà, Sastry, Yu & Goldberg 21
    22. Conclusions • The matching process is crucial and expensive • Efficient implementations can take us far to a point • Relaxation can get rid of the need of matching • For some types of problems overlapping BBs are required • DSMGA provides the proper machinery to identify the proper basis for such a surrogate GECCO 2007 Llorà, Sastry, Yu & Goldberg 22
    23. Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning Techniques Xavier Llorà1,2, Kumara Sastry2, Tian-Li Yu3, David E. Goldberg2 1 National Center for Supercomputing Applications. University of Illinois at Urbana-Champaign 2 Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign 3 Department of Electrical Engineering, National Taiwan University Supported by AFOSR FA9550-06-1-0370, NSF at ISS-02-09199 GECCO 2007 HUMIES 23

    + Xavier LloràXavier Llorà, 3 years ago

    custom

    1562 views, 0 favs, 2 embeds more stats

    A byproduct benefit of using probabilistic model-bu more

    More info about this document

    CC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike License

    Go to text version

    • Total Views 1562
      • 1550 on SlideShare
      • 12 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 129
    Most viewed embeds
    • 7 views on http://www.illigal.uiuc.edu
    • 5 views on http://www.kumarasastry.com

    more

    All embeds
    • 7 views on http://www.illigal.uiuc.edu
    • 5 views on http://www.kumarasastry.com

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories