Population Sizing for Genetic Programming Based Upon Decision Making

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Favorites, Groups & Events

    Population Sizing for Genetic Programming Based Upon Decision Making - Presentation Transcript

    1. Population Sizing in Genetic Programming: Building-Block-Wise Decision Making Kumara Sastry1, Una-May O’Reilly2, David E. Goldberg1 1Illinois Genetic Algorithms Laboratory (IlliGAL) Department of General Engineering University of Illinois at Urbana-Champaign 2Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology ksastry@uiuc.edu, unamay@csail.mit.edu, deg@uiuc.edu Supported by AFOSR F49620-03-1-0129, NSF/DMR at MCC DMR-99-76550, NSF/ITR at CPSD DMR-0121695, DOE at FS-MRL DEFG02-91ER45439.
    2. How do we size populations? Tradeoff! Expense = #runs x popsize x #generations We need to understand population sizing because: Competency: If it’s not a problem with pop size, it must be something else Scalability How does required popsize scale with problem size? Efficiency How to choose the tradeoff?
    3. Decision Making Based Popsize: The Quick Answer The population size required increases when Collateral building block (BB) noise increases Alphabet size increases Building block size increases Signal to noise ratio decreases Error tolerance decreases Problem size increases Tree size decreases Bloat increases
    4. Outline Background GP building blocks and competitions GA decision making population size model GP decision making model Empirical validation Using this!?
    5. Design Decomposition & Modeling In the Middle [Goldberg (1991), Goldberg, Deb, & Clark (1992), Goldberg (2002)] 1. Know what GAs are processing—building blocks 2. Know thy BB challengers—BB-wise difficult problems 3. Ensure an adequate supply of raw BBs 4. Ensure increased market share for superior BBs 5. Know BB takeover and convergence time 6. Make decisions well among competing BBs 7. Mix BBs well Low Cost/ High Cost/ High Error Low Error Unarticulated Articulated Dimensional Facetwise Equations Wisdom Qualitative Models Models Of Motion Models
    6. A Building Block Supply Model How big should the population be sized to ensure that *EVERY* (i.e. raw) building block appears at least once within some error: Following intuition our model indicated that the bigger the primitive set, the greater the population size or the bigger the trees, the smaller the population size [Sastry, O’Reilly, Goldberg, Hill, 2003]
    7. What else is critical? Decision making Initial population contains an unbiased sample of BBs Ensure success of high-quality BBs Competition between best and 2nd best BB in a given partition From first generation onward In GAs, usually decision making bounds BB supply. Model does not factor crossover or growth Facetwise analysis
    8. GP Building Blocks Consider a symbolic regression problem with F = {+, -, *} and T = {w, x, y, z} Alphabet cardinality: X = XF + XT Typical tree has sub-solutions, i.e. BBs + NF = 1 NF = 1 NT = 1 NT = 2 _ k=2 * k=3 NF = 0 w x y z NT = 1 k=1
    9. BB Competition Size of a competition: F F F F F F T T T T F F F T F Full binary tree of size λ and a competition of size k In this tree there are Φ bbs competing
    10. GA Population Sizing: Decision-Making Model Make correct decision between competing subsolutions Statistical in nature Fitness is measure of the entire chromosome [De Jong (1975); Goldberg (1989); Goldberg & Rudnick (1992); Goldberg, Deb & Clark (1992); Harik, Cantú-Paz, Goldberg, & Miller (1997)] Goldberg, Deb, and Clark, Complex Sys, 6, 1992
    11. Improving the Likelihood of a Correct Decision Increasing trials, decreases the variance
    12. GA Decision Making Step 1: Ease of decision making: signal to noise ratio Step 2: Probability of correct decision making: Pdm is cumulative density function of a unit normal variate which is the signal-noise ratio
    13. GA Decision Making Step 3: Probability of making an error on a single trial Ordinate of a unit, one sided normal variate Step 4: GA BB variance and collateral noise
    14. GA Decision Making Step 5: Increase trials τ to decrease variance
    15. Population sizing GA Decision Making Step 5: Rearrange and examine… # Sub-components Probabilistic Safety Factor Component Complexity Ease of decision making
    16. GP Decision Making Step 1: Ease of decision making: signal to noise ratio Step 2: Probability of correct decision making: Pdm is cumulative density function of a unit normal variate which is the signal-noise ratio
    17. GP Decision Making Step 3: Probability of making an error on a single trial Ordinate of a unit, one sided normal variate Step 4: GA BB variance and collateral noise
    18. GP Decision Making Step 3: GP BB variance and collateral noise
    19. GP Decision Making Step 3: Increase trials τ to decrease variance In the case of binary trees
    20. GP Decision Making Step 4: Rearrange and examine… Component Complexity Probabilistic Safety Factor Ease of Subcomponent decision making quantity
    21. Determining the best population size Use a bisection method Search in an interval that is shortened from appropriate end until interval = 1 Determine minimum population size required such that m-1 BBs converge to correct value
    22. Run Parameters Error tolerance: α = 1/m Ramped half half or full trees Grow up to 1024 nodes Tournament selection, size 4 Crossover 100%, elite 5% 50 independent runs to determine convergence accuracy Results of population size and time convergence was 30 bisection runs Results for function evaluations were averaged over 1500 runs
    23. ORDER: Empirical Validation Primitive set: Parse program tree inorder Primitive first encountered is expressed Parsed primitives Expressed primitives – Fitness: # of expressed primitives Xi – Similar to OneMax in GAs – BB expressed at most once
    24. Population Sizing in GP: Decision-Making Model Error tolerance Signal-to-noise ratio Number of Sub-component sub-components complexity BB can expressed more than once Accounts for bloat Related to Kolmogorov complexity [Sastry, O’Reilly & Goldberg, GPTP 2004]
    25. ORDER: time and evals Linear time scaling: Cubic function evaluation scaling: Implies: quadratic population size scaling:
    26. LOUD: Empirical Validation Primitive set: 0,1,4 All primitives expressed but 2/3 only have fitness influence Fitness: Theory:
    27. LOUD: Empirical Validation Convergence time is constant with problem size
    28. ON-OFF: Empirical Validation Primitive set: x1, x2, EXP, NO-EXP Tunable BB expression via frequency of No-EXP Leaf must have only EXP in path to root to express itself Fitness:
    29. ON-OFF: Empirical Validation
    30. Using The Population-Sizing Model Pick a fairly convenient λ. Create a coarse sample and use std-dev of average fitness to estimate noise If possible estimate bloat and relate to tree size Establish d, k and minimal solution size Keep k small Estimate φ - use binary trees if clueless Estimate as l/k * inverse of bloat frequency
    31. If you have a small scale version of problem Find the best population size for the small problem Consider ratio of alphabet, minimal solution size, inverse ratio of bloat and keep the tree sizes for the initial population the same Did for 6 and 11 multiplexer
    32. The Numbers
    33. Decision Making Based Popsize: The Quick Answer The population size required increases when Collateral bb noise increases Alphabet size increases Building block size increases Signal to noise ratio decreases Error tolerance decreases Problem size increases Tree size decreases Bloat increases
    34. Population Sizing: Decision Making Model Error in a single trial Gambler’s ruin model [Harik et al, ECJ, 7(3), 1999] Initial market share Population Size Number of sub-components Error tolerance Signal-to-noise ratio Sub-component complexity
    35. Convergence Time in GP Understanding time in GP Goldberg & O’Reilly (1998), O’Reilly & Goldberg (1998) Expression matters Key factors Selection method Varying size Context … Convergence time Sequential Sastry, O’Reilly & Goldberg, GPTP 2004
    36. Multiplexer Problem

    + kknsastrykknsastry, 3 years ago

    custom

    1980 views, 0 favs, 2 embeds more stats

    This paper derives a population sizing relationship more

    More info about this document

    CC Attribution-NonCommercial-NoDerivs LicenseCC Attribution-NonCommercial-NoDerivs LicenseCC Attribution-NonCommercial-NoDerivs License

    Go to text version

    • Total Views 1980
      • 1974 on SlideShare
      • 6 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 0
    Most viewed embeds
    • 4 views on http://www.kumarasastry.com
    • 2 views on http://www.illigal.uiuc.edu

    more

    All embeds
    • 4 views on http://www.kumarasastry.com
    • 2 views on http://www.illigal.uiuc.edu

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories