Evolving Rules to Solve Problems: The Learning Classifier Systems Way

5,832 views

Published on

Invited talk at EPIA 2007 the Portuguese Conference on Artificial Intelligence, Guimarães, Portugal.

http://epia2007.appia.pt/

Published in: Technology, Education
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
5,832
On SlideShare
0
From Embeds
0
Number of Embeds
180
Actions
Shares
0
Downloads
210
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide

Evolving Rules to Solve Problems: The Learning Classifier Systems Way

  1. 1. Evolving Rules to Solve Problems: The Learning Classifier Systems Way Pier Luca Lanzi EPIA 2007, Guimarães , Portugal, September 4th, 2007
  2. 2. Evolving
  3. 3. Early Evolutionary Research 3 Box (1957). Evolutionary operations. Led to simplex methods, Nelder-Mead. Other Evolutionaries: Friedman (1959), Bledsoe (1961), Bremermann (1961) Rechenberg (1964), Schwefel (1965). Evolution Strategies. Fogel, Owens & Walsh (1966). Evolutionary programming. Common view Evolution = Random mutation + Save the best. Pier Luca Lanzi
  4. 4. Early intuitions 4 “There is the genetical or evolutionary search by which a combination of genes is looked for, the criterion being the survival value.” Alan M. Turing, Intelligent Machinery, 1948 “We cannot expect to find a good child-machine at the first attempt. One must experiment with teaching one such machine and see how well it learns. One can then try another and see if it is better or worse. There is an obvious connection between this process and evolution, by the identifications “Structure of the child machine = Hereditary material “Changes of the child machine = Mutations “Natural selection = Judgment of the experimenter” Alan M. Turing, “Computing Machinery and Intelligence” 1950. Pier Luca Lanzi
  5. 5. Meanwhile… in Ann Arbor… 5 Holland (1959). Iterative circuit computers. Holland (1962). Outline for a logical theory of adaptive systems. Role of recombination (Holland 1965) Role of schemata (Holland 1968, 1971) Two-armed bandit (Holland 1973, 1975) First dissertations (Bagley, Rosenberg 1967) Simple Genetic Algorithm(De Jong 1975) Pier Luca Lanzi
  6. 6. What are Genetic Algorithms? 6 Genetic algorithms (GAs) are search algorithms based on the mechanics of natural selection and genetics Two components Natural selection: survival of the fittest Genetics: recombination of structures, variation Underlying methaphor Individuals in a population must be adapted to the environment to survive and reproduce A problem can be viewed as an environment, we evolve a population of solutions to solve it Different individuals are differently adapted To survive a solution must be “adapted” to the problem Pier Luca Lanzi
  7. 7. A Peek into Genetic Algorithms 7 Population Representation A set of candidate The coding of solutions solutions Originally, binary strings Fitness function Operators inspired by Nature Evaluates candidate Selection solutions Recombination Mutation Genetic Algorithm Generate an initial random population Repeat Select promising solutions Create new solutions by applying variation Incorporate new solutions into original population Until stop criterion met Pier Luca Lanzi
  8. 8. Rules
  9. 9. Holland’s Vision, Cognitive System One 9 To state, in concrete technical form, a model of a complete mind and its several aspects A cognitive system interacting with an environment Binary detectors and effectors Knowledge = set of classifiers Condition-action rules that recognize a situation and propose an action Payoff reservoir for the system’s needs Payoff distributed through an epochal algorithm Internal memory as message list Genetic search of classifiers Pier Luca Lanzi
  10. 10. What was the goal? 10 1#11:buy ⇒30 0#0#:sell ⇒-2 … A real system with an unknown underlying dynamics Use a classifier system online to generate a behavior that matched the real system. The evolved rules would provide a plausible, human readable, model of the unknown system Pier Luca Lanzi
  11. 11. Holland’s Learning Classifier Systems 11 Explicit representation of the incoming reward Good classifiers are the ones that predict high rewards Credit Assignment using Bucket Brigade Rule Discovery through a genetic algorithm on all the rule base (on the whole solution) Description was vast It did not work right off! Very limited success David E. Goldberg: Computer-aided gas pipeline operation using genetic algorithms and rule learning, PhD thesis. University of Michigan. Ann Arbor, MI. Pier Luca Lanzi
  12. 12. Learning System LS-1 & 12 Pittsburgh Classifier Systems Holland models learning as an adaptation process De Jong models learning as an optimization process Genetic algorithm applied to a population of rule sets 1. t := 0 2. Initialize the population P(t) 3. Evaluate the rules sets in P(t) 4. While the termination condition is not satisfied 5. Begin 6. Select the rule sets in P(t) and generate Ps(t) 7. Recombine and mutate the rule sets in Ps(t) 8. P(t+1) := Ps(t) 9. t := t+1 No apportionment of credit 10. Evaluate the rules sets in P(t) Offline evaluation of rule sets 11. End Pier Luca Lanzi
  13. 13. As time goes by… 13 Genetic algorithms and CS-1 1970’s Research flourishes Success is limited Reinforcement Learning Evolving rules as optimization Machine 1980’s Research follows Holland’s vision Learning Success is still limited Stewart Wilson Robotics applications 1990’s creates XCS First results on classification But the interest fades away Classifier systems finally work 2000’s Large development of models, facetwise theory, and applications Pier Luca Lanzi
  14. 14. Stewart W. Wilson & 14 The XCS Classifier System 1. Simplify the model 2. Go for accurate predictions not high payoffs 3. Apply the genetic algorithm to subproblems not to the whole problem 4. Focus on classifier systems as reinforcement learning with rule-based generalization 5. Use reinforcement learning (Q-learning) to distribute reward Most successfull model developed so far Wilson, S.W.: Classifier Fitness Based on Accuracy. Evolutionary Computation 3(2), 149-175 (1995). Pier Luca Lanzi
  15. 15. The Classifier System Way
  16. 16. Learning Classifier Systems as 16 Reinforcement Learning Methods System stt+1 at rt+1 Environment The goal: maximize the amount of reward received How much future reward when at is performed in st? What is the expected payoff for st and at? Need to compute a value function, Q(st,at)→ payoff Pier Luca Lanzi
  17. 17. 17 Define the inputs, the actions, and how the reward is determined HowDefine the expected payoff does reinforcement learning work? Compute a value function Q(st,at) mapping state-action pairs into expected payoffs Pier Luca Lanzi
  18. 18. How does reinforcement learning work? 18 First we define the expected payoff First we define the expected payoff as γ is the discount factor Pier Luca Lanzi
  19. 19. How does reinforcement learning work? 19 Then, Q-learning is an option. At the beginning, is initialized with random values At time t, previous value new estimate incoming reward new estimate Parameters, Discount factor γ The learning rate β The action selection strategy Pier Luca Lanzi
  20. 20. This looks simple… 20 Let’s bring RL to the real world! Reinforcement learning assumes that Q(st,at) is represented as a table But the real world is complex, the number of possible inputs can be huge! We cannot afford an exact Q(st,at) Pier Luca Lanzi
  21. 21. Example: The Mountain Car 21 rt = 0 when goal is reached, -1 otherwise. GOAL Value Function Q(st,at) st = position, acc . acc velocity . ht, . left, no a cc a= r ig t Task: drive an underpowered car up a steep mountain road Pier Luca Lanzi
  22. 22. What are the issues? 22 Exact representation infeasible Approximation mandatory The function is unknown, it is learnt online from experience Pier Luca Lanzi
  23. 23. What are the issues? 23 Learning an unknown payoff function while also trying to approximate it Approximator works on intermediate estimates While also providing information for the learning Convergence is not guaranteed Pier Luca Lanzi
  24. 24. Whats does this have to do with 24 Learning Classifier Systems? They solve reinforcement learning problems Represent the payoff function Q(st, at) as a population of rules, the classifiers Classifiers are evolved while Q(st, at) is learnt online Pier Luca Lanzi
  25. 25. What is a classifier? 25 IF condition C is true for input s THEN the payoff of action A is p Accurate approximations payoff payoff surface for A p General conditions covering large portions Condition of the problem space C(s)=l≤s≤u s l u Pier Luca Lanzi
  26. 26. What types of solutions? 26 Pier Luca Lanzi
  27. 27. How do learning classifier systems work? 27 The main performance cycle Pier Luca Lanzi
  28. 28. How do learning classifier systems work? 28 The main performance cycle The classifiers predict an expected payoff The incoming reward is used to update the rules which helped in getting the reward Any reinforcement learning algorithm can be used to estimate the classifier prediction. Pier Luca Lanzi
  29. 29. How do learning classifier systems work? 29 The main performance cycle Pier Luca Lanzi
  30. 30. 30 In principle, any search method may be used I prefer genetic algorithms Where do classifiers come from? because they are representation independent A genetic algorithm select, recombines, mutate existing classifiers to search for better ones Pier Luca Lanzi
  31. 31. What are the good classifiers? 31 What is the classifier fitness? The goal is to approximate a target value function with as few classifiers as possible We wish to have an accurate approximation One possible approach is to define fitness as a function of the classifier prediction accuracy Pier Luca Lanzi
  32. 32. What about getting as 32 few classifiers as possible? The genetic algorithm can take care of this General classifiers apply more often, thus they are reproduced more But since fitness is based on classifiers accuracy only accurate classifiers are likely to be reproduced The genetic algorithm evolves maximally general maximally accurate classifiers Pier Luca Lanzi
  33. 33. How to apply learning classifier systems 33 Environment Determine the inputs, the actions, and how reward is distributed Determine what is the expected payoff that must be maximized st rt at Decide an action selection strategy Set up the parameters β and γ Learning Classifier System Select a representation for conditions, the recombination and the mutation operators Select a reinforcement learning algorithm Setup the parameters, mainly the population size, the parameters for the genetic algorithm, etc. Pier Luca Lanzi
  34. 34. Things can be extremely simple! 34 For instance in supervised classification Environment 1 if the class is correct 0 if the class is not correct example class Select a representation for conditions and the recombination and mutation operators Setup the parameters, mainly the population size, the parameters for the genetic algorithm, etc. Learning Classifier System Pier Luca Lanzi
  35. 35. 35 Genetics-Based Accurate Estimates Generalization About Classifiers (Powerful RL) Classifier Representation Pier Luca Lanzi
  36. 36. One Representation, 36 One Principle Data described by 6 variables a1, …, a6 They represents the simple concept “a1=a2 Ç a5=1” A rather typical approach Select a representation Select an algorithm which produces such a representation Apply the algorithm Decision Rules (attribute-value) if (a5 = 1) then class 1 [95.3%] If (a1=3 Æ a2=3) then class = 1 [92.2%] … FOIL Clause 0: is_0(a1,a2,a3,a4,a5,a6) :- a1≠a2, a5≠ 1 Pier Luca Lanzi
  37. 37. Learning Classifier Systems: 37 One Principle Many Representations Ternary rules ####1#:1 if a5<2, class=1 0, 1, # 22####:1 if a1=a2, class=1 Learning Classifier System Genetic Estimates Knowledge Search RL & ML Representation Conditions & Ternary Conditions Attribute-Value Symbolic Prediction Conditions 0, 1, # No need to change the framework Ternary Conditions Attribute-Value Symbolic Just plug-in your favourite representation 0, 1, # Conditions Conditions Pier Luca Lanzi
  38. 38. Better Estimations
  39. 39. What is computed prediction? 39 Replace the prediction p by a parametrized function p(x,w) Which type of approximation? payoff payoff p(x,w)=w0+xw1 landscape of A Which Representation? Condition C(s)=l≤s≤u x l u Pier Luca Lanzi
  40. 40. Same example with computed prediction 40 Again, no need to change the framework Just plug-in your favourite estimator Linear, Polynomial, NNs, SVMs, tile-coding Pier Luca Lanzi
  41. 41. Any Theory?
  42. 42. 42 Learning Classifier Systems involve Representation, Reinforcement Learning, & Genetics-based Search Unified theory is impractical Develop facetwise models Pier Luca Lanzi
  43. 43. Facetwise Models for a Theory of 43 Evolution and Learning Prof. David E. Goldberg University of Illinois at Urbana Champaign David Goldberg & Kumara Sastry Genetic Algorithms: The Design of Innovation Springer-Verlag May 2008 Facetwise approach to analysis and design for genetic algorithms In learning classifier systems Separate learning from evolution Simplify the problem by focusing only on relevant aspect Derive facetwise models Applied to model several aspects of evolution E.g., the time to convergence is O(L 2k) Pier Luca Lanzi
  44. 44. What Applications?
  45. 45. 45 Computational Models of Cognition Others Complex Traffic controllers Target recognition Adaptive Fighter maneuvering Systems … Autonomous Classification Robotics & Data mining Pier Luca Lanzi
  46. 46. Modeling Cognition
  47. 47. What Applications? 47 Computational Models of Cognition Learning classifier system model certain aspects of cognition Human language learning Perceptual category learning Affect theory Anticipatory and latent learning Learning classifier systems provide good models for animals in experiments in which the subjects must learn internal models to perform as well as they do Martin V. Butz, University of Würzburg, Department of Cognitive Psychology III Cognitive Bodyspaces: Learning and Behavior (COBOSLAB) Wolfgang Stolzmann, Daimler Chrysler Rick R. Riolo, University of Michigan, Center for the Study of Complex Systems Pier Luca Lanzi
  48. 48. References 48 Butz, M.V.: Anticipatory Learning Classifier Systems, Genetic Algorithms and Evolutionary Computation, vol. 4. Springer-Verlag (2000) Riolo, R.L.: Lookahead Planning and Latent Learning in a Classifier System. In: J.A. Meyer, S.W. Wilson (eds.) From Animals to Animats 1. Proceedings of the First International Conferenceon Simulation of Adaptive Behavior (SAB90), pp. 316{326. A Bradford Book. MIT Press (1990) Stolzmann, W. and Butz, M.V. and Hoffman, J. and Goldberg, D.E.: First Cognitive Capabilities in the Anticipatory Classifier System. In: From Animals to Animats: Proceedings of the Sixth International Conference on Simulation of Adaptive Behavior. MIT Press (2000) Pier Luca Lanzi
  49. 49. Computational Economics
  50. 50. What Applications? 50 Computational Economics To models one single agent acting in the market (BW Arthur, JH Holland, B LeBaron) To model many interactive agents each one controlled by its own classifier system. Modeling the behavior of agents trading risk free bonds and risky assets Different trader types modeled by supplying different input information sets to a group of homogenous agents Later extended to a multi-LCS architecture applied to portfolio optimization Technology startup company founded in March 2005 Pier Luca Lanzi
  51. 51. References 51 Sor Ying (Byron) Wong, Sonia Schulenburg: Portfolio allocation using XCS experts in technical analysis, market conditions and options market. GECCO (Companion) 2007: 2965-2972 Sonia Schulenburg, Peter Ross: An Adaptive Agent Based Economic Model. Learning Classifier Systems 1999: 263-282 BW Arthur, J.H. Holland, B. LeBaron, R. Palmer, and P. Tayler: quot;Asset Pricing Under Endogenous Expectations in an Artificial Stock Market,“ in The Economy as an Evolving Complex System II. Edited (with S. Durlauf and D. Lane), Addison-Wesley, 1997. BW Arthur, R. Palmer, J. Holland, B. LeBaron, and P. Taylor: quot;Artificial Economic Life: a Simple Model of a Stockmarket,“ Physica D, 75, 264-274, 1994 Pier Luca Lanzi
  52. 52. Data analysis
  53. 53. What Applications? 53 Classification and Data Mining Bull, L. (ed) Applications of Learning Classifier Systems. Springer. (2004) Bull, L., Bernado Mansilla, E. & Holmes, J. (eds) Learning Classifier Systems in Data Mining. Springer. (2008) Nowadays, by far the most important application domain for LCSs Many models GA-Miner, REGAL, GALE GAssist Performance comparable to state of the art machine learning Human Competitive Results 2007 X Llorà, R Reddy, B Matesic, R Bhargava: Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infrared Spectroscopic Imaging Pier Luca Lanzi
  54. 54. Hyper Heuristics
  55. 55. What Applications? 55 Hyper-Heuristics Ross P., Marin-Blazquez J., Schulenburg S., and Hart E., Learning a Procedure that can Solve Hard Bin-packing Problems: A New GA-Based Approach to Hyper-Heuristics. In Proceedings of GECCO 2003 Bin-packing and timetabling problems Pick a set of non-evolutionary heuristics Use classifier system to learn a solution process not a solution The classifier system learns a sequence of heuristics which should be applied to gradually transform the problem from its initial state to its final solved state. Pier Luca Lanzi
  56. 56. Medical data
  57. 57. What Applications? 57 Epidemiologic Surveillance John H. Holmes Center for Clinical Epidemiology & Biostatistics Department of Biostatistics & Epidemiology University of Pennsylvania - School of Medicine Epidemiologic surveillance data need adaptivity to abrupt changes Readable rules are attractive Performance similar to state of the art machine learning But several important feature-outcome relationships missed by other methods were discovered Similar results were reported by Stewart Wilson for breast cancer data Pier Luca Lanzi
  58. 58. References 58 John H. Holmes, Jennifer A. Sager: Rule Discovery in Epidemiologic Surveillance Data Using EpiXCS: An Evolutionary Computation Approach. AIME 2005: 444-452 John H. Holmes, Dennis R. Durbin, Flaura K. Winston: A New Bootstrapping Method to Improve Classification Performance in Learning Classifier Systems. PPSN 2000: 745-754 John H. Holmes, Dennis R. Durbin, Flaura K. Winston: The learning classifier system: an evolutionary computation approach to knowledge discovery in epidemiologic surveillance. Artificial Intelligence in Medicine 19(1): 53-74 (2000) Pier Luca Lanzi
  59. 59. Autonomous Robotics
  60. 60. What Applications? 60 Autonomous Robotics In the 1990's, a major testbed for learning classifier systems. Marco Dorigo and Marco Colombetti: Robot Shaping An Experiment in Behavior Engineering, 1997 They introduced the concept of robot shaping defined as the incremental training of an autonomous agent. Behavior engineering methodology named BAT: Behavior Analysis and Training. Recently, University of West England applied several learning classifier system models to several robotics problems Pier Luca Lanzi
  61. 61. Artificial Ecosystems
  62. 62. What Applications? 62 Modeling Artificial Ecosystems Jon McCormack, Monash University Eden: an interactive, self-generating, artificial ecosystem. World populated by collections of evolving virtual creatures Creatures move about the environment, Make and listen to sounds, Foraging for food, Encountering predators, Mating with each other. Creatures evolve to fit their landscape Eden has four seasons per year (15mins) Jon McCormack Simple physics for rocks, biomass and sonic animals. Pier Luca Lanzi
  63. 63. References 63 McCormack, J. Impossible Nature The Art of Jon McCormack Published by the Australian Centre for the Moving Image ISBN 1 920805 08 7, ISBN 1 920805 09 5 (DVD) J. McCormack: New Challenges for Evolutionary Music and Art, ACM SIGEVOlution Newsletter, Vol. 1(1), April 2006, pp. 5-11. McCormack, J. 2005, 'On the Evolution of Sonic Ecosystems' in Adamatzky, et al. (eds), Artificial Life Models in Software, Springer, Berlin. McCormack, J. 2003, 'Evolving Sonic Ecosystems', Kybernetes, 32(1/2), pp. 184-202. Pier Luca Lanzi
  64. 64. Chemical & Neuronal Networks
  65. 65. What Applications? 65 Chemical and Neuronal Networks L. Bull, A. Budd, C. Stone, I. Uroukov, B. De Lacy Costello and A. Adamatzky University of the West of England Behaviour of non-linear media controlled automatically through evolutionary learning Unconventional computing realised by such an approach. Learning classifier systems Control a light-sensitive sub-excitable Belousov-Zhabotinski reaction Control the electrical stimulation of cultured neuronal networks Pier Luca Lanzi
  66. 66. What Applications? 66 Chemical and Neuronal Networks To control a light-sensitive sub-excitable BZ reaction, pulses of wave fragments are injected into the checkerboard grid resulting in rich spatio-temporal behaviour Learning classifier system can direct the fragments to an arbitrary position through control of the light intensity within each cell Learning Classifier Systems control the electrical stimulation of cultured neuronal networks such that they display elementary learning, respond to a given input signal in a pre-specified way Results indicate that the learned stimulation protocols identify seemingly fundamental properties of in vitro neuronal networks Pier Luca Lanzi
  67. 67. References 67 Larry Bull, Adam Budd, Christopher Stone, Ivan Uroukov, Ben De Lacy Costello and Andrew Adamatzky: Towards Unconventional Computing through Simulated Evolution: Learning Classifier System Control of Non-Linear Media Artificial Life (to appear) Budd, A., Stone, C., Masere, J., Adamatzky, A., DeLacyCostello, B., Bull, L.: Towards machine learning control of chemical computers. In: A. Adamatzky, C. Teuscher (eds.) From Utopian to Genuine Unconventional Computers, pp. 17-36. Luniver Press Bull, L., Uroukov, I.S.: Initial results from the use of learning classier systems to control n vitro neuronal networks. In: Lipson [189], pp. 369-376 Pier Luca Lanzi
  68. 68. Conclusions
  69. 69. Conclusions 69 Cognitive Modeling Complex Adaptive Systems Machine Learning Reinforcement Learning Metaheuristics … Many blocks to plug-in Several Representations Several RL algorithms Several evolutionary methods … Pier Luca Lanzi
  70. 70. Further Readings 70 Martin V. Butz “Rule-Based Evolutionary Online Learning Systems: A Principled Approach to LCS Analysis and Design” Studies in Fuzziness and Soft Computing, Springer-Verlag 2005 Proceedings of IWLCS from 2000 to 2007 published by Springer Verlag Pier Luca Lanzi
  71. 71. Thank you! 71 Any Question? Pier Luca Lanzi

×