Gbml - Genetics Based Machine Learning


Published on

Genetics Based Machine Learning

Published in: Education, Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Gbml - Genetics Based Machine Learning

  1. 1. Introduction to Genetics-based Machine Learning Mahalingam.P.R Semester III, M.Tech CSESIS, RSET
  2. 2. Contents Background of Machine Learning  About Machine Learning  GA in Machine Learning Evolution of GBML Classifier System  Rule and Message System  Apportionment of Credit System  Genetic Algorithm A simple classifier system in Pascal Conclusion 2 Introduction to GBML
  3. 3. Introduction3 Introduction to GBML
  4. 4. Machine Learning A machine learns whenever it changes its structure, program, or data (based on its inputs or in response to external information) in such a manner that its expected future performance improves. Machine learning usually refers to the changes in systems that perform tasks associated with artificial intelligence (AI). Such tasks involve recognition, diagnosis, planning, robot control, prediction, etc. The “changes" might be either enhancements to already performing systems or ab-initio synthesis of new systems. 4 Introduction to GBML
  5. 5. A typical AI systemHere, the model progressively learns from the experience it earns from the environment 5 Introduction to GBML
  6. 6. Need for Machine Learning Some tasks cannot be defined well except by example.  We might be able to specify input/output pairs but not a concise relationship between inputs and desired outputs.  We would like machines to be able to adjust their internal structure to produce correct outputs for a large number of sample inputs  suitably constrain their input/output function to approximate the relationship implicit in the examples. It is possible that hidden among large piles of data are important relationships and correlations.  Machine learning methods can often be used to extract these relationships (data mining). 6 Introduction to GBML
  7. 7.  Machine learning methods can be used for on-the-job improvement of existing machine designs. The amount of knowledge available about certain tasks might be too large for explicit encoding by humans.  Machines that learn this knowledge gradually might be able to capture more of it than humans would want to write down. Environments change over time.  Machines that can adapt to a changing environment would reduce the need for constant redesign. New knowledge about tasks is constantly being discovered by humans (Vocabulary changes).  There is a constant stream of new events in the world.  Continuing redesign of AI systems to conform to new knowledge is impractical.  Machine learning methods might be able to track much of it. 7 Introduction to GBML
  8. 8. More on Machine LearningContributions to Machine Varieties of MachineLearning Learning Statistics  Functions Brain Models  Logic Programs and Adaptive Control Theory Rule sets Psychological Models  Finite-state machines Artificial Intelligence  Grammars Evolutionary Models  Problem solving systems 8 Introduction to GBML
  9. 9. GA in Machine Learning Conventional GA systems work with the following properties:  Probabilistic  Random  Enumerative The structures adapted cater to the human-like mechanism adopted in the methodology. This adaptation itself is the biggest hindrance when incorporating GA into more complex, less completely defined environments.  Too much “guesses” when it comes to searching  Methodology safe under the “sandbox” of searching 9 Introduction to GBML
  10. 10. Coming Up Origins of GBML Study Machine Learning systems Systems that use genetic search as their Classifier System primary discovery heuristic. Operation of Classifier Systems Adapting GA structures to work in Implementation of a complex environments like Machine Simple Classifier System in Pascal Learning Testing classifier in a problem domain – Learning a Boolean function 10 Introduction to GBML
  11. 11. Origin of GBML systems11 Introduction to GBML
  12. 12.  In nature, not only do individual animals learn to perform better, but species evolve to be better fit in their individual niches. Since the distinction between evolving and learning can be blurred in computer systems, techniques that model certain aspects of biological evolution have been proposed as learning methods to improve the performance of computer programs. Genetic algorithms [Holland, 1975] and genetic programming [Koza, 1992, Koza, 1994] are the most prominent computational techniques for evolution. 12 Introduction to GBML
  13. 13. Theoretical Foundation for GBML Laid by Holland (1962) Outline for Adaptive Systems Theory  role of program replication as a method of emphasizing past programs. Fundamental role of recombination  Holland (1965) Schemata Processors  Holland (1968-1971) Application of a classifier system  Holland and Reitman (1978) 13 Introduction to GBML
  14. 14.  Modern classifier systems resemble schemata processors in both outline and detail Holland suggested four prototypes in the initial proposal.  No experiments or implementation have been reported yet! Proposal coincided with the development of the theory of schemata. 14 Introduction to GBML
  15. 15. Prototype 1 Prototype 2 Stimulus-response (SR)  Extend type 1 by adding processor that would internal effectors link environmental (internal states). schemata (conditions) with particular action effectors. 15 Introduction to GBML
  16. 16. Prototype 3 Prototype 4 Build upon types 1 and  Extend the other types 2 by including explicit by incorporating the environmental state capability to modify its prediction (a model of own effectors and the real world), and an detectors, permitting internal evaluation greater (or lesser) range mechanism. of data detection and a larger behavioural response. 16 Introduction to GBML
  17. 17. Broadcast language Derived from early proposals.  Broad  Unimplemented Creating broadcast units (production rules) from a 10- letter alphabet. The alphabet added wild card (single and multiple match) characters to an underlying binary alphabet. If included, the following would have given sufficient power for computational completeness and representational convenience.  A fundamental punctuation mark, a persistence symbol for continued broadcast of a message.  A quotation character for taking the next character literally. 17 Introduction to GBML
  18. 18.  The proposal for broadcast language was instrumental in unifying the earlier suggestions for schemata operators by:  theoretically permitting a consistent representation of all operators, data, and rules or instructions. The generality gained in theory has not been realized in practice. 18 Introduction to GBML
  19. 19. First practical implementation Three years following the broadcast language proposal (Holland and Reitman, 1978). Cognitive System Level One (CS-1)  Trained to learn two maze-running tasks.  Performance system with message list and simple string rules (classifiers).  GA composed of reproduction, crossover, mutation and crowding.  Epochal learning mechanism where reward is apportioned to all classifiers active between successive payoff events. Learning mechanism has largely been supplanted by another mechanism – a bucket brigade. 19 Introduction to GBML
  20. 20. GBML Applications20 Introduction to GBML
  21. 21. 21 Introduction to GBML
  22. 22. 22 Introduction to GBML
  23. 23. 23 Introduction to GBML
  24. 24. 24 Introduction to GBML
  25. 25. Classifier System25 Introduction to GBML
  26. 26.  A classifier system is a machine learning system that learns syntactically simple string rules (called classifiers) to guide its performance in an arbitrary environment. A classifier system consists of three main components:  Rule and message system  Apportionment of credit system  Genetic algorithm 26 Introduction to GBML
  27. 27.  The rule and message system of a classifier system is a special kind of production system. A production system is a computational scheme that uses rules as its only algorithmic device. The general syntax of such systems are as follows: if <condition> then <action> The production means that the action may be taken (rule is “fired”) when the condition is satisfied. 27 Introduction to GBML
  28. 28.  It has been shown that production systems are computationally complete, as well as convenient. A simple rule or set of rules can represent a complex set of thoughts compactly. But when it comes to situations in need of learning, this is not advisable due to the complex rule syntax. Many production systems permit involved grammatical constructions for the condition and action part of the rule. 28 Introduction to GBML
  29. 29.  Classifier systems depart from the mainstream by restricting a rule to a fixed length representation. This restriction has two benefits.  All strings under the permissible alphabet are syntactically meaningful  A fixed string representation permits string operators of the genetic kind. This leaves the door open for a Genetic Algorithm search of the space of permissible rules. 29 Introduction to GBML
  30. 30.  Classifier systems use parallel rule activation, whereas traditional systems use serial rule activation. They permit multiple activities to be coordinated simultaneously. When choices must be made between mutually exclusive environmental actions, or size of matched rule set must be pruned to accommodate the fixed length message list, the choices are postponed to the last possible moment, and the arbitration is then performed competitively. 30 Introduction to GBML
  31. 31.  In traditional expert systems, the value or rating of a rule relative to other is fixed by the programmer in conjunction with the expert or group of experts to be emulated. But this is not possible in rule learning system. In such cases, the relative importance has to be “learned”. So, classifiers are forced to coexist in an information-based service economy. A competition is held among classifiers where the right to answer relevant messages goes to the highest bidder. Subsequent bids serve as the income for previously successful message senders. This competitive nature ensures that good rules (profitable ones) survive and the bad rules (unprofitable) die off. 31 Introduction to GBML
  32. 32. Internal Currency The exchange and accumulation of an internal currency provides a natural figure of merit for applying GA. Using the “bank balance” as a fitness function, classifiers may be reproduced, crossed, and mutated. So, it can rank extant rules, and discover new, possibly better rules by innovative combinations of old ones. Here, the stress is on “who gets replaced”, not “replace entire populations”. 32 Introduction to GBML
  33. 33.  Thus, apportionment of credit via competition and rule discovery Next… using GA form a reasonable basis Each component is discussed in detail and for constructing a machine learning the interconnections system atop the computationally studied. convenient and complete •Ruleand message framework of classifiers. system •Apportionment of credit •Genetic Algorithm Implementation in Pascal Testing on real world problem – learning a Boolean function 33 Introduction to GBML
  34. 34. Rule and Message System34 Introduction to GBML
  35. 35. This is the schematic of a complete classifier system.35 Introduction to GBML
  36. 36.  The rule and message system forms the backbone of the silicon beast. Information flows from the environment through the detectors, where it is decoded to one or more finite length messages. The environmental messages are fed into a finite length message list where the messages activate string rules called classifiers. When activated, the classifier posts a message on to the message list, which may in-turn invoke other classifiers, or cause an action to be taken through the system’s action triggers called effectors. 36 Introduction to GBML
  37. 37.  Classifiers combine environmental cues and internal thoughts to determine what the system should do and think next. Coordinates the flow of information from where it is sensed (detectors) to where it is processed (message list and classifier store) to where it is called to action (effectors). Informational units  Messages  Classifiers 37 Introduction to GBML
  38. 38.  A message within a classifier system is simply a finite-length string over some finite alphabet. If we limit to the binary alphabet, we get <message> ::= {0, 1}L This means taking the concatenation of 0s or 1s for “L” times. Messages are the basic token of information exchange in a classifier system. The messages on the message list may match one or more classifiers or string rules. 38 Introduction to GBML
  39. 39.  A classifier is a production rule with the syntax <classifier> ::= <condition>:<message> The condition is a simple pattern recognition device where a wild card (#) is added to the underlying alphabet. <condition> ::= {0, 1, #}L So, the condition matches a message if at every position, a 0 in the message matches a 0 in the condition, 1 matches a 1, or a # matches either a 0 or a 1.  For example, #01# matches 0010, 0011, 1010 and 1011. But it doesn’t match 0000.  Similar to schema in GA 39 Introduction to GBML
  40. 40.  Once the classifier’s condition is matched, it becomes a candidate to post its message to the message list in the next time step. Whether the candidate classifier posts its message is determined by the outcome of an activation auction, which in turn depends on evaluation of the classifier’s value or weighting. 40 Introduction to GBML
  41. 41. Sample simulation by hand Initial message list is as follows. 41 Introduction to GBML
  42. 42. 42 Introduction to GBML
  43. 43. 43 Introduction to GBML
  44. 44. 44 Introduction to GBML
  45. 45. 45 Introduction to GBML
  46. 46. 46 Introduction to GBML
  47. 47. 47 Introduction to GBML
  48. 48. 48 Introduction to GBML
  49. 49. 49 Introduction to GBML
  50. 50. Apportionment of Credit Algorithm THE BUCKET BRIGADE50 Introduction to GBML
  51. 51.  Many classifiers attempt to rank or rate the individual classifiers according to a classifier’s role in achieving reward from the environment. Most prevalent method  Bucket Brigade Algorithm  John Holland 51 Introduction to GBML
  52. 52. About the algorithm An information economy where the right to trade information is bought and sold by classifiers. Classifiers form a chain of middlemen from information manufacturer (the environment) to information consumer (the effectors). Components of service economy  Auction  ClearingHouse 52 Introduction to GBML
  53. 53.  When classifiers are matched, they don’t directly post their messages. A matching message entitles a classifier to participate in an activation auction.  Each classifier maintains a record of its net worth.  Called Strength (S).  Each matched classifier makes a bid B.  Bid proportional to its strength. In this way, rules that are highly fit (have accumulated a net worth) are given preference over other rules. 53 Introduction to GBML
  54. 54.  Once a classifier is selected for activation, it must clear its payment through the clearinghouse, paying its bid to other classifiers for matching messages rendered. A matched and activated classifier sends its bid B to those classifiers responsible for sending the messages that matched the bidder classifier’s condition. Bid amount divided among the matching classifiers. Division of payoff among contributing classifiers helps ensure the formation of an appropriately sized subpopulation of rules. Different types of rules can cover different types of behavioral requirements without undue interspecies competition. 54 Introduction to GBML
  55. 55.  In a rule-learning system of any experience, we cannot search for one master rule.  We must instead search for a co-adapted set of rules that together cover a range of behavior that provides ample payoff to the learning system.Consider theclassifiers as before. 55 Introduction to GBML
  56. 56.  Now, lets follow the payments, with initial strength of 200. 56 Introduction to GBML
  57. 57. 57 Introduction to GBML
  58. 58. 58 Introduction to GBML
  59. 59. 59 Introduction to GBML
  60. 60. 60 Introduction to GBML
  61. 61. 61 Introduction to GBML
  62. 62. 62 Introduction to GBML
  63. 63.  For steady receipts, the bid value approaches the receipt. For time-varying receipt values, we see that the bid is a geometrically weighted average of the input. As such, it acts as a filter of the possibly intermittent and noisy receipt values. 63 Introduction to GBML
  64. 64. Genetic Algorithm64 Introduction to GBML
  65. 65.  Bucket brigade  Clean procedure  Evaluation of rules  Decide among competing alternatives But we have to devise a way of injecting new rules into the system. Similar to SGA, we can inject new rules using the tripartite rules  Reproduction  Crossover  Mutation 65 Introduction to GBML
  66. 66.  The rules are placed in the population and processed by the auction, payment and reinforcement mechanism to properly evaluate their role in the system. Pay attention to “who replaces whom”  Not replacing the entire population GA in classifier systems strongly resemble those used in search and optimization. Main difference in Machine Learning  Non-overlapping population model not acceptable here.  In non-overlapping generations, complete generations are selected and replaced by a new population at every run. 66 Introduction to GBML
  67. 67. Machine Learning Search and optimization High level of on-line  Convergence performance.  Offline performance Learn to perform more proficiently. 67 Introduction to GBML
  68. 68. De Jong’s experiments Machine Learning Conventional system  Whole population GA Parameter should not be replaced.  Generation Gap (G)  Quantity  Implement and test  Selection Proportion overlapping population (proportion) Genetic Algorithms.  Replace that proportion of the population at a given algorithm run.  Coupled with a number of other parameters. 68 Introduction to GBML
  69. 69. Other Parameters GA Period  Represented as Tga  Specifies number of time steps (rule and message cycles) between GA calls.  Period can be treated deterministically  GA is called every Tga cycles  Or stochastically  GA is called probabilistically with average period Tga Invocation of GA learning may be conditioned on particular events such as:  Lack of match  Poor performance 69 Introduction to GBML
  70. 70. Selection Roulette Wheel Selection  Classifier’s strength S used as the fitness. No longer generating entire populations  Careful when choosing population members for replacement. De Jong’s crowding procedure  Encourage replacement of similar population memebers 70 Introduction to GBML
  71. 71. Mutation Modified procedure  Here, ternary alphabet is used  SGA used a binary alphabet Mutation probability pm defined as before When a mutation is called for, we change the mutated character to one of the other two with equal probability.  0  { 1, # }  1  { 0, # }  #  { 0, 1 } 71 Introduction to GBML
  72. 72. Next- With all these changes to the A simple Classifier normal SGA routine, GA may be System in Pascal dropped into the classifier system •Simple Classifier System Data Structure and used in a manner not too different from normal search and •The Performance System optimization applications. •Apportionment of credit algorithm •Geneticsearch within the Simple Classifier System •Real-world testing •Results •Comparison with and without GA 72 Introduction to GBML
  73. 73. A Simple Classifier System in Pascal73 Introduction to GBML
  74. 74.  Construct a system designed to learn a boolean function  A multiplexer Collapse the finite-length message list to a single message (the environmental message)  Immediate feedback  Simple payoff 74 Introduction to GBML
  75. 75. Components Simple Classifier System Data Structure  Adapt to learning strategies Performance System  Heart of SCS  Matching procedures are the heart of the performance system Apportionment of Credit  Procedures  Auction  Clearinghouse  Taxcollector Genetic Search within Simple Classifier System  Similar to SGA Learning the multiplexing system Main procedure  Reinforcement 75 Introduction to GBML
  76. 76. Six – bit multiplexing system76 Introduction to GBML
  77. 77. Results using the Simple Classifier SystemWithout GA With GA77 Introduction to GBML
  78. 78. Conclusion The following were discussed  Machine Learning  Role of GA in Machine Learning  The evolution of GA concepts in Machine Learning  Some applications of GBML  Classifier System  Components of Classifier System  Rule and Message System  Apportionment of Credit (The Bucket Brigade)  Genetic Algorithm  A practical implementation in Pascal  Increase in output efficiency when using GA 78 Introduction to GBML
  79. 79. References “Introduction to Machine Learning”, Nils J. Nilsson, Robotics Laboratory, Department of Computer Science, Stanford University “Genetic Algorithms in Search, Optimization and Machine Learning”, David E. Goldberg, pp 217-260 79 Introduction to GBML
  80. 80. THANK YOU80 Introduction to GBML