Designed by Gusz Eiben & Mark Hoogendoorn
On-line adaptation, learning,
evolution
Designed by Gusz Eiben & Mark Hoogendoorn
Outline
• Population-based Adaptive Systems
• Types of adaptation: evolution, in...
Designed by Gusz Eiben & Mark Hoogendoorn
Population-based Adaptive Systems
PAS have two essential features
•They consist ...
Designed by Gusz Eiben & Mark Hoogendoorn
Types of adaptation
• Evolutionary learning (EL): Changes at population
level (a...
Designed by Gusz Eiben & Mark Hoogendoorn
Taxonomy of adaptation
Adaptation
Evolutionary
Learning
Lifetime
Learning
Indivi...
Designed by Gusz Eiben & Mark Hoogendoorn
Taxonomy of adaptation 2
Adaptation
Evolutionary
Learning
Lifetime
Learning
Indi...
Designed by Gusz Eiben & Mark Hoogendoorn
Adaptation ≠ operation
• Operation: controller is being used
– Sensory inputs  ...
Designed by Gusz Eiben & Mark Hoogendoorn
Genotype
Developmental
Engine(decoder)
Genetic operators:
mutation & xover
Learn...
Designed by Gusz Eiben & Mark Hoogendoorn
Genotype
Developmental
Engine(decoder)
Genetic operators:
mutation & xover
Learn...
Designed by Gusz Eiben & Mark Hoogendoorn
Genotype
Developmental
Engine(decoder)
Genetic operators:
mutation & xover
Learn...
Designed by Gusz Eiben & Mark Hoogendoorn
Phenotype
Genotype
Developmental
Engine(decoder)
Genetic operators:
mutation & x...
Designed by Gusz Eiben & Mark Hoogendoorn
Evolutionary loop
Genotype
DevelopmentalEngine
Genetic operators:
mutation & xov...
Designed by Gusz Eiben & Mark Hoogendoorn
Learning loop
Genotype
DevelopmentalEngine
Genetic operators:
mutation & xover
L...
Designed by Gusz Eiben & Mark Hoogendoorn
ENVIRONMENTAGENT
Reward r(t)
State s(t)
Action a(t)
Designed by Gusz Eiben & Mark Hoogendoorn
Reinforcement learning
Agent in situation/state st chooses action at
World chang...
Designed by Gusz Eiben & Mark Hoogendoorn
Further reading
• Evert Haasdijk and A.E. Eiben and Alan F.T.
Winfield, Individu...
Upcoming SlideShare
Loading in …5
×

Academic Course: 10 On-line adaptation, learning, evolution

680 views

Published on

By Gusz Eiben & Mark Hoogendoorn

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
680
On SlideShare
0
From Embeds
0
Number of Embeds
194
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Academic Course: 10 On-line adaptation, learning, evolution

  1. 1. Designed by Gusz Eiben & Mark Hoogendoorn On-line adaptation, learning, evolution
  2. 2. Designed by Gusz Eiben & Mark Hoogendoorn Outline • Population-based Adaptive Systems • Types of adaptation: evolution, individual (lifetime) learning, social learning • Machine learning • Reinforcement learning • Off-line vs. on-line adaptation
  3. 3. Designed by Gusz Eiben & Mark Hoogendoorn Population-based Adaptive Systems PAS have two essential features •They consist of a group of basic units that can perform actions, e.g., computation, communication, interaction, etc. •The ability to adapt at – individual level (modify agent ) and/or – group level (add/remove agent).
  4. 4. Designed by Gusz Eiben & Mark Hoogendoorn Types of adaptation • Evolutionary learning (EL): Changes at population level (assumed non-Lamarckian) • Lifetime learning (LL): Changes at agent level – Individual learning (IL): adaptation autonomously through a purely internal procedure – Social learning (SL): adaptation through interaction /communication
  5. 5. Designed by Gusz Eiben & Mark Hoogendoorn Taxonomy of adaptation Adaptation Evolutionary Learning Lifetime Learning Individual Learning Social Learning
  6. 6. Designed by Gusz Eiben & Mark Hoogendoorn Taxonomy of adaptation 2 Adaptation Evolutionary Learning Lifetime Learning Individual Learning Social Learning Learning Evolution
  7. 7. Designed by Gusz Eiben & Mark Hoogendoorn Adaptation ≠ operation • Operation: controller is being used – Sensory inputs  outputs (motor, comm. device) – Robot behavior changes, not the controller • Adaptation: controller is being changed – Present controller  new controller – Uses utility/reward/fitness info – It may require • One single robot – learning • More robots – evolution, social learning • Adaptation + operation = generate + test • Off-line (initial controller design, before start) vs. on-line (after start)
  8. 8. Designed by Gusz Eiben & Mark Hoogendoorn Genotype Developmental Engine(decoder) Genetic operators: mutation & xover Learning operators Robot behavior State of the environment Phenotype = controller Reward Fitness Selection operators
  9. 9. Designed by Gusz Eiben & Mark Hoogendoorn Genotype Developmental Engine(decoder) Genetic operators: mutation & xover Learning operators Robot behavior State of the environment Phenotype = controller Reward Fitness Selection operators
  10. 10. Designed by Gusz Eiben & Mark Hoogendoorn Genotype Developmental Engine(decoder) Genetic operators: mutation & xover Learning operators Robot behavior State of the environment Reward Fitness Selection operators Phenotype controllershape
  11. 11. Designed by Gusz Eiben & Mark Hoogendoorn Phenotype Genotype Developmental Engine(decoder) Genetic operators: mutation & xover Learning operators Robot behavior State of the environment Reward Fitness Selection operators controllershape
  12. 12. Designed by Gusz Eiben & Mark Hoogendoorn Evolutionary loop Genotype DevelopmentalEngine Genetic operators: mutation & xover Learning operator(s) Robot behavior Changes in environment Controller = phenotype Reward Fitness Selection operator(s)
  13. 13. Designed by Gusz Eiben & Mark Hoogendoorn Learning loop Genotype DevelopmentalEngine Genetic operators: mutation & xover Learning operator(s) Robot behavior Changes in environment Controller = phenotype Reward Fitness Selection operator(s)
  14. 14. Designed by Gusz Eiben & Mark Hoogendoorn ENVIRONMENTAGENT Reward r(t) State s(t) Action a(t)
  15. 15. Designed by Gusz Eiben & Mark Hoogendoorn Reinforcement learning Agent in situation/state st chooses action at World changes to situation/state st+1 Agent perceives situation st+1 and gets reward rt+1 Telling the agent what to do is its POLICY πt(s, a) = P r{at = a|st = s} Given the situation at time t is s, the policy gives the probability the agent’s action will be a. For example: πt(s, goforward) = 0.5, πt(s, gobackward) = 0.5. Reinforcement learning ⇒ Get/find/learn the policy
  16. 16. Designed by Gusz Eiben & Mark Hoogendoorn Further reading • Evert Haasdijk and A.E. Eiben and Alan F.T. Winfield, Individual Social and Evolutionary Adaptation in Collective Systems , Serge Kernbach (eds.) , Handbook of Collective Robotics , Pan Stanford , 2011

×