Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
A NEAT Way for Evolving Echo State Networks Kyriakos C. Chatzidimitriou Pericles A. Mitkas Intelligent Systems and Softwar...
The problem <ul><li>Engineer fully autonomous-intelligent agents </li></ul><ul><li>Model as reinforcement learning problem...
Adaptive function approximators <ul><li>Problems continue:  </li></ul><ul><ul><li>Q: What FA to choose…? </li></ul></ul><u...
Our proposal <ul><li>Our proposal for an adaptive FA methodology </li></ul><ul><ul><li>Built bottom up, combining powerful...
The Ingredients <ul><li>1  ESN  (Echo State Network) [H. Jaeger] </li></ul><ul><li>50%  NEAT   (NeuroEvolution of Augmente...
Basic Echo State Network If output units are linear: y(t) = w u(t) + w’ x(t) Linear function with a) linear  b) non-linear...
NEAT – Basic Principles <ul><li>Start minimally and  complexify </li></ul><ul><li>Weight & structural mutation </li></ul><...
<ul><li>Combine  global  and  local  search </li></ul><ul><li>Evolution helps learning “avoid traps” </li></ul><ul><li>Lea...
Initialization <ul><li>Start minimally with 1 reservoir neuron </li></ul><ul><ul><li>XOR problem </li></ul></ul><ul><li>In...
Mutation – Add node <ul><li>Node added </li></ul><ul><ul><li>Adds a new feature, gene increases </li></ul></ul><ul><ul><li...
More Mutations <ul><li>Add/remove connections </li></ul><ul><li>Mutate weights </li></ul><ul><ul><li>Restart </li></ul></u...
Crossover 1 3 2 4 5 1 3 2 Let’s assume the  smallest gene is also the fittest.
Alignment 1 3 2 4 5 1 3 2
Fittest 1 3 2 1 3 2
Alignment 1 3 2 4 5 1 3 2
Largest 1 3 2 4 5 1 3 2
Speciation <ul><li>ESN are supposed to be sparse </li></ul><ul><li>Structural similarity on connections like NEAT would el...
Learning <ul><li>Use simple GD TD-learning for RL </li></ul><ul><li>Use Least Squares for time series - online updating is...
Basic Flow Init Pop Simulation Learning Fitness Speciation Selection Mutation Crossover Next Gen Champion Generalization P...
Experiments <ul><li>Reinforcement Learning </li></ul><ul><ul><li>Mountain Car </li></ul></ul><ul><ul><li>Single & Double P...
Time Series – Mackey Glass <ul><li>Better prediction errors than another recent TWEANN on ESN algorithm  </li></ul><ul><ul...
Mountain Car <ul><li>Same behavior as NEAT+Q algorithm </li></ul><ul><ul><li>NEAT+Q = “NEAT” + “Q-Learning through back-pr...
Pole Balancing <ul><li>Comparable performance with NEAT with  respect to networks evaluated </li></ul><ul><li>Our approach...
Vs. <ul><li>Simple ESN </li></ul><ul><ul><li>Problem probably due to on-line learning (online-learning, RL and NNs not a g...
Experiments <ul><li>Reinforcement Learning </li></ul><ul><ul><li>Mountain Car </li></ul></ul><ul><ul><li>Single & Double P...
Future Work <ul><li>Even more automation, driven by the problem at hand </li></ul><ul><ul><li>For example adapting operato...
Thank you for your attention Questions? Kyriakos Chatzidimitriou [email_address] http://issel.ee.auth.gr
Upcoming SlideShare
Loading in …5
×

A NEAT Way for Evolving Echo State Networks

2,283 views

Published on

Published in: Technology
  • Be the first to comment

A NEAT Way for Evolving Echo State Networks

  1. 1. A NEAT Way for Evolving Echo State Networks Kyriakos C. Chatzidimitriou Pericles A. Mitkas Intelligent Systems and Software Engineering Labgroup Informatics and Telematics Institute Electrical and Computer Eng. Dept. Centre for Research and Technology-Hellas Aristotle University of Thessaloniki Thessaloniki, Greece
  2. 2. The problem <ul><li>Engineer fully autonomous-intelligent agents </li></ul><ul><li>Model as reinforcement learning problems </li></ul><ul><li>Need good Function Approximators </li></ul><ul><li>Plus properties like: </li></ul><ul><ul><li>Non-linear </li></ul></ul><ul><ul><li>Non-Markovian </li></ul></ul>Generalization
  3. 3. Adaptive function approximators <ul><li>Problems continue: </li></ul><ul><ul><li>Q: What FA to choose…? </li></ul></ul><ul><ul><li>A: Something powerful/suitable </li></ul></ul><ul><ul><li>Q: Adjust the parameters… </li></ul></ul><ul><ul><ul><li>Neural Nets: Number of neurons? Topology? Weights? </li></ul></ul></ul><ul><ul><li>A: Adaptive function approximators </li></ul></ul><ul><li>FAs built automatically, ad-hoc, per problem/environment </li></ul><ul><li>How? Through the synthesis of learning and evolution </li></ul>Good for autonomy Good for the user
  4. 4. Our proposal <ul><li>Our proposal for an adaptive FA methodology </li></ul><ul><ul><li>Built bottom up, combining powerful ideas and algorithms from the research literature into a single methodology </li></ul></ul><ul><ul><li>Each one fills a different gap, developing into something complete </li></ul></ul><ul><ul><li>Design goal: cover as many aspects as possible </li></ul></ul>
  5. 5. The Ingredients <ul><li>1 ESN (Echo State Network) [H. Jaeger] </li></ul><ul><li>50% NEAT (NeuroEvolution of Augmented Topologies) [K. Stanley] </li></ul><ul><li>50% TD-Learning </li></ul>
  6. 6. Basic Echo State Network If output units are linear: y(t) = w u(t) + w’ x(t) Linear function with a) linear b) non-linear and temporal features Large number of features Sparse Mean around 0 Spectral radius less than 1
  7. 7. NEAT – Basic Principles <ul><li>Start minimally and complexify </li></ul><ul><li>Weight & structural mutation </li></ul><ul><li>Speciation through clustering to protect innovation </li></ul><ul><li>Crossover networks through historical markings on connections </li></ul><ul><li>Adapt it and use it as meta-search algorithm (its principles) for ESNs </li></ul>1 2 3 1 2 3 1 3 2
  8. 8. <ul><li>Combine global and local search </li></ul><ul><li>Evolution helps learning “avoid traps” </li></ul><ul><li>Learning helps evolution to “find better locations” nearby </li></ul><ul><li>ESNs allow us to do that easily </li></ul><ul><ul><li>Linear learning schemes so one can use all the classic linear RL/SL learning algorithms (TD, LS, LSTD etc.) </li></ul></ul><ul><ul><li>Need to adjust the part of evolution – adjust NEAT </li></ul></ul>Evolution and learning
  9. 9. Initialization <ul><li>Start minimally with 1 reservoir neuron </li></ul><ul><ul><li>XOR problem </li></ul></ul><ul><li>Input weights: random [-1,1] </li></ul><ul><li>Output weights: 0 </li></ul><ul><li>Reservoir weights: </li></ul><ul><ul><li>[-1,1] </li></ul></ul><ul><ul><li>Density </li></ul></ul><ul><ul><li>Mean(Wres) = 0 </li></ul></ul>
  10. 10. Mutation – Add node <ul><li>Node added </li></ul><ul><ul><li>Adds a new feature, gene increases </li></ul></ul><ul><ul><li>Historical markings to the node </li></ul></ul><ul><ul><li>All the reservoir connections are initially disabled </li></ul></ul><ul><ul><ul><li>Later enabled through link mutation or crossover </li></ul></ul></ul>1 3 2 4
  11. 11. More Mutations <ul><li>Add/remove connections </li></ul><ul><li>Mutate weights </li></ul><ul><ul><li>Restart </li></ul></ul><ul><ul><li>Perturb </li></ul></ul><ul><li>The weights added/changed towards making Mean(Wres) = 0 </li></ul><ul><li>Mutate density/spectral radius </li></ul><ul><ul><li>Restart </li></ul></ul><ul><ul><li>Perturb </li></ul></ul>1 3 2 4
  12. 12. Crossover 1 3 2 4 5 1 3 2 Let’s assume the smallest gene is also the fittest.
  13. 13. Alignment 1 3 2 4 5 1 3 2
  14. 14. Fittest 1 3 2 1 3 2
  15. 15. Alignment 1 3 2 4 5 1 3 2
  16. 16. Largest 1 3 2 4 5 1 3 2
  17. 17. Speciation <ul><li>ESN are supposed to be sparse </li></ul><ul><li>Structural similarity on connections like NEAT would eliminate the notion of speciation </li></ul><ul><li>Similarity on the basic macroscopic ESN properties: </li></ul><ul><ul><li>spectral radius, density, # nodes </li></ul></ul>
  18. 18. Learning <ul><li>Use simple GD TD-learning for RL </li></ul><ul><li>Use Least Squares for time series - online updating is not required </li></ul><ul><li>Tons of methods can be used here (both under TD-learning or doing policy search using EC) </li></ul><ul><li>Tested also Darwinian versus Lamarckian evolution </li></ul>
  19. 19. Basic Flow Init Pop Simulation Learning Fitness Speciation Selection Mutation Crossover Next Gen Champion Generalization Performance
  20. 20. Experiments <ul><li>Reinforcement Learning </li></ul><ul><ul><li>Mountain Car </li></ul></ul><ul><ul><li>Single & Double Pole Balancing </li></ul></ul><ul><li>Time Series </li></ul><ul><ul><li>Mackey-Glass </li></ul></ul>
  21. 21. Time Series – Mackey Glass <ul><li>Better prediction errors than another recent TWEANN on ESN algorithm </li></ul><ul><ul><li>One order of magnitude both for test and generalization errors </li></ul></ul><ul><li>Main differences is that we start minimally and do crossover, speciation </li></ul>
  22. 22. Mountain Car <ul><li>Same behavior as NEAT+Q algorithm </li></ul><ul><ul><li>NEAT+Q = “NEAT” + “Q-Learning through back-propagation” – “recurrences” </li></ul></ul><ul><li>Same generalization behavior (around -50) </li></ul><ul><li>Our approach solves also Non-Markovian 2D and 3D cars problems with learning (only position and not a speed signal is available) </li></ul>
  23. 23. Pole Balancing <ul><li>Comparable performance with NEAT with respect to networks evaluated </li></ul><ul><li>Our approach takes more time due to the learning procedure </li></ul><ul><li>1 st bell to accommodate a more advanced learning algorithm than simple GD </li></ul>
  24. 24. Vs. <ul><li>Simple ESN </li></ul><ul><ul><li>Problem probably due to on-line learning (online-learning, RL and NNs not a good triplet) </li></ul></ul><ul><ul><li>Not a problem with our approach since NE finds ESNs “that are better able to learn” </li></ul></ul><ul><li>Linear TD-learning (no reservoir) </li></ul><ul><ul><li>No reservoir => No Non-Markovian signals, Worse behavior </li></ul></ul><ul><li>No-learning, only evolution </li></ul><ul><ul><ul><li>No clear conclusions, but vs. simple GD TD-Learning ( 2 nd bell ) </li></ul></ul></ul>
  25. 25. Experiments <ul><li>Reinforcement Learning </li></ul><ul><ul><li>Mountain Car </li></ul></ul><ul><ul><li>Single & Double Pole Balancing </li></ul></ul><ul><ul><li>3D Mountain Car </li></ul></ul><ul><ul><li>Server Job Scheduling [+++] </li></ul></ul><ul><li>Time Series </li></ul><ul><ul><li>Mackey-Glass [+] </li></ul></ul><ul><ul><li>MSO [-] </li></ul></ul><ul><ul><li>Lorentz Attractor [+] </li></ul></ul>15% improvement than NEAT+Q
  26. 26. Future Work <ul><li>Even more automation, driven by the problem at hand </li></ul><ul><ul><li>For example adapting operator probabilities online </li></ul></ul><ul><li>Test new RL-TD learning techniques </li></ul><ul><ul><li>i.e. iLSTD, GQ </li></ul></ul><ul><li>More difficult test-beds </li></ul><ul><ul><li>TAC Ad Auctions </li></ul></ul><ul><ul><li>Poker </li></ul></ul><ul><ul><li>Real Time Strategy </li></ul></ul>
  27. 27. Thank you for your attention Questions? Kyriakos Chatzidimitriou [email_address] http://issel.ee.auth.gr

×