Demolition Derby 2011 at GECCO

1,131 views

Published on

The presentation that was held at GECCO 2011 about the "Demolition Derby" competition. In the competition, separate programs control the cars in a racing game being provided with local, sensor information only. The goal is to cause as much damage to other cars with avoiding damage meanwhile.

The slides shortly introduce the controllers of the participants and then present the results and the winner.

See also http://www.youtube.com/watch?v=YZ1Pi7V83Ao

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,131
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
12
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Demolition Derby 2011 at GECCO

  1. 1. Martin V. Butz<br />Department of Psychology III<br />University of Würzburg<br />Röntgenring 11, 97070 Würzburg, Germany<br />butz@psychologie.uni-wuerzburg.de<br />GECCO 2011Competitions<br />
  2. 2. Demolition Derbyhttp://www.coboslab.psychologie.uni-wuerzburg.de/competitions<br />Martin V. Butz, University of Würzburg, Germany<br />Matthias J. Linhardt, University of Würzburg, Germany<br />Daniele Loiacono, Politecnico di Milano, Italy<br />Luigi Cardamone, Politecnico di Milano, Italy<br />Pier Luca Lanzi, Politecnico di Milano, Italy<br />
  3. 3. Demolition Derby: Purpose<br />Optimize opponent interactions<br />Avoid being hit – run away when necessary<br />Try to hit others at the right moment<br />Enables (co-) optimization of interaction behavior. <br />Fitness may be based on damage caused to other cars.<br />Co-development of two or more competitors is possible (possibly with different approaches).<br />3<br />
  4. 4. Goal & Setup<br />Goal: Wreck all opponent cars by crashing into them without gettingwrecked yourself.<br />Setup: Local sensor information as in the Simulated Car Racing Competition.<br />Modifications:<br />Sensors: <br />Same simulated sensors – but without noise.<br />The range of the 36 opponent sensors has been increased to 300m.<br />Damage model:<br />Cars do not take any damage when colliding with walls.<br />Cars do not take any damage in the front when colliding with each other.<br />Cars only take damage when their rear is hit by another car.<br />4<br />
  5. 5. Winner Determination<br />Arena: Large circular track (surface: asphalt; length: 640m, width: 90m)<br />Qualifying<br />1-vs-1 matches evaluating all against all (winner = 1 point)<br />Eight best controllers qualify for the final showdown.<br />Final match:<br />The best eight controllers fight each other.<br />The last car standing in the final match is declared the winner.<br />5<br />
  6. 6. Additional Goodies for a Quick Start<br />Basic controllerclients for Java and C++, to easily add additional functionality to.<br />COBOSTAR client in Java<br />With opponent monitor that tracks opponents over time.<br />With simple crashing strategy that targets closest car in range.<br />Evolvable client setup that <br />receives caused damage signal,<br />applies CMA evolution strategy-based optimization,<br />runs continuously with or (even faster) without visualization for as many generations as desired.<br />
  7. 7. Entries<br />Thies Lönneker<br />Dep. of Computer Science<br />University of Würzburg<br />GERMANY<br />Controller: DemoStar<br />Zygmunt Horodyski<br />Piłsudskiego 39/1<br />66-530 Drezdenko<br />POLAND<br />Controller: Spartiat<br />
  8. 8. Entry Information<br />Thies Lönneker<br />Dep. of Computer Science<br />University of Würzburg<br />GERMANY<br />Controller: DemoStar<br />
  9. 9. DemoStar: Orientation<br />Task:<br /><ul><li>Get an idea of the car's position</li></ul>Approach 2 (DemoStar):<br /><ul><li>gather global information
  10. 10. sensors:
  11. 11. distance from start line
  12. 12. track position
  13. 13. result: 3rd person perspective
  14. 14. requires known trade geometry</li></ul>Approach 1:<br /><ul><li>gather local information
  15. 15. sensors:
  16. 16. track edge sensors
  17. 17. focus sensors
  18. 18. result: 1st person perspective
  19. 19. works in unknown environment</li></li></ul><li>DemoStar: Opponents<br />Task:<br /><ul><li>Locate opponents
  20. 20. Opponent sensors provide information about
  21. 21. sector
  22. 22. distance</li></ul>of an opponent from 1st-person point of view<br /><ul><li>Accuracy is restricted by the sector's central angle
  23. 23. Sector transitions lead to the most exact sensor readings</li></li></ul><li>DemoStar: Opponent Tracking<br />Task:<br /><ul><li>Get more information about the opponents
  24. 24. Record the last opponent sensor readings
  25. 25. Remember information about sector transitions
  26. 26. Get opponent's moving direction from last sector transitions
  27. 27. Extrapolate opponent's current position</li></ul>In case of multiple opponents:<br /><ul><li>Map sensor readings to extrapolated positions</li></li></ul><li>DemoStar: Agility<br />Task:<br /><ul><li>Find the best way to quickly head for an opponent
  28. 28. Parameter optimization for best steering efficiency
  29. 29. Method: Covariance Matrix Adaption evolutionary strategy (CMA-ES)
  30. 30. Resulting parameter set makes use of drift effects at the limit of the car's controllability</li></li></ul><li>DemoStar: Recovery<br />Task:<br /><ul><li>Escape stuck situations</li></ul>Stuck face-to-face<br /><ul><li>Indications: opponent close to the car, combined with involuntary negative velocity
  31. 31. Resolution: go into reverse with maximum steering</li></ul>Stuck at a wall<br /><ul><li>Indications: wall close to the car, combined with involuntary zero-velocity
  32. 32. Resolution: go into reverse for a short time</li></li></ul><li>Entry Information<br />Zygmunt Horodyski<br />Piłsudskiego 39/1<br />66-530 Drezdenko<br />POLAND<br />Controller: Spartiat<br />
  33. 33. Action<br />SensorModel<br />(SM)<br />Mainclass<br />Update model every 1min<br />(Run Algoritmevery 1min)<br />Action<br />SM<br />Collect data <br />Send action<br />Select one<br /> of behaviors<br />Model<br />M<br />Update<br />Controller<br />Behavior<br />GetAction<br />Algoritm<br />Zygmunt Horodyski<br />Overview of how the controller works<br />
  34. 34. Model<br />The model is a group of elements, which determine how the controller moves on the track.<br />The model is also an individual solution selected by the algorithm. <br />Components of model are:<br /><ul><li>Maximum speed/speed when we turn back[depends on strategy]
  35. 35. [A] Average distance from the edge when we move back [outer edge]
  36. 36. [B] Average distance from the edge when we move back [inner edge]
  37. 37. [C] Average distance from the edge when we slow down
  38. 38. [D] Distance from the edge when we slow down
  39. 39. [E] Distance from the edge when we use brake
  40. 40. Time after controller will act like it stuck
  41. 41. Strategy </li></ul>D, E<br />D<br />Legend:<br /><ul><li>Trackedge:
  42. 42. Sensor:
  43. 43. Car:</li></ul>CAR<br />CAR<br />CAR<br />CAR<br />CAR<br />Zygmunt Horodyski<br />B, C<br />A, C<br />
  44. 44. Algorithm<br />The Algorithm is inspired by “life” observations – and the solution is some sort of model. Referring to the reality, we don’t always think about every single move. From time to time we have to update factors that determine our behavior, in other words we’re changing our model. <br />The select is my interpretation of tournament. Each winner of each tournament (10 in total) has the possibility to crossbreed with the best one from the previous population. Also the two worst solutions form each tournament are crossing with each other (to eliminate useless solutions).<br />Each solution has it’s own probability of mutation based on it’s rating. Probability of mutation is checked for each parameter of solution, adding a Gaussian-distributed parameter value if applicable.<br />Zygmunt Horodyski<br />
  45. 45. Evaluation function<br />The evaluation function is also related to life. In many areas of life, especially in cars and racing we know the best way, the best solution. But not always we follow that best solution, although we can calculate it. And that make as unpredictable.<br />In my evaluation function firstly we check which one of strategies suits the situation best. Select of it is based on time to the end, current damage (ours and others), current fuel level and others.<br />Two different strategies determine two different best solutions. Each judged solution is compared to the best one and rated. The perfect score that can be granted is for solutions is the one with 10-20% difference from the best solution.<br />Zygmunt Horodyski<br />
  46. 46. Controller<br />Controller is part of program that, based on model and current state of environment, sets the action of our car. <br />There are two different controllers, each specialized to different tasks. One that avoids damage and safes fuel (Dumbass) and the other one that tries to cause damage (Pro). Two strategies protect us from programs that learn the opponent.<br />Firstly we select one behavior, then we calculate each Action parameter (based on the selected behavior) and then return that one to the main class. <br />SensorModel<br />Collect data <br />Send action<br />Select one<br /> of behaviors<br />M<br />Update<br />Action<br />Behavior<br />Action<br />Zygmunt Horodyski<br />
  47. 47. Behavior<br />Selection of behavior depends on current model and environment. In strategy Dumbass (sd) and Pro(sp) there are five behaviors that can be selected:<br /><ul><li> Search (sd+sp)
  48. 48. Controller will move around the map searching for the opponent
  49. 49. [C] Charge (sd+sp)
  50. 50. If we see an opponent and it is possible controller will attack
  51. 51. [M] MoveBack (sd+sp)
  52. 52. If there is no way we can move, controller will move back
  53. 53. [B] Brake (sd+sp)
  54. 54. If it is necessary, controller will use brake to stop the car
  55. 55. [T] TurnBack (sd+sp)
  56. 56. If there is an opponent behind us and it is possible, the controller will turn back
  57. 57. [S] Strike (sp)
  58. 58. If opponent is directly in front of us, controller will attack
  59. 59. [E] Evade (sp)
  60. 60. If controller is near edge, it will evade it.</li></ul>EXAMPLES ON THE NEXT SLIDE<br />(The way how it chooses can be checked in source code in DD_D_Pro and DD_D_Dumbass)<br />Zygmunt Horodyski<br />
  61. 61. Behavior examples<br />M<br />B<br />C<br />E<br />S<br />Legend:<br /><ul><li>Trackedge:
  62. 62. Car:
  63. 63. Enemy:</li></ul>CAR<br />CAR<br />CAR<br />CAR<br />CAR<br />CAR<br />CAR<br />CAR<br />CAR<br />T<br />
  64. 64. Results………<br />8 (2xDemoStar, 2xSpartiat, 4xBase-Client)<br />3 (DemoStar, Spartiat, Base-Client)<br />1 vs. 1<br />
  65. 65. Results with 8 cars(2xDemoStar, 2xSpartiat, 4xBase-Client)<br />
  66. 66. Results with Three Cars(DemoStar, Spartiat, Base-Client)<br />… DemoStar wins… Spartiat similar to BaseClient.<br />
  67. 67. Results 1 vs. 1(DemoStar, Spartiat, Base-Client)<br />… DemoStar wins… Spartiat better than BaseClient.<br />
  68. 68. Thies Lönnekerwins with DemoStar<br />Check out the competition video online: <br />http://www.youtube.com/watch?v=YZ1Pi7V83Ao<br />

×