Optimization of avoidance is as important as is the optimization of hitting. Co-optimization is possible. Point out that various optimization methods can be applied – policy-gradient algorithms can be done with CMA-ES for example. Strategy components can be optimized individually or in parallel.
Last year: Both competitors did NOT consider opponent AVOIDANCE. However, chasing and crashing was attempted to be optimized. Result was, however, rather unclear outcomes with 8 competitors.
Institute of Computer Science Chair of Cognitive ModelingDemolition Derby 2012Based on TORCS: The Open Racing Car Simulator07/2012, Martin V. Butz
Demolition Derby 2012 http://cm.inf.uni-tuebingen.de/competitions• Organized by: - Martin V. Butz, University of Tübingen, Germany• Supported by: - Andreas Alin, University of Tübingen, Germany - Dennis Schwartz, University of Tübingen, Germany• And with previous help by: - Matthias J. Linhardt, University of Bamberg, Germany - Daniele Loiacono, Politecnico di Milano, Italy - Luigi Cardamone, Politecnico di Milano, Italy - Pier Luca Lanzi, Politecnico di Milano, Italy
Demolition Derby: Purpose• Optimize opponent interactions - Avoid being hit – run away when necessary - Try to hit others at the right moment.• Enables (co-)optimization of interaction behavior. behavior - Fitness may be based on damage caused to other cars. - Co-development of two or more competitors is possible (possibly with different approaches). - Can do policy-gradient-based optimization• Various strategy components are relevant - Avoidance optimization - Chasing others optimization - Forwards & backwards steering control - Opponent monitoring - Meta-strategies3
Goal & Setup• Goal: Wreck all opponent cars by crashing into them without getting wrecked yourself.• Setup: Local sensor information as in the Simulated Car Racing Competition.• Sensors: - Simulated distances sensors (noiseless). Surrounding 36 opponent sensors with a range of 300m. 19 track sensors with a rang of 200m. - Other sensors Current damage of own car. Damage produced on other cars. Status of car (speed, wheels, gear…). Relative position on track. - Damage model: Cars do not take any damage when colliding with walls. Cars do not take any damage in the front when colliding with each other. Cars only take damage when their rear is hit by another car.4
Winner Determination• Arena: Large circular track (surface: asphalt; length: 640m, width: 90m) Arena• Qualifying - 1-vs-1 matches evaluating all against all (winner = 1 point = less damage) - Eight best controllers qualify for the final showdown.• Final demolition derby matches: - The best eight controllers fight each other. - Ten matches are played. - Car that wins most often wins the competition. - Alternative scoring with rank-based points is also considered. 5
Additional Goodies for a Quick Start• Basic controller clients for Java and C++, to easily add additional functionality.• COBOSTAR client in Java - With opponent monitor that tracks opponents over time. - With simple crashing strategy that targets closest car in range.• Evolvable client setup that - receives caused damage signal, - applies CMA evolution strategy-based optimization, - runs continuously with or (even faster) without visualization for as many generations as desired.
Last Years Entries• Base Client - Dep. of Computer Science - University of Würzburg, Germany• DemoStar - Thies Lönneker, Dep. of Computer Science - University of Würzburg, Germany• Spartiat - Zygmunt Horodyski, Piłsudskiego 39/1 - 66-530 Drezdenko, Poland
USM Rule-Based Agents Agent behaviors are determined using a rule-based approach No learning for the entries here, but this approach is designed for EC learning Conditions and actions are drawn from a discrete “vocabulary” of pre-designed options Each rule is a condition-action vector Conflict resolution: Rule order here (but specificity is usually helpful)Gagne, Knowlton, Tellier, and Congdon, GECCO
USM Rule-Based Agents Abstractions interface between game and rules Low-level game sensors are abstracted to high-level rule conditions High-level rule actions are translated to low-level game controls Game low-level sensors controls low-level details details high-level inputs outputs high-level abstractions Agent abstractions conditions actionsGagne, Knowlton, Tellier, and Congdon, GECCO
Conditions – Input Abstractions Condition TRUE when… Ahead Ahead An enemy is ahead and within 100 m Close Ahead Any enemy is ahead and within 20 m Behind An enemy is behind and within 100 m Behind Advantage Opponent has 2000 more damage than agent Edge Agent is near the track edge Wounded Agent has more than 7,000 damage Agent has been doing the same thing No Change for a long time Duel There is only one other opponent Turning Agent has started a U-turn, but has not finishedGagne, Knowlton, Algorithms Congdon, GECCOCongdon, Genetic Tellier, and and NonCoding DNA
Actions – Output Abstractions as Pictures Action Pictorial Description Ram Agent SMASH! Run Bait Get Clear U-Turn Circle TrackGagne, Knowlton, Tellier, and Congdon, GECCO
Actions – Output Abstractions in English Action Description Ram • Steer toward opponent • Slow down if necessary. • Otherwise, full acceleration. Run • Steer away from opponent • Circle the track • Full acceleration Bait • Circle the track • Speed limit 110 kph • When opponent is close, swerve Get Clear • Turn away from track edge • If very close to edge, back up U-Turn • Cut wheel hard left or right (coin flip) • Keep wheel cut for 100 steps Circle Track • Stay centered and in line with track axis • Speed limit 110 kphGagne, Knowlton, Tellier, and Congdon, GECCO
Sloppy Jalopy Entry – Rule Set Conditions ActionAhead Close Behin Advantage Edge Wounded No Change Duel Turnin Ahead d g F * * * T * * * Get Clear * * * F * F T * U-Turn * * * * F T * F Run * * * T F * * T Run * * T * F F * F Bait T * F * * * * F Ram T * * * * F F T Ram * * T F F * F T Bait * * * * * * F * Circle Track ‘*’ Means any state satisfies this condition (Don’t Care). Grayed-out conditions are ignored by this agent. Gagne, Knowlton, Tellier, and Congdon, GECCO
Sloppy Jalopy Behavior SJ runs away when it’s wounded or winning a duel by a margin. Rams only when there is nobody behind it. Tries to score mainly by baiting opponents. Is more timid in multiplayer, more aggressive in a duel. Does a U-Turn if it’s been doing one thing for a while. Circle track by default.Gagne, Knowlton, Tellier, and Congdon, GECCO
Crash and Segfault Entry – Rule Set Conditions ActionAhead Close Behin Advantage Edge Wounded No Change Duel Turnin Ahead d g F F * * T * * * * Get Clear * F T * F * F * F Run T F F T F * * * * Run T F F F * * * * F Ram F F F T F * * * * Circle Track * T * * * * * * * Ram * F T * F * T * * U-Turn F F F F F * F * F Circle Track F F F F F * T * * U-Turn F F * F F * * * T U-Turn ‘*’ Means any state satisfies this condition (Don’t Care). Grayed-out conditions are ignored by this agent. Gagne, Knowlton, Tellier, and Congdon, GECCO
Crash and Segfault Behavior Will always attempt to ram if an opponent is close ahead. Runs away if an opponent is behind it. Runs away if it has a damage advantage in a duel. Attempts to ram if there is an opponent ahead and it isnt running. Will make a U-turn if it has gone a complete lap around the track while either running or circling. If no other action is called for, will circle the track to try to find opponents.Gagne, Knowlton, Tellier, and Congdon, GECCO
JustDetermined Entry – Rule Set Conditions ActionAhead Close Behin Advantage Edge Wounded No Change Duel Turnin Ahead d g * * * * T * * * * Get Clear * * T * * * * * * U-Turn T * F * F * * * * Ram * * F T F * * * * Run F * F * F * * * * Circle Track ‘*’ Means any state satisfies this condition (Don’t Care). Grayed-out conditions are ignored by this agent. Gagne, Knowlton, Tellier, and Congdon, GECCO
JustDetermined Behavior Basic wall detection to avoid walls Opponent is its main focus when opponent is in front of the controller If an opponent is behind the controller, it will turn around as fast as possible to hit the opponent When no opponent is near, the controller circles the track to maintain speedGagne, Knowlton, Tellier, and Congdon, GECCO
Future Work Further evaluation of agents against a wider variety of drivers. This basic approach is designed as a step towards using EC on the rule sets. In addition to evolving the rule sets, parameters such as “close” can benefit from EC to refine these values.Gagne, Knowlton, Tellier, and Congdon, GECCO
And the Winner is.... SEALbot Anderson Rocha Tavares Anderson Rocha Tavares & Gabriel de Oliveira Ramos & Renato de Pontes Pereira & Sérgio Montazzolli Silva & Ana L. C. BazzanUniversidade Federal do Rio Grande do Sul (UFRGS), Porto Alegre UFRGS Brasil