CIG case study: car racing
 • A prolonged example of applying CI to a
   game: car racing
 • Sensor representation and inp...
Racing games
• On the charts for the last three decades
• Can be technically simple (computationally
  cheap) or very soph...
CI in racing games
• Learning to race
 • on your own, against specific opponents,
    against opponents in general, on one ...
A simple car game


• Optimised for speed, not for prettiness
• 2D dynamics (momentum, understeer, etc.)
• Intended to qua...
• Walls are solid
• Waypoints must be
  passed in order

• Fitness: continuous
  approximation of
  waypoints passed in
  ...
• Inputs
 • Six range-finder sensors (evolvable pos.)
 • Waypoint sensor, Speed, Bias
• Networks
 • Standard MLP, 9:6:2
 • ...
T rack        10                 50                  100                 200               P r.
                          ...
Example video




Evolved with 50+50 ES, 100 Generatons
Choose your inputs
(+their representation)
• Using third-person inputs (cartesian inputs)
  seems not to work
• Either ran...
If you don’t know
       your inputs...
• Memetic techniques (e.g. memetic ES) can
  sort out useful from useless inputs
•...
Learning controllers with irrelevant inputs present




   Togelius, Gomez and Schmidhuber (2008)
Generalization and
     specialization
• A controller evolved for one track does
  not necessarily perform well on other
 ...
damaging such cars in collisions is ha
                                                                            weight....
Incremental evolution
• Introduced by Gomez & Mikkulainen (1997)
• Change the fitness function f (to make it
  more demandi...
Incremental evolution
• Controllers evolved for specific tracks
  perform poorly on other tracks
• General controllers, that can drive almost
  a...
Two cars on a track
• Two car with solo-evolved controllers on
  one track: disaster
  • they don’t even see each other!
•...
Video: navigating
a complex track
Competitive
          coevolution
• The fitness function evaluates at least two
  individuals
• One individual’s success is...
Competitive
          coevolution
• Standard 15+15 ES; each individual is
  evaluated through testing against the
  curren...
Video: absolute fitness
Video: 50/50 fitness
Video: relative fitness
Problems with
        coevolution
• Over-specialization and cycling
 • Can be battled with e.g. archives
• Loss of gradien...
Multi-population
      coevolution
• Typically, competitive coevolution uses one
  or two populations
• Many more populati...
Example:
1 versus 9 populations




   Togelius, Burrow, Lucas (2007)
Player modelling
• Can we create players that drive just like
  specific human players?
• The models need to be...
 • Simil...
Direct modelling
• Let a player drive a number of tracks
• Use supervised learning to associate inputs
  (sensors) with ou...
Indirect modelling
• Let a human drive a test track, record
  performance, speed and orthogonal
  deviation at the various...
The test track supposedly requires
a varied repertoire of driving skills
                                                 ...
Content creation
• Creating interesting, enjoyable levels, worlds,
  tracks, opponents etc.
  • Not the same as well-playi...
Track evolution
• Using the controllers we evolved to model
  human players, we evolve tracks that are fun
  to drive for ...
Fig. 5.   Track evolved using the random walk initialisation and mutation.
 e the representa-
nted with several
 t the beg...
the results of ou
                                                                              car racing [10].
         ...
ks by sampling
aken advantage
 ack. First thick
  side of the b-
ixels or subject
nt is set up. But
 th of the track,
  an...
but only sometimes causes the car to collide. Those elements are believed to
be the main source of final progress variabili...
Video: evolved
TORCS drivers
Video: real car control
More on these topics
• http://julian.togelius.com
 • e.g. Togelius, Lucas and De Nardi:
    “Computational Intelligence in...
WCCI 2008 Tutorial on Computational Intelligence and Games, part 2 of 3
WCCI 2008 Tutorial on Computational Intelligence and Games, part 2 of 3
Upcoming SlideShare
Loading in …5
×

WCCI 2008 Tutorial on Computational Intelligence and Games, part 2 of 3

1,058 views

Published on

WCCI 2008 Tutorial on Computational Intelligence and Games by Simon Lucas, Julian Togelius and Thomas Runarsson, part 2 of 3

Published in: Sports, Automotive
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,058
On SlideShare
0
From Embeds
0
Number of Embeds
9
Actions
Shares
0
Downloads
30
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

WCCI 2008 Tutorial on Computational Intelligence and Games, part 2 of 3

  1. 1. CIG case study: car racing • A prolonged example of applying CI to a game: car racing • Sensor representation and input selection • Incremental evolution • Competitive coevolution • Player modelling • Content creation
  2. 2. Racing games • On the charts for the last three decades • Can be technically simple (computationally cheap) or very sophisticated • Easy to pick up and play, but possess almost unlimited “depth” (a lifetime to master) • Can be played on your own or with others
  3. 3. CI in racing games • Learning to race • on your own, against specific opponents, against opponents in general, on one or several tracks, using simple or complex cars/physics models, etc. • Modelling driving styles • Creating entertaining game content: tracks and opponent drivers
  4. 4. A simple car game • Optimised for speed, not for prettiness • 2D dynamics (momentum, understeer, etc.) • Intended to qualitatively replicate a standard toy R/C car driven on a table • Bang-bang control (9 possible commands)
  5. 5. • Walls are solid • Waypoints must be passed in order • Fitness: continuous approximation of waypoints passed in 700 time steps
  6. 6. • Inputs • Six range-finder sensors (evolvable pos.) • Waypoint sensor, Speed, Bias • Networks • Standard MLP, 9:6:2 • Outputs interpreted as thrust/steering
  7. 7. T rack 10 50 100 200 P r. 1 1.9 (0.1) 1.99 (0.06) 2.02 (0.01) 2.04 (0.02) 10 2 2.06 (0.1) 2.12 (0.04) 2.14 (0) 2.15 (0.01) 10 3 3.25 (0.08) 3.4 (0.1) 3.45 (0.12) 3.57 (0.1) 10 4 3.35 (0.11) 3.58 (0.11) 3.61 (0.1) 3.67 (0.1) 10 5 2.66 (0.13) 2.84 (0.02) 2.88 (0.06) 2.88 (0.06) 10 6 2.64 (0) 2.71 (0.08) 2.72 (0.08) 2.82 (0.1) 10 7 1.53 (0.29) 1.84 (0.13) 1.88 (0.12) 1.9 (0.09) 10 T rack 18 2 0.59 (0.15) 3 0.73 (0.22) 4 0.85 (0.21) 5 0.93 (0.25) 06 7 8 Fitness (sd) 1.66 (0.12) 1.86 (0.02) 2.27 (0.45) 2.66 (0.3) TABLE VI 2.19 (0.23) 2.47 (0.18) 0.22 (0.15) 0.15 (0.01) TABLE V F ITNESS OF BEST CONTROLLERS , EVOLVING CONTROLLERS F ITNESS OF A FURTHER EVOLVED GENERAL CONTROLLER WITH EVOLVABLE SENSOR PARAMETERS ON THE DIFFERENT TRACKS . C OMPOUND FITNESS SPECIALISED FOR EACH TRACK , STARTING FROM A FURTHER EVOLVED 2.22 (0.09). GENERAL CONTROLLER WITH EVOLVED SENSOR PARAMETERS . Fig. 6. Sensor setup of a controller reach good fitness on, track 7. Presum their angular spread reflects the larg T rack 10 50 100 200 P r. T rack 10 50 100 200 Phas to handle in order to navigate th r. 1 1.9 (0.1) 1.99 (0.06) 2.02 (0.01) 1 (0.02) 2.040.32 (0.07) 100.54 (0.2) 0.7 (0.38) 0.81 (0.5) 2 2 2.06 (0.1) 2.12 (0.04) 2.14 (0) 2 2.150.38 (0.24) (0.01) 100.49 (0.38) 0.56 (0.36) 0.71 (0.5) 2 3 3.25 (0.08) 3.4 (0.1) 3.45 (0.12) 3 (0.1) 3.570.32 (0.09) 100.97 (0.5) 1.47 (0.63) 1.98 (0.66) 7 4 3.35 (0.11) 3.58 (0.11) 3.61 (0.1) 4 (0.1) 3.670.53 (0.17) 101.3 (0.48) 1.5 (0.54) 2.33 (0.59) 9 5 2.66 (0.13) 2.84 (0.02) 2.88 5 (0.06) 2.880.45 (0.08) (0.06) 100.95 (0.6) 0.95 (0.58) 1.65 (0.45) 8 6 2.64 (0) 2.71 (0.08) 2.72 6 (0.08) 0.4 (0.08) 2.82 (0.1) 100.68 (0.27) 1.02 (0.74) 1.29 (0.76) 5 7 1.53 (0.29) 1.84 (0.13) 1.88 7 (0.12) 1.9 0.3 (0.07) (0.09) 100.35 (0.05) 0.39 (0.09) 0.46 (0.13) 0 8 0.16 (0.02) 0.19 (0.03) 0.2 (0.01) 0.2 (0.01) 0 8 0.59 (0.15) 0.73 (0.22) 0.85 (0.21) 0.93 (0.25) 0 TABLE I TABLE VI T HE FITNESS OF THE BEST CONTROLLER OF VARIOUS GENERATIONS ON F ITNESS OF BEST CONTROLLERS , EVOLVING CONTROLLERS TRACKS , AND NUMBER OF RUNS PRODUCING THE DIFFERENT SPECIALISED FOR EACH TRACK , STARTING FROMPROFICIENT CONTROLLERS . F ITNESS AVERAGED OVER 10 SEPARATE A FURTHER EVOLVED GENERAL CONTROLLER WITH EVOLVED SENSOR PARAMETERS . STANDARD DEVIATION BETWEEN PARENTHESES . EVOLUTIONARY RUNS ; Fig. 2. The initial sensor setup, which is kept throughout the evolutionary Fig. 6. 5. track Sensor setup of a controller specialized for, and able to consistently run for those runs where sensor parameters are not evolvable. Here, setup of controller specialized forreach good While on, track 7. Presumably the use of all but one sensor and Fig. 5. Sensor the car more or less retaining the two longest-range sensors from the further evolved fitness general is seen in close-up moving upward-leftward. At this particular position, the their angular spread reflects the large variety of different situations the car front-right sensor returns a positive number very close to 0, as it detects on, it has added medium-range sensors in the front and controller it is based a has to handle in order to navigate this more difficult track. wall near the limit of its range; the front-left sensor returns a number close back,The front, very short-range sensor to the left. number of waypoints in the track, 7. Sensor setup of another con to 0.5, and the back sensor a slightly larger number. and a left and right passed, divided by the Fig. sensors do not detect any walls at all and thus return 0. plus an intermediate term representing how far it is on its way in figure 6 seemingly using all i one to the next waypoint, calculated from the relative distances between the car and the previous and next waypoint. A range 200 pixels, as has three sensors pointing forward- fitness of 10 evolutionary runs were made, track controllers. For each track, 1.0 thus means having completed one full VII. O BSERVATIONS ON EV left, forward-right and backward respectively. The two other within the alloted time. Waypoints can only be passed in the where the initial population was seeded with the general sensors, which point left and right, have reach 100; this is correct order, and a waypoint is counted as passed when the illustrated in figure 2. controller and evolution of the car is within 30continue for waypoint. In It has previously been found centre was allowed to pixels from the 200
  8. 8. Example video Evolved with 50+50 ES, 100 Generatons
  9. 9. Choose your inputs (+their representation) • Using third-person inputs (cartesian inputs) seems not to work • Either range-finders or waypoint sensor can be taken away, but some fitness lost • A little bit of noise is not a problem, actually it’s desirable • Adding extra inputs (while keeping core inputs) can reduce evolvability drastically!
  10. 10. If you don’t know your inputs... • Memetic techniques (e.g. memetic ES) can sort out useful from useless inputs • Principle: evolve neural network weights together with a mask: whether connections are on or off • Masks and weights are evolved at different time scales; after every mask mutation, weight space is searched - if no fitness increase, the mask is reverted
  11. 11. Learning controllers with irrelevant inputs present Togelius, Gomez and Schmidhuber (2008)
  12. 12. Generalization and specialization • A controller evolved for one track does not necessarily perform well on other tracks • How do we achieve more general game- playing skills? • Is there a tradeoff between generality and performance?
  13. 13. damaging such cars in collisions is ha weight. The dynamics of the car are based on mechanical model, taking into account car and bad grip on the surface, but is n measurement [13][14]. The model is s [4], and differs mainly in its improve after more experience with the physical response system was reimplemented to realistic (and, as an effect, more undesir may cause the car to get stuck if the unfortunate angle, something often see physical cars. A track consists of a set of walls, a and a set of starting positions and di is added to a track in one of the sta corresponding starting direction, both t being subject to random alterations. Th for fitness calculations. For the experiments we have des tracks, presented in figure 1. The tr vary in difficulty, from easy to hard. are versions of three other tracks wi in reverse order, and the directions of reversed. The main differences between our real R/C car racing problem have to reported in Tanev et al. as well as [4] not unimportant lag in the communica computer and car, leading to the control perceptions. Apart from that, there Fig. 1. The eight tracks. Notice how tracks 1 and 2 (at the top), 3 and 4, 5 and 6 differ in the clockwise/anti-clockwise layout of waypoints and in estimations of the car’s position a associated starting points. Tracks 7 and 8 have no relation to each other overhead camera. In contrast, the sim
  14. 14. Incremental evolution • Introduced by Gomez & Mikkulainen (1997) • Change the fitness function f (to make it more demanding) as soon as a certain fitness is achieved • In this case, add new tracks to f as soon as the controller can drive 1.5 rounds on all tracks currently in f
  15. 15. Incremental evolution
  16. 16. • Controllers evolved for specific tracks perform poorly on other tracks • General controllers, that can drive almost any track, can be incrementally evolved • Starting from a general controller, a controller can be further evolved for specialization on a particular track • drive faster than the general controller • works even when evolution from scratch did not work!
  17. 17. Two cars on a track • Two car with solo-evolved controllers on one track: disaster • they don’t even see each other! • How do we train controllers that take other drivers into account? (avoiding collisions or using them to their advantage) • Solution: car sensors (rangefinders, like the wall sensors) and competitive coevolution
  18. 18. Video: navigating a complex track
  19. 19. Competitive coevolution • The fitness function evaluates at least two individuals • One individual’s success is adversely affected by the other’s (directly or indirectly) • Very potent, but seldom straightforward; e.g. Hillis (1991), Rosin and Belew (1996)
  20. 20. Competitive coevolution • Standard 15+15 ES; each individual is evaluated through testing against the current best individual in the population • Fitness function a mix of... • Absolute fitness: progress in n time steps • Relative fitness: distance ahead of or behind the other car after n time steps
  21. 21. Video: absolute fitness
  22. 22. Video: 50/50 fitness
  23. 23. Video: relative fitness
  24. 24. Problems with coevolution • Over-specialization and cycling • Can be battled with e.g. archives • Loss of gradient • Can be battled with careful fitness function design, e.g. combining absolute and relative fitness • Much more research needed here!
  25. 25. Multi-population coevolution • Typically, competitive coevolution uses one or two populations • Many more populations can be used! • Can help against cycling and overspecialization • The phenotypical diversity between populations can be useful in itself
  26. 26. Example: 1 versus 9 populations Togelius, Burrow, Lucas (2007)
  27. 27. Player modelling • Can we create players that drive just like specific human players? • The models need to be... • Similar in terms of performance • Similar in terms playing (driving) style • Robust
  28. 28. Direct modelling • Let a player drive a number of tracks • Use supervised learning to associate inputs (sensors) with outputs (driving commands) • e.g. MLP/Backpropagation or k-nearest neighbour • Suffers from generalization problems, and that any approximation is likely to lead to worse playing performance
  29. 29. Indirect modelling • Let a human drive a test track, record performance, speed and orthogonal deviation at the various waypoints the track • Start from a good, general evolved neural network controller, and evolve it further • Fitness: negative difference between controller and player for the three measures above
  30. 30. The test track supposedly requires a varied repertoire of driving skills 1 0.8 0.6 Fitness (progess, speed) 0.4 0.2 0 −0.2 −0.4 −0.6 −0.8 0 0 10 Fig. 2. The test track and the car. Fig. 3. Evolving a First of all, we design a test track, featuring a number of different types of racing challenges. The track, as pictured 0 in (fig), has two long straight sections where the player can −0.2
  31. 31. Content creation • Creating interesting, enjoyable levels, worlds, tracks, opponents etc. • Not the same as well-playing opponents • Probably the area where commercial game developers need most help • What makes game content fun? Many theories, e.g. Thomas Malone, Raph Koster, Mihály Csíkszentmihályi
  32. 32. Track evolution • Using the controllers we evolved to model human players, we evolve tracks that are fun to drive for the modelled player • Fitness function: • Right amount of progress • Variation in progress • High maximum speed
  33. 33. Fig. 5. Track evolved using the random walk initialisation and mutation. e the representa- nted with several t the beginning plementations of nfigurations are rd initial track rners. Each mu- ontrol points by distribution with y axes. xperiments, mu- onfiguration, but ectangle track is eds of mutations those mutations controller is not e result of such a ll drivable track. ck and evolution Fig. 6. A track evolved (using the radial method) to be fun for the first author, who plays too many racing games anyway. It is not easy to drive, which is just as it should be. n, starts from an rol points around
  34. 34. the results of ou car racing [10]. In the section describe a numb value, most of wh described here. D sures would defin urgent to study th oft-cited hypothe know there are n entertainment me games and types needed. Finally we no different approach in the beginning Fig. 7. A track evolved (using the radial method) to be fun for the second viewed from sev author, who is a bit more careful in his driving. Note the absence of sharp on using evoluti turns. in games is not studying under w perspective we h
  35. 35. ks by sampling aken advantage ack. First thick side of the b- ixels or subject nt is set up. But th of the track, and sometimes struction of the ing the b-spline middle of the imately regular e resulting track est track which e control points Fig. 5. Track evolved using the random walk initialisation and mutation. the representa- ed with several the beginning
  36. 36. but only sometimes causes the car to collide. Those elements are believed to be the main source of final progress variability. These features are also notably absent from track c, on which the good player model has very low variability. The progress of the controller is instead limited by many broad curves. Fig. 3. Three evolved tracks: ((a)) evolved for a bad player with target progress 1.1, (b) evolved for a good player with target fitness 1.5, (c) evolved for a good player with target progress 1.5 using only progress fitness.
  37. 37. Video: evolved TORCS drivers
  38. 38. Video: real car control
  39. 39. More on these topics • http://julian.togelius.com • e.g. Togelius, Lucas and De Nardi: “Computational Intelligence in Racing Games” • Togelius, Gomez and Schmidhuber: “Learning what to ignore” on Friday, 11.10, room 606 • Car Racing Competition on Tuesday 15.00, room 402

×