nsga.ppt

Evolving Multimodal
Networks for Multitask
Games
Jacob Schrum – schrum2@cs.utexas.edu
Risto Miikkulainen – risto@cs.utexas.edu
University of Texas at Austin
Department of Computer Science

 Evolution in videogames
 Automatically learn interesting behavior
 Complex but controlled environments
 Stepping stone to real world
 Robots
 Training simulators
 Complexity issues
 Multiple contradictory objectives
 Multiple challenging tasks

Multitask Games
 NPCs perform two or more separate tasks
 Each task has own performance measures
 Task linkage
Independent
Dependent
 Not blended
 Inherently multiobjective

Test Domains
 Designed to study multimodal behavior
 Two tasks in similar environments
 Different behavior needed to succeed
 Main challenge: perform well in both
Front Ramming Back Ramming

Front/Back Ramming
 Front Ramming
 Attack w/front ram
 Avoid counterattacks
 Back Ramming
 Attack w/back ram
 Avoid counterattacks
 Same goal, opposite embodiments

Predator/Prey
 Predator
 Attack prey
 Prevent escape
 Prey
 Avoid attack
 Stay alive
 Same embodiment, opposite goals

Multiobjective Optimization
 Game with two objectives:
 Damage Dealt
 Remaining Health
 A dominates B iff A is
strictly better in one
objective and at least
as good in others
 Population of points
not dominated are best:
Pareto Front
 Weighted-sum provably
incapable of capturing
non-convex front
Dealt lot of damage,
but lost lots of health
Tradeoff between objectives
High health but did not deal much damage

NSGA-II
 Evolution: natural approach for finding optimal population
 Non-Dominated Sorting Genetic Algorithm II*
 Population P with size N; Evaluate P
 Use mutation to get P´ size N; Evaluate P´
 Calculate non-dominated fronts of {P P´} size 2N
 New population size N from highest fronts of {P P´}
*K. Deb et al. A Fast and Elitist Multiobjective Genetic Algorithm: NSGA-II. Evol. Comp. 2002

Constructive Neuroevolution
 Genetic Algorithms + Neural Networks
 Build structure incrementally (complexification)
 Good at generating control policies
 Three basic mutations (no crossover used)
Perturb Weight Add Connection Add Node

Multimodal Networks (1)
 Multitask Learning*
 One mode per task
 Shared hidden layer
 Knows current task
 Previous work
 Supervised learning context
 Multiple tasks learned
quicker than individual
 Not tried with evolution yet
* R. A. Caruana, "Multitask learning: A knowledge-based source of inductive bias" ICML 1993

Multimodal Networks (2)
 Mode Mutation
 Extra modes evolved
 Networks choose mode
 Chosen via preference neurons
 MM Previous
 Links from previous mode
 Weights = 1.0
 MM Random
 Links from random
sources
 Random weights
 Supports mode deletion
Starting network with one mode
MM(R)
MM(P)

Experiment
 Compare 4 conditions:
 Control: Unimodal networks
 Multitask: One mode per task
 MM(P): Mode Mutation Previous
 MM(R): Mode Mutation Random + Delete Mutation
 500 generations
 Population size 52
 “Player” behavior scripted
 Network controls homogeneous team of 4

MO Performance Assessment
 Reduce Pareto front to single number
Hypervolume of
dominated region
 Pareto compliant
Front A dominates
front B implies
HV(A) > HV(B)
 Standard statistical
comparisons of
average HV

Front/Back Ramming Behaviors
Multitask
MM(R)
Front Ramming Back Ramming

Predator/Prey Behaviors
Multitask
MM(R)
Prey Predator

Discussion (1)
 Front/Back Ramming
Control < MM(P), MM(R) < Multitask
Multiple modes help
Explicit knowledge of task helps

Discussion (2)
 Predator/Prey
MM(P), Control, Multitask < MM(R)
Multiple modes not necessarily helpful
Disparity in relative difficulty of tasks
 Multitask ends up wasting effort
Mode deletion aids search for one good mode

How To Apply
 Multitask good if:
Task division known, and
Tasks are comparably difficult
 Mode mutation good if:
Task division is unknown, or
“Obvious” task division is misleading

Future Work
 Games with more tasks
 Does method scale?
 Control mode bloat
 Games with independent tasks
 Ms. Pac-Man
 Collect pills while avoiding ghosts
 Eat ghosts after eating power pill
 Games with blended tasks
 Unreal Tournament 2004
 Fight while avoiding damage
 Fight or run away?
 Collect items or seek opponents?

Conclusion
 Domains with multiple tasks are common
Both in real world and games
 Multimodal networks improve learning in
multitask games
 Will allow interesting/complex behavior to
be developed in future

Questions?
Jacob Schrum – schrum2@cs.utexas.edu
Risto Miikkulainen – risto@cs.utexas.edu
University of Texas at Austin
Department of Computer Science

nsga.ppt

Recommended

Recommended

More Related Content

Similar to nsga.ppt

Similar to nsga.ppt (20)

More from raj20072

More from raj20072 (7)

Recently uploaded

Recently uploaded (20)

nsga.ppt