R9263109_M801_Webposter

Comparison of the impact of heuristic complexity and evolutionary populations in the
development of the co-evolution of neural network based Cartagena players
John Faherty
March 2009
Abstract
A series of genetic evolution experiments were conducted where multilayered feed forward neural net game playing agents competed in an
evolutionary environment based on their ability to play a relatively simple board game called Cartagena (described below). Two series of
experiments are presented in this thesis. Firstly, three cohorts of Cartagena agents with increasing complex heuristics were evolved and
compared to investigate the impact of heuristic complexity on agent evolution. The second series of experiments involved three cohorts of neural
net based game playing agents (with the same level of heuristic complexity) evolving in environments with differing levels of supervision, i.e.
unsupervised, semi-supervised and fully supervised by expert agents. The strength of the evolving agents was assessed against external fixed
benchmarking agents.
The results of the different heuristic complexity experiments showed that more complex agents, which did not need to learn the spatial details of
the board evolved more quickly, but some of the more complex heuristics actually hinder the evolution. The results of the different levels of
supervision experiments showed that agents that evolved in an unsupervised environment showed increased strength through evolutionary
cycles, but the agents that evolved in semi-supervised and fully supervised environments did not show an increase in strength through the
evolutionary process.

Context and Aim of the research
Game description: The game of Cartagena is based on the tale of a 1672 breakout from a Spanish Prison in Cartagena. Players are in control of
6 pirates, and the objective of each player is to navigate their pirates from Cartagena, through an underground passage to safety (see Figure 1).
Each space in the passage contains one of six distinct symbols (dagger, hat, pistol, bottle, skull and
keys) and there is a pack of cards, each one bearing one of the symbols. Players move their pirates to
the Sloop by playing cards, or they may select to pick up cards by moving backwards through the
passageway. The players take turns sequentially, and each turn comprises three “moves”, i.e. either
play a card or pick up cards. The winner is the first player to navigate all of their pirates to the Sloop.
Agent design: Each game playing agent comprises neural net. Move selection is undertaken by
identifying all of the potential one ply game positions from a distinct position. Features of each game
position (i.e. heuristics) are then feed into the neural net, and the game position with the highest
output from the neural net defines the move to be played. The agents play each other in round robin
competitions with evolutionary pools, with winners of these games being assigned positive pay-offs
(and losers negative pay-offs).
Research question: This research investigates two subjects: the impact of heuristic complexity on
the agent evolution; and the impact of supervision of the evolutionary pool on agent evolution.
SLOOP
DIRECTION
OF TRAVEL
CARTAGENA
Fig 1: CARTAGENA
BOARD

Initialise Population
Randomly Vary
Individuals
'Evaluate 'fitness
Apply Selection
Research method/Techniques
Overview: An overview of the co-evolutionary process is given in Figure 2 opposite. The first stage is to initialise the population, where the
weights and biases of the neural nets in the initial population are set randomly. The next stage is to evaluate the fitness
of the individuals within the pool, via ‘round robin’ competition within the evolutionary pool. The fittest individuals
are selected and mutated to form the next generation, and the process is repeated. Periodically the strength of the
fittest individuals within a pool is assessed objectively through benchmarking by playing against an ensemble of three
fixed external agents.
Heuristic Complexity: Heuristics are defined as features of potential game boards that are inputs to neural nets
and this experiment involved comparing the evolution of agents with three different levels of heuristic complexity:
• Simple’ heuristics where the location of the pirates in the passageway is main input
• ‘Spatial’ heuristics which attempt to capture the spatial information concerning the pirates’ relative positions, as
well as the distance travelled by each pirate along the passageway.
• ‘Spatial’ and Cards’ heuristics are as the spatial heuristics, but also include looking further forward to playing
cards in subsequent moves, some consideration of position of the opponents pirates
Levels of Supervision: This experiment also investigated the influence of level of supervision on agent evolution, by considered three levels of
supervision. In the “unsupervised” experiment the round robin fitness assessment only involved play against other members of the evolutionary
pool. In the “semi-supervised” trial half of the round robin games were against other members of the pool, and half were against expert agents,
and in the “fully supervised” game all of the games that the agents played in the pool were against expert agents.
Figure 2:
Evolutionary
Process

Results: Heuristic complexity trials
Figure 3 opposite shows the results of the experiments into the
effect of heuristic complexity on strength development. The
figure shows the average strength of the three fitness agents from
evolutionary generation. This diagram shows the strength against
the “Random” benchmark (that selects moves at random).
• The Spatial & Cards heuristic based agents ultimately
evolved to a higher strength level compared to the Spatial
heuristic based agents.
• The Spatial based agents and the Spatial with Cards based
agents evolved more rapidly between generation 0 and 200,
when compared with the Simple Heuristics.
• By generation 500 the Simple Heuristic based agents have a greater strength compared to the Spatial Heuristic. Unfortunately the Simple heuristic trial
was shorter (due to time constraints) than the other heuristic trials, which is one of the limitations of this research.
Figure 3: Random BenchMark
-100
0
100
200
300
400
500
600
700
800
900
0 200 400 600 800 1000 1200 1400 1600 1800
Generation
Strength
Spatial&CardsAverage
SpatialAveage
Simple average

Random BenchMark for Different Cohort populations
-600
-400
-200
0
200
400
600
800
0 20 40 60 80 100 120 140
Generation
Strength
Fully Supervised UnSupervised Semi Supervised
Results: Cohort population trials
Figure 4 opposite shows the results of the trials with different
levels of supervision.
• The agents in the unsupervised trial, became continuously
more stronger compared to the random benchmark. This is
shown through a continuous improvement in strength
(when compared to the external benchmark) with
increasing number of generations.
• The agents evolved in semi and fully supervised cohorts
did not show any increase in strength development with
successive evolutionary cycles.
• The agents in the fully and semi supervised environments, average strength is less than zero when compared to a random benchmark. This means that
they are routinely beaten by a benchmark player that plays random moves. This is because a random player will generally play cards to move forward,
whereas a “random heuristic” will actively choose moves that select certain random features.
Figure 4: Results of Cohort population trials

Conclusions
Heuristic Complexity Trials:
• The strength development of the agents did vary with different heuristic complexities, although more complex heuristics did not necessarily result in a
higher strength of the evolved agent. It is thought that the Spatial Heuristics may have been hindered by less pertinent heuristics interfering with the more
basic salient heuristics, whereas the Simple Heuristics only contained the more basic salient heuristics.
• The initial Strength development of the Simplistic Heuristic based agents was slower compared to the Spatial and Spatial with Cards Heuristics. This
resulted from the fact that the Simple Heuristics had to learn the spatial relationships between the different sections of the passage, whereas these spatial
relationships were inherent in the inputs used in the Spatial and Spatial with Cards Heuristics.
• The extra board features included in the Spatial with Cards heuristics, compared to the Spatial Heuristics, allowed the Spatial with Cards heuristic based
agents to evolve to a higher strength, implying the extra heuristics inputs were important board features of the board that the were appropriate for board
evaluation.
Evolutionary Pool Trials:
• Evolution of agents in an unsupervised evolutionary cohort is a far more effective method for developing neural net based agents compared to evolution
of agents within semi or fully supervised evolutionary cohorts. In these trials only agents evolved in an unsupervised evolutionary cohort actually
resulted in a consistent increase in strength through evolutionary cycles. The agents evolved in semi and fully supervised cohorts did not show any
increase in strength development with successive evolutionary cycles.

References (limited by word count to key references– Full reference list in dissertation text)
Chellapilla, K and Fogel, D (1999) Evolution, Neural Networks, Games, and Intelligence, in Proceedings of the IEEE 87 (9), 1471-1496
Yao, X (1999) Evolving Artificial Neural Networks, in Proceedings of the IEEE 87 (9), 1423-1447
Runarsson, T and Lucas, S (2005) Co evolution Versus Self-Play Temporal Difference Learning for Acquiring Position Evaluation in Small-
Board Go, in IEEE Transactions of Evolutionary Computation 9 (6)628-640
Russell and Norvig (2003), Artificial Intelligence – A Modern Approach, Prentice-Hall
Darwen, P (20001) Why co-evolution beats temporal difference learning at backgammon for linear architecture, but not a non-linear architecture
Proceedings of the 2001 Congress on Evolutionary Computation, 1003-1010
Mandziuk, J, Kusiak, M and Waledzik, K (2007) Evolutionary-based heuristic generators for checkers and give-away checkers, in Expert
Systems 24 (4), 189–211

R9263109_M801_Webposter

Recommended

Recommended

More Related Content

Similar to R9263109_M801_Webposter

Similar to R9263109_M801_Webposter (11)

R9263109_M801_Webposter