SlideShare a Scribd company logo
Comparison of the impact of heuristic complexity and evolutionary populations in the
development of the co-evolution of neural network based Cartagena players
John Faherty
March 2009
Abstract
A series of genetic evolution experiments were conducted where multilayered feed forward neural net game playing agents competed in an
evolutionary environment based on their ability to play a relatively simple board game called Cartagena (described below). Two series of
experiments are presented in this thesis. Firstly, three cohorts of Cartagena agents with increasing complex heuristics were evolved and
compared to investigate the impact of heuristic complexity on agent evolution. The second series of experiments involved three cohorts of neural
net based game playing agents (with the same level of heuristic complexity) evolving in environments with differing levels of supervision, i.e.
unsupervised, semi-supervised and fully supervised by expert agents. The strength of the evolving agents was assessed against external fixed
benchmarking agents.
The results of the different heuristic complexity experiments showed that more complex agents, which did not need to learn the spatial details of
the board evolved more quickly, but some of the more complex heuristics actually hinder the evolution. The results of the different levels of
supervision experiments showed that agents that evolved in an unsupervised environment showed increased strength through evolutionary
cycles, but the agents that evolved in semi-supervised and fully supervised environments did not show an increase in strength through the
evolutionary process.
Context and Aim of the research
Game description: The game of Cartagena is based on the tale of a 1672 breakout from a Spanish Prison in Cartagena. Players are in control of
6 pirates, and the objective of each player is to navigate their pirates from Cartagena, through an underground passage to safety (see Figure 1).
Each space in the passage contains one of six distinct symbols (dagger, hat, pistol, bottle, skull and
keys) and there is a pack of cards, each one bearing one of the symbols. Players move their pirates to
the Sloop by playing cards, or they may select to pick up cards by moving backwards through the
passageway. The players take turns sequentially, and each turn comprises three “moves”, i.e. either
play a card or pick up cards. The winner is the first player to navigate all of their pirates to the Sloop.
Agent design: Each game playing agent comprises neural net. Move selection is undertaken by
identifying all of the potential one ply game positions from a distinct position. Features of each game
position (i.e. heuristics) are then feed into the neural net, and the game position with the highest
output from the neural net defines the move to be played. The agents play each other in round robin
competitions with evolutionary pools, with winners of these games being assigned positive pay-offs
(and losers negative pay-offs).
Research question: This research investigates two subjects: the impact of heuristic complexity on
the agent evolution; and the impact of supervision of the evolutionary pool on agent evolution.
SLOOP
DIRECTION
OF TRAVEL
CARTAGENA
Fig 1: CARTAGENA
BOARD
Initialise Population
Randomly Vary
Individuals
'Evaluate 'fitness
Apply Selection
Research method/Techniques
Overview: An overview of the co-evolutionary process is given in Figure 2 opposite. The first stage is to initialise the population, where the
weights and biases of the neural nets in the initial population are set randomly. The next stage is to evaluate the fitness
of the individuals within the pool, via ‘round robin’ competition within the evolutionary pool. The fittest individuals
are selected and mutated to form the next generation, and the process is repeated. Periodically the strength of the
fittest individuals within a pool is assessed objectively through benchmarking by playing against an ensemble of three
fixed external agents.
Heuristic Complexity: Heuristics are defined as features of potential game boards that are inputs to neural nets
and this experiment involved comparing the evolution of agents with three different levels of heuristic complexity:
• Simple’ heuristics where the location of the pirates in the passageway is main input
• ‘Spatial’ heuristics which attempt to capture the spatial information concerning the pirates’ relative positions, as
well as the distance travelled by each pirate along the passageway.
• ‘Spatial’ and Cards’ heuristics are as the spatial heuristics, but also include looking further forward to playing
cards in subsequent moves, some consideration of position of the opponents pirates
Levels of Supervision: This experiment also investigated the influence of level of supervision on agent evolution, by considered three levels of
supervision. In the “unsupervised” experiment the round robin fitness assessment only involved play against other members of the evolutionary
pool. In the “semi-supervised” trial half of the round robin games were against other members of the pool, and half were against expert agents,
and in the “fully supervised” game all of the games that the agents played in the pool were against expert agents.
Figure 2:
Evolutionary
Process
Results: Heuristic complexity trials
Figure 3 opposite shows the results of the experiments into the
effect of heuristic complexity on strength development. The
figure shows the average strength of the three fitness agents from
evolutionary generation. This diagram shows the strength against
the “Random” benchmark (that selects moves at random).
• The Spatial & Cards heuristic based agents ultimately
evolved to a higher strength level compared to the Spatial
heuristic based agents.
• The Spatial based agents and the Spatial with Cards based
agents evolved more rapidly between generation 0 and 200,
when compared with the Simple Heuristics.
• By generation 500 the Simple Heuristic based agents have a greater strength compared to the Spatial Heuristic. Unfortunately the Simple heuristic trial
was shorter (due to time constraints) than the other heuristic trials, which is one of the limitations of this research.
Figure 3: Random BenchMark
-100
0
100
200
300
400
500
600
700
800
900
0 200 400 600 800 1000 1200 1400 1600 1800
Generation
Strength
Spatial&CardsAverage
SpatialAveage
Simple average
Random BenchMark for Different Cohort populations
-600
-400
-200
0
200
400
600
800
0 20 40 60 80 100 120 140
Generation
Strength
Fully Supervised UnSupervised Semi Supervised
Results: Cohort population trials
Figure 4 opposite shows the results of the trials with different
levels of supervision.
• The agents in the unsupervised trial, became continuously
more stronger compared to the random benchmark. This is
shown through a continuous improvement in strength
(when compared to the external benchmark) with
increasing number of generations.
• The agents evolved in semi and fully supervised cohorts
did not show any increase in strength development with
successive evolutionary cycles.
• The agents in the fully and semi supervised environments, average strength is less than zero when compared to a random benchmark. This means that
they are routinely beaten by a benchmark player that plays random moves. This is because a random player will generally play cards to move forward,
whereas a “random heuristic” will actively choose moves that select certain random features.
Figure 4: Results of Cohort population trials
Conclusions
Heuristic Complexity Trials:
• The strength development of the agents did vary with different heuristic complexities, although more complex heuristics did not necessarily result in a
higher strength of the evolved agent. It is thought that the Spatial Heuristics may have been hindered by less pertinent heuristics interfering with the more
basic salient heuristics, whereas the Simple Heuristics only contained the more basic salient heuristics.
• The initial Strength development of the Simplistic Heuristic based agents was slower compared to the Spatial and Spatial with Cards Heuristics. This
resulted from the fact that the Simple Heuristics had to learn the spatial relationships between the different sections of the passage, whereas these spatial
relationships were inherent in the inputs used in the Spatial and Spatial with Cards Heuristics.
• The extra board features included in the Spatial with Cards heuristics, compared to the Spatial Heuristics, allowed the Spatial with Cards heuristic based
agents to evolve to a higher strength, implying the extra heuristics inputs were important board features of the board that the were appropriate for board
evaluation.
Evolutionary Pool Trials:
• Evolution of agents in an unsupervised evolutionary cohort is a far more effective method for developing neural net based agents compared to evolution
of agents within semi or fully supervised evolutionary cohorts. In these trials only agents evolved in an unsupervised evolutionary cohort actually
resulted in a consistent increase in strength through evolutionary cycles. The agents evolved in semi and fully supervised cohorts did not show any
increase in strength development with successive evolutionary cycles.
References (limited by word count to key references– Full reference list in dissertation text)
Chellapilla, K and Fogel, D (1999) Evolution, Neural Networks, Games, and Intelligence, in Proceedings of the IEEE 87 (9), 1471-1496
Yao, X (1999) Evolving Artificial Neural Networks, in Proceedings of the IEEE 87 (9), 1423-1447
Runarsson, T and Lucas, S (2005) Co evolution Versus Self-Play Temporal Difference Learning for Acquiring Position Evaluation in Small-
Board Go, in IEEE Transactions of Evolutionary Computation 9 (6)628-640
Russell and Norvig (2003), Artificial Intelligence – A Modern Approach, Prentice-Hall
Darwen, P (20001) Why co-evolution beats temporal difference learning at backgammon for linear architecture, but not a non-linear architecture
Proceedings of the 2001 Congress on Evolutionary Computation, 1003-1010
Mandziuk, J, Kusiak, M and Waledzik, K (2007) Evolutionary-based heuristic generators for checkers and give-away checkers, in Expert
Systems 24 (4), 189–211

More Related Content

Similar to R9263109_M801_Webposter

Stability of Individuals in a Fingerprint System across Force Levels
Stability of Individuals in a Fingerprint System across Force LevelsStability of Individuals in a Fingerprint System across Force Levels
Stability of Individuals in a Fingerprint System across Force Levels
ITIIIndustries
 
Optimizing search-space-of-othello-using-hybrid-approach
Optimizing search-space-of-othello-using-hybrid-approachOptimizing search-space-of-othello-using-hybrid-approach
Optimizing search-space-of-othello-using-hybrid-approach
Editor IJMTER
 
Neural Mechanisms of Free-riding and Cooperation in a Public Goods Game: An E...
Neural Mechanisms of Free-riding and Cooperation in a Public Goods Game: An E...Neural Mechanisms of Free-riding and Cooperation in a Public Goods Game: An E...
Neural Mechanisms of Free-riding and Cooperation in a Public Goods Game: An E...
Kyongsik Yun
 
soft computing BTU MCA 3rd SEM unit 1 .pptx
soft computing BTU MCA 3rd SEM unit 1 .pptxsoft computing BTU MCA 3rd SEM unit 1 .pptx
soft computing BTU MCA 3rd SEM unit 1 .pptx
naveen356604
 
A Survey On Genetic Algorithms
A Survey On Genetic AlgorithmsA Survey On Genetic Algorithms
A Survey On Genetic Algorithms
Valerie Felton
 
Genetic algorithms
Genetic algorithmsGenetic algorithms
Genetic algorithms
Amna Saeed
 
Emotional Interactions in Human Decision-Making using EEG Hyperscanning
Emotional Interactions in Human Decision-Making using EEG HyperscanningEmotional Interactions in Human Decision-Making using EEG Hyperscanning
Emotional Interactions in Human Decision-Making using EEG Hyperscanning
Kyongsik Yun
 
Genetic algorithms full lecture
Genetic algorithms full lectureGenetic algorithms full lecture
Genetic algorithms full lecture
sadiacs
 
Google Deepmind Mastering Go Research Paper
Google Deepmind Mastering Go Research PaperGoogle Deepmind Mastering Go Research Paper
Google Deepmind Mastering Go Research Paper
Business of Software Conference
 
Evolving Neural Network Agents In The Nero Video
Evolving Neural Network Agents In The Nero VideoEvolving Neural Network Agents In The Nero Video
Evolving Neural Network Agents In The Nero Video
Stelios Petrakis
 
Primacy of categorical levels
Primacy of categorical levelsPrimacy of categorical levels
Primacy of categorical levels
Guillermo Farfan Jr.
 

Similar to R9263109_M801_Webposter (11)

Stability of Individuals in a Fingerprint System across Force Levels
Stability of Individuals in a Fingerprint System across Force LevelsStability of Individuals in a Fingerprint System across Force Levels
Stability of Individuals in a Fingerprint System across Force Levels
 
Optimizing search-space-of-othello-using-hybrid-approach
Optimizing search-space-of-othello-using-hybrid-approachOptimizing search-space-of-othello-using-hybrid-approach
Optimizing search-space-of-othello-using-hybrid-approach
 
Neural Mechanisms of Free-riding and Cooperation in a Public Goods Game: An E...
Neural Mechanisms of Free-riding and Cooperation in a Public Goods Game: An E...Neural Mechanisms of Free-riding and Cooperation in a Public Goods Game: An E...
Neural Mechanisms of Free-riding and Cooperation in a Public Goods Game: An E...
 
soft computing BTU MCA 3rd SEM unit 1 .pptx
soft computing BTU MCA 3rd SEM unit 1 .pptxsoft computing BTU MCA 3rd SEM unit 1 .pptx
soft computing BTU MCA 3rd SEM unit 1 .pptx
 
A Survey On Genetic Algorithms
A Survey On Genetic AlgorithmsA Survey On Genetic Algorithms
A Survey On Genetic Algorithms
 
Genetic algorithms
Genetic algorithmsGenetic algorithms
Genetic algorithms
 
Emotional Interactions in Human Decision-Making using EEG Hyperscanning
Emotional Interactions in Human Decision-Making using EEG HyperscanningEmotional Interactions in Human Decision-Making using EEG Hyperscanning
Emotional Interactions in Human Decision-Making using EEG Hyperscanning
 
Genetic algorithms full lecture
Genetic algorithms full lectureGenetic algorithms full lecture
Genetic algorithms full lecture
 
Google Deepmind Mastering Go Research Paper
Google Deepmind Mastering Go Research PaperGoogle Deepmind Mastering Go Research Paper
Google Deepmind Mastering Go Research Paper
 
Evolving Neural Network Agents In The Nero Video
Evolving Neural Network Agents In The Nero VideoEvolving Neural Network Agents In The Nero Video
Evolving Neural Network Agents In The Nero Video
 
Primacy of categorical levels
Primacy of categorical levelsPrimacy of categorical levels
Primacy of categorical levels
 

R9263109_M801_Webposter

  • 1. Comparison of the impact of heuristic complexity and evolutionary populations in the development of the co-evolution of neural network based Cartagena players John Faherty March 2009 Abstract A series of genetic evolution experiments were conducted where multilayered feed forward neural net game playing agents competed in an evolutionary environment based on their ability to play a relatively simple board game called Cartagena (described below). Two series of experiments are presented in this thesis. Firstly, three cohorts of Cartagena agents with increasing complex heuristics were evolved and compared to investigate the impact of heuristic complexity on agent evolution. The second series of experiments involved three cohorts of neural net based game playing agents (with the same level of heuristic complexity) evolving in environments with differing levels of supervision, i.e. unsupervised, semi-supervised and fully supervised by expert agents. The strength of the evolving agents was assessed against external fixed benchmarking agents. The results of the different heuristic complexity experiments showed that more complex agents, which did not need to learn the spatial details of the board evolved more quickly, but some of the more complex heuristics actually hinder the evolution. The results of the different levels of supervision experiments showed that agents that evolved in an unsupervised environment showed increased strength through evolutionary cycles, but the agents that evolved in semi-supervised and fully supervised environments did not show an increase in strength through the evolutionary process.
  • 2. Context and Aim of the research Game description: The game of Cartagena is based on the tale of a 1672 breakout from a Spanish Prison in Cartagena. Players are in control of 6 pirates, and the objective of each player is to navigate their pirates from Cartagena, through an underground passage to safety (see Figure 1). Each space in the passage contains one of six distinct symbols (dagger, hat, pistol, bottle, skull and keys) and there is a pack of cards, each one bearing one of the symbols. Players move their pirates to the Sloop by playing cards, or they may select to pick up cards by moving backwards through the passageway. The players take turns sequentially, and each turn comprises three “moves”, i.e. either play a card or pick up cards. The winner is the first player to navigate all of their pirates to the Sloop. Agent design: Each game playing agent comprises neural net. Move selection is undertaken by identifying all of the potential one ply game positions from a distinct position. Features of each game position (i.e. heuristics) are then feed into the neural net, and the game position with the highest output from the neural net defines the move to be played. The agents play each other in round robin competitions with evolutionary pools, with winners of these games being assigned positive pay-offs (and losers negative pay-offs). Research question: This research investigates two subjects: the impact of heuristic complexity on the agent evolution; and the impact of supervision of the evolutionary pool on agent evolution. SLOOP DIRECTION OF TRAVEL CARTAGENA Fig 1: CARTAGENA BOARD
  • 3. Initialise Population Randomly Vary Individuals 'Evaluate 'fitness Apply Selection Research method/Techniques Overview: An overview of the co-evolutionary process is given in Figure 2 opposite. The first stage is to initialise the population, where the weights and biases of the neural nets in the initial population are set randomly. The next stage is to evaluate the fitness of the individuals within the pool, via ‘round robin’ competition within the evolutionary pool. The fittest individuals are selected and mutated to form the next generation, and the process is repeated. Periodically the strength of the fittest individuals within a pool is assessed objectively through benchmarking by playing against an ensemble of three fixed external agents. Heuristic Complexity: Heuristics are defined as features of potential game boards that are inputs to neural nets and this experiment involved comparing the evolution of agents with three different levels of heuristic complexity: • Simple’ heuristics where the location of the pirates in the passageway is main input • ‘Spatial’ heuristics which attempt to capture the spatial information concerning the pirates’ relative positions, as well as the distance travelled by each pirate along the passageway. • ‘Spatial’ and Cards’ heuristics are as the spatial heuristics, but also include looking further forward to playing cards in subsequent moves, some consideration of position of the opponents pirates Levels of Supervision: This experiment also investigated the influence of level of supervision on agent evolution, by considered three levels of supervision. In the “unsupervised” experiment the round robin fitness assessment only involved play against other members of the evolutionary pool. In the “semi-supervised” trial half of the round robin games were against other members of the pool, and half were against expert agents, and in the “fully supervised” game all of the games that the agents played in the pool were against expert agents. Figure 2: Evolutionary Process
  • 4. Results: Heuristic complexity trials Figure 3 opposite shows the results of the experiments into the effect of heuristic complexity on strength development. The figure shows the average strength of the three fitness agents from evolutionary generation. This diagram shows the strength against the “Random” benchmark (that selects moves at random). • The Spatial & Cards heuristic based agents ultimately evolved to a higher strength level compared to the Spatial heuristic based agents. • The Spatial based agents and the Spatial with Cards based agents evolved more rapidly between generation 0 and 200, when compared with the Simple Heuristics. • By generation 500 the Simple Heuristic based agents have a greater strength compared to the Spatial Heuristic. Unfortunately the Simple heuristic trial was shorter (due to time constraints) than the other heuristic trials, which is one of the limitations of this research. Figure 3: Random BenchMark -100 0 100 200 300 400 500 600 700 800 900 0 200 400 600 800 1000 1200 1400 1600 1800 Generation Strength Spatial&CardsAverage SpatialAveage Simple average
  • 5. Random BenchMark for Different Cohort populations -600 -400 -200 0 200 400 600 800 0 20 40 60 80 100 120 140 Generation Strength Fully Supervised UnSupervised Semi Supervised Results: Cohort population trials Figure 4 opposite shows the results of the trials with different levels of supervision. • The agents in the unsupervised trial, became continuously more stronger compared to the random benchmark. This is shown through a continuous improvement in strength (when compared to the external benchmark) with increasing number of generations. • The agents evolved in semi and fully supervised cohorts did not show any increase in strength development with successive evolutionary cycles. • The agents in the fully and semi supervised environments, average strength is less than zero when compared to a random benchmark. This means that they are routinely beaten by a benchmark player that plays random moves. This is because a random player will generally play cards to move forward, whereas a “random heuristic” will actively choose moves that select certain random features. Figure 4: Results of Cohort population trials
  • 6. Conclusions Heuristic Complexity Trials: • The strength development of the agents did vary with different heuristic complexities, although more complex heuristics did not necessarily result in a higher strength of the evolved agent. It is thought that the Spatial Heuristics may have been hindered by less pertinent heuristics interfering with the more basic salient heuristics, whereas the Simple Heuristics only contained the more basic salient heuristics. • The initial Strength development of the Simplistic Heuristic based agents was slower compared to the Spatial and Spatial with Cards Heuristics. This resulted from the fact that the Simple Heuristics had to learn the spatial relationships between the different sections of the passage, whereas these spatial relationships were inherent in the inputs used in the Spatial and Spatial with Cards Heuristics. • The extra board features included in the Spatial with Cards heuristics, compared to the Spatial Heuristics, allowed the Spatial with Cards heuristic based agents to evolve to a higher strength, implying the extra heuristics inputs were important board features of the board that the were appropriate for board evaluation. Evolutionary Pool Trials: • Evolution of agents in an unsupervised evolutionary cohort is a far more effective method for developing neural net based agents compared to evolution of agents within semi or fully supervised evolutionary cohorts. In these trials only agents evolved in an unsupervised evolutionary cohort actually resulted in a consistent increase in strength through evolutionary cycles. The agents evolved in semi and fully supervised cohorts did not show any increase in strength development with successive evolutionary cycles.
  • 7. References (limited by word count to key references– Full reference list in dissertation text) Chellapilla, K and Fogel, D (1999) Evolution, Neural Networks, Games, and Intelligence, in Proceedings of the IEEE 87 (9), 1471-1496 Yao, X (1999) Evolving Artificial Neural Networks, in Proceedings of the IEEE 87 (9), 1423-1447 Runarsson, T and Lucas, S (2005) Co evolution Versus Self-Play Temporal Difference Learning for Acquiring Position Evaluation in Small- Board Go, in IEEE Transactions of Evolutionary Computation 9 (6)628-640 Russell and Norvig (2003), Artificial Intelligence – A Modern Approach, Prentice-Hall Darwen, P (20001) Why co-evolution beats temporal difference learning at backgammon for linear architecture, but not a non-linear architecture Proceedings of the 2001 Congress on Evolutionary Computation, 1003-1010 Mandziuk, J, Kusiak, M and Waledzik, K (2007) Evolutionary-based heuristic generators for checkers and give-away checkers, in Expert Systems 24 (4), 189–211