Evolution of Coordination and Communication in Groups of Embodied Agents

Evolution of Coordination and Communication
in Groups of Embodied Agents
A PhD Thesis Presentation
by Olaf Witkowski
!
Department of Computer Science

University of Tokyo

19 January 2015

• Biological cells, insect swarms, bird ﬂocks all self-organize in groups
displaying a collective behavior.

• Individuals interacting together produce adaptive behavior, i.e. behavior that
increases their chances of survival and reproduction.
Introduction
2
Myxobacteria
form wolf packs to share
digestive enzymes
Honey bees
exchange information
to optimize foraging
Weaver ants
build bridges
with their own bodies
Bigeye ﬁsh
form schools
to avoid predation

Research questions
• In which conditions does
collective behavior emerge in a
group of autonomous agents?

• Can individuals work together
more eﬀectively when they rely
on a communication system?
3

Signiﬁcance is twofold
• This thesis is relevant to both scientiﬁc and technological purposes.

• First, it contributes to shed light on the evolution of coordination and
communication.

• Second, a better understanding of the fundamental principles of collective
behavior may also lead to innovative methods in multi-agents systems,
ubiquitous computing devices and swarm computation.
4

Outline
• Introduction & Background

• Methods

• Gene-culture coevolution (ch. 7)

• Synchronization vs. variability (ch. 6)

• 3D signal-swarming models (ch. 4)

• 3D spatial Prisoner’s Dilemma (ch. 5)

• Conclusion
contributions
5

Methods
• 3-block model = Darwinian evolution + “Robots” + Environment
7
Darwinian evolution
Robots
Environment

Methods — Agent-Based Model (ABM)
• Agent-based modeling: computational models which simulate the actions
and interactions of autonomous creatures in a simulated environment.

• The agent’s actions impact on its survival, just like in real environments.
8
Example of ABM by Wischmann, Floreano & Keller (2012)

• Artiﬁcial Neural network (McCulloch & Pitts 1943, Rosenblatt 1958)
Methods — Artiﬁcial Neural Networks (ANN)
9
Neural network (“brain”)
Neuron
Connection weights
:

• Connection weights encoded in a genotype & evolved by a genetic
algorithm (Fisher 1958, Holland 1995).
Methods — Genetic Algorithm (GA)
10
Genotype
= vectors of ANN connection weights
= (w1, w2, … wn)
The ﬁtness value of each genotype
is determined by the agent’s
performances on a predeﬁned task.
w1w2w3 - - - wn
w1w2w3 - - - wn
w1w2w3 - - - wn
Population of genotypes
Evolution environment
GA operators
:

Methods — Evolutionary Robotics (ER)
• Evolutionary robotics = Genetic algorithm + Agent-based modeling
11
Darwinian evolution
Robots
Environment
!
Cliff, Harvey & Husbands (1992)
Floreano & Mondada (1994)

Methods - Asynchronous GA
• The generations of genotypes are overlapping: each agent’s ﬁtness is
evaluated every iteration.

• When the agent gets enough energy, it replicates: the oﬀspring is added to
the running simulation.
12

!
Outline
• Introduction & Background

• Methods

• Gene-culture coevolution (ch. 7)

• Synchronization vs. variability (ch. 6)

• 3D signal-swarming models (ch. 4)

• 3D spatial Prisoner’s Dilemma (ch. 5)

• Conclusion
contributions
13
Generic gene/culture
coordination
Spatial coordination
with communication
0D
3D
0-2D
Seasonal coordination
through communication

Neutral selection in gene-culture coevolution
14
Goal: analyze the evolution of generic communication in a gene-
culture model
Signal matching task

Spread of Indo-European
languages through time
Bouckaert et al. (2012), Mapping the
Origins and Expansion of the Indo-European
Language Family, Science, vol. 337, no.
6097, pp. 957-960.
15

• Gene-culture models have been used to investigate language evolution, due
to the lack of empirical data (Boyd & Richerson 1992, Christiansen & Kirby
2003).
• We use genetic algorithm, artiﬁcial neural networks, and diﬀerent social
networks for learning.
16
Signal matching task

17
SignalSignalSignal
• Agents produce signals
match match
• Agents need to match their signals with their neighbours
• Best performing agents are selected and replicated through genetic algorithm

• Culture: each agent learns by imitating its neighbors’ signals
18
Learner Teacher
Learning phase
Social network
Learner Neighbor
Evaluation phase
• Gene: each agent is then evaluated for reproduction

• If the learned culture becomes uniform over the population, the selection
pressure on the genes is relaxed, leading to a neutral selection space.
19
Social networks: Learning in lattice ; ﬁtness in lattice ; reproduction in row
Genes:
= weights before learning
Cultures:
= weights after learning
Time
Reproduction
network = rows
Communication
network = lattice

• In this model the agents’ task was directly
to coordinate their communication.

• The results show neutral selection, oﬀers
new insights with the analogy to Potts
model/Oscillators theory/Swarming
models.
Conclusion
20
• Next, we will go further by studying tasks
that indirectly require to coordinate via
communication.
Task

Synchronization in dynamic environments
21
Goal: study agent strategies for variable resource, using energy
saving vs. synchronisation via communication
Resource variationSignal

Animal behavior in winter Source: National Geographic
& BBC documentaries, 2014
22
Food hoarding
Bird migration
Hibernation

• Population of agents in an
environment with seasonal
food availability

• Each agent controlled by a
simple neural network
evolved by genetic algorithm
23Simple neural network (Elman 1990)

24
Dimensions 1D 2D 0D
Model
Ring world

!
!
!
!
Grid world

!
!
!
!
Action-based

!
!
!
!
Results
Synced wake-up
using signaling
Synced wake-up
using signaling
Speciated
resource saving
behaviors
FP -x :
Food Patch x ; x { 0 ,..., P }
A-y :
Agent y ; y { 0 ,..., N }
A-y ( sv ) : sv { 0 ,..., Patch Spacing }
Agent y signal value
FP -0
A-0
FP -5
FP -1
FP -4
FP -2
FP -P
FP -6
FP -8
FP -7 FP -3
A-N
A-0 ( 0 )
A-0 ( 0 )
A-N ( sv )
A-N ( sv )
...
3 experimental setups

• Signaling agents showed better collective performances than non-
signalling agents.

• The agents wake-up from hibernation based on other agents’ signals.
25
0 ,..., P }
N }
.., Patch Spacing }
FP -0
A-0
FP -5
FP -1
FP -4
FP -2
FP -P
FP -6
FP -8
FP -7 FP -3
A-N
A-0 ( 0 )
A-0 ( 0 )
A-N ( sv )
A-N ( sv )
...
Ring map Food
Food
Agent
Agent
Lattice map 2D
1D
Summer
Winter

Population vs size vs time: shows
evolutionary stable strategy
26
• Without direct communication, agents develop speciﬁc strategies to survive
winters.

• Strategies: fast reproduction, resource saving and hibernation.
Number of
individuals
Agent’s size
Time step
Action-cost model: cycles
detected
Small agents Mid-sized agents Large agents

• In dynamic environments, agents
synchronize foraging with seasons
using communication.

• Without direct communication,
agents use speciﬁc strategies to
save resource.
• Next, we will consider static
resources in a minimalist system
Resource variationSignal
Conclusion
27
Olaf Witkowski, Geoff Nitschke andTakashi Ikegami. July 2012. When is happy hour:An agent’s concept of time. Proceedings of theThirteenth
International Conference onThe Synthesis and Simulation of Living Systems, 13, 544–545.!
Olaf Witkowski and Geoff Nitschke. September 2013. The Transmission of Migratory Behaviors. Proceedings of theTwelveth European
Conference on Artiﬁcial Life, 12, 1218–1220.!
Olaf Witkowski and Nathanaël Aubert. July 2013. Size Does Matter:The Impact of Size on Hoarding Behaviour. Proceedings of theThirteenth
International Conference onThe Synthesis and Simulation of Living Systems, 13, 542–543.

Signal-based swarming
28
Goal: use minimalist 3D simulation to explore the emergence of
swarming based on signaling
!

29
Starling murmuration A Bird Ballet
by Neels Castillon

• Reynolds’ basic ﬂocking model (1986) consisted of three simple steering
behaviors that determined how individual boids should manoeuver based on
their velocity and position within the ﬂock.
30
Separation Alignment Cohesion

• Gradual improvements of the model, adding rules or ﬁxed leaders (Mataric
1992, Hartman & Benes 2006, Cucker & Huepe 2008, Su et al. 2009, Yu et al.
2010, Chiew et al. 2013)

• Swarming can be developed using an evolutionary robotics approach, often
with complex sensors and pressures such as predators (Tu and Terzopoulos
1994, Ward et al. 2001, Olson et al. 2013)
31
Hartman&Benes(2006)

32
• In our 3D simulation, blind sound-emitting agents look for a hidden food
resource. An asynchronous reproduction scheme is used to evolve the agents’
controllers.
• The models shows (a) emergence of collective motion from the combination of
signaling system and foraging task, and (b) clustering improves the search.

• Each agent is equipped with 1 signaling device and 6 sensors.

• The sensors detect signals produced by other agents from 6 directions.
33
signal
emitter
receiver
1
2
34
5
6

Simulation — Agent survival & reproduction
Energy cost = 0.01 + [ 0.0 ; 0.001 ]
34
> 10
Energy -> replication with mutation
= 02
No energy -> death
Energy gain = ________________Carrying capacity
Distance to goal
_______________
Survival

Model — Neural controller
35
M1 = pitch
M2 = yaw
S = produced signal
S1..6 = sensed signal
Elman simple recurrent
network architecture
(Elman 1990)

Results — Emergence of swarming
• Agents self-organize into swarms without any other external control than the
ﬁtness they get from being closer to the goal.

• The agents go through three phases: (1) random motion (2) dynamic
changing clusters and (3) compact ball around resource
36
(1) (2) (3)
(1) (2) (3)

0 2 4 6 8 10 12
x 10
5
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
Time steps
Averagenumberofneighbors
Average number of neighbors (10 runs) with signalling ON vs OFF
signalling ON
signalling OFF
Results — Neighborhood analysis
37
← signal on
← signal off
Averagenumberofneighbours
Average number of neighbors (10 runs)
Time steps

Results — distance to goal areas (signal on/off)
signal on
signal off
0 2 4 6 8 10 12
x 10
5
0
50
100
150
200
250
300
350
400
450
500
Distancetogoal
Average distance to goal every iteration (silent control simulation)
Simulation steps
38
Distancetogoal
Average distance to goal (signal on)
Time steps
0 2 4 6 8 10 12
x 10
5
0
50
100
150
200
250
300
350
400
450
500
Average distance to goal every iteration (regular run)
Distancetogoal
Simulation steps
Distancetogoal
Average distance to goal (signal on)
Time steps

• The transfer entropy (Schreiber 2000) T from a random process X to another
process Y is a measure of the amount of directed transfer of information from
X to Y:

!
where H is the Shannon entropy (Shannon & Weaver 1949).
Results — Transfer entropy
39

Results — Measure of following behavior
40
← signal on
← signal off
The transfer entropy from a random process X to another process Y is a measure of the amount of directed
transfer of information from X to Y, deﬁned as:
Inwardneighbourhoodtransferentropy
Time steps
Inward neighbourhood transfer entropy

Results — Measure of individual leadership
41
The transfer entropy from a random process X to another process Y is a measure of the amount of directed
transfer of information from X to Y, deﬁned as:
Outwardneighbourhoodtransferentropy
Time steps
Outward neighborhood transfer entropy

Phylogenetic tree & neutral selection
4242
Principal Component Analysis
(color = iteration, radius = swarming)
PC2
PC 1
Simulationtime
Biplot of a PCA on genotypes of all agents in a typical run, over one million iterations. Each
circle represents one agent’s genotype, the diameter representing the average number of
neighbors around the agent over its lifetime, and the color showing its time of death.

!
• In this chapter, we used a minimalist
model to demonstrate the emergence
of swarming behavior.

• The agents exchange signals in
order to swarm together, which in
turn improves their foraging.
Conclusion
43
Olaf Witkowski andTakashi Ikegami. Expected mid-2015 (In preparation). Signal-based swarming and neutral selection. Submitted to PLoS
Computational Biology. <Paper>!
Olaf Witkowski, Geoff Nitschke andTakashi Ikegami. January 2015 (In press). Signal drives genetic diversity: an agent-based approach to
speciation. Proceedings of theTwentieth International Symposium on Artiﬁcial Life and Robotics, 20. <Paper,Talk>!
Olaf Witkowski andTakashi Ikegami. July 2014. Asynchronous evolution: Emergence of signal- based swarming. Proceedings of the Fourteenth
International Conference on the Simulation and Synthesis of Living Systems, 14, 302–309. <Paper,Talk>
!
• Next, we will explore the same model
with a diﬀerent task.
PD

Swarming in dynamic 3D Prisoner’s Dilemma
44
Goal: ﬁnd impact of cooperation/defection game on agents’
collective behavior
!
PD

Food sharing in
vampire bats
Attenborough, D. (2011). Friends and
Rivals. BBC documentary. 
45

Iterated Prisoner’s Dilemma (IPD)
• Prisoner’s Dilemma (Flood & Dresher 1950)
Each player can Cooperate (C) or Defect (D)

• Iterated version (Axelrod 1984)

• Spatial version (Nowak & May 1993)
• Our version: dynamic & spatial
46
Spatial prisoner’s Dilemma
PD Reward matrix

Dynamic Spatial IPD
47
• Agent moves on 3D map

• Agent controls direction (constant speed)

• Communication through signals (2
channels) to detect “friendly neighbors”

• Agent chooses to cooperate/defect
Cooperation (blue) or Defection (red)
Simulation visualization

Differences with previous model
Task: play Prisoner’s Dilemma
Reproduction: offspring added locally
Task: distance to resource
Reproduction: offspring added globally
48
Ch. 4 Ch. 5

Agent’s Controller
49
Movement Communication
Cooperate
or Defect
Sensors
Hidden Units
Context Units
I12
Elman (1990)
Previous controller

• We extend the reward per iteration from Chiong & Kirley (2012) to take into
account spatial continuity:
Coop. vs Def. Costs & Payoff Matrix
50
the same. Our ve
tions with distan
closer ones.
Another advan
be assimilated to
also no cost and
We can see tha
PD game, since,
each other, (1) yi
It is clear that fo
correspond to a P
Based on the o
new direction, w
Figure 2: Architecture of the agents controller, composed
of 12 input neurons, 10 hidden neurons, 10 context neurons
and 5 output neurons.
spacial continuity. It is defined by:
8
>>>>>>>>><
>>>>>>>>>:
C : b
X
coop2radius
1
1 + distance(coop, me)
c
X
any2radius
1
1 + distance(any, me)
D : b
X
coop2radius
1
(1)
With b the bonus, c the cooperation cost, b > c > 0,
and distance the Euclidian distance between two agents. Ra-
dius represent the sphere of radius radius around the agent.
Note that the agent itself is not considered part of its neigh-
borhood. The distance is not part of the original fitness,
which made sense since Chiong and Kirley (2012) are bas-
ing their simulation on a lattice, where the distance is always
Table
walk away s
ing that, in o
is also simila
group, as a lo
Evolution/P
Evolution is
zero energy
a threshold a
infant per tim
considering
risk. Table 1
lution.
Results were
sets used for
stant speed, b
ing. This all
circles.
While som
were strongl
Figure 2: Architecture of the agents controller, composed
of 12 input neurons, 10 hidden neurons, 10 context neurons
and 5 output neurons.
spacial continuity. It is defined by:
8
>>>>>>>>><
>>>>>>>>>:
C : b
X
coop2radius
1
c
X
any2radius
1
1 + distance(any, me)
D : b
X
coop2radius
1
(1)
With b the bonus, c the cooperation cost, b > c > 0,
and distance the Euclidian distance between two agents. Ra-
dius represent the sphere of radius radius around the agent.
Note that the agent itself is not considered part of its neigh-
borhood. The distance is not part of the original fitness,
which made sense since Chiong and Kirley (2012) are bas-
Table
walk away s
ing that, in o
is also simila
group, as a lo
Evolution/P
Evolution is
zero energy
a threshold a
infant per tim
considering
risk. Table 1
lution.
Results were
sets used for
stant speed, b
ing. This all
circles.
While som

!
(a) seek and destroy
(b) cluster with high mobility / high reproduction rate
Simulation
51
Observed behaviors:
!
!
(b)

Simulation - Cooperators increase
52
Cooperation proportion
Proportionofcooperators
inthepopulation
Time steps

Simulation - Cooperators’ invasion
53

Simulation - Cooperators’ stronger signal
54
Signaling strength
inthepopulation
Time steps

Simulation - Cooperators are moving faster
55
Average displacement of agents over a 100 steps sliding window
inthepopulation
Time steps

Conclusion
• In this chapter, we gained the insight
that cooperation requires grouping
of collaborating agents.

• This grouping emerges as a
swarming behavior degenerated
from the previous chapter, using the
communication channel to ﬁnd
other cooperators.
56
•
Olaf Witkowski and Nathanaël Aubert-Kato. July 2014. Pseudo-static cooperators: Moving isn’t always about going somewhere. Proceedings of the
Fourteenth International Conference on the Simulation and Synthesis of Living Systems, 14, 392–397. <Paper,Talk>
!
PD

Conclusion
3D signal-swarming
models (ch. 4)
3D spatial Prisoner’s
Dilemma (ch. 5)
Synchronization vs
variability (ch. 6)
Gene-culture
coevolution (ch. 7)
Summary of the speciﬁc focus of every chapter
PD
58

• In this thesis, using evolutionary robotics, we demonstrated how groups of
agents can evolve eﬃcient collective behavior based on communication.

• The way groups of animals come to cooperate by exchanging information
is essential to optimize their behavior in an environment.

• Future swarm computation will need to build robots that are not directly
controlled by human rules, but interact with each other to solve problems.
Conclusion
59

I am so thankful to…
Takashi Ikegami !

Nathanaël Aubert-Kato, Geoﬀ Nitschke, Julien Hubert, Luke McCrohon !

Everyone in Ikegami Lab !

Jun’ichi Tsujii, Reiji Suda, Masami Hagiya, all the committee members !

My loving family & truly awesome friends !

Publications and conferences
Olaf Witkowski andTakashi Ikegami. Expected mid-2015 (In
preparation). Signal-based swarming and neutral selection. PLoS
Computational Biology. <Paper>!
Olaf Witkowski, Geoff Nitschke andTakashi Ikegami. January
2015 (In press). Signal drives genetic diversity: an agent-based approach
to speciation. Proceedings of theTwentieth International Symposium
on Artificial Life and Robotics, 20. <Paper,Talk>!
Olaf Witkowski and Nathanaël Aubert-Kato. July 2014. Pseudo-
static cooperators: Moving isn’t always about going somewhere.
Proceedings of the Fourteenth International Conference on the
Simulation and Synthesis of Living Systems, 14, 392–397. <Paper,Talk>!
Olaf Witkowski andTakashi Ikegami. July 2014. Asynchronous
evolution: Emergence of signal- based swarming. Proceedings of the
Fourteenth International Conference on the Simulation and Synthesis
of Living Systems, 14, 302–309. <Paper,Talk>!
Olaf Witkowski and Geoff Nitschke. September 2013. The
Transmission of Migratory Behaviors. Proceedings of theTwelveth
European Conference on Artificial Life, 12, 1218–1220. <Paper,Talk>!
Olaf Witkowski and Nathanaël Aubert. July 2013. Size Does
Matter:The Impact of Size on Hoarding Behaviour. Proceedings of the
Thirteenth International Conference onThe Synthesis and Simulation
of Living Systems, 13, 542–543. <Extended Abstract,Talk>!
Olaf Witkowski, Geoff Nitschke andTakashi Ikegami. July 2012.
When is happy hour:An agent’s concept of time. Proceedings of the
Thirteenth International Conference onThe Synthesis and Simulation
of Living Systems, 13, 544–545. <Extended Abstract, Poster>!
Olaf Witkowski and Nathanaël Aubert. May 2012. Size Does
Matter:The Impact of Size on Hoarding Behaviour. Bio UT International
Life Sciences Symposium. <Abstract, Poster>!
Olaf Witkowski, Geoff Nitschke andTakashi Ikegami. March 2012.
Time To Migrate:The Effect of Lifespan on Imitation and Culturally
Learned Migration. Seventh International Workshop on Natural
Computing. <Abstract,Talk>!
Luke McCrohon and Olaf Witkowski. August 2011. Devil in the
details:Analysis of a coevolutionary model of language evolution via
relaxation of selection. Advances in Artificial Life, ECAL 2011.
Proceedings of the Eleventh European Conference on the Synthesis
and Simulation of Living Systems, 522–529. <Paper,Talk>!
Olaf Witkowski. September 2011.A Two-Speed Language
Evolution: Exploring the Linguistic Carrying Capacity. Proceedings of
Ways to Protolanguage 2 (Protolang 2011). <Paper,Talk>!
Olaf Witkowski. July 2011. Can Cultural Adaptation Lead to
Evolutionary Suicide? At HBES 2011 (23rd Annual Human Behavior &
Evolution Society Conference). <Abstract, Poster>!
Olaf Witkowski.August 2010. A Two-Speed Language Evolution. At
Freelinguistics 2010 (4th Annual International Free Linguistics
Conference). <Abstract,Talk>!
(In reverse chronological order)

Evolution of Coordination and Communication in Groups of Embodied Agents

Recommended

Recommended

More Related Content

Similar to Evolution of Coordination and Communication in Groups of Embodied Agents

Similar to Evolution of Coordination and Communication in Groups of Embodied Agents (20)

Recently uploaded

Recently uploaded (20)

Evolution of Coordination and Communication in Groups of Embodied Agents