A Reactive Agent-Based Problem-Solving Model

A Reactive Agent-Based Problem-Solving
Model: Application to Localization
and Tracking
FRANCK GECHTER
UTBM-SeT Laboratory
VINCENT CHEVRIER
Loria-UHP Nancy 1
and
FRANÇOIS CHARPILLET
LORIA-INRIA Lorraine
For two decades, multi-agent systems have been an attractive approach for problem solving and
have been applied to a wide range of applications. Despite the lack of generic methodology, the
reactive approach is interesting considering the properties it provides. This article presents a
problem-solving model based on a swarm approach where agents interact using physics-inspired
mechanisms. The initial problem and its constraints are represented through agents’ environment,
the dynamics of which is part of the problem-solving process. This model is then applied to localiza-
tion and target tracking. Experiments assess our approach and compare it to widely-used classical
algorithms.
Categories and Subject Descriptors: I.2.11 [Artificial Intelligence]: Distributed Artificial Intelli-
gence—Multiagent systems
General Terms: Experimentation
Additional Key Words and Phrases: Reactive multi-agent systems, localization, tracking, mobile
robots
1. INTRODUCTION
For fifteen years multi-agent systems have been an attractive approach for
problem solving. This approach has been applied to a wide range of applica-
tions. Among the classical models, the reactive approach is one of the most
Authors’ addresses: F. Gechter, UTBM, SeT Laboratory, Computer Science Team, F-90010 Belfort
Cedex; email: franck.gechter@utbm.fr; V. Chevrier, UHP-Nancy 1 Loria UMR 7503, MAIA Team,
BP 239, F-54506 Vandoeuvre cedex; email: chevrier@loria.fr; F. Charpillet, INRIA-Lorraine Loria
UMR 7503, MAIA Team, BP 239, F-54506 Vandoeuvre cedex; email: francois.charpillet@loria.fr.
Permission to make digital or hard copies of part or all of this work for personal or classroom use is
granted without fee provided that copies are not made or distributed for profit or direct commercial
advantage and that copies show this notice on the first page or initial screen of a display along
with the full citation. Copyrights for components of this work owned by others than ACM must be
honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers,
to redistribute to lists, or to use any component of this work in other works requires prior specific
permission and/or a fee. Permissions may be requested from Publications Dept., ACM, Inc., 2 Penn
Plaza, Suite 701, New York, NY 10121-0701 USA, fax +1 (212) 869-0481, or permissions@acm.org.
C
2006 ACM 1556-4665/06/1200-0189 $5.00
ACM Transactions on Autonomous and Adaptive Systems, Vol. 1, No. 2, December 2006, Pages 189–222.

190 • F. Gechter et al.
interesting. Such systems rely on reactive agents, which are simple entities that
behave based on their perceptions [Ferber 1999]. According to Muller [2004] and
Muller and Parunak [1998], the difference between a multi-agent system (MAS)
and a classical problem-solving method is the role and the significance of the
interactions that prevail on the definition of the agents themselves. Moreover,
the environment also plays a preponderant role, as shown in Weyns et al. [2005]
or Parunak [1997], since it is the main place where the system computes, builds,
and communicates. Indeed, in the reactive MAS framework, one single agent
can neither handle a representation of the problem nor compute the global
solution. Instead, this solution is obtained from numerous agent-agent and/or
agent-environment interactions [Ferber 1999; Kennedy and Eberhart 2001].
These sorts of systems usually show such features as self-organization or
emergent phenomena, robustness, adaptability, simplicity, and redundancy of
the agents (and consequently low-cost agent design). It has been shown that
this approach is efficient for tackling complex problems such as life-systems
simulation and study [Parunak 1997; Kennedy and Eberhart 2001; Mamei and
Zambonelli 2005], cooperation of situated agents/robots [Steels 1989; Mataric
1995; Drogoul and Ferber 1993; Mamei and Zambonelli 2005], and problem
and game-solving [Drogoul et al. 1991; Drogoul and Dubreuil 1993]. As for
the behavioral models, two main trends are generally used. The first one is
a biologically-inspired approach such as DiMarzo-Serugendo et al. [2004],
Bonabeau et al. [1999], Parunak [1997], Bourjot et al. [2002] or Brueckner
[2000]. The second trend is the use of physics-inspired behavioral models such
as Simonin and Ferber [2000], Reynolds [1987], Mamei and Zambonelli [2004]
or Zeghal and Ferber [1994].
One of the main problems with reactive approaches is the small number of
methodologies and guidelines. The most well-known are Bernon et al. [2004],
Muller [2004], Drogoul et al. [1991], Ferber and Jacopin [1991], and Parunak
[1997].
The goal of this article is to present a problem-solving model based on reactive
MAS applied to localization and tracking. In this model, the environment is
the main component of the problem-solving process. Indeed, it makes the link
between the problem and its constraints (topology and dynamics) on one side
and the problem-solving mechanism on the other.
Localization and target tracking are hard but necessary tasks when we want
to control and follow mobile robots in a dynamical and uncertain environment.
Localization can be defined as finding the position of an object, mobile or not, in
a well-known frame of reference. It is composed of two kinds of methods: local-
ization with on-board sensors (also called self-localization) and localization with
external sensors. The algorithms used stem generally from signal or image pro-
cessing [Pourraz and Crowley 1999; Henkel 2000; Demirli and Turksen 2000;
Bernardino and Santos-Victor 1998; Baerveldt 2001; Adorni et al. 2001], or
from stochastic methods based on Markov Decision Processes (MDP) [Gechter
and Charpillet 2000; Gechter et al. 2001; Fox et al. 2001; Thrun 1999]. The
standard localization algorithms depend heavily on the nature of the sensors
used and deal only with one single target. To our knowledge, there are no reac-
tive MAS-based localization and tracking devices. However, some related work
ACM Transactions on Autonomous and Adaptive Systems, Vol. 1, No. 2, December 2006.

A Reactive Agent-Based Problem-Solving Model for Localization and Tracking • 191
using cognitive agents can be found, including environment mapping
[Grabowski et al. 1999] or data fusion [Chong 1998] algorithms. Target tracking
is considered to be a collection of temporally and spatially coherent localiza-
tions, and the resulting algorithms are based on signal processing. Among the
most well-known are Kalman filter [Roumeliotis et al. 1999; Welch and Bishop
2000], optical flow algorithms [Nagel and Gehrke 1998], and particle filtering
[Kwok et al. 2003].
The article is structured as follows. The first section outlines the key concepts
we used to propose our problem-solving methods, and then details how these
concepts have been used to build a concrete architecture. Finally, this section
ends with an overview of the related works on both model and application
domains. The next section reports the experiments done in order to assess our
approach and compare it with a classical method. The last section stresses the
main contributions of this work and draws some conclusions.
2. THE REACTIVE PROBLEM-SOLVING MODEL
This section explains the problem-solving model. After a short description of
the mechanism, the model is described in detail.
2.1 General Overview
The proposed model can be considered an application of the methodology devel-
oped in Simonin and Gechter [2006]. This methodology puts the environment
in the center of the problem-solving process as the place where the problem and
its constraints are specified and presented to the perception of the agents. Then
interactions are defined in order to take into account the problem’s dynamics.
Finally, environment agents, and interactions lead to an emergent structure
that can be considered a solution to the problem.
Localization and tracking are based on the use of sensors that are spread out
in the real world. This can be considered an area, observable by the sensors,
where the targets are expected to move. The environment is depicted using
an occupancy grid that represents the observable areas of the real world ac-
cording to the sensors’ range. Dynamics of the problem depend on dynamics
of the targets which can (i) appear, that is, arrive in the observation field of
the sensors, (ii) move, that is, go from one observable point of the real world to
another observable point, (iii) disappear, that is, go out of the observation field.
These dynamics have been accounted for using two main trends. First, accu-
mulation of the sensing information deals with the appearance of the targets.
This accumulation leads to the construction of a plot (namely, a local probability
distribution) that represents a possible position for a target. This construction
can be considered a deformation of the environment that has to be perceived
by the agents. Second, there is attenuation of the plot in order to deal with
target disappearance. Together, these two trends take into account the targets’
movements.
The perceptions of the agents have then to be defined. The agents perceive
the plots through the environment by means of an attraction force (formulated
to account for the nearsightedness of the agents). This force is induced by the
appearance of a plot and depends on its size.

Fig. 1. General overview.
As for the interaction mechanisms, they have to be defined taking into ac-
count individual and collective points of view. Moreover, regulation mechanisms
are required. A repulsion mechanism is defined between agents in order to
spread them in the informationless areas of the environment. This mechanism
is inhibited when the agents are on a plot. There, the agents cooperate by ampli-
fying the attenuation of the plot in order to limit the size of the resulting group.
Finally, the emergent organization is characterized by both a gathering of the
agents on the plots, which leads to group construction, and a homogeneous
distribution of them in the informationless areas. Each group can thus be con-
sidered a localized target. The output of the system is stable when equilibrium
is established between refreshing and solving dynamics. Figure 1 summarizes
the architecture of our problem-solving model.
We can now describe the model in detail.
2.2 The Environment
2.2.1 Representation. We assume that the environment topology is known
a priori. This topology corresponds to the union of the known boundaries of
the observed area according to the range of the sensors used. The agents’ en-
vironment is an abstract representation of the real world composed of a set of
states. Each such point is associated with a bounded value, corresponding to the
probability that a target is present. Each probable position is thus represented
by a perception plot given locally by a state value that corresponds to the per-
ceived reliability of the assumption that a target is at that location. This local
value can be taken to be the altitude of the given plot at the associated state of

the environment.The model of the environment is as follows:
E = {Si}i with
⎧
⎨
⎩
Si = {xi, yi, zi(t)}
zi :

R∗+
→ [0, zmax].
t → zi(t)
(1)
In this equation, Si is a state characterized by its position (xi, yi), and the
function of time zi(t) that represents the evolution of the Si state altitude. In
order to construct the agents’ environment by using observations of the real
world, a generic structure, called a perceptive unit (PU), has been defined. This
structure enables us to formalize the construction process by designing required
properties for the devices expected to furnish data to the system.
A PU is composed of a set of sensors linked to signal processing algorithms.
They furnish structured data that are integrated dynamically in the environ-
ment through the accumulation mechanism.
2.2.2 Dynamics. The environment’s dynamics have to fit those of the real
world and the targets. Targets appear, move, and disappear; the environment
must be able to account for these three phenomena. In our solving model, they
are represented by two opposite principles: accumulation which takes into ac-
count appearance of targets, and attenuation1
which deals with their disap-
pearance. Target movement can be regarded as a result of these two tendencies
—Accumulation. When new information is furnished to the environment by a
PU, the value of the state Si increases according to the following equation.
ziac
(t) = zi(t − 1) + zPUSi
. (2)
In this equation, ziac
(t) represents the altitude of the plot i at time t, taking
into account only the accumulation, and zUPSi
is the value furnished by the
perceptive unit. zi(t − 1) is the plot’s altitude at the previous time step.
—Attenuation. In order to deal with the disappearance of targets and to avoid
the accumulation of out-of-date information in the environment, an atten-
uation phenomenon has been introduced. It is able to smoothly eliminate
the plots that are not renewed. Thus, the mechanism tends to make the out-
dated data disappear without using a specific agent’s behavior. The following
equation represents the attenuation mechanism.
ziat
(t) = zi(t − 1) ∗ (1 − ǫ) (3)
As was the case for accumulation, ziat
(t) represents the ith plot’s altitude at
time t, taking into account only the attenuation mechanism. ǫ represents the
coefficient of attenuation drawn from [0, 1]. [0, 1].
—Putting accumulation and attenuation together. The dynamics of the targets
are represented by the conjunction of attenuation and accumulation mecha-
nisms. Thanks to these two phenomena, the environment has its own dynam-
ics, which serve to maintain the temporal coherence of the data. Accumula-
tion takes into account new probable positions and reinforces the old ones,
whereas attenuation allows us to control the pertinence of the information
in the environment. The movement of a target generates new information
1Attenuation can be compared to the evaporation mechanism used in digital pheromone models.

Fig. 2. How to formalize the target’s movements? (A) A plot appears in the environment. (B)
Thanks to attenuation, the plot starts to disappear. (C) The target has moved, creating another
plot in the environment (accumulation phenomena). The altitude of the first plot is still decreasing.
(D) The first plot has disappeared and the new one starts to attenuate.
that corresponds to another state. This will lead to the appearance of a new
probable position (by accumulation) when attenuation starts to make the old
position disappear (see Figure 2).
The environment, as it has been described, is conservative that is no energy is
lost by either the environment or the agent. Thus, the effects of the behaviors
are added at each time step without any limitation, implying what is called
an infinite amplified movement in signal processing.2
In addition, without
any energy loss, an individual agent that is not externally influenced will
move continuously at a constant speed until it reaches a border of the world,
and, as a result, the overall system will not stabilize. In order to cope with
this problem, energy dissipation has been introduced thanks to a fluid friction
force applied to each agent.

Ffriction = −λ
V . (4)
In this equation, λ is the isotropic fluid friction coefficient of the environment.

V is the speed of the agent. The influence of the friction force and, thus inertia
of the system, are controlled by λ. The greater λ, the more the environment
will be opposed to the agents’ movements.
2.3 From Collective Organization to Local Behavior
2.3.1 General Overview. On a macroscopic level, we want to obtain a gath-
ering of the agents over the plots and a homogeneous distribution of them in
the plotless areas of the environment. A gathering of agents corresponds to a
detected target. These two opposed collective behaviors make the system tend
toward a state of balance which can be taken to be an organization that solves
the problem specified by the environment.
These two macroscopic trends have to be translated to the microscopic level.
Gathering is associated with an attraction between the agents and the plots
and the homogeneous distribution is tied to interagent repulsion. Moreover,
two regulation phenomena have been defined in order to balance the agents’
main tendencies. Attraction is thus inhibited by attenuation amplification on
the part of agents, which regulates their population in proportion to the altitude
of the plot. In contrast, repulsion is inhibited when the agents are on a plot.
This behavior allows the groups to be coherent, avoiding ejection of agents from
the group.
2As opposed to periodic, absorbed, or absorbed periodic movements.

Since the interaction model is inspired by physics, the agents are treated
as small mass particles in a force field. An agent a is described by its position
(xa, ya) in the environment; its speed (Vxa
, Vya
); its acceleration (γxa
, γya
); its
mass ma and a potential energy level Ea. An agent has the following possible
behaviors:
—movement induced by the sum of the forces applied to it,
—attenuation amplification.
We will now describe in detail the attraction and repulsion forces.
2.3.2 Attraction and Repulsion Forces
—Repulsion behavior. Repulsion can be treated as a negative gravitation force
between two weighted elements. It is aimed at maintaining a homogeneous
and uniform distribution of the agents in the plotless areas. As with natural
gravitation, repulsion is described by 1/r2
, where r is the distance between
agents.
The following equation shows the analytic expression of the repulsion force
applied to agent Aj , considering the influence of agent Ai. α is a scalar mul-
tiplier that takes into account the environmental gravitational constant and
the proportion of attraction compared with the other forces. In practice, since
agents’ environment is virtual, this constant allows us to tune the importance
of the repulsion behavior relative to the other forces. In this equation, mi and
mj are, respectively, the weight of the agents Ai and Aj . In our case, since
the population is homogeneous, the weights are the same.

Rij = α.mimj

Ai Aj

Ai Aj 3
. (5)
When the vector is replaced by the coordinates of each agent and the equation
is generalized to the whole population, the following is obtained:
⎧
⎪
⎪
⎨
⎪
⎪
⎩
RX j
=

i= j

α.mi.mj
(xj −xi)
(( y j − yi)2+(xj −xi)2)
3
2
RY j
=

i= j

α.mi.mj
( y j − yi)
(( y j − yi)2+(xj −xi)2)
3
2
. (6)
In this equation xi, yi, mi and xj , y j , mj represent the coordinates and the
weight of the agents i and j.
—Attraction behavior. The attraction force, as opposed to the repulsion force,
is between the agents and their environment and particularly the center of
plots; it can also be compared to a gravitational force.
From a mathematical point of view, it is natural to determine this behavior
by the following equation.

F = β.mA.altP

AP

AP3
. (7)
This equation represents the force applied to an agent A by the plot P. The
vector between A and P is computed by taking into account the respective
position of the agent and of the plot. Coefficients mA and altP are the weight of

Fig. 3. The three steps of the attraction behavior.
the agent and the altitude of the plot, respectively. As shown for the repulsion,
β represents the gravitational constant of the environment.
The drawback to this formulation is that as an agent nears a plot, the force
tends towards the infinite. Consequently, it is impossible to keep the agent
on the plot. Instead of this, we observe a gravitational sling phenomenon well
know to physicists.
One solution can be the use of a heuristic that forces the agent to stay on the
plot by inhibiting the attraction force as the agent nears it. This is the solution
use by Simonin and Ferber [2000] to solve the same problem.
As for our model, we decided to find a physically coherent solution instead
of adding a heuristic to the behaviors. This allows us to define a continuous
vectorial attraction function, producing a coherent measure of an agent’s po-
tential energy. This energy can be regarded as a measure of the agent’s local
state and can be used, for instance, in learning or optimization algorithms. The
attraction model we develop has three steps (see Figure 3).
The first step, between ∞ and d, uses the standard definition (see
Equation (7)). Then, until d′
, the attraction behavior uses a constant law aimed
at reducing the influence of the distance on the force’s amplitude is shown
Equation (8). Finally, between d′
and 0, the attraction law is linear involving a

decrease in the agents’ speed within the plot’s area of influence per the following
equation.

Fdd′ = β1.mA.altP

AP

AP
with β1 =
β
d2
. (8)

F0d′ = β2.mA.altP

AP with β2 =
β
d′d2
. (9)
When the previous equation is modified taking into account the coordinates
of the agent and all the plots in the environment, it becomes
⎧
⎪
⎨
⎪
⎩
F xA =

i k.β.mA.altPi
(xPi−xA)
((xPi−xA)2+( yPi− yA)2)
n
2
F yA =

i k.β.mA.altPi
( yPi− yA)
((xPi−xA)2+( yPi− yA)2)
n
2
,
(10)
with
(k, n) =
⎧
⎪
⎪
⎪
⎨
⎪
⎪
⎪
⎩
1
d′d2 , 0 if (xPi − xA)2 + ( yPi − yA)2 ∈ [O, d′
]
1
d2 , 1 if (xPi − xA)2 + ( yPi − yA)2 ∈ [d′
, d]
(1, 3) if (xPi − xA)2 + ( yPi − yA)2 ∈ [d, ∞].
This equation is the one used in our model. Here, β represents the environ-
ment’s attraction constant, mA and altPi the weight of agent A and altitude of
the plot Pi, and (xA, yA) and (xPi, yPi) their coordinates. Moreover, k and n are
modulation parameters, dependent on the distance between agent and plot. d
and d′
are model parameters that have to be tuned.
2.3.3 Regulation Processes
—Attenuation amplification. As previously described, the goal of attenuation
amplification is to regulate the number of agents on plots. Since attraction
increases when the plot has high amplitude, this behavior has to be reduced
when the agents are too numerous.
At each time step, an agent amplifies the attenuation of the plot according
to an amplification rate called μ. The next equation shows how the altitude
of a plot evolves according to the agents that are on it.that are on it.
zi(t) = zi(t − 1) ∗ μnb
. (11)
In this equation, zi(t) represents the amplitude of the plot i at the time t, μ
is the amplification rate of the agents, and nb the number of agents on that
plot. Since the population is homogeneous, μ is the same for all agents. Its
value lies between 0 and 1.
—Repulsion attenuation. A state measure can be defined for each agent in
order to represent its situation in a system (e.g., on a plot or far from the
other agents). Since a physically-inspired model is used, this measure can be
treated as energy. This could be expressed as the sum of a potential energy
using the classical formulation (Ep = −δW = −
F.d
u).

Table I. Summary of the Agents’ Behaviors
Action Name Attraction Repulsion
Equation
F = kβmA.altP

AP

APn

Rij = Iαmi.mj

Ai Aj

Ai Aj 3
Feedback Name Attenuation amplification Inhibition
Equation zi(t) = zi(t − 1).μnb I = 1
1+νzA
Interacting Element Agent/Plot Agent/Agent
In the case of regulation of the repulsion behavior, only the classical poten-
tial energy3
has been taken into account. The next equation represents this
value for an agent A.
EA = ν.zA. (12)
In this equation, zA represents the altitude of agent A (equal to that of its
local state), and ν is a proportional coefficient that takes into account the
environment’s characteristics.
Then, the inhibition I can be defined as follows:
I =
1
1 + EA
=
1
1 + ν.zA
. (13)
This coefficient I is used as a multiplier for the repulsion behavior (see Equa-
tion (6)).
When the energy is high, that is, when an agent is on a plot, repulsion de-
creases relative to attraction, allowing the group to stay coherent. In contrast,
when the energy is low, repulsion is predominant, and agents return to their
basic stable state of distribution.
2.4 Group Construction
The defined behaviors lead agents to accumulate on a plot according to their
own local altitudes and distances from their neighbors. When an agent situated
on a plot is linked to its neighbors, a group is obtained.
Considering the nature of the interactions between agents, they can be near
each other only if they are on a plot (due to attraction and the inhibition of
repulsion). As soon as they are separated by a distance lower than d′
, a group is
created, the position of which corresponds to the center of weight of the collected
agents.4
The number of agents from which a group is created is a key parameter
of the model; overall, 2 are sufficient given the nature of the behavioral model.
Experiments shows that 3 seems to be an adequate and sufficient value. Finally,
each group so created is linked to a potential target to be followed. This link
between groups and targets and its dynamics are detailed in Section 2.6.3.
2.5 Summary of the Agent Behaviors and the Environmental Evolution
Table I summarizes agent-agent and agent-environment interactions.
3In physics, the potential energy of a mass at the altitude z is given by Ep = −m.g.z, with a vertical
axis from bottom to top.
4Creation of a group consists in giving a common label to agents that are near one another.

The following equation sums up how the environment evolves, taking into
account all agent behaviors (as given in Table I) and the environment’s own
dynamics.
zi(t) = zi(t − 1).μnb
.(1 − ǫ) +
PU
zPUSi
. (14)
2.6 Dynamics and Interpretation
Since the forces and their corresponding mechanisms have been defined, we
have now to compute the state parameters of agents (position, speed, and ac-
celeration) and to study the system’s dynamical characteristics.
2.6.1 Resolving Dynamical Equations. The position, speed, and accelera-
tion for each agent are computed in a continuous world. The agents’ dynami-
cal characteristics are computed following the laws of the classical Newtonian
physics. Each behavior applied to an agent corresponds to a force influencing its
movement and is selected according to the agent’s local state. For instance, re-
pulsion is not taken into account when inhibition is too high. Moreover, an agent
is dynamically restricted (speed and acceleration are bounded). This avoids os-
cillation phenomena during the convergence of agents on plot. Furthermore,
such restrictions are required due to the fact that the world itself is bounded.
By applying the fundamental law of dynamics, we can compute the accel-
eration of each agent (Equation (15)). Here,
γ represents acceleration, m the
agent’s mass, and
Fb the force resulting from behavior b.

γ =
1
m behaviors

Fb. (15)
By substituting in the definition of the force of friction (See Equation (4))
collecting terms in the velocity vector
V and integrating twice, we obtain the
following equation:

X t =
X t−1 +

Vt−1δt +
(δt)2
2m

Fattraction +
Frepulsion

1

1 + δt
m
λ

, (16)
with
X t = x(t)
y(t) .
This physically inspired model of behavior is particularly interesting because
it allows the various influences on an agent’s position to be composed straight-
forwardly. Moreover, it is easy to add or remove a behavioral item by modifying
its corresponding force.
Our dynamical model uses agent inertia in order to account for the history
of each agent’s motion and to foresee their future positions. Even if agents,
therefore, seem to be reduced to particles in a dynamical computation, they are
still agents according to the definition given in Ferber [1999]:
—They have limited actions and perceptions, since attraction and repulsion
obey the 1/r2
law.
—They possess autonomy, as reflected by the fact that repulsion is inhibited by
increases in an agent’s internal energy, which can be regarded as an internal
state that i under the agent’s control.

—They interact with each other and with the environment through the twin
forces of attraction and attenuation amplification.
2.6.2 How Does It Work?. When the system is initialized, the agents are
randomly spread throughout their environment. If no targets are perceived (i.e.,
no plots), only the repulsion force influences the agents. As a result, the agents
are homogeneously spread in the environment in such a way that they are as
far as possible from each other. This equilibrium corresponds to the fundamen-
tal stable state of the system. Considered in physical terms, it corresponds to
maximizing each agent’s energy, a state that can be regarded as the agent’s lo-
cal goal. Indeed, since there are no percepts, the potential energy of each agent
is equal to zero. The only way to maximize their internal energy is to reduce the
contribution of the agent-agent repulsion. This is carried out when the distance
between agents is as high as possible.
When information arrives, it disturbs this stable state. The agents near the
corresponding plot tend to be attracted by it. Thus, the nearest agents will con-
gregate at the point of perturbation. Meanwhile, other agents, further from the
perceived target, will move to maintain maximum distance from their neigh-
bors. Ignoring the effect of attenuation amplification for the moment, the system
tends to another stable state that corresponds to a maximum local energy for
each agent.
Once the agents are on the plot, they will amplify its attenuation, indepen-
dently of the environment’s own evolutionary rules. The amplitude of the plot
starts decreasing, causing a decrease in the energy level for each agent on the
plot. Consequently, the agents start to separate.
Two cases are now possible, depending upon whether the target is still per-
ceptible or not.
—In the first case, if the target did not move (appearance of a new plot at the
same place), the stable state is obtained since the group is already in place.
If the target has moved to a nearby position, the system will evolve to a
new stable state. If states of the environment are refreshed quickly enough,
there will be only a small difference between target positions. Thus, agents
gathered around the new position will be almost the same as those around
the prior location. In this case, we can consider the group to have moved
from one position at time t to the new position at the time t + 1. The trajec-
tory of one target is, consequently, marked over time by a sequence of group
positions.
—When the target is no longer detected, the system returns to its fundamental
stable state after the disappearance of the agents’ group.
2.6.3 What is the Link Between the Dynamics of Agents and Targets?. The
goal of tracking is to build a trajectory for each target corresponding as pre-
cisely as possible to the real trajectory. In our case, we only have a discrete
set of localizations that have to be linked based on the profile of the speed
vector.
In our method, this task is made more difficult since the dynamical target
model is unknown. The target’s apparent speed profile must be determined

using the dynamics of the group of agents. However, we must avoid a profile in
which speed is reduced to zero whenever a group collects upon some plot. This
case will be encountered if the group has been defined as a rigid entity moving
from one plot to another. In fact, a group is a dynamical emerging structure
where the agents are continuously moving (Figure 4). The structure can be
compared to a set of particles the movement of which is the superposition of a
Brownian influence and a translation.
Attenuation amplification of a plot starts when the first agent of the group
arrives. If the rate of attenuation amplification is high enough, the plot will
have almost disappeared by the time the last agents arrive on it. Thanks to
inertia, those agents will continue to move in the same direction. From a global
point of view, a group never stops on a plot. Instead, we use its speed profile to
estimate the target’s actual speed.
Figure 4 shows the evolution of the group’s speed profile.
—Figure A . Each agent of the group (which has been constructed previously
around a percept) detects the presence of a new plot. This is characterized by
the movement of the agents to this new plot, involving changes in the global
speed of the group.
—Figure B . Some agents are too far from the new plot to be under its influence.
They separate from the group due to the repulsion behavior. The other agents
in the group collect at the new plot. The group’s speed smoothly decreases.
In addition, new agents near the new plot are recruited.
—Figure C . A new plot appears. Since its amplitude is higher than the old one,
the nearest agents of the group will be attracted. This involves an increase
in the global speed of the group. The agents that stay on the old plot continue
to attenuate it. New agents are recruited due to the arrival of the new plot.
The speed profile of the agents’ group is similar to the one shown in Figure 5.
The difference between the maximum and the minimum speeds (denoted by
V ) depends on the rate at which percepts are refreshed, along with those of
environmental attenuation and its agent-based attenuation. The real speed of
the target lies between these two extremes. Thus, model parameters must be
adapted in order to reduce V as much as possible, consequently providing
the most precise possible estimate of the target’s speed. Locally adapting the
model can then allow us to determine the target’s dynamical properties and to
identify it.
2.7 Related Work
The literature contains several examples of related work comparable to our
proposal in terms of either the models used or the applications considered.
Here, we consider some of the most influential of such work.
2.7.1 Swarm Algorithms. Some multi-agent systems found in the liter-
ature utilize a swarm approach. Examples include work based on physically-
inspired force field approaches as well as models taken from biology. Among the
physically-inspired work, we can mention Simonin and Ferber [2000], Zeghal
and Ferber [1994], and Reynolds [1987].

Fig. 4. Dynamics of a group of agents tracking a target.

Fig. 5. Speed profile of a group of agents (The A, B, and C areas are the same as the one studied
in Figure 4).
For instance, the Co-Fields approach [Mamei and Zambonelli 2004] is based
on similar physical principles:
—information is represented in the environment in the form of computation
fields;
—agents can perceive these fields and move according to them and to some
coordination policy; movement is equivalent to following the gradient of a
given filed, specially related to a particular task;
—the movements of agents are treated as feedback to the environment and
modify the fields.
The Co-Fields approach differs somewhat from the one we present here in that
the goal in Co-Fields work is explicit coordination of agent motion: agents move
and arrange themselves spatially in accordance with some particular policy
such as “surround the prey”. This implies that agents have at least minimal
knowledge about their goals. In our case, the model is itself constructed so that
it produces a pattern which can then be interpreted at some higher level (using
group detection) as implicitly solving a problem.
One other interesting swarm approach is the one proposed in Brueckner
[2000]. This approach is based on digital pheromones [Colorni et al. 1991];
[Parunak 1997]. The goal of this work is to control and monitor a manufacturing
process based on, in particular, workpiece agents that drop pheromones in their
environment. The dynamics of such an environment are then linked to evapo-
ration and propagation of these pheromones. The local strength of pheromones
can then be studied in order to monitor the number of workpiece agents on each
processing unit. Moreover, this strength can also be used as an input to the rout-
ing decision process. The main point in common with our proposal is the way

in which the environment dynamics are defined (accumulation, evaporation,
and propagation). However, in the pheromone approach, these dynamics are
induced by the agents alone, whereas, in our model, they arise from properties
of the world itself (accumulation) and its own dynamics (attenuation) as well
as its interactions with agents (attenuation amplification).
Even if swarm approaches use models similar to ours, they do not deal with
the same sorts of localization and tracking problems. In contrast, both particle
filtering and Kalman filter methods share many characteristics with Reactive
Multi-Agent Systems.
2.7.2 Particle Filtering. The main idea behind particle filtering is to build
a probabilistic estimator of a state vector based on a model of the underlying
process, an observation model, and a model of statistical noise applied to the
estimation and measure of states. Such an approach relies on exploration of the
state space by a set of randomly evolving particles which are distributed based
on the probalistic characterization of the observed process. Particle filtering
stems from both Markovian algorithms and Kalman filtering methods. It has
the same information requirements as Kalman filters (as discusse in Section
2.73) but can explore several hypotheses in parallel and consequently provides
better reliability relative to noise in the observation model [Hue et al. 2000]. As
for Markovian methods, particle filtering employs environment models similar
to Markov Decision Processes (MDP) or Partially Observable Markov Decision
Processes (POMDP) but can overcome the problem of combinatorial explosions
space rather than computing every possible belief state independently. One of
the best known such methods for localization is Monte Carlo Localization as
developed by Fox et al. [1999] and Dellaert et al. [1999].
As we have mentioned, our method has several things in common with par-
ticle filtering approaches, such as:
—the model of agent dynamics is similar to the one used classically to simulate
particles;
—from an external point of view, the movements of agents and particles can
appear similar;
—both methods have a real time and anytime character;
—the application fields are similar (localization and tracking);
—they are both filtering methods;
—they have some common advantages, such as a natural response to global
localization problems [Fox et al. 2000] and multitarget localization
In contrast to these common characteristics, there are also some fundamental
conceptual differences between the two approaches.
—In multi-agent systems, the environment is an active participant in the so-
lution process, while in particle filtering it only represents the background
state space.
—In our approach, there is no specific external model of agent dynamics (i.e.,
the dynamical model of agents is separate from that of targets). In contrast,

all movements in a particle filtering context depend upon a central model
of the dynamics or transitions, such as in a POMDP. Moreover, movements
of agents in our framework are computed deterministically based on states of
the environment rather than according to some probabilistic description of
the system.
—Our agents are more than simple particles. In particular, they have a limited
form of perception (simulated by the formulation of the attraction and repul-
sion behaviors), they make decisions (repulsion inhibition), and they interact
with each other and with the environment (attenuation amplification).
—In particle filtering, each particle is used to give part of the solution based on
their weight. In the agent-based approach, on the other hand, no individual
agent can be considered part of the solution; instead, each agent has its own
individual behavioral model, and together these lead to a global solution
arising from the emerging organization.
—Reasoning with a multi-agent system is reasoning in terms of interaction
instead of reasoning based upon individual particle models.
—Unlike particle filtering, it is not yet possible to provide any optimal criteria
for agent-based methods since these are based on emergence behavior which
is difficult to characterize mathematically.
Consequently, despite their similarities, the two methods differ somewhat
on a conceptual level and so the method proposed in this article cannot be
considered simply a novel particle filtering approach. However, it would be
interesting to combine the two approaches in order to take advantage of both
(allowing particles to interact in filtering models or introducing probabilistic
behavior into our agents, for example).
2.7.3 Kalman Filter Algorithms. The Kalman filter is widely used in im-
age processing and, most particularly, in robotics. Its main drawback is the
knowledge required in order to succeed at tracking since the algorithm must
compute based on models of perception and dynamics. In addition, knowledge
of the initial state of the system is crucial to the method and must generally
be established empirically. The basic idea behind Kalman filtering is to predict
the position of a moving object by using an initial position, a model of the ob-
ject’s dynamics, and an observation model. The initial state is computed using
classical features-detection algorithms similar to the one used in our percep-
tive units. The moving object is defined by a state vector, including the position
coordinate and the dynamic parameters (speed for instance), and a model of
probable perturbations in its movement. These perturbations are usually de-
fined as Gaussian white noise. Tracking is performed recursively by estimating
the state at each time step.
The Kalman Filtering method consists of five stages:
—assignment of a kinematic model to each feature;
—prediction of feature positions at the next step, using both the dynamical
model and a measure of uncertainty;

—determination of a search area based on the amount of uncertainty in
prediction;
—determination of the best possible location by means of some normalized
distance measure, such as the Mahalanobis criterion;
—update of the current state in accord with the perception model.
The main steps and required parameters of Kalman filtering methods are de-
tailled in Welch and Bishop [2000].
The biggest problem with Kalman filtering is its need for explicit informa-
tion about the target that is being tracked. Indeed, since the goal of target
tracking is to determine the trajectory of an unknown moving object, it would
seem to be difficult to obtain a model specifically adapted to its dynamics and
initial position. In general, this model is chosen from among a set of models
specific to the sorts of objects being tracked, and its deviation from the real
model is estimated through the error covariance matrix. Finally, when the cho-
sen dynamical model in fact fits that of the target, Kalman filtering performs
optimally; in other cases, no such performance is guaranteed. In contrast, our
method does not require such a model in order to achieve relevant results, as
described in the next section.
3. EXPERIMENTAL RESULTS
This section presents experiments conducted to assess the relevance of the
proposed model according to different points of view:
—feasibility of such an approach for the localization and tracking issues;
—efficiency of the proposal with respect to the problem being solved, especially
with real robots and in comparison Kalman filter approach, which is one of
the best bases of comparison due to its optimality;
—the problem-solving process and its main properties such as its ease of use,
robustness in the face of perturbations, such as handling varying number of
targets or PU, or the ability to perform data fusion.
The section is structured as follows. The first part presents the experimental
setup used. The second deals with experiments with simulated data sources,
including analysis of the influence of the various model parameters in specific
situations. Finally, a presentation of experiments with real targets is made,
including a comparison with a Kalman filter on a specific scenario.
3.1 Experimental Setup
The model described in the previous section has been implemented thanks to the
MadKit5
development kit proposed by J. Ferber and O. Guknecht. In particular,
we use its synchronous engine to manage the different dynamics of our system.
The software architecture developed is composed of three modules.
—The Environment Construction Module is made up of virtual perceptive units
that send simulated trajectories to the system and/or perceptive units tied
to real sensors.
5http://www.madkit.org.

Fig. 6. Experimental setup: platform (left) and robots (right).
—The Solving Process Module computes and controls agent behavior in the
environment (movement, interaction forces, plot dynamics, etc.).
—The Group Detection Module.
Communication between these modules is handled using a UDP socket on spe-
cific ports.
As for the real target, small 2-wheel drive Mirosot6
soccer robots have been
used (Figure 6). These robots move over a playground of size 2.20m by 1.80m.
They are controlled by a standard PC computer that sends data to each robot
through a RF device. Perception is handled using a CCD camera placed over
the playground (Figure 6) and associated with a variety of processing algo-
rithms in order to construct several perceptive units. To compute the dynamics
and behavior of the multi-agent system, a standard PC computer (Pentium III
800MHz 512Mo RAM) has been used.
3.2 Simulated Targets
The preliminary experiments have been mostly made in simulation. Their goal
is to check the suitability of the proposed model for localization and track-
ing. These tests allow us to identify various characteristics and advantages of
the approach, legitimating our conceptual choices. Finally, they point out the
influence of specific parameters on the emerging result quality. Among the ex-
periments done, only the most significant have been chosen for presentation.
In the next sections, the following tests are detailed:
—experiments with specific trajectories (measuring the efficiency of our
proposal),
—experiments with crossing trajectories (measuring discrimination and sepa-
ration distances),
6http://www.yujinrobot.com.

Table II. Agent-Device Parameters for the Experiment with Simulated Targets
atten-
Name atten- -uation attrac- size
of the -uation ampli- attrac- repul- -tion of the inhibi-
parameter rate -fication -tion -sion distance plot -tion friction
Symbol ǫ δ β α d d′ ν λ
Value 0.7 0.07 100.0 100.0 25.0 3.0 10.0 0.07
Table III. Preliminary Error Results for
Simulations with Either Static Plots or
Rectilinear Target Trajectories (The
percentage is computed relative to the
overall size of the environment.)
Static Rectilinear
ME SD ME SD
In X 0.09% 0.19% 0.83% 2.40%
In Y 0.15% 0.52% 0.16% 2.31%
—experiments with noisy trajectories and with various speed (measuring
robustness).
In order to perform these experiments, virtual perceptive units have been
used. These units are able to simulate the work of real ones. As far as the core
system is concerned, there is no difference between virtual and real perceptive
units since the data, received by the UDP connections, has the same structure.
3.2.1 Early Localization and Tracking Experiments. The main goal of
these tests is to check whether the emerging organization accords with ex-
pectations, namely, agents gathering on plots and following target trajectories.
Moreover, they allow us to measure the localization error (i.e., the distance
between the agents’ group and the the point in the trajectory that is sent at
the corresponding time by the perceptive unit). These experiments first use a
single static plot at a specific location, and then consider plots moving along
trajectories that are, for instance, linear or sinusoidal. The set of parameters
used for the agent model is presented in Table II.
The results obtained for the static plot and linear trajectories are presented in
Table III, which shows the mean error and standard deviation in localization for
each case. Those results have been computed in a 200 by 200-state environment
and with 150 agents. About 200 tests have been conducted.
These first results show that localization is precise given the developed mech-
anisms. But it still has residual error linked to the problem-solving dynamics
used. Indeed, variations in the rate of attenuation and agent attraction result in
oscillating group positions (and, therefore, increased localization error) which
is amplified by possible changes in population as agents enter the group or
escape from it.
Now we can focus on the sinusoidal trajectory. The choice of a sinusoidal
trajectory is not innocent. This kind of curve allows us to study the behavior
of a system featuring huge changes in target speed and acceleration. Such a
trajectory can be divided into two main types of regions: rectilinear sections and

Table IV. Preliminary Results Obtained in Simulation with a
Sinusoidal Trajectory (The percentage is computed relative to the size
of the entire world.)
Mean Error Standard Deviation
Linear sections 1.48% 1.50%
Orientation-changing sections 4.24% 4.70%
Global 2.86% 3.11%
orientation changing sections where the target changes course and its speed
passes through some minimum. Table IV shows the results obtained.
The system gives the same kind of results as before in the rectilinear sec-
tion. In contrast, the quality of the result decreases in the orientation-changing
areas. This decrease is linked to inertia in the group of agents similar to that
explained in Section 2.6.3. This inertia is due to the cumulative mass of the
group of agents and the fluid friction force. In fact, the error rate in orientation-
changing areas is closely linked to the λ coefficient (see Equation (4)). If λ is low,
the error will decrease but there will be more oscillations around the trajectory
in rectilinear sections.
3.2.2 Discrimination and Separation Distances. The goal of this experi-
ment is to test the system’s ability to discriminate when two (or more) targets
are near each other. For this purpose, we define two kinds of criteria; the dis-
crimination distance and the separation distance. The discrimination distance
is defined as the minimum distance that must exist between two targets whose
trajectories meet in one point in order that the system can continue to discern
between them (see Figure 7 (top)). The separation distance is the minimum
distance that must exist between two targets whose trajectories start from the
same point so that the system can begin to discern between them (see Figure 7
(bottom)). These experiments have again been performed using the parameters
detailed in Table II.
The experiments show that the main parameter that determines the val-
ues of these two criteria is the radius d of the attraction area of the plot (see
Figure 3). Figure 8 shows the variation of the discrimination distance as com-
pared to changes of the radius of the attraction area. This figure is generated
based on 10 experiments for each value of d. The curve presented in figure 8 is
quasilinear. We obtain the same results with the separation distance. However,
the test for separation is more difficult for the system to solve. In such cases,
due to inertia, the agents favor a trajectory identical to the one plotted before
the point of discrimination was reached.7
Following this point, after a short
distance (of the same order relative to d), the system recovers two targets to
follow. This is the result of two phenomena: the separation of the initial group
(as agents from the first group are attracted by the top trajectory) and, more
significantly, the recruitment of new agents to the second group.
3.2.3 Noisy Trajectories and Speed Limit. The goal of these experiments
is to evaluate the reliability of the system relative to the noise in the data
7The bottom trajectory in Figure 7.

Fig. 7. Discrimination distance (measuring 19.41 states) for an attraction distanced d of 25 envi-
ronment states (top) and separation distance (measuring 38.12 states) for an attraction distance d
of 45 environment states (bottom).

Fig. 8. Evolution of the discrimination distance relative to the attraction distance d.
furnished by the PU on the one hand, and the difference between the speed of
the target and the speed limit of the agents on the other.
—Filtering the noise. In this experiment, both classical trajectories and noisy
information are sent to the system. The goal is to analyze the reliability
of the system in filtering false information. The critical parameter is the
attenuation rate. Indeed, if this coefficient is too high, the system is not able
to detect any target; however, if it is too low, noise may be falsely identified as
a new target. Trajectories are generated by the same virtual perceptive units
as described in the previous section. The noise is generated by other virtual
perceptive units. The false information is sent with a random frequency and
amplitude. We measure the number of detected targets (including the real
one) during a fixed time period and then compute the number of detected
targets per minute. Figure 9 shows the result obtained.
Figure 9 shows that the system is able to filter the noise, provided the
attenuation parameter is small enough (a value under 0.2 is adequate).
Moreover, these results show the importance in our model of the prop-
erties and dynamics of the environment as compared to the behavioral
model used. Indeed, even if all the behavioral parameters are optimized,

Fig. 9. Evolution of the number of detected targets relative to the attenuation coefficient.
a small error in the environmental parameters can make the entire system
inefficient.
—Speed limit. Limits on agent speeds may seem to be restrictive in a physically-
inspired system but is linked to the chosen structure of the environment
(i.e., the finite number of states). As we will show, however, this choice is
not as critical as it could be; instead, what matters is the difference between
the speed of an agent and a target. We have measured this effect in two
ways. First, we determine how mean error changes as compared to an agent’s
relative speed, which is the difference between its maximum speed and the
speed of the target. The second experiment measures changes in the detection
rate (number of detected groups/number of generated plots) compared to the
relative speed. Figures 10 and 11 show the results obtained.
These results show that groups are able to track targets whose speed is
three times higher than their own (with a mean error around 1.5%). This
is due to persistence (due to repulsion inhibition) and inertia (due to the
weight of agents and the friction force) in the groups. This result can be used
in order to tune the maximum speed of agents, taking into account the size
of the environment and the supposed maximum speed of targets.
3.2.4 Conclusions About the Simulated Experiments. Thanks to these pre-
liminary experiments, we observed interesting properties at the application
level.
—Agent inertia allows a group to anticipate the movements of the target, just
as with other tracking algorithms such as Kalman filters.
—The behaviors built into our model allow information in the environment to
persist (as shown in the speed limit and separation distance experiments).

Fig. 10. Mean error and standard deviation (vertical lines) compared to the relative speed.
—The dynamics of the environment allows the system to filter noisy informa-
tion.
—The appearance and disappearance of targets is handled automatically (as
shown in the separation and discrimination experiments).
Moreover, from the point of view of our model, these experiments demon-
strate the essential characteristics of this type of solution method.
—Flexibility. The system is able to deal with a changing number of targets.
—Reliability. The system can anticipate the movements of the targets even if
their speed is high compared to that of the agents’.
—Adaptability. Since the system specification is flexible, we can add or subtract
agents at run time, thus allowing us to adapt solution resources based upon
the complexity of a particular problem.
Finally, these experiments allowed us to isolate critical parameters, directly
linked to the quality of the result produced.
As for the antagonistic attraction/repulsion behaviors, we determined that
the attraction behavior is the most influential on the quality of the result. The
repulsion behavior is only influent in the plotless areas. In this case, what is
important is the proportion of each behavior in order to obtain a stable equi-
librium. If attraction dominates, the agents will concentrate on the interesting

Fig. 11. Detection rate compared to the relative speed.
areas, deserting the plotless ones. Moreover, the specific attraction distance is
also critical to the system’s ability to discriminate between two targets in close
proximity.
Inertia is important to target tracking. It allows us to take the dynamical
parameters of agent groups into account indirectly, preventing these groups
from changing unnecessarily as time goes on. Inertia can be tuned based on
two parameters, the weight of the agents and the friction of the environment.
Experience has shown that it is easier to control inertia through the friction
coefficient instead of by means of agent weights.
The regulation parameters are also important in order to obtain a good esti-
mate of target location. They directly influence the size of the group of agents
on plots and thus their coherence. The attenuation parameter makes it easier
for the system to filter out false targets. This parameter can be considered to be
isotropic since its influence does not depend on the particular location in the en-
vironment. Attenuation amplification can also influence the number of agents
on plots. The goal of this tendency is to ensure that the number of agents on any
plot is proportionate to its importance (altitude). As opposed to the attenuation
parameter, this one is anisotropic since its influence depends on the number of
agents situated on the plot.
Finally, the last critical parameter is the number of agents which must be in
proximity for a group to be identified. If this number is low (near 2), numerous

phantom groups will appear. In contrast, if it is too high, some real target will
not be detected. Experimentally, 3 seems to be the best value in terms of the
detection error rate.
3.3 Experiments with Real Targets
Our real-world experiments have proceeded in two directions. On one hand, the
agent-based system has been compared to a reference algorithm for localization
and tracking; the Kalman Filter. On the other hand, the relevance of the data
fusion process has been evaluated. In order to perform these experiments, small
robots have been used associated to several kinds of perceptive units.
3.3.1 Comparison with a Kalman Filter. The goal of these experiments
is to evaluate the performance of our agent-based system as compared to the
standard Kalman filter algorithm. Our aim is not to prove that the agent-based
system is better than the Kalman filter (this is not possible since the Kalman
algorithm generates an optimal filter), but to compare them in terms of their
flexibility and the knowledge required about the targets.
—Experimental protocol. In these experiments, mean error and standard devi-
ation error have been compared for the two methods. Since the real trajectory
of the target cannot be known in advance, the performance of the methods
is compared over a reference trajectory composed of linear sections linked
by round corners. This trajectory has been chosen in order to distinguish
the relative performance of each algorithm. In order to measure mean er-
ror and standard deviation, the position given by the perceptive unit and
the estimates given by the two algorithms have been compared at each time
step. Then the same experiment has been made with a noisy perceptive unit
(Gaussian noise with a standard deviation of 2.0).
The filter used is a classical Kalman filter as discussed previously. The
state vector is composed of the target’s position and speed at time t. As for
the dynamical model, a linear trajectory has been chosen. Thus the predicted
position corresponds to a linear movement of the target from its previous po-
sition based on the instantaneous speed vector at the previous time step.
This simple model is well adapted to the actual trajectory chosen since the
latter is composed of linear portions linked by small curves. Based on the
linear model it employs, the Kalman filter is optimal over linear parts of the
trajectory. In contrast, over the small curved areas, the model is no longer
optimal. Thus, the observed position will differ from the one that is predicted.
The tuning of the Kalman filter concerns only the prediction and the obser-
vation error matrix. The other parameters of the filter have been determined
analytically based on the linear dynamical model.
The model used for the Kalman filter is given in Table V. The parameters
used for the agent-based system are given in Table VI. Figure 12 compares
the Kalman filter and the agent-based system for a single run. Additionally, in
order to obtain a more significant comparison between the two algorithms, we
have measured mean error and standard deviation. The results obtained are
presented in Table VII.

Table V. Parameters of the Kalman Filter Used (Numerical values have been
computed in order to optimize the output of the filter.)
Name Value
State Vector (time t)
⎛
⎜
⎜
⎝
x(t)
y(t)
vx(t)
vy (t)
⎞
⎟
⎟
⎠
Projection Matrix
⎛
⎜
⎜
⎝
1 0 dt 0
0 1 0 dt
0 0 1 0
0 0 0 1
⎞
⎟
⎟
⎠
Observation Vector (time t)
⎛
⎜
⎜
⎝
xobs(t)
yobs(t)
vx obs(t)
vy obs(t)
⎞
⎟
⎟
⎠
A priori error covariance matrix
⎛
⎜
⎜
⎝
0.021 0 0 0
0 0.021 0 0
0 0 0.996 0
0 0 0 0.996
⎞
⎟
⎟
⎠
Initial estimation error covariance matrix
⎛
⎜
⎜
⎝
8.875 0 0 0
0 8.875 0 0
0 0 0.937 0
0 0 0 0.937
⎞
⎟
⎟
⎠
Observation error covariance matrix
⎛
⎜
⎜
⎝
0.0993 0 0 0
0 0.993 0 0
0 0 7.501 0
0 0 0 7.501
⎞
⎟
⎟
⎠
Observation matrix
⎛
⎜
⎜
⎝
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
⎞
⎟
⎟
⎠
Table VI. Agent-Device Parameters for the Experiment with Real Targets (These values have
been computed using an optimization algorithm).
Name size
of the attenuation attenuation attraction of the
parameter rate amplification attraction repulsion distance plot inhibition friction
Symbol ǫ δ β α d d′
ν λ
Value 0.9 0.655 99.912 100.0 30.0 2.0 10.0 0.08
—Results and Discussion. As shown in figure 12, the multi-agent device seems
to follow the trajectory with more precision than the Kalman filter, especially
in curved regions where the Kalman dynamical model does not suit the real
trajectory. In fact, however, this figure does not show the time lag between the
moment that the target reaches some position and the point when the agent-
based system furnishes a corresponding estimate of that location. Thus, even
if the trajectory of the group of agents is more precise than the Kalman filter’s
trajectory on curves, the time lag is higher. Table VII presents results that
take into account this time lag. They show that the Kalman filter is more
precise than our system for a noiseless information source (i.e., this is a

Fig. 12. Comparison between Kalman (top) and Agent (bottom) methods.
normal result since the Kalman filter is optimal for more than 80% of the
trajectory). However, it is interesting to note that the difference between the
two systems decreases when the information source becomes noisy.
One other interesting point is that the results for the agent-based system
are better in real scenarios than in simulation. This is due to the choice of
the parameters used for the real experiment. These have been computed
using the gradient descent optimization algorithm developed in Jeanpierre

Table VII. Comparison between Kalman Filter and the
Multi-Agent Method (ME = Mean Error, SD = Standard
Deviation). The percentages have been computed relative
to the maximum size of the playground.
Kalman Filter Multi-agent device
Source ME SD ME SD
Ideal reference 0.32% 0.38% 0.54% 0.97%
Noisy reference 0.80% 0.75% 0.72% 0.89%
[2002]. In contrast, the parameters used for the simulation were tuned
experimentally.
The main difference between the two algorithms is the amount of infor-
mation required concerning the target and its dynamics. The Kalman filter
requires a large amount of information such as the initial position of the tar-
get, the transition matrix between two consecutive states (dynamical model),
the prediction covariance matrix, and the observation covariance matrix.
The agent-based device does not require as much information. First, the
initial position of the target is unknown. The detection of new targets is made
automatically. This resolves a crucial issue generally encountered in target
tracking, except where stochastic methods are used.
Furthermore, our method has no need for knowledge concerning the ob-
servation error. This can be considered a drawback to our system since a
false observation leads to a false estimation. However, this drawback can be
partially overcome thanks to the system’s data-fusion abilities (described in
the next section). This cannot be achieved directly with a standard Kalman
filter.
Even if the use of physically-inspired interactions can be considered a kind
of model, the agent-based system does not require specific knowledge about
the dynamical model of the targets unlike the Kalman filter. Furthermore, the
dynamical model used in Kalman filtering can not be easily modified at runtime
except through manipulation of the covariance error matrix. In the agent-based
system, however, the target model can be directly modified during tracking in
order to fit the particular target followed.
Therefore, the system we have constructed can be viewed as a filtering
method, the performance of which is structurally similar to that of a Kalman
filter. The main difference is the requirement of knowledge about the targets.
The agent-based device does not require a kinetic model of the target or an
observation model linked to the sensors used. Thus, it is able to adapt to the
dynamical nature of the targets even if that nature is unknown. Moreover, it
is able to track multiple targets at the same time even if their models differ.
Finally, it is possible to use multiple information sources as input to the system.
This is considered in the next section.
3.3.2 Merging Data. The goal of these experiments is to show the impor-
tance of the ability to perform data fusion in our system. In particular, we show
improvements in results for real-world situations when multiple information
sources, which may be noisy, are used.

Table VIII. Summary of the Data Fusion Results
Reference 3 noisy 3 Kalman Filters 3 Kalman Filters
Input Trajectory sources (Uniformly weighted) (best overweighted)
Mean Error 0.54% 0.31% 0.46% 0.35%
Standard Deviation 0.97% 0.29% 0.58% 0.36%
In this context, two sets of experiments were performed. The first employs
three noisy sources of information about target trajectories. Noise is Gaussian,
analogous to that used in the previous section, with a standard deviation of
2.0%. As it turns out, a collection of Kalman filters, using their own sensors,
can also be considered a Perceptive Unit in accord with the definition given
in Section 3.1. This allows our system to employ several highly sophisticated
signal-processing methods in order to obtain better results. Thus, for the sec-
ond series of experiments, the input to the agent-based system was itself drawn
from the output of three different Kalman filters. The first such filter is identi-
cal to the one used in the previous experiments. The two others have the same
parameters except for their evaluation of the time elapsed between two succes-
sive data given by the associated sensors. For one of these Kalman filters, the
time delay is underestimated by 20%, and for the other, it is overestimated by
20%. The system is first tested using the same weight given to each Kalman
filter. Then we test performance when the Kalman filter that better suits the
reference trajectory is given double weight. Error and standard deviation are
measured for these experiments as before. The results are shown in Table VIII.
This experiment show the influence of data fusion on target tracking. The
method defined for the agent-based system allows it to:
—compensate the noise of each sources,
—decrease the oscillation of the group around the ideal trajectory,
—compose information sources with classical signal processing algorithms,
—regulate the information furnished by the perceptive units based on their
quality.
4. CONCLUSION AND FUTURE WORKS
The model proposed in this article is based on a swarm approach where agents
interact based on laws inspired by physics. The model is thus based on simple
agents with neither cognitive abilities nor a representation of the collective goal.
The initial problem and its constraints are represented through the agents’
environment.
From the problem-solving point of view, we proposed a method based on the
environment and its role in development of an emergent solution. It includes
primitives whose role is to specify the problem on both topological and dynam-
ical levels and mechanisms that take part in the solution process.
One of the most interesting points is the way the emergent structure is
obtained. It is the result of an equilibrium between the problem dynamics
and its corresponding mechanisms (accumulation and attenuation) on the one
side, and the agent dynamics (attenuation amplification and repulsion) on the
other.

The use of physically-inspired forces enables easier tuning of the behavioral
parameters. Indeed, tuning is performed on the macroscopic level, based on the
parameters of the environment and the quality of the collective result obtained.
Thus, tuning happens in response to the influence of the environment on its
components rather than in terms of how agents will react to changes in that
environment. Finally, the dynamics of information gathering allow us to deal
with the data fusion problem in a reactive way. This reactive fusion is possible
thanks to the nonsymbolic nature of the information sources that participate
in the construction of the environment.
The system exhibits all the main characteristics of a reactive multi-agent
system, including flexibility, adaptation to changes in problem constraints, and
reliability. This model has been successfully applied to localization and target-
tracking. Performance in this task demonstrates the various characteristics of
the architecture.
—The problem-solving process requires neither the construction nor the ma-
nipulation of a complex representation of the initial problem.
—The agents have no elaborate cognitive abilities nor internal representation
of the collective goal to achieve.
—There is no knowledge on the information sources (perceptive units).
—The targets are unknown (no dynamical model and no initial position for
instance).
The experiments performed, using both simulated and real targets, have
shown that the solution model can be applied to localization and tracking tasks,
producing interesting results compared to classical methods.Furthermore,
these experiments revealed the following key characteristics of the system.
—The device is able to deal with a variable number of targets.
—Since the population is homogeneous, the solution process is the same what-
ever the initial configuration of the problem and whatever its variations.
Even if the tuning of the parameters has been simplified by the use of physics,
it is not efficient enough to compete with classical optimal methods. Thus, we
now are studying the use of learning and optimization algorithms in order to
be able to tune parameters as precisely as possible in response to the dynamics
of each specific target.
Furthermore, we are also studying the possibility of merging particle filter-
ing methods with our agent-based model. We are particularly interested in a
mathematical characterization of the emerging organization.
ACKNOWLEDGMENTS
The authors would like to profusely thank Martin Allen for the correction of
the English writing of the article.
REFERENCES
ADORNI, G., CAGNONI, S., ENDERLE, S., KRAETZSCHMAR, G., MORDONINI, M., PLAGGE, M., RITTER, M.,
SABLATNOG, S., AND ZELL, A. 2001. Vision-based localization for mobile robots. J. Robot. Auton.
Syst. 36, 2–3, 103–1191.

BAERVELDT, A.-J. 2001. A vision system for object verification and localization based on local
features. Robot. Auto. Syst. 34, 83–92.
BERNARDINO, A. AND SANTOS-VICTOR, J. 1998. Visual behaviours for binocular tracking. Robot. Au-
ton. Syst. 25. 137–146.
BERNON, C., CAMPS, V., GLEIZES, M., AND PICARD, G. 2004. Designing agents behaviours within the
framework of ADELFE METHODOLOGY. In Proceedings of Engineering Societies in the Agents World
(ESAW03). Lecture Notes in Artifual Intelligence, vol. 3071, Springer Verlag, 311–32.
BONABEAU, E., DORIGO, M., AND THERAULAZ, G. 1999. Swarm Intelligence: From Natural to Artificial
Systems. Oxford University Press, Oxford, UK.
BOURJOT, C., CHEVRIER, V., AND THOMAS, V. 2002. How social spiders inspired an approach to region
detection. In Proceedings of the International Conference on Autonomous Agents and Multi-Agent
Systems (AAMAS’02). 426–433.
BRUECKNER, S. 2000. Return from the ant : Synthetic eco-systems for manufacturing control.
PhD. Thesis at Department of Computer Science, Humboldt University, Berlin. 2000.
CHONG, C.-Y. 1998. Distributed architectures for data fusion. In Fusion ’98 International Con-
ference. In Proceedings of the International Conferance on Multisource-Multisensor Information
Fusion (Fusion’98).
COLORNI, A., DORIGO, M., AND MANIEZZO, V. 1991. Distributed optimization by ant colonies. In
Proceedings of ECAL91, European Conference on Artificial Life, Paris, 134–142.
DELLAERT, F., FOX, D., BURGARD, W., AND THRUN, S. 1999. Monte Carlo localization for mobile robots.
In Proceeding of the IEEE International Conference on RObotics and Automation (ICRA).
DEMIRLI, K. AND TURKSEN, I. 2000. Sonar based mobile robot localization by using fuzzy triangu-
lation. In Robot. and Auton. Syst. 33, 2–3. 109–123.
DIMARZO-SERUGENDO, G., KARAGEORGOS, A., RANA, O., AND ZAMBONELLI, F. 2004. Engineering self-
organising systems: Nature-inspired approaches to software engineering. Lecture notes in Atifi-
cial intelligence, Vol., 2977.
DROGOUL, A. AND DUBREUIL, C. 1993. A distributed approach to n-puzzle solving. In Proceedings
of the Distributed Artificial Intelligence Workshop. Seattle, WA.
DROGOUL, A. AND FERBER, J. 1993. From tom-thumb to the dockers: Some experiments with for-
aging robots. In Proceedings of From Animals to Animats II. 451–459.
DROGOUL, A., FERBER, J., AND JACOPIN, E. 1991. Pengi: Applying eco-problem-solving for behavior
modelling in an abstract eco-system. In Proceedings of the Modelling and Simulation: Councils,
Copenhague, Denmark, 337–342.
FERBER, J. 1999. Multi-Agent System: An Introduction to Distributed Artificial Intelligence. Ad-
dison Wesley Publishing.
FERBER, J. AND JACOPIN, E. 1991. The framework of eco-problem solving. In Proceedings of Decen-
tralized Artificial Intelligence. 2, North-Holland. Yves Demazeau and Jean-Pierre Muller Eds.
181–193.
FOX, D., BURGARD, W., DELLAERT, F., AND THRUN, S. 1999. Monte carlo localization: Efficient posi-
tion estimation for mobile robots. In Proceedings of the AAAI National Conference on Artificial
Intelligence. Orlando, FL.
FOX, D., BURGARD, W., DELLAERT, F., AND THRUN, S. 2000. Particle filters for mobile robot localization.
In Sequential Monte Carlo Methods in Practice. Springer Verlag, New York, NY.
FOX, D., BURGARD, W., AND THRUN, S. 2001. Active markov localization for mobile robots. Robot.
Auton. Syst. 25, 195–207.
GECHTER, F. AND CHARPILLET, F. 2000. Vision based localisation for a mobile robot. In pro-
ceedings IEEE International Conference on Tools with Artificial Intelligence (ICTAI). 229–
236.
GECHTER, F., THOMAS, V., AND CHARPILLET, F. 2001. Robot localization by stochastic vision based
device. In the 5th World Multi-Conference on Systemics, Cybernetics and Informatics, Orlando,
FL.
GRABOWSKI, R., NAVARRO, L., PAREDIS, C., AND KHOSLA, P. 1999. Heterogeneous teams of modular
robots for mapping and exploration. Auton. Rob. Special Issue on Heterogeneous Multirobot
Systems.
HENKEL, R. 2000. Synchronization, coherence-detection and three-dimensional vision. Tech. rep.,
Institute of Theoretical Neurophysics, University of Bremen.

HUE, C., CADRE, J.-P. L., AND PEREZ, P. 2000. Tracking multiple objects with particle filtering.
Tech. rep., INRIA.
JEANPIERRE, L. 2002. Apprentissage et adaptation pour la modélisation stochastique de systèmes
dynamiques réels. PhD. thesis, UHP University Nancy 1.
KENNEDY, J. AND EBERHART, R. 2001. Swarm Intelligence. Morgan Kaufmann Publisher.
KWOK, C., FOX, D., AND MEILA, M. 2003. Adaptive real-time particle filters for robot localization.
In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA).
MAMEI, M. AND ZAMBONELLI, F. 2004. Co-fields: A physically inspired approach to distributed mo-
tion coordination. In Proceedings of the IEEE Pervasive Computing. vol. 3, no. 2.
MAMEI, M. AND ZAMBONELLI, F. 2005. Programming stigmergic coordination with the TOTA mid-
dleware. In Proceedings of the 4th International Conference on Autonomous Agents and Multia-
gent Systems. 415–422.
MATARIC, M. J. 1995. Designing and understanding adaptative group behavior. In Proceedings of
Adaptive Behavior 4, 1, 51–80.
MULLER, J.-P. 2004. Emergence of collective behavior and problem solving. In Proceedings of
Engineering Societies in the Agents World, (ESAW’03). Lecture Notes in Artificial Intelligence
vol. 3071. Springer Verlag, 1–21.
MULLER, J.-P. AND PARUNAK, H. 1998. Multi-agent systems and manufacturing. In Proceedings of
the 9th Symposium on Information Control in Manufacturing (INCOM’98). Nancy-Metz, France.
NAGEL, H.-H. AND GEHRKE, A. 1998. Bildbereichsbasierte verfolgung von straflenfahrzeugen
durch adaptive schätzung und segmentierung von optischen-fluss-feldern. DAGM Symposium.
314–321.
PARUNAK, H. 1997. Go to the ant. In Engin. Princ. Natural Agent Systems. Ann. Operat. Res.
POURRAZ, F. AND CROWLEY, J. L. 1999. Continuity properties of the appearance manifold for mobile
robot position estimation. Workshop on Perception for Mobile Agents (CVPRI’99).
REYNOLDS, C. 1987. Flocks, herds, and schools: A distributed behavioral model. SIGGRAPH Con-
ference Proceedings. 25–34.
ROUMELIOTIS, S., SUKHATME, G., AND BEKEY, G. 1999. Circumventing dynamic modeling: Evaluation
of the error-state kalman filter applied to mobile robot localization. IEEE ICRA. Detroit, MI.
SIMONIN, O. AND FERBER, J. 2000. Modeling self satisfaction and altruism to handle action selection
and reactive cooperation. In Proceedings of the 6th International Conference on the Simulation
of Adaptative Behavior FROM ANIMALS TO ANIMATS 6. (Paris, France), 314–323.
SIMONIN, O. AND GECHTER, F. 2006. An environment-based methodology to design reactive multi-
agent systems for problem solving. Lecture Note in Artificial Intelligence, vol. 383. Springer,
32–49.
STEELS, L. 1989. Cooperation between distributed agents through self-organization. Workshop
on Multi-Agent Cooperation. North Holland, Cambridge, UK.
THRUN, S. 1999. Monte Carlo hidden Markov models: Learning non-parametric models of par-
tially observable stochastic processes. In Proceedings of the 16th International Conference on
Machine Learning.
WELCH, G. AND BISHOP, G. 2000. An introduction to the kalman filter. Dr. Rer.Nat. Thesis. De-
partement of Computer Science, Humbolt University Berlin, Berlin, Germany.
WEYNS, D., PARUNAK, V., MICHEL, F., HOLVOET, T., AND FERBER, J. 2005. Environments for multiagent
systems, state of the art and research challenges. In Post-proceedings of the 1st International
Workshop on Environments for Multiagent Systems. Lecture Notes in Artificial Intelligence, vol.
3374. Springer.
ZEGHAL, K. AND FERBER, J. 1994. A reactive approach for distributed air traffic control. In Pro-
ceedings of Avignon. 381–390.
Received July 2005; revised March 2006; accepted July 2006

A Reactive Agent-Based Problem-Solving Model

Recommended

Recommended

More Related Content

Similar to A Reactive Agent-Based Problem-Solving Model

Similar to A Reactive Agent-Based Problem-Solving Model (20)

More from Kayla Smith

More from Kayla Smith (20)

Recently uploaded

Recently uploaded (20)

A Reactive Agent-Based Problem-Solving Model