The RoboCup Rescue simulation models an earthquake in an urban centre presented in the form of a map. The goal of this project is to develop a machine learning technique able to predict the expected time of death (ETD) of civilians and use it in the task planning of the ambulance team in order to save the maximum number of civilians.
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
ย
Machine Learning techniques for the Task Planning of the Ambulance Rescue Team
1. Machine Learning techniques for the Task
Planning of the Ambulance Rescue Team
Francesco Cucari - fracu758
TDDD10 AI Programming - Individual Report
Linkยจoping University
1 Introduction
The RoboCup Rescue project started after the earthquake that hit Kobe City
in the January of 1995 causing enormous number of victims and damage.
The aim behind this project is to propose solutions for overcoming these dis-
astrous scenarios with minimal loss. In order to achieve this goal, the RoboCup
Rescue simulation models an earthquake in an urban centre presented in the
form of a map. This simulation matches real world limits and problems as accu-
rately as possible. In fact, the simulated earthquake causes building to collapse,
roads to be blocked, ๏ฌres ignitions and civilians to be trapped and buried inside
collapsed buildings.
In the simulator, there are three teams responsible for all rescuing purposes:
the ambulance team, ๏ฌre-brigade and police forces. The main task of ambulance
team is to rescue civilians and carry them safely in the refuge; the aim of ๏ฌre-
brigades is to extinguish buildings on ๏ฌre and the task of police forces is to clear
roads.
1.1 Motivation
This project is mainly focused on the tasks of the ambulance team. Since
the score function of the simulator highly depends on the number of the alive
civilians and on the percentage of the health left for all civilians [1], in order
to get an high score, the highest priority is to save the maximum number of
civilians possible.
This implies the maximum utilization of the time ahead of each agent and
agents should be sure that no time will be wasted on targets that will die either
during or after rescuing. This problem arises for example when civilians are
prioritized according to the shortest distance from each agent.
1.2 Aim
The proposed solution is inspired by the GUC ArtSapience teamโs approach
described in [6] and it is based on learning the Expected Time of Death (ETD) of
each civilian and thus having realistic estimations will lead to better performance
of the Ambulance team by prioritizing the agents tasks accordingly. Due to
di๏ฌculty in ๏ฌnding a realistic formula to estimate the ETD, a learning algorithm
1
2. can be used to learn the factors that are hard to be calculated using those
implicitly parameters [5]. The goal of this project is to reach better performance
introducing the ETD in the prioritization task compared to the results of the
shortest distances approach.
1.3 Research questions
Given the described domain, questions regarding this arise. The focus of the
resulting report as well as the project in general will revolve around these ques-
tions, and this project will aim for getting an answer to these questions:
1. Is it possible to introduce the ETD in the prioritization task?
(a) If so, does this lead to better performance compared to the results of
shortest distances approach?
2 Methods
This work can be divided in two main steps: Learning and Planning.
2.1 Learning
In each time step of simulation the state of any given civilian changes. These
values are collected and preprocessed in a dataset that is used for the learning
phase.
2.1.1 Data collection
Data are collected after many runs of multiple maps, where the agents log the
state of civilians at each step instead of rescuing them. Then, some further
changes to the simulator are done: maps were run without blockades, with the
๏ฌre-simulator enabled and with all static civilians. In order to collect more
data, maps were run with di๏ฌerent scenarios where more ๏ฌres were added in the
map. Tab.1 is an example of collected dataset that represent the history of each
civilian.
ID civilian Timestep HP Damage Buriedness
406067950 177 3000 1000 0
406067950 178 2000 1300 0
406067950 179 0 1800 0
1769673037 79 1000 100 60
1769673037 80 0 200 60
Table 1: An example of collected data before preprocessing
2.1.2 Preprocessing and Labelling
The collected data need some preprocessing. In fact, for a supervised learning
algorithm to be applied on a training dataset, each example of that set has to
have an output attribute, which is the output value of the classi๏ฌcation. In the
2
3. proposed approach this attribute is the ETD of the civilian. The ID civilian and
time steps are used to label each example with the value of ETD. The resulting
dataset contains pairs of values for each attribute in the set and a value for the
time where this civilian will die. If civilian is not dead during the simulation,
this value is set to the maximum time step, that is 300. Tab.2 shows an extract
of the ๏ฌnal dataset, that is used as input of the learning classi๏ฌer.
HP Damage Buriedness ETD
3000 1000 0 179
2000 1300 0 179
0 1800 0 179
1000 100 60 80
0 200 60 80
Table 2: Final dataset after preprocessing and labelling
2.1.3 Classi๏ฌer
Given the dataset, the goal is to learn the relation between the input pairs (HP,
Damage, Buriedness) and the output (ETD), following the approach developed
in [2]: this relation was obtained ๏ฌrst by training the dataset and then using
the output learning model for future predictions. Thus, a classi๏ฌer is needed to
achieve both goals: multiple linear regression.
Linear regression is an approach for predicting a quantitative response (nu-
meric value) using multiple feature (or โpredictorโ or โinput variableโ) [3]. It
takes the following form:
y = ฮฒ0 + ฮฒ1 โ x1 + ฮฒ2 โ x2 + ฮฒ3 โ x3 (1)
where ฮฒ0 represent the intercept, each xi represents a di๏ฌerent feature (hp,
damage and buriedness), and each feature has its own coe๏ฌcient ฮฒi. Learning
these coe๏ฌcients itโs possible to make prediction of the output value y, that is
the ETD of a civilian.
For model validation, 10-fold cross validation was used. As de๏ฌned in [4],
cross-validation is a statistical method of evaluating and comparing learning
algorithms by dividing data into two segments: one used to learn or train a
model and the other used to validate the model. The division process can be
repeated k times, using di๏ฌerent subsets of the data. So, the idea of cross
validation is to estimate how well the current dataset can predict an output
value for any given input instance.
Finally, the Weka1
tool is used for all learning purposes. Weka is a com-
prehensive suite of Java class libraries that implement many state-of-the-art
machine learning and data mining algorithms [7]. It is quite easy to use it since
the learning algorithms can be called directly from the Java code.
2.2 Planning
Rescue agents need to plan their decisions to reach their restricted goal that is
saving the maximum number of civilians possible. An ambulance agent noti๏ฌes
1http://www.cs.waikato.ac.nz/ml/weka/
3
4. the parameters of found civilian to the ambulance centre. The ambulance centre
uses these parameters and the output learning model (3.1) to predict the ETD
and then to prioritize the agents tasks accordingly.
2.2.1 Exploring-Rescuing trade-o๏ฌ
The ๏ฌrst decision of the centre takes in consideration the exploring-rescuing
trade-o๏ฌ. In fact, the ambulance centre should answer to the question: When
does the ambulance centre decide which civilian should be rescued? The ๏ฌrst
option is to rescue when a victim is noti๏ฌed. This is not e๏ฌcient because this
approach lead to a waste of time due to the fact that there could be a civilian
who need more help or with more HP and so on. The chosen approach is to
introduce a priority value based on ETD: if the priority value is greater than
a certain threshold then the ambulance centre orders for an agent to rescue a
civilian.
2.2.2 Task prioritization
The ambulance centre can prioritize the agents tasks according to:
โข shortest distance, where closest targets have high priority;
โข ETD, where targets with low ETD have high priority;
โข shortest distance + ETD, where closest targets with low ETD have high
priority. This combined approach is inspired by A*: f(n) = g(n) + h(n)
where g(n) is the normalized shortest pathโs cost and h(n) is the ETD of
the target.
3 Results
3.1 Learning
Using the Weka tool and applying the classi๏ฌer, described in 2.1.3, on the ob-
tained training dataset, described in 2.1.2, the resulting model is the following:
Figure 1: Resulting output learning model
This model will be used by the ambulance center to predict the ETD and
prioritize the agent task accordingly using the parameters of each civilian no-
ti๏ฌed by the agents. Then, a summary of the model validation, described in
2.1.3, is shown in Fig.2.
4
5. Figure 2: Summary of the model validation
3.2 Planning
The number of rescued civilians was chosen to be the evaluation measurement
of proposed approaches since the score doesnโt depends only on civilians. Fig.3
shows the number of rescued civilians using the approaches described in 2.2.2.
Figure 3: Comparison of the number of rescued civilians using the three ap-
proaches
4 Discussion
The proposed approach, based on the introduction of the ETD in the prioriti-
zation task and in particular using the combined approach distance+ETD, lead
to better performance compared with the results of shortest-distance strategy.
It is worth noting that some statistical parameters of the output learning
model of this work are better than those described in [6]. In fact, the value of
correlation coe๏ฌcient, that quanti๏ฌes a statistical relationships between two or
more observed data values, is higher by 7%. Furthermore, the value of the root
relative squared error is lower by 5%.
Finally, with the proposed approach the number of rescued civilians is in-
creased by 7% compared with the shortest-distance strategy and by 15% com-
pared with only ETD-based strategy.
5
6. 5 Conclusion
The introduction of a learning model in the task planning of the ambulance
rescue team helped to reach better results in the rescuing operation. This model
was the outcome of a training data set that was trained using linear regression
algorithm. Then, this model was used for prediction of ETD for task prioritizing
and planning. In particular, ETD was used for optimizing the search algorithm
that constructs paths for the agents to move from one location on the map to
another. This was done by replacing the traditional breadth ๏ฌrst search by
a heuristic search, which includes the ETD as a heuristic for the evaluation
function of expanding nodes.
The proposed solution didnโt only help optimize the task planning of the
agents and achieve better results. It also helped to overcome the obstacles en-
forced by the inaccurate values retrieved from the simulator regarding civilians.
Having a training dataset allows to know and learn the relation between the
parameters of civilians.
In future work, the training dataset can be augmented with other param-
eters, for example the shortest distance of a civilian from the nearest agent.
Furthermore, the proposed approach combined with the division of the map in
clusters and the assignment of one or more agents to speci๏ฌc clusters could lead
to better results, increasing the number of rescue civilians and decreasing the
waste of time in rescuing operations.
References
[1] Cyrille Berger Developing a team, 2015.
[2] Fadwa Sakr, Slim Abdennadher, Harnessing Supervised Learning Techniques
for the Task Planning of Ambulance Rescue Agents, ICAART 2016, 2016
[3] James, Gareth, et al. An introduction to statistical learning. Vol. 6. New
York: springer, 2013.
[4] Refaeilzadeh, Payam, Lei Tang, and Huan Liu. Cross-validation. Encyclope-
dia of database systems. Springer US, 2009. 532-538.
[5] Sameh Metias, Mahmoud Walid and others, RoboCup 2015 Rescue Simula-
tion League Team Description GUC ArtSapience (Egypt), 2015
[6] Sameh Metias, Mohammed Waheed and others, RoboCup 2016 Rescue Sim-
ulation League Team Description GUC ArtSapience (Egypt), 2014
[7] Witten, Ian H and Frank, Eibe and Trigg, Leonard E and Hall, Mark A
and Holmes, Geo๏ฌrey and Cunningham, Sally Jo, Weka: Practical machine
learning tools and techniques with Java implementations, 1999
6