Search algorithm for optimal execution of incident commander guidance in macro action planning

354 Int. J. Intelligent Systems Technologies and Applications, Vol. 14, Nos. 3/4, 2015
Search algorithm for optimal execution of incident
commander guidance in macro action planning
Reza Nourjou* and Hirokazu Tatano
Disaster Prevention Research Institute (DPRI),
Kyoto University,
Kyoto 611-0011, Japan
Email: nourjour@gmail.com
Email: tatano@imdr.dpri.kyoto-u.ac.jp
*Corresponding author
Hossein Aghamohammadi
Department of Remote Sensing and GIS,
Science and Research Branch, IAU,
Tehran 1477893855, Iran
Email: hossein.aghamohammadi@gmail.com
Abstract: This paper presents a state space search algorithm that solves the
optimal execution problem of incident commander’s guidance during disaster
emergency management. To achieve a joint goal, the IC should select the best
choice, as an optimal strategic decision, from available alternatives in a definite
time. A strategic decision coordinates/controls macro actions of a team of field
units by constraining a subteam to a subgoal in sublocation in a time window;
moreover a sequence of strategic decisions generates a macro action plan that
defines how to reach the goal. Three results are achieved by running this algorithm
for a scenario: (1) calculate an optimal macro action plan; (2) estimate a minimum
total time to achieve a joint goal and (3) reason about the best choice. We applied
our approach to develop an intelligent software system (autonomous agent) for
assisting the human in crisis response to earthquake disaster.
Keywords: state space search algorithm; multi-agent planning; disaster crisis
response; incident commander; optimal decision making; human strategy
execution.
Reference to this paper should be made as follows: Nourjou, R., Tatano, H.
and Aghamohammadi, H. (2015) ‘Search algorithm for optimal execution of
incident commander guidance in macro action planning’, Int. J. Intelligent
Systems Technologies and Applications, Vol. 14, Nos. 3/4, pp.354–384.
Biographical notes: Reza Nourjou received his Master degree in GIS and
the PhD in Informatics in 2006 and 2014, respectively. He was a Research
Student and a Researcher of Disaster Prevention Research Institute at the Kyoto
University during 2009–2014 and 2014, respectively. Since 2006, he has been
contributing to the public safety domain (disaster emergency management, crisis
response, rescue and relief operations) by applying: GIS, autonomous software
Copyright © 2015 Inderscience Enterprises Ltd.

Search algorithm for optimal execution of incident commander guidance 355
agents and multi-agent systems, automated planning and scheduling algorithms,
spatial agent-based modelling and geospatial simulation. His work/research
fields currently include distributed autonomous GIS, machine learning, software
application development (mobile app development, software system engineering,
cloud Communication platform, and location-based web services), and internet
of things.
Hirokazu Tatano is a Professor of Disaster Prevention Research Institute and
the Director of Social Systems for Disaster Risk Governance Lab at the Kyoto
University, Japan. He is a Co-Founder of the IDRiM Society (International
Society for Integrated Disaster Risk Management). His fields of specialisation
include disaster risk management, economic analysis of disasters, infrastructure
economics.
Hossein Aghamohammadi holds the Master degree and the PhD in GIS from the
K.N. Toosi University of Technology. Since 2007, he teaches remote sensing,
GIS, database, and geomatics courses in Azad University.
1 Introduction
Action Planning is a key research content of emergency management of disasters especially
in the search and rescue operation (SAR) domain, where a commander of a team aims to
control and coordinate field units’ action. Crisis response to an earthquake disaster starts
with SAR by a team of field units.
Effective coordination is an essential ingredient for efficient emergency response
management (Chen et al., 2008). Coordination is the act of managing interdependencies
among activities performed to achieve a goal. Inefficient coordination causes the team to
fail to reach their goal. At least five reasons necessitate the coordination among field units
in the SAR problem domain:
• The ‘enabling’ dependency between two tasks states that accomplishment of the
predecessor enables field units to start carrying out the successor. For example, a
search task and a rescue task that are associated to a victim should be sequentially
done. Field units can do tasks whose state are ‘enabled’. To accomplish a task reveals
a new ‘enabled’ task or changes the state of a ‘not yet enabled’ task to the ‘enabled’
task. Not yet enabled tasks result in idle and inactive field units.
• The ‘enabling’ dependency between field units’ actions states that the sub-actions of
a field unit maybe dependent to outcomes of actions of other field units. A field unit
might be capable of doing a subset of tasks because of distribution of capabilities
(and also resources and expertise) among field units. Therefore, this field unit is
enabled to do a subset of ‘enabled’ tasks
I which have already revealed by actions of other field units or by itself
II for which this field unit can provide capability requirements.

356 R. Nourjou et al.
It is reasonable to involve all field units in performing tasks by decreasing amount of
time a field units is idle.
• Redundant actions result in anarchy or chaos. Because a subset of field units may
possess overlapping capabilities, this causes conflict between field units who intend
to do same tasks, while a field unit is enough to accomplish an instance of these tasks.
• Information sharing allows field units and also the commander to have a better
perception of the world state in real-time and in the future. Information would
include task information, task schedules, or action plans. This enables them to make
better action plans.
• Task assignment and scheduling deal with a field unit as a machine that should
perform tasks with regards to some optimisation criteria. To optimise the global
utility, this is crucial for the team to define what tasks a certain field unit should do
when and where.
Many good works have applied AI planning techniques in the problem domain of
disaster crisis response in which each one takes into account a specific problem (problem
characteristics, assumptions, requirements, etc.). Some works have addressed a centralised
approach, where an agent, e.g., the incident commander, is responsible for making an action
plan for field units.
To sharpen our understandings, we consider some characteristics and assumption as
follows:
• A single objective: The global goal is to carry out SAR by a team of field units. The
optimisation criterion is to minimise the overall time called makespan to achieve this
joint objective by field units’ actions.
• Commander as human decision-maker: The commander is the team planner. This
paper focuses on the role of commander in making an action plan for field units.
Field units are considered autonomous entities that are capable of reasoning about
their own actions, taking into account the action plan made by the commander.
• Centralised approach: A centralised approach is required to calculate a central plan.
The action plan, which is made by the commander in a centralised manner, will be
executed by field units in a distributed and decentralised approach.
• Location-based temporal macro tasks: The incident commander has a big picture of
task environment, which is formed by location-based temporal macro tasks (LoTeM
tasks). See Section 2 for details.
• Macro action: The plan that the commander makes has the macro feature. A macro
action plan can not specify actions of field units in detail. It partially specifies actions
of field units and delegates these intelligent agents to autonomously and explicitly
plan (reason and make decision) their own micro actions. A macro action states that
what subset of field units should execute what subset of actions (or what subset of
tasks) within what sub-area during what time window. A micro action specifies a
definite action which a definite field unit is planning to execute in a definite time at a
definite location. See Section 2 for more details.
• Human supervisor: Humans should be involved in the planning loop and should have
a supervisor role in the planning process. Because of complexity of commanding

crisis response operations, a human planner cannot be completely replaced by a fully
automated system. In fact, it is not feasible for fully automated planning systems to
effectively plan for field units by reasoning about all the possibilities that might arise
during the execution of tasks in a complex environment. This is true especially in
operation centres.
• Partial observation of the task environment: The commander has a partial view of
task environment, because tasks are discovered (revealed, observed) by field units’
actions over time. There is uncertainty in
1 outcomes of field units’ actions
2 amount of time in which a task is accomplished.
Therefore, the commander should act according to a loop consisting of five steps:
1 estimate tasks and integrate them with observed data
2 plan, re-plan, or revise (adapt, refine) the current plan to new situations in a time
manner
3 disseminate information of the plan for execution
4 continuously monitor how the plan is executed by the field units, gather and
integrate data reported by field units to have a timely situational awareness
5 learn to act better.
This loop is continuously repeated over time until ultimate goals are achieved.
• Quick answer: Quickly, the commander should make the action plan for field units
while planning is done under time pressure.
• Time: executing tasks is a time-consuming operation. Therefore, we should
determine the amount of time that an action being done by a definite field unit needs
to accomplish a definite task.
• Planning verse scheduling: the commander aims to define what macro action should
be done in the real time and timely remake a new decision for revising the previous
one. The thread assignment problem is not a scheduling problem that is capable of
changing the task environment. In fact, task/action scheduling is done under a set
strategic decision.
This paper addresses the problem of optimal execution of human strategy, in which the best
choice should be made from a set of presented alternatives. The selected choice is the best
strategic decision the commander should make for specifying macro actions of field units
in SAR.
This is a difficult problem for the human, because it is not feasible for the commander as
the human planner to quickly and appropriately reason which alternative might be the best in
real time. In fact it is beyond the human capabilities in the problem addressed in this paper. In
addition, a review of the related works shows that neither have they thoroughly addressed the
stated problem, nor has a proper solution been developed for solving this type of problem.
As a result, it is necessary to assist the commander and support the human planner in

the strategic decision making procedure by optimally executing the human strategy. Our
motivation in this paper is to propose an ideal approach for solving this problem.
Our objective in this paper is to apply artificial intelligence techniques in problem
solving. The paper aims to design and implement a heuristic search algorithm that is capable
of automated reasoning to select the best choice. In addition, this algorithm is capable of
calculating an optimal macro action plan in the shortest amount of time, by which the goal
will be satisfied via the calculated plan.
2 Background to execution problem of incident commander guidance
SAR plays a major role in disaster response occurring in urban areas. SAR is concerned
with reducing the number of fatalities in the first few days after the occurrence of a disaster.
Tasks that should be done in the SAR problem domain are categorised into four task types:
1 conduct reconnaissance and assessment by collecting information on the extent of
damage
2 search and locate victims trapped in collapsed structures
3 extract and rescue trapped victims
4 transport injured survivors to hospitals or refuges.
A disaster-affected area contains a number of tasks geographically dispersed in the area, and
task information is associated with geographic objects such as buildings, road segments,
and city blocks. Tasks entail inter-dependencies; it means that a task might enable (discover,
release, reveal) another task completion. To save one victim, who is located at a damaged
building in a certain spatial location, a sequence of four tasks should be accomplished. A
task to be performed requires one or several capabilities synchronously and a considerable
amount of time. All tasks are not known in advance, and accomplishing tasks reveals new
tasks and changes the state of tasks over time. Figure 1 presents a simple task tree that is
associated with saving a person. It is notable that attributes of these tasks may be different
from each other and are changeable over time.
Field units are hierarchically organised as a team. A team consists of
I a commander situated at the top level called the strategic level
II field units at the lower level called the tactical level.
This paper refers to a team as a society of cooperative intelligent agents. The team is faced
with the problem of carrying out geographically dispersed tasks, under evolving execution
circumstances, in a manner that achieves global objectives. Field units, which are spatially
distributed in the geographic area, might be robots or humans that are responsible for doing
SAR. A field unit displaced from one location to another:
• would possess different capabilities that are required by tasks
• perceives its local environment
• cooperates and coordinates with other field units in order to maximise the global
utility

• might have a set of actions that each one is associated with a deﬁnite speed and a set
of capabilities, autonomously reasons about its actions to select and execute their
own actions
• has a partial and local view of the world state
• reports to the operations centre
• executes the commander’ orders (action plan).
In the SAR domain, ﬁeld units are categorised into several types according to capabilities
such as
1 reconnaissance
2 canine search
3 electronic search
4 light rescue
5 medium rescue
6 heavy rescue
7 volunteer.
Figure 1 A simple task structure consisting of three tasks, which should be accomplished in order
to save a victim in SAR domain (see online version for colours)

The role of commander is to control and command the field units. The commander has
a large global picture of the world’s state, which enables him or his unit to define global
objectives for the team and make strategic decisions for the field units. One example of a
joint objective is to finish all rescue tasks located within a zone by all field units in minimum
time. Figure 2 presents a team of four field units that is command by an incident commander.
We study this team in detail in Section 3.
Figure 2 Structure of a team of four heterogeneous field units (see online version for colours)
The global perception, which the team commander has of the task environment, is formed
by information of location-based temporal macro tasks (LoTeM tasks). A LoTeM task states
how many basic tasks from an certain type are spatially located within a definite geographic
area during a definite time. The concept of LoTeM task is shown in Figure 3. Three LoTeM
tasks are spatially contained within a geographic area, such as road segment, city blocks,
etc. These tasks have dynamic and temporal attributes.
Planning and scheduling techniques are two major coordination mechanisms in multi-
agent systems. The problem of how agents should get from the current world’s state to the
desired goal state through a sequence of actions (an action plan) represents a multi-agent
planning problem. An action plan specifies a sequence of actions that a definite agent should
do. The multi-agent scheduling is the problem of assignment of limited resources (agents)
to time-consuming tasks within a defined time window and coping with a set of constraints
and requirements over time in order to maximise an optimisation criterion.
Macro action planning for a team of field units responding to disaster crisis is important
to the team’s commander in order to achieve a global/joint objective, e.g., accomplishing
search and rescue tasks in a minimum overall time.

Figure 3 A LoTeM task structure of three task types which are spatially contained within a
geographic area in a time (see online version for colours)
Strategic decision making is a technique by which the commander can coordinate and
control actions of field units by making strategic decision in the SAR domain with regards
to the defined requirements. Figure 4 briefly presents the strategic decision making process
that consists of four main phases as follows:
1 Specify a strategy: Strategy specification enables a commander, as a human, to
express and encode his or her intuition for action planning. A strategy decomposes a
problem (e.g., carrying out SAR by a team within an operational area) into a finite set
of small sub-problems each called a thread. A strategy is a set of prioritised threads
that ordered from high to low according to their importance under human
supervision. A thread consists of:
I a subset of task types (a subgoal)
II a subset of zones (a sublocation)
III a subset of field units (a subteam).
A strategy might define a field unit in several threads. If we completely ignore a
commander, a strategy will define using a single thread that includes all task types,
the whole operational area, and all field units.
2 Calculate a set of alternatives: Execution of a strategy is the problem of appropriate
assignment of field units to threads of a human strategy according to the world’s state
in real-time. A strategy is required to be executed or be re-executed whenever a new
set of field units enter into one thread or several threads. A thread receives a new set
of field units from two sources:

I the commander who can directly send a set of (free) field units to this thread or
II the higher thread that releases a set of field units, which are assigned or are
entered into to this thread, and sends this set into the lower thread during the
adaption phase (phase 4).
Human strategy execution is done at two steps. The first step is to calculate possible
alternatives that present feasible choices for solving the thread assignment problem
within a time frame. An alternative presents which field units are assigned to which
threads, provided that a field unit cannot be assigned to more than one thread in an
alternative. Calculated alternatives are input data for the next phase.
3 Make a choice: The second step in the execution of the human strategy is to make a
choice by selecting an alternative from the calculated alternatives. The selected
choice is referred to as a strategic decision that the commander should select for
control and coordinating field units. With regard to the strategy definition, a strategic
decision constrains a subteam to a subtype of task at a subarea. In other words,
a strategic decision assigns a field unit to either a thread or no thread. A field unit
which is assigned to a thread is constrained to the thread definition and is delegated to
do any task contained by the thread. The selected strategic decision specifies macro
actions of field units for a time window, therefore field units are required to specify
and execute their actions, taking into account this decision. As a result, a strategic
decision is considered a macro action.
4 Adapt the strategic decision in a time manner: A strategic decision, made at the
previous phase at a certain time and being executed by field units, is valid for a
limited time. Therefore the commander is required to adapt (revise, refine) this
strategic decision at the right time. The adaption of a strategic decision includes
I identifying a subset of field units that should be released from threads, to which
those are assigned via the strategic decision, at the right time
II sending the released field units to the lower threads. The adaption of strategic
decision results in re-executing phase 2.
In the strategic decision making chain, strategic decisions are sequentially and timely made
over time until the desirable goal is met. These made decisions generate a macro action
plan that states how and when a team reaches a pre-defined goal. The plan is initiated from
one of the presented alternative in the current time and then evolves over time whenever
a new decision should be made in Phase 3. It is obvious that due to the existence of
different choices for making a strategic decision within a time, several different plans may
be generated, which each one having a strong effect on the efficiency of SAR operation.
Among these plans, there is an optimal plan that guarantees the team’s ability to maximise
the joint objective. One of the alternatives included by the optimal macro action plan is
considered the best strategic decision the commander should select. The optimal execution
of the commander’s strategy aims to reason on which choice could be the best strategic
decision.

Figure 4 Strategic decision-making process and role of step 3 in this process as the problem
addressed by this paper (see online version for colours)
3 A simulated scenario
3.1 SAR scenario
To have a better understanding of and address the problem, this section is dedicated to the
explanation of a simulated and simple scenario of SAR, to which a team has been assigned.
Imagine that an earthquake disaster has occurred in an urban area and a team has
departed to this area to engage in SAR. Figure 5 shows a map, which is created by
geographic information systems (GIS), that visualises the location of four field units and
spatial distribution of five highlighted road segments, as five operational zones in time 0.
This map provides a timely situational awareness for the commander of SAR.
Figure 5 A simulated SAR scenario (see online version for colours)
Table 1 lists three task types with associated properties to present SAR domain data. For
example, to do an instance of task type T0, which presents one reconnaissance task, requires

one capability of type C0 and 5 min duration. The domain data are essentially defined and
modified by the commander.
Table 1 Matrix of task types to present the SAR domain data
∆ t Capability requirements
Task-type (min) C0 C1 C2
T0 5 1 0 0
T1 20 0 1 0
T2 60 0 0 1
Capabilities description: C0: Reconnaissance; C1: Search; C2: Light rescue.
Task types description: T0: Reconnaissance; T1: Search; T2: Light rescue.
The team includes four field units and a commander. We assume that all of the field units
are free (or idle) in time 0. The action matrix shown in Table 2 lists these field units with
an action set associated to each field unit. For example, the field unit a2 has two actions
and is capable of doing one of them at a time. The second action provides one unit of the
capability type C1 with a speed of two times faster than the basic level. In summary, by this
action, this field unit can carry out one unit of the T1 task within 10 min.
Table 2 Matrix of field units’ actions
Number of capabilities
Field unit ID Action speed C0 C1 C2
a0 2 1 0 0
a2 1 1 0 0
2 0 1 0
a6 1 0 1 0
2 0 0 1
a7 1 0 1 0
2 0 0 1
Table 3 shows the shortest distances among six locations in time 0. We assume that the
average moving speed of all field units equals 20 (metre per minute) through the road
network. To make the problem simple, we assume that information provided by this table
does not change over time. In real situation, this table is calculated and updated by a team
of road-clearing vehicles.
A set of 12 LoTeM tasks, which are observed or estimated at time 0 are geo-located in
five road segments. Table 4 presents the state of task environment at time 0. For example,
the 10th item states that the proximity of the road s5. Five tasks of the light rescue type
are estimated to be revealed in the future and six tasks of the same type have been revealed
and are ready to be carried out by field units. Because of the ‘enabling’ dependency among
tasks, e.g., between 9th task and 10th task, if the whole 9th task (2 not yet enabled amount
plus 5 enabled amount) is completely done, it is estimated that five tasks of the light rescue
type will be revealed the proximity the road s5. From another point of view, six civilians
have been successfully located (searched) under debris and now they need to be rescued in
location s5. In addition, five persons are estimated to be under debris, and to rescue them,

10 tasks of T0 type and seven tasks of T1 type should be completely carried out in the
same geographic area. Over time, estimated information is replaced with real and observed
information.
This data are used to provide a timely situational awareness of the world’s state and
form a big picture of the crisis situation for the commander.
Table 3 The shortest distances (given in metre) among six locations visualised in Figure 5
s2 s1 s3 s4 s5 s6
s2 0 225 447 764 364 625
s1 225 0 370 687 418 548
s3 447 370 0 343 452 221
s4 764 687 343 0 618 224
s5 364 418 452 618 0 476
s6 625 548 221 224 476 0
Table 4 State of a set of 12 LoTeM tasks geo-located in five road segments in time 0
LoTeM Location Not yet enabled Enabled
task Id (Road S.) Task-type amount amount
1 s1 T0 0 25
2 s1 T2 0 4
3 s2 T1 0 5
4 s2 T2 10 5
5 s3 T0 0 15
6 s3 T1 8 0
7 s3 T2 2 8
8 s5 T0 0 10
9 s5 T1 2 5
10 s5 T2 5 6
11 s4 T1 0 18
12 s4 T2 10 2
3.2 A scenario of strategic decision making process
This subsection presents a simple scenario of the strategic decision making process, in
which the commander of the team is faced with the problem of making the best strategic
decision.
The commander, first, specifies a strategy as Table 5 presents. This strategy in this
scenario is composed of three threads that partition the objective into three sub-problems.
Thread 1 states that the first and the highest priority for the team is to do two task types:
Thread 1 states that the first and the highest priority for the team is to do two task types
{T0, T1} at three geographic zones {s3, s4, s5}. Any appropriate and available subset of
four field units {a0, a2, a6, a7} can be assigned to this thread, in order to do tasks that this

thread will contain after strategy is executed in {real time}. The human strategy has defined
field unit a2 at three threads. It means that this field unit is capable of being assigned to
one of three threads via a strategic decision. In addition, abstractly, the task environment
defined by thread 2 is completely dependent on thread 1. We assume the commander has
sent all four field units as free ones to thread 1 in time 0.
Table 5 An example of incident commander guidance (human strategy)
A sub-goal
Thread Id A sub-location (task type) A sub-team
1 s3, s4, s5 T0, T1 a0, a2, a6, a7
2 s3, s4, s5 T2 a2, a6, a7
3 s1, s2 T0, T1, T2 a0, a2, a6, a7
Strategy execute is the next step that includes two phases. First, the whole problem needs
to be partition into three sub-problems, taking into account the human strategy and the
real-time state of the world (see Table 6).
Table 6 Partition the whole problem into three subproblem in strategy execution in time 0
Thread LoTeM Field units Field units
Id task Id assigned to sent or released into
1 5, 6, 8, 9, 11 a0, a2, a6, a7
2 7, 10, 12
3 1, 2, 3, 4
Calculation of feasible alternatives results in 10 choices presented in Table 7. Each
alternative is considered a potential candidate for the strategic decision that the commander
can select/make in time 0.
Table 7 A set of 10 alternatives (choice) calculated for making a strategic decision in time 0
Alternative Assignment to Assignment to Assignment to
no. Thread 1 Thread 2 Thread 3
1 a2 a7 a0, a6
2 a2 a6, a7 a0
3 a0, a7 a6 a2
4 a2, a7 a6 a0
5 a0, a2 a7 a6
6 a0, a2 a6, a7
7 a0, a2, a6 a2
8 a2, a6, a7 a0
9 a0, a2, a7 a6
10 a0, a2, a6, a7
Phase 2 in strategy execution is to select the best choice from 10 alternatives that presents
the optimal strategic decision made by the commander in time 0. Therefore, the key question
is which choice can be the optimal strategic decision?

4 Literature review
Good works, which focused on optimisation of emergency response operations, taking into
account a centralised approach, have been done by different schools. This section reviews
some of them. Review of the related works shows that they have not thoroughly addressed
the stated problem and a proper solution has not been developed for solving this type of
problem.
A system of distributed autonomous GIS (DAGIS) is proposed by Nourjou and
Gelernter (2015) to solve the coalition formation problem within a human team for public
safety applications via an automated mechanism. DAGIS imply a system (social network)
of autonomous software agents carry out the coalition formation method on behalf of
human users with some degree of independence or autonomy, and in so doing, automated
communicate with each other and employ some problem solving algorithms. DAGIS are
run on mobile devices such as smart phones, and an instance of DAGIS is used by one
type of human user: field unit, civilian, incident commander. A DAGIS which interacts
with the incident commander only follows the human’ guidance without optimisation of
his decision.
The incident command system (ICS) is the official designation for a particular approach
used by many public safety professions (e.g., firefighters and police) to assemble and
control the temporary systems they employ to manage personnel and equipment at a wide
range of emergencies, such as fires, multi-casualty accidents (air, rail, water, roadway),
natural disasters, hazardous materials spills, and so forth (Bigley et al., 2001). The Incident
Commander, the highest-ranking position within the ICS, is ultimately responsible for all
activities that take place at an incident, including the development and implementation of
strategic decisions and the ordering and releasing of resources. The planning section, one
of the sections that reports directly to the incident commander, develops the action plan
to accomplish the systems objectives. It collects, evaluates, and disseminates information
about the development of the incident and status of resources. Information is needed to
understand the situation, predict probable courses of events, prepare alternative strategies,
and control operations. An action plan is made of five phases:
• understand the situation
• establish incident objectives (priorities, objectives, strategies, tactics/ tasks)
• develop an action plan
• prepare and disseminate the plan
• continually execute, evaluate, and revise the plan (FEMA, 2012).
The ICS uses the strategic planning approach for coordination of emergency operations
by goal selection, goal decomposition, grouping people into units, and assigning units to
subgoals. Unfortunately, FEMA provides a set of useful guidelines about practices but, does
not explicitly identify the algorithms and design requirements for information systems in
order to make incident action plans. It is not clear how the incident commander is involved in
the action planning process and how the information system assists the incident commander
via a mixed-initiative planning system.
STaC addresses multi-agent planning problems in dynamic environments where most
goals are revealed during execution, where uncertainty in the duration and outcome of
actions plays a significant role, and where unexpected events can cause large disruptions

to existing plans (Maheswaran et al., 2011). STaC is composed of a strategy specification
languagethatcaptureshuman-generatedhigh-levelstrategiesandcorrespondingalgorithms,
executing them in dynamic and uncertain settings. This partitions the problem into strategy
generation, designed by humans and understood by the system, and tactics, orchestrated by
the system with information to and from responders on the ground. STaC gives the ability
to create changing subteams with task threads under constraints (e.g., focus on injured).
The connection between a STaC strategy and the STaC execution algorithm is the notion of
capabilities: agents have capabilities; tasks require capabilities. STaC dynamically updates
the total capability requirements for the tasks in the strategy and assigns agents to tasks
during execution following the human guidance. The big inefficiency in this approach is
that the execution algorithm searches for and assigns a subteam providing a minimum total
capability requirements for a thread, while the best decision maybe to select a subteam
providing a maximum one. This paper proposes a AI problem-solving technique that aims
to optimise the decision-making problem by identifying the best choice from a set of
alternatives effectively and efficiency.
DEFACTO is a multi-agent based tool for training incident commanders for large scale
disasters (man-made or natural) (Schurr and Tambe, 2008). One key aspect of the proxy-
based coordination is ‘adjustable autonomy’ that refers to an agent’s ability to dynamically
change its own autonomy, possibly to transfer control over a decision to a human or another
agent. A transfer-of-control strategy is a pre-planned sequence of actions to transfer control
over a decision among multiple entities. For example, an AH1H2 strategy implies that an
agent (A) attempts a decision and if the agent fails in the decision, then the control over the
decision is passed to a human H1, and then if H1 cannot reach a decision, then the control is
passed to H2. The adjustable autonomy concept is different with strategy decision-making
problem discussed here. Therefore this approach cannot be used to solve the problem stated
by this paper.
The RoboCup Rescue Simulation program is to advance research in the area of disaster
management and SAR (Kitano and Tadokoro, 2001). It provides a platform for disaster
management where heterogeneous field agents (police, fire brigades, and ambulances)
coordinate with each other to deal with a simulated disaster scenario. Police agents have
to clear road blockades to provide access to the disaster sites, ambulance agents have to
rescue civilians, and fire brigade agents have to control the spread of fire and extinguish
it. The simulator also provides centres, a Police Office, a Fire Station, and an Ambulance
Centre, to help the field agents coordinate. A partitioning strategy can partition/divide the
disaster space among agents in the pre-determined and homogeneous fashion (Paquet et al.,
2004) or another strategy (Nanjanath et al., 2010) allows the centres to partition the city into
clusters of roads and buildings using the k-means algorithm and assigning each to an agent.
These strategies are considered a simplified version of the strategy proposed in this paper.
Unfortunately approaches which have been proposed for RoboCup Rescue Simulation do
not take into account the macro action planning problem and human guidance.
One of the most widely-used technique for problem-solving in artificial intelligence
is state-space search (Tambe and Norvig, 2009). Formulation of a problem in a state-
space search framework requires four basic components: state representation, initial state,
expansion operator, and goal state. The objective is to find a sequence of actions that
transforms the start state into a goal state, and also optimises some measure of the quality
of the solution (Kwok and Ishfaq Ahmad, 2005). Heuristic searches, such as A* search, are
highly popular means of finding least-cost plans due to their generality, strong theoretical

guarantees on completeness, optimality and simplicity in the implementation (Cohen et al.,
2010). In this algorithm, a cost function f(s) is attached to each state, s in the search-
space, and the algorithm always chooses the state with the minimum value of f(s) for
expansion. The function f(s) can be decomposed into two components g(s) and h(s)
such that f(s) = g(s) + h(s), where g(s) is the cost from the initial state to state s, and
h(s) (which is also called the heuristic function) is the estimated cost from state s to a
goal state. Since g(s) represents the actual cost of reaching a state, it is h(s) where the
problem dependent heuristic information is captured. Indeed, h(s) is only an estimate of
the actual cost from state s to a goal state, denoted by h∗
(s). An h(s) is called admissible
if it satisfies h(s) <= h∗
(s) which in turn implies f(s) <= f∗
(s). Many problems that
can be formalised in the state-space search model, are solvable by using different version
of this technique. Some problems include the optimal task assignment/allocation problem
(Kwok and Ishfaq Ahmad, 2005; Shahul et al., 2010), finding the shortest paths on real
road networks (Zeng and Church, 2009), action planning (bulitko and Lee, 2006; Bonet
and Geffner, 2001). A specific search algorithm, which has been designed for a specific
problem, is appropriate for solving a specific type of problem. As a result, a few search
algorithms have been developed for different problems. This paper aimed to apply the
state-space search technique for optimal execution of incident commander strategy.
The Markov decision process (MDP) is used for problems of planning under uncertainty
(Boutilier et al., 2011). A MDP models problems of sequential decision making that include
actions that transform a state into one of several possible successor states, with each possible
state transition occurring with some probability. The MDP guarantees an optimal solution,
but does not work for large problems, because of its high time and space complexity,
because it requires calculating all feasible states. This paper proposes a faster approach by
a further-reduced state space.
In mixed-initiative planning systems, humans and machine collaborate in the
development and management of plans by providing capabilities of each one that does
the best (Burstein and McDermott, 1996). Human-system interaction generates and refines
plans by adding and removing activities during execution, while minimising changes to a
reference plan or schedule (Ai-Chang et al., 2004). In the strategic decision making, the
human provides a high-level strategy guidance for the software system to use during real
time execution to produce concrete plans and satisfy particular goals, that are not revealed
a priori but are revealed during execution.
5 Methodology
We formulate the stated problem as a search problem. This paper presents a space-state
search algorithm that intends to calculate/make an optimal macro action plan that minimises
the total time required by the team to achieve the goal.
The calculated plan is composed of a sequential strategic decisions and the first one are
the alternates the commander should select for optimal execution of his strategy for that
time. In addition, a time window is calculated and associated to each strategic decision to
show when that strategic decision starts and when it finishes.
Design the algorithm requires two essential steps: problem statement and problem
formulation. See Nourjou et al. (2014a) for more detailed information.

5.1 Problem statement
The problem addressed by this paper is stated as follows:
.
argmin
p
f(p) = ∆t
P = {d1, d2, . . . , dm}
T = {0, t2, t3, . . . , ∆t}
d1 ∈ A
A = {a1, a2, . . . , an}
∆t: the total time to reach the goal
P: an optimal macro action plan that minimises ∆t
f: the goal is achieved under the P
ti: a time of making the di
d1: the best choice in time 0
A: a set of alternative available.
. (1)
5.2 Problem modelling
A complete data model is required to formulate and model the problem. Moreover, this data
model enable us to design and implement the algorithm. The SAP data model was used
by this paper to develop this algorithm. Figure 6 shows a part of the SAP data model that
presents the classes of strategy, thread, node, thread assignment, and alternative. The SAP
data model is completely discussed in Nourjou et al. (2014a).
6 Algorithm
The state-space search algorithm, which is presented in Algorithm 1, is proposed by this
paper for solving the optimal execution problem of incident commander guidance. This
algorithm combines three key algorithms:
1 the A* search algorithm
2 the human guidance execution algorithm (Nourjou et al., 2014d)
3 LoTeM task assignment algorithm (Nourjou et al., 2014c).
We are not concerned with Algorithms 2 and 3 because of limited space.
This algorithm generates new nodes and calculates the attributes associated to that node.
The computation starts from the initial node, which models the world’s state at time 0, the
time of strategy execution. A sub-algorithm is responsible for generating a new node per
each alternative available at that time. For a newly generated node, LoTeM task scheduling
and task execution are done under the strategic decision encoded by this node until this
strategic decision needs to be adapted. The adaption time points to the node’s ﬁnish time,
and this time window points to the time window that the node’s strategic decision is valid.
Then, the attribute h is calculated. Finally, all these nodes are added to the state space in
order to extend the search tree.

Figure 6 A part of the SAP data model which is used in this paper for problem modelling
(see online version for colours)
Source: Nourjou et al. (2014a)
A sub-algorithm searches the state space to select a node that contains the minimum f. If
the goal is not met by node, a new set of alternatives as feasible choices for revising the
node’s strategic decision will be calculated, and the algorithm deals with these alternative
as it dealt with the initial ones. If this node reaches the goal, the search tree will be explored
from this leaf to the root node to extract a macro action plan as the optimal one.
The proposed algorithm integrates a number of sub-algorithms to solve the main
problem. The following subsections are dedicated to methods/algorithms used in the main
algorithm.
6.1 Subalgorithm: generate a new node
The objective of this algorithm is to generate a new node per an alternative using the parent
node. This alternative plays the role of a strategic decision for the node and will be consistent
during the node’s life time. Attributes associated to the stateNode class are described as
follows:
• g0 is a time in which the node is generated. The g2 attribute of the parent node is
assigned to this attribute.
• g2 is a time that this node ends (or ﬁnishes). This parameter indicates when the
node’s strategic decision needs to be adapted. These parameters are calculated in the
LoTeM tasks assignment algorithm.
• f is an overall time from the root node to the goal node.
• h is an estimation of cost (amount of time) that is required by the team to reach the
goal state from the current node.

• ThreadAssignments_node comprises a set of instances of the threadAssignment class.
An instance of the ‘threadAssignment’ class states what field units are assigned to
what thread. This property presents a strategic decision associated with this node. It
is valid during the lifetime of this node.
• Segments_node encodes LoTeM tasks.
If we execute this algorithm on the problem stated in Sections 1 and 2, the 10 new nodes
will be generated. The g0 property of all these nodes is set into 0 which points to the initial
time.
Algorithm 1 The heuristic search algorithm to make the best choice among a set of alternatives
available for optimal execution of human strategy in macro action planning
Data: A :a set of available alternatives.
Data: n0 :the initial node, an instance of the "stateNode" class, that models the world’s state in time 0.
Result: d1 :the best alternative as the best strategic decision.
Result: t :a minimum overall time.
Result: p :the optimal macro action plan.
StateSpcae ←− ∅;
while true do
for a ∈ A do
n ← Generate_aNew_Node(a, n0);
Assign_LoTeMTasks_toFieldUnits(n);
n.h ← Calculate_h(n);
StateSpcae ←− StateSpcae ∪ {n};
end
n ← Select_a_Node(StateSpcae);
if n.g2 = null then
d1 ← null;
return [null, null,null] ;
end
if Is_the_goalNode(n) = true then
p ← Extract_thePlan(n, StateSpcae);
d1 ← Select_theFirst_Member_from(p);
t ← n.g2;
return [d1, p, t];
end
n0 ← n;
A ←− Calculate_aNewSet_Alternatives(n0);
end
Algorithm 1: The heuristic search algorithm to make the best choice among a set of
6.2 Subalgorithm: assign LoTeM tasks to field units (Nourjou et al., 2014c)
Algorithm 2 presents a heuristic algorithm that intends to assign LoTeM tasks to field units
and execute LoTeM tasks under the node’s strategic decision. In addition, this algorithm
calculates a time at which the node’ strategic decision needs to revised. This algorithm
identifies a subset of field units that should be released from threads. The ‘Segments_node’
and ‘g2’ properties of the node will updated by this algorithm. This algorithm has been
discussed in detail by Nourjou et al. (2014c).

Algorithm 2 Heuristic algorithm for dynamic assignment of LoTeM tasks to field units within a
node
g2 ← g0;
L ←− Create_emptyset_of_legalAssignment();
while true do
Select_Efficient_Agents();
T ←− Select_Active_MacroTasks();
A ←− Select_Idle_Agents(g2);
L ←− L ∪ A;
if Identify_theRelease_Time() = true then
return;
end
while true do
if |L| = 0 or |T| = 0 then
break;
end
T2 ←− Nominate_MacroTasks();
U ←− Calculate_Utilities();
u ←− Find_theHighest_Utilities();
if u.benefitRation >= 10 then
Assign_Agents_toMacroTasks();
continue;
end
else
T ←− T − T2;
continue;
end
end
if |L| = 0 and |T| > 0 then
g2 ← Calculate_earliestFinishTime();
continue;
end
else if |L| > 0 and |T| = 0 then
if Identify_theRelease_Time() = true then
return;
end
else
g2 ← Update_ProblemState();
if g2 = null then
return;
end
else
continue;
end
end
end
end
Algorithm 2: Heuristic algorithm for dynamic assignment of LoTeM tasks to field unitsSource: Nourjou et al. (2014c)
If this algorithm is executed on the 6th node, which contains the 6th alternative, the task
schedule presented in Figure 7 will be calculated. This algorithm has calculated that field
unit a0 needs to be released from its thread at time 96 because thread 1 does not need to
keep this field unit anymore. As result, g2 attribute will be updated to 96. Table 8 shows the
state of LoTeM tasks at the time 96 that this algorithm has calculated.

Algorithm 3 Automated algorithm for calculation of a new set of feasible alternatives in incident
commander guidance execution
Data: n :an entity of the "stateNode" class of the data model.
Data: S :an entity of the "strategy" class.
Data: D :the problem domain.
Data: p :type of selection method.
Result: N :a set of entities of the "stateNode" class that present feasible alternatives.
for i ← 1 to |S.Threads| do
t ← S.Threads[i];
ta ← n.ThreadAssignments_node[i];
ta.MacroTasks_ofThread ←− f_Calculate_MacroTasks(n.Segments_node, t, D);
end
Na ←− ∅;
Nb ←− ∅;
Nb ←− Nb ∪ {n};
for i ← 1 to |S.Threads| do
t ← S.Threads[i];
for nb ∈ Nb do
ta ← nb.ThreadAssignments_node[i];
A1 ←− f_Identify_Agents_ResidentIn(ta);
A2 ←− f_Identify_Agents_ReceivedBy(ta);
for m0 ∈ ta.macroTask_ofThread do
tm0 ← m0.TemporalMacroTasks.Last();
tm0.LegalAssignments ←− f_Select_Efficient_Agents(m0, t, A1, A2);
end
M ←− ta.macroTask_ofThread;
C1 ←− f_Form_EfficientCoalitions(M, A1, A2);
C2 ←− f_Purify_Coalitions(C1, M);
C3 ←− f_Select_Coalitions(C2, p);
for j ← 1 to |C3| do
Na ←− Na ∪ {f_Generate_NewNode(C3[j], nb)};
end
end
Nb ←− Na;
Na ←− ∅;
end
N ←− Nb;
Algorithm 3: Automated algorithm for calculation of a new set of feasible alternativesSource: Nourjou et al. (2014d)
6.3 Subalgorithm: calculate h
Figure 8 shows a heuristic algorithm that aims to calculate ‘h’ and ‘f’ properties of node,
at time g2. First, for each LoTeM task of the task environment formulated by the node, the
parameter TDD (total dependent duration) is calculated using two kinds of information:
1 number of enabled tasks
2 number of tasks dependent to this LoTeM task.
If a set of ﬁeld units is assigned to a LoTeM task, this algorithm will take into account the
ﬁnish time of this task in calculation. Then, ‘h’ variable of node is the aggregation of all
total dependent durations of LoTeM tasks. Consequently, the ‘f’ property of this node will

be the aggregation of the ‘h’ and ‘g2’ properties. Table 9 will be achieved if we execute this
algorithm for the LoTeM tasks shown in Table 8.
Figure 7 Task/action scheduling in the 6th node (see online version for colours)
Table 8 State of LoTeM task environment forecasted at time 96 in 6th node
Location Not yet enabled Enabled
No. (Road S.) Task-type amount amount
1 s1 T0 0 25
2 s1 T2 0 4
3 s2 T1 0 5
4 s2 T2 10 5
5 s3 T0 0 0
6 s3 T1 0 8
7 s3 T2 2 5
8 s5 T0 0 0
9 s5 T1 0 7
10 s5 T2 5 3
11 s4 T1 0 9
12 s4 T2 5 7
6.4 How to expand the state space
The search space is a search tree in which each node presents a deﬁnite strategic decision.
The search tree is expanded by new nodes that are generated using input alternatives. The
LoTeM task assignment algorithm cannot produce new nodes. If 10 nodes, which were
newly generated and calculated, are added to the space search, the result will be a search
tree shown in Figure 9.
To reach a quick solution, the search tree is expanded due to new alternatives calculated
in executing or re-executing the human strategy problem. The strategic decision making
problem is prioritised higher than the task scheduling problem. As result, a heuristic
algorithm that does not produce any node was used in presented algorithm.

Figure 8 Heuristic algorithm for calculating the ‘Total Dependent Duration’ parameter which is
associated with three LoTeM tasks located in the same geographic area (see online
version for colours)
Table 9 The ‘total dependent duration’ variable calculated for LoTeM tasks of Table 8
LoTeM task no. Total dependent duration
1 125
2 240
3 700
4 300
5 0
6 280
7 150
8 0
9 440
10 90
11 390
12 420
6.5 Subalgorithm: select a node
The purpose of algorithm is to select a node from the state space. There may be different
methods to select a node according to some criteria. A selected node will be used for two
purposes:
• to select as the goal state or
• to expand the search tree.
Our method in this paper is to select a node with the smallest f from the search space. This
method is used in A* search algorithms. Execution of this method on the calculated search
tree resulted in selection of node 6.
6.6 Sub-algorithm: extract the plan
This algorithm will be executed if the selected node is recognised as the goal node, in which
the global goal is satisﬁed (or met). The objective of this algorithm is to extract the optimal

macro action plan from the search tree using this selected node (the leaf node). The leaf
node indicates the time in which the goal is met, and the root node indicates the best choice
for making the strategic decision in time 0. Therefore, ‘g2’ property of the selected node
presents the minimum overall time (cost) that the team can reach the objective from the
initial state.
Figure 9 The search tree (state space) that includes 10 newly generated nodes
6.7 Subalgorithm: calculate a new set of alternatives
This algorithm will be run if the selected node is not identified as the goal node. Algorithm 3
is used to calculate a set of new alternative for revising the selected node’s strategic decision
at time ‘g2’. The ‘g2’ property indicates a right time that a right subset of field units, which
are assigned to subset of threads via the strategic decision, should be released (get free)
from these threads, and a released field unit should be sent/entered into the lower thread.
At this time, we are faced with the problem of re-assignment of new field units to a
thread; These new field have been entered into this thread from the higher thread. A thread
of this type is capable of selecting and keeping any sufficient subset of the field units and
sends unwanted ones into the lower thread, but there may be several subsets as available
options for this thread. Consequently, a number of scenarios may exist that present different
kinds of distribution of field units among threads in a time. An alternative is a candidate for
making the final strategic decision. This algorithm has been discussed in detail by Nourjou
et al. (2014d).
We saw at Figure 7 that the strategic decision of 6th Node should release field unit ‘a0’
from thread 1 at time 96 (Nourjou et al., 2014d). Figure 10 presents a set of new feasible
alternatives that is calculated by running this algorithm for this node. Newly calculated
alternatives will be used to generate new nodes. Figure 11 shows the state space which is
expanded by the new alternative.
7 Implementation
The algorithm presented in the previous section was implemented in GICoordinator. Design
the GICoordinator has identified the significant issue of this algorithm in collaboration
between the human (incident commander) and an intelligent software agent. A human is

equipped with a computer running an instance of GICoordinator. This system is a GIS-
based intelligent assistant system that collaborates with the commander and supports IC in
his guidance execution (Nourjou et al., 2014b). C# programming language has been used
to implement the core of the GICoordinator.
Figure 10 Adaption of the strategic decision associated with 6th node at time 96 (see online
version for colours)
Source: Nourjou et al. (2014d)
Figure 11 The search tree which is expanded from node 6 (see online version for colours)
The proposed algorithm was executed on the scenario stated in Section 3. To evaluate the
proposed approach, we applied this algorithm on various strategies and various team with
different size. For each scenario, computation includes the computation time, number of
generated nodes, the minimum total time to reach the goal, and type of search method.

8 Results
Tables 10 and 11 show the results which were calculated by executing the presented
algorithm on the scenario stated in Section 3. Alternative 6 shown in Table 7 was selected
as the best choice among 10 alternative for optimal execution of the human strategy in
the strategic decision-making process. This choice is the best strategic decision that the
commander can select in time 0 for macro action planning. Time 894 states a minimum total
time (cost) that the goal state is estimated to be achieved via the optimal macro action plan.
Figure 12 presents the made strategic decision via the GIS-based interface for the human.
Table 10 The best alternative (choice) selected for optimally executing the human strategy stated
in Section 3
Alternative no. Minimum total time (in minutes)
6 894
Table 11 The optimal macro action plan calculated for the scenario stated at Section 3
Field units Field units Field units
Strategic assigned assigned Fassigned Start time
decision no. to Thread 1 to Thread 2 to Thread 3 (assignment time)
1 (the Alternative 6) a0, a2 a6, a7 0
2 a2 a6, a7 a0 96
3 a2 a6, a7 180
4 a6, a7 a2 393
5 a6, a7 463
6 a6, a7 576
7 894
In addition, the optimal macro action plan was calculated by this algorithm. Seven sequence
strategic decisions form this plan. First one is alternative 6, and others are calculated by the
algorithm. A strategic decision presents which ﬁeld units are assigned to which threads in
what life time.
8.1 Evaluation
To evaluate efﬁciency of the proposed algorithm, we executed it on 9 scenarios. These
scenarios are considered to be 27 examples, which are generated using three types of
strategies, three kinds of team size, and three types of search methods.

Figure 12 Display the best strategic decision that was made for the scenario stated in Section 3
(see online version for colours)
The task environment of SAR scenario, which was stated in Section 3, was used to generate
27 scenarios in this evaluation. Three human strategies were defined as:
1 the strategy type I composed of three threads independent in field units
2 the strategy type II composed of three semi-independent threads
3 the strategy type III as a complex one composed of three threads containing all field
units.
Three search methods were:
1 the A* search method with α = 1.0
2 the A* search method with α = 0.3
3 the Breadth-First algorithm, which is a kind of the A* search method α = 0.0.
Table 12 shows complete, feasible, and optimal solutions for the simulated scenarios. An
optimal macro action plan was calculated for each simulated scenario. Three types of
information were computed for each scenario’s plan:
1 the computation time (run time),
2 number of generated nodes, and
3 the minimum overall time.
The results show that the breadth-first search method can guaranty the optimality of the
problem, but it should generate a big state space including a huge number of nodes in order
to find the goal node. Consequently it is impossible to quickly calculate a plan in a complex
SAR. The A* search method finds a semi-optimal solution in a meaning computation time,
but for a complex problem, we should apply a correct variable α.

Table 12 Evaluation of the proposed algorithm using the simulated scenarios
Team Strategy Minimum total Computation time Number of Search
size Type time (in minutes) (in milliseconds) generated nodes method
4 I 1074 5 94 A* with α = 1.0
4 I 1074 5 93 A* with α = 0.3
4 I 1074 5 91 Breadth-First
4 II 1104 7 95 A* with α = 1.0
4 II 1104 7 97 A* with α = 0.3
4 II 1074 10 100 Breadth-First
4 III 931 9 102 A* with α = 1.0
4 III 931 9 97 A* with α = 0.3
4 III 931 22 141 Breadth-First
8 I 725 12 99 A* with α = 1.0
8 I 530 12 107 A* with α = 0.3
8 II 541 12 105 A* with α = 1.0
8 II 541 12 103 A* with α = 0.3
8 III 1082 9 109 A* with α = 1.0
8 III 729 15 121 A* with α = 0.3
12 I 714 12 95 A* with α = 1.0
12 I 714 12 102 A* with α = 0.3
12 II 374 12 106 A* with α = 1.0
12 II 374 12 106 A* with α = 0.3
12 III 1072 9 178 A* with α = 1.0
12 III 661 17 230 A* with α = 0.3
9 Conclusion
This paper applied search algorithms, which belongs to the artiﬁcial intelligence, to solve
the optimal decision-making problem addressed throughout this presentation. The presented
algorithm is capable of reasoning which is the best choice among a set of alternatives to
optimally execute the commander’s strategy in the strategic decision making problem in
the SAR domain. This algorithm calculates a complete, feasible and optimal solution for
the problem addressed by this paper. Three results achieved by executing this algorithm on
a simulated scenario includes:
1 a macro action plan made
2 an overall time calculated
3 an alternative selected.

They support the commander’s decisions and assist the human in effective control and
coordination of field units’ actions by macro action planning.
Incident commanders need a quick and feasible solution, and optimisation of the
human strategy execution problem is a difficult problem. We certainly do not claim
optimality of calculated results in the real world. We made some assumptions and
simplifications to model a complex problem, and design and run the problem-solving
algorithm in a simulated environment. SAR that is carried out in a real disaster includes
more sophisticated characteristics: road blockage distributes SAR; synchronous actions
are required to accomplish a specific task; uncertainty in task duration, task state, and
field units’ capabilities distribute an action plan; a bad strategy that are defined by an
unprofessional commander results in second disaster; there are different task-types and also
different interdependencies among tasks.
To decrease the size of search space, two techniques were used in our algorithm. The
strategy adaption algorithm keeps only two alternatives for each thread. The task assignment
algorithm is a greedy and heuristic algorithm that does not produce any node.
The proposed algorithm is considered a search-based planning algorithm that makes a
macro action plan according some optimisation criteria and the human strategy. It calculates
a measurement (a total time) associated to choices available in time 0.
In addition, decision-theoretic planning, as planning under uncertainty, takes up a huge
amount of time. A selected strategic decision constrains field units’ actions for real-time
execution, and the commander will monitor the world state to adapt this decision remake
for a new one at a right time. As result, better strategic decisions should be made over time.
This paper introduced the strategy definition to encode human initiative in the decision-
making solving process and involve humans in the loop in the SAR domain. This approach
is required to be refined to real requirements that commanders of different teams are faced
in the real world.
Learning from past experiences enables us to develop robust algorithms. Learning how
to adjust the task type matrix, the matrix of field units’ actions, the parameter h of the search
algorithm, or a strategy defined by humans are considered the important issue.
This algorithm can be used to develop an autonomous software system for automated
execution of human strategy. The aim of this system to execute the human strategy and
continuously monitor execution of made strategic decisions until the goal is met.
The proposed algorithm can be refined to address new requirements/demands. A joint
objective can maximise the global utility due to a definite time, e.g., maximise the number
of rescued people till 72 h, or minimise the number of human fatalities during 72 h. Multiple
objectives can be considered in this algorithm. Human strategy definition and execution
can be used for the resource allocation problem, e.g., allocate refuges or emergency
transportation.
The presented algorithm can enable a commander to specify different strategies and
execute this algorithm on them. Results provide a measurement for the human to evaluate
and assess the quality of the defined strategy.
In this paper, a simplified scenario was simulated and used to present the application
of the proposed algorithm. Crisis response to real disaster is a complex environment with
which incident commanders should handle. Our algorithm seems to make contribution to
real SAR by addressing the commander’s demands and requirements stated in this paper.
This paper focused on Phase 3 of the strategic decision-making technique stated in
Sections 1 and 2. It is notable that other parameters are important in the optimisation problem

of SAR. Specifying a good strategy is a considerable issue because our algorithm is executed
under a human-defined strategy as a human guidance/supervision.
Future work includes at least two directions:
1 develop a smart algorithm that aims to adjust the human strategy and recommend a
better one
2 apply machine learning techniques to correctly refine the matrix of task types, matrix
of field units actions, and correctly estimate the variable h in the real world.
Acknowledgement
We would like to thank Mrs. Denise Badolato for improving the language of this paper.
References
Ai-Chang, M., Bresina, J., Charest, L., Chase, A., Hsu, J.C.J., Jonsson, A., Kanefsky, B., Morris, P.,
Rajan, K., Yglesias, J. and Chafin, B.G. (2004) ‘Mapgen: mixed-initiative planning and
scheduling for the mars exploration rover mission’, Intelligent Systems, IEEE, Vol. 19, No. 1,
pp.8–12.
Bigley, G.A. and Roberts, K.H. (2001) ‘The incident command system: high-reliability organizing
for complex and volatile task environments’, Academy of Management Journal, Vol. 44, No. 6,
pp.1281–1299.
Bonet, B. and Geffner, H. (2001) ‘Planning as heuristic search’, Artificial Intelligence, Vol. 129, No. 1,
pp.5–33.
Boutilier, C., Dean, T. and Hanks, S. (1999) ‘Decision-theoretic planning: structural assumptions and
computational leverage’, Journal of Artificial Intelligence Research, Vol. 11, No. 1, p.94.
Bulitko, V. and Lee, G. (2006) ‘Learning in real-time search: a unifying framework’, J. Artif. Intell.
Res.(JAIR), Vol. 25, pp.119–157.
Burstein, M.H. and McDermott, D.V. (1996) ‘Issues in the development of human-computer mixed-
initiative planning’, Advances in Psychology, Vol. 113, pp.285–303.
Chen, R., Sharman, R., Raghav Rao, H. and Upadhyaya, S.J. (2008) ‘Coordination in emergency
response management’, Communications of the ACM, Vol. 51, No. 5, pp.66–73.
Cohen, B.J., Chitta, S. and Likhachev, M. (2010) ‘Search-based planning for manipulation with motion
primitives’, 2010 IEEE International Conference on Robotics and Automation (ICRA), IEEE,
May, Anchorage, Alaska, pp.2902–2908.
FEMA(2012)FemaIncidentActionPlanningGuide,http://www.uscg.mil/hq/cg5/cg534/nsarc/FEMA
%20Incident%20Action%20Planning%20Guide%20(IAP).pdf (Accessed on September, 2014).
Kitano, H. and Tadokoro, S. (2001) ‘Robocup rescue: a grand challenge for multiagent and intelligent
systems’, AI Magazine, Vol. 22, No. 1, p.39.
Kwok, Y-K. and Ahmad, I. (2005) ‘On multiprocessor task scheduling using efficient state
space search approaches’, Journal of Parallel and Distributed Computing, Vol. 65, No. 12,
pp.1515–1532.
Maheswaran, R.T., Szekely, P. and Sanchez, R. (2011) ‘Automated adaptation of strategic guidance in
multiagent coordination’, Agents in Principle, Agents in Practice, Springer Berlin Heidelberg,
pp.247–262.
Myers, K.L., Jarvis, P., Tyson, M. and Wolverton, M. (2003) ‘A mixed-initiative framework for robust
plan sketching’, ICAPS, Trento, Italy, pp.256–266.

Nanjanath, M., Erlandson, A.J., Andrist, S., Ragipindi, A., Mohammed, A.A., Sharma, A.S. and
Gini, M. (2010) ‘Decision and coordination strategies for robocup rescue agents’, Simulation,
Modeling, and Programming for Autonomous Robots, Springer Berlin Heidelberg, pp.473–484.
Nourjou, R., Szekely, P., Hatayama, M., Ghafory-Ashtiany, M. and Smith, S.F. (2014a) ‘Data model
of the strategic action planning and scheduling problem in a disaster response team’, Journal of
Disaster Research, Vol. 9, No. 3, pp.381–399.
Nourjou, R., Hatayama, M., Smith, S.F., Sadeghi, A. and Szekely, P. (2014b) Design of a GIS-
based Assistant Software Agent for the incident Commander to Coordinate Emergency Response
Operations, arXiv preprint arXiv:1401.0282.
Nourjou, R., Smith, S.F., Hatayama, M., Okada, N. and Szekely, P. (2014c) ‘Dynamic assignment
of geospatial-temporal macro tasks to agents under human strategic decisions for centralized
scheduling in multi-agent systems’, International Journal of Machine Learning and Computing
(IJMLC), Vol. 4, No. 1, pp.39–46.
Nourjou, R., Smith, S.F., Hatayama, M. and Szekely, P. (2014d) ‘Intelligent algorithm for assignment
of agents to human strategy in centralized multi-agent coordination’, Journal of Software, Vol. 9,
No. 10, pp.2586–2597.
Nourjou, R. and Gelernter, J. (2015) ‘Distributed autonomous GIS to form teams for public safety’,
MobiGIS ’15 Proceedings of the 4th ACM SIGSPATIAL International Workshop on Mobile
Geographic Information Systems, ACM, Bellevue, WA, USA.
Paquet, S., Bernier, N. and Chaib-draa, B. (2004) ‘Comparison of different coordination strategies
for the robocuprescue simulation’, Innovations in Applied Artiﬁcial Intelligence, Springer Berlin
Heidelberg, pp.987–996.
Russell, S. and Norvig, P. (2009) Artiﬁcial Intelligence: A Modern Approach, 3rd ed., Pearson, p.1152.
Schurr, N. and Tambe, M. (2008) ‘Using multi-agent teams to improve the training of incident
commanders’, Defence Industry Applications of Autonomous Agents and Multi-Agent Systems,
Birkhuser Basel, pp.151–166.
Shahul, A., Semar, Z. and Sinnen, O. (2010) ‘Scheduling task graphs optimally with A*’, The Journal
of Supercomputing, Vol. 51, No. 3, pp.310–332.
Zeng, W. and Church, R.L. (2009) ‘Finding shortest paths on real road networks: the case for A*’,
International Journal of Geographical Information Science, Vol. 23, No. 4, pp.531–543.

Search algorithm for optimal execution of incident commander guidance in macro action planning

Recommended

Recommended

More Related Content

What's hot

What's hot (7)

Similar to Search algorithm for optimal execution of incident commander guidance in macro action planning

Similar to Search algorithm for optimal execution of incident commander guidance in macro action planning (20)

More from Reza Nourjou, Ph.D.

More from Reza Nourjou, Ph.D. (20)

Recently uploaded

Recently uploaded (20)

Search algorithm for optimal execution of incident commander guidance in macro action planning