2. 2"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Introduc.on(
! World seldom predictable
● corresponding deliberation models as a result always going to be incomplete
! Results in:
● Action failures
● Unexpected side effects of actions
● Exogenous events
! So far, been working with deterministic action models
● Each action, when applied in a particular state, results in only one state
● Formally: γ(s,a) returns a single state
● Doesn’t adequately support inherent uncertainty in domains
! Nondeterministic models provide more flexibility:
● An action, when applied in a state, may result in one among several possible
states
● γ(s,a) returns a set of states
! Nondeterministic models allow modeling uncertainty in planning domains
5. 5"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
Figure 5.1: A simple nondeterministic planning domain model
Definition 5.1. (Planning Domain) A nondeterministic planning do-
main ⌃ is the tuple (S, A, ), where S is the finite set of states, A is the
finite set of actions, and : S ⇥ A ! 2S is the state transition function.
Search(Spaces(in(Nondeterminis.c(Planning(
! Search space of deterministic planning
modeled as a graph
● Nodes are states, edges are actions
! For planning with nondeterministic domains,
search space no longer a graph
● Instead its now an AND/OR graph
! AND/OR graph has following elements:
● OR branches: which action
to apply in a state?
● AND branches: which state does the
action lead to?
! Have control over which action to apply (OR
branches)
! Don’t have control over resulting state (AND
branches)
A simple nondeterministic model of a
harbor management facility
10. 10"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
Plans(in(Nondeterminis.c(Domains(
! Need the notion of a conditional plan
● plans that account for various
possibilities in a given state
! Can sense the actual action outcome
among the possible ones, and act
according to the conditional
structure of plan
! A possible representation:
● a policy:
partial function
that maps
states to actions
! If a policy π maps a state s to an action a
● that means we should perform a
whenever we are in state s
13. 13"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5! In deterministic planning, can compute states
reachable by sequence of actions using γ
● s ∪ γ (s, a1)∪ γ (γ (s,a1), a2) ∪ ...
! Need few extra definitions to do similar
checks in nondeterministic planning
! Reachable States: (s,π)
● All states that can be produced by
starting at s and executing π
! Example: (pos(item)=on"ship,π1)
● π1 (pos(item)=on"ship) = unload"
● π1(pos(item)=at"harbor) = park"
● π1(pos(item)=parking1) = deliver"
Defini.ons(Over(Policies(
14. 14"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
Defini.ons(Over(Policies(
! In deterministic planning, can compute states
reachable by sequence of actions using γ
● s ∪ γ (s, a1)∪ γ (γ (s,a1), a2) ∪ ...
! Need few extra definitions to do similar
checks in nondeterministic planning
! Reachable States: (s,π)
● All states that can be produced by
starting at s and executing π
! Example: (pos(item)=on"ship,π1)
● π1 (pos(item)=on"ship) = unload"
● π1(pos(item)=at"harbor) = park"
● π1(pos(item)=parking1) = deliver"
15. 15"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
Defini.ons(Over(Policies(
! Need to also check whether plan reaches goal
● Requires calculating final states of policy
! leaves (s,π): set of final states reached by
policy π starting from state s
! leaves(s, π) = {s′ | s′ ∈ ︎ (s, π) and
s′ not in Dom(π)}
! Example:
● leaves (pos(item)=on"ship,"π1)
● π1 (pos(item)=on"ship) = unload"
● π1(pos(item)=at"harbor) = park"
● π1(pos(item)=parking1) = deliver"
20. 20"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Planning(Problems(and(Solu.ons(
! Let Σ = (S,A,γ) be a planning domain
! A planning problem P is a 3-tuple P = (Σ,s0,Sg)
● s0 ∈ S is the initial state
● Sg ⊆ S is set of goal states
! Note: previous book had set of initial states S0
● Allowed uncertainty about initial state
● Current definition is equivalent
▸ Can easily translate one to the other
• How?
▸ Introduce a new start action such that γ (s0, start) = S0
! Solutions: not as straightforward to define as Deterministic Planning
● Based on actual action outcomes, might or might not achieve goal
● Can define different criteria of success – many types of solutions
21. 21"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Type(1:(Solu.on(
Let P = (Σ,s0,Sg) be a planning problem. Let π be a policy for Σ. π is a solution iff
leaves (s0,π) ∩ Sg ≠ ∅
! A policy that may lead to a goal
● In other words: at least one sequence of
nondeterministic outcomes leads to a goal state
! Example:
● s0 = {pos(item)"="on_ship}
● Sg = {pos(item)"="gate1,"pos(item)"="gate2}
! Policy π1 is a solution
● π1 (pos(item)=on"ship) = unload"
● π1(pos(item)=at"harbor) = park"
● π1(pos(item)=parking1) = deliver"
! Reason: At least one of the paths in
reachability graph of π1 leads to a state in Sg
26. 26"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Type(2a:(Cyclic(Safe(Solu.ons(
Let P = (Σ,s0,Sg) be a planning problem. Let π be a policy for Σ. π is a cyclic safe
solution iff
(1) leaves(s0, π) ⊆ Sg ∧
(2) (∀s ∈ γ ︎(s0, π)(leaves(s, π) ∩ Sg ≠ ∅))
(3) Graph(s0, π) is cyclic
Meaning of Conditions:
(1) No non-solution leaves
(2) Safe solution
(3) Reachability graph is cyclic
Cyclic Safe solution: a
safe solution with cycles
! π2 is a cyclic safe solution
How does having cycles affect level of safety?
27. 27"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Type(2a:(Cyclic(Safe(Solu.ons(
Let P = (Σ,s0,Sg) be a planning problem. Let π be a policy for Σ. π is a cyclic safe
solution iff
(1) leaves(s0, π) ⊆ Sg ∧
(2) (∀s ∈ γ ︎(s0, π)(leaves(s, π) ∩ Sg ≠ ∅))
(3) Graph(s0, π) is cyclic
Meaning of Conditions:
(1) No non-solution leaves
(2) Safe solution
(3) Reachability graph is cyclic
Cyclic Safe solution: a
safe solution with cycles
! π2 is a cyclic safe solution
How does having cycles affect level of safety?
! could go though cycle infinitely many times
! If execution gets out of loop eventually,
guaranteed to reach goal state
30. 30"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Summary(of(Solu.on(Types(
Section 5.3 173
Figure 5.6: Di↵erent Kinds of Solutions: A Class Diagram
nondeterminism probabilistic
solutions weak solutions -
unsafe solutions - improper solutions
safe solutions strong cyclic solutions proper solutions
cyclic safe solutions - -
acyclic safe solutions strong solutions -
! Unsafe Solutions aren’t of much interest to us
● Do not guarantee achievement of goal
! Acyclic Safe Solutions are the best – complete assurance that we’ll get to the goal
! Cyclic Safe Solutions also good, but provide a weaker degree of assurance
● We can get into loops
● However, assuming that we don’t stay in the loop forever, guaranteed to
achieve the goal
32. 32"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
AND/OR(Graph(Search(Algorithms(
! Nondeterministic planning search
spaces represented as AND/OR
graphs
● nodes: states
● OR branches: actions applicable
in a state (consider 1)
● AND branches: successor states
from an state-action pair
(consider ALL)
! Reachability graph of a solution
policy includes one action at each OR
branch and all of the action’s
outcomes at each AND branch
! First set of planning algorithms will
do AND/OR graph search
● Simple extensions of ForwardG
Search"from Chapter 2
ship
hbr
par1
tr1
par2
park
tr2
g2
g1
del
tr3
g1
hbr
del
back
par1
par2
move
unload
33. 33"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
FindRSolu.on:(Algorithm(to(find(Solu.ons(
Chapter 2
Forward-search (⌃, s0, g)
s s0; ⇡ hi
loop
if s0 satisfies g then return ⇡
A0 {a 2 A | a is applicable in s}
if A0 = ? then return failure
nondeterministically choose a 2 A0
s (s, a); ⇡ ⇡.a
A nondeterministic forward-search planning algorithm.
iscuss properties that are shared by all algorithms that do a
of the same search space, even though those algorithms may
es of that tree in di↵erent orders. The rest of this section
of those algorithms.
olution to a planning problem may require a huge computa-
r an arbitrary CSV planning problem the task is PSPACE-
]. To reduce the computational e↵ort, several of the search
his section incorporate heuristic techniques for selecting which
174 Chapter 5
Find-Solution (⌃, s0, Sg)
⇡ ?; s s0; Visited {s0} // initialization
loop
if s 2 Sg then return ⇡ // goal test
A0 Applicable(s)
if A0 = ? then return failure // dead-end test
nondeterministically choose a 2 A0 // branching
nondeterministically choose s0 2 (s, a)// progression
if s0 2 Visited then return failure // loop check
⇡(s) a; Visited Visited [ {s0}; s s0
174 Chapter 5
Find-Solution (⌃, s0, Sg)
⇡ ?; s s0; Visited {s0} // initialization
loop
if s 2 Sg then return ⇡ // goal test
A0 Applicable(s)
if A0 = ? then return failure // dead-end test
nondeterministically choose a 2 A0 // branching
nondeterministically choose s0 2 (s, a)// progression
if s0 2 Visited then return failure // loop check
⇡(s) a; Visited Visited [ {s0}; s s0
Additional nondeterministic
choice to decide which action
outcome to plan for next
Cycle-checking
Identical Algorithms except:
Deterministic Planning algorithm
from Chapter 2
Nondeterministic Planning
algorithm
34. 34"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
FindRSolu.on:(Algorithm(to(find(Solu.ons(
174
Find-Solution (⌃, s0, Sg)
⇡ ?; s s0; Visited {s0} // initia
loop
if s 2 Sg then return ⇡ // goal
A0 Applicable(s)
if A0 = ? then return failure // dead-
nondeterministically choose a 2 A0 // branc
nondeterministically choose s0 2 (s, a)// progr
if s0 2 Visited then return failure // loop
⇡(s) a; Visited Visited [ {s0}; s s0
Figure 5.7: Planning for Solutions by For
graphs to find solutions. The main goal of the foll
the di↵erence in algorithms from deterministic dom
mainly a didactic rather than practical objective.
5.3.1 Planning for Solutions by Forward
174
Find-Solution (⌃, s0, Sg)
⇡ ?; s s0; Visited {s0} // initial
loop
if s 2 Sg then return ⇡ // goal t
A0 Applicable(s)
if A0 = ? then return failure // dead-e
nondeterministically choose a 2 A0 // branc
nondeterministically choose s0 2 (s, a)// progre
if s0 2 Visited then return failure // loop c
⇡(s) a; Visited Visited [ {s0}; s s0
Figure 5.7: Planning for Solutions by For
graphs to find solutions. The main goal of the follo
the di↵erence in algorithms from deterministic doma
mainly a didactic rather than practical objective.
5.3.1 Planning for Solutions by Forward
We first present a very simple algorithm that finds
ship
164 Chapter 5
Policy:
35. 35"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
FindRSolu.on:(Algorithm(to(find(Solu.ons(
174
Find-Solution (⌃, s0, Sg)
⇡ ?; s s0; Visited {s0} // initia
loop
if s 2 Sg then return ⇡ // goal
A0 Applicable(s)
if A0 = ? then return failure // dead-
nondeterministically choose a 2 A0 // branc
nondeterministically choose s0 2 (s, a)// progr
if s0 2 Visited then return failure // loop
⇡(s) a; Visited Visited [ {s0}; s s0
Figure 5.7: Planning for Solutions by For
graphs to find solutions. The main goal of the foll
the di↵erence in algorithms from deterministic dom
mainly a didactic rather than practical objective.
5.3.1 Planning for Solutions by Forward
174
Find-Solution (⌃, s0, Sg)
⇡ ?; s s0; Visited {s0} // initial
loop
if s 2 Sg then return ⇡ // goal t
A0 Applicable(s)
if A0 = ? then return failure // dead-e
nondeterministically choose a 2 A0 // branc
nondeterministically choose s0 2 (s, a)// progre
if s0 2 Visited then return failure // loop c
⇡(s) a; Visited Visited [ {s0}; s s0
Figure 5.7: Planning for Solutions by For
graphs to find solutions. The main goal of the follo
the di↵erence in algorithms from deterministic doma
mainly a didactic rather than practical objective.
5.3.1 Planning for Solutions by Forward
We first present a very simple algorithm that finds
ship
hbr
unload
164 Chapter 5
Policy:
ship: unload
36. 36"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
FindRSolu.on:(Algorithm(to(find(Solu.ons(
174
Find-Solution (⌃, s0, Sg)
⇡ ?; s s0; Visited {s0} // initia
loop
if s 2 Sg then return ⇡ // goal
A0 Applicable(s)
if A0 = ? then return failure // dead-
nondeterministically choose a 2 A0 // branc
nondeterministically choose s0 2 (s, a)// progr
if s0 2 Visited then return failure // loop
⇡(s) a; Visited Visited [ {s0}; s s0
Figure 5.7: Planning for Solutions by For
graphs to find solutions. The main goal of the foll
the di↵erence in algorithms from deterministic dom
mainly a didactic rather than practical objective.
5.3.1 Planning for Solutions by Forward
174
Find-Solution (⌃, s0, Sg)
⇡ ?; s s0; Visited {s0} // initial
loop
if s 2 Sg then return ⇡ // goal t
A0 Applicable(s)
if A0 = ? then return failure // dead-e
nondeterministically choose a 2 A0 // branc
nondeterministically choose s0 2 (s, a)// progre
if s0 2 Visited then return failure // loop c
⇡(s) a; Visited Visited [ {s0}; s s0
Figure 5.7: Planning for Solutions by For
graphs to find solutions. The main goal of the follo
the di↵erence in algorithms from deterministic doma
mainly a didactic rather than practical objective.
5.3.1 Planning for Solutions by Forward
We first present a very simple algorithm that finds
ship
hbr
par1
tr1
par2
park
unload
164 Chapter 5
Policy:
ship: unload
hbr: park
Assume this
outcome is
chosen
37. 37"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
FindRSolu.on:(Algorithm(to(find(Solu.ons(
174
Find-Solution (⌃, s0, Sg)
⇡ ?; s s0; Visited {s0} // initia
loop
if s 2 Sg then return ⇡ // goal
A0 Applicable(s)
if A0 = ? then return failure // dead-
nondeterministically choose a 2 A0 // branc
nondeterministically choose s0 2 (s, a)// progr
if s0 2 Visited then return failure // loop
⇡(s) a; Visited Visited [ {s0}; s s0
Figure 5.7: Planning for Solutions by For
graphs to find solutions. The main goal of the foll
the di↵erence in algorithms from deterministic dom
mainly a didactic rather than practical objective.
5.3.1 Planning for Solutions by Forward
174
Find-Solution (⌃, s0, Sg)
⇡ ?; s s0; Visited {s0} // initial
loop
if s 2 Sg then return ⇡ // goal t
A0 Applicable(s)
if A0 = ? then return failure // dead-e
nondeterministically choose a 2 A0 // branc
nondeterministically choose s0 2 (s, a)// progre
if s0 2 Visited then return failure // loop c
⇡(s) a; Visited Visited [ {s0}; s s0
Figure 5.7: Planning for Solutions by For
graphs to find solutions. The main goal of the follo
the di↵erence in algorithms from deterministic doma
mainly a didactic rather than practical objective.
5.3.1 Planning for Solutions by Forward
We first present a very simple algorithm that finds
ship
hbr
par1
tr1
par2
park
unload
164 Chapter 5
Policy:
ship: unload
hbr: park
par1: deliver
g1
g2
del
tr2
Assume this
outcome is
chosen
38. 38"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
FindRSolu.on:(Algorithm(to(find(Solu.ons(
174
Find-Solution (⌃, s0, Sg)
⇡ ?; s s0; Visited {s0} // initia
loop
if s 2 Sg then return ⇡ // goal
A0 Applicable(s)
if A0 = ? then return failure // dead-
nondeterministically choose a 2 A0 // branc
nondeterministically choose s0 2 (s, a)// progr
if s0 2 Visited then return failure // loop
⇡(s) a; Visited Visited [ {s0}; s s0
Figure 5.7: Planning for Solutions by For
graphs to find solutions. The main goal of the foll
the di↵erence in algorithms from deterministic dom
mainly a didactic rather than practical objective.
5.3.1 Planning for Solutions by Forward
174
Find-Solution (⌃, s0, Sg)
⇡ ?; s s0; Visited {s0} // initial
loop
if s 2 Sg then return ⇡ // goal t
A0 Applicable(s)
if A0 = ? then return failure // dead-e
nondeterministically choose a 2 A0 // branc
nondeterministically choose s0 2 (s, a)// progre
if s0 2 Visited then return failure // loop c
⇡(s) a; Visited Visited [ {s0}; s s0
Figure 5.7: Planning for Solutions by For
graphs to find solutions. The main goal of the follo
the di↵erence in algorithms from deterministic doma
mainly a didactic rather than practical objective.
5.3.1 Planning for Solutions by Forward
We first present a very simple algorithm that finds
ship
hbr
par1
tr1
par2
park
unload
164 Chapter 5
Policy:
ship: unload
hbr: park
par1: deliver
tr2: move
g1
g2
del
tr2
g1 g2
move
Assume this
outcome is
chosen
39. 39"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
FindRSolu.on:(Algorithm(to(find(Solu.ons(
174
Find-Solution (⌃, s0, Sg)
⇡ ?; s s0; Visited {s0} // initia
loop
if s 2 Sg then return ⇡ // goal
A0 Applicable(s)
if A0 = ? then return failure // dead-
nondeterministically choose a 2 A0 // branc
nondeterministically choose s0 2 (s, a)// progr
if s0 2 Visited then return failure // loop
⇡(s) a; Visited Visited [ {s0}; s s0
Figure 5.7: Planning for Solutions by For
graphs to find solutions. The main goal of the foll
the di↵erence in algorithms from deterministic dom
mainly a didactic rather than practical objective.
5.3.1 Planning for Solutions by Forward
174
Find-Solution (⌃, s0, Sg)
⇡ ?; s s0; Visited {s0} // initial
loop
if s 2 Sg then return ⇡ // goal t
A0 Applicable(s)
if A0 = ? then return failure // dead-e
nondeterministically choose a 2 A0 // branc
nondeterministically choose s0 2 (s, a)// progre
if s0 2 Visited then return failure // loop c
⇡(s) a; Visited Visited [ {s0}; s s0
Figure 5.7: Planning for Solutions by For
graphs to find solutions. The main goal of the follo
the di↵erence in algorithms from deterministic doma
mainly a didactic rather than practical objective.
5.3.1 Planning for Solutions by Forward
We first present a very simple algorithm that finds
ship
hbr
par1
tr1
par2
park
unload
164 Chapter 5
Policy:
ship: unload
hbr: park
par1: deliver
tr2: move
g1
g2
del
tr2
g1 g2
move
Reached a
goal state.
Terminate here.
40. 40"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
FindRSolu.on:(Algorithm(to(find(Solu.ons(
174
Find-Solution (⌃, s0, Sg)
⇡ ?; s s0; Visited {s0} // initia
loop
if s 2 Sg then return ⇡ // goal
A0 Applicable(s)
if A0 = ? then return failure // dead-
nondeterministically choose a 2 A0 // branc
nondeterministically choose s0 2 (s, a)// progr
if s0 2 Visited then return failure // loop
⇡(s) a; Visited Visited [ {s0}; s s0
Figure 5.7: Planning for Solutions by For
graphs to find solutions. The main goal of the foll
the di↵erence in algorithms from deterministic dom
mainly a didactic rather than practical objective.
5.3.1 Planning for Solutions by Forward
174
Find-Solution (⌃, s0, Sg)
⇡ ?; s s0; Visited {s0} // initial
loop
if s 2 Sg then return ⇡ // goal t
A0 Applicable(s)
if A0 = ? then return failure // dead-e
nondeterministically choose a 2 A0 // branc
nondeterministically choose s0 2 (s, a)// progre
if s0 2 Visited then return failure // loop c
⇡(s) a; Visited Visited [ {s0}; s s0
Figure 5.7: Planning for Solutions by For
graphs to find solutions. The main goal of the follo
the di↵erence in algorithms from deterministic doma
mainly a didactic rather than practical objective.
5.3.1 Planning for Solutions by Forward
We first present a very simple algorithm that finds
ship
hbr
par1
tr1
par2
park
unload
164 Chapter 5
Policy:
ship: unload
hbr: park
par1: deliver
tr2: move
g1
g2
del
tr2
g1 g2
move
This policy
is returned
42. 42"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
FindRSafeRSolu.on(
Section 5.3 175
Find-Safe-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal reached by all leaves
for every s 2 Frontier do
remove s from Frontier
if Find-Solution(⌃, s, Sg) = failure // nonterminating loop
then return failure
nondeterministically choose a 2 Applicable(s) // select an action
⇡ ⇡ [ (s, a)
Frontier Frontier [ ( (s, a) Dom(⇡)) // expand
return failure
Figure 5.8: Planning for Safe Solutions by Forward-search.
Keeps track of
unexpanded states,
much like A*
Uses FindGSoluBon"to see
if a Solution exists. If no
Solution, then no
Safe-Solution.
Only nondeterministic choice is action.
Adds ALL possible successor states to
Frontier. Not a choice since Safe-Solution
needs to guard against all eventualities.
43. 43"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRSafeRSolu.on(
Find-Safe-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Find-Solution(⌃, s, Sg) = failure //
then return failure
nondeterministically choose a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ ( (s, a) Dom(⇡))
return failure
Figure 5.8: Planning for Safe Solutions by For
resulting from applying a to s. The interpretation of
choice of the state among the elements of the frontie
creates several copies of a, one for each applicable act
these copies has been made, the algorithm makes ano
ship
Policy:
Frontier: ship
44. 44"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRSafeRSolu.on(
Find-Safe-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Find-Solution(⌃, s, Sg) = failure //
then return failure
nondeterministically choose a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ ( (s, a) Dom(⇡))
return failure
Figure 5.8: Planning for Safe Solutions by For
resulting from applying a to s. The interpretation of
choice of the state among the elements of the frontie
creates several copies of a, one for each applicable act
these copies has been made, the algorithm makes ano
ship
hbr
unload
Policy:
ship: unload
Frontier: hbr
45. 45"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRSafeRSolu.on(
Find-Safe-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Find-Solution(⌃, s, Sg) = failure //
then return failure
nondeterministically choose a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ ( (s, a) Dom(⇡))
return failure
Figure 5.8: Planning for Safe Solutions by For
resulting from applying a to s. The interpretation of
choice of the state among the elements of the frontie
creates several copies of a, one for each applicable act
these copies has been made, the algorithm makes ano
Frontier: par2,
tr1,par1
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park
Unlike FindGSoluBon, need to
solve for all successor states.
All are added to Frontier.
46. 46"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRSafeRSolu.on(
Find-Safe-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Find-Solution(⌃, s, Sg) = failure //
then return failure
nondeterministically choose a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ ( (s, a) Dom(⇡))
return failure
Figure 5.8: Planning for Safe Solutions by For
resulting from applying a to s. The interpretation of
choice of the state among the elements of the frontie
creates several copies of a, one for each applicable act
these copies has been made, the algorithm makes ano
Frontier: par2,
tr1,g1,g2,tr2
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park
par1: deliver
g1
g2
del
tr2
47. 47"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRSafeRSolu.on(
Find-Safe-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Find-Solution(⌃, s, Sg) = failure //
then return failure
nondeterministically choose a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ ( (s, a) Dom(⇡))
return failure
Figure 5.8: Planning for Safe Solutions by For
resulting from applying a to s. The interpretation of
choice of the state among the elements of the frontie
creates several copies of a, one for each applicable act
these copies has been made, the algorithm makes ano
Frontier: par2,
tr1,g1,g2,tr2
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park
par1: deliver
g1
g2
del
tr2
g1 and g2 are goal states.So
FSS doesn’t solve for it further.
48. 48"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRSafeRSolu.on(
Find-Safe-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Find-Solution(⌃, s, Sg) = failure //
then return failure
nondeterministically choose a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ ( (s, a) Dom(⇡))
return failure
Figure 5.8: Planning for Safe Solutions by For
resulting from applying a to s. The interpretation of
choice of the state among the elements of the frontie
creates several copies of a, one for each applicable act
these copies has been made, the algorithm makes ano
Frontier: par2,
tr1,g1,g2
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park
par1: deliver
tr2: move
g1
g2
del
tr2
g1 g2
move
49. 49"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRSafeRSolu.on(
Find-Safe-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Find-Solution(⌃, s, Sg) = failure //
then return failure
nondeterministically choose a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ ( (s, a) Dom(⇡))
return failure
Figure 5.8: Planning for Safe Solutions by For
resulting from applying a to s. The interpretation of
choice of the state among the elements of the frontie
creates several copies of a, one for each applicable act
these copies has been made, the algorithm makes ano
Frontier: par2,
tr1,g1,g2
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park
par1: deliver
tr2: move
g1
g2
del
tr2
g1 g2
move
50. 50"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRSafeRSolu.on(
Find-Safe-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Find-Solution(⌃, s, Sg) = failure //
then return failure
nondeterministically choose a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ ( (s, a) Dom(⇡))
return failure
Figure 5.8: Planning for Safe Solutions by For
resulting from applying a to s. The interpretation of
choice of the state among the elements of the frontie
creates several copies of a, one for each applicable act
these copies has been made, the algorithm makes ano
Frontier:tr1,
g1,g2
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park
par1: deliver
tr2: move
par2: back
g1
g2
del
tr2
g1 g2
move
hbr
back
51. 51"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRSafeRSolu.on(
Find-Safe-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Find-Solution(⌃, s, Sg) = failure //
then return failure
nondeterministically choose a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ ( (s, a) Dom(⇡))
return failure
Figure 5.8: Planning for Safe Solutions by For
resulting from applying a to s. The interpretation of
choice of the state among the elements of the frontie
creates several copies of a, one for each applicable act
these copies has been made, the algorithm makes ano
Frontier:tr1,
g1,g2
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park
par1: deliver
tr2: move
par2: back
g1
g2
del
tr2
g1 g2
move
hbr
back
52. 52"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRSafeRSolu.on(
Find-Safe-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Find-Solution(⌃, s, Sg) = failure //
then return failure
nondeterministically choose a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ ( (s, a) Dom(⇡))
return failure
Figure 5.8: Planning for Safe Solutions by For
resulting from applying a to s. The interpretation of
choice of the state among the elements of the frontie
creates several copies of a, one for each applicable act
these copies has been made, the algorithm makes ano
Frontier:
g1,g2
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park
par1: deliver
tr2: move
par2: back
tr1: move
g1
g2
del
tr2
g1 g2
move
par1 par2
hbr
back
53. 53"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRSafeRSolu.on(
Find-Safe-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Find-Solution(⌃, s, Sg) = failure //
then return failure
nondeterministically choose a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ ( (s, a) Dom(⇡))
return failure
Figure 5.8: Planning for Safe Solutions by For
resulting from applying a to s. The interpretation of
choice of the state among the elements of the frontie
creates several copies of a, one for each applicable act
these copies has been made, the algorithm makes ano
Frontier:
g1,g2
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park
par1: deliver
tr2: move
par2: back
tr1: move
g1
g2
del
tr2
g1 g2
move
par1 par2
satisfies
hbr
back
54. 54"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRSafeRSolu.on(
Find-Safe-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Find-Solution(⌃, s, Sg) = failure //
then return failure
nondeterministically choose a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ ( (s, a) Dom(⇡))
return failure
Figure 5.8: Planning for Safe Solutions by For
resulting from applying a to s. The interpretation of
choice of the state among the elements of the frontie
creates several copies of a, one for each applicable act
these copies has been made, the algorithm makes ano
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park
par1: deliver
tr2: move
par2: back
tr1: move
g1
g2
del
tr2
g1 g2
move
par1 par2
This policy
is returned
hbr
back
56. 56"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
FindRAcyclicRSolu.on(
176 Chapter
Find-Acyclic-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal reached by all leave
for every s 2 Frontier do
remove s from Frontier
if Frontier Dom(⇡) 6= ? // loop checking
then return failure
choose nondeterministically a 2 Applicable(s) // select an action
⇡ ⇡ [ (s, a)
Frontier Frontier [ (s, a) // expand
return failure
Figure 5.9: Planning for Safe Acyclic Solutions by Forward-search.
Cycle check: makes sure
that action applied in previous
iteration didn’t lead to a state
already considered by π
Similar to
FindRSafeRSolu.on except:
57. 57"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRAcyclicRSoln(
ship
Policy:
Frontier: ship
Find-Acyclic-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Frontier Dom(⇡) 6= ? //
then return failure
choose nondeterministically a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ (s, a)
return failure
Figure 5.9: Planning for Safe Acyclic Solutions by
While exploring the frontier, it calls Find-Soluti
whether the current policy contains cycles without p
tion, i.e., whether it gets in a state where no action i
there is no path to the goal. Also Find-Safe-Solution
terministic selection among the applicable actions.
58. 58"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRAcyclicRSoln(
ship
hbr
unload
Policy:
ship: unload
Frontier: hbr
Find-Acyclic-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Frontier Dom(⇡) 6= ? //
then return failure
choose nondeterministically a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ (s, a)
return failure
Figure 5.9: Planning for Safe Acyclic Solutions by
While exploring the frontier, it calls Find-Soluti
whether the current policy contains cycles without p
tion, i.e., whether it gets in a state where no action i
there is no path to the goal. Also Find-Safe-Solution
terministic selection among the applicable actions.
59. 59"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRAcyclicRSoln(
Frontier: par2,
tr1,par1
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park
Unlike FindGSoluBon, need to
solve for all successor states.
All are added to Frontier.
Find-Acyclic-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Frontier Dom(⇡) 6= ? //
then return failure
choose nondeterministically a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ (s, a)
return failure
Figure 5.9: Planning for Safe Acyclic Solutions by
While exploring the frontier, it calls Find-Soluti
whether the current policy contains cycles without p
tion, i.e., whether it gets in a state where no action i
there is no path to the goal. Also Find-Safe-Solution
terministic selection among the applicable actions.
60. 60"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRAcyclicRSoln(
Frontier: par2,
tr1,g1,g2,tr2
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park
par1: deliver
g1
g2
del
tr2
Find-Acyclic-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Frontier Dom(⇡) 6= ? //
then return failure
choose nondeterministically a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ (s, a)
return failure
Figure 5.9: Planning for Safe Acyclic Solutions by
While exploring the frontier, it calls Find-Soluti
whether the current policy contains cycles without p
tion, i.e., whether it gets in a state where no action i
there is no path to the goal. Also Find-Safe-Solution
terministic selection among the applicable actions.
61. 61"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRAcyclicRSoln(
Frontier: par2,
tr1,g1,g2,tr2
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park
par1: deliver
g1
g2
del
tr2
g1 and g2 are goal states.So
FSS doesn’t solve for it further.
Find-Acyclic-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Frontier Dom(⇡) 6= ? //
then return failure
choose nondeterministically a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ (s, a)
return failure
Figure 5.9: Planning for Safe Acyclic Solutions by
While exploring the frontier, it calls Find-Soluti
whether the current policy contains cycles without p
tion, i.e., whether it gets in a state where no action i
there is no path to the goal. Also Find-Safe-Solution
terministic selection among the applicable actions.
62. 62"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRAcyclicRSoln(
Frontier: par2,
tr1,g1,g2
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park
par1: deliver
tr2: move
g1
g2
del
tr2
g1 g2
move
Find-Acyclic-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Frontier Dom(⇡) 6= ? //
then return failure
choose nondeterministically a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ (s, a)
return failure
Figure 5.9: Planning for Safe Acyclic Solutions by
While exploring the frontier, it calls Find-Soluti
whether the current policy contains cycles without p
tion, i.e., whether it gets in a state where no action i
there is no path to the goal. Also Find-Safe-Solution
terministic selection among the applicable actions.
63. 63"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRAcyclicRSoln(
Frontier: par2,
tr1,g1,g2
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park
par1: deliver
tr2: move
g1
g2
del
tr2
g1 g2
move
Find-Acyclic-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Frontier Dom(⇡) 6= ? //
then return failure
choose nondeterministically a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ (s, a)
return failure
Figure 5.9: Planning for Safe Acyclic Solutions by
While exploring the frontier, it calls Find-Soluti
whether the current policy contains cycles without p
tion, i.e., whether it gets in a state where no action i
there is no path to the goal. Also Find-Safe-Solution
terministic selection among the applicable actions.
64. 64"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRAcyclicRSoln(
Frontier:tr1,
g1,g2,tr3
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park
par1: deliver
tr2: move
par2: deliver
g1
g2
del
tr2
g1 g2
move
tr3
del
g1
Find-Acyclic-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Frontier Dom(⇡) 6= ? //
then return failure
choose nondeterministically a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ (s, a)
return failure
Figure 5.9: Planning for Safe Acyclic Solutions by
While exploring the frontier, it calls Find-Soluti
whether the current policy contains cycles without p
tion, i.e., whether it gets in a state where no action i
there is no path to the goal. Also Find-Safe-Solution
terministic selection among the applicable actions.
Note: doesn’t
consider back(
because it
creates
a cycle
65. 65"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRAcyclicRSoln(
Frontier:tr1,
g1,g2
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park
par1: deliver
tr2: move
par2: deliver
tr3: move
g1
g2
del
tr2
g1 g2
move
tr3
del
g1
g2
move
Find-Acyclic-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Frontier Dom(⇡) 6= ? //
then return failure
choose nondeterministically a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ (s, a)
return failure
Figure 5.9: Planning for Safe Acyclic Solutions by
While exploring the frontier, it calls Find-Soluti
whether the current policy contains cycles without p
tion, i.e., whether it gets in a state where no action i
there is no path to the goal. Also Find-Safe-Solution
terministic selection among the applicable actions.
66. 66"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRAcyclicRSoln(
Frontier:
g1,g2
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park
par1: deliver
tr2: move
par2: deliver
tr3: move
par1: move
g1
g2
del
tr2
g1 g2
move
tr3
del
g1
g2
move
par1 par2
Find-Acyclic-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Frontier Dom(⇡) 6= ? //
then return failure
choose nondeterministically a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ (s, a)
return failure
Figure 5.9: Planning for Safe Acyclic Solutions by
While exploring the frontier, it calls Find-Soluti
whether the current policy contains cycles without p
tion, i.e., whether it gets in a state where no action i
there is no path to the goal. Also Find-Safe-Solution
terministic selection among the applicable actions.
67. 67"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRAcyclicRSoln(
Frontier:
g1,g2
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park
par1: deliver
tr2: move
par2: deliver
tr3: move
tr1: move
g1
g2
del
tr2
g1 g2
move
tr3
del
g1
g2
move
par1 par2
satisfies
Find-Acyclic-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Frontier Dom(⇡) 6= ? //
then return failure
choose nondeterministically a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ (s, a)
return failure
Figure 5.9: Planning for Safe Acyclic Solutions by
While exploring the frontier, it calls Find-Soluti
whether the current policy contains cycles without p
tion, i.e., whether it gets in a state where no action i
there is no path to the goal. Also Find-Safe-Solution
terministic selection among the applicable actions.
68. 68"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRAcyclicRSoln(
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park
par1: deliver
tr2: move
par2: deliver
tr3: move
tr1: move
g1
g2
del
tr2
g1 g2
move
tr3
del
g1
g2
move
par1 par2
This policy
is returned
Find-Acyclic-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Frontier Dom(⇡) 6= ? //
then return failure
choose nondeterministically a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ (s, a)
return failure
Figure 5.9: Planning for Safe Acyclic Solutions by
While exploring the frontier, it calls Find-Soluti
whether the current policy contains cycles without p
tion, i.e., whether it gets in a state where no action i
there is no path to the goal. Also Find-Safe-Solution
terministic selection among the applicable actions.
71. 71"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
GuidedRFindRSafeRSolu.on(
192 Chapter 5
Guided-Find-Safe-Solution (⌃,s0,Sg)
if s0 2 Sg then return(?)
if Applicable(s0) = ? then return(failure)
⇡ ?
loop
Q leaves(s0, ⇡) Sg
if Q = ? then return(⇡)
select arbitrarily s 2 Q
⇡0 Find-Solution(⌃, s, Sg)
if ⇡0 6= failure then do
⇡ ⇡ [ {(s, a) 2 ⇡0 | s 62 Dom(⇡)}
else for every s0 and a such that s 2 (s0, a) do
⇡ ⇡ {(s0, a)}
make a not applicable in s0
Figure 5.17: Guided Planning for a Safe Solution
Look at all the leaves of π.
Safe solution requires a goal state
to be reachable from every node.
So plan from each non-solution leaf.
Incorporate solution π’ found
into overall policy π
If solution not found from
s, goals unreachable from
s. Remove all elements of
π that could result in s.
74. 74"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Determiniza.on(Techniques(
! High-Level Approach:
● Transform nondeterministic model to a
deterministic one
▸ Each nondeterministic action translates to
several deterministic actions, one for each
possible successor state
● Use CSV planners to solve these problems
● Stitch solutions together into a policy
! Advantages:
● Deterministic planning problems efficiently
solvable
● Allows us to leverage all of the nice features
CSV planners bring in
▸ Heuristics, landmarks, etc
hbr
par1
tr1
par2
park
hbr
par1
tr1
par2
park1
park2
park3
75. 75"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
FindRSafeRSolu.onRbyRDeterminiza.on(
Find-Safe-Solution-by-Determinization (⌃,s0,Sg)
if s0 2 Sg then return(?)
if Applicable(s0) = ? then return(failure)
⇡ ?
⌃d mk-deterministic(⌃) // determinization
loop
Q leaves(s0, ⇡) Sg
if Q = ? then do
⇡ ⇡ {(s, a) 2 ⇡ | s 62 b(s0, ⇡)} // clean policy
return(⇡)
select s 2 Q
p0 Forward-search (⌃d, s, Sg) // classical planner
if p0 6= fail then do
⇡0 Plan2policy(p0, s) // plan2policy transformatio
⇡ ⇡ [ {(s, a) 2 ⇡0 | s 62 Dom(⇡)}
else for every s0 and a such that s 2 (s0, a) do
⇡ ⇡ {(s0, a)}
make the actions in the determinization of a // action elimination
not applicable in s0
Compute determinization of domain
If no non-solution leaf
states, we’re done. Need to
clean up policy to remove
unreachable states
Invoke CSV planner on
deterministic model
Transform deterministic
plan into policy
Action elimination
76. 76"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Plan2Policy(
⌃d rather than the nondeterministic domain ⌃.
Plan2policy(p = ha1, . . . , ani,s)
⇡ ?
loop for i from 1 to n do
⇡ ⇡ [ (s, ai)
s d(s, ai)
return ⇡
Figure 5.19: Transformation of a sequential plan into a corresponding pol
5.6 Online approaches with nondeterminist
models
In Chapter 1 (see Section 1.2, and specifically Section 1.6.2) we introdu
the idea of interleaving planning and acting. One motivation is that, giv
a complete plan that is generated o↵-line, its execution seldom works
Relatively straightforward: transforms
solution into a policy representation
Note: p needs to be an acyclic plan
To ensure this, Forward-Search (see
previous slide) needs to return an
acyclic plan
77. 77"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Ac.on(Elimina.on(
if p0 6= fail then do
⇡0 Plan2policy(p0, s) // plan2poli
⇡ ⇡ [ {(s, a) 2 ⇡0 | s 62 Dom(⇡)}
else for every s0 and a such that s 2 (s0, a) do
⇡ ⇡ {(s0, a)}
make the actions in the determinization of a // action eli
not applicable in s0
Figure 5.18: Planning for Safe Solutions by Determinization
Fragment of FindGSafeGSoluBonGbyGDeterminizaBon
that has to do with action elimination
Triggered if no deterministic solution from s
Informally it does the following:
• Update π to ensure s is never reached
• Ensure that no deterministic solution found in a future call to ForwardG
Search"returns a solution going through s
79. 79"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Online(Approaches(with(Nondeterminis.c(Models(
! Interleaving planning and acting is
important
● Planning models are approximate –
execution seldom works out as planned
● Large problems mean long planning
time – need to interleave the two
! This motivation even more stronger in
nondeterministic domains
● Long time needed to generate safe
solutions when there are lots of state
variables, actions etc
! Therefore interleaving planning and acting
helps reduce complexity
● Instead of coming up with complete
policy, generate partial policy that tells
us the next few actions to perform
196
Figure 5.20: O↵-line vs. Run Time Search Spaces
acting and planning then we reduce significantly the sear
indeed to find a partial policy, e.g., the next few ”good”
or some of them, and repeat these two interleaved plannin
Offline vs Runtime
Search Spaces
80. 80"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Issues(With(Interleaving(Planning(and(Ac.ng(
! Need to identify good actions without exploring entire search space
● Can be done using heuristic estimates
! Handling Dead-ends:
● When lookahead is not enough, can get trapped in dead ends
▸ By planning fully, we would have found out about the dead-end
▸ E.g. if robot goes down a steep incline out of which it cannot come back
up
● Not a problem in safely explorable domains
▸ Goal states reachable from all situations
! Despite these issues, interleaving planning and acting an essential alternative to
purely offline planning
81. 81"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Ac.ng(Procedure:(RunRLookahead(
198 Chapter
Run-Lookahead(⌃, s0, Sg)
s s0
while s /2 Sg and Applicable(s) 6= ? do
⇡ Lookahead(s, ✓)
apply partial plan ⇡
s observe current state
Figure 5.21: Interleaving planning and execution by look-ahead
There are di↵erent ways in which the generated plan can be partia
and di↵erent ways in planning and acting can be interleaved. Indeed th
procedure Run-Lookahead is parametric along two dimensions:
The first parametric dimension is in the call to the look-ahead plannin
step, i.e., Lookahead(s, ✓). The parameter ✓ determines the way in which th
generated plan ⇡ is partial. For instance, it can be partial since the lookahea
is bounded, i.e., the forward search is performed for a bounded number o
This is where the planner is
invoked. θ is a context-dependent
parameter that restricts the search
for a solution and hence
determines how π is partial
• θ could be a bound on the
search depth
• θ could be limitation on
planning time
• θ could also limit the number of
action outcomes considered
• Special case: only ONE
outcome == FindGSoluBon(
! Two ways to perform lookahead:
● Lookahead with a bounded
number of steps: handle all
action outcomes, but only upto a
certain depth
● Lookahead by
determinization: solve the
problem fully, but possibly
unsafe due to determinization
82. 82"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
FFRReplan:(Lookahead(by(Determiniza.on(
Section 5.6
FF-Replan (⌃, s, Sg)
while s /2 Sg and Applicable(s) 6= ? do
if ⇡d undefined for s then do
⇡d Forward-search (⌃d, s, Sg)
apply action ⇡d(s)
s observe resulting state
Figure 5.22: Online determinization planning and acting algorithm.
lookahead and partial numebr of outcomes, in any arbitrary way.
The second parametric dimension is in the application of the partial p
that has been generated, i.e., apply the partial plan ⇡. Independently of
lookahead, we can still execute ⇡ in a partial way. Suppose for instance t
we have generated a sequential plan of length n, we can decide to ap
m n steps.
Run Forward-Search on
a determinized version of
the problem.
Then start executing
the (possibly unsafe) policy
until we cannot execute
it anymore
Properties:
• If the domain is safely-explorable,
then FFGReplan will get to a goal state.
• If the domain has dead-ends, then
no guarantees.