SlideShare a Scribd company logo
1 of 82
Download to read offline
1"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
This"work"is"licensed"under"a"CreaBve"Commons"AEribuBonGNonCommercialGShareAlike"4.0"InternaBonal"License."
Chapter(5((
Delibera.on(with(Nondeterminis.c(Domain(
Models(
Dana S. Nau and Vikas Shivashankar
University of Maryland
2"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Introduc.on(
!  World seldom predictable
●  corresponding deliberation models as a result always going to be incomplete
!  Results in:
●  Action failures
●  Unexpected side effects of actions
●  Exogenous events
!  So far, been working with deterministic action models
●  Each action, when applied in a particular state, results in only one state
●  Formally: γ(s,a) returns a single state
●  Doesn’t adequately support inherent uncertainty in domains
!  Nondeterministic models provide more flexibility:
●  An action, when applied in a state, may result in one among several possible
states
●  γ(s,a) returns a set of states
!  Nondeterministic models allow modeling uncertainty in planning domains
3"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Why(Model(Uncertainty?((
!  We’ve seen ways to handle these situations using deterministic models
●  Generate plans for the nominal case
●  Execute, and monitor
●  Detect failure, and recover
4"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Why(Model(Uncertainty?(
Answer: nondeterministic models have several advantages
!  More accurate modeling
!  Plan for uncertainty ahead of time, instead of during execution
!  No nominal case in certain environments:
●  Think of throwing a dice/tossing a coin
●  Online payments where choice of payment left to user
!  However, comes at a cost:
●  More complicated, both conceptually and computationally
●  Since you need to take all different possibilities into account
5"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
Figure 5.1: A simple nondeterministic planning domain model
Definition 5.1. (Planning Domain) A nondeterministic planning do-
main ⌃ is the tuple (S, A, ), where S is the finite set of states, A is the
finite set of actions, and : S ⇥ A ! 2S is the state transition function.
Search(Spaces(in(Nondeterminis.c(Planning(
!  Search space of deterministic planning
modeled as a graph
●  Nodes are states, edges are actions
!  For planning with nondeterministic domains,
search space no longer a graph
●  Instead its now an AND/OR graph
!  AND/OR graph has following elements:
●  OR branches: which action
to apply in a state?
●  AND branches: which state does the
action lead to?
!  Have control over which action to apply (OR
branches)
!  Don’t have control over resulting state (AND
branches)
A simple nondeterministic model of a
harbor management facility
6"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
Nondeterminis.c(Planning(Domains(
!  3-tuple (S, A, γ)
●  S – finite set of states
●  A – finite set of actions
●  γ: S × A → 2S
!  Search space of a simple harbor
management domain
●  Only one state variable:
▸  pos(item)
●  Nodes
represent
possible values
7"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
Ac.ons(in(Nondeterminis.c(Planning(Domains(
!  An action a applicable in state s iff γ(s,a) ≠ ∅
!  Applicable(s) is set of all actions applicable in s
●  Applicable(s) = {a ∈ A | γ(s, a) ≠ ∅}
!  Five actions in example
●  Two deterministic:
▸  unload, back
●  Three nondeterministic:
▸  park move, deliver
8"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
Ac.ons(in(Nondeterminis.c(Planning(Domains(
!  park stores items in storage areas parking1 or
parking2
●  Nondeterminism used to model possibility of
▸  storing item in parking1
▸  storing item in parking2
▸  having to temporarily
move item in transit1
if space is unavailable
●  Once space is available: move action
9"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
Plans(in(Nondeterminis.c(Domains(
!  Structure of plans must be different from
the deterministic case
●  Previously, sequence of actions
!  Doesn’t work here
●  Why?
10"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
Plans(in(Nondeterminis.c(Domains(
!  Need the notion of a conditional plan
●  plans that account for various
possibilities in a given state
!  Can sense the actual action outcome
among the possible ones, and act
according to the conditional
structure of plan
!  A possible representation:
●  a policy:
partial function
that maps
states to actions
!  If a policy π maps a state s to an action a
●  that means we should perform a
whenever we are in state s
11"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
Policies:(A(Representa.on(of(Plans(in(
Nondeterminis.c(Planning(
!  Example policy π1 for the harbor management
problem:
●  π1 (pos(item)=on"ship) = unload"
●  π1(pos(item)=at"harbor) = park"
●  π1(pos(item)=parking1) = deliver"
12"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
Policies:(A(Representa.on(of(Plans(in(
Nondeterminis.c(Planning(
●  π1 (pos(item)=on"ship) = unload"
●  π1(pos(item)=at"harbor) = park"
●  π1(pos(item)=parking1) = deliver"
13"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5!  In deterministic planning, can compute states
reachable by sequence of actions using γ
●  s ∪ γ (s, a1)∪ γ (γ (s,a1), a2) ∪ ...
!  Need few extra definitions to do similar
checks in nondeterministic planning
!  Reachable States: (s,π)
●  All states that can be produced by
starting at s and executing π
!  Example: (pos(item)=on"ship,π1)
●  π1 (pos(item)=on"ship) = unload"
●  π1(pos(item)=at"harbor) = park"
●  π1(pos(item)=parking1) = deliver"
Defini.ons(Over(Policies(
14"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
Defini.ons(Over(Policies(
!  In deterministic planning, can compute states
reachable by sequence of actions using γ
●  s ∪ γ (s, a1)∪ γ (γ (s,a1), a2) ∪ ...
!  Need few extra definitions to do similar
checks in nondeterministic planning
!  Reachable States: (s,π)
●  All states that can be produced by
starting at s and executing π
!  Example: (pos(item)=on"ship,π1)
●  π1 (pos(item)=on"ship) = unload"
●  π1(pos(item)=at"harbor) = park"
●  π1(pos(item)=parking1) = deliver"
15"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
Defini.ons(Over(Policies(
!  Need to also check whether plan reaches goal
●  Requires calculating final states of policy
!  leaves (s,π): set of final states reached by
policy π starting from state s
!  leaves(s, π) = {s′ | s′ ∈ ︎ (s, π) and
s′ not in Dom(π)}
!  Example:
●  leaves (pos(item)=on"ship,"π1)
●  π1 (pos(item)=on"ship) = unload"
●  π1(pos(item)=at"harbor) = park"
●  π1(pos(item)=parking1) = deliver"
16"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Policies:(A(Representa.on(of(Plans(in(
Nondeterminis.c(Planning(
!  Reachability graph, Graph(s,π)
●  Graph of all possible state transitions if we
execute π starting at s
●  Graph(s,π) = { γ︎(s,π), E |
s′ ∈ γ︎(s, π), s′′ ∈ π(s′), and (s′,s′′) ∈ E}
●  π1 (pos(item)=on"ship) = unload"
●  π1(pos(item)=at"harbor) = park"
●  π1(pos(item)=parking1) = deliver"
17"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Policies:(A(Representa.on(of(Plans(in(
Nondeterminis.c(Planning(
●  π2"(pos(item)=on"ship)"="unload"
●  π2(pos(item)=at"harbor)"="park"
●  π2(pos(item)=parking1)"="deliver"
●  π2(pos(item)=parking2)"="back"
●  π2(pos(item)=transit1)"="move"
●  π2(pos(item)=transit2)"="move;""
18"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Policies:(A(Representa.on(of(Plans(in(
Nondeterminis.c(Planning(
●  π3"(pos(item)=on"ship)"="unload"
●  π3(pos(item)=at"harbor)"="park"
●  π3(pos(item)=parking1)"="deliver"
●  π3(pos(item)=parking2)"="deliver"
●  π3(pos(item)=transit1)"="move"
●  π3(pos(item)=transit2)"="move"
●  π3(pos(item)=transit3)"="move""
19"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Planning(Problems(and(Solu.ons(
!  Let Σ = (S,A,γ) be a planning domain
!  A planning problem P is a 3-tuple P = (Σ,s0,Sg)
●  s0 ∈ S is the initial state
●  Sg ⊆ S is set of goal states
!  Note: previous book had set of initial states S0
●  Allowed uncertainty about initial state
●  Current definition is equivalent
▸  Can easily translate one to the other
•  How?
20"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Planning(Problems(and(Solu.ons(
!  Let Σ = (S,A,γ) be a planning domain
!  A planning problem P is a 3-tuple P = (Σ,s0,Sg)
●  s0 ∈ S is the initial state
●  Sg ⊆ S is set of goal states
!  Note: previous book had set of initial states S0
●  Allowed uncertainty about initial state
●  Current definition is equivalent
▸  Can easily translate one to the other
•  How?
▸  Introduce a new start action such that γ (s0, start) = S0
!  Solutions: not as straightforward to define as Deterministic Planning
●  Based on actual action outcomes, might or might not achieve goal
●  Can define different criteria of success – many types of solutions
21"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Type(1:(Solu.on(
Let P = (Σ,s0,Sg) be a planning problem. Let π be a policy for Σ. π is a solution iff
leaves (s0,π) ∩ Sg ≠ ∅
!  A policy that may lead to a goal
●  In other words: at least one sequence of
nondeterministic outcomes leads to a goal state
!  Example:
●  s0 = {pos(item)"="on_ship}
●  Sg = {pos(item)"="gate1,"pos(item)"="gate2}
!  Policy π1 is a solution
●  π1 (pos(item)=on"ship) = unload"
●  π1(pos(item)=at"harbor) = park"
●  π1(pos(item)=parking1) = deliver"
!  Reason: At least one of the paths in
reachability graph of π1 leads to a state in Sg
22"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Type(2:(Safe(Solu.on(
Let P = (Σ,s0,Sg) be a planning problem. Let π be a policy for Σ. π is a safe solution
iff
∀s ∈ γ ︎(s0, π)(leaves(s, π) ∩ Sg ≠ ∅)
Safe solution: a solution
in which a goal state is
reachable from every state
in the reachability graph
!  Is π1 a safe solution?
Condition for solutionNeeds to hold for
all reachable
states
23"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Type(2:(Safe(Solu.on(
Let P = (Σ,s0,Sg) be a planning problem. Let π be a policy for Σ. π is a safe solution
iff
∀s ∈ γ ︎(s0, π)(leaves(s, π) ∩ Sg ≠ ∅)
Safe solution: a solution
in which a goal state is
reachable from every state
in the reachability graph
!  Is π1 a safe solution?
●  No
Condition for solutionNeeds to hold for
all reachable
states
24"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Type(2:(Safe(Solu.on(
Let P = (Σ,s0,Sg) be a planning problem. Let π be a policy for Σ. π is a safe solution
iff
∀s ∈ γ ︎(s0, π)(leaves(s, π) ∩ Sg ≠ ∅)
Safe solution: a solution
in which a goal state is
reachable from every state
in the reachability graph
!  Is π2 a safe solution?
Condition for solutionNeeds to hold for
all reachable
states
25"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Type(2:(Safe(Solu.on(
Let P = (Σ,s0,Sg) be a planning problem. Let π be a policy for Σ. π is a safe solution
iff
∀s ∈ γ ︎(s0, π)(leaves(s, π) ∩ Sg ≠ ∅)
Safe solution: a solution
in which a goal state is
reachable from every state
in the reachability graph
!  Is π2 a safe solution?
●  Yes
Condition for solutionNeeds to hold for
all reachable
states
26"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Type(2a:(Cyclic(Safe(Solu.ons(
Let P = (Σ,s0,Sg) be a planning problem. Let π be a policy for Σ. π is a cyclic safe
solution iff
(1)  leaves(s0, π) ⊆ Sg ∧
(2)  (∀s ∈ γ ︎(s0, π)(leaves(s, π) ∩ Sg ≠ ∅))
(3)  Graph(s0, π) is cyclic
Meaning of Conditions:
(1)  No non-solution leaves
(2)  Safe solution
(3)  Reachability graph is cyclic
Cyclic Safe solution: a
safe solution with cycles
!  π2 is a cyclic safe solution
How does having cycles affect level of safety?
27"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Type(2a:(Cyclic(Safe(Solu.ons(
Let P = (Σ,s0,Sg) be a planning problem. Let π be a policy for Σ. π is a cyclic safe
solution iff
(1)  leaves(s0, π) ⊆ Sg ∧
(2)  (∀s ∈ γ ︎(s0, π)(leaves(s, π) ∩ Sg ≠ ∅))
(3)  Graph(s0, π) is cyclic
Meaning of Conditions:
(1)  No non-solution leaves
(2)  Safe solution
(3)  Reachability graph is cyclic
Cyclic Safe solution: a
safe solution with cycles
!  π2 is a cyclic safe solution
How does having cycles affect level of safety?
!  could go though cycle infinitely many times
!  If execution gets out of loop eventually,
guaranteed to reach goal state
28"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Type(2b:(Acyclic(Safe(Solu.ons(
Let P = (Σ,s0,Sg) be a planning problem. Let π be a policy for Σ. π is a acyclic safe
solution iff
(1) leaves(s0, π) ⊆ Sg ∧
(2) Graph(s0, π) is cyclic
Meaning of Conditions:
(1)  No non-solution leaves
(2)  Reachability graph is acyclic
Acyclic Safe Solution: a
safe solution without cycles
!  π3 is an acyclic safe solution
!  Acyclic policy completely safe
●  No matter what happens, guaranteed to
eventually reach the goal
29"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Unsafe(Solu.ons(
Let P = (Σ,s0,Sg) be a planning problem. Let π be a policy for Σ. π is an unsafe
solution iff
(1)  (leaves(s0, π) ∩ Sg ≠ ∅)
(2)  ((∃s ∈ leaves(s0, π) | s is not in Sg) ∨ (∃s ∈ γ︎(s0,π) | leaves(s,π)=∅))
Either there is a non-solution
leaf state
Or you get caught in

an infinite loop
Both of these are bad events
30"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Summary(of(Solu.on(Types(
Section 5.3 173
Figure 5.6: Di↵erent Kinds of Solutions: A Class Diagram
nondeterminism probabilistic
solutions weak solutions -
unsafe solutions - improper solutions
safe solutions strong cyclic solutions proper solutions
cyclic safe solutions - -
acyclic safe solutions strong solutions -
!  Unsafe Solutions aren’t of much interest to us
●  Do not guarantee achievement of goal
!  Acyclic Safe Solutions are the best – complete assurance that we’ll get to the goal
!  Cyclic Safe Solutions also good, but provide a weaker degree of assurance
●  We can get into loops
●  However, assuming that we don’t stay in the loop forever, guaranteed to
achieve the goal
31"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
SOLVING(NONDETERMINISTIC(
PLANNING(PROBLEMS(
32"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
AND/OR(Graph(Search(Algorithms(
!  Nondeterministic planning search
spaces represented as AND/OR
graphs
●  nodes: states
●  OR branches: actions applicable
in a state (consider 1)
●  AND branches: successor states
from an state-action pair
(consider ALL)
!  Reachability graph of a solution
policy includes one action at each OR
branch and all of the action’s
outcomes at each AND branch
!  First set of planning algorithms will
do AND/OR graph search
●  Simple extensions of ForwardG
Search"from Chapter 2
ship
hbr
par1
tr1
par2
park
tr2
g2
g1
del
tr3
g1
hbr
del
back
par1
par2
move
unload
33"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
FindRSolu.on:(Algorithm(to(find(Solu.ons(
Chapter 2
Forward-search (⌃, s0, g)
s s0; ⇡ hi
loop
if s0 satisfies g then return ⇡
A0 {a 2 A | a is applicable in s}
if A0 = ? then return failure
nondeterministically choose a 2 A0
s (s, a); ⇡ ⇡.a
A nondeterministic forward-search planning algorithm.
iscuss properties that are shared by all algorithms that do a
of the same search space, even though those algorithms may
es of that tree in di↵erent orders. The rest of this section
of those algorithms.
olution to a planning problem may require a huge computa-
r an arbitrary CSV planning problem the task is PSPACE-
]. To reduce the computational e↵ort, several of the search
his section incorporate heuristic techniques for selecting which
174 Chapter 5
Find-Solution (⌃, s0, Sg)
⇡ ?; s s0; Visited {s0} // initialization
loop
if s 2 Sg then return ⇡ // goal test
A0 Applicable(s)
if A0 = ? then return failure // dead-end test
nondeterministically choose a 2 A0 // branching
nondeterministically choose s0 2 (s, a)// progression
if s0 2 Visited then return failure // loop check
⇡(s) a; Visited Visited [ {s0}; s s0
174 Chapter 5
Find-Solution (⌃, s0, Sg)
⇡ ?; s s0; Visited {s0} // initialization
loop
if s 2 Sg then return ⇡ // goal test
A0 Applicable(s)
if A0 = ? then return failure // dead-end test
nondeterministically choose a 2 A0 // branching
nondeterministically choose s0 2 (s, a)// progression
if s0 2 Visited then return failure // loop check
⇡(s) a; Visited Visited [ {s0}; s s0
Additional nondeterministic
choice to decide which action

outcome to plan for next
Cycle-checking
Identical Algorithms except:
Deterministic Planning algorithm

from Chapter 2
Nondeterministic Planning

algorithm
34"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
FindRSolu.on:(Algorithm(to(find(Solu.ons(
174
Find-Solution (⌃, s0, Sg)
⇡ ?; s s0; Visited {s0} // initia
loop
if s 2 Sg then return ⇡ // goal
A0 Applicable(s)
if A0 = ? then return failure // dead-
nondeterministically choose a 2 A0 // branc
nondeterministically choose s0 2 (s, a)// progr
if s0 2 Visited then return failure // loop
⇡(s) a; Visited Visited [ {s0}; s s0
Figure 5.7: Planning for Solutions by For
graphs to find solutions. The main goal of the foll
the di↵erence in algorithms from deterministic dom
mainly a didactic rather than practical objective.
5.3.1 Planning for Solutions by Forward
174
Find-Solution (⌃, s0, Sg)
⇡ ?; s s0; Visited {s0} // initial
loop
if s 2 Sg then return ⇡ // goal t
A0 Applicable(s)
if A0 = ? then return failure // dead-e
nondeterministically choose a 2 A0 // branc
nondeterministically choose s0 2 (s, a)// progre
if s0 2 Visited then return failure // loop c
⇡(s) a; Visited Visited [ {s0}; s s0
Figure 5.7: Planning for Solutions by For
graphs to find solutions. The main goal of the follo
the di↵erence in algorithms from deterministic doma
mainly a didactic rather than practical objective.
5.3.1 Planning for Solutions by Forward
We first present a very simple algorithm that finds
ship
164 Chapter 5
Policy:
35"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
FindRSolu.on:(Algorithm(to(find(Solu.ons(
174
Find-Solution (⌃, s0, Sg)
⇡ ?; s s0; Visited {s0} // initia
loop
if s 2 Sg then return ⇡ // goal
A0 Applicable(s)
if A0 = ? then return failure // dead-
nondeterministically choose a 2 A0 // branc
nondeterministically choose s0 2 (s, a)// progr
if s0 2 Visited then return failure // loop
⇡(s) a; Visited Visited [ {s0}; s s0
Figure 5.7: Planning for Solutions by For
graphs to find solutions. The main goal of the foll
the di↵erence in algorithms from deterministic dom
mainly a didactic rather than practical objective.
5.3.1 Planning for Solutions by Forward
174
Find-Solution (⌃, s0, Sg)
⇡ ?; s s0; Visited {s0} // initial
loop
if s 2 Sg then return ⇡ // goal t
A0 Applicable(s)
if A0 = ? then return failure // dead-e
nondeterministically choose a 2 A0 // branc
nondeterministically choose s0 2 (s, a)// progre
if s0 2 Visited then return failure // loop c
⇡(s) a; Visited Visited [ {s0}; s s0
Figure 5.7: Planning for Solutions by For
graphs to find solutions. The main goal of the follo
the di↵erence in algorithms from deterministic doma
mainly a didactic rather than practical objective.
5.3.1 Planning for Solutions by Forward
We first present a very simple algorithm that finds
ship
hbr
unload
164 Chapter 5
Policy:
ship: unload
36"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
FindRSolu.on:(Algorithm(to(find(Solu.ons(
174
Find-Solution (⌃, s0, Sg)
⇡ ?; s s0; Visited {s0} // initia
loop
if s 2 Sg then return ⇡ // goal
A0 Applicable(s)
if A0 = ? then return failure // dead-
nondeterministically choose a 2 A0 // branc
nondeterministically choose s0 2 (s, a)// progr
if s0 2 Visited then return failure // loop
⇡(s) a; Visited Visited [ {s0}; s s0
Figure 5.7: Planning for Solutions by For
graphs to find solutions. The main goal of the foll
the di↵erence in algorithms from deterministic dom
mainly a didactic rather than practical objective.
5.3.1 Planning for Solutions by Forward
174
Find-Solution (⌃, s0, Sg)
⇡ ?; s s0; Visited {s0} // initial
loop
if s 2 Sg then return ⇡ // goal t
A0 Applicable(s)
if A0 = ? then return failure // dead-e
nondeterministically choose a 2 A0 // branc
nondeterministically choose s0 2 (s, a)// progre
if s0 2 Visited then return failure // loop c
⇡(s) a; Visited Visited [ {s0}; s s0
Figure 5.7: Planning for Solutions by For
graphs to find solutions. The main goal of the follo
the di↵erence in algorithms from deterministic doma
mainly a didactic rather than practical objective.
5.3.1 Planning for Solutions by Forward
We first present a very simple algorithm that finds
ship
hbr
par1
tr1
par2
park
unload
164 Chapter 5
Policy:
ship: unload
hbr: park
Assume this

outcome is
chosen
37"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
FindRSolu.on:(Algorithm(to(find(Solu.ons(
174
Find-Solution (⌃, s0, Sg)
⇡ ?; s s0; Visited {s0} // initia
loop
if s 2 Sg then return ⇡ // goal
A0 Applicable(s)
if A0 = ? then return failure // dead-
nondeterministically choose a 2 A0 // branc
nondeterministically choose s0 2 (s, a)// progr
if s0 2 Visited then return failure // loop
⇡(s) a; Visited Visited [ {s0}; s s0
Figure 5.7: Planning for Solutions by For
graphs to find solutions. The main goal of the foll
the di↵erence in algorithms from deterministic dom
mainly a didactic rather than practical objective.
5.3.1 Planning for Solutions by Forward
174
Find-Solution (⌃, s0, Sg)
⇡ ?; s s0; Visited {s0} // initial
loop
if s 2 Sg then return ⇡ // goal t
A0 Applicable(s)
if A0 = ? then return failure // dead-e
nondeterministically choose a 2 A0 // branc
nondeterministically choose s0 2 (s, a)// progre
if s0 2 Visited then return failure // loop c
⇡(s) a; Visited Visited [ {s0}; s s0
Figure 5.7: Planning for Solutions by For
graphs to find solutions. The main goal of the follo
the di↵erence in algorithms from deterministic doma
mainly a didactic rather than practical objective.
5.3.1 Planning for Solutions by Forward
We first present a very simple algorithm that finds
ship
hbr
par1
tr1
par2
park
unload
164 Chapter 5
Policy:
ship: unload
hbr: park
par1: deliver
g1
g2
del
tr2
Assume this

outcome is
chosen
38"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
FindRSolu.on:(Algorithm(to(find(Solu.ons(
174
Find-Solution (⌃, s0, Sg)
⇡ ?; s s0; Visited {s0} // initia
loop
if s 2 Sg then return ⇡ // goal
A0 Applicable(s)
if A0 = ? then return failure // dead-
nondeterministically choose a 2 A0 // branc
nondeterministically choose s0 2 (s, a)// progr
if s0 2 Visited then return failure // loop
⇡(s) a; Visited Visited [ {s0}; s s0
Figure 5.7: Planning for Solutions by For
graphs to find solutions. The main goal of the foll
the di↵erence in algorithms from deterministic dom
mainly a didactic rather than practical objective.
5.3.1 Planning for Solutions by Forward
174
Find-Solution (⌃, s0, Sg)
⇡ ?; s s0; Visited {s0} // initial
loop
if s 2 Sg then return ⇡ // goal t
A0 Applicable(s)
if A0 = ? then return failure // dead-e
nondeterministically choose a 2 A0 // branc
nondeterministically choose s0 2 (s, a)// progre
if s0 2 Visited then return failure // loop c
⇡(s) a; Visited Visited [ {s0}; s s0
Figure 5.7: Planning for Solutions by For
graphs to find solutions. The main goal of the follo
the di↵erence in algorithms from deterministic doma
mainly a didactic rather than practical objective.
5.3.1 Planning for Solutions by Forward
We first present a very simple algorithm that finds
ship
hbr
par1
tr1
par2
park
unload
164 Chapter 5
Policy:
ship: unload
hbr: park
par1: deliver
tr2: move
g1
g2
del
tr2
g1 g2
move
Assume this

outcome is
chosen
39"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
FindRSolu.on:(Algorithm(to(find(Solu.ons(
174
Find-Solution (⌃, s0, Sg)
⇡ ?; s s0; Visited {s0} // initia
loop
if s 2 Sg then return ⇡ // goal
A0 Applicable(s)
if A0 = ? then return failure // dead-
nondeterministically choose a 2 A0 // branc
nondeterministically choose s0 2 (s, a)// progr
if s0 2 Visited then return failure // loop
⇡(s) a; Visited Visited [ {s0}; s s0
Figure 5.7: Planning for Solutions by For
graphs to find solutions. The main goal of the foll
the di↵erence in algorithms from deterministic dom
mainly a didactic rather than practical objective.
5.3.1 Planning for Solutions by Forward
174
Find-Solution (⌃, s0, Sg)
⇡ ?; s s0; Visited {s0} // initial
loop
if s 2 Sg then return ⇡ // goal t
A0 Applicable(s)
if A0 = ? then return failure // dead-e
nondeterministically choose a 2 A0 // branc
nondeterministically choose s0 2 (s, a)// progre
if s0 2 Visited then return failure // loop c
⇡(s) a; Visited Visited [ {s0}; s s0
Figure 5.7: Planning for Solutions by For
graphs to find solutions. The main goal of the follo
the di↵erence in algorithms from deterministic doma
mainly a didactic rather than practical objective.
5.3.1 Planning for Solutions by Forward
We first present a very simple algorithm that finds
ship
hbr
par1
tr1
par2
park
unload
164 Chapter 5
Policy:
ship: unload
hbr: park
par1: deliver
tr2: move
g1
g2
del
tr2
g1 g2
move
Reached a

goal state.

Terminate here.
40"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
FindRSolu.on:(Algorithm(to(find(Solu.ons(
174
Find-Solution (⌃, s0, Sg)
⇡ ?; s s0; Visited {s0} // initia
loop
if s 2 Sg then return ⇡ // goal
A0 Applicable(s)
if A0 = ? then return failure // dead-
nondeterministically choose a 2 A0 // branc
nondeterministically choose s0 2 (s, a)// progr
if s0 2 Visited then return failure // loop
⇡(s) a; Visited Visited [ {s0}; s s0
Figure 5.7: Planning for Solutions by For
graphs to find solutions. The main goal of the foll
the di↵erence in algorithms from deterministic dom
mainly a didactic rather than practical objective.
5.3.1 Planning for Solutions by Forward
174
Find-Solution (⌃, s0, Sg)
⇡ ?; s s0; Visited {s0} // initial
loop
if s 2 Sg then return ⇡ // goal t
A0 Applicable(s)
if A0 = ? then return failure // dead-e
nondeterministically choose a 2 A0 // branc
nondeterministically choose s0 2 (s, a)// progre
if s0 2 Visited then return failure // loop c
⇡(s) a; Visited Visited [ {s0}; s s0
Figure 5.7: Planning for Solutions by For
graphs to find solutions. The main goal of the follo
the di↵erence in algorithms from deterministic doma
mainly a didactic rather than practical objective.
5.3.1 Planning for Solutions by Forward
We first present a very simple algorithm that finds
ship
hbr
par1
tr1
par2
park
unload
164 Chapter 5
Policy:
ship: unload
hbr: park
par1: deliver
tr2: move
g1
g2
del
tr2
g1 g2
move
This policy

is returned
41"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
FindRSolu.on:(Proper.es(
!  Finds a solution if one exists
!  However, in most cases it will find unsafe solutions
●  Because it only considers one outcome for each action
!  Nondeterministic choice implemented using backtracking
●  Two levels of backtracking
▸  Choosing an action
▸  Choosing an effect of that action
●  Each sequence of choices corresponds to an execution trace of FindGSoluBon"
42"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
FindRSafeRSolu.on(
Section 5.3 175
Find-Safe-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal reached by all leaves
for every s 2 Frontier do
remove s from Frontier
if Find-Solution(⌃, s, Sg) = failure // nonterminating loop
then return failure
nondeterministically choose a 2 Applicable(s) // select an action
⇡ ⇡ [ (s, a)
Frontier Frontier [ ( (s, a)  Dom(⇡)) // expand
return failure
Figure 5.8: Planning for Safe Solutions by Forward-search.
Keeps track of

unexpanded states,

much like A*
Uses FindGSoluBon"to see

if a Solution exists. If no

Solution, then no

Safe-Solution.
Only nondeterministic choice is action.

Adds ALL possible successor states to

Frontier. Not a choice since Safe-Solution

needs to guard against all eventualities.
43"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRSafeRSolu.on(
Find-Safe-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Find-Solution(⌃, s, Sg) = failure //
then return failure
nondeterministically choose a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ ( (s, a)  Dom(⇡))
return failure
Figure 5.8: Planning for Safe Solutions by For
resulting from applying a to s. The interpretation of
choice of the state among the elements of the frontie
creates several copies of a, one for each applicable act
these copies has been made, the algorithm makes ano
ship
Policy:
Frontier: ship
44"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRSafeRSolu.on(
Find-Safe-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Find-Solution(⌃, s, Sg) = failure //
then return failure
nondeterministically choose a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ ( (s, a)  Dom(⇡))
return failure
Figure 5.8: Planning for Safe Solutions by For
resulting from applying a to s. The interpretation of
choice of the state among the elements of the frontie
creates several copies of a, one for each applicable act
these copies has been made, the algorithm makes ano
ship
hbr
unload
Policy:
ship: unload
Frontier: hbr
45"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRSafeRSolu.on(
Find-Safe-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Find-Solution(⌃, s, Sg) = failure //
then return failure
nondeterministically choose a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ ( (s, a)  Dom(⇡))
return failure
Figure 5.8: Planning for Safe Solutions by For
resulting from applying a to s. The interpretation of
choice of the state among the elements of the frontie
creates several copies of a, one for each applicable act
these copies has been made, the algorithm makes ano
Frontier: par2,

tr1,par1
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park
Unlike FindGSoluBon, need to

solve for all successor states.

All are added to Frontier.
46"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRSafeRSolu.on(
Find-Safe-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Find-Solution(⌃, s, Sg) = failure //
then return failure
nondeterministically choose a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ ( (s, a)  Dom(⇡))
return failure
Figure 5.8: Planning for Safe Solutions by For
resulting from applying a to s. The interpretation of
choice of the state among the elements of the frontie
creates several copies of a, one for each applicable act
these copies has been made, the algorithm makes ano
Frontier: par2,

tr1,g1,g2,tr2
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park

par1: deliver
g1
g2
del
tr2
47"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRSafeRSolu.on(
Find-Safe-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Find-Solution(⌃, s, Sg) = failure //
then return failure
nondeterministically choose a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ ( (s, a)  Dom(⇡))
return failure
Figure 5.8: Planning for Safe Solutions by For
resulting from applying a to s. The interpretation of
choice of the state among the elements of the frontie
creates several copies of a, one for each applicable act
these copies has been made, the algorithm makes ano
Frontier: par2,

tr1,g1,g2,tr2
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park

par1: deliver
g1
g2
del
tr2
g1 and g2 are goal states.So 

FSS doesn’t solve for it further.
48"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRSafeRSolu.on(
Find-Safe-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Find-Solution(⌃, s, Sg) = failure //
then return failure
nondeterministically choose a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ ( (s, a)  Dom(⇡))
return failure
Figure 5.8: Planning for Safe Solutions by For
resulting from applying a to s. The interpretation of
choice of the state among the elements of the frontie
creates several copies of a, one for each applicable act
these copies has been made, the algorithm makes ano
Frontier: par2,

tr1,g1,g2
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park

par1: deliver

tr2: move
g1
g2
del
tr2
g1 g2
move
49"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRSafeRSolu.on(
Find-Safe-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Find-Solution(⌃, s, Sg) = failure //
then return failure
nondeterministically choose a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ ( (s, a)  Dom(⇡))
return failure
Figure 5.8: Planning for Safe Solutions by For
resulting from applying a to s. The interpretation of
choice of the state among the elements of the frontie
creates several copies of a, one for each applicable act
these copies has been made, the algorithm makes ano
Frontier: par2,

tr1,g1,g2
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park

par1: deliver

tr2: move
g1
g2
del
tr2
g1 g2
move
50"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRSafeRSolu.on(
Find-Safe-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Find-Solution(⌃, s, Sg) = failure //
then return failure
nondeterministically choose a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ ( (s, a)  Dom(⇡))
return failure
Figure 5.8: Planning for Safe Solutions by For
resulting from applying a to s. The interpretation of
choice of the state among the elements of the frontie
creates several copies of a, one for each applicable act
these copies has been made, the algorithm makes ano
Frontier:tr1,

g1,g2
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park

par1: deliver

tr2: move

par2: back
g1
g2
del
tr2
g1 g2
move
hbr
back
51"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRSafeRSolu.on(
Find-Safe-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Find-Solution(⌃, s, Sg) = failure //
then return failure
nondeterministically choose a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ ( (s, a)  Dom(⇡))
return failure
Figure 5.8: Planning for Safe Solutions by For
resulting from applying a to s. The interpretation of
choice of the state among the elements of the frontie
creates several copies of a, one for each applicable act
these copies has been made, the algorithm makes ano
Frontier:tr1,

g1,g2
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park

par1: deliver

tr2: move

par2: back
g1
g2
del
tr2
g1 g2
move
hbr
back
52"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRSafeRSolu.on(
Find-Safe-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Find-Solution(⌃, s, Sg) = failure //
then return failure
nondeterministically choose a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ ( (s, a)  Dom(⇡))
return failure
Figure 5.8: Planning for Safe Solutions by For
resulting from applying a to s. The interpretation of
choice of the state among the elements of the frontie
creates several copies of a, one for each applicable act
these copies has been made, the algorithm makes ano
Frontier:

g1,g2
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park

par1: deliver

tr2: move

par2: back

tr1: move
g1
g2
del
tr2
g1 g2
move
par1 par2
hbr
back
53"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRSafeRSolu.on(
Find-Safe-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Find-Solution(⌃, s, Sg) = failure //
then return failure
nondeterministically choose a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ ( (s, a)  Dom(⇡))
return failure
Figure 5.8: Planning for Safe Solutions by For
resulting from applying a to s. The interpretation of
choice of the state among the elements of the frontie
creates several copies of a, one for each applicable act
these copies has been made, the algorithm makes ano
Frontier:

g1,g2
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park

par1: deliver

tr2: move

par2: back

tr1: move
g1
g2
del
tr2
g1 g2
move
par1 par2
satisfies
hbr
back
54"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRSafeRSolu.on(
Find-Safe-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Find-Solution(⌃, s, Sg) = failure //
then return failure
nondeterministically choose a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ ( (s, a)  Dom(⇡))
return failure
Figure 5.8: Planning for Safe Solutions by For
resulting from applying a to s. The interpretation of
choice of the state among the elements of the frontie
creates several copies of a, one for each applicable act
these copies has been made, the algorithm makes ano
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park

par1: deliver

tr2: move

par2: back

tr1: move
g1
g2
del
tr2
g1 g2
move
par1 par2
This policy

is returned
hbr
back
55"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Proper.es(of(FindRSafeRSolu.on(
!  Guaranteed to find safe solution, if one exists
!  Uses FindGSoluBon"as a subroutine to detect nonterminating loops
56"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
FindRAcyclicRSolu.on(
176 Chapter
Find-Acyclic-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal reached by all leave
for every s 2 Frontier do
remove s from Frontier
if Frontier  Dom(⇡) 6= ? // loop checking
then return failure
choose nondeterministically a 2 Applicable(s) // select an action
⇡ ⇡ [ (s, a)
Frontier Frontier [ (s, a) // expand
return failure
Figure 5.9: Planning for Safe Acyclic Solutions by Forward-search.
Cycle check: makes sure

that action applied in previous

iteration didn’t lead to a state

already considered by π
Similar to

FindRSafeRSolu.on except:
57"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRAcyclicRSoln(
ship
Policy:
Frontier: ship
Find-Acyclic-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Frontier  Dom(⇡) 6= ? //
then return failure
choose nondeterministically a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ (s, a)
return failure
Figure 5.9: Planning for Safe Acyclic Solutions by
While exploring the frontier, it calls Find-Soluti
whether the current policy contains cycles without p
tion, i.e., whether it gets in a state where no action i
there is no path to the goal. Also Find-Safe-Solution
terministic selection among the applicable actions.
58"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRAcyclicRSoln(
ship
hbr
unload
Policy:
ship: unload
Frontier: hbr
Find-Acyclic-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Frontier  Dom(⇡) 6= ? //
then return failure
choose nondeterministically a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ (s, a)
return failure
Figure 5.9: Planning for Safe Acyclic Solutions by
While exploring the frontier, it calls Find-Soluti
whether the current policy contains cycles without p
tion, i.e., whether it gets in a state where no action i
there is no path to the goal. Also Find-Safe-Solution
terministic selection among the applicable actions.
59"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRAcyclicRSoln(
Frontier: par2,

tr1,par1
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park
Unlike FindGSoluBon, need to

solve for all successor states.

All are added to Frontier.
Find-Acyclic-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Frontier  Dom(⇡) 6= ? //
then return failure
choose nondeterministically a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ (s, a)
return failure
Figure 5.9: Planning for Safe Acyclic Solutions by
While exploring the frontier, it calls Find-Soluti
whether the current policy contains cycles without p
tion, i.e., whether it gets in a state where no action i
there is no path to the goal. Also Find-Safe-Solution
terministic selection among the applicable actions.
60"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRAcyclicRSoln(
Frontier: par2,

tr1,g1,g2,tr2
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park

par1: deliver
g1
g2
del
tr2
Find-Acyclic-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Frontier  Dom(⇡) 6= ? //
then return failure
choose nondeterministically a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ (s, a)
return failure
Figure 5.9: Planning for Safe Acyclic Solutions by
While exploring the frontier, it calls Find-Soluti
whether the current policy contains cycles without p
tion, i.e., whether it gets in a state where no action i
there is no path to the goal. Also Find-Safe-Solution
terministic selection among the applicable actions.
61"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRAcyclicRSoln(
Frontier: par2,

tr1,g1,g2,tr2
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park

par1: deliver
g1
g2
del
tr2
g1 and g2 are goal states.So 

FSS doesn’t solve for it further.
Find-Acyclic-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Frontier  Dom(⇡) 6= ? //
then return failure
choose nondeterministically a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ (s, a)
return failure
Figure 5.9: Planning for Safe Acyclic Solutions by
While exploring the frontier, it calls Find-Soluti
whether the current policy contains cycles without p
tion, i.e., whether it gets in a state where no action i
there is no path to the goal. Also Find-Safe-Solution
terministic selection among the applicable actions.
62"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRAcyclicRSoln(
Frontier: par2,

tr1,g1,g2
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park

par1: deliver

tr2: move
g1
g2
del
tr2
g1 g2
move
Find-Acyclic-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Frontier  Dom(⇡) 6= ? //
then return failure
choose nondeterministically a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ (s, a)
return failure
Figure 5.9: Planning for Safe Acyclic Solutions by
While exploring the frontier, it calls Find-Soluti
whether the current policy contains cycles without p
tion, i.e., whether it gets in a state where no action i
there is no path to the goal. Also Find-Safe-Solution
terministic selection among the applicable actions.
63"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRAcyclicRSoln(
Frontier: par2,

tr1,g1,g2
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park

par1: deliver

tr2: move
g1
g2
del
tr2
g1 g2
move
Find-Acyclic-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Frontier  Dom(⇡) 6= ? //
then return failure
choose nondeterministically a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ (s, a)
return failure
Figure 5.9: Planning for Safe Acyclic Solutions by
While exploring the frontier, it calls Find-Soluti
whether the current policy contains cycles without p
tion, i.e., whether it gets in a state where no action i
there is no path to the goal. Also Find-Safe-Solution
terministic selection among the applicable actions.
64"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRAcyclicRSoln(
Frontier:tr1,

g1,g2,tr3
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park

par1: deliver

tr2: move

par2: deliver
g1
g2
del
tr2
g1 g2
move
tr3
del
g1
Find-Acyclic-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Frontier  Dom(⇡) 6= ? //
then return failure
choose nondeterministically a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ (s, a)
return failure
Figure 5.9: Planning for Safe Acyclic Solutions by
While exploring the frontier, it calls Find-Soluti
whether the current policy contains cycles without p
tion, i.e., whether it gets in a state where no action i
there is no path to the goal. Also Find-Safe-Solution
terministic selection among the applicable actions.
Note: doesn’t

consider back(
because it 

creates

a cycle
65"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRAcyclicRSoln(
Frontier:tr1,

g1,g2
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park

par1: deliver

tr2: move

par2: deliver
tr3: move
g1
g2
del
tr2
g1 g2
move
tr3
del
g1
g2
move
Find-Acyclic-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Frontier  Dom(⇡) 6= ? //
then return failure
choose nondeterministically a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ (s, a)
return failure
Figure 5.9: Planning for Safe Acyclic Solutions by
While exploring the frontier, it calls Find-Soluti
whether the current policy contains cycles without p
tion, i.e., whether it gets in a state where no action i
there is no path to the goal. Also Find-Safe-Solution
terministic selection among the applicable actions.
66"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRAcyclicRSoln(
Frontier:

g1,g2
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park

par1: deliver

tr2: move

par2: deliver
tr3: move

par1: move
g1
g2
del
tr2
g1 g2
move
tr3
del
g1
g2
move
par1 par2
Find-Acyclic-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Frontier  Dom(⇡) 6= ? //
then return failure
choose nondeterministically a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ (s, a)
return failure
Figure 5.9: Planning for Safe Acyclic Solutions by
While exploring the frontier, it calls Find-Soluti
whether the current policy contains cycles without p
tion, i.e., whether it gets in a state where no action i
there is no path to the goal. Also Find-Safe-Solution
terministic selection among the applicable actions.
67"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRAcyclicRSoln(
Frontier:

g1,g2
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park

par1: deliver

tr2: move

par2: deliver
tr3: move

tr1: move
g1
g2
del
tr2
g1 g2
move
tr3
del
g1
g2
move
par1 par2
satisfies
Find-Acyclic-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Frontier  Dom(⇡) 6= ? //
then return failure
choose nondeterministically a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ (s, a)
return failure
Figure 5.9: Planning for Safe Acyclic Solutions by
While exploring the frontier, it calls Find-Soluti
whether the current policy contains cycles without p
tion, i.e., whether it gets in a state where no action i
there is no path to the goal. Also Find-Safe-Solution
terministic selection among the applicable actions.
68"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
FindRAcyclicRSoln(
ship
hbr
par1
tr1
par2
park
unload
Policy:
ship: unload
hbr: park

par1: deliver

tr2: move

par2: deliver
tr3: move

tr1: move
g1
g2
del
tr2
g1 g2
move
tr3
del
g1
g2
move
par1 par2
This policy

is returned
Find-Acyclic-Solution (⌃, s0, Sg)
⇡ ?
Frontier {s0}
while Frontier 6= ? do
if Frontier ✓ Sg then return ⇡ // goal
for every s 2 Frontier do
remove s from Frontier
if Frontier  Dom(⇡) 6= ? //
then return failure
choose nondeterministically a 2 Applicable(s)
⇡ ⇡ [ (s, a)
Frontier Frontier [ (s, a)
return failure
Figure 5.9: Planning for Safe Acyclic Solutions by
While exploring the frontier, it calls Find-Soluti
whether the current policy contains cycles without p
tion, i.e., whether it gets in a state where no action i
there is no path to the goal. Also Find-Safe-Solution
terministic selection among the applicable actions.
69"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Proper.es(of(FindRAcyclicRSolu.on(
!  Guarantees finding Acyclic Safe Solutions, if one exists
!  Checks for cycles by seeing if any node in FronBer"is already in the domain of π
70"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Guided(Planning(For(Safe(Solu.ons(
!  Main motivation: finding possibly unsafe solutions much easier than finding safe
solutions
●  FindGSoluBon"ignores AND/OR graph structure and just looks for a policy that
might achieve the goal
●  FindGSafeGSoluBon needs to plan for all possible outcomes of actions
!  We’ll now see an algorithm that computes safe solutions by starting from possibly
unsafe solutions
71"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
GuidedRFindRSafeRSolu.on(
192 Chapter 5
Guided-Find-Safe-Solution (⌃,s0,Sg)
if s0 2 Sg then return(?)
if Applicable(s0) = ? then return(failure)
⇡ ?
loop
Q leaves(s0, ⇡)  Sg
if Q = ? then return(⇡)
select arbitrarily s 2 Q
⇡0 Find-Solution(⌃, s, Sg)
if ⇡0 6= failure then do
⇡ ⇡ [ {(s, a) 2 ⇡0 | s 62 Dom(⇡)}
else for every s0 and a such that s 2 (s0, a) do
⇡ ⇡  {(s0, a)}
make a not applicable in s0
Figure 5.17: Guided Planning for a Safe Solution
Look at all the leaves of π. 

Safe solution requires a goal state

to be reachable from every node.

So plan from each non-solution leaf.
Incorporate solution π’ found

into overall policy π
If solution not found from
s, goals unreachable from
s. Remove all elements of
π that could result in s.
72"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
164 Chapter 5
Figure 5.1: A simple nondeterministic planning domain model
EXAMPLE(
73"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Finding(Safe(Solu.ons(by(Determiniza.on(
!  Main idea underlying GuidedGFindGSafeGSoluBon:"
●  Can use (possibly) unsafe solutions (using FindGSoluBon) to guide the search
towards a safe solution
!  Advantageous because we can temporarily focus on only one of the action’s
outcomes
●  Searching for paths rather than trees
!  Determinization carries same idea even further
!  I’ll explain how determinization works, and then how it compares with FindG
SoluBon"
74"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Determiniza.on(Techniques(
!  High-Level Approach:
●  Transform nondeterministic model to a
deterministic one
▸  Each nondeterministic action translates to
several deterministic actions, one for each
possible successor state
●  Use CSV planners to solve these problems
●  Stitch solutions together into a policy
!  Advantages:
●  Deterministic planning problems efficiently
solvable
●  Allows us to leverage all of the nice features
CSV planners bring in
▸  Heuristics, landmarks, etc
hbr
par1
tr1
par2
park
hbr
par1
tr1
par2
park1
park2
park3
75"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
FindRSafeRSolu.onRbyRDeterminiza.on(
Find-Safe-Solution-by-Determinization (⌃,s0,Sg)
if s0 2 Sg then return(?)
if Applicable(s0) = ? then return(failure)
⇡ ?
⌃d mk-deterministic(⌃) // determinization
loop
Q leaves(s0, ⇡)  Sg
if Q = ? then do
⇡ ⇡  {(s, a) 2 ⇡ | s 62 b(s0, ⇡)} // clean policy
return(⇡)
select s 2 Q
p0 Forward-search (⌃d, s, Sg) // classical planner
if p0 6= fail then do
⇡0 Plan2policy(p0, s) // plan2policy transformatio
⇡ ⇡ [ {(s, a) 2 ⇡0 | s 62 Dom(⇡)}
else for every s0 and a such that s 2 (s0, a) do
⇡ ⇡  {(s0, a)}
make the actions in the determinization of a // action elimination
not applicable in s0
Compute determinization of domain
If no non-solution leaf
states, we’re done. Need to
clean up policy to remove
unreachable states
Invoke CSV planner on
deterministic model
Transform deterministic

plan into policy
Action elimination
76"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Plan2Policy(
⌃d rather than the nondeterministic domain ⌃.
Plan2policy(p = ha1, . . . , ani,s)
⇡ ?
loop for i from 1 to n do
⇡ ⇡ [ (s, ai)
s d(s, ai)
return ⇡
Figure 5.19: Transformation of a sequential plan into a corresponding pol
5.6 Online approaches with nondeterminist
models
In Chapter 1 (see Section 1.2, and specifically Section 1.6.2) we introdu
the idea of interleaving planning and acting. One motivation is that, giv
a complete plan that is generated o↵-line, its execution seldom works
Relatively straightforward: transforms
solution into a policy representation
Note: p needs to be an acyclic plan
To ensure this, Forward-Search (see
previous slide) needs to return an
acyclic plan
77"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Ac.on(Elimina.on(
if p0 6= fail then do
⇡0 Plan2policy(p0, s) // plan2poli
⇡ ⇡ [ {(s, a) 2 ⇡0 | s 62 Dom(⇡)}
else for every s0 and a such that s 2 (s0, a) do
⇡ ⇡  {(s0, a)}
make the actions in the determinization of a // action eli
not applicable in s0
Figure 5.18: Planning for Safe Solutions by Determinization
Fragment of FindGSafeGSoluBonGbyGDeterminizaBon

that has to do with action elimination
Triggered if no deterministic solution from s

Informally it does the following:
•  Update π to ensure s is never reached
•  Ensure that no deterministic solution found in a future call to ForwardG
Search"returns a solution going through s
78"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Proper.es(of(FindRSafeRSolu.onRbyR
Determiniza.on(
!  Finds safe solutions
!  Any CSV planner can be plugged in
!  Determinization needs to be done carefully
●  Could potentially lead to an exponential blowup in the number of actions
79"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Online(Approaches(with(Nondeterminis.c(Models(
!  Interleaving planning and acting is
important
●  Planning models are approximate –
execution seldom works out as planned
●  Large problems mean long planning
time – need to interleave the two
!  This motivation even more stronger in
nondeterministic domains
●  Long time needed to generate safe
solutions when there are lots of state
variables, actions etc
!  Therefore interleaving planning and acting
helps reduce complexity
●  Instead of coming up with complete
policy, generate partial policy that tells
us the next few actions to perform
196
Figure 5.20: O↵-line vs. Run Time Search Spaces
acting and planning then we reduce significantly the sear
indeed to find a partial policy, e.g., the next few ”good”
or some of them, and repeat these two interleaved plannin
Offline vs Runtime

Search Spaces
80"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Issues(With(Interleaving(Planning(and(Ac.ng(
!  Need to identify good actions without exploring entire search space
●  Can be done using heuristic estimates
!  Handling Dead-ends:
●  When lookahead is not enough, can get trapped in dead ends
▸  By planning fully, we would have found out about the dead-end
▸  E.g. if robot goes down a steep incline out of which it cannot come back
up
●  Not a problem in safely explorable domains
▸  Goal states reachable from all situations
!  Despite these issues, interleaving planning and acting an essential alternative to
purely offline planning
81"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
Ac.ng(Procedure:(RunRLookahead(
198 Chapter
Run-Lookahead(⌃, s0, Sg)
s s0
while s /2 Sg and Applicable(s) 6= ? do
⇡ Lookahead(s, ✓)
apply partial plan ⇡
s observe current state
Figure 5.21: Interleaving planning and execution by look-ahead
There are di↵erent ways in which the generated plan can be partia
and di↵erent ways in planning and acting can be interleaved. Indeed th
procedure Run-Lookahead is parametric along two dimensions:
The first parametric dimension is in the call to the look-ahead plannin
step, i.e., Lookahead(s, ✓). The parameter ✓ determines the way in which th
generated plan ⇡ is partial. For instance, it can be partial since the lookahea
is bounded, i.e., the forward search is performed for a bounded number o
This is where the planner is
invoked. θ is a context-dependent
parameter that restricts the search
for a solution and hence
determines how π is partial
•  θ could be a bound on the
search depth
•  θ could be limitation on
planning time
•  θ could also limit the number of
action outcomes considered
•  Special case: only ONE
outcome == FindGSoluBon(
!  Two ways to perform lookahead:
●  Lookahead with a bounded
number of steps: handle all
action outcomes, but only upto a
certain depth
●  Lookahead by
determinization: solve the
problem fully, but possibly
unsafe due to determinization
82"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15"
FFRReplan:(Lookahead(by(Determiniza.on(
Section 5.6
FF-Replan (⌃, s, Sg)
while s /2 Sg and Applicable(s) 6= ? do
if ⇡d undefined for s then do
⇡d Forward-search (⌃d, s, Sg)
apply action ⇡d(s)
s observe resulting state
Figure 5.22: Online determinization planning and acting algorithm.
lookahead and partial numebr of outcomes, in any arbitrary way.
The second parametric dimension is in the application of the partial p
that has been generated, i.e., apply the partial plan ⇡. Independently of
lookahead, we can still execute ⇡ in a partial way. Suppose for instance t
we have generated a sequential plan of length n, we can decide to ap
m  n steps.
Run Forward-Search on

a determinized version of

the problem.
Then start executing

the (possibly unsafe) policy

until we cannot execute 

it anymore
Properties:
•  If the domain is safely-explorable,

then FFGReplan will get to a goal state.
•  If the domain has dead-ends, then

no guarantees.

More Related Content

What's hot

Codemotion akka persistence, cqrs%2 fes y otras siglas del montón
Codemotion   akka persistence, cqrs%2 fes y otras siglas del montónCodemotion   akka persistence, cqrs%2 fes y otras siglas del montón
Codemotion akka persistence, cqrs%2 fes y otras siglas del montónJavier Santos Paniego
 
Tall-and-skinny Matrix Computations in MapReduce (ICME colloquium)
Tall-and-skinny Matrix Computations in MapReduce (ICME colloquium)Tall-and-skinny Matrix Computations in MapReduce (ICME colloquium)
Tall-and-skinny Matrix Computations in MapReduce (ICME colloquium)Austin Benson
 
ICPC Asia::Tokyo 2014 Problem J – Exhibition
ICPC Asia::Tokyo 2014 Problem J – ExhibitionICPC Asia::Tokyo 2014 Problem J – Exhibition
ICPC Asia::Tokyo 2014 Problem J – Exhibitionirrrrr
 
ICPC 2015, Tsukuba : Unofficial Commentary
ICPC 2015, Tsukuba: Unofficial CommentaryICPC 2015, Tsukuba: Unofficial Commentary
ICPC 2015, Tsukuba : Unofficial Commentaryirrrrr
 
Efficient Offline Monitoring of LTL with Bit Vectors (Talk at SAC 2021)
Efficient Offline Monitoring of LTL with Bit Vectors (Talk at SAC 2021)Efficient Offline Monitoring of LTL with Bit Vectors (Talk at SAC 2021)
Efficient Offline Monitoring of LTL with Bit Vectors (Talk at SAC 2021)Sylvain Hallé
 
Test Sequence Generation with Cayley Graphs (Talk @ A-MOST 2021)
Test Sequence Generation with Cayley Graphs (Talk @ A-MOST 2021)Test Sequence Generation with Cayley Graphs (Talk @ A-MOST 2021)
Test Sequence Generation with Cayley Graphs (Talk @ A-MOST 2021)Sylvain Hallé
 
11 - Programming languages
11 - Programming languages11 - Programming languages
11 - Programming languagesTudor Girba
 
Test s velocity_15_5_4
Test s velocity_15_5_4Test s velocity_15_5_4
Test s velocity_15_5_4Kunihiko Saito
 
Algorithms for Graph Coloring Problem
Algorithms for Graph Coloring ProblemAlgorithms for Graph Coloring Problem
Algorithms for Graph Coloring ProblemShengyi Wang
 

What's hot (14)

Pda
PdaPda
Pda
 
Codemotion akka persistence, cqrs%2 fes y otras siglas del montón
Codemotion   akka persistence, cqrs%2 fes y otras siglas del montónCodemotion   akka persistence, cqrs%2 fes y otras siglas del montón
Codemotion akka persistence, cqrs%2 fes y otras siglas del montón
 
Tall-and-skinny Matrix Computations in MapReduce (ICME colloquium)
Tall-and-skinny Matrix Computations in MapReduce (ICME colloquium)Tall-and-skinny Matrix Computations in MapReduce (ICME colloquium)
Tall-and-skinny Matrix Computations in MapReduce (ICME colloquium)
 
Pda
PdaPda
Pda
 
ICPC Asia::Tokyo 2014 Problem J – Exhibition
ICPC Asia::Tokyo 2014 Problem J – ExhibitionICPC Asia::Tokyo 2014 Problem J – Exhibition
ICPC Asia::Tokyo 2014 Problem J – Exhibition
 
ICPC 2015, Tsukuba : Unofficial Commentary
ICPC 2015, Tsukuba: Unofficial CommentaryICPC 2015, Tsukuba: Unofficial Commentary
ICPC 2015, Tsukuba : Unofficial Commentary
 
Block Diagram Algebra
Block Diagram AlgebraBlock Diagram Algebra
Block Diagram Algebra
 
Efficient Offline Monitoring of LTL with Bit Vectors (Talk at SAC 2021)
Efficient Offline Monitoring of LTL with Bit Vectors (Talk at SAC 2021)Efficient Offline Monitoring of LTL with Bit Vectors (Talk at SAC 2021)
Efficient Offline Monitoring of LTL with Bit Vectors (Talk at SAC 2021)
 
Test Sequence Generation with Cayley Graphs (Talk @ A-MOST 2021)
Test Sequence Generation with Cayley Graphs (Talk @ A-MOST 2021)Test Sequence Generation with Cayley Graphs (Talk @ A-MOST 2021)
Test Sequence Generation with Cayley Graphs (Talk @ A-MOST 2021)
 
Vectorization in ATLAS
Vectorization in ATLASVectorization in ATLAS
Vectorization in ATLAS
 
11 - Programming languages
11 - Programming languages11 - Programming languages
11 - Programming languages
 
Test s velocity_15_5_4
Test s velocity_15_5_4Test s velocity_15_5_4
Test s velocity_15_5_4
 
Algorithms for Graph Coloring Problem
Algorithms for Graph Coloring ProblemAlgorithms for Graph Coloring Problem
Algorithms for Graph Coloring Problem
 
201707 SER332 Lecture 19
201707 SER332 Lecture 19   201707 SER332 Lecture 19
201707 SER332 Lecture 19
 

More from Tianlu Wang

14 pro resolution
14 pro resolution14 pro resolution
14 pro resolutionTianlu Wang
 
13 propositional calculus
13 propositional calculus13 propositional calculus
13 propositional calculusTianlu Wang
 
12 adversal search
12 adversal search12 adversal search
12 adversal searchTianlu Wang
 
11 alternative search
11 alternative search11 alternative search
11 alternative searchTianlu Wang
 
21 situation calculus
21 situation calculus21 situation calculus
21 situation calculusTianlu Wang
 
20 bayes learning
20 bayes learning20 bayes learning
20 bayes learningTianlu Wang
 
19 uncertain evidence
19 uncertain evidence19 uncertain evidence
19 uncertain evidenceTianlu Wang
 
18 common knowledge
18 common knowledge18 common knowledge
18 common knowledgeTianlu Wang
 
17 2 expert systems
17 2 expert systems17 2 expert systems
17 2 expert systemsTianlu Wang
 
17 1 knowledge-based system
17 1 knowledge-based system17 1 knowledge-based system
17 1 knowledge-based systemTianlu Wang
 
16 2 predicate resolution
16 2 predicate resolution16 2 predicate resolution
16 2 predicate resolutionTianlu Wang
 
16 1 predicate resolution
16 1 predicate resolution16 1 predicate resolution
16 1 predicate resolutionTianlu Wang
 
09 heuristic search
09 heuristic search09 heuristic search
09 heuristic searchTianlu Wang
 
08 uninformed search
08 uninformed search08 uninformed search
08 uninformed searchTianlu Wang
 

More from Tianlu Wang (20)

L7 er2
L7 er2L7 er2
L7 er2
 
L8 design1
L8 design1L8 design1
L8 design1
 
L9 design2
L9 design2L9 design2
L9 design2
 
14 pro resolution
14 pro resolution14 pro resolution
14 pro resolution
 
13 propositional calculus
13 propositional calculus13 propositional calculus
13 propositional calculus
 
12 adversal search
12 adversal search12 adversal search
12 adversal search
 
11 alternative search
11 alternative search11 alternative search
11 alternative search
 
10 2 sum
10 2 sum10 2 sum
10 2 sum
 
22 planning
22 planning22 planning
22 planning
 
21 situation calculus
21 situation calculus21 situation calculus
21 situation calculus
 
20 bayes learning
20 bayes learning20 bayes learning
20 bayes learning
 
19 uncertain evidence
19 uncertain evidence19 uncertain evidence
19 uncertain evidence
 
18 common knowledge
18 common knowledge18 common knowledge
18 common knowledge
 
17 2 expert systems
17 2 expert systems17 2 expert systems
17 2 expert systems
 
17 1 knowledge-based system
17 1 knowledge-based system17 1 knowledge-based system
17 1 knowledge-based system
 
16 2 predicate resolution
16 2 predicate resolution16 2 predicate resolution
16 2 predicate resolution
 
16 1 predicate resolution
16 1 predicate resolution16 1 predicate resolution
16 1 predicate resolution
 
15 predicate
15 predicate15 predicate
15 predicate
 
09 heuristic search
09 heuristic search09 heuristic search
09 heuristic search
 
08 uninformed search
08 uninformed search08 uninformed search
08 uninformed search
 

Chapter05

  • 2. 2"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" Introduc.on( !  World seldom predictable ●  corresponding deliberation models as a result always going to be incomplete !  Results in: ●  Action failures ●  Unexpected side effects of actions ●  Exogenous events !  So far, been working with deterministic action models ●  Each action, when applied in a particular state, results in only one state ●  Formally: γ(s,a) returns a single state ●  Doesn’t adequately support inherent uncertainty in domains !  Nondeterministic models provide more flexibility: ●  An action, when applied in a state, may result in one among several possible states ●  γ(s,a) returns a set of states !  Nondeterministic models allow modeling uncertainty in planning domains
  • 3. 3"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" Why(Model(Uncertainty?(( !  We’ve seen ways to handle these situations using deterministic models ●  Generate plans for the nominal case ●  Execute, and monitor ●  Detect failure, and recover
  • 4. 4"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" Why(Model(Uncertainty?( Answer: nondeterministic models have several advantages !  More accurate modeling !  Plan for uncertainty ahead of time, instead of during execution !  No nominal case in certain environments: ●  Think of throwing a dice/tossing a coin ●  Online payments where choice of payment left to user !  However, comes at a cost: ●  More complicated, both conceptually and computationally ●  Since you need to take all different possibilities into account
  • 5. 5"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" 164 Chapter 5 Figure 5.1: A simple nondeterministic planning domain model Definition 5.1. (Planning Domain) A nondeterministic planning do- main ⌃ is the tuple (S, A, ), where S is the finite set of states, A is the finite set of actions, and : S ⇥ A ! 2S is the state transition function. Search(Spaces(in(Nondeterminis.c(Planning( !  Search space of deterministic planning modeled as a graph ●  Nodes are states, edges are actions !  For planning with nondeterministic domains, search space no longer a graph ●  Instead its now an AND/OR graph !  AND/OR graph has following elements: ●  OR branches: which action to apply in a state? ●  AND branches: which state does the action lead to? !  Have control over which action to apply (OR branches) !  Don’t have control over resulting state (AND branches) A simple nondeterministic model of a harbor management facility
  • 6. 6"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" 164 Chapter 5 Nondeterminis.c(Planning(Domains( !  3-tuple (S, A, γ) ●  S – finite set of states ●  A – finite set of actions ●  γ: S × A → 2S !  Search space of a simple harbor management domain ●  Only one state variable: ▸  pos(item) ●  Nodes represent possible values
  • 7. 7"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" 164 Chapter 5 Ac.ons(in(Nondeterminis.c(Planning(Domains( !  An action a applicable in state s iff γ(s,a) ≠ ∅ !  Applicable(s) is set of all actions applicable in s ●  Applicable(s) = {a ∈ A | γ(s, a) ≠ ∅} !  Five actions in example ●  Two deterministic: ▸  unload, back ●  Three nondeterministic: ▸  park move, deliver
  • 8. 8"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" 164 Chapter 5 Ac.ons(in(Nondeterminis.c(Planning(Domains( !  park stores items in storage areas parking1 or parking2 ●  Nondeterminism used to model possibility of ▸  storing item in parking1 ▸  storing item in parking2 ▸  having to temporarily move item in transit1 if space is unavailable ●  Once space is available: move action
  • 9. 9"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" 164 Chapter 5 Plans(in(Nondeterminis.c(Domains( !  Structure of plans must be different from the deterministic case ●  Previously, sequence of actions !  Doesn’t work here ●  Why?
  • 10. 10"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" 164 Chapter 5 Plans(in(Nondeterminis.c(Domains( !  Need the notion of a conditional plan ●  plans that account for various possibilities in a given state !  Can sense the actual action outcome among the possible ones, and act according to the conditional structure of plan !  A possible representation: ●  a policy: partial function that maps states to actions !  If a policy π maps a state s to an action a ●  that means we should perform a whenever we are in state s
  • 11. 11"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" 164 Chapter 5 Policies:(A(Representa.on(of(Plans(in( Nondeterminis.c(Planning( !  Example policy π1 for the harbor management problem: ●  π1 (pos(item)=on"ship) = unload" ●  π1(pos(item)=at"harbor) = park" ●  π1(pos(item)=parking1) = deliver"
  • 12. 12"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" 164 Chapter 5 Policies:(A(Representa.on(of(Plans(in( Nondeterminis.c(Planning( ●  π1 (pos(item)=on"ship) = unload" ●  π1(pos(item)=at"harbor) = park" ●  π1(pos(item)=parking1) = deliver"
  • 13. 13"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" 164 Chapter 5!  In deterministic planning, can compute states reachable by sequence of actions using γ ●  s ∪ γ (s, a1)∪ γ (γ (s,a1), a2) ∪ ... !  Need few extra definitions to do similar checks in nondeterministic planning !  Reachable States: (s,π) ●  All states that can be produced by starting at s and executing π !  Example: (pos(item)=on"ship,π1) ●  π1 (pos(item)=on"ship) = unload" ●  π1(pos(item)=at"harbor) = park" ●  π1(pos(item)=parking1) = deliver" Defini.ons(Over(Policies(
  • 14. 14"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" 164 Chapter 5 Defini.ons(Over(Policies( !  In deterministic planning, can compute states reachable by sequence of actions using γ ●  s ∪ γ (s, a1)∪ γ (γ (s,a1), a2) ∪ ... !  Need few extra definitions to do similar checks in nondeterministic planning !  Reachable States: (s,π) ●  All states that can be produced by starting at s and executing π !  Example: (pos(item)=on"ship,π1) ●  π1 (pos(item)=on"ship) = unload" ●  π1(pos(item)=at"harbor) = park" ●  π1(pos(item)=parking1) = deliver"
  • 15. 15"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" 164 Chapter 5 Defini.ons(Over(Policies( !  Need to also check whether plan reaches goal ●  Requires calculating final states of policy !  leaves (s,π): set of final states reached by policy π starting from state s !  leaves(s, π) = {s′ | s′ ∈ ︎ (s, π) and s′ not in Dom(π)} !  Example: ●  leaves (pos(item)=on"ship,"π1) ●  π1 (pos(item)=on"ship) = unload" ●  π1(pos(item)=at"harbor) = park" ●  π1(pos(item)=parking1) = deliver"
  • 16. 16"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" Policies:(A(Representa.on(of(Plans(in( Nondeterminis.c(Planning( !  Reachability graph, Graph(s,π) ●  Graph of all possible state transitions if we execute π starting at s ●  Graph(s,π) = { γ︎(s,π), E | s′ ∈ γ︎(s, π), s′′ ∈ π(s′), and (s′,s′′) ∈ E} ●  π1 (pos(item)=on"ship) = unload" ●  π1(pos(item)=at"harbor) = park" ●  π1(pos(item)=parking1) = deliver"
  • 17. 17"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" Policies:(A(Representa.on(of(Plans(in( Nondeterminis.c(Planning( ●  π2"(pos(item)=on"ship)"="unload" ●  π2(pos(item)=at"harbor)"="park" ●  π2(pos(item)=parking1)"="deliver" ●  π2(pos(item)=parking2)"="back" ●  π2(pos(item)=transit1)"="move" ●  π2(pos(item)=transit2)"="move;""
  • 18. 18"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" Policies:(A(Representa.on(of(Plans(in( Nondeterminis.c(Planning( ●  π3"(pos(item)=on"ship)"="unload" ●  π3(pos(item)=at"harbor)"="park" ●  π3(pos(item)=parking1)"="deliver" ●  π3(pos(item)=parking2)"="deliver" ●  π3(pos(item)=transit1)"="move" ●  π3(pos(item)=transit2)"="move" ●  π3(pos(item)=transit3)"="move""
  • 19. 19"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" Planning(Problems(and(Solu.ons( !  Let Σ = (S,A,γ) be a planning domain !  A planning problem P is a 3-tuple P = (Σ,s0,Sg) ●  s0 ∈ S is the initial state ●  Sg ⊆ S is set of goal states !  Note: previous book had set of initial states S0 ●  Allowed uncertainty about initial state ●  Current definition is equivalent ▸  Can easily translate one to the other •  How?
  • 20. 20"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" Planning(Problems(and(Solu.ons( !  Let Σ = (S,A,γ) be a planning domain !  A planning problem P is a 3-tuple P = (Σ,s0,Sg) ●  s0 ∈ S is the initial state ●  Sg ⊆ S is set of goal states !  Note: previous book had set of initial states S0 ●  Allowed uncertainty about initial state ●  Current definition is equivalent ▸  Can easily translate one to the other •  How? ▸  Introduce a new start action such that γ (s0, start) = S0 !  Solutions: not as straightforward to define as Deterministic Planning ●  Based on actual action outcomes, might or might not achieve goal ●  Can define different criteria of success – many types of solutions
  • 21. 21"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" Type(1:(Solu.on( Let P = (Σ,s0,Sg) be a planning problem. Let π be a policy for Σ. π is a solution iff leaves (s0,π) ∩ Sg ≠ ∅ !  A policy that may lead to a goal ●  In other words: at least one sequence of nondeterministic outcomes leads to a goal state !  Example: ●  s0 = {pos(item)"="on_ship} ●  Sg = {pos(item)"="gate1,"pos(item)"="gate2} !  Policy π1 is a solution ●  π1 (pos(item)=on"ship) = unload" ●  π1(pos(item)=at"harbor) = park" ●  π1(pos(item)=parking1) = deliver" !  Reason: At least one of the paths in reachability graph of π1 leads to a state in Sg
  • 22. 22"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" Type(2:(Safe(Solu.on( Let P = (Σ,s0,Sg) be a planning problem. Let π be a policy for Σ. π is a safe solution iff ∀s ∈ γ ︎(s0, π)(leaves(s, π) ∩ Sg ≠ ∅) Safe solution: a solution in which a goal state is reachable from every state in the reachability graph !  Is π1 a safe solution? Condition for solutionNeeds to hold for all reachable states
  • 23. 23"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" Type(2:(Safe(Solu.on( Let P = (Σ,s0,Sg) be a planning problem. Let π be a policy for Σ. π is a safe solution iff ∀s ∈ γ ︎(s0, π)(leaves(s, π) ∩ Sg ≠ ∅) Safe solution: a solution in which a goal state is reachable from every state in the reachability graph !  Is π1 a safe solution? ●  No Condition for solutionNeeds to hold for all reachable states
  • 24. 24"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" Type(2:(Safe(Solu.on( Let P = (Σ,s0,Sg) be a planning problem. Let π be a policy for Σ. π is a safe solution iff ∀s ∈ γ ︎(s0, π)(leaves(s, π) ∩ Sg ≠ ∅) Safe solution: a solution in which a goal state is reachable from every state in the reachability graph !  Is π2 a safe solution? Condition for solutionNeeds to hold for all reachable states
  • 25. 25"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" Type(2:(Safe(Solu.on( Let P = (Σ,s0,Sg) be a planning problem. Let π be a policy for Σ. π is a safe solution iff ∀s ∈ γ ︎(s0, π)(leaves(s, π) ∩ Sg ≠ ∅) Safe solution: a solution in which a goal state is reachable from every state in the reachability graph !  Is π2 a safe solution? ●  Yes Condition for solutionNeeds to hold for all reachable states
  • 26. 26"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" Type(2a:(Cyclic(Safe(Solu.ons( Let P = (Σ,s0,Sg) be a planning problem. Let π be a policy for Σ. π is a cyclic safe solution iff (1)  leaves(s0, π) ⊆ Sg ∧ (2)  (∀s ∈ γ ︎(s0, π)(leaves(s, π) ∩ Sg ≠ ∅)) (3)  Graph(s0, π) is cyclic Meaning of Conditions: (1)  No non-solution leaves (2)  Safe solution (3)  Reachability graph is cyclic Cyclic Safe solution: a safe solution with cycles !  π2 is a cyclic safe solution How does having cycles affect level of safety?
  • 27. 27"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" Type(2a:(Cyclic(Safe(Solu.ons( Let P = (Σ,s0,Sg) be a planning problem. Let π be a policy for Σ. π is a cyclic safe solution iff (1)  leaves(s0, π) ⊆ Sg ∧ (2)  (∀s ∈ γ ︎(s0, π)(leaves(s, π) ∩ Sg ≠ ∅)) (3)  Graph(s0, π) is cyclic Meaning of Conditions: (1)  No non-solution leaves (2)  Safe solution (3)  Reachability graph is cyclic Cyclic Safe solution: a safe solution with cycles !  π2 is a cyclic safe solution How does having cycles affect level of safety? !  could go though cycle infinitely many times !  If execution gets out of loop eventually, guaranteed to reach goal state
  • 28. 28"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" Type(2b:(Acyclic(Safe(Solu.ons( Let P = (Σ,s0,Sg) be a planning problem. Let π be a policy for Σ. π is a acyclic safe solution iff (1) leaves(s0, π) ⊆ Sg ∧ (2) Graph(s0, π) is cyclic Meaning of Conditions: (1)  No non-solution leaves (2)  Reachability graph is acyclic Acyclic Safe Solution: a safe solution without cycles !  π3 is an acyclic safe solution !  Acyclic policy completely safe ●  No matter what happens, guaranteed to eventually reach the goal
  • 29. 29"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" Unsafe(Solu.ons( Let P = (Σ,s0,Sg) be a planning problem. Let π be a policy for Σ. π is an unsafe solution iff (1)  (leaves(s0, π) ∩ Sg ≠ ∅) (2)  ((∃s ∈ leaves(s0, π) | s is not in Sg) ∨ (∃s ∈ γ︎(s0,π) | leaves(s,π)=∅)) Either there is a non-solution leaf state Or you get caught in
 an infinite loop Both of these are bad events
  • 30. 30"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" Summary(of(Solu.on(Types( Section 5.3 173 Figure 5.6: Di↵erent Kinds of Solutions: A Class Diagram nondeterminism probabilistic solutions weak solutions - unsafe solutions - improper solutions safe solutions strong cyclic solutions proper solutions cyclic safe solutions - - acyclic safe solutions strong solutions - !  Unsafe Solutions aren’t of much interest to us ●  Do not guarantee achievement of goal !  Acyclic Safe Solutions are the best – complete assurance that we’ll get to the goal !  Cyclic Safe Solutions also good, but provide a weaker degree of assurance ●  We can get into loops ●  However, assuming that we don’t stay in the loop forever, guaranteed to achieve the goal
  • 32. 32"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" AND/OR(Graph(Search(Algorithms( !  Nondeterministic planning search spaces represented as AND/OR graphs ●  nodes: states ●  OR branches: actions applicable in a state (consider 1) ●  AND branches: successor states from an state-action pair (consider ALL) !  Reachability graph of a solution policy includes one action at each OR branch and all of the action’s outcomes at each AND branch !  First set of planning algorithms will do AND/OR graph search ●  Simple extensions of ForwardG Search"from Chapter 2 ship hbr par1 tr1 par2 park tr2 g2 g1 del tr3 g1 hbr del back par1 par2 move unload
  • 33. 33"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" FindRSolu.on:(Algorithm(to(find(Solu.ons( Chapter 2 Forward-search (⌃, s0, g) s s0; ⇡ hi loop if s0 satisfies g then return ⇡ A0 {a 2 A | a is applicable in s} if A0 = ? then return failure nondeterministically choose a 2 A0 s (s, a); ⇡ ⇡.a A nondeterministic forward-search planning algorithm. iscuss properties that are shared by all algorithms that do a of the same search space, even though those algorithms may es of that tree in di↵erent orders. The rest of this section of those algorithms. olution to a planning problem may require a huge computa- r an arbitrary CSV planning problem the task is PSPACE- ]. To reduce the computational e↵ort, several of the search his section incorporate heuristic techniques for selecting which 174 Chapter 5 Find-Solution (⌃, s0, Sg) ⇡ ?; s s0; Visited {s0} // initialization loop if s 2 Sg then return ⇡ // goal test A0 Applicable(s) if A0 = ? then return failure // dead-end test nondeterministically choose a 2 A0 // branching nondeterministically choose s0 2 (s, a)// progression if s0 2 Visited then return failure // loop check ⇡(s) a; Visited Visited [ {s0}; s s0 174 Chapter 5 Find-Solution (⌃, s0, Sg) ⇡ ?; s s0; Visited {s0} // initialization loop if s 2 Sg then return ⇡ // goal test A0 Applicable(s) if A0 = ? then return failure // dead-end test nondeterministically choose a 2 A0 // branching nondeterministically choose s0 2 (s, a)// progression if s0 2 Visited then return failure // loop check ⇡(s) a; Visited Visited [ {s0}; s s0 Additional nondeterministic choice to decide which action
 outcome to plan for next Cycle-checking Identical Algorithms except: Deterministic Planning algorithm
 from Chapter 2 Nondeterministic Planning
 algorithm
  • 34. 34"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" FindRSolu.on:(Algorithm(to(find(Solu.ons( 174 Find-Solution (⌃, s0, Sg) ⇡ ?; s s0; Visited {s0} // initia loop if s 2 Sg then return ⇡ // goal A0 Applicable(s) if A0 = ? then return failure // dead- nondeterministically choose a 2 A0 // branc nondeterministically choose s0 2 (s, a)// progr if s0 2 Visited then return failure // loop ⇡(s) a; Visited Visited [ {s0}; s s0 Figure 5.7: Planning for Solutions by For graphs to find solutions. The main goal of the foll the di↵erence in algorithms from deterministic dom mainly a didactic rather than practical objective. 5.3.1 Planning for Solutions by Forward 174 Find-Solution (⌃, s0, Sg) ⇡ ?; s s0; Visited {s0} // initial loop if s 2 Sg then return ⇡ // goal t A0 Applicable(s) if A0 = ? then return failure // dead-e nondeterministically choose a 2 A0 // branc nondeterministically choose s0 2 (s, a)// progre if s0 2 Visited then return failure // loop c ⇡(s) a; Visited Visited [ {s0}; s s0 Figure 5.7: Planning for Solutions by For graphs to find solutions. The main goal of the follo the di↵erence in algorithms from deterministic doma mainly a didactic rather than practical objective. 5.3.1 Planning for Solutions by Forward We first present a very simple algorithm that finds ship 164 Chapter 5 Policy:
  • 35. 35"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" FindRSolu.on:(Algorithm(to(find(Solu.ons( 174 Find-Solution (⌃, s0, Sg) ⇡ ?; s s0; Visited {s0} // initia loop if s 2 Sg then return ⇡ // goal A0 Applicable(s) if A0 = ? then return failure // dead- nondeterministically choose a 2 A0 // branc nondeterministically choose s0 2 (s, a)// progr if s0 2 Visited then return failure // loop ⇡(s) a; Visited Visited [ {s0}; s s0 Figure 5.7: Planning for Solutions by For graphs to find solutions. The main goal of the foll the di↵erence in algorithms from deterministic dom mainly a didactic rather than practical objective. 5.3.1 Planning for Solutions by Forward 174 Find-Solution (⌃, s0, Sg) ⇡ ?; s s0; Visited {s0} // initial loop if s 2 Sg then return ⇡ // goal t A0 Applicable(s) if A0 = ? then return failure // dead-e nondeterministically choose a 2 A0 // branc nondeterministically choose s0 2 (s, a)// progre if s0 2 Visited then return failure // loop c ⇡(s) a; Visited Visited [ {s0}; s s0 Figure 5.7: Planning for Solutions by For graphs to find solutions. The main goal of the follo the di↵erence in algorithms from deterministic doma mainly a didactic rather than practical objective. 5.3.1 Planning for Solutions by Forward We first present a very simple algorithm that finds ship hbr unload 164 Chapter 5 Policy: ship: unload
  • 36. 36"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" FindRSolu.on:(Algorithm(to(find(Solu.ons( 174 Find-Solution (⌃, s0, Sg) ⇡ ?; s s0; Visited {s0} // initia loop if s 2 Sg then return ⇡ // goal A0 Applicable(s) if A0 = ? then return failure // dead- nondeterministically choose a 2 A0 // branc nondeterministically choose s0 2 (s, a)// progr if s0 2 Visited then return failure // loop ⇡(s) a; Visited Visited [ {s0}; s s0 Figure 5.7: Planning for Solutions by For graphs to find solutions. The main goal of the foll the di↵erence in algorithms from deterministic dom mainly a didactic rather than practical objective. 5.3.1 Planning for Solutions by Forward 174 Find-Solution (⌃, s0, Sg) ⇡ ?; s s0; Visited {s0} // initial loop if s 2 Sg then return ⇡ // goal t A0 Applicable(s) if A0 = ? then return failure // dead-e nondeterministically choose a 2 A0 // branc nondeterministically choose s0 2 (s, a)// progre if s0 2 Visited then return failure // loop c ⇡(s) a; Visited Visited [ {s0}; s s0 Figure 5.7: Planning for Solutions by For graphs to find solutions. The main goal of the follo the di↵erence in algorithms from deterministic doma mainly a didactic rather than practical objective. 5.3.1 Planning for Solutions by Forward We first present a very simple algorithm that finds ship hbr par1 tr1 par2 park unload 164 Chapter 5 Policy: ship: unload hbr: park Assume this
 outcome is chosen
  • 37. 37"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" FindRSolu.on:(Algorithm(to(find(Solu.ons( 174 Find-Solution (⌃, s0, Sg) ⇡ ?; s s0; Visited {s0} // initia loop if s 2 Sg then return ⇡ // goal A0 Applicable(s) if A0 = ? then return failure // dead- nondeterministically choose a 2 A0 // branc nondeterministically choose s0 2 (s, a)// progr if s0 2 Visited then return failure // loop ⇡(s) a; Visited Visited [ {s0}; s s0 Figure 5.7: Planning for Solutions by For graphs to find solutions. The main goal of the foll the di↵erence in algorithms from deterministic dom mainly a didactic rather than practical objective. 5.3.1 Planning for Solutions by Forward 174 Find-Solution (⌃, s0, Sg) ⇡ ?; s s0; Visited {s0} // initial loop if s 2 Sg then return ⇡ // goal t A0 Applicable(s) if A0 = ? then return failure // dead-e nondeterministically choose a 2 A0 // branc nondeterministically choose s0 2 (s, a)// progre if s0 2 Visited then return failure // loop c ⇡(s) a; Visited Visited [ {s0}; s s0 Figure 5.7: Planning for Solutions by For graphs to find solutions. The main goal of the follo the di↵erence in algorithms from deterministic doma mainly a didactic rather than practical objective. 5.3.1 Planning for Solutions by Forward We first present a very simple algorithm that finds ship hbr par1 tr1 par2 park unload 164 Chapter 5 Policy: ship: unload hbr: park par1: deliver g1 g2 del tr2 Assume this
 outcome is chosen
  • 38. 38"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" FindRSolu.on:(Algorithm(to(find(Solu.ons( 174 Find-Solution (⌃, s0, Sg) ⇡ ?; s s0; Visited {s0} // initia loop if s 2 Sg then return ⇡ // goal A0 Applicable(s) if A0 = ? then return failure // dead- nondeterministically choose a 2 A0 // branc nondeterministically choose s0 2 (s, a)// progr if s0 2 Visited then return failure // loop ⇡(s) a; Visited Visited [ {s0}; s s0 Figure 5.7: Planning for Solutions by For graphs to find solutions. The main goal of the foll the di↵erence in algorithms from deterministic dom mainly a didactic rather than practical objective. 5.3.1 Planning for Solutions by Forward 174 Find-Solution (⌃, s0, Sg) ⇡ ?; s s0; Visited {s0} // initial loop if s 2 Sg then return ⇡ // goal t A0 Applicable(s) if A0 = ? then return failure // dead-e nondeterministically choose a 2 A0 // branc nondeterministically choose s0 2 (s, a)// progre if s0 2 Visited then return failure // loop c ⇡(s) a; Visited Visited [ {s0}; s s0 Figure 5.7: Planning for Solutions by For graphs to find solutions. The main goal of the follo the di↵erence in algorithms from deterministic doma mainly a didactic rather than practical objective. 5.3.1 Planning for Solutions by Forward We first present a very simple algorithm that finds ship hbr par1 tr1 par2 park unload 164 Chapter 5 Policy: ship: unload hbr: park par1: deliver tr2: move g1 g2 del tr2 g1 g2 move Assume this
 outcome is chosen
  • 39. 39"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" FindRSolu.on:(Algorithm(to(find(Solu.ons( 174 Find-Solution (⌃, s0, Sg) ⇡ ?; s s0; Visited {s0} // initia loop if s 2 Sg then return ⇡ // goal A0 Applicable(s) if A0 = ? then return failure // dead- nondeterministically choose a 2 A0 // branc nondeterministically choose s0 2 (s, a)// progr if s0 2 Visited then return failure // loop ⇡(s) a; Visited Visited [ {s0}; s s0 Figure 5.7: Planning for Solutions by For graphs to find solutions. The main goal of the foll the di↵erence in algorithms from deterministic dom mainly a didactic rather than practical objective. 5.3.1 Planning for Solutions by Forward 174 Find-Solution (⌃, s0, Sg) ⇡ ?; s s0; Visited {s0} // initial loop if s 2 Sg then return ⇡ // goal t A0 Applicable(s) if A0 = ? then return failure // dead-e nondeterministically choose a 2 A0 // branc nondeterministically choose s0 2 (s, a)// progre if s0 2 Visited then return failure // loop c ⇡(s) a; Visited Visited [ {s0}; s s0 Figure 5.7: Planning for Solutions by For graphs to find solutions. The main goal of the follo the di↵erence in algorithms from deterministic doma mainly a didactic rather than practical objective. 5.3.1 Planning for Solutions by Forward We first present a very simple algorithm that finds ship hbr par1 tr1 par2 park unload 164 Chapter 5 Policy: ship: unload hbr: park par1: deliver tr2: move g1 g2 del tr2 g1 g2 move Reached a
 goal state.
 Terminate here.
  • 40. 40"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" FindRSolu.on:(Algorithm(to(find(Solu.ons( 174 Find-Solution (⌃, s0, Sg) ⇡ ?; s s0; Visited {s0} // initia loop if s 2 Sg then return ⇡ // goal A0 Applicable(s) if A0 = ? then return failure // dead- nondeterministically choose a 2 A0 // branc nondeterministically choose s0 2 (s, a)// progr if s0 2 Visited then return failure // loop ⇡(s) a; Visited Visited [ {s0}; s s0 Figure 5.7: Planning for Solutions by For graphs to find solutions. The main goal of the foll the di↵erence in algorithms from deterministic dom mainly a didactic rather than practical objective. 5.3.1 Planning for Solutions by Forward 174 Find-Solution (⌃, s0, Sg) ⇡ ?; s s0; Visited {s0} // initial loop if s 2 Sg then return ⇡ // goal t A0 Applicable(s) if A0 = ? then return failure // dead-e nondeterministically choose a 2 A0 // branc nondeterministically choose s0 2 (s, a)// progre if s0 2 Visited then return failure // loop c ⇡(s) a; Visited Visited [ {s0}; s s0 Figure 5.7: Planning for Solutions by For graphs to find solutions. The main goal of the follo the di↵erence in algorithms from deterministic doma mainly a didactic rather than practical objective. 5.3.1 Planning for Solutions by Forward We first present a very simple algorithm that finds ship hbr par1 tr1 par2 park unload 164 Chapter 5 Policy: ship: unload hbr: park par1: deliver tr2: move g1 g2 del tr2 g1 g2 move This policy
 is returned
  • 41. 41"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" FindRSolu.on:(Proper.es( !  Finds a solution if one exists !  However, in most cases it will find unsafe solutions ●  Because it only considers one outcome for each action !  Nondeterministic choice implemented using backtracking ●  Two levels of backtracking ▸  Choosing an action ▸  Choosing an effect of that action ●  Each sequence of choices corresponds to an execution trace of FindGSoluBon"
  • 42. 42"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" FindRSafeRSolu.on( Section 5.3 175 Find-Safe-Solution (⌃, s0, Sg) ⇡ ? Frontier {s0} while Frontier 6= ? do if Frontier ✓ Sg then return ⇡ // goal reached by all leaves for every s 2 Frontier do remove s from Frontier if Find-Solution(⌃, s, Sg) = failure // nonterminating loop then return failure nondeterministically choose a 2 Applicable(s) // select an action ⇡ ⇡ [ (s, a) Frontier Frontier [ ( (s, a) Dom(⇡)) // expand return failure Figure 5.8: Planning for Safe Solutions by Forward-search. Keeps track of
 unexpanded states,
 much like A* Uses FindGSoluBon"to see
 if a Solution exists. If no
 Solution, then no
 Safe-Solution. Only nondeterministic choice is action.
 Adds ALL possible successor states to
 Frontier. Not a choice since Safe-Solution
 needs to guard against all eventualities.
  • 43. 43"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" 164 Chapter 5 FindRSafeRSolu.on( Find-Safe-Solution (⌃, s0, Sg) ⇡ ? Frontier {s0} while Frontier 6= ? do if Frontier ✓ Sg then return ⇡ // goal for every s 2 Frontier do remove s from Frontier if Find-Solution(⌃, s, Sg) = failure // then return failure nondeterministically choose a 2 Applicable(s) ⇡ ⇡ [ (s, a) Frontier Frontier [ ( (s, a) Dom(⇡)) return failure Figure 5.8: Planning for Safe Solutions by For resulting from applying a to s. The interpretation of choice of the state among the elements of the frontie creates several copies of a, one for each applicable act these copies has been made, the algorithm makes ano ship Policy: Frontier: ship
  • 44. 44"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" 164 Chapter 5 FindRSafeRSolu.on( Find-Safe-Solution (⌃, s0, Sg) ⇡ ? Frontier {s0} while Frontier 6= ? do if Frontier ✓ Sg then return ⇡ // goal for every s 2 Frontier do remove s from Frontier if Find-Solution(⌃, s, Sg) = failure // then return failure nondeterministically choose a 2 Applicable(s) ⇡ ⇡ [ (s, a) Frontier Frontier [ ( (s, a) Dom(⇡)) return failure Figure 5.8: Planning for Safe Solutions by For resulting from applying a to s. The interpretation of choice of the state among the elements of the frontie creates several copies of a, one for each applicable act these copies has been made, the algorithm makes ano ship hbr unload Policy: ship: unload Frontier: hbr
  • 45. 45"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" 164 Chapter 5 FindRSafeRSolu.on( Find-Safe-Solution (⌃, s0, Sg) ⇡ ? Frontier {s0} while Frontier 6= ? do if Frontier ✓ Sg then return ⇡ // goal for every s 2 Frontier do remove s from Frontier if Find-Solution(⌃, s, Sg) = failure // then return failure nondeterministically choose a 2 Applicable(s) ⇡ ⇡ [ (s, a) Frontier Frontier [ ( (s, a) Dom(⇡)) return failure Figure 5.8: Planning for Safe Solutions by For resulting from applying a to s. The interpretation of choice of the state among the elements of the frontie creates several copies of a, one for each applicable act these copies has been made, the algorithm makes ano Frontier: par2,
 tr1,par1 ship hbr par1 tr1 par2 park unload Policy: ship: unload hbr: park Unlike FindGSoluBon, need to
 solve for all successor states.
 All are added to Frontier.
  • 46. 46"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" 164 Chapter 5 FindRSafeRSolu.on( Find-Safe-Solution (⌃, s0, Sg) ⇡ ? Frontier {s0} while Frontier 6= ? do if Frontier ✓ Sg then return ⇡ // goal for every s 2 Frontier do remove s from Frontier if Find-Solution(⌃, s, Sg) = failure // then return failure nondeterministically choose a 2 Applicable(s) ⇡ ⇡ [ (s, a) Frontier Frontier [ ( (s, a) Dom(⇡)) return failure Figure 5.8: Planning for Safe Solutions by For resulting from applying a to s. The interpretation of choice of the state among the elements of the frontie creates several copies of a, one for each applicable act these copies has been made, the algorithm makes ano Frontier: par2,
 tr1,g1,g2,tr2 ship hbr par1 tr1 par2 park unload Policy: ship: unload hbr: park
 par1: deliver g1 g2 del tr2
  • 47. 47"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" 164 Chapter 5 FindRSafeRSolu.on( Find-Safe-Solution (⌃, s0, Sg) ⇡ ? Frontier {s0} while Frontier 6= ? do if Frontier ✓ Sg then return ⇡ // goal for every s 2 Frontier do remove s from Frontier if Find-Solution(⌃, s, Sg) = failure // then return failure nondeterministically choose a 2 Applicable(s) ⇡ ⇡ [ (s, a) Frontier Frontier [ ( (s, a) Dom(⇡)) return failure Figure 5.8: Planning for Safe Solutions by For resulting from applying a to s. The interpretation of choice of the state among the elements of the frontie creates several copies of a, one for each applicable act these copies has been made, the algorithm makes ano Frontier: par2,
 tr1,g1,g2,tr2 ship hbr par1 tr1 par2 park unload Policy: ship: unload hbr: park
 par1: deliver g1 g2 del tr2 g1 and g2 are goal states.So 
 FSS doesn’t solve for it further.
  • 48. 48"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" 164 Chapter 5 FindRSafeRSolu.on( Find-Safe-Solution (⌃, s0, Sg) ⇡ ? Frontier {s0} while Frontier 6= ? do if Frontier ✓ Sg then return ⇡ // goal for every s 2 Frontier do remove s from Frontier if Find-Solution(⌃, s, Sg) = failure // then return failure nondeterministically choose a 2 Applicable(s) ⇡ ⇡ [ (s, a) Frontier Frontier [ ( (s, a) Dom(⇡)) return failure Figure 5.8: Planning for Safe Solutions by For resulting from applying a to s. The interpretation of choice of the state among the elements of the frontie creates several copies of a, one for each applicable act these copies has been made, the algorithm makes ano Frontier: par2,
 tr1,g1,g2 ship hbr par1 tr1 par2 park unload Policy: ship: unload hbr: park
 par1: deliver
 tr2: move g1 g2 del tr2 g1 g2 move
  • 49. 49"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" 164 Chapter 5 FindRSafeRSolu.on( Find-Safe-Solution (⌃, s0, Sg) ⇡ ? Frontier {s0} while Frontier 6= ? do if Frontier ✓ Sg then return ⇡ // goal for every s 2 Frontier do remove s from Frontier if Find-Solution(⌃, s, Sg) = failure // then return failure nondeterministically choose a 2 Applicable(s) ⇡ ⇡ [ (s, a) Frontier Frontier [ ( (s, a) Dom(⇡)) return failure Figure 5.8: Planning for Safe Solutions by For resulting from applying a to s. The interpretation of choice of the state among the elements of the frontie creates several copies of a, one for each applicable act these copies has been made, the algorithm makes ano Frontier: par2,
 tr1,g1,g2 ship hbr par1 tr1 par2 park unload Policy: ship: unload hbr: park
 par1: deliver
 tr2: move g1 g2 del tr2 g1 g2 move
  • 50. 50"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" 164 Chapter 5 FindRSafeRSolu.on( Find-Safe-Solution (⌃, s0, Sg) ⇡ ? Frontier {s0} while Frontier 6= ? do if Frontier ✓ Sg then return ⇡ // goal for every s 2 Frontier do remove s from Frontier if Find-Solution(⌃, s, Sg) = failure // then return failure nondeterministically choose a 2 Applicable(s) ⇡ ⇡ [ (s, a) Frontier Frontier [ ( (s, a) Dom(⇡)) return failure Figure 5.8: Planning for Safe Solutions by For resulting from applying a to s. The interpretation of choice of the state among the elements of the frontie creates several copies of a, one for each applicable act these copies has been made, the algorithm makes ano Frontier:tr1,
 g1,g2 ship hbr par1 tr1 par2 park unload Policy: ship: unload hbr: park
 par1: deliver
 tr2: move
 par2: back g1 g2 del tr2 g1 g2 move hbr back
  • 51. 51"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" 164 Chapter 5 FindRSafeRSolu.on( Find-Safe-Solution (⌃, s0, Sg) ⇡ ? Frontier {s0} while Frontier 6= ? do if Frontier ✓ Sg then return ⇡ // goal for every s 2 Frontier do remove s from Frontier if Find-Solution(⌃, s, Sg) = failure // then return failure nondeterministically choose a 2 Applicable(s) ⇡ ⇡ [ (s, a) Frontier Frontier [ ( (s, a) Dom(⇡)) return failure Figure 5.8: Planning for Safe Solutions by For resulting from applying a to s. The interpretation of choice of the state among the elements of the frontie creates several copies of a, one for each applicable act these copies has been made, the algorithm makes ano Frontier:tr1,
 g1,g2 ship hbr par1 tr1 par2 park unload Policy: ship: unload hbr: park
 par1: deliver
 tr2: move
 par2: back g1 g2 del tr2 g1 g2 move hbr back
  • 52. 52"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" 164 Chapter 5 FindRSafeRSolu.on( Find-Safe-Solution (⌃, s0, Sg) ⇡ ? Frontier {s0} while Frontier 6= ? do if Frontier ✓ Sg then return ⇡ // goal for every s 2 Frontier do remove s from Frontier if Find-Solution(⌃, s, Sg) = failure // then return failure nondeterministically choose a 2 Applicable(s) ⇡ ⇡ [ (s, a) Frontier Frontier [ ( (s, a) Dom(⇡)) return failure Figure 5.8: Planning for Safe Solutions by For resulting from applying a to s. The interpretation of choice of the state among the elements of the frontie creates several copies of a, one for each applicable act these copies has been made, the algorithm makes ano Frontier:
 g1,g2 ship hbr par1 tr1 par2 park unload Policy: ship: unload hbr: park
 par1: deliver
 tr2: move
 par2: back
 tr1: move g1 g2 del tr2 g1 g2 move par1 par2 hbr back
  • 53. 53"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" 164 Chapter 5 FindRSafeRSolu.on( Find-Safe-Solution (⌃, s0, Sg) ⇡ ? Frontier {s0} while Frontier 6= ? do if Frontier ✓ Sg then return ⇡ // goal for every s 2 Frontier do remove s from Frontier if Find-Solution(⌃, s, Sg) = failure // then return failure nondeterministically choose a 2 Applicable(s) ⇡ ⇡ [ (s, a) Frontier Frontier [ ( (s, a) Dom(⇡)) return failure Figure 5.8: Planning for Safe Solutions by For resulting from applying a to s. The interpretation of choice of the state among the elements of the frontie creates several copies of a, one for each applicable act these copies has been made, the algorithm makes ano Frontier:
 g1,g2 ship hbr par1 tr1 par2 park unload Policy: ship: unload hbr: park
 par1: deliver
 tr2: move
 par2: back
 tr1: move g1 g2 del tr2 g1 g2 move par1 par2 satisfies hbr back
  • 54. 54"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" 164 Chapter 5 FindRSafeRSolu.on( Find-Safe-Solution (⌃, s0, Sg) ⇡ ? Frontier {s0} while Frontier 6= ? do if Frontier ✓ Sg then return ⇡ // goal for every s 2 Frontier do remove s from Frontier if Find-Solution(⌃, s, Sg) = failure // then return failure nondeterministically choose a 2 Applicable(s) ⇡ ⇡ [ (s, a) Frontier Frontier [ ( (s, a) Dom(⇡)) return failure Figure 5.8: Planning for Safe Solutions by For resulting from applying a to s. The interpretation of choice of the state among the elements of the frontie creates several copies of a, one for each applicable act these copies has been made, the algorithm makes ano ship hbr par1 tr1 par2 park unload Policy: ship: unload hbr: park
 par1: deliver
 tr2: move
 par2: back
 tr1: move g1 g2 del tr2 g1 g2 move par1 par2 This policy
 is returned hbr back
  • 55. 55"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" Proper.es(of(FindRSafeRSolu.on( !  Guaranteed to find safe solution, if one exists !  Uses FindGSoluBon"as a subroutine to detect nonterminating loops
  • 56. 56"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" FindRAcyclicRSolu.on( 176 Chapter Find-Acyclic-Solution (⌃, s0, Sg) ⇡ ? Frontier {s0} while Frontier 6= ? do if Frontier ✓ Sg then return ⇡ // goal reached by all leave for every s 2 Frontier do remove s from Frontier if Frontier Dom(⇡) 6= ? // loop checking then return failure choose nondeterministically a 2 Applicable(s) // select an action ⇡ ⇡ [ (s, a) Frontier Frontier [ (s, a) // expand return failure Figure 5.9: Planning for Safe Acyclic Solutions by Forward-search. Cycle check: makes sure
 that action applied in previous
 iteration didn’t lead to a state
 already considered by π Similar to
 FindRSafeRSolu.on except:
  • 57. 57"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" 164 Chapter 5 FindRAcyclicRSoln( ship Policy: Frontier: ship Find-Acyclic-Solution (⌃, s0, Sg) ⇡ ? Frontier {s0} while Frontier 6= ? do if Frontier ✓ Sg then return ⇡ // goal for every s 2 Frontier do remove s from Frontier if Frontier Dom(⇡) 6= ? // then return failure choose nondeterministically a 2 Applicable(s) ⇡ ⇡ [ (s, a) Frontier Frontier [ (s, a) return failure Figure 5.9: Planning for Safe Acyclic Solutions by While exploring the frontier, it calls Find-Soluti whether the current policy contains cycles without p tion, i.e., whether it gets in a state where no action i there is no path to the goal. Also Find-Safe-Solution terministic selection among the applicable actions.
  • 58. 58"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" 164 Chapter 5 FindRAcyclicRSoln( ship hbr unload Policy: ship: unload Frontier: hbr Find-Acyclic-Solution (⌃, s0, Sg) ⇡ ? Frontier {s0} while Frontier 6= ? do if Frontier ✓ Sg then return ⇡ // goal for every s 2 Frontier do remove s from Frontier if Frontier Dom(⇡) 6= ? // then return failure choose nondeterministically a 2 Applicable(s) ⇡ ⇡ [ (s, a) Frontier Frontier [ (s, a) return failure Figure 5.9: Planning for Safe Acyclic Solutions by While exploring the frontier, it calls Find-Soluti whether the current policy contains cycles without p tion, i.e., whether it gets in a state where no action i there is no path to the goal. Also Find-Safe-Solution terministic selection among the applicable actions.
  • 59. 59"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" 164 Chapter 5 FindRAcyclicRSoln( Frontier: par2,
 tr1,par1 ship hbr par1 tr1 par2 park unload Policy: ship: unload hbr: park Unlike FindGSoluBon, need to
 solve for all successor states.
 All are added to Frontier. Find-Acyclic-Solution (⌃, s0, Sg) ⇡ ? Frontier {s0} while Frontier 6= ? do if Frontier ✓ Sg then return ⇡ // goal for every s 2 Frontier do remove s from Frontier if Frontier Dom(⇡) 6= ? // then return failure choose nondeterministically a 2 Applicable(s) ⇡ ⇡ [ (s, a) Frontier Frontier [ (s, a) return failure Figure 5.9: Planning for Safe Acyclic Solutions by While exploring the frontier, it calls Find-Soluti whether the current policy contains cycles without p tion, i.e., whether it gets in a state where no action i there is no path to the goal. Also Find-Safe-Solution terministic selection among the applicable actions.
  • 60. 60"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" 164 Chapter 5 FindRAcyclicRSoln( Frontier: par2,
 tr1,g1,g2,tr2 ship hbr par1 tr1 par2 park unload Policy: ship: unload hbr: park
 par1: deliver g1 g2 del tr2 Find-Acyclic-Solution (⌃, s0, Sg) ⇡ ? Frontier {s0} while Frontier 6= ? do if Frontier ✓ Sg then return ⇡ // goal for every s 2 Frontier do remove s from Frontier if Frontier Dom(⇡) 6= ? // then return failure choose nondeterministically a 2 Applicable(s) ⇡ ⇡ [ (s, a) Frontier Frontier [ (s, a) return failure Figure 5.9: Planning for Safe Acyclic Solutions by While exploring the frontier, it calls Find-Soluti whether the current policy contains cycles without p tion, i.e., whether it gets in a state where no action i there is no path to the goal. Also Find-Safe-Solution terministic selection among the applicable actions.
  • 61. 61"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" 164 Chapter 5 FindRAcyclicRSoln( Frontier: par2,
 tr1,g1,g2,tr2 ship hbr par1 tr1 par2 park unload Policy: ship: unload hbr: park
 par1: deliver g1 g2 del tr2 g1 and g2 are goal states.So 
 FSS doesn’t solve for it further. Find-Acyclic-Solution (⌃, s0, Sg) ⇡ ? Frontier {s0} while Frontier 6= ? do if Frontier ✓ Sg then return ⇡ // goal for every s 2 Frontier do remove s from Frontier if Frontier Dom(⇡) 6= ? // then return failure choose nondeterministically a 2 Applicable(s) ⇡ ⇡ [ (s, a) Frontier Frontier [ (s, a) return failure Figure 5.9: Planning for Safe Acyclic Solutions by While exploring the frontier, it calls Find-Soluti whether the current policy contains cycles without p tion, i.e., whether it gets in a state where no action i there is no path to the goal. Also Find-Safe-Solution terministic selection among the applicable actions.
  • 62. 62"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" 164 Chapter 5 FindRAcyclicRSoln( Frontier: par2,
 tr1,g1,g2 ship hbr par1 tr1 par2 park unload Policy: ship: unload hbr: park
 par1: deliver
 tr2: move g1 g2 del tr2 g1 g2 move Find-Acyclic-Solution (⌃, s0, Sg) ⇡ ? Frontier {s0} while Frontier 6= ? do if Frontier ✓ Sg then return ⇡ // goal for every s 2 Frontier do remove s from Frontier if Frontier Dom(⇡) 6= ? // then return failure choose nondeterministically a 2 Applicable(s) ⇡ ⇡ [ (s, a) Frontier Frontier [ (s, a) return failure Figure 5.9: Planning for Safe Acyclic Solutions by While exploring the frontier, it calls Find-Soluti whether the current policy contains cycles without p tion, i.e., whether it gets in a state where no action i there is no path to the goal. Also Find-Safe-Solution terministic selection among the applicable actions.
  • 63. 63"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" 164 Chapter 5 FindRAcyclicRSoln( Frontier: par2,
 tr1,g1,g2 ship hbr par1 tr1 par2 park unload Policy: ship: unload hbr: park
 par1: deliver
 tr2: move g1 g2 del tr2 g1 g2 move Find-Acyclic-Solution (⌃, s0, Sg) ⇡ ? Frontier {s0} while Frontier 6= ? do if Frontier ✓ Sg then return ⇡ // goal for every s 2 Frontier do remove s from Frontier if Frontier Dom(⇡) 6= ? // then return failure choose nondeterministically a 2 Applicable(s) ⇡ ⇡ [ (s, a) Frontier Frontier [ (s, a) return failure Figure 5.9: Planning for Safe Acyclic Solutions by While exploring the frontier, it calls Find-Soluti whether the current policy contains cycles without p tion, i.e., whether it gets in a state where no action i there is no path to the goal. Also Find-Safe-Solution terministic selection among the applicable actions.
  • 64. 64"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" 164 Chapter 5 FindRAcyclicRSoln( Frontier:tr1,
 g1,g2,tr3 ship hbr par1 tr1 par2 park unload Policy: ship: unload hbr: park
 par1: deliver
 tr2: move
 par2: deliver g1 g2 del tr2 g1 g2 move tr3 del g1 Find-Acyclic-Solution (⌃, s0, Sg) ⇡ ? Frontier {s0} while Frontier 6= ? do if Frontier ✓ Sg then return ⇡ // goal for every s 2 Frontier do remove s from Frontier if Frontier Dom(⇡) 6= ? // then return failure choose nondeterministically a 2 Applicable(s) ⇡ ⇡ [ (s, a) Frontier Frontier [ (s, a) return failure Figure 5.9: Planning for Safe Acyclic Solutions by While exploring the frontier, it calls Find-Soluti whether the current policy contains cycles without p tion, i.e., whether it gets in a state where no action i there is no path to the goal. Also Find-Safe-Solution terministic selection among the applicable actions. Note: doesn’t
 consider back( because it 
 creates
 a cycle
  • 65. 65"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" 164 Chapter 5 FindRAcyclicRSoln( Frontier:tr1,
 g1,g2 ship hbr par1 tr1 par2 park unload Policy: ship: unload hbr: park
 par1: deliver
 tr2: move
 par2: deliver tr3: move g1 g2 del tr2 g1 g2 move tr3 del g1 g2 move Find-Acyclic-Solution (⌃, s0, Sg) ⇡ ? Frontier {s0} while Frontier 6= ? do if Frontier ✓ Sg then return ⇡ // goal for every s 2 Frontier do remove s from Frontier if Frontier Dom(⇡) 6= ? // then return failure choose nondeterministically a 2 Applicable(s) ⇡ ⇡ [ (s, a) Frontier Frontier [ (s, a) return failure Figure 5.9: Planning for Safe Acyclic Solutions by While exploring the frontier, it calls Find-Soluti whether the current policy contains cycles without p tion, i.e., whether it gets in a state where no action i there is no path to the goal. Also Find-Safe-Solution terministic selection among the applicable actions.
  • 66. 66"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" 164 Chapter 5 FindRAcyclicRSoln( Frontier:
 g1,g2 ship hbr par1 tr1 par2 park unload Policy: ship: unload hbr: park
 par1: deliver
 tr2: move
 par2: deliver tr3: move
 par1: move g1 g2 del tr2 g1 g2 move tr3 del g1 g2 move par1 par2 Find-Acyclic-Solution (⌃, s0, Sg) ⇡ ? Frontier {s0} while Frontier 6= ? do if Frontier ✓ Sg then return ⇡ // goal for every s 2 Frontier do remove s from Frontier if Frontier Dom(⇡) 6= ? // then return failure choose nondeterministically a 2 Applicable(s) ⇡ ⇡ [ (s, a) Frontier Frontier [ (s, a) return failure Figure 5.9: Planning for Safe Acyclic Solutions by While exploring the frontier, it calls Find-Soluti whether the current policy contains cycles without p tion, i.e., whether it gets in a state where no action i there is no path to the goal. Also Find-Safe-Solution terministic selection among the applicable actions.
  • 67. 67"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" 164 Chapter 5 FindRAcyclicRSoln( Frontier:
 g1,g2 ship hbr par1 tr1 par2 park unload Policy: ship: unload hbr: park
 par1: deliver
 tr2: move
 par2: deliver tr3: move
 tr1: move g1 g2 del tr2 g1 g2 move tr3 del g1 g2 move par1 par2 satisfies Find-Acyclic-Solution (⌃, s0, Sg) ⇡ ? Frontier {s0} while Frontier 6= ? do if Frontier ✓ Sg then return ⇡ // goal for every s 2 Frontier do remove s from Frontier if Frontier Dom(⇡) 6= ? // then return failure choose nondeterministically a 2 Applicable(s) ⇡ ⇡ [ (s, a) Frontier Frontier [ (s, a) return failure Figure 5.9: Planning for Safe Acyclic Solutions by While exploring the frontier, it calls Find-Soluti whether the current policy contains cycles without p tion, i.e., whether it gets in a state where no action i there is no path to the goal. Also Find-Safe-Solution terministic selection among the applicable actions.
  • 68. 68"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" 164 Chapter 5 FindRAcyclicRSoln( ship hbr par1 tr1 par2 park unload Policy: ship: unload hbr: park
 par1: deliver
 tr2: move
 par2: deliver tr3: move
 tr1: move g1 g2 del tr2 g1 g2 move tr3 del g1 g2 move par1 par2 This policy
 is returned Find-Acyclic-Solution (⌃, s0, Sg) ⇡ ? Frontier {s0} while Frontier 6= ? do if Frontier ✓ Sg then return ⇡ // goal for every s 2 Frontier do remove s from Frontier if Frontier Dom(⇡) 6= ? // then return failure choose nondeterministically a 2 Applicable(s) ⇡ ⇡ [ (s, a) Frontier Frontier [ (s, a) return failure Figure 5.9: Planning for Safe Acyclic Solutions by While exploring the frontier, it calls Find-Soluti whether the current policy contains cycles without p tion, i.e., whether it gets in a state where no action i there is no path to the goal. Also Find-Safe-Solution terministic selection among the applicable actions.
  • 69. 69"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" Proper.es(of(FindRAcyclicRSolu.on( !  Guarantees finding Acyclic Safe Solutions, if one exists !  Checks for cycles by seeing if any node in FronBer"is already in the domain of π
  • 70. 70"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" Guided(Planning(For(Safe(Solu.ons( !  Main motivation: finding possibly unsafe solutions much easier than finding safe solutions ●  FindGSoluBon"ignores AND/OR graph structure and just looks for a policy that might achieve the goal ●  FindGSafeGSoluBon needs to plan for all possible outcomes of actions !  We’ll now see an algorithm that computes safe solutions by starting from possibly unsafe solutions
  • 71. 71"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" GuidedRFindRSafeRSolu.on( 192 Chapter 5 Guided-Find-Safe-Solution (⌃,s0,Sg) if s0 2 Sg then return(?) if Applicable(s0) = ? then return(failure) ⇡ ? loop Q leaves(s0, ⇡) Sg if Q = ? then return(⇡) select arbitrarily s 2 Q ⇡0 Find-Solution(⌃, s, Sg) if ⇡0 6= failure then do ⇡ ⇡ [ {(s, a) 2 ⇡0 | s 62 Dom(⇡)} else for every s0 and a such that s 2 (s0, a) do ⇡ ⇡ {(s0, a)} make a not applicable in s0 Figure 5.17: Guided Planning for a Safe Solution Look at all the leaves of π. 
 Safe solution requires a goal state
 to be reachable from every node.
 So plan from each non-solution leaf. Incorporate solution π’ found
 into overall policy π If solution not found from s, goals unreachable from s. Remove all elements of π that could result in s.
  • 73. 73"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" Finding(Safe(Solu.ons(by(Determiniza.on( !  Main idea underlying GuidedGFindGSafeGSoluBon:" ●  Can use (possibly) unsafe solutions (using FindGSoluBon) to guide the search towards a safe solution !  Advantageous because we can temporarily focus on only one of the action’s outcomes ●  Searching for paths rather than trees !  Determinization carries same idea even further !  I’ll explain how determinization works, and then how it compares with FindG SoluBon"
  • 74. 74"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" Determiniza.on(Techniques( !  High-Level Approach: ●  Transform nondeterministic model to a deterministic one ▸  Each nondeterministic action translates to several deterministic actions, one for each possible successor state ●  Use CSV planners to solve these problems ●  Stitch solutions together into a policy !  Advantages: ●  Deterministic planning problems efficiently solvable ●  Allows us to leverage all of the nice features CSV planners bring in ▸  Heuristics, landmarks, etc hbr par1 tr1 par2 park hbr par1 tr1 par2 park1 park2 park3
  • 75. 75"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" FindRSafeRSolu.onRbyRDeterminiza.on( Find-Safe-Solution-by-Determinization (⌃,s0,Sg) if s0 2 Sg then return(?) if Applicable(s0) = ? then return(failure) ⇡ ? ⌃d mk-deterministic(⌃) // determinization loop Q leaves(s0, ⇡) Sg if Q = ? then do ⇡ ⇡ {(s, a) 2 ⇡ | s 62 b(s0, ⇡)} // clean policy return(⇡) select s 2 Q p0 Forward-search (⌃d, s, Sg) // classical planner if p0 6= fail then do ⇡0 Plan2policy(p0, s) // plan2policy transformatio ⇡ ⇡ [ {(s, a) 2 ⇡0 | s 62 Dom(⇡)} else for every s0 and a such that s 2 (s0, a) do ⇡ ⇡ {(s0, a)} make the actions in the determinization of a // action elimination not applicable in s0 Compute determinization of domain If no non-solution leaf states, we’re done. Need to clean up policy to remove unreachable states Invoke CSV planner on deterministic model Transform deterministic
 plan into policy Action elimination
  • 76. 76"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" Plan2Policy( ⌃d rather than the nondeterministic domain ⌃. Plan2policy(p = ha1, . . . , ani,s) ⇡ ? loop for i from 1 to n do ⇡ ⇡ [ (s, ai) s d(s, ai) return ⇡ Figure 5.19: Transformation of a sequential plan into a corresponding pol 5.6 Online approaches with nondeterminist models In Chapter 1 (see Section 1.2, and specifically Section 1.6.2) we introdu the idea of interleaving planning and acting. One motivation is that, giv a complete plan that is generated o↵-line, its execution seldom works Relatively straightforward: transforms solution into a policy representation Note: p needs to be an acyclic plan To ensure this, Forward-Search (see previous slide) needs to return an acyclic plan
  • 77. 77"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" Ac.on(Elimina.on( if p0 6= fail then do ⇡0 Plan2policy(p0, s) // plan2poli ⇡ ⇡ [ {(s, a) 2 ⇡0 | s 62 Dom(⇡)} else for every s0 and a such that s 2 (s0, a) do ⇡ ⇡ {(s0, a)} make the actions in the determinization of a // action eli not applicable in s0 Figure 5.18: Planning for Safe Solutions by Determinization Fragment of FindGSafeGSoluBonGbyGDeterminizaBon
 that has to do with action elimination Triggered if no deterministic solution from s
 Informally it does the following: •  Update π to ensure s is never reached •  Ensure that no deterministic solution found in a future call to ForwardG Search"returns a solution going through s
  • 78. 78"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" Proper.es(of(FindRSafeRSolu.onRbyR Determiniza.on( !  Finds safe solutions !  Any CSV planner can be plugged in !  Determinization needs to be done carefully ●  Could potentially lead to an exponential blowup in the number of actions
  • 79. 79"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" Online(Approaches(with(Nondeterminis.c(Models( !  Interleaving planning and acting is important ●  Planning models are approximate – execution seldom works out as planned ●  Large problems mean long planning time – need to interleave the two !  This motivation even more stronger in nondeterministic domains ●  Long time needed to generate safe solutions when there are lots of state variables, actions etc !  Therefore interleaving planning and acting helps reduce complexity ●  Instead of coming up with complete policy, generate partial policy that tells us the next few actions to perform 196 Figure 5.20: O↵-line vs. Run Time Search Spaces acting and planning then we reduce significantly the sear indeed to find a partial policy, e.g., the next few ”good” or some of them, and repeat these two interleaved plannin Offline vs Runtime
 Search Spaces
  • 80. 80"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" Issues(With(Interleaving(Planning(and(Ac.ng( !  Need to identify good actions without exploring entire search space ●  Can be done using heuristic estimates !  Handling Dead-ends: ●  When lookahead is not enough, can get trapped in dead ends ▸  By planning fully, we would have found out about the dead-end ▸  E.g. if robot goes down a steep incline out of which it cannot come back up ●  Not a problem in safely explorable domains ▸  Goal states reachable from all situations !  Despite these issues, interleaving planning and acting an essential alternative to purely offline planning
  • 81. 81"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" Ac.ng(Procedure:(RunRLookahead( 198 Chapter Run-Lookahead(⌃, s0, Sg) s s0 while s /2 Sg and Applicable(s) 6= ? do ⇡ Lookahead(s, ✓) apply partial plan ⇡ s observe current state Figure 5.21: Interleaving planning and execution by look-ahead There are di↵erent ways in which the generated plan can be partia and di↵erent ways in planning and acting can be interleaved. Indeed th procedure Run-Lookahead is parametric along two dimensions: The first parametric dimension is in the call to the look-ahead plannin step, i.e., Lookahead(s, ✓). The parameter ✓ determines the way in which th generated plan ⇡ is partial. For instance, it can be partial since the lookahea is bounded, i.e., the forward search is performed for a bounded number o This is where the planner is invoked. θ is a context-dependent parameter that restricts the search for a solution and hence determines how π is partial •  θ could be a bound on the search depth •  θ could be limitation on planning time •  θ could also limit the number of action outcomes considered •  Special case: only ONE outcome == FindGSoluBon( !  Two ways to perform lookahead: ●  Lookahead with a bounded number of steps: handle all action outcomes, but only upto a certain depth ●  Lookahead by determinization: solve the problem fully, but possibly unsafe due to determinization
  • 82. 82"Dana"Nau"and"Vikas"Shivashankar:"Lecture"slides"for!Automated!Planning!and!Ac0ng" Updated"4/16/15" FFRReplan:(Lookahead(by(Determiniza.on( Section 5.6 FF-Replan (⌃, s, Sg) while s /2 Sg and Applicable(s) 6= ? do if ⇡d undefined for s then do ⇡d Forward-search (⌃d, s, Sg) apply action ⇡d(s) s observe resulting state Figure 5.22: Online determinization planning and acting algorithm. lookahead and partial numebr of outcomes, in any arbitrary way. The second parametric dimension is in the application of the partial p that has been generated, i.e., apply the partial plan ⇡. Independently of lookahead, we can still execute ⇡ in a partial way. Suppose for instance t we have generated a sequential plan of length n, we can decide to ap m  n steps. Run Forward-Search on
 a determinized version of
 the problem. Then start executing
 the (possibly unsafe) policy
 until we cannot execute 
 it anymore Properties: •  If the domain is safely-explorable,
 then FFGReplan will get to a goal state. •  If the domain has dead-ends, then
 no guarantees.