Chapter02b

1
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Backward
Search

●  Forward search starts at the initial state
Ø  Chooses an action that’s applicable
Ø  Computes state transition s′ = γ (s,a)
●  Backward search starts at the goal
Ø  Chooses an action that’s relevant
•  A possible “last action” before the goal
Ø  Computes inverse state transition g′ = γ –1(g,a)
•  g′ = properties a state s′ should satisfy in order for γ (s′,a) to satisfy g
●  Why would we want to do this?
●  One possibility: sometimes has a lower branching factor
Ø  Forward: 10 applicable actions
•  for each robot, two move actions and three load actions

Ø  Backward: g = {loc(r1)=d3}
Ø  2 relevant actions: move(r1,d1,d3),
move(r1,d2,d3)

•  Can eliminate move(r1,d2,d3); it requires a rigid condition that’s false

d2

d1

d3
r1

c1

r2

c2
c3
c4
c5
c6

2
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Relevance

●  Idea: a is relevant for g if a could be the last action of a plan that achieves g
●  Definition:
Ø  Let g = {g1, …, gk} be a goal. An action a is relevant for g if
1. eff(a) makes at least one gi true, i.e., eff(a) ∩ g ≠ ∅
2. eff(a) doesn’t make any gi false
▸  ∀ x, c, c′, if eff(a) contains (x,c) and g contains x = c′ then c = c′
3. pre(a) doesn’t require any gi to be false unless eff(a) makes gi true
▸  ∀ x, c, c′, if (x,c) ∈ pre(a) and (x,c′) ∈ g – eff(a) then c = c′
●  What actions are relevant for loc(c1)=r2
?

d2

d1

d3
r1

c1

r2

c2
c3
c4
c5
c6

3
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Inverse
State
Transi5ons

●  If a is relevant for achieving g, then
γ−1(g,a) = pre(a) ∪ (g – eff(a))
●  If a isn’t relevant for g, then
γ–1(g,a) is undefined
●  Example:
Ø  g = {loc(c1)=r2}
Ø  What is γ –1(g,
load(r2,c1,d3))?
Ø  What is γ –1(g,
load(r1,c1,d1))?

d2

d1

d3
r1

c1

r2

c2
c3
c4
c5
c6

4
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Backward
Search

For cycle checking:
●  After line 1, put
Solved = {g}
●  After line 6, put
if g′ ∈ Solved then return failure

Solved ← Solved ∪ {g′}
●  More powerful:
if ∃g ∈ Solved s.t. g ⊆ g′ then return failure

●  Sound and complete
Ø  If a planning problem is solvable
then at least one of Backward-
search’s nondeterministic execution
traces will find a solution
g

g1

g2

g3

a1

a2

a3

g4

g5

s0

a4

a5

5
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Branching
Factor

●  Our motivation for Backward-‐search

was to focus the search
Ø  But as written, it doesn’t really
accomplish that
●  Solve this by lifting
Ø  Leave y uninstantiated
...
move(r1,d2,d3)

move(r1,d4,d3)

move(r1,d20,d3)

move(r1,d1,d3)

g = {loc(r1)=d3}
move(r1,y,d3)
g = {loc(r1)=d3}

d2

d1

d3
r1

c1

r2

c2
c3
c4
c5
c6

6
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Li:ed
Backward
Search

●  Like Backward-‐search but more complicated
Ø  Have to keep track of what values were substituted for which parameters
Ø  But it has a much smaller branching factor
●  I won’t discuss the details

7
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Plan-‐Space
Planning

●  Another approach
Ø  formulate planning as a constraint satisfaction problem
Ø  use constraint-satisfaction techniques to produce solutions that are more
flexible than ordinary plans
•  E.g., plans in which the actions are partially ordered
•  Postpone ordering decisions until the plan is being executed
▸  the actor may have a better idea about which ordering is best
●  First step toward planning concurrent execution of actions (Chapter 4)
Outline:
•  Basic idea
•  Open goals
•  Threats
•  The PSP algorithm
•  Long example
•  Comments

8
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Plan-‐Space
Planning
-‐
Basic
Idea

●  Backward search from the goal
●  Each node of the search space is a partial plan, π
•  A set of partially-instantiated actions
•  Constraints on the actions
Ø  Keep making refinements,
until we have a solution
●  Types of constraints:
Ø  precedence constraints
indicated by solid arcs
Ø  binding constraints
•  inequality constraints, e.g., z ≠ x or w ≠ p1
Ø  causal links:
•  indicated by dashed arcs
•  use effect e of action a to establish precondition p of action b
●  How to tell we have a solution: no more flaws in the plan
Ø  Two kinds of flaws …
foo(x)
Pre: …
Eff: loc(x)=p1
bar(x)
Pre: loc(x)=p1
Eff: …
baz(z)
Pre: loc(z)=p2
Eff: …
z ≠ x

9
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Flaws:

1.
Open
Goals

●  A precondition p of an action b is an open
goal if there is no causal link for p
●  Resolve the flaw by creating a causal link
Ø  Find an action a (either already in π,
or can add it to π) that can establish p
•  can precede b
•  can have p as an effect
Ø  Do substitutions on variables
to make a assert p
•  e.g., replace y with x
Ø  Add an ordering constraint a ≺ b
Ø  Create a causal link from a to p
Pre: loc(y)=p1
Pre: loc(y)=p1
bar(y)
foo(y) bar(y)
substitute y for x
Eff: loc(y)=p1
foo(x)
Eff: loc(x)=p1

10
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Flaws:

2.
Threats

●  Suppose we have a causal link from
action a to precondition p of action b
●  Action c threatens the link if c may affect p
and may come between a and b
Ø  c is a threat even if it makes p true
rather than false
•  Causal link means a, not c, is
supposed to establish p for b
•  The plan in which c establishes p
will be generated on another path in
the search space

●  Three possible ways to resolve the flaw:
Ø  Require c ≺ a
Ø  Require b ≺ c
Ø  Constrain variable(s) to prevent
b from affecting p
Pre: loc(y)=p1
foo(y) bar(y)
clobber(z)
Eff: loc(z)=p2
Eff: loc(y)=p1
Pre: loc(y,p1)
foo(y) bar(y)
clobber(z)
Eff: ¬loc(z,p1)
Eff: loc(y,p1)
State variables:
Classical:

11
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

PSP
Algorithm

●  Initial plan is always {Start, Finish} with Start ≺ Finish
Ø  Start has no preconditions; effects are the initial state s0
Ø  Finish has no effects; its precondition is the goal g
●  PSP is sound and complete
Ø  It returns a partially ordered solution π such that any
total ordering of π will achieve g
Ø  In some environments, could execute actions in parallel
Start

Finish

Eff: s0
Pre: g

12
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

move(c, y, z)
pre: pos(c)=y, clear(c)=T, clear(z)=T
eff: pos(c)←z, clear(y)←T, clear(z)←F

Range(c) = Containers; Range(y) = Range(z) = Container ∪ pallets
Finish

p3

p1

p2

a

d

p4

b

c

Start

pos(a)=d
pos(b)=c

c
d

b
a

●  Finish has two open goals: pos(a)=d, pos(b)=c

Example

clear(p1)=T
clear(p2)=T
clear(p3)=F
clear(p4)=F

clear(a)=F

clear(b)=F

clear(c)=T

clear(d)=T

pos(a)=p3

pos(b)=p4

pos(c)=b

pos(d)=a

13
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

move(c, y, z)

move(a,y1,d)

Start

Finish

clear(a)=T
pos(a)=y1

pos(a)=d
pos(b)=c

clear(d)=T

Example

●  For each open goal, add a new action
Ø  Every new action a must have Start ≺ a, a ≺ Finish

p3

p1

p2

a

d

p4

b

c

clear(b)=T
pos(b)=y2
clear(c)=T

move(b,y2,c)

clear(p1)=T
clear(p2)=T
clear(p3)=F
clear(p4)=F

clear(a)=F

clear(b)=F

clear(c)=T

clear(d)=T

pos(a)=p3

pos(b)=p4

pos(c)=b

pos(d)=a

c
d

b
a

14
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

move(c, y, z)


move(a,p3,d)

Start

Finish

clear(a)=T
clear(b)=T
pos(b)=p4
pos(a)=p3

pos(a)=d
pos(b)=c

Example

●  Resolve four more open goals: bind y1=p3, y2=p4
clear(c)=T
clear(d)=T

p3

p1

p2

a

d

p4

b

c

move(b,p4,c)

clear(p1)=T
clear(p2)=T
clear(p3)=F
clear(p4)=F

clear(a)=F

clear(b)=F

clear(c)=T

clear(d)=T

pos(a)=p3

pos(b)=p4

pos(c)=b

pos(d)=a

c
d

b
a

15
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

move(c, y, z)

move(a,p3,d)

clear(d)=T
pos(b)=p4

Start

Finish

clear(a)=T
clear(b)=T

Example

●  1st threat requires z3≠d
●  2nd threat has two resolvers:
Ø  move(b,p4,c) ≺ move(x3,a,z3)
Ø  z3≠c

pos(a)=p3
clear(c)=T

clear(x3)=T
clear(z3)=T
pos(x3)=a

pos(a)=d
pos(b)=c

p3

p1

p2

a

d

p4

b

c

move(x3,a,z3)

move(b,p4,c)

clear(p1)=T
clear(p2)=T
clear(p3)=F
clear(p4)=F

clear(a)=F

clear(b)=F

clear(c)=T

clear(d)=T

pos(a)=p3

pos(b)=p4

pos(c)=b

pos(d)=a

c
d

b
a

16
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

move(c, y, z)

move(a,p3,d)

clear(d)=T
pos(b)=p4

Start

Finish

clear(a)=T
clear(b)=T

Example

●  Threats resolved

pos(a)=p3
clear(c)=T

clear(x3)=T
clear(z3)=T
pos(x3)=a

pos(a)=d
pos(b)=c

p3

p1

p2

a

d

p4

b

c

move(x3,a,z3)

move(b,p4,c)

z3≠c

z3≠d

clear(p1)=T
clear(p2)=T
clear(p3)=F
clear(p4)=F

clear(a)=F

clear(b)=F

clear(c)=T

clear(d)=T

pos(a)=p3

pos(b)=p4

pos(c)=b

pos(d)=a

c
d

b
a

17
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

move(c, y, z)

move(a,p3,d)

clear(d)=T
pos(b)=p4

Start

Finish

clear(a)=T
clear(b)=T

move(b,p4,c)

Example
●  1st threat has two resolvers:
Ø  An ordering constraint,
and z4≠d
●  2nd threat has three resolvers:
Ø  Two ordering constraints,
and z4≠a
●  3rd threat has one: z4≠c

pos(a)=p3
clear(c)=T

move(x4,b,z4)

clear(x3)=T
clear(x4)=T
clear(z3)=T
clear(z4)=T
pos(x4)=b
pos(x3)=a

pos(a)=d
pos(b)=c

Start

p3

p1

p2

a

d

p4

b

c

move(x3,a,z3)

z3≠c

z3≠d

c
d

b
a

18
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

move(c, y, z)

move(x3,a,z3)

move(a,p3,d)

clear(d)=T
pos(b)=p4

Start

Finish

clear(a)=T
clear(b)=T

move(b,p4,c)

Example

●  Resolve the three threats using
the binding constraints

pos(a)=p3
clear(c)=T

move(x4,b,z4)

clear(x3)=T
clear(x4)=T
clear(z3)=T
clear(z4)=T
pos(x4)=b
pos(x3)=a

pos(a)=d
pos(b)=c

p3

p1

p2

a

d

p4

b

c

z4≠a

z4≠c

z4≠d

z3≠c

z3≠d

c
d

b
a

19
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

p1≠c

p1≠d

move(c, y, z)

move(a,p3,d)

clear(d)=T
pos(b)=p4

Start

Finish

clear(a)=T
clear(b)=T

move(b,p4,c)

Example

●  Resolve five open goals
Ø  Bind x3=d, x4=c, z3=p1
pos(a)=p3
clear(c)=T

move(d,a,p1)
move(c,b,z4)

clear(d)=T
clear(c)=T
clear(p1)=T
clear(z4)=T
pos(c)=b
pos(d)=a

pos(a)=d
pos(b)=c

p3

p1

p2

a

d

p4

b

c

z4≠a

z4≠c

z4≠d

c
d

b
a

20
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

move(c, y, z)

move(a,p3,d)

clear(d)=T
pos(b)=p4

Start

Finish

clear(a)=T
clear(b)=T

move(b,p4,c)

Example

●  Threatened causal link
●  Resolvers:
Ø  move(d,a,p1) ≺ move(c,b,z4)
Ø  z4≠p1

pos(a)=p3
clear(c)=T

move(d,a,p1)
move(c,b,z4)

clear(d)=T
clear(c)=T
clear(p1)=T
clear(z4)=T
pos(c)=b
pos(d)=a

pos(a)=d
pos(b)=c

p3

p1

p2

a

d

p4

b

c

z4≠a

z4≠c

z4≠d

c
d

b
a

21
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

move(c, y, z)

move(a,p3,d)

clear(d)=T
pos(b)=p4

Start

Finish

clear(a)=T
clear(b)=T

move(b,p4,c)

Example

●  Threat resolved
pos(a)=p3
clear(c)=T

move(d,a,p1)
move(c,b,z4)

clear(d)=T
clear(c)=T
clear(p1)=T
clear(z4)=T
pos(c)=b
pos(d)=a

pos(a)=d
pos(b)=c

p3

p1

p2

a

d

p4

b

c

z4≠a

z4≠c

z4≠d

z4≠p1

c
d

b
a

22
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

move(c, y, z)

move(a,p3,d)

clear(d)=T
pos(b)=p4

Start

Finish

clear(a)=T
clear(b)=T

move(b,p4,c)

Example
●  Resolve open goal
Ø  bind z4=p2
●  No more flaws, so we’re done!
pos(a)=p3
clear(c)=T

move(d,a,p1)
move(c,b,p2)

clear(d)=T
clear(c)=T
clear(p1)=T
clear(p2)=T
pos(c)=b
pos(d)=a

pos(a)=d
pos(b)=c

p3

p1

p2

a

d

p4

b

c

p2≠a

p2≠c

p2≠d

p2≠p1

c
d

b
a

23
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

move(a,p3,d)

Start

Finish

move(b,p4,c)

Example

●  PSP returns this solution:
move(d,a,p1)
move(c,b,p4)

p3

p1

p2

a

d

p4

b

c

c
d

b
a

24
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

move(a,p3,d)

clear(d)=T
pos(b)=p4

Start

Finish

clear(a)=T
clear(b)=T

move(b,p4,c)

Example

●  Go back to the last threat
●  Resolvers:
Ø  move(d,a,p1) ≺ move(c,b,z4)
Ø  z4≠p1

pos(a)=p3
clear(c)=T

move(d,a,p1)
move(c,b,z4)

clear(d)=T
clear(c)=T
clear(p1)=T
clear(z4)=T
pos(c)=b
pos(d)=a

pos(a)=d
pos(b)=c

p3

p1

p2

a

d

p4

b

c

z4≠a

z4≠c

z4≠d

c
d

b
a

25
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

move(a,p3,d)

clear(d)=T
pos(b)=p4

Start

Finish

clear(a)=T
clear(b)=T

move(b,p4,c)

Example

●  Threat resolved
pos(a)=p3
clear(c)=T

move(d,a,p1)
move(c,b,z4)

clear(d)=T
clear(c)=T
clear(p1)=T
clear(z4)=T
pos(c)=b
pos(d)=a

pos(a)=d
pos(b)=c

p3

p1

p2

a

d

p4

b

c

z4≠a

z4≠c

z4≠d

c
d

b
a

26
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

move(a,p3,d)

clear(d)=T
pos(b)=p4

Start

Finish

clear(a)=T
clear(b)=T

move(b,p4,c)

Example
●  Resolve open goal
Ø  bind z4=p2
●  No more flaws, so we’re done
pos(a)=p3
clear(c)=T

move(d,a,p1)
move(c,b,p4)

clear(d)=T
clear(c)=T
clear(p1)=T
clear(p2)=T
pos(c)=b
pos(d)=a

pos(a)=d
pos(b)=c

p3

p1

p2

a

d

p4

b

c

c
d

b
a

27
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

move(a,p3,d)

Start

Finish

move(b,p4,c)

Example

●  Same solution as before,
but with another ordering
constraint
move(d,a,p1)
move(c,b,p4)

p3

p1

p2

a

d

p4

b

c

c
d

b
a

28
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Node-‐Selec5on
Heuris5cs

●  Analogy to constraint-satisfaction problems
Ø  Resolving a flaw in PSP ≈ assigning a value to a variable in a CSP
●  What flaw to work on next?
Ø  Fewest Alternatives First (FAF): the flaw with the fewest resolvers
≈ Minimum Remaining Values (MRV) heuristic for CSPs
●  To resolve the flaw, which resolver to try first?
Ø  Least Constraining Resolver (LCR): the resolver that rules out the fewest
resolvers for the other flaws
≈ Least Constraining Value (LCV) heuristic for CSPs
●  In PSP, introducing a new action introduces new flaws to resolve
Ø  The plan can get arbitrarily large; want it to be as small as possible
•  Not like CSPs, where the search tree always has a fixed depth
●  Avoid introducing new actions unless necessary
●  To choose between actions a and b, estimate distance from s0 to Pre(a) and Pre(b)
Ø  We’ll discuss some heuristics for that later

29
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

move(a,p3,d)

clear(d)=T
pos(b)=p4

Start

Finish

clear(a)=T
clear(b)=T

move(b,p4,c)

Example

●  Example of Fewest Alternatives First:
Ø  1st threat has two resolvers: an ordering constraint, and z4≠d
Ø  2nd threat has three resolvers: 2 ordering constraints, and z4≠a
Ø  3rd threat has one resolver: z4≠c

●  So resolve the 3rd threat first
pos(a)=p3
clear(c)=T

move(x4,b,z4)

clear(x3)=T
clear(x4)=T
clear(z3)=T
clear(z4)=T
pos(x4)=b
pos(x3)=a

pos(a)=d
pos(b)=c

Start

move(x3,a,z3)

z3≠c

z3≠d

30
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Discussion

●  Problem: how to prune infinitely long paths in the search space?
Ø  Loop detection is based on recognizing states
or goals we’ve seen before
Ø  In a partially ordered plan, we don’t know the states
●  Can we prune a path if we see the same action more than once?
Ø  No. Sometimes we might need the same action several times in
different states of the world
Ø  Example on next slide
s s' s
act1
act2
act1
……
…

31
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Example

●  3-digit binary counter d3

d2
d1
s0 = {d3=0,
d2=0,
d1=0}, i.e., 0 0 0
g = {d3=1,
d2=1,
d1=1}, i.e., 1 1 1
●  Actions to increment the counter
•  incr-‐xx0-‐to-‐xx1

Pre: d1=0

Eff: d1=1
•  incr-‐x01-‐to-‐x10

Pre: d2=0,
d1=1

Eff: d2=1,
d1=0
•  incr-‐011-‐to-‐100

Pre: d3=0,
d2=1,
d1=1

Eff: d3=1,
d2=0,
d1=0

●  Plan:

d3

d2

d1

s0 :
0

0

0

incr-‐xx0-‐to-‐xx1
à
0

0

1

incr-‐x01-‐to-‐x10
à
0

1

0

à
0

1

1

incr-‐011-‐to-‐100
à
1

0

0

à
1

0

1

incr-‐x01-‐to-‐x10
à
1

1

0

à
1

1

1

32
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

A
Weak
Pruning
Technique

●  Can prune all partial plans of n or more actions, where n = |{all possible states}|
Ø  This doesn’t help very much
●  I’m not sure whether there’s a good pruning technique for plan-space planning

33
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Planning
with
Control
Rules

Motivation:
●  Given a state s and an action a
●  Sometimes domain-specific tests can
tell us we don’t want to use a, e.g.,
Ø  a doesn’t lead to a solution
Ø  or a is dominated
•  there’s a better solution
along some other path
Ø  or a doesn’t lead to a solution
that’s acceptable according to
domain-specific criteria
●  In such cases we can prune s (remove it from Act)
●  Approach:
Ø  Write logical formulas giving conditions that states must satisfy
Ø  Prune states that don’t satisfy the formulas

34
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Quick
Review
of
First
Order
Logic

First Order Logic (FOL):
●  Syntax:
Ø  atomic formulas (or atoms)
•  predicate symbol with arguments, e.g., clear(c)
•  include ‘=’ as a binary predicate symbol, e.g., loc(r1)=d1
Ø  logical connectives (∨, ∧, ¬, ⇒, ⇔), quantifiers (∀, ∃), punctuation
•  e.g., (loc(r1)=d1
∧ ∀c clear(c)) ⇒ ¬∃c loc(c)=r1

●  First Order Theory T:
Ø  “Logical” axioms and inference rules – encode logical reasoning in general
Ø  Additional “nonlogical” axioms – talk about a particular domain
Ø  Theorems: produced by applying the axioms and rules of inference
●  Model: a set of objects, functions, relations that the symbols refer to
Ø  For our purposes, a model is a state of the world s
Ø  In order for s to be a model, all theorems of T must be true in s
Ø  s ⊨ loc(r1)=d1 read “s satisfies loc(r1)=d1” or “s entails loc(r1)=d1”
•  means that r1 is at d1 in the state s

35
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Linear
Temporal
Logic

●  Modal logic: FOL plus modal operators
to express concepts that would be difficult to express within FOL
●  Linear Temporal Logic (LTL):
Ø  Purpose: to express a limited notion of time
•  Infinite sequence 〈0, 1, 2, …〉 of time instants
•  Infinite sequence M = 〈s0, s1, …〉 of states of the world
Ø  Modal operators to refer to states in M:
X f “next f ” - f is true in the next state, e.g., F loc(a)=b
F f “future f ” - f either is true now or in some future state
G f “globally f ” - f is true now and in all future states
f1 U f2 “f1 until f2” - f2 is true now or in a future state,
and f1 is true until then
Ø  Propositional constant symbols True and False

•  Instead of T and F, to avoid confusion with the F operator

36
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Linear
Temporal
Logic
(con5nued)

●  Quantifiers cause problems with computability
Ø  Suppose f(x) is true for infinitely many values of x
Ø  Problem evaluating truth of ∀x f(x) and ∃x f(x)
●  Bounded quantifiers
Ø  Let g(x) be such that {x | g(x) is true} is finite and easily computed
∀[x: g(x)] f(x)
▸  means ∀x (g(x) ⇒ f(x))
▸  expands into f(x1) ∧ f(x2) ∧ … ∧ f(xn)
∃[x: g(x)] f(x)
▸  means ∃x (g(x) ∧ f(x))
▸  expands into f(x1) ∨
f(x2) ∨ … ∨ f(xn)

37
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Nota5on

●  We can use state-variable assignments as logical propositions in LTL formulas
Ø  G (∀[x: clear(x)=T] final(x)=T ⇒ X(clear(x)=T ∨ ∃[y: loc(y)=x] final(y)=T))
●  For Boolean state variables, simpler to write them as logical propositions
•  Instead of clear(x)=T, just write clear(x)
•  Instead of clear(x)=F,
write ¬clear(x)
Ø  G (∀[x: clear(x)] final(x) ⇒ X(clear(x) ∨ ∃[y: loc(y)=x] final(y)))

38
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

pickup(x)
pre: loc(x)=floor, clear(x), holding=nil
eff: loc(x)=crane, ¬clear(x), holding=x
b

stack(x,y)
pre: holding=x, clear(y)

eff: holding=nil, ¬clear(y), loc(x)=y, clear(x)

●  The “container stacking” domain
Ø  Based on a classical planning domain
called the “blocks world”
unstack(x,y)
pre: loc(x)=y, clear(x), holding=nil

eff: loc(x)=crane, ¬clear(x), holding=x, clear(y)

putdown(x)
pre: holding=x
eff: holding=nil, loc(x)=floor, clear(x)

Example
clear(e),
loc(e)=d,
loc(d)=floor,

clear(c),
loc(c)=a,
loc(a)=floor,

clear(b),
loc(b)=floor,
holding=nil

d

e

a

c

b

d

e

a

c

clear(e),
loc(e)=d,
loc(d)=floor,

clear(c),
loc(c)=a,
loc(a)=floor,

¬clear(b),
loc(b)=crane,
holding=b

39
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Models
for
Planning
with
LTL

●  A model is a pair M = (M, si)
Ø  M = 〈s0, s1, …〉 is a sequence of states
Ø  si is the i’th state in M,
●  For planning, we also have a goal g = {g1, …, gn}
Ø  To reason about it, add a modal operator called “Goal”
•  Not part of ordinary LTL, but I’ll call it LTL anyway
Ø  In an LTL formula, use “Goal(gi)” to refer to part of g
•  ((M,si), g) ⊨ Goal(gi) iff g ⊨ gi
●  Planning problem:
Ø  Initial state s0, a goal g, control formula f
Ø  Find a plan π = 〈a1, …, an〉 that generates a sequence of states
M = 〈s0, s1, …sn〉 such that M ⊨ f and sn ⊨ g
•  That’s not quite correct
•  Do you know why?

40
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Models
for
Planning
with
LTL

●  M needs to be an infinite sequence
●  Kluge: assume that the final state repeats infinitely after the plan ends
●  Planning problem:
Ø  Initial state s0, a goal g, control formula f
Ø  Find a plan π = 〈a1, …, an〉 that generates a sequence of states
M = 〈s0, s1, …, sn, sn, sn, …〉 such that M ⊨ f and sn ⊨ g

41
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Examples

●  Suppose M = 〈s0, s1, …〉
●  (M,s2) ⊨ XX loc(a)=b
Ø  a is on b in state s2
●  Abbreviation: can omit the state, it defaults to s0
Ø  M ⊨ XX loc(a)=b

means (M,s0) ⊨ XX loc(a)=b
●  Since loc(a)=b has no modal operators
Ø  (M,s2) ⊨ loc(a)=b is equivalent to s2 ⊨ loc(a)=b
●  M ⊨ G holding
≠ c

Ø  in every state in M, we aren’t holding c
●  M ⊨ G (clear(b) ⇒ (clear(b) U loc(a)=b))
Ø  whenever we enter a state in which b is clear,
b remains clear until a is on b

42
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

TLPlan

●  Nondeterministic forward search
Ø  s = current state, f = control formula, g = goal
●  If s satisfies g then we’re done
●  Otherwise, think about what kind of plan we need
Ø  It must generate a sequence of states M = 〈s, s+, s++, …〉 that satisfies f
●  Compute a formula f + such that
(M,s) ⊨ f iff (M,s+) ⊨ f +
●  Fail if f + = FALSE

Ø  No matter what s+ is,
(M,s+) can’t satisfy f +
●  Fail if no applicable actions
●  Otherwise, nondeterministically
choose one, compute s+,
and call TLPlan with s+ and f +
TLPlan
(s, f, g)
if s satisfies g then return ⟨ ⟩
f + ← Progress
(f, s)
if f + = False then return failure

A ← {actions applicable to s}
if A is empty then return failure

nondeterministically choose a ∈ A
π+ ← TLPlan
(γ (s,a), f +, g)
if π+ ≠ failure then return π.π+

return failure

43
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Progression

●  Procedure Progress(f,s)
◆  Case:

1. f contains no temporal ops : f + ← True if s ⊨ f, False
otherwise

2. f = f1 ∧ f2 : f + ← Progress(f1, s) ∧ Progress(f2, s)
3. f = f1 ∨ f2 : f + ← Progress(f1, s) ∨ Progress(f2, s)
4. f =¬ f1 : f + ← ¬Progress(f1, s)
5. f = X f1 : f + ← f1
6. f = F f1 : f + ← Progress(f1, s) ∨ f
7. f = G f1 : f + ← Progress(f1, s) ∧ f
8. f = f1 U f2 : f + ← Progress(f2, s) ∨ (Progress(f1, s) ∧ f)
9. f = ∀[x:g(x)] h(x) : f + ← Progress(h(x1), s) ∧ … ∧ Progress(h(xn), s)
10. f = ∃ [x:g(x)] h(x) : f + ← Progress(h(x1), s) ∨ … ∨ Progress(h(xn), s)
◆  simplify f + and return it
False ∧ h = False,
True ∧ h = h,
¬False = True,
etc.
Compute the formula f + that M + must satisfy

44
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Progressing
ordinary
formulas

◆  Case:

otherwise

4. f =¬ f1 : f + ← ¬Progress(f1, s)
5. f = X f1 : f + ← f1
●  f = loc(a)=b
◆  if a is currently on b, then True (every possible M + is OK)
◆  otherwise False (there is no M + that’s OK)
Example:

45
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

●  f = XX loc(a)=b
Ø  two states from now,
a must be on b
Ø  f + = X loc(a)=b
●  f = X loc(a)=b
◆  in the next state,
a must be on b
◆  f + = loc(a)=b
Progressing
X

◆  Case:

otherwise

4. f =¬ f1 : f + ← ¬Progress(f1, s)
5. f = X f1 : f + ← f1
Examples:

46
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Progressing
∧

●  f = clear(c) ∧ X loc(a)=c
Ø  c must be clear now, and a must be on c in the next state
●  f + = Progress(clear(c), s) ∧ Progress(X loc(a)=c, s)
= True ∧ loc(a)=c
= loc(a)=c
◆  Case:

otherwise

4. f =¬ f1 : f + ← ¬Progress(f1, s)
5. f = X f1 : f + ← f1
Example:
a
b

c

47
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Progressing
∧

●  f = G loc(a)=c
Ø  a must be on c now and must stay there in the future
●  f + = Progress(loc(a)=c, s) ∧ f
= False ∧ G loc(a)=c
= False

◆  Case:

otherwise

4. f =¬ f1 : f + ← ¬Progress(f1, s)
5. f = X f1 : f + ← f1
Example:
a
b

c

48
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Progressing
∧

●  f = loc(a)=b U clear(c)
Ø  c must be clear, or a must be on b and stay there until c is clear
●  f + = Progress(clear(c), s) ∨ [Progress(loc(a)=b, s) ∧ f ]
= True ∨ [ False ∧ (loc(a)=b) U clear(c))]
= True
◆  Case:

otherwise

4. f =¬ f1 : f + ← ¬Progress(f1, s)
5. f = X f1 : f + ← f1
a
b

c

49
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

◆  Case:

otherwise

4. f =¬ f1 : f + ← ¬Progress(f1, s)
5. f = X f1 : f + ← f1
Progressing
∀

●  f = ∀[x: clear(x)] X loc(x)=floor
Ø  {x | clear(x)} = {a,
c}
●  f + = Progress(X loc(a)=floor, s) ∧ Progress(X loc(c)=floor, s)
= loc(a)=floor ∧ loc(c)=floor

Example:
xi is the i’th element
of {x | s ⊨ g(x)}
a
b

c

50
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

◆  Case:

otherwise

4. f =¬ f1 : f + ← ¬Progress(f1, s)
5. f = X f1 : f + ← f1
Progressing
∃

●  f = ∃[x: clear(x)] X loc(x)=floor
Ø  {x | clear(x)} = {a,
c}
●  f + = Progress(X loc(a)=floor, s) ∨ Progress(X loc(c)=floor, s)
= loc(a)=floor ∨ loc(c)=floor
Example:
xi is the i’th element
of {x | s ⊨ g(x)}
a
b

c

51
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

TLPlan

●  Nondeterministic forward search
Ø  s = current state, f = control formula, g = goal
●  If s satisfies g then we’re done
●  Otherwise, think about what kind of plan we need
Ø  It must generate a sequence of states M = 〈s, s+, s++, …〉 that satisfies f
●  Compute a formula f + such that
(M,s) ⊨ f iff (M,s+) ⊨ f +
●  Fail if f + = FALSE

Ø  No matter what s+ is,
(M,s+) can’t satisfy f +
●  Fail if no applicable actions
●  Otherwise, nondeterministically
choose one, compute s+,
and call TLPlan with s+ and f +
TLPlan
(s, f, g)
f + ← Progress
(f, s)


π+ ← TLPlan
(γ (s,a), f +, g)

return failure

52
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Example
Planning
Problem

●  s = {loc(a)=floor, loc(b)=floor, clear(a), clear(c), loc(c)=b}
●  g = {loc(b)=a}
●  f = G ∀[x: clear(x)] (loc(x)≠floor ∨ ∃[y: Goal(loc(x)=y)] ∨ X holding≠x)
Ø  never pick up a clear container from the floor unless it needs to be elsewhere
●  Run the TLPlan algorithm
●  Compute f +
Ø  Return failure
if f + = FALSE
●  Two applicable actions: pickup(a)
and unstack(c,b)
Ø  Which one to use?
●  Try using pickup(a)
Ø  Call TLPlan recursively with
γ (s, pickup(a)) and f +
●  If TLPlan returns failure, then
try unstack(c,b)
a
b

b

a

c
s0: g:
TLPlan
(s, f, g)
f + ← Progress
(f, s)


π+ ← TLPlan
(γ (s,a), f +, g)

return failure

53
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Example
Planning
Problem

●  s = {loc(a)=floor, loc(b)=floor, clear(a), clear(c), loc(c)=b}
●  g = {loc(b)=a}
●  f = G ∀[x: clear(x)] (loc(x)≠floor ∨ ∃[y: Goal(loc(x)=y)] ∨ X holding≠x)
●  f + = Progress(G f1,s) = Progress(f1,s) ∧ f = Progress(∀[x: clear(x)] h(x)), s) ∧ f
= Progress(h(a) ∧ h(c)), s) ∧ f
= Progress(h(a)), s) ∧ Progress(h(c)), s) ∧ f

•  Progress(h(a),s) = Progress(loc(a)≠floor ∨ ∃[y: Goal(loc(a)=y)] ∨ X holding≠a),s)
= False ∨ False ∨ holding≠a

= holding≠a
•  Progress(h(c),s) = Progress(loc(c)≠floor
∨ ∃[y: Goal(loc(c)=y)] ∨ X holding≠c),s)
= False ∨ True ∨ holding≠c

=
True
●  f + = holding≠a
∧ True ∧ f = holding≠a
∧ f
●  Two applicable actions: pickup(a) and unstack(c,b)
Ø  s1 = γ (s, pickup(a)): Progress(f +, s1) = False ⇒ backtrack
Ø  s2 = γ (s, unstack(c,b)): Progress(f +, s2) = f ⇒ keep going
a
b

b

a

c
s0: g:
h(x)
f1

54
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Container-‐Stacking
Problems

●  Define an inferred state variable final(x) ∈ Booleans, where x is a container
•  Never directly changed by any planning operator
•  Produced by logical inference from the other state variables
●  Want final(x) to mean x is at the top of a stack that we’re finished moving
Ø  Neither x nor the containers below x will ever need to be moved
●  Axioms to support this:
Ø  final(x) ⇔ clear(x) ∧ ¬Goal(holding=x) ∧ finalbelow(x)
Ø  finalbelow(x) ⇔
(loc(x)=floor ∧ ¬∃[y: Goal(loc(x)=y])
∨ ∃[y: loc(x)=y] [
¬Goal(loc(x)=floor) ∧ ¬Goal(holding=y) ∧ ¬Goal(clear(y))
∧ ∀[z : Goal(loc(x)=z)] (z=y) ∧ ∀[z: Goal(loc(z)=y)] (z=x)
∧ finalbelow(y)]
Ø  nonfinal(x) ⇔ clear(x) ∧ ¬final(x)

55
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Control
Rules

Try TLPlan with three different control formulas:
(1) If x is final, only put a container y onto x if it will make y final:
Ø  G ∀[x: clear(x)] (final(x) ⇒ X [clear(x) ∨ ∃[y: loc(y)=x] final(y)])
(2) Like (1), but also says never to put anything onto a container that isn’t final:
Ø  G ∀[x: clear(x)] [
(final(x) ⇒ X [clear(x) ∨ ∃[y: loc(y)=x] final(y)])
∧ (nonfinal(x) ⇒ X ¬∃[y: loc(y)=x])]
(3) Like (2), but also says never to pick up a nonfinal container from the floor
unless you can put it where it will be final:
Ø  G ∀[x: clear(x)] [
(final(x) ⇒ X [clear(x) ∨ ∃[y: loc(y)=x] final(y)])
∧ (nonfinal(x) ⇒ X ¬∃[y: loc(y)=x])
∧ (onfloor(x) ∧ ∃[y: Goal(loc(x)=y)] ¬final(y) ⇒ X¬holding(x))]

56
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Container

Stacking

57
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Container

Stacking

58
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Domain-‐Speciﬁc
Planning
Algorithms

●  Sometimes we can write highly efficient planning algorithms for a
specific class of problems
Ø  Use special properties of that class
●  For container-stacking problems with n containers, we can easily get a
solution of length O(n)
Ø  Move all containers to the floor, then build up stacks from the bottom
●  With additional domain-specific knowledge, can do even better …
a

e

b

c

d

e
c

b

a

d
s0 g

59
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Container-‐Stacking
Algorithm

●  The algorithm generates the following sequence of actions:
Ø  ⟨move(e,a,floor),
move(d,c,e),
move(c,b,floor),
move(b,floor,c),

move(a,floor,b)⟩

g
e
c

b

a

d

●  c needs moving if
◆  s contains loc(c)=d and
g contains loc(c)=e,
where e≠d
g contains loc(b)=d,
where b≠c
d needs moving
loop
if ∃ a clear container c that needs moving
& we can move c to a position d
where c won’t need moving
then move c to d
else if ∃ a clear container c that needs moving
then move c to any clear pallet
else if the goal is satisfied
then return success
else return failure
repeat
a

e

b

c

d

s0

60
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Proper5es
of
the
Algorithm

●  Sound, complete, guaranteed to terminate on all container-stacking problems
●  Runs in time O(n3)
Ø  Can be modified (Slaney & Thiébaux) to run in time O(n)
●  Often finds optimal (shortest) solutions
●  But sometimes only near-optimal
Ø  For container-stacking problems, PLAN-LENGTH is NP-complete
●  I think what TLPlan does (with its 3rd control rule) is roughly similar to this
algorithm

61
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Using
Determinis5c
Domain
Models

●  For planning with deterministic domain models, we made some assumptions that
aren’t necessarily true
Assumption Problem
Ø  Static world The world may change dynamically
Ø  Perfect information We almost never have all of the information
Ø  Instantaneous actions Actions take time; there may be time constraints
Ø  Correct predictions Action models usually are just approximations
Ø  Determinism Action model may just be the “nominal case”
Ø  Flat search space There may further lower-level refinements
●  If enough of the assumptions are approximately true, the plans may still be useful
Ø  But can’t just take a plan π and start executing it
Ø  Need to monitor π’s execution, detect problems as they occur, recover from
them

62
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Ac5ng
and
Planning

●  Interaction is roughly as follows
Ø  loop
•  from the planner, get the latest
plan or partial plan
•  perform one or more actions,
monitoring the current state
▸  if problems occur, replan
while performing some
preplanned recovery actions
●  Performance could involve lower-level
refinement rather than direct execution
Ø  The next chapter contains lots of
details
Acting
Planning
Performance

63
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

s s’Predict
Search
Planning stage
Acting stage
Ac5ng
and
Planning

●  What kind of information should the
planner provide?
Ø  Depends on the planning domain
and the actor
●  Some possibilities
Ø  Complete plan, as in the
algorithms we’ve discussed
•  But usually for a subproblem
▸  example on next slide
Ø  Partial plan
•  e.g., receding horizon
Ø  Several partial plans, with
relative evaluations of each
•  e.g., game-tree search
overall'problem
sub1 sub2 sub3

64
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
2/5/15

Example

●  Killzone 2
Ø  “First-person shooter” game
●  Special-purpose AI planner
Ø  Plans enemy actions at the
squad level
•  Subproblems; solution
plans are maybe 4–6
actions long
Ø  Different planning algorithm than what we’ve discussed so far,
but it uses a deterministic domain model
Ø  Quickly generates a plan that would work if nothing interferes
Ø  Replans several times per second as the world changes
●  Why it worked:
Ø  Don’t want to get the best possible plan
Ø  Need actions that appear believable and consistent to human users
Ø  Need them very quickly

Chapter02b

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Chapter02b

Similar to Chapter02b (20)

More from Tianlu Wang

More from Tianlu Wang (20)

Chapter02b