AI Applications and Planning Techniques

Integrated Inﬂuence :
The Six Million Dollar Man
of AI

L Dicken

What is AI?
• Any time a computer makes any sort of decision
between a number of options, it can be thought of
as acting “intelligently”.

2

What is AI?
• Any time a computer makes any sort of decision
between a number of options, it can be thought of
as acting “intelligently”.
• Whether or not those decisions are the right ones
is how “good” the intelligence is.

2

AI Applications
• Automatic Translation

3

AI Applications
• Statistical Analysis

3

AI Applications
• Optimising Resource Usage

3

AI Applications
• Scheduling Problems

3

AI Applications
• Automated Planning

3

AI Applications
• Image/Facial Recognition

3

AI Applications
• Image/Facial Recognition
• And many more...

3

Basics
• Broadly, two conceptual paradigms in AI
‣ Reaction
‣ Deliberation

4

Basics
‣ Reaction
‣ Deliberation
• Reaction aims to program “instinctive” reactions to
minimal subsets of stimuli.

4

Basics
‣ Reaction
‣ Deliberation
• Reaction aims to program “instinctive” reactions to
minimal subsets of stimuli.
• Deliberation describes reasoning-based approaches,
using all the information available.

4

Automated Planning
• AP is a deliberative technique

5

Automated Planning
• Given a description of

5

Automated Planning
‣ Current state of the world

5

Automated Planning
‣ Actions that can be applied and the way they affect the
world

5

Automated Planning
world
‣ A set of goals to be achieved

5

Automated Planning
world
‣ A set of goals to be achieved
• Automatically determines a sequence of actions that
will complete the task.

5

PDDL
• Planning Domain Description Language

6

PDDL
• Propositional representation of world

6

PDDL
• All things not asserted true are false

6

PDDL
• All things true now will remain true unless negated

6

PDDL
• Extensions deal with a variety of extras

6

PDDL
• Extensions deal with a variety of extras
‣ e.g. Numerical values, Temporal actions, Continuous
effects etc.

6

Example Domain
(deﬁne (domain logistics)
(:requirements :strips :typing)
(:types truck - vehicle
package vehicle - physobj
location - place
place physobj - object)
(:predicates (in-city ?loc - place ?city - city)
(at ?obj - physobj ?loc - place)
(in ?pkg - package ?veh - vehicle))
(:action LOAD
:parameters (?pkg - package ?truck - truck ?loc - place)
:precondition (and (at ?truck ?loc) (at ?pkg ?loc))
:effect (and (not (at ?pkg ?loc)) (in ?pkg ?truck)))
(:action UNLOAD
:parameters (?pkg - package ?truck - truck ?loc - place)
:precondition (and (at ?truck ?loc) (in ?pkg ?truck))
:effect (and (not (in ?pkg ?truck)) (at ?pkg ?loc)))
(:action MOVE
:parameters (?truck - truck ?loc-from - place ?loc-to - place ?city - city)
:precondition
(and (at ?truck ?loc-from) (in-city ?loc-from ?city) (in-city ?loc-to ?city))
:effect
(and (not (at ?truck ?loc-from)) (at ?truck ?loc-to)))

7

Planning Graphs

At(T1, L1)

At(P1, L1)

f0

Planning Graphs
Move(T1,
L2)
At(T1, L1)
Move(T1,
L3)
At(P1, L1)
Load(P
1,T1)

f0 a1

Planning Graphs
Move(T1,
L2)
At(T1, L1) At(T1, L1)
Move(T1,
L3)
At(P1, L1) In(P1, T1)
Load(P
1,T1)

f0 a1 f1

Planning Graphs
Move(T1, Move(T
L2) 1, L2)
At(T1, L1) At(T1, L1)
Move(T1, Move(T1,
L3) L3)
At(P1, L1) In(P1, T1)
Load(P Unload(P
1,T1) 1,T1)

f0 a1 f1 a2

Planning Graphs
Move(T1, Move(T
L2) 1, L2)
At(T1, L1) At(T1, L1) At(T1, L2)
Move(T1, Move(T1,
L3) L3)
At(P1, L1) In(P1, T1) In(P1, T1)
Load(P Unload(P
1,T1) 1,T1)

f0 a1 f1 a2 f2

Planning Graphs
Move(T1, Move(T Move(T1,
L2) 1, L2) L1)
At(T1, L1) At(T1, L1) At(T1, L2)
Move(T1, Move(T1, Move(T1,
L3) L3) L3)
At(P1, L1) In(P1, T1) In(P1, T1)
Load(P Unload(P Unload(
1,T1) 1,T1) P1,T1)

f0 a1 f1 a2 f2 a3

Planning Graphs
Move(T1, Move(T Move(T1,
L2) 1, L2) L1)
At(T1, L1) At(T1, L1) At(T1, L2) At(T1, L2)
Move(T1, Move(T1, Move(T1,
L3) L3) L3)
At(P1, L1) In(P1, T1) In(P1, T1) At(P1, L2)
Load(P Unload(P Unload(
1,T1) 1,T1) P1,T1)

f0 a1 f1 a2 f2 a3 f3

The Core of Planning
• Previous example made it look so easy.

9

• Trivial example, worked out in advance.

9

• At every action layer, many choices don’t help

9

‣ Easy to disappear down a rabbit hole

9

• Planning all about guiding search across the action/
fact layer space.

9

• Planning all about guiding search across the action/
fact layer space.
‣ Different heuristics, search strategies, pruning techniques

9

Problems
• Search space is massive.

10

Problems
‣ Computational complexity high

10

Problems
‣ Processing time required also high

10

Problems
• Not only that but models used are “abstractions”

10

Problems
‣ Typically removes chance to fail an action

10

Problems
‣ Typically removes other agents and the consequences of
their actions

10

Problems
‣ Typically removes other agents and the consequences of
their actions
‣ Typically removes a lot of the detail e.g. Driver for truck

10

Time Constraints
• International Planning Competition entrants get
around 30m to generate a plan for a single problem.

11

Time Constraints
• AAAI General Game Playing entrants get 5-10s to
decide on their next move.

11

Time Constraints
• Games Industry aims for 60fps execution - 16ms
per frame.

11

Time Constraints
per frame.
‣ Most of that is spent on graphics, physics etc.

11

Time Constraints
per frame.
‣ Most of that is spent on graphics, physics etc.
‣ AI gets maybe 1ms to work out everything it needs to
11

Reactive AI
• Reactive AI makes snap decisions based on current
state of the world.

12

Reactive AI
state of the world.
• More tolerant to action failure - one action isn’t
part of a long chain of actions that depend on it.

12

Reactive AI
state of the world.
• More tolerant to action failure - one action isn’t
part of a long chain of actions that depend on it.
• Typically gives a very good response time to input
received from the environment.

12

Subsumption Arch.
• Quintessential reactive approach.

13

Subsumption Arch.
• Library of behaviours ordered by priority.

13

Subsumption Arch.
• Each behaviour maps detected input to relevant
response.

13

Subsumption Arch.
• Each behaviour maps detected input to relevant
response.
• Higher priority behaviours are able to “subsume” or
override the output of the lower priority ones.

13

Subsumption
Behaviour 1

Behaviour 2
Input Mux
Behaviour 3

Behaviour 4
Output
14

Inﬂuence Maps
• Much more simplistic approach

15

Influence Maps
• Influence radiates from objects similarly to magnetic
fields.

15

Inﬂuence Maps
ﬁelds.
• Good things attract the agent, bad things repel the
agent.

15

Influence Maps
fields.
• Good things attract the agent, bad things repel the
agent.
• Interaction of influence is typically (but not
necessarily) additive.

15

Stateful vs. Stateless
• Deliberative reasoning is by its nature stateful

18

• Reactive systems typically are stateless

18

• Trying to retroﬁt them to include states typically
adds the type of complexity they were designed to
avoid.

18

avoid.
‣ E.g. Trying to capture state in a NN involves a separate
NN designed to have a delayed feedback into the input

18

avoid.
‣ E.g. Trying to capture state in a NN involves a separate
NN designed to have a delayed feedback into the input
• Reactive Systems struggle to make long term plans
18

Limitations of AI
• Contemporary AI is capable of very sophisticated
insightful decision making.
‣ ....eventually
• Also able to make very rapid decision making.
‣ ...at the expense of long-term decision quality
• Range of problems require both high quality and
short time frame decisions.

19

Six Million Dollar Man
• We can rebuild it!

20

‣ Better

20

‣ Better
‣ Stronger

20

‣ Better
‣ Stronger
‣ Faster

20

‣ Better
‣ Stronger
‣ Faster
• But, we don’t yet have
the technology...

20

Integrated Inﬂuence
• My work focuses on trying to bridge the gap
between reaction and deliberation in novel ways

21

• Previous approaches have typically either :

21

‣ Created an agent that deliberates about certain aspects
of the world and reacts to others

21

‣ Created an agent that reacts within the parameters of a
deiberatively generated trajectory

21

‣ Created an agent that reacts within the parameters of a
deiberatively generated trajectory
• Neither approach has proven particularly robust

21

Concept
• We take the view that many aspects of the world
can’t be tackled by one or other paradigm, but
require both.

22

Concept
• We take the view that many aspects of the world
can’t be tackled by one or other paradigm, but
require both.
• To this end, our architecture aims to constantly use
all information available, both deliberative and
reactive, to make decisions

22

Search vs. Evaluation
• Searching spaces is a complex task.
‣ Typically at least NP-Hard, PDDL domains can be as
complex as PSPACE-Complete
• What if, instead of performing search, we could
reformulate the problem into something closer to
function evaluation?

23

Propositions
• PDDL’s propositional representation gives a state
representation of very high dimension, with each
dimension having exactly two possible values.
• Can we do better with another representation
format?

24

SAS+
• SAS+ groups mutually exclusive PDDL props
together.
‣ Propositional - at(P1, L1), at(P1, L2), at(P1, L3), in(P1, T1)
‣ SAS+ - locationP1 ∈ {L1, L2, L3, T1}

25

SAS+
together.
• Also captures the ordering that the values take
‣ E.g. From any Lx to Ly, locationP1 take value T1 between

25

SAS+
together.
• Also captures the ordering that the values take
‣ E.g. From any Lx to Ly, locationP1 take value T1 between
• Identiﬁes the dependencies between different types
of object

25

DTGs
T1

L1 L2 L3

26

Inﬂuence Landscapes
• Introduced the concept of Inﬂuence Map earlier

27

• Inﬂuence Landscapes extend the idea away from a
purely spatial representation and into a conceptual
representation.

27

• Inﬂuence Landscapes extend the idea away from a
purely spatial representation and into a conceptual
representation.
• Allows for the same function-based approach to be
applied to reasoning.

27

Inﬂuence Landscape
T1

L1 L2 L3

28

Caveats
• It isn’t quite this easy

29

Caveats
• Need to consider ‘Causal Links’

29

Caveats
• To load the package at L1, the truck needs to be at
L1 too.

29

Caveats
• To load the package at L1, the truck needs to be at
L1 too.
• Interlinked set of DTGs allow this to be captured
and represented.

29

DTG with CG

T1

L1 L2 L3
30

DTG with CG

T1
L1 L2 L3

L1 L2 L3
30

Landscape Generators
• Previous example generated using a simple critical
path analysis.

31

path analysis.
• Need to get much more informed view of the
world around the agent.

31

path analysis.
• Illustrates the concept though - also useful for
providing the “naive” view of the world.

31

path analysis.
• Illustrates the concept though - also useful for
providing the “naive” view of the world.
• Critical path analysis gives some info about the
structure of the world, but is not fully informed.
31

Stacks
• Stacks are the name given to Landscape Generators

32

Stacks
• Each stack is tasked with assigning a numerical value
to every node within the DTG/CG space.

32

Stacks
‣ Remember that this is a smaller space than propositional

32

Stacks
• Each stack deals with a speciﬁc aspect of the world
or a speciﬁc approach

32

Stacks
• Each stack deals with a speciﬁc aspect of the world
or a speciﬁc approach
‣ E.g. Reactive, Deliberative (or some other information)

32

Deliberative - Plan
• We can guide the search by bringing in the
information a deliberative reasoner provides.

33

Deliberative - Plan
‣ E.g. Automated Planning

33

Deliberative - Plan
• We can implement this as a stack generating a
landscape reﬂecting the inﬂuence the plan exerts on
our agent.

33

Deliberative - Plan
• We can implement this as a stack generating a
landscape reﬂecting the inﬂuence the plan exerts on
our agent.
• Typically this will be a best-case assumption.

33

Conformity
• How heavily do we want to reward conforming to
the plan?

34

Tight Conformity
• Reward every node the plan requires the agent to
visit in the graph, Royal Road style.

35

Tight Conformity
• Visualise landscape as a ridge to the summit

35

Tight Conformity
‣ Excellent in best-cases

35

Tight Conformity
‣ Poor when ﬂexibility required

35

Tight Conformity
‣ Poor when ﬂexibility required
• After deviating from the plan, best approach seems
to be to rejoin the ridge - this may not be the case.

35

Loose Conformity
• Use the plan to mark out a general path without
strictly deﬁning each node of the plan.

36

Loose Conformity
• Much more ﬂexible approach, guides the agent
rather than dictating to it.

36

Loose Conformity
• Much more ﬂexible approach, guides the agent
rather than dictating to it.
• But how do you determine which nodes to mark
and which to ignore?

36

Focal Nodes
• Focal Nodes are these waypoints in the plan.

37

Focal Nodes
• Previous work with SAS+ has shown that a DTG
can be deformed to be laid out in any way.

37

Focal Nodes
‣ Logistics-style domain overlaid on a map of Europe.

37

Focal Nodes
• Highlights by inspection clumps of nodes and
connections between them

37

Focal Nodes
• Highlights by inspection clumps of nodes and
connections between them
‣ E.g. Channel Tunnel, Dover ferry etc.

37

Clustering
• We can pick out the FNs by hand, but that’s no fun.

38

Clustering
• Instead, using the structure of the graph to ﬁnd
them automagically.

38

Clustering
them automagically.
• Clustering the nodes of the graph allows us to
group nodes together by proximity

38

Clustering
them automagically.
• Clustering the nodes of the graph allows us to
group nodes together by proximity
• Fuzzy Clustering allows us to identify nodes that lie
between groups

38

Using Focal Nodes
• FNs can be identiﬁed ofﬂine for every DTG in the
domain.

41

Using Focal Nodes
domain.
• FNs that the plan indicates should be passed
through then become “Activated”.

41

Using Focal Nodes
domain.
• FNs that the plan indicates should be passed
through then become “Activated”.
• These nodes are given inﬂuence in the landscape
and this is propagated out across the graph to guide
the agent to these key nodes.

41

Environmental Data
• We can overcome the deﬁciencies in deliberative
landscape by bringing in data about the environment

43

Environmental Data
• Gives the kind of insight that a reactive system
would use to make decisions.

43

Environmental Data
• Gives the kind of insight that a reactive system
would use to make decisions.
• Allows us to inform the agent about things that may
require it to deviate from the planned trajectory

43

Preferences
• Preferences allow the agent to bias the inﬂuence of
nodes in the graph at execution time based on data
being sensed.

44

Preferences
being sensed.
• Can be either positive or negative.

44

Preferences
being sensed.
• Can be either positive or negative.
• Applies an inﬂuence of appropriate strength to the
target node and then propagates that out

44

Road Blocks
• Road Block is an edge that should be in the domain
but for some reason is not traversable at this time.

45

Road Blocks
• Two conceptual models

45

Road Blocks
‣ Cancelled ﬂight - this edge will never be traversable

45

Road Blocks
‣ Blocked road - this edge may be traversable later

45

Road Blocks
• Should the edge be removed?

45

Road Blocks
• Should the edge be removed?
‣ Opted to implement the ‘Road Block’ model as this
allows resensing later to check the state of the edge.

45

Integrated Landscape
• By combining the landscape from each of the
individual stacks, we get the “Integrated Inﬂuence
Landscape” from which the architecture draws its
name.

46

name.
• Currently using an unweighted additive model for
combining

46

name.
• Currently using an unweighted additive model for
combining
‣ This may prove to be sub-optimal in further testing

46

Using the IIL
• Given an IIL, the agent can then climb the gradient
towards the goal, changing between DTGs as
required by the Causal Graph.

47

Using the IIL
• Hill Climbing algorithm obvious choice, but gets
stuck at local maxima

47

Using the IIL
‣ Experimented with Forced-movement Hill Climbing

47

Using the IIL
‣ Also Neighbourhood-bounded A*

47

Using the IIL
‣ Also Neighbourhood-bounded A*
‣ Currently using Forced HC with ties broken randomly

47


Plan Landscape Env Landscape

Structure Landscape
48


Integrated Inﬂuence Landscape
49

Bonus Feature!
• Brought up earlier time constraints of the Game
Industry - around 1ms per frame.

50

Bonus Feature!
• Games give us a great context for testing “real
world” style applications, and often already have the
fast/smart requirements

50

Bonus Feature!
• Fully controllable simulation environment

50

Bonus Feature!
• Fully controllable simulation environment
• Very pretty demos

50

Parrallelisation
• By its nature, the stack paradigm is very ﬂexible

51

Parrallelisation
• Designed for each stack to be updated
asynchronously

51

Parrallelisation
asynchronously
• Very suitable to parallel execution

51

Parrallelisation
asynchronously
• Very suitable to parallel execution
‣ Increasingly a big factor in modern computing

51

Vector Operations
• The vast majority of the maths mentioned can be
stated as vector and matrix operations.

52

Vector Operations
• Makes the whole architecture very suitable to
execution on Cell SPUs.

52

Vector Operations
‣ Synergistic Processing Units are vector-based
coprocessors in Cell-based systems such as PS3.

52

Vector Operations
• SPUs are typically not used efﬁciently or fully.

52

Vector Operations
• SPUs are typically not used efﬁciently or fully.
‣ Effectively “free” processing power.

52

Experiments
• Majority of work to date has been conceptual

53

Experiments
• Initial prototype has been developed based on a
number of assumptions

53

Experiments
• Work currently ongoing to make sure all of these
assumptions are valid.

53

Experiments
• Work currently ongoing to make sure all of these
assumptions are valid.
• Experiments being run on small problem instances
as translation from SAS+ encoding to internal
architecture rep. currently by hand.
53

Results
• Early tests have shown some promising results

54

Results
‣ Noticeable decrease in time taken to make decisions
over a purely deliberative method.

54

Results
- 1-2ms for a 12 decision point execution with 6ms of processing
in advance.

54

Results
in advance.
‣ Increased robustness to changes detected in the domain

54

Results
in advance.
- Rapid discovery of alternative paths through the space

54

Results
in advance.
- Rapid discovery of alternative paths through the space
• Much more rigorous testing required.

54

Future Work
• A lot of work remains to develop this into a
polished technique that will revolutionise AI.

55

Future Work
‣ Further testing on wider range of problems

55

Future Work
‣ Full testing of all assumptions made and techniques
chosen without substantiation

55

Future Work
‣ Development into a working system, rather than proof-
of-concept prototype

55

Future Work
‣ Development into a working system, rather than proof-
of-concept prototype
‣ Proof of extensibility of system by addition of additional
Stacks representing other sources of information.
55

AI Applications and Planning Techniques

Recommended

Recommended

More Related Content

Similar to AI Applications and Planning Techniques

Similar to AI Applications and Planning Techniques (20)

More from Luke Dicken

More from Luke Dicken (17)

Recently uploaded

Recently uploaded (20)

AI Applications and Planning Techniques

Editor's Notes