SlideShare a Scribd company logo
1
What is GOAP, and Why Is It Not Already Mainstream?
Aakash Chotrani, Donghwi Jin
1 Introduction
It’s hard to ignore the potential of planning techniques for Game AI, and over the years they
have drawn much attention of game developers and researchers. Goal Oriented Action
Planning, or GOAP for short, is an Artificial Intelligence system for agents that allow them to
plan a sequence of actions to satisfy a particular goal without having to maintain large and
complex state machines. GOAP offers dynamic problem solving which gives the illusion that
the AI understands the environment, and reaches the goal in a logical manner. Fundamentally,
the technique involves formulating a list of actions whose outcome will satisfy the goal.
2 Motivation behind GOAP
Most of us are familiar with finite state machines (FSMs) which are used in popular game
engines such as Unity3d. Although they are easy to implement and design due to their cyclic
nature, they also have several disadvantages. FSMs have poor scalability, since they
eventually become very complex with the increase in the number of states. At one point,
adding a new state to an existing FSM becomes very difficult. FSMs also handle concurrency
very poorly. When running multiple states in parallel, FSMs often end up in a deadlock,
forcing the programmer to make changes until they become compatible with each other.
Finally, FSMs are very labor intensive. It takes a lot of time and resources to determine the
conditions for a transition from one state to another, and eventually becomes a major source
of behavioral bugs.
FSMs, like any other technology, have their own place in the game development
process. However, scripts and co-routines are becoming more popular these days, and
hierarchical planners are increasingly making their way into games and middleware. Planning
technology has made notable progress in past 10 years since being introduced in the game
industry. There are certainly a lot of problems to solve but much progress has been made
already.
There aren’t many games that use planners in comparison to other techniques but AI
in those games has been much well received by the players. The most successful games in
which planning has been implemented are mostly open world, which have emergent
gameplay. Games where there is much linear and story driven gameplay has received much
poor reviews.
3 What is GOAP?
In GOAP, agents are supplied with a goal state of the world and a set of actions which
2
change the state of the world upon being performed by the agent. Agents follow a particular
sequence of actions to achieve the goal state of the world. The particular sequence not only
depends on the goal state of the world, but also on the current state of the world. Hence,
GOAP is a realistic and dynamic architecture which takes into account both the current state
of the world and the goal state of the world.
For example, we have an agent who is a robber and his goal is to collect all the coins
and exit the world. In order to collect the coins, the agent has to choose which coin to
collect, which depends on the distance between the agent and the coin. The closer the coin is
to the agent, the higher the priority is to get collected first. After collecting all the coins, the
agent has to get a key in order to open the door and exit the world.
Hence the agent is supplied with:
Goal:
ExitTheWorld
Actions:
CollectAllCoinsAction
CollectKeyAction
OpenDoorAction
ExitAction
A GOAP planner chooses the optimal sequence of actions to achieve the goal state of the
world by considering which actions are available to be performed by the agent at a given
state of the world using the preconditions and effects of each action.
3.1 Actions
Anything the agent performs is considered as an action. It could be picking up the coin,
opening the door, moving to the target, playing animation or sound, and so on. Actions are
independent of other actions, and we do not need to worry about an action’s connection to
other actions while the action is being performed.
Each action has a cost associated with it. Actions with lower costs are preferred over
actions with higher costs since we want the final sequence of actions toward the goal state of
the world be optimal. In a given world state there could be more than one sequence of
actions available to achieve the goal state. The planner decides which sequence to choose
based on the total cost of actions to achieve the goal state. For example, assume there is a
world where the agent can collect one diamond instead of collecting all the coins and exit
the world:
CollectDiamondAction cost: 10
CollectCoinsAction cost: 3
CollectKeyAction cost: 2
OpenDoorAction cost: 1
We have more than one sequence of actions available to achieve the goal state of the world:
3
CollectDiamondAction -> CollectCoinsAction -> OpenDoorAction
= ExitWorld(total: 13)
CollectCoinsAction -> CollectKeyAction -> OpenDoorAction =
ExitWorld(total: 6)
Each action is composed of preconditions and effects. Preconditions are conditions required
to execute the action, and effects are the result the action produces after it has successfully
been executed. An example set of actions and their preconditions and effects may be:
CollectCoinAction
Precondition: hasCoin = false
Effect: hasCoin = true
CollectKeyAction
Precondition: hasCoin = true, hasKey = false
Effect: hasKey = true
OpenDoorAction
Precondition: hasKey = true
Effect: exitWorld = true
Depending on the action, there could be multiple preconditions or effects associated to a
single action.
In summary, a goal is described by relevant world state properties. An action is
described in terms of its effect on the world and its world state preconditions. A plan is a
sequence of actions which is formulated by a component called the planner.
Figure 1 A relationship between the AI agent and the planner is shown in the diagram
above.
The input to the planner includes a current world state, a previously selected goal,
and a set of actions. The goal's properties are modulated with the current state of the world
4
to produce the goal state of the world. A plan is then formulated by searching the state space
graph for an optimally ordered sequence of actions which produce the goal state of the world
from the current state of the world.
4 Our Attempt at GOAP
We created a simple demo application of GOAP, which was built around a world that
consisted of a player, several coins, a key shop, and an exit door. The player’s ultimate goal
was to leave the world through the exit door, which is initially locked. In order to open the
door, the player had to purchase a key from the key shop for five coins. The player initially
did not own any coins; he had to collect coins from around the world by moving to each
coin and then picking it up. Hence, the player’s general plan was to pick up enough coins to
buy a key, and then use the purchased key to leave the world through the exit door.
Figure 2 This is a screenshot of our demo application, which displays the player’s
current action and plan as they are carried out.
A planner was responsible for formulating a specific plan, or a sequence of actions,
that the player would follow in order to achieve the goal. In order to create a plan for the
player, the planner goes through three steps: recognize all available actions, search for the
optimal sequence of actions which achieves the goal, and relay the information to the player.
Before all of this could begin, however, the programmer has to supply the entire set
of actions that is available to the player, each with its own preconditions and effects. In our
demo, the entire set of actions were:
5
PickCoinAction
Precondition: coinsOwned < 5
Effect: coinsOwned += 1
PickKeyAction
Precondition: coinsOwned >= 5, hasKey = false
Effect: hasKey = true
OpenDoorAction
Precondition: hasKey = true, exitWorld = false
Effect: exitWorld = true
It is from this entire set of actions that the planner chooses which actions are available to the
player at this current state of the world. For instance, at the initial state of the world, the
player’s parameters are given as coinsOwned = 0, hasKey = false, and
exitWorld = false. Out of the three given actions, the player’s parameters satisfy the
preconditions of the PickCoinAction action only. Hence, at this state of the world, the
only available action to the player is the PickCoinAction action.
Figure 3 This figure illustrates an example of a tree structure of actions. A path from
the root node to one of the terminal leaf nodes would represent a sequence of actions for the
player to take.
After the planner recognizes all actions currently available to the player, it then
creates a tree structure, much like Figure 3 above, where each node contains an action to be
performed by the player. At each node, the player’s parameters are updated according to the
action taken by the player at that node, and any single action that could be performed with
the newly updated player’s parameters become children of the node. Once a node’s action
allows the player to exit the world, the node does not create any more children since there is
no need to and becomes terminal. This way, a possible action sequence for the player can be
6
represented by a path between the topmost parent node and one of the terminal leaf nodes of
the constructed tree. The terminal leaf nodes also store the entire cost of the path from the
topmost parent node to itself, which the planner uses to pick the most optimal sequence of
actions that achieve the player’s goal. Pseudocodes of the two functions that handle this part
of the algorithm are given below:
Listing 1 Function to build a tree structure of actions
bool BuildTree(Node parent, List<Node> pathList, Set<Action>
availableActions, State goal)
{
foundPath = false
foreach action in availableActions:
if action.mPrecondition in parent.mAction.mEffect:
currentState = UpdateState(action)
Node node = new Node(parent,
parentCost + givenCost,
action)
if currentState == goal:
pathList.Add(node)
foundPath = true
else:
availableActions.delete(action)
bool found = BuildTree(node, pathList,
availableAction,
goal)
if (found):
foundPath = true
return foundPath
}
Listing 2 Function to choose the optimal sequence of actions and return it as a plan
List<Action> CreatePlan(GameObject agent, List<Action>
availableActions, State worldState,
State goal)
{
foreach action in availableActions:
if action.mPrecondition == false:
availableActions.remove(action)
List<Node> pathList = new List<Node>()
Node start = new Node(null, 0, worldState, null)
bool success = BuildTree(start, pathList,
availableActions, goal)
// If no plan found, return null
7
if not success:
return null;
Node optimal = null;
foreach terminalNode in pathList:
if optimal == null:
optimal = terminalNode
else:
if optimal.cost > terminalNode.cost:
optimal = terminalNode
List<Action> actionList = new List<Action>()
while optimal != null:
actionList.push_front(optimal.mAction)
optimal = optimal.mParent
return actionList
}
After a plan has been formulated by the planner, the plan is sent to the player, and
the player begins to execute the sequence of actions in the plan. Each execution of an action
by the player can either succeed or fail. Upon success, the player updates its parameters, and
proceeds to the next action in the sequence; upon failure, the player relays the information to
the planner, and the planner re-evaluates the world state, makes a new plan, and returns it to
the player. If at any time a planner fails to find a plan that is able to achieve the player’s
goal, the player does not receive a plan, and the player becomes idle until a change happens
in the state of the world that would allow the planner to come up with a working plan again.
4.1 Possible Improvements
The planner in our demo used a tree data structure to search for all possible paths to the goal
state of the world, and then chose the path with the lowest cost. The method used was
somewhat like a BFS search which did not stop upon finding a path to goal, and proceeded
to find all possible paths to goal in order to compare their costs and return the optimal one.
This method of searching for the optimal path is inefficient since we are interested more in
the total cost of the path instead of the number of nodes in the path; the BFS algorithm
excels in the latter case, and should not be used in the former case. Our search algorithm can
be improved by using smart search algorithms which guarantee an optimal path, such as A*
with an admissible heuristic.
5 Advantages and Disadvantages of GOAP
From what we have talked about so far, GOAP, or planners in general for that matter, seem
very promising. However, bigger games which base their AI architecture majorly on
planning systems are rare, even after Jeff Orkin’s presentation of the game F.E.A.R at GDC
8
in 2006. In this section, we assess the advantages and disadvantages of GOAP, and analyze
why it has not taken over variations of FSMs and behavior trees as the major AI architecture
in the game industry.
The biggest advantage of GOAP over other architectures is that GOAP creates a
dynamic problem-solving environment which greatly contributes to making the AI behavior
in-game look more realistic. Compared to behavior trees or FSMs, the fact that GOAP uses
actions which have preconditions and effects that alter the world state allows agents under
instruction from the planner to behave in different ways to the same problem according to
the resources given to them and conditions imposed on them. Specifically, GOAP can
respond to unexpected changes in the world very well, since all it needs to do is come up
with a new plan; this is a clear advantage over behavior trees, which are known for their
notoriously bad handling of interruptions.
Another advantage of GOAP is that the designer can provide agents with all the
actions that he would like to see them perform without having to couple specific actions to
goals. All the designer needs to specify is whether an action needs any preconditions to be
met in order to be performed, and what effects the action would have on the world state once
it has been performed. Figure 4 illustrates how this open relationship between actions and
goals results in a much smaller complexity compared to those of other designs which would
be necessary to achieve the same end result.
Figure 4 In GOAP, the designer can provide actions and goals without having to
associate certain actions with certain goals, since the agent will figure out by itself which
actions are necessary to achieve certain goals.
9
However, GOAP does have its batch of problems. The biggest problem associated
with using GOAP as the main architecture of a game’s AI system comes from the fact that it
is extremely hard to design correctly. Planning systems work best when designers want the
AI to achieve something, and do not particularly care about how it is done as long as it looks
realistic. However, when the designer is also interested in the process by which the agent
achieves its goal, planning systems suddenly become very difficult to work with. For
instance, when a designer wants an agent to display a certain behavior, he cannot do so by
explicitly telling the agent to demonstrate such behaviors. Instead, the designer must first
come up with a set of actions the agent can perform, along with preconditions and effects for
each action. Then, the designer must tune the parameters of the world, and then give a goal
to the planner so that the planner would think the sequence of behaviors desired by the
designer is the optimal choice to achieve the goal given to it by the designer.
Another problem of GOAP, which follows from the aforementioned problem, is that
when not done perfectly, it is simply better to use behavior trees to achieve the same
behavior. Although one of the biggest advantages of using GOAP is the realistic behavior its
agents display through dynamic problem-solving, if the design of the GOAP itself is not so
great, the output behavior is such that one can achieve a similar level of realisticity by
simply using behavior trees. Behavior trees have a huge advantage in performance over
planners since it does not rely on dynamic problem-solving at runtime. While planners need
to re-assess the world state each time it creates a plan or the world state changes
significantly enough for the planner to be interested, a behavior tree is processed at compile
time and already fired up and ready to go at runtime, and can output similarly realistic
behaviors without having to spend much resources at runtime. Hence, if a GOAP system is
not designed perfectly, it loses what advantages it has, and is only left with its downsides in
performance.
6 Conclusion
GOAP is a type of planning architecture which assigns a goal to an agent, provides it with
all available actions that the agent can perform, and then lets the agent figure out which
sequence of actions optimally achieves the goal given to it. When designed correctly, GOAP
can be a powerful AI architecture for a game, outputting realistic behavior across any setting
the designer might throw at the agents. However, if a designer is more interested in
displaying specific types of behaviors through its agents, it is extremely difficult to design a
perfect GOAP system to suit the designer’s needs, and most of the times the advantages of
an imperfect GOAP system is negligible compared to its disadvantages, and games prefer
behavior trees and various types of FSMs for their AI architecture. If your game values
realistic behavior of agents more than specific types of behaviors displayed by agents, a
GOAP system might be the right choice for your game, and you should give it a try.
10
7 References
7.1 Online references
[AiGameDev]Alex J Champandard. Planning in games: An overview of lessons learned
http://aigamedev.com/open/review/planning-in-games/
[MIT media lab]Jeff Orkin . Goal oriented Action planning
http://alumni.media.mit.edu/~jorkin/goap.html

More Related Content

What's hot

Types of environment
Types of environmentTypes of environment
Types of environment
Megha Sharma
 
tic-tac-toe: Game playing
 tic-tac-toe: Game playing tic-tac-toe: Game playing
tic-tac-toe: Game playing
kalpana Manudhane
 
Ch2 properties of the task environment
Ch2 properties of the task environmentCh2 properties of the task environment
Ch2 properties of the task environment
Julyn Mae Pagmanoja
 
Reinforcement Learning 2. Multi-armed Bandits
Reinforcement Learning 2. Multi-armed BanditsReinforcement Learning 2. Multi-armed Bandits
Reinforcement Learning 2. Multi-armed Bandits
Seung Jae Lee
 
Khronos Munich 2018 - Halcyon and Vulkan
Khronos Munich 2018 - Halcyon and VulkanKhronos Munich 2018 - Halcyon and Vulkan
Khronos Munich 2018 - Halcyon and Vulkan
Electronic Arts / DICE
 
peas description of task environment with different types of properties
 peas description of task environment with different types of properties peas description of task environment with different types of properties
peas description of task environment with different types of properties
monircse2
 
Adversarial search
Adversarial searchAdversarial search
Adversarial search
Dheerendra k
 
And or search
And or searchAnd or search
And or search
Megha Sharma
 
Reinforcement Learning 4. Dynamic Programming
Reinforcement Learning 4. Dynamic ProgrammingReinforcement Learning 4. Dynamic Programming
Reinforcement Learning 4. Dynamic Programming
Seung Jae Lee
 
Types of environment in Artificial Intelligence
Types of environment in Artificial IntelligenceTypes of environment in Artificial Intelligence
Types of environment in Artificial Intelligence
Noman Ullah Khan
 
Operators and Expressions in Java
Operators and Expressions in JavaOperators and Expressions in Java
Operators and Expressions in JavaAbhilash Nair
 
Game playing in artificial intelligent technique
Game playing in artificial intelligent technique Game playing in artificial intelligent technique
Game playing in artificial intelligent technique
syeda zoya mehdi
 
Rl chapter 1 introduction
Rl chapter 1 introductionRl chapter 1 introduction
Rl chapter 1 introduction
ConnorShorten2
 
Artificial Intelligence in Computer and Video Games
Artificial Intelligence in Computer and Video GamesArtificial Intelligence in Computer and Video Games
Artificial Intelligence in Computer and Video Games
Luke Dicken
 
A Step Towards Data Orientation
A Step Towards Data OrientationA Step Towards Data Orientation
A Step Towards Data Orientation
Electronic Arts / DICE
 
Artificial Intelligence in games
Artificial Intelligence in gamesArtificial Intelligence in games
Artificial Intelligence in games
DevGAMM Conference
 
Game Playing in Artificial Intelligence
Game Playing in Artificial IntelligenceGame Playing in Artificial Intelligence
Game Playing in Artificial Intelligence
lordmwesh
 
Stylized Rendering in Battlefield Heroes
Stylized Rendering in Battlefield HeroesStylized Rendering in Battlefield Heroes
Stylized Rendering in Battlefield Heroes
Electronic Arts / DICE
 
Multi-Agent Reinforcement Learning
Multi-Agent Reinforcement LearningMulti-Agent Reinforcement Learning
Multi-Agent Reinforcement Learning
Seolhokim
 

What's hot (20)

Types of environment
Types of environmentTypes of environment
Types of environment
 
tic-tac-toe: Game playing
 tic-tac-toe: Game playing tic-tac-toe: Game playing
tic-tac-toe: Game playing
 
Ch2 properties of the task environment
Ch2 properties of the task environmentCh2 properties of the task environment
Ch2 properties of the task environment
 
Reinforcement Learning 2. Multi-armed Bandits
Reinforcement Learning 2. Multi-armed BanditsReinforcement Learning 2. Multi-armed Bandits
Reinforcement Learning 2. Multi-armed Bandits
 
Khronos Munich 2018 - Halcyon and Vulkan
Khronos Munich 2018 - Halcyon and VulkanKhronos Munich 2018 - Halcyon and Vulkan
Khronos Munich 2018 - Halcyon and Vulkan
 
peas description of task environment with different types of properties
 peas description of task environment with different types of properties peas description of task environment with different types of properties
peas description of task environment with different types of properties
 
Adversarial search
Adversarial searchAdversarial search
Adversarial search
 
And or search
And or searchAnd or search
And or search
 
Reinforcement Learning 4. Dynamic Programming
Reinforcement Learning 4. Dynamic ProgrammingReinforcement Learning 4. Dynamic Programming
Reinforcement Learning 4. Dynamic Programming
 
Types of environment in Artificial Intelligence
Types of environment in Artificial IntelligenceTypes of environment in Artificial Intelligence
Types of environment in Artificial Intelligence
 
Operators and Expressions in Java
Operators and Expressions in JavaOperators and Expressions in Java
Operators and Expressions in Java
 
Game playing in artificial intelligent technique
Game playing in artificial intelligent technique Game playing in artificial intelligent technique
Game playing in artificial intelligent technique
 
Ai Slides
Ai SlidesAi Slides
Ai Slides
 
Rl chapter 1 introduction
Rl chapter 1 introductionRl chapter 1 introduction
Rl chapter 1 introduction
 
Artificial Intelligence in Computer and Video Games
Artificial Intelligence in Computer and Video GamesArtificial Intelligence in Computer and Video Games
Artificial Intelligence in Computer and Video Games
 
A Step Towards Data Orientation
A Step Towards Data OrientationA Step Towards Data Orientation
A Step Towards Data Orientation
 
Artificial Intelligence in games
Artificial Intelligence in gamesArtificial Intelligence in games
Artificial Intelligence in games
 
Game Playing in Artificial Intelligence
Game Playing in Artificial IntelligenceGame Playing in Artificial Intelligence
Game Playing in Artificial Intelligence
 
Stylized Rendering in Battlefield Heroes
Stylized Rendering in Battlefield HeroesStylized Rendering in Battlefield Heroes
Stylized Rendering in Battlefield Heroes
 
Multi-Agent Reinforcement Learning
Multi-Agent Reinforcement LearningMulti-Agent Reinforcement Learning
Multi-Agent Reinforcement Learning
 

Similar to What is goap, and why is it not already mainstream

GAMING BOT USING REINFORCEMENT LEARNING
GAMING BOT USING REINFORCEMENT LEARNINGGAMING BOT USING REINFORCEMENT LEARNING
GAMING BOT USING REINFORCEMENT LEARNING
IRJET Journal
 
intern.pdf
intern.pdfintern.pdf
intern.pdf
cprabhash
 
24.09.2021 Reinforcement Learning Algorithms.pptx
24.09.2021 Reinforcement Learning Algorithms.pptx24.09.2021 Reinforcement Learning Algorithms.pptx
24.09.2021 Reinforcement Learning Algorithms.pptx
ManiMaran230751
 
Reinforcement Learning using OpenAI Gym
Reinforcement Learning using OpenAI GymReinforcement Learning using OpenAI Gym
Reinforcement Learning using OpenAI Gym
Muhammad Aleem Siddiqui
 
OpenAI Gym & Universe
OpenAI Gym & UniverseOpenAI Gym & Universe
OpenAI Gym & Universe
Entrepreneur / Startup
 
Reinforcement Learning - DQN
Reinforcement Learning - DQNReinforcement Learning - DQN
Reinforcement Learning - DQN
Mohammaderfan Arefimoghaddam
 
ANSWER SET PROGRAMMING (DLV – CLINGO):CONNECT 4 SOLVER
ANSWER SET PROGRAMMING (DLV – CLINGO):CONNECT 4 SOLVERANSWER SET PROGRAMMING (DLV – CLINGO):CONNECT 4 SOLVER
ANSWER SET PROGRAMMING (DLV – CLINGO):CONNECT 4 SOLVER
ijaia
 
ANSWER SET PROGRAMMING (DLV – CLINGO):CONNECT 4 SOLVER
ANSWER SET PROGRAMMING (DLV – CLINGO):CONNECT 4 SOLVERANSWER SET PROGRAMMING (DLV – CLINGO):CONNECT 4 SOLVER
ANSWER SET PROGRAMMING (DLV – CLINGO):CONNECT 4 SOLVER
gerogepatton
 
Game playing in AI
Game playing in AIGame playing in AI
Game playing in AI
Dr. C.V. Suresh Babu
 
Lec 2-agents
Lec 2-agentsLec 2-agents
Lec 2-agents
Taymoor Nazmy
 
Unity3d scripting tutorial
Unity3d scripting tutorialUnity3d scripting tutorial
Unity3d scripting tutorial
hungnttg
 
Lecture 07 search techniques
Lecture 07 search techniquesLecture 07 search techniques
Lecture 07 search techniques
Hema Kashyap
 
Raspberry Pi and Physical Computing Workshop
Raspberry Pi and Physical Computing WorkshopRaspberry Pi and Physical Computing Workshop
Raspberry Pi and Physical Computing Workshop
Rachel Wang
 
Kingston University AR Drone game-final report.pdf
Kingston University AR Drone game-final report.pdfKingston University AR Drone game-final report.pdf
Kingston University AR Drone game-final report.pdf
Anne David
 
Simulation of N-Person Games
Simulation of N-Person GamesSimulation of N-Person Games
Simulation of N-Person GamesMarcos Zuzu
 
Cross Media design scenarios: smartphones and tablets, a workshop at ISIA Des...
Cross Media design scenarios: smartphones and tablets, a workshop at ISIA Des...Cross Media design scenarios: smartphones and tablets, a workshop at ISIA Des...
Cross Media design scenarios: smartphones and tablets, a workshop at ISIA Des...
Salvatore Iaconesi
 
Agents-and-Problem-Solving-20022024-094442am.pdf
Agents-and-Problem-Solving-20022024-094442am.pdfAgents-and-Problem-Solving-20022024-094442am.pdf
Agents-and-Problem-Solving-20022024-094442am.pdf
syedhasanali293
 
Week 3.pdf
Week 3.pdfWeek 3.pdf
Week 3.pdf
ZamshedForman1
 

Similar to What is goap, and why is it not already mainstream (20)

GAMING BOT USING REINFORCEMENT LEARNING
GAMING BOT USING REINFORCEMENT LEARNINGGAMING BOT USING REINFORCEMENT LEARNING
GAMING BOT USING REINFORCEMENT LEARNING
 
intern.pdf
intern.pdfintern.pdf
intern.pdf
 
24.09.2021 Reinforcement Learning Algorithms.pptx
24.09.2021 Reinforcement Learning Algorithms.pptx24.09.2021 Reinforcement Learning Algorithms.pptx
24.09.2021 Reinforcement Learning Algorithms.pptx
 
Reinforcement Learning using OpenAI Gym
Reinforcement Learning using OpenAI GymReinforcement Learning using OpenAI Gym
Reinforcement Learning using OpenAI Gym
 
OpenAI Gym & Universe
OpenAI Gym & UniverseOpenAI Gym & Universe
OpenAI Gym & Universe
 
Reinforcement Learning - DQN
Reinforcement Learning - DQNReinforcement Learning - DQN
Reinforcement Learning - DQN
 
ANSWER SET PROGRAMMING (DLV – CLINGO):CONNECT 4 SOLVER
ANSWER SET PROGRAMMING (DLV – CLINGO):CONNECT 4 SOLVERANSWER SET PROGRAMMING (DLV – CLINGO):CONNECT 4 SOLVER
ANSWER SET PROGRAMMING (DLV – CLINGO):CONNECT 4 SOLVER
 
ANSWER SET PROGRAMMING (DLV – CLINGO):CONNECT 4 SOLVER
ANSWER SET PROGRAMMING (DLV – CLINGO):CONNECT 4 SOLVERANSWER SET PROGRAMMING (DLV – CLINGO):CONNECT 4 SOLVER
ANSWER SET PROGRAMMING (DLV – CLINGO):CONNECT 4 SOLVER
 
Game playing in AI
Game playing in AIGame playing in AI
Game playing in AI
 
Lec 2-agents
Lec 2-agentsLec 2-agents
Lec 2-agents
 
Unity3d scripting tutorial
Unity3d scripting tutorialUnity3d scripting tutorial
Unity3d scripting tutorial
 
Lecture 07 search techniques
Lecture 07 search techniquesLecture 07 search techniques
Lecture 07 search techniques
 
Lesson 22
Lesson 22Lesson 22
Lesson 22
 
AI Lesson 22
AI Lesson 22AI Lesson 22
AI Lesson 22
 
Raspberry Pi and Physical Computing Workshop
Raspberry Pi and Physical Computing WorkshopRaspberry Pi and Physical Computing Workshop
Raspberry Pi and Physical Computing Workshop
 
Kingston University AR Drone game-final report.pdf
Kingston University AR Drone game-final report.pdfKingston University AR Drone game-final report.pdf
Kingston University AR Drone game-final report.pdf
 
Simulation of N-Person Games
Simulation of N-Person GamesSimulation of N-Person Games
Simulation of N-Person Games
 
Cross Media design scenarios: smartphones and tablets, a workshop at ISIA Des...
Cross Media design scenarios: smartphones and tablets, a workshop at ISIA Des...Cross Media design scenarios: smartphones and tablets, a workshop at ISIA Des...
Cross Media design scenarios: smartphones and tablets, a workshop at ISIA Des...
 
Agents-and-Problem-Solving-20022024-094442am.pdf
Agents-and-Problem-Solving-20022024-094442am.pdfAgents-and-Problem-Solving-20022024-094442am.pdf
Agents-and-Problem-Solving-20022024-094442am.pdf
 
Week 3.pdf
Week 3.pdfWeek 3.pdf
Week 3.pdf
 

More from Aakash Chotrani

Efficient Backpropagation
Efficient BackpropagationEfficient Backpropagation
Efficient Backpropagation
Aakash Chotrani
 
Deep q learning with lunar lander
Deep q learning with lunar landerDeep q learning with lunar lander
Deep q learning with lunar lander
Aakash Chotrani
 
Reinforcement Learning
Reinforcement LearningReinforcement Learning
Reinforcement Learning
Aakash Chotrani
 
Course recommender system
Course recommender systemCourse recommender system
Course recommender system
Aakash Chotrani
 
Artificial Intelligence in games
Artificial Intelligence in gamesArtificial Intelligence in games
Artificial Intelligence in games
Aakash Chotrani
 
Simple & Fast Fluids
Simple & Fast FluidsSimple & Fast Fluids
Simple & Fast Fluids
Aakash Chotrani
 
Supervised Unsupervised and Reinforcement Learning
Supervised Unsupervised and Reinforcement Learning Supervised Unsupervised and Reinforcement Learning
Supervised Unsupervised and Reinforcement Learning
Aakash Chotrani
 

More from Aakash Chotrani (7)

Efficient Backpropagation
Efficient BackpropagationEfficient Backpropagation
Efficient Backpropagation
 
Deep q learning with lunar lander
Deep q learning with lunar landerDeep q learning with lunar lander
Deep q learning with lunar lander
 
Reinforcement Learning
Reinforcement LearningReinforcement Learning
Reinforcement Learning
 
Course recommender system
Course recommender systemCourse recommender system
Course recommender system
 
Artificial Intelligence in games
Artificial Intelligence in gamesArtificial Intelligence in games
Artificial Intelligence in games
 
Simple & Fast Fluids
Simple & Fast FluidsSimple & Fast Fluids
Simple & Fast Fluids
 
Supervised Unsupervised and Reinforcement Learning
Supervised Unsupervised and Reinforcement Learning Supervised Unsupervised and Reinforcement Learning
Supervised Unsupervised and Reinforcement Learning
 

Recently uploaded

Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
TinyAnderson
 
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdfDMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
fafyfskhan251kmf
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
RenuJangid3
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
Gokturk Mehmet Dilci
 
Chapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisisChapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisis
tonzsalvador2222
 
BREEDING METHODS FOR DISEASE RESISTANCE.pptx
BREEDING METHODS FOR DISEASE RESISTANCE.pptxBREEDING METHODS FOR DISEASE RESISTANCE.pptx
BREEDING METHODS FOR DISEASE RESISTANCE.pptx
RASHMI M G
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
Nistarini College, Purulia (W.B) India
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
MAGOTI ERNEST
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
RitabrataSarkar3
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
Columbia Weather Systems
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
SAMIR PANDA
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
pablovgd
 
Toxic effects of heavy metals : Lead and Arsenic
Toxic effects of heavy metals : Lead and ArsenicToxic effects of heavy metals : Lead and Arsenic
Toxic effects of heavy metals : Lead and Arsenic
sanjana502982
 
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptxANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
RASHMI M G
 
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdfMudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
frank0071
 
Nucleophilic Addition of carbonyl compounds.pptx
Nucleophilic Addition of carbonyl  compounds.pptxNucleophilic Addition of carbonyl  compounds.pptx
Nucleophilic Addition of carbonyl compounds.pptx
SSR02
 
Oedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptxOedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptx
muralinath2
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
yqqaatn0
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
PRIYANKA PATEL
 

Recently uploaded (20)

Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
 
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdfDMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
 
Chapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisisChapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisis
 
BREEDING METHODS FOR DISEASE RESISTANCE.pptx
BREEDING METHODS FOR DISEASE RESISTANCE.pptxBREEDING METHODS FOR DISEASE RESISTANCE.pptx
BREEDING METHODS FOR DISEASE RESISTANCE.pptx
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
 
Toxic effects of heavy metals : Lead and Arsenic
Toxic effects of heavy metals : Lead and ArsenicToxic effects of heavy metals : Lead and Arsenic
Toxic effects of heavy metals : Lead and Arsenic
 
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptxANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
 
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdfMudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
 
Nucleophilic Addition of carbonyl compounds.pptx
Nucleophilic Addition of carbonyl  compounds.pptxNucleophilic Addition of carbonyl  compounds.pptx
Nucleophilic Addition of carbonyl compounds.pptx
 
Oedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptxOedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptx
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
 

What is goap, and why is it not already mainstream

  • 1. 1 What is GOAP, and Why Is It Not Already Mainstream? Aakash Chotrani, Donghwi Jin 1 Introduction It’s hard to ignore the potential of planning techniques for Game AI, and over the years they have drawn much attention of game developers and researchers. Goal Oriented Action Planning, or GOAP for short, is an Artificial Intelligence system for agents that allow them to plan a sequence of actions to satisfy a particular goal without having to maintain large and complex state machines. GOAP offers dynamic problem solving which gives the illusion that the AI understands the environment, and reaches the goal in a logical manner. Fundamentally, the technique involves formulating a list of actions whose outcome will satisfy the goal. 2 Motivation behind GOAP Most of us are familiar with finite state machines (FSMs) which are used in popular game engines such as Unity3d. Although they are easy to implement and design due to their cyclic nature, they also have several disadvantages. FSMs have poor scalability, since they eventually become very complex with the increase in the number of states. At one point, adding a new state to an existing FSM becomes very difficult. FSMs also handle concurrency very poorly. When running multiple states in parallel, FSMs often end up in a deadlock, forcing the programmer to make changes until they become compatible with each other. Finally, FSMs are very labor intensive. It takes a lot of time and resources to determine the conditions for a transition from one state to another, and eventually becomes a major source of behavioral bugs. FSMs, like any other technology, have their own place in the game development process. However, scripts and co-routines are becoming more popular these days, and hierarchical planners are increasingly making their way into games and middleware. Planning technology has made notable progress in past 10 years since being introduced in the game industry. There are certainly a lot of problems to solve but much progress has been made already. There aren’t many games that use planners in comparison to other techniques but AI in those games has been much well received by the players. The most successful games in which planning has been implemented are mostly open world, which have emergent gameplay. Games where there is much linear and story driven gameplay has received much poor reviews. 3 What is GOAP? In GOAP, agents are supplied with a goal state of the world and a set of actions which
  • 2. 2 change the state of the world upon being performed by the agent. Agents follow a particular sequence of actions to achieve the goal state of the world. The particular sequence not only depends on the goal state of the world, but also on the current state of the world. Hence, GOAP is a realistic and dynamic architecture which takes into account both the current state of the world and the goal state of the world. For example, we have an agent who is a robber and his goal is to collect all the coins and exit the world. In order to collect the coins, the agent has to choose which coin to collect, which depends on the distance between the agent and the coin. The closer the coin is to the agent, the higher the priority is to get collected first. After collecting all the coins, the agent has to get a key in order to open the door and exit the world. Hence the agent is supplied with: Goal: ExitTheWorld Actions: CollectAllCoinsAction CollectKeyAction OpenDoorAction ExitAction A GOAP planner chooses the optimal sequence of actions to achieve the goal state of the world by considering which actions are available to be performed by the agent at a given state of the world using the preconditions and effects of each action. 3.1 Actions Anything the agent performs is considered as an action. It could be picking up the coin, opening the door, moving to the target, playing animation or sound, and so on. Actions are independent of other actions, and we do not need to worry about an action’s connection to other actions while the action is being performed. Each action has a cost associated with it. Actions with lower costs are preferred over actions with higher costs since we want the final sequence of actions toward the goal state of the world be optimal. In a given world state there could be more than one sequence of actions available to achieve the goal state. The planner decides which sequence to choose based on the total cost of actions to achieve the goal state. For example, assume there is a world where the agent can collect one diamond instead of collecting all the coins and exit the world: CollectDiamondAction cost: 10 CollectCoinsAction cost: 3 CollectKeyAction cost: 2 OpenDoorAction cost: 1 We have more than one sequence of actions available to achieve the goal state of the world:
  • 3. 3 CollectDiamondAction -> CollectCoinsAction -> OpenDoorAction = ExitWorld(total: 13) CollectCoinsAction -> CollectKeyAction -> OpenDoorAction = ExitWorld(total: 6) Each action is composed of preconditions and effects. Preconditions are conditions required to execute the action, and effects are the result the action produces after it has successfully been executed. An example set of actions and their preconditions and effects may be: CollectCoinAction Precondition: hasCoin = false Effect: hasCoin = true CollectKeyAction Precondition: hasCoin = true, hasKey = false Effect: hasKey = true OpenDoorAction Precondition: hasKey = true Effect: exitWorld = true Depending on the action, there could be multiple preconditions or effects associated to a single action. In summary, a goal is described by relevant world state properties. An action is described in terms of its effect on the world and its world state preconditions. A plan is a sequence of actions which is formulated by a component called the planner. Figure 1 A relationship between the AI agent and the planner is shown in the diagram above. The input to the planner includes a current world state, a previously selected goal, and a set of actions. The goal's properties are modulated with the current state of the world
  • 4. 4 to produce the goal state of the world. A plan is then formulated by searching the state space graph for an optimally ordered sequence of actions which produce the goal state of the world from the current state of the world. 4 Our Attempt at GOAP We created a simple demo application of GOAP, which was built around a world that consisted of a player, several coins, a key shop, and an exit door. The player’s ultimate goal was to leave the world through the exit door, which is initially locked. In order to open the door, the player had to purchase a key from the key shop for five coins. The player initially did not own any coins; he had to collect coins from around the world by moving to each coin and then picking it up. Hence, the player’s general plan was to pick up enough coins to buy a key, and then use the purchased key to leave the world through the exit door. Figure 2 This is a screenshot of our demo application, which displays the player’s current action and plan as they are carried out. A planner was responsible for formulating a specific plan, or a sequence of actions, that the player would follow in order to achieve the goal. In order to create a plan for the player, the planner goes through three steps: recognize all available actions, search for the optimal sequence of actions which achieves the goal, and relay the information to the player. Before all of this could begin, however, the programmer has to supply the entire set of actions that is available to the player, each with its own preconditions and effects. In our demo, the entire set of actions were:
  • 5. 5 PickCoinAction Precondition: coinsOwned < 5 Effect: coinsOwned += 1 PickKeyAction Precondition: coinsOwned >= 5, hasKey = false Effect: hasKey = true OpenDoorAction Precondition: hasKey = true, exitWorld = false Effect: exitWorld = true It is from this entire set of actions that the planner chooses which actions are available to the player at this current state of the world. For instance, at the initial state of the world, the player’s parameters are given as coinsOwned = 0, hasKey = false, and exitWorld = false. Out of the three given actions, the player’s parameters satisfy the preconditions of the PickCoinAction action only. Hence, at this state of the world, the only available action to the player is the PickCoinAction action. Figure 3 This figure illustrates an example of a tree structure of actions. A path from the root node to one of the terminal leaf nodes would represent a sequence of actions for the player to take. After the planner recognizes all actions currently available to the player, it then creates a tree structure, much like Figure 3 above, where each node contains an action to be performed by the player. At each node, the player’s parameters are updated according to the action taken by the player at that node, and any single action that could be performed with the newly updated player’s parameters become children of the node. Once a node’s action allows the player to exit the world, the node does not create any more children since there is no need to and becomes terminal. This way, a possible action sequence for the player can be
  • 6. 6 represented by a path between the topmost parent node and one of the terminal leaf nodes of the constructed tree. The terminal leaf nodes also store the entire cost of the path from the topmost parent node to itself, which the planner uses to pick the most optimal sequence of actions that achieve the player’s goal. Pseudocodes of the two functions that handle this part of the algorithm are given below: Listing 1 Function to build a tree structure of actions bool BuildTree(Node parent, List<Node> pathList, Set<Action> availableActions, State goal) { foundPath = false foreach action in availableActions: if action.mPrecondition in parent.mAction.mEffect: currentState = UpdateState(action) Node node = new Node(parent, parentCost + givenCost, action) if currentState == goal: pathList.Add(node) foundPath = true else: availableActions.delete(action) bool found = BuildTree(node, pathList, availableAction, goal) if (found): foundPath = true return foundPath } Listing 2 Function to choose the optimal sequence of actions and return it as a plan List<Action> CreatePlan(GameObject agent, List<Action> availableActions, State worldState, State goal) { foreach action in availableActions: if action.mPrecondition == false: availableActions.remove(action) List<Node> pathList = new List<Node>() Node start = new Node(null, 0, worldState, null) bool success = BuildTree(start, pathList, availableActions, goal) // If no plan found, return null
  • 7. 7 if not success: return null; Node optimal = null; foreach terminalNode in pathList: if optimal == null: optimal = terminalNode else: if optimal.cost > terminalNode.cost: optimal = terminalNode List<Action> actionList = new List<Action>() while optimal != null: actionList.push_front(optimal.mAction) optimal = optimal.mParent return actionList } After a plan has been formulated by the planner, the plan is sent to the player, and the player begins to execute the sequence of actions in the plan. Each execution of an action by the player can either succeed or fail. Upon success, the player updates its parameters, and proceeds to the next action in the sequence; upon failure, the player relays the information to the planner, and the planner re-evaluates the world state, makes a new plan, and returns it to the player. If at any time a planner fails to find a plan that is able to achieve the player’s goal, the player does not receive a plan, and the player becomes idle until a change happens in the state of the world that would allow the planner to come up with a working plan again. 4.1 Possible Improvements The planner in our demo used a tree data structure to search for all possible paths to the goal state of the world, and then chose the path with the lowest cost. The method used was somewhat like a BFS search which did not stop upon finding a path to goal, and proceeded to find all possible paths to goal in order to compare their costs and return the optimal one. This method of searching for the optimal path is inefficient since we are interested more in the total cost of the path instead of the number of nodes in the path; the BFS algorithm excels in the latter case, and should not be used in the former case. Our search algorithm can be improved by using smart search algorithms which guarantee an optimal path, such as A* with an admissible heuristic. 5 Advantages and Disadvantages of GOAP From what we have talked about so far, GOAP, or planners in general for that matter, seem very promising. However, bigger games which base their AI architecture majorly on planning systems are rare, even after Jeff Orkin’s presentation of the game F.E.A.R at GDC
  • 8. 8 in 2006. In this section, we assess the advantages and disadvantages of GOAP, and analyze why it has not taken over variations of FSMs and behavior trees as the major AI architecture in the game industry. The biggest advantage of GOAP over other architectures is that GOAP creates a dynamic problem-solving environment which greatly contributes to making the AI behavior in-game look more realistic. Compared to behavior trees or FSMs, the fact that GOAP uses actions which have preconditions and effects that alter the world state allows agents under instruction from the planner to behave in different ways to the same problem according to the resources given to them and conditions imposed on them. Specifically, GOAP can respond to unexpected changes in the world very well, since all it needs to do is come up with a new plan; this is a clear advantage over behavior trees, which are known for their notoriously bad handling of interruptions. Another advantage of GOAP is that the designer can provide agents with all the actions that he would like to see them perform without having to couple specific actions to goals. All the designer needs to specify is whether an action needs any preconditions to be met in order to be performed, and what effects the action would have on the world state once it has been performed. Figure 4 illustrates how this open relationship between actions and goals results in a much smaller complexity compared to those of other designs which would be necessary to achieve the same end result. Figure 4 In GOAP, the designer can provide actions and goals without having to associate certain actions with certain goals, since the agent will figure out by itself which actions are necessary to achieve certain goals.
  • 9. 9 However, GOAP does have its batch of problems. The biggest problem associated with using GOAP as the main architecture of a game’s AI system comes from the fact that it is extremely hard to design correctly. Planning systems work best when designers want the AI to achieve something, and do not particularly care about how it is done as long as it looks realistic. However, when the designer is also interested in the process by which the agent achieves its goal, planning systems suddenly become very difficult to work with. For instance, when a designer wants an agent to display a certain behavior, he cannot do so by explicitly telling the agent to demonstrate such behaviors. Instead, the designer must first come up with a set of actions the agent can perform, along with preconditions and effects for each action. Then, the designer must tune the parameters of the world, and then give a goal to the planner so that the planner would think the sequence of behaviors desired by the designer is the optimal choice to achieve the goal given to it by the designer. Another problem of GOAP, which follows from the aforementioned problem, is that when not done perfectly, it is simply better to use behavior trees to achieve the same behavior. Although one of the biggest advantages of using GOAP is the realistic behavior its agents display through dynamic problem-solving, if the design of the GOAP itself is not so great, the output behavior is such that one can achieve a similar level of realisticity by simply using behavior trees. Behavior trees have a huge advantage in performance over planners since it does not rely on dynamic problem-solving at runtime. While planners need to re-assess the world state each time it creates a plan or the world state changes significantly enough for the planner to be interested, a behavior tree is processed at compile time and already fired up and ready to go at runtime, and can output similarly realistic behaviors without having to spend much resources at runtime. Hence, if a GOAP system is not designed perfectly, it loses what advantages it has, and is only left with its downsides in performance. 6 Conclusion GOAP is a type of planning architecture which assigns a goal to an agent, provides it with all available actions that the agent can perform, and then lets the agent figure out which sequence of actions optimally achieves the goal given to it. When designed correctly, GOAP can be a powerful AI architecture for a game, outputting realistic behavior across any setting the designer might throw at the agents. However, if a designer is more interested in displaying specific types of behaviors through its agents, it is extremely difficult to design a perfect GOAP system to suit the designer’s needs, and most of the times the advantages of an imperfect GOAP system is negligible compared to its disadvantages, and games prefer behavior trees and various types of FSMs for their AI architecture. If your game values realistic behavior of agents more than specific types of behaviors displayed by agents, a GOAP system might be the right choice for your game, and you should give it a try.
  • 10. 10 7 References 7.1 Online references [AiGameDev]Alex J Champandard. Planning in games: An overview of lessons learned http://aigamedev.com/open/review/planning-in-games/ [MIT media lab]Jeff Orkin . Goal oriented Action planning http://alumni.media.mit.edu/~jorkin/goap.html