Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Deliberately Planning and Acting for Angry Birds with
Refinement Methods
Ruofei Du, Zebao Gao, Zheng Xu
CMSC 722 Technical...
Introduction
Angry Birds
Introduction
Angry Birds
200 Million
Introduction
Angry Birds
Introduction
Angry Birds
Introduction
Angry Birds
Birds
Introduction
Angry Birds
Slingshot
Introduction
Angry Birds
Pig
s
Introduction
Angry Birds
Introduction
Angry Birds
Introduction
Angry Birds
Related Work
Planning & Machine Learning
Related Work
Planning & Machine Learning
Related Work
Planning & Machine Learning
Related Work
Planning & Machine Learning
Related Work
Planning & Machine Learning
Related Work
Planning & Machine Learning
Related Work
Planning & Machine Learning
Most work focuses on
modeling the physical properties of the game or
utilize mach...
Contribution
Sample log of our system
0 [System] Program starts with configuration -d
32 [System] Server waiting on port: ...
Design and Implementation of
Greedy, DFFS, A* specifically
for solving Angry Birds puzzles
Design a Refinement Acting Engine (RAE)
Based on ARP-interleave
With Sequential Refinement Planning Engine (SeRPE)
Evaluation and Comparison amongst
3 algorithms in 2 conditions
Greedy / DFFS / A* && Planning Only / SeRPE
Discussion, illustration and insights from
Integrating Planning and Acting
Using refinement methods for Angry Birds
Angry Bird Platform
Rebuilt upon Ge et al., 2014
Figure by our team
Formalism
State Variable Representation
Formalism
State Variable Representation
The angry birds platform could
execute the following commands
Formalism
Commands and Tasks
findMBR()
perceive all the minimum bounding
rectangles (MBR) of Objects in the current
state
estimateLauchPoint(sl, target)
estimate the Start Points
used to shoot bird from the Sling and the target Point
shot(b, start)
shot a bird by pulling it to the start point
isReachable(sl, start, target)
estimate if the Target Point is reachable based on the
Start Point, or the Target is blocke...
getSupporters(obj)
get all the objects that suppot the object
We define the following commands
that are executable in the program
getMBR(obj)
getHitPoint(obj)
isBlocked(obj)
getAllPigs()
getAllBirds()
getCurrentBird()
getTNTs()
getObstacles(obj)
estima...
We define the following tasks
destroyAllPigs()
chooseTarget()
estimateScore(p)
destroy(obj)
destroy(obj, b)
hit(b, obj)
Refinement method
m-destroyAllPigs()
m1-chooseTarget()
m2-chooseTarget()
m1-selectTarget()
m2-selectTarget()
m1-destroyTarget(t)
m2-destroyT...
Refinement Tree
Challenge
World is seldom predictable
(Ghallab, Nau and Traverso, 2015)
Actions failures
A shot may not accomplish the goal of destroying a
targeted pig
Challenge
World is seldom predictable
(Gh...
Unexpected side effects of
actions
A shot may changing the future planning state by
destroying other pigs
Challenge
World ...
Exogenous events
A shot may put the stage into an unsolvable state like
surrounding a pig with irons
Challenge
World is se...
ARP-interleave
(Ghallab, Nau and Traverso, 2015)
1. Before each step:
Generate a plan based on the SeRPE
refinement tree.
ARP-interleave
(Ghallab, Nau and Traverso, 2015)
2. After each step:
Update information of the world
(outcomes, objects)
ARP-interleave
(Ghallab, Nau and Traverso, 2015)
3. If the current step fails:
try remaining methods
It models the problem more
accurately
Advantages
ARP-interleave Framework
Plans generated by this system
are more practical
the actual status of the world are always taken into
consideration
Advan...
Better decision
can be made about choosing the next plan..
Advantages
ARP-interleave Framework
Search algorithms for
nondeterministic choice in SeRPE
SeRPE-Greedy
Two heuristics:
1. always choose an existing pig
as target
2. choose the pig that is closest to
the sling
SeRPE-DFFS
1 Heuristic:
always choose an existing pig as target
Impl:
maintain visited states in a HashSet
SeRPE-A*
Experiment
Nvidia Quadro K6000
Experiment
Intel Xeon CPU E5-2667 2.90GHz
32GB RAM
Experiment
Chrome Version of Angry Birds
chrome.angrybirds.com
Experiment
Chrome Version of Angry Birds
chrome.angrybirds.com
Dataset
Statistics
Dataset
Highest Score for Each Level
Dataset
Average Time Cost for Each Level
(ms)
Results
SeRPE-A* outperforms both in
highest score and time cost
Why deterministic
models are worse? /
Why SeRPE with A* is
better?
Inaccurate Simulation
It is a challenging task to derive an exact model of a physical world.
The knowledge base of the phy...
Changes in Time
In practical scenarios like Angry Birds, the global states of any
object may be subject to change from tim...
Erroneous Actions
As for human beings, it is hard to perform an action such as
textit{move left hand 5 centimeters forward...
1. present a deliberately planning and acting system
with refinement methods for the popular Angry
Birds video game.
2. de...
Deliberately Planning and Acting for Angry Birds with
Refinement Methods
Ruofei Du, Zebao Gao, Zheng Xu
CMSC 722 Technical...
Introduction
Angry Birds
Deliberately Planning and Acting for Angry Birds with Refinement Methods
Deliberately Planning and Acting for Angry Birds with Refinement Methods
Deliberately Planning and Acting for Angry Birds with Refinement Methods
Deliberately Planning and Acting for Angry Birds with Refinement Methods
Upcoming SlideShare
Loading in …5
×

Deliberately Planning and Acting for Angry Birds with Refinement Methods

1,636 views

Published on

Authors: Ruofei Du, Zebao Gao, Zheng Xu; Advisors: Prof. Dana S. Nau and Dr. Vikas Shivashankar; This is a supplementary video for CMSC 722 AI Planning.

Check out papers, slides and more here: http://duruofei.com/Research/angrybirds

Angry Birds has been a popular game throughout the world since 2009. The goal of the game is to destroy all the pigs and as many obstacles as possible using a limited number of birds. Since the game environment is subject to change tremendously after each shot, a deterministic planning model is very likely to fail. In this paper, we integrate deliberately planning and acting for Angry Birds with refinement methods. Specifically, we design a refinement acting engine (RAE) based on ARP-interleave with Sequential Refinement Planning Engine (SeRPE). In addition, we implement greedy algorithm, Depth First Forward Search (DFFS) and $A^*$ algorithm to perform the actor's deliberation functions. Eventually, we evaluate our agent to solve the web version of Angry Birds in Chrome using the client-server platform provided by the IJCAI 2015 AI Birds Competition. In our experiments, we find out that our agent using SeRPE with $A^*$ algorithm greatly outperforms the agent using greedy algorithm or forward search without SeRPE. In this way, we prove the significance of refinement methods for planning in practice. Please see the supplementary video \url{https://youtu.be/u7XJ0g6d9po} for more results.

Published in: Technology
  • Be the first to comment

Deliberately Planning and Acting for Angry Birds with Refinement Methods

  1. 1. Deliberately Planning and Acting for Angry Birds with Refinement Methods Ruofei Du, Zebao Gao, Zheng Xu CMSC 722 Technical Report advised by Prof. Nau & Dr. Shivashankar Computer Science Dept., Univ. of Maryland, College Park
  2. 2. Introduction Angry Birds
  3. 3. Introduction Angry Birds 200 Million
  4. 4. Introduction Angry Birds
  5. 5. Introduction Angry Birds
  6. 6. Introduction Angry Birds Birds
  7. 7. Introduction Angry Birds Slingshot
  8. 8. Introduction Angry Birds Pig s
  9. 9. Introduction Angry Birds
  10. 10. Introduction Angry Birds
  11. 11. Introduction Angry Birds
  12. 12. Related Work Planning & Machine Learning
  13. 13. Related Work Planning & Machine Learning
  14. 14. Related Work Planning & Machine Learning
  15. 15. Related Work Planning & Machine Learning
  16. 16. Related Work Planning & Machine Learning
  17. 17. Related Work Planning & Machine Learning
  18. 18. Related Work Planning & Machine Learning Most work focuses on modeling the physical properties of the game or utilize machines learning as empirical knowledge base. None of the related work compares the methods using or not using refinements methods in an ARP-interleave framework.
  19. 19. Contribution Sample log of our system 0 [System] Program starts with configuration -d 32 [System] Server waiting on port: 9000 1051 [System] Client connected 1934 [System] Running A* agent without IRPE 8309 [Vision] Detecting objects in the Angry Birds using vision module 10068 [Vision] Detect TNTs 0 10069 [Vision] Detect pigs and rolling 1, glass 5, stone 1, wood 27, birds 8 10069 [Vision] Statistics: 1, 34, 0 10213 [Planner] m1-destroyAllPigs: 1 pigs 10216 [Planner] m2-destroyPig ab.vision.ABObject[x=536,y=290,width=14,height=10] 10216 [Planner] Estimate launch point from 2 10217 [Planner] launch point java.awt.Point[x=9,y=961], angle 73.79194556203768 17742 [Actor] Shooting starts 193, 328, -184, 633 27765 [Actor] Shooting completes 28096 [Vision] Scale factor = 1.0006048459391437 28267 [Vision] Detecting objects in the Angry Birds using vision module 28943 [Vision] Detect TNTs 0 28944 [Vision] Detect pigs and rolling 0, glass 2, stone 1, wood 10, birds2 28944 [Vision] Statistics: 0, 13, 0 29116 [Planner] m1-destroyAllPigs: 1 pigs 29116 [Planner] m2-destroyPig ab.vision.ABObject[x=536,y=290,width=14,height=10] 29117 [Planner] Estimate launch point from 2 29117 [Planner] launch point java.awt.Point[x=0,y=959], angle 72.99302990476194 36778 [Actor] no sling detected, can not execute the shot, will re-segement the image 37049 [Vision] Detecting objects in the Angry Birds using vision module 38198 [Vision] Detect TNTs 0
  20. 20. Design and Implementation of Greedy, DFFS, A* specifically for solving Angry Birds puzzles
  21. 21. Design a Refinement Acting Engine (RAE) Based on ARP-interleave With Sequential Refinement Planning Engine (SeRPE)
  22. 22. Evaluation and Comparison amongst 3 algorithms in 2 conditions Greedy / DFFS / A* && Planning Only / SeRPE
  23. 23. Discussion, illustration and insights from Integrating Planning and Acting Using refinement methods for Angry Birds
  24. 24. Angry Bird Platform Rebuilt upon Ge et al., 2014 Figure by our team
  25. 25. Formalism State Variable Representation
  26. 26. Formalism State Variable Representation
  27. 27. The angry birds platform could execute the following commands Formalism Commands and Tasks
  28. 28. findMBR() perceive all the minimum bounding rectangles (MBR) of Objects in the current state
  29. 29. estimateLauchPoint(sl, target) estimate the Start Points used to shoot bird from the Sling and the target Point
  30. 30. shot(b, start) shot a bird by pulling it to the start point
  31. 31. isReachable(sl, start, target) estimate if the Target Point is reachable based on the Start Point, or the Target is blocked by Obstacles
  32. 32. getSupporters(obj) get all the objects that suppot the object
  33. 33. We define the following commands that are executable in the program
  34. 34. getMBR(obj) getHitPoint(obj) isBlocked(obj) getAllPigs() getAllBirds() getCurrentBird() getTNTs() getObstacles(obj) estimateScoreOfObjs(objs)
  35. 35. We define the following tasks
  36. 36. destroyAllPigs() chooseTarget() estimateScore(p) destroy(obj) destroy(obj, b) hit(b, obj)
  37. 37. Refinement method
  38. 38. m-destroyAllPigs() m1-chooseTarget() m2-chooseTarget() m1-selectTarget() m2-selectTarget() m1-destroyTarget(t) m2-destroyTarget(p) m1-destroyPig(p, b) m1-hit(a, obj) m2-hit(a, obj) m3-hit(a, obj) m4-hit(a, obj)
  39. 39. Refinement Tree
  40. 40. Challenge World is seldom predictable (Ghallab, Nau and Traverso, 2015)
  41. 41. Actions failures A shot may not accomplish the goal of destroying a targeted pig Challenge World is seldom predictable (Ghallab, Nau and Traverso, 2015)
  42. 42. Unexpected side effects of actions A shot may changing the future planning state by destroying other pigs Challenge World is seldom predictable (Ghallab, Nau and Traverso, 2015)
  43. 43. Exogenous events A shot may put the stage into an unsolvable state like surrounding a pig with irons Challenge World is seldom predictable (Ghallab, Nau and Traverso, 2015)
  44. 44. ARP-interleave (Ghallab, Nau and Traverso, 2015) 1. Before each step: Generate a plan based on the SeRPE refinement tree.
  45. 45. ARP-interleave (Ghallab, Nau and Traverso, 2015) 2. After each step: Update information of the world (outcomes, objects)
  46. 46. ARP-interleave (Ghallab, Nau and Traverso, 2015) 3. If the current step fails: try remaining methods
  47. 47. It models the problem more accurately Advantages ARP-interleave Framework
  48. 48. Plans generated by this system are more practical the actual status of the world are always taken into consideration Advantages ARP-interleave Framework
  49. 49. Better decision can be made about choosing the next plan.. Advantages ARP-interleave Framework
  50. 50. Search algorithms for nondeterministic choice in SeRPE
  51. 51. SeRPE-Greedy Two heuristics: 1. always choose an existing pig as target 2. choose the pig that is closest to the sling
  52. 52. SeRPE-DFFS 1 Heuristic: always choose an existing pig as target Impl: maintain visited states in a HashSet
  53. 53. SeRPE-A*
  54. 54. Experiment Nvidia Quadro K6000
  55. 55. Experiment Intel Xeon CPU E5-2667 2.90GHz 32GB RAM
  56. 56. Experiment Chrome Version of Angry Birds chrome.angrybirds.com
  57. 57. Experiment Chrome Version of Angry Birds chrome.angrybirds.com
  58. 58. Dataset Statistics
  59. 59. Dataset Highest Score for Each Level
  60. 60. Dataset Average Time Cost for Each Level (ms)
  61. 61. Results SeRPE-A* outperforms both in highest score and time cost
  62. 62. Why deterministic models are worse? / Why SeRPE with A* is better?
  63. 63. Inaccurate Simulation It is a challenging task to derive an exact model of a physical world. The knowledge base of the physical world is built upon the prior exploration with trials and errors. Even in a virtual world like Angry Birds, it is almost impossible to model the world without reading its source code. Discussion Why deterministic model fails (Ghallab, Nau and Traverso, 2015)
  64. 64. Changes in Time In practical scenarios like Angry Birds, the global states of any object may be subject to change from time to time. For example, it is very hard to know whether the scene is static or not after a shot. The structure may be unstable and moving in a very slow pace. As a result, the subtle changes in the global states may lead to failure of the current plan. Discussion Why deterministic model fails (Ghallab, Nau and Traverso, 2015)
  65. 65. Erroneous Actions As for human beings, it is hard to perform an action such as textit{move left hand 5 centimeters forward exactly}. Similarly, for a video game like Angry Birds that involves human interactions, the acting engine hardly points to the perfect position in pixel-level. Consequently, the actor may produce different states than the planner predicted. Discussion Why deterministic model fails (Ghallab, Nau and Traverso, 2015)
  66. 66. 1. present a deliberately planning and acting system with refinement methods for the popular Angry Birds video game. 2. design a RAE based on ARP-interleave with SeRPE. Greedy, DFFS and A* are incorporated to search for planning 3. our experimental results show that ARP-interleave outperforms their counterparts which purely rely on the search algorithms Conclusion Summary
  67. 67. Deliberately Planning and Acting for Angry Birds with Refinement Methods Ruofei Du, Zebao Gao, Zheng Xu CMSC 722 Technical Report advised by Prof. Nau & Dr. Shivashankar Computer Science Dept., Univ. of Maryland, College Park Thank you!
  68. 68. Introduction Angry Birds

×