Successfully reported this slideshow.
Your SlideShare is downloading. ×

Dissertation defense

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 62 Ad
Advertisement

More Related Content

Similar to Dissertation defense (20)

Advertisement
Advertisement

Dissertation defense

  1. 1. expressiveintelligencestudio Integrating Learning in a Multi-Scale Agent Ben Weber Dissertation Defense May 18, 2012
  2. 2. expressiveintelligencestudio UC Santa Cruz Introduction  AI has a long history of using games to advance the state of the field [Shannon 1950]
  3. 3. expressiveintelligencestudio UC Santa Cruz Real-Time Strategy Games  Building human-level AI for RTS games remains an open research challenge StarCraft II, Blizzard Entertainment
  4. 4. expressiveintelligencestudio UC Santa Cruz Task Environment Properties Chess StarCraft Taxi Driving Fully vs. partially observable Fully Partially Partially Deterministic vs. stochastic Deterministic Deterministic* Stochastic Episodic vs. sequential Sequential Sequential Sequential Static vs. dynamic Static Dynamic Dynamic Discrete vs. continuous Discrete Continuous Continuous Single vs. multiagent Multi Multi Multi [Russell & Norvig 2009]
  5. 5. expressiveintelligencestudio UC Santa Cruz Motivation  RTS games present complex environments and complex tasks  Professional players demonstrate a broad range of reasoning capabilities  Human behavior can be observed, emulated, and evaluated [Langley 2011, Mateas 2002]
  6. 6. expressiveintelligencestudio UC Santa Cruz Hypothesis  Reproducing expert-level StarCraft gameplay involves integrating heterogeneous reasoning capabilities
  7. 7. expressiveintelligencestudio UC Santa Cruz Research Questions  What competencies are necessary for expert StarCraft gameplay?  Which competencies can be learned from demonstrations?  How can these competencies be integrated in a real-time agent?
  8. 8. expressiveintelligencestudio UC Santa Cruz Overview  StarCraft  Multi-Scale AI  Learning from Demonstration  Integrating Learning  Evaluation
  9. 9. expressiveintelligencestudio UC Santa Cruz StarCraft  Expert gameplay  300+ APM  Evolving meta-game  Exhibited capabilities  Estimation  Anticipation  Adaptation [Flash, Pro-gamer]
  10. 10. expressiveintelligencestudio UC Santa Cruz StarCraft Gameplay Expand Tech Tree Manage Economy Produce Units Attack Opponent
  11. 11. expressiveintelligencestudio UC Santa Cruz Gameplay Scales in StarCraft  Individual  Squad  Global Support siege line Worker harassment Aggressive mine placement
  12. 12. expressiveintelligencestudio UC Santa Cruz State Space  The following number of states are possible, considering only unit type and location: (Type * X * Y)Units  States on a 256x256 tile map: (100*256*256)1700 > 1011,500
  13. 13. expressiveintelligencestudio UC Santa Cruz Decision Complexity  The set of possible actions that can be executed at a particular moment: O(2W(A * P) + 2T(D + S) + B(R + C))  W – number of workers  A – number of the type of worker assignments  P – average number of workspaces  T – number of troops  D – number of movement directions [Aha et al. 2005]
  14. 14. expressiveintelligencestudio UC Santa Cruz Decision Complexity  The set of possible actions that can be executed at a particular moment: O(W * A * P + T * D * S + B(R + C))  Assumption  Unit actions can be selected independently  Resulting complexity:  Assuming 50 worker units on a 256x256 tile map results in more than 1,000,000 possible actions
  15. 15. expressiveintelligencestudio UC Santa Cruz StarCraft  Complex gameplay  Real-world properties  Highly-competitive  Sources of expert gameplay
  16. 16. expressiveintelligencestudio UC Santa Cruz Research Question #1  What competencies are necessary for expert StarCraft gameplay?
  17. 17. expressiveintelligencestudio UC Santa Cruz Multi-Scale AI  Multiple scales  Actions are performed across multiple levels of coordination  Interrelated tasks  Performance in each tasks impacts other tasks  Real-time  Actions are performed in real time
  18. 18. expressiveintelligencestudio UC Santa Cruz Reactive Planning  Provides useful mechanisms for building multi-scale agents  Advantages  Efficient behavior selection  Interleaved plan expansion and execution  Disadvantages  Lacks deliberative capabilities [Loyall 1997, Mateas 2002]
  19. 19. expressiveintelligencestudio UC Santa Cruz Agent Design  Implemented in the ABL reactive planning language  Architecture  Extension of McCoy & Mateas integrated agent framework  Partitions gameplay into distinct competencies  Uses a blackboard for coordination [McCoy & Mateas 2008]
  20. 20. expressiveintelligencestudio UC Santa Cruz EISBot Managers Strategy Manager Income Manager Production Manager Tactics Manager Recon Manager Gather Resources Construct Buildings Attack Opponent Scout Opponent
  21. 21. expressiveintelligencestudio UC Santa Cruz Multi-Scale Idioms  Design patterns for authoring multi-scale AI  Idioms  Message passing  Daemon behaviors  Managers  Unit subtasks  Behavior locking
  22. 22. expressiveintelligencestudio UC Santa Cruz Idioms in EISBot Initial_tree Tactics Manager Strategy Manager Income Manager Form Squad Squad Monitor Squad Attack Squad Retreat Attack Enemy Pump Probes Legend Subgoal Daemon behavior Message passingDragoon Dance Timing Attack WME Probe Stop WME
  23. 23. expressiveintelligencestudio UC Santa Cruz Multi-Scale AI  StarCraft gameplay is multi-scale  Reactive planning provides mechanisms for multi-scale reasoning  Idioms are applied in EISBot to support StarCraft gameplay
  24. 24. expressiveintelligencestudio UC Santa Cruz Research Question #2  Which competencies can be learned from demonstrations?
  25. 25. expressiveintelligencestudio UC Santa Cruz Learning from Demonstration  Objective  Emulate capabilities exhibited by expert players by harnessing gameplay demonstrations  Methods  Classification and regression model training  Case-based goal formulation  Parameter selection for model optimization
  26. 26. expressiveintelligencestudio UC Santa Cruz Strategy Prediction  Tasks  Identify opponent build orders  Predict when buildings will be constructed 0 100 200 300 400 0 4 Game Time (minutes) Spawning Pool Timing [Hsieh & Sun 2008]
  27. 27. expressiveintelligencestudio UC Santa Cruz Approach  Feature encoding  Each player’s actions are encoded in a single vector  Vectors are labeled using a build-order rule set  Features describe the game cycle when a unit or building type is first produced by a player t, time when x is first produced by P 0, x was not (yet) produced by P f(x) = {
  28. 28. expressiveintelligencestudio UC Santa Cruz Strategy Prediction Results 0 0.2 0.4 0.6 0.8 1 0 1 2 3 4 5 6 7 8 9 10 11 12 RecallPrecision Game Time (minutes) NNge Boosting Rule Set State Lattice
  29. 29. expressiveintelligencestudio UC Santa Cruz Strategy Learning  Task  Learn build-orders from demonstration  Trace Algorithm  Converts replays to a trace representation  Formulates goals based on most similar situation q = argminc ϵ L distance(s, c) g = s + (q’ - q) [Ontañón et al. 2010]
  30. 30. expressiveintelligencestudio UC Santa Cruz Trace Retrieval: Example  Consider a planning window of size 2 S =< 3, 0, 1, 1 > T1 =< 2, 0, 0.5, 1 > T2 =< 3, 0, 0.7, 1 > T3 =< 4, 1, 0.9, 1 > T4 =< 4, 1, 1.1, 2 >
  31. 31. expressiveintelligencestudio UC Santa Cruz Trace Retrieval: Step 1  The system retrieves the most similar case, q S =< 3, 0, 1, 1 > T1 =< 2, 0, 0.5, 1 > T2 =< 3, 0, 0.7, 1 > T3 =< 4, 1, 0.9, 1 > T4 =< 4, 1, 1.1, 2 >
  32. 32. expressiveintelligencestudio UC Santa Cruz Trace Retrieval : Step 2  q’ is retrieved S =< 3, 0, 1, 1 > T1 =< 2, 0, 0.5, 1 > T2 =< 3, 0, 0.7, 1 > T3 =< 4, 1, 0.9, 1 > T4 =< 4, 1, 1.1, 2 >
  33. 33. expressiveintelligencestudio UC Santa Cruz Trace Retrieval : Step 3  The difference is computed: T4 – T2 = <1,1,0.4,1> S =< 3, 0, 1, 1 > T1 =< 2, 0, 0.5, 1 > T2 =< 3, 0, 0.7, 1 > T3 =< 4, 1, 0.9, 1 > T4 =< 4, 1, 1.1, 2 >
  34. 34. expressiveintelligencestudio UC Santa Cruz Trace Retrieval : Step 4  g is computed: S =< 3, 0, 1, 1 > T1 =< 2, 0, 0.5, 1 > T2 =< 3, 0, 0.7, 1 > T3 =< 4, 1, 0.9, 1 > T4 =< 4, 1, 1.1, 2 > g = s + (T4 – T2) = <4, 1, 1.4, 2>
  35. 35. expressiveintelligencestudio UC Santa Cruz Strategy Learning Results 0 2 4 6 8 10 12 14 0 10 20 30 40 50 60 70 80 90 100 PredictionError(RMSE) Actions performed by player Opponent modeling with a window size of 20 Null IB1 Trace MultiTrace
  36. 36. expressiveintelligencestudio UC Santa Cruz State Estimation  Task  Estimate enemy positions given prior observations  Particle Model  Apply movement model  Remove visible particles  Reweight particles [Thrun 2002, Bererton 2004]
  37. 37. expressiveintelligencestudio UC Santa Cruz Parameter Selection  Free parameters  Trajectory weights  Decay rates  State estimation is represented as an optimization problem  Input: parameter weights  Output: particle model error  Replays are used to implement a particle model error function
  38. 38. expressiveintelligencestudio UC Santa Cruz State Estimation Results 0 20 40 60 80 100 120 140 160 0 2 4 6 8 10 12 14 16 18 ThreatPredictionError Game Time (Minutes) Null Model Perfect Tracker Default Model Optimized Model
  39. 39. expressiveintelligencestudio UC Santa Cruz Learning from Demonstration  Anticipation  Classification and regression models  Adaptation  Case-based goal formulation  Estimation  Model optimization
  40. 40. expressiveintelligencestudio UC Santa Cruz Research Question #3  How can these competencies be integrated in a real-time agent?
  41. 41. expressiveintelligencestudio UC Santa Cruz Agent Architecture
  42. 42. expressiveintelligencestudio UC Santa Cruz Integration Approaches  Augmenting working memory  External plan generation  External goal formulation Working Memory External Components
  43. 43. expressiveintelligencestudio UC Santa Cruz Augmenting Working Memory  Supplementing working memory with additional beliefs
  44. 44. expressiveintelligencestudio UC Santa Cruz External Plan Generation  Generating plans outside the scope of ABL
  45. 45. expressiveintelligencestudio UC Santa Cruz External Goal Formulation  Formulating goals outside the scope of ABL
  46. 46. expressiveintelligencestudio UC Santa Cruz Goal-Driven Autonomy  A framework for building self introspective agents  GDA agents monitor plan execution, detect discrepancies, and explain failures  Implementations  Hand-authored rules  Case-based reasoning [Molineaux et al. 2010, Muñoz-Avila et al. 2010]
  47. 47. expressiveintelligencestudio UC Santa Cruz GDA Subtasks  Expectation generation  Discrepancy detection  Explanation generation  Goal formulation
  48. 48. expressiveintelligencestudio UC Santa Cruz Implementation
  49. 49. expressiveintelligencestudio UC Santa Cruz Integrating Learning  ABL agents can be interfaced with external learning components  Applying the GDA model enabled tighter coordination across capabilities  EISBot incorporates ABL behaviors, a particle model, and a GDA implementation
  50. 50. expressiveintelligencestudio UC Santa Cruz Evaluation  Claim  Reproducing expert-level StarCraft gameplay involves integrating heterogeneous reasoning capabilities  Experiments  Ablation studies  User study
  51. 51. expressiveintelligencestudio UC Santa Cruz GDA Ablation Study  Agent configurations  Base  Formulator  Predictor  GDA  Free parameters  Planning window size  Look-ahead window size  Discrepancy period Discrepancy Detector Explanation Generator Goal Formulator Goal Manager Discrepancies Explanations Goals
  52. 52. expressiveintelligencestudio UC Santa Cruz GDA Results  Overall results from the GDA experiments Agent Win Ratio Base 0.73 Formulator 0.77 Predictor 0.81 GDA 0.92
  53. 53. expressiveintelligencestudio UC Santa Cruz User Study  Experiment setup  Matches hosted on ICCup  3 trials  Testing script 1. Launch StarCraft 2. Connect to server 3. Host match 4. Announce experiment [Dennis Fong, Pro-gamer]
  54. 54. expressiveintelligencestudio UC Santa Cruz Performance on Tau Cross 0 500 1000 1500 2000 0 10 20 30 40 50 ICCupScore Number of Games Played Base Formulator Predictor GDA
  55. 55. expressiveintelligencestudio UC Santa Cruz ICCup Results Agent Longinus Python Tau Cross Overall Base 942 599 669 737 Formulator 980 718 1078 925 Predictor 1111 555 1145 937 GDA 952 860 1293 1035
  56. 56. expressiveintelligencestudio UC Santa Cruz EISBot Ranking  Rankings achieved by the complete GDA agent Trial Percentile Ranking Longinus 32nd Python 8th Tau Cross 66th Average 48th
  57. 57. expressiveintelligencestudio UC Santa Cruz Evaluation  Ablation Studies  Optimized particle model  Complete GDA model  Integrating additional capabilities into EISBot improved performance  EISBot performed at the level of a competitive amateur StarCraft player
  58. 58. expressiveintelligencestudio UC Santa Cruz Conclusion  Objective  Identify and realize capabilities necessary for expert-level StarCraft gameplay in an agent  Approach  Decompose gameplay  Learn capabilities from demonstrations  Integrate learned gameplay models  Evaluate versus humans and agents
  59. 59. expressiveintelligencestudio UC Santa Cruz Contributions  Idioms for authoring multi-scale agents  Methods for learning from demonstration  Integration approaches for ABL agents
  60. 60. expressiveintelligencestudio UC Santa Cruz Integrating Learning in a Multi-Scale Agent  Ben G. Weber  Ph.D. Candidate  Expressive Intelligence Studio  UC Santa Cruz  bweber@soe.ucsc.edu  Funding  NSF Grant IIS – 1018954
  61. 61. expressiveintelligencestudio UC Santa Cruz References  Aha, Molineaux, & Ponsen. 2005. “Learning to Win: Case-Based Plan Selection in a Real-Time Strategy Game”, Proceedings of ICCBR.  Bererton. 2004. “State Estimation for Game AI using Particle Filters”, Proceedings of AAI Workshop on Challenges in Game AI.  Hsieh & Sun. 2008. “Building a Player Strategy Model by Analyzing Replays of Real-Time Strategy Games”, Proceedings of IJCNN.  Langley. 2011. “Artificial Intelligence and Cognitive Systems”, AISB Quarterly.  Loyall. 1997. “Believable Agents: Building Interactive Personalities”, Ph.D. thesis, CMU.  Mateas. 2002. “Believable Agents: Building Interactive Personalities”, Ph.D. thesis, CMU.
  62. 62. expressiveintelligencestudio UC Santa Cruz References  McCoy & Mateas. 2008. “An Integrated Agent for Playing Real-Time Strategy Games”, Proceedings of AAAI.  Molineaux, Klenk, Aha. 2010. “Goal-Driven Autonomy in a Navy Strategy Simulation”, Proceedings of AAAI.  Muñoz-Avila, Aha, Jaidee, Klenk, Molineaux. 2010. “Applying Goal Driven Autonomy to a Team Shooter Game”, Proceedings of FLAIRS.  Ontañón, Mishra, Sugandh, Ram. 2010. “On-line Case-Based Planning”, Computational Intelligence.  Russell & Norvig. 2009. Artificial Intelligence: A Modern Approach.  Shannon. 1950. “Programming a Computer for Playing Chess”, Philosophical magazine .  Thrun. 2002. “Particle Filters in Robotics”, Proceedings of UAI.

×