Simon Ellis from RPI presented “Aleph, A Cognitive Game-playing System for Tabletop Games”at the Cognitive Systems Institute Group Speaker Series call on November 12, 2015.
How to Remove Document Management Hurdles with X-Docs?
Aleph
1. ALEPH,
A COGNITIVE GAME-PLAYING SYSTEM
FOR TABLETOP GAMES
Cognitive Systems Institute Group
Speaker Series
Simon Ellis
Department of Computer Science ◇ Tetherless World Constellation
Rensselaer Polytechnic Institute, Troy, NY 12180
ELLISS5@RPI.EDU
Thursday, 12th November, 2015
2. Background
v 5th-year PhD student in Computer Science
v Supervised by Professor Jim Hendler since 2013
v Led RPI MiniDeepQA R&D team, Summer, 2013
v CS background is practical, not theoretical
v Programmer for 30+ years
v Developed games for 20+ years (including industry)
v Current research: using ‘cognitive computing’ for game AI
v “Cognitive game-playing” system, Aleph
v Future plan: build a Dungeons & Dragons-playing A.I. agent
4. Games
v Lots of games!…
v Computers play some games well and some badly, but why?
5. Game complexity
v Arises from multiple aspects of the game design
v Data structure, rule complexity, bluffing, openness…
v Game theory defines other parameters for games
v Zero-/Non zero-sum, Deterministic/Stochastic, Impartial/Partisan…
v But IBM Watson won at Jeopardy!, a “humans-only” game
v Jeopardy! has a massive search space
v How did Watson manage it – with ≤3 seconds per question?
6. IBM Watson
v Serious hardware (~2,800 IBM Power7 cores)
v More importantly…
Epstein et al.,“Making Watson Fast”. IBM J Res Dev 56 (3/4), May/July 2012, p. 15:2
7. Introducing Aleph
v The first ‘cognitive computing’-based tabletop game AI system
v Sadly not the first game AI: that title has been claimed
u e.g. Dannenhauer & Muñoz-Avila (2013): “Case-based Goal Selection
Inspired by IBM’s Watson” (ICCBR 2013)
v Uses a pipeline-style approach to play games
v Iterative search is used only where necessary
v Different evaluation techniques are applied
u Absolute scoring (i.e. ‘what is my score for this move?’)
u Influence maps & stacks (similar to MLA)
u Simple strategic analysis & reflection
8. Conceptual architecture
v Architecture…
v … was inspired by the design of the DeepQA pipeline
v … is informed by consideration of how people play games
v … uses numerous tools (“evaluators”) to judge game state
u Evaluators correspond to the sections and subsections of the pipeline
PRIMARY GENERAL ANALYSIS
Where can I play?
Where can I not play?
What can I play?
SECONDARY GENERAL ANALYSIS
What is my score?
Can I win this turn?
Do I have any valuable tiles?
What is my position like?
MOVE GENERATION
What moves exist?
Do chains of moves exist?
PRIMARY MOVE SCORING
Will this advance my position?
What would my new score(s) be?
GENERAL META-ANALYSIS
Who is winning?
What tile might come up next?
Can I disrupt a player’s game?
What happens if I play tile M?
INPUT
STATE
OUTPUT
STATE
TACTICS
Can I control more of the board?
How many tiles can I play now?
Can I swap hands? Should I do so?
Should I retain tile Q for later?
TILE-SPECIFIC META-ANALYSIS
How can I use tile X best?
Does tile Y give me any benefit?
Can I perform combo move Z?
FINAL SCORING AND RANKING
Which move has the highest score?
What other moves score highly?
Which move gives me the highest score?
“DEEP THOUGHT”
How well does this move fit my tactics?
Should I change my gameplay?
Is it worth playing a lesser move now?
9. System overview
v Developed using C-IMA, a partial implementation of UIMA
v Written in C++ using boost libraries
u Speed
u Programmer familiarity
u Curiosity: can it be done?
v System constructed from pipelines and evaluators
v Pipelines and evaluators are equivalent to UIMA flow models and
annotators
v Game test platform is kept largely separate
v Game logic is designed as a set of reentrant modules
v Game state data are stored in a specific container class
11. Move evaluation
v Two primary techniques used currently
v Absolute score
v Influence map
v Absolute score
v “What is my score if I make this move?”
v Value-based score
v Influence map
v Modified form of multi-layer analysis
v Board representations which can be stacked for analysis
v Positional evaluation
12. Influence map
v Frequently used in RTS games for
strategic reasoning
v Also used in go A.I.
v Each map represents a set of information; e.g.…
v What is my score if I play at location (x, y)?
v Number of players controlling location (p, q)
v Data are useful individually
v Data are more useful in combination
v Individual layers can be formed into a stack for analysis
13. Influence map
v Individual stack frame consists of:
v X, Y grid of cells
v Weight (importance of the set relative to the other data sets)
v Active flag (determines if this layer is used in computation)
v Each cell consists of:
v Value
v Weight (importance of an individual cell within its own frame)
v Computation over stack columns generates result frame
v Result frame is an influence map which provides combined
information about the state of the game
14. Strategy
v Needed to guide the agents’ play
v Otherwise the agents will play no better than randomly
v Traditionally provided by heuristic in search algorithm
v Minimax (or variant) generates scores for multiple plies
v Scores are propagated back up the search tree
v Branch leading to best/optimum remaining future move is chosen
u Greedy algorithm
v With games with complex search space, this is impractical
v Cannot search deeply enough forward for meaningful analysis
15. Strategy
v “Deep thought” module performs analysis over evaluators
v Which evaluators work well or badly is still being determined
v Different strategies have some different inputs
v Heuristics are necessarily simple
v Canbe as simple as a set of if (...) statements
v Again, a matter of research to see what works well
v Aim is to provide the agent with a degree of self-reflection
v Ability to judge its own performance using provided criteria
v Based on results, the agent may elect to change its strategy
u Strategies are pre-programmed, not deduced by the agents during play
17. Conclusion
v Watson demonstrated the efficacy of ‘cognitive computing’
v Aleph is the first in a new kind of AI for tabletop games
v “Cognitive game-playing”
v Design is capable of playing extremely complex games
v Turn-by-turn strategic analysis guides the agents’ play
v Considerable scope for future development
v Improving Aleph to make it more challenging for human players
v Other games, e.g. go, Civilization, Magic: the Gathering, chess
v IBM Watson ‘cogs’ + “cognitive game-playing” = …?
u Dungeons & Dragons, maybe?...
18. Q10
Acknowledgements
I would like to thank my supervisor, Professor Jim Hendler, for his continued support and advice, and for taking a chance on a stranger with some crazy ideas and offering me the initial
opportunity to work with Watson. I would also like to thank Dr Chris Welty and Dr Siddharth Patwardhan for their assistance and insights which led semi-directly to this work, Dr Bijan Parsia
(University of Manchester, UK) for his timely intervention in asking difficult questions which I had been avoiding, and Professor Selmer Bringsjord (RPI) for his consistently insightful comments
and observations.Additionally, sincere thanks are due to Dr Jonathan Dordick and Mr John Kolb (RPI) for their support, and to my other friends and colleagues at RPI likewise for theirs.