@inproceedings{teytaud:inria-00625794,
hal_id = {inria-00625794},
url = {http://hal.inria.fr/inria-00625794},
title = {{Lemmas on Partial Observation, with Application to Phantom Games}},
author = {Teytaud, Fabien and Teytaud, Olivier},
abstract = {{Solving games is usual in the fully observable case. The partially observable case is much more difficult; whenever the number of strategies is finite (which is not necessarily the case, even when the state space is finite), the main tool for the exact solving is the construction of the full matrix game and its solving by linear programming. We here propose tools for approximating the value of partially observable games. The lemmas are relatively general, and we apply them for deriving rigorous bounds on the Nash equilibrium of phantom-tic-tac-toe and phantom-Go.}},
language = {Anglais},
affiliation = {Laboratoire de Recherche en Informatique - LRI , TAO - INRIA Saclay - Ile de France},
booktitle = {{Computational Intelligence and Games}},
address = {Seoul, Cor{\'e}e, R{\'e}publique Populaire D{\'e}mocratique De},
audience = {internationale },
year = {2011},
month = Sep,
pdf = {http://hal.inria.fr/inria-00625794/PDF/phantomatari.pdf},
}
Hire 💕 8617370543 Auraiya Call Girls Service Call Girls Agency
Simple Lemmas on Partially Observable Games, and Applications to Phantom tic-tac-toe, Kriegspiel and Phantom-Go
1. Phantom Games
Phantom Games
1. Phantom-games & phantom-go
2. Maths
3. Experiments
F. Teytaud, O. Teytaud
TAO, Inria-Saclay IDF, Cnrs 8623, Lri, Univ. Paris-Sud,
OASE Lab,
Korea,
Summer 2011
1
2. Phantom Games
What are phantom games ?
phantom-X = partial information counterpart of
(full info) game X,
you're not informed of your opponent's moves
so you might play illegal moves:
then you're informed they're illegal;
and you just replay them.
Extremal case: no other information (just illegal moves)
More convenient: a bit more information:
informed of ataris (in Go)
see all the locations you can reach (Dark Chess).
2
3. Phantom Games
What are phantom games ?
phantom-X = partial information counterpart of
(full info) game X,
you're not informed of your opponent's moves
so you might play illegal moves:
then you're informed they're illegal;
and you just replay them.
Example: phantom Tic-Tac-Toe
3
4. Phantom Games
What are phantom games ?
phantom-X = partial information counterpart of
(full info) game X,
you're not informed of your opponent's moves
so you might play illegal moves:
then you're informed they're illegal;
and you just replay them.
My opponent plays (I don't know
where)
4
5. Phantom Games
What are phantom games ?
phantom-X = partial information counterpart of
(full info) game X,
you're not informed of your opponent's moves
so you might play illegal moves:
then you're informed they're illegal;
and you just replay them.
I try this...
==> illegal move!
5
6. Phantom Games
What are phantom games ?
phantom-X = partial information counterpart of
(full info) game X,
you're not informed of your opponent's moves
so you might play illegal moves:
then you're informed they're illegal;
and you just replay them.
I know the
state...
==> good :-)
6
7. Example: Dark Chess
- Different from Chinese Dark Chess
- Also known as “Fog of War”7
9. Phantom Games
A little bit of maths (sorry)
1. Phantom-games & phantom-go
2. Maths
3. Experiments
F. Teytaud, O. Teytaud
TAO, Inria-Saclay IDF, Cnrs 8623, Lri, Univ. Paris-Sud,
OASE Lab,
Korea,
Summer 2011
9
10. Simple things
Consider a 2-player game with: - finite state space;
- one of the two players wins.
Then:
- Full information: one of the player has a
winning strategy. We can know who by
Minimax. Possibly 2EXP-complete (Go with
Japanese rules, Robson's paper).
- Partial information, finite horizon: there exists p,
Such that player 1 wins with proba p in case of
perfect play. p is computable.
- Partial information, infinite horizon: p not
computable ! (Auger et al, 2010, submitted)
10
11. Other simple things
Previous stuff was known, and mathematically hard.
Now, simple stuff, with concrete applications.
Goals: making approximate solving of partially observable games
more tractable.
With precise bounds.
11
12. Other simple thing==> practice
Difference with full information games + applications:
- good strategies are randomized
(when playing games with hidden information)
(illustration: play rock-paper-scissor;
if you play a fixed strategy,
at least one opponent is much stronger than you)
- remark: there is an optimal strategy which is invariant w.r.t
rotations/symmetries
==> so we can work with only one version, and then symmetrize
(uniformly)
==> no loss of optimality (Nash sense)
12
13. Yet another simple thing ==> practice
- Change the game as follows: player 2 chooses
the hidden state when in state S.
- Then, the game is harder for player 1 (in term of game-theoretical
value).
==> So we can lower bound the value by considering
- the worst case on opponent's strategies and
- assuming he is allowed to rebuild the hidden state (consistently
with your observations, however).
==> you get a matrix game (see example later)
==> if you have both lower and upper bounds, you can estimate
the value of an history of observations.
==> looks stupid, but simplifies13
analysis (examples next slide)
14. Examples: 4x4 Ponnuki Simple case
(phantom version) (you do it naturally)
<=== Sure win in 4x4 ponnuki
(phantom or not)
==>
==> at least 1/3 for black
14
15. Examples: 4x4 Ponnuki
Better case (you don't
do it without thinking
at the method)
<=== Sure win in 4x4 ponnuki
(phantom or not)
==>
==> at least 1/3 for black
15
16. One more simple thing ==> practice
- Specifically for phantom-games: if a move is either a win, or an
“illegal” move, then play it.
- Trivially ok (no optimality loss),
reduces (very much) the set of strategies
==> it can't hurt
==> very compact representation
16
17. One last simple thing ==> practice
Specifically for phantom-games:
- If in fully observable game X, there are N possible sequences of
actions and player 1 wins surely.
- Then, player 1 wins with probability at least 1/N in phantom-X.
Proof: Player 1 can reach proba 1/N of winning by playing
randomly a sequence of actions = optimal sequence with
proba 1/N (at least).
17
18. Good in Go, bad in phantom-Go:
tightness
Black to play.
Go: black has lost
Phantom-Go:
Black wins with
Proba 1-1/8!
(==> bound from
previous slide
Is nearly tight)
18
19. Phantom Games
Some results on real games.
1. Phantom-games & phantom-go
2. Maths
3. Experiments (manually performed :-) )
F. Teytaud, O. Teytaud
TAO, Inria-Saclay IDF, Cnrs 8623, Lri, Univ. Paris-Sud,
OASE Lab,
Korea,
Summer 2011
19
21. Phantom-tic-tac-toe: bounds
Then, define 6 families of strategies for
white, covering all possible cases;
using the “simple facts”, we show that in
all cases 1st player wins with proba at
least 3/4
But, 2nd player can ensure a draw in TTT.
So by Lemma: value of Phantom-TTT in
21
22. Phantom-tic-tac-toe: bounds
Then, define 6 families of strategies for
white, covering all possible cases;
using the “simple facts”, we show that in
all cases 1st player wins with proba at
least 3/4 384 = nb
of legal
But, 2nd player can ensure a draw in TTT.
sequences
as 2nd player
So by Lemma: value of Phantom-TTT in
22
23. Phantom-ponnuki
3x3 is a win for black.
4x4 is a win for black with proba:
(by conversion/inequalities with
matrix games)
23
24. Conclusions
Here some simple tools, with rigorous bounds on
Phantom-tic-tac-toe
Phantom-Ponnuki in 3x3 and 4x4
The main tool is generic (opponent chooses hidden state
==> matrix game)
Main further work:
Implementation inside a search algorithm (e.g. for ranking
moves or evaluating leafs)
Other simplification ideas ?
e.g. more on worst 24
case analysis
25. Conclusions
PO board games = great challenge
Phantom-Go (humans still stronger than computers ?)
Fog of War (don't know)
MineSweeper: usual solvers are not optimal (they optimize
the short-term only: minimum proba of mine)
We got
optimal
play in 6x6,
4 mines.
==> better models than board games for real AI ?
==> involves taste of danger; beyond IQ ?
==> my feeling: many CI improvements possible here,
maths can help.
(human-level performance25at Urban Rivals, a PO card game)