Simple Lemmas on Partially Observable Games, and Applications to Phantom tic-tac-toe, Kriegspiel and Phantom-Go

Phantom Games

Phantom Games

1. Phantom-games & phantom-go
2. Maths
3. Experiments

F. Teytaud, O. Teytaud
TAO, Inria-Saclay IDF, Cnrs 8623, Lri, Univ. Paris-Sud,
OASE Lab,

Korea,
Summer 2011
1

Phantom Games

What are phantom games ?
phantom-X = partial information counterpart of
(full info) game X,
you're not informed of your opponent's moves
so you might play illegal moves:
then you're informed they're illegal;
and you just replay them.

Extremal case: no other information (just illegal moves)
More convenient: a bit more information:
informed of ataris (in Go)
see all the locations you can reach (Dark Chess).
2

Phantom Games

(full info) game X,

Example: phantom Tic-Tac-Toe

3

Phantom Games

(full info) game X,

My opponent plays (I don't know
where)

4

Phantom Games

(full info) game X,

I try this...
==> illegal move!

5

Phantom Games

(full info) game X,

I know the
state...
==> good :-)
6

Example: Dark Chess

- Different from Chinese Dark Chess
- Also known as “Fog of War”7

Example: phantom-Go

8

Phantom Games

A little bit of maths (sorry)

2. Maths
3. Experiments

OASE Lab,

Korea,
Summer 2011
9

Simple things

Consider a 2-player game with: - finite state space;
- one of the two players wins.
Then:
- Full information: one of the player has a
winning strategy. We can know who by
Minimax. Possibly 2EXP-complete (Go with
Japanese rules, Robson's paper).

- Partial information, finite horizon: there exists p,
Such that player 1 wins with proba p in case of
perfect play. p is computable.

- Partial information, infinite horizon: p not
computable ! (Auger et al, 2010, submitted)
10

Other simple things

Previous stuff was known, and mathematically hard.
Now, simple stuff, with concrete applications.

Goals: making approximate solving of partially observable games
more tractable.
With precise bounds.

11

Other simple thing==> practice

Difference with full information games + applications:

- good strategies are randomized
(when playing games with hidden information)
(illustration: play rock-paper-scissor;
if you play a fixed strategy,
at least one opponent is much stronger than you)

- remark: there is an optimal strategy which is invariant w.r.t
rotations/symmetries

==> so we can work with only one version, and then symmetrize
(uniformly)
==> no loss of optimality (Nash sense)
12

Yet another simple thing ==> practice

- Change the game as follows: player 2 chooses
the hidden state when in state S.

- Then, the game is harder for player 1 (in term of game-theoretical
value).

==> So we can lower bound the value by considering
- the worst case on opponent's strategies and
- assuming he is allowed to rebuild the hidden state (consistently
with your observations, however).
==> you get a matrix game (see example later)

==> if you have both lower and upper bounds, you can estimate
the value of an history of observations.

==> looks stupid, but simplifies13
analysis (examples next slide)

Examples: 4x4 Ponnuki Simple case
(phantom version) (you do it naturally)

<=== Sure win in 4x4 ponnuki
(phantom or not)

==>

==> at least 1/3 for black
14

Examples: 4x4 Ponnuki

Better case (you don't
do it without thinking
at the method)
<=== Sure win in 4x4 ponnuki
(phantom or not)

==>

==> at least 1/3 for black
15

One more simple thing ==> practice

- Specifically for phantom-games: if a move is either a win, or an
“illegal” move, then play it.

- Trivially ok (no optimality loss),
reduces (very much) the set of strategies

==> it can't hurt

==> very compact representation

16

One last simple thing ==> practice

Specifically for phantom-games:

- If in fully observable game X, there are N possible sequences of
actions and player 1 wins surely.

- Then, player 1 wins with probability at least 1/N in phantom-X.

Proof: Player 1 can reach proba 1/N of winning by playing
randomly a sequence of actions = optimal sequence with
proba 1/N (at least).

17

Good in Go, bad in phantom-Go:
tightness

Black to play.
Go: black has lost
Phantom-Go:
Black wins with
Proba 1-1/8!

(==> bound from
previous slide
Is nearly tight)

18

Phantom Games

Some results on real games.

2. Maths
3. Experiments (manually performed :-) )

OASE Lab,

Korea,
Summer 2011
19

Phantom-tic-tac-toe

Strategy for 1st player / phantom-tic-tac-toe

==> dominating moves = moves which
are either illegal or wins
20

Phantom-tic-tac-toe: bounds

Then, define 6 families of strategies for
white, covering all possible cases;
using the “simple facts”, we show that in
all cases 1st player wins with proba at
least 3/4
But, 2nd player can ensure a draw in TTT.
So by Lemma: value of Phantom-TTT in

21

Phantom-tic-tac-toe: bounds

Then, define 6 families of strategies for
white, covering all possible cases;
using the “simple facts”, we show that in
all cases 1st player wins with proba at
least 3/4 384 = nb
of legal
But, 2nd player can ensure a draw in TTT.
sequences
as 2nd player
So by Lemma: value of Phantom-TTT in

22

Phantom-ponnuki

3x3 is a win for black.
4x4 is a win for black with proba:

(by conversion/inequalities with
matrix games)

23

Conclusions

Here some simple tools, with rigorous bounds on
Phantom-tic-tac-toe
Phantom-Ponnuki in 3x3 and 4x4

The main tool is generic (opponent chooses hidden state
==> matrix game)

Main further work:
Implementation inside a search algorithm (e.g. for ranking
moves or evaluating leafs)
Other simplification ideas ?
e.g. more on worst 24
case analysis

Conclusions
PO board games = great challenge
Phantom-Go (humans still stronger than computers ?)
Fog of War (don't know)
MineSweeper: usual solvers are not optimal (they optimize
the short-term only: minimum proba of mine)
We got
optimal
play in 6x6,
4 mines.

==> better models than board games for real AI ?
==> involves taste of danger; beyond IQ ?
==> my feeling: many CI improvements possible here,
maths can help.
(human-level performance25at Urban Rivals, a PO card game)

Finished!

...thanks for your attention ! ...

26

Simple Lemmas on Partially Observable Games, and Applications to Phantom tic-tac-toe, Kriegspiel and Phantom-Go

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (9)

Recently uploaded

Recently uploaded (20)

Simple Lemmas on Partially Observable Games, and Applications to Phantom tic-tac-toe, Kriegspiel and Phantom-Go