Artificial intelligence for minesweeper

1,664 views

Published on

Best results so far for minesweeper on small board sizes can be found in http://hal.inria.fr/hal-00712417.

2 bibtex references below:

@article{10.1109/TAAI.2011.55,
author = {Adrien Couetoux and Mario Milone and Olivier Teytaud},
title = {Consistent Belief State Estimation, with Application to Mines},
journal ={Technologies and Applications of Artificial Intelligence, International Conference on},
volume = {0},
isbn = {978-0-7695-4601-8},
year = {2011},
pages = {280-285},
doi = {http://doi.ieeecomputersociety.org/10.1109/TAAI.2011.55},
publisher = {IEEE Computer Society},
address = {Los Alamitos, CA, USA},
}


And the UCT performances on MineSweeper on small board:
@inproceedings{sebag:hal-00712417,
hal_id = {hal-00712417},
url = {http://hal.inria.fr/hal-00712417},
title = {{Combining Myopic Optimization and Tree Search: Application to MineSweeper}},
author = {Sebag, Mich{\`e}le and Teytaud, Olivier},
abstract = {{Abstract. Many reactive planning tasks are tackled by optimization combined with shrinking horizon at each time step: the problem is sim- plified to a non-reactive (myopic) optimization problem, based on the available information at the current time step and an estimate of future behavior, then it is solved; and the simplified problem is updated at each time step thanks to new information. This is in particular suitable when fast off-the-shelf components are available for the simplified problem - optimality stricto sensu is not possible, but good results are obtained at a reasonnable computational cost for highly untractable problems. As machines get more powerful, it makes sense however to go beyond the inherent limitations of this approach. Yet, a brute-force solving of the complete problem is often impossible; we here propose a methodology for embedding a solver inside a consistent reactive planning solver. Our methodology consists in embedding the solver in an Upper- Confidence-Tree algorithm, both in the nodes and as a Monte-Carlo simulator. We show the mathematical consistency of the approach, and then we apply it to a classical success of the myopic approach: the MineSweeper game.}},
language = {Anglais},
affiliation = {Laboratoire de Recherche en Informatique - LRI , TAO - INRIA Saclay - Ile de France},
booktitle = {{LION6, Learning and Intelligent Optimization}},
pages = {in press (14 pages, long paper)},
address = {Paris, France},
audience = {internationale },
year = {2012},
pdf = {http://hal.inria.fr/hal-00712417/PDF/mines2.pdf},
}

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,664
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
15
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Artificial intelligence for minesweeper

  1. 1. MINESWEEPERA. Couëtoux, O. TeytaudTAO, Inria, Lri, U-Psud,Umr Cnrs 8623+ OASE, NUTN
  2. 2. MINESWEEPERA. Couëtoux, O. TeytaudTAO, Inria, Lri, U-Psud, Umr Cnrs 8623 + OASE, NUTN Sometimes we work on (visibly) serious stuff.
  3. 3. MINESWEEPER A. Couëtoux, O. Teytaud TAO, Inria, Lri, U-Psud, Umr Cnrs 8623 + OASE, NUTN Sometimes we work on (visibly) serious stuff. But I think the bestchallenge for proving thatwe have good algorithms is games
  4. 4. And a great challenge is MineSweeper!Yes Im serious!
  5. 5. RULES
  6. 6. I playhere!
  7. 7. Good news! No mine in theneighborhood! I can “click” all the neighbours.
  8. 8. I have 3 uncovered neighbors, and I have 3 mines in theneighborhood ==> 3 flags!
  9. 9. I know its a mine,so I put a flag!
  10. 10. No info !
  11. 11. I play here and I lose...
  12. 12. The mostsuccessfulgame ever!Who never played Mine-Sweeper ?
  13. 13. Do youthink its easy ?(10 mines)
  14. 14. What isthe optimal move ?
  15. 15. What is the optimal move ? Remark: the question makes sense.You dont need the history for playing optimaly. ==> (non-trivial proof!)
  16. 16. What is the optimal move ? This one is easy.Both remaining locations win with proba 50%.
  17. 17. Moredifficult! Whichmove isoptimal ?
  18. 18. Probability of a mine ?- Top:- Middle:- Bottom:
  19. 19. Probability of a mine ?- Top: 33%- Middle:- Bottom:
  20. 20. Probability of a mine ?- Top: 33%- Middle: 33%- Bottom:
  21. 21. Probability of a mine ?- Top: 33%- Middle: 33%- Bottom: 33%
  22. 22. Probability of a mine ?- Top: 33%- Middle: 33%- Bottom: 33%==> so all moves equivalent ?
  23. 23. Probability of a mine ?- Top: 33%- Middle: 33%- Bottom: 33%==> so all moves equivalent ?==> NOOOOO!!!
  24. 24. MineSweeper approaches- exact MDP: very expensive. 4x4 solved.- CSP: the main approach. - (unknown) state: x(i) = 1 if there is a mine at location i - each visible location is a constraint: If location 15 is 4, then x(04)+x(05)+x(06) +x(14)+ x(16) +x(24)+x(25)+x(26) = 4. - find all solutions X1, X2, X3,...,XN - P(mine in j) = sumi Xij / N - play j such that P(mine in j) minimal - randomly break tie. MDP= Markov Decision Process CSP = Constraint Satisfaction Problem
  25. 25. CSP- is very fast- but its not optimal- because ofHere CSP plays randomly!Also for the initial move: dont play randomly the first move! (sometimes opening book)
  26. 26. Why not UCT ?- looks like a stupid idea at first view- can not compete with CSP in terms of speed- But at least UCT is consistent: if given sufficient time, it will play optimally.
  27. 27. Should I present UCT ?
  28. 28. UCT (Upper Confidence Trees)Coulom (06)Chaslot, Saito & Bouzy (06)Kocsis Szepesvari (06)
  29. 29. UCT
  30. 30. UCT
  31. 31. UCT
  32. 32. UCT
  33. 33. UCT Kocsis & Szepesvari (06)
  34. 34. Exploitation ...
  35. 35. Exploitation ... SCORE = 5/7 + k.sqrt( log(10)/7 )
  36. 36. Exploitation ... SCORE = 5/7 + k.sqrt( log(10)/7 )
  37. 37. Exploitation ... SCORE = 5/7 + k.sqrt( log(10)/7 )
  38. 38. ... or exploration ? SCORE = 0/2 + k.sqrt( log(10)/2 )
  39. 39. What do I need for implementing UCT ?A complete generative model.Given a state and an action,I must be able to simulate possible transitions.State S, Action a:(S,a) ==> SExample: given the state below, and the action “top left”, whatare the possible next states ?
  40. 40. What do I need for implementing UCT ?A complete generative model.Given a state and an action,I must be able to simulate possible transitions.State S, Action a:(S,a) ==> SExample: given the state below, and the action “top left”, what are the possible nextstates ?
  41. 41. What do I need for implementing UCT ?A complete generative model.Given a state and an action,I must be able to simulate possible transitions.State S, Action a:(S,a) ==> SExample: given the state below, and the action “top left”, what are the possible nextstates ?
  42. 42. What do I need for implementing UCT ?A complete generative model.Given a state and an action,I must be able to simulate possible transitions.State S, Action a:(S,a) ==> SExample: given the state below, and the action “top left”, what are the possible nextstates ?
  43. 43. What do I need for implementing UCT ?A complete generative model.Given a state and an action,I must be able to simulate possible transitions.State S, Action a:(S,a) ==> SExample: given the state below, and the action “top left”, what are the possible nextstates ?
  44. 44. Can you please forgive me for that ?What do I need for implementing UCT ?Given a state andIve been lazy, I have justA complete generative model. an action, implemented the rejection algorithm.I must be able to simulate possible transitions.State S, Action a:(S,a) ==> SExample: given the state below, and the action “top left”, what are the possible nextstates ?
  45. 45. Rejection algorithm: 1- randomly draw the minesWhat do I need for implementing UCT ?Given 2- if and an action, return the new observation a state its ok,A complete generative model. 3- otherwise, go back to 1.I must be able to simulate possible transitions.State S, Action a:(S,a) ==> SExample: given the state below, and the action “top left”, what are the possible nextstates ?
  46. 46. (being lazy is good: I could write a second paper withWhat do I need for implementing UCT ? a better algorithm :-) )A complete generative model.Given a state and an action, (using CSP for this!)I must be able to simulate possible transitions.State S, Action a:(S,a) ==> SExample: given the state below, and the action “top left”, what are the possible nextstates ?
  47. 47. An example showing that the initialmove matters (and our algorithm finds it!).. 3x3, 7 mines: the optimal move is anything but the center. Optimal winning rate: 25%. Optimal winning rate if random uniform initial move: 17/72. (yes we get 1/72 improvement!)
  48. 48. 15 mines on 5x5 board with GnoMine rule (i.e. initial move is 0) Optimal success rate = 100%!!!!!Play the center, and you win (well, you have to work...)
  49. 49. UCT vs CSP + opening book (play corners) in the Windows mode
  50. 50. Probability of a mine ?- Top: 33%- Middle: 33%- Bottom: 33%Top or bottom: 66% of win!Middle: 33%!
  51. 51. CONCLUSIONS- MineSweeper is not dead! ==> still a challenge- When you have a myopic solver (i.e. which neglects long term effect, as often in industry!) ==> combine with UCT- More to come, big boards are far from optimal
  52. 52. Thanks for your attention! 9 Mines. What is the optimal move ?

×