Meta Monte-Carlo Tree Search

466 views

Published on

Published in: Business, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
466
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Meta Monte-Carlo Tree Search

  1. 1. - Go is not solved beyond 6x6- We build an opening book for 7x7 for approximately solving 7x7 Go..Widely believed: perfect play is a draw with Komi 9.
  2. 2. Our tools1) Monte-Carlo Tree Search2) Meta-Monte-Carlo Tree Search3) Senseis partial solution (using in particular Davies work)
  3. 3. Monte-Carlo Tree SearchCoulom 2006,Kocsis-Szepesvari 2006.= combining tree search and Monte-Carlo evaluation
  4. 4. UCT (Upper Confidence Trees)Coulom (06)Chaslot, Saito & Bouzy (06)Kocsis Szepesvari (06)
  5. 5. UCT
  6. 6. UCT
  7. 7. UCT
  8. 8. UCT
  9. 9. UCT Kocsis & Szepesvari (06)
  10. 10. Exploitation ...
  11. 11. Exploitation ... SCORE = 5/7 + k.sqrt( log(10)/7 )
  12. 12. Exploitation ... SCORE = 5/7 + k.sqrt( log(10)/7 )
  13. 13. Exploitation ... SCORE = 5/7 + k.sqrt( log(10)/7 )
  14. 14. ... or exploration ? SCORE = 0/2 + k.sqrt( log(10)/2 )
  15. 15. Our tools1) Monte-Carlo Tree Search2) Meta-Monte-Carlo Tree Search3) Senseis partial solution (using in particular Davies work)
  16. 16. Meta-Monte-Carlo Tree Search= Monte-Carlo Tree Search with Monte-Carlo replaced by MCTS
  17. 17. Meta-Monte-Carlo Tree SearchI.e.:MCTS = MC play-outs + Tree SearchMeta-MCTS = MCTS play-outs + Tree SearchMeta-Meta-MCTS = Meta-MCTS play-outs + TreeSearch...
  18. 18. Meta-Monte-Carlo Tree SearchI.e.:MCTS = MC play-outs + Tree SearchMeta-MCTS = MCTS play-outs + Tree SearchMeta-Meta-MCTS = Meta-MCTS play-outs + TreeSearch...
  19. 19. Our tools1) Monte-Carlo Tree Search2) Meta-Monte-Carlo Tree Search3) Senseis partial solution (using in particular Davies work)
  20. 20. A variation which is not in Senseis file. Left: black E3 should be black E5. Right: corrected version.
  21. 21. EXPERIMENTAL RESULTS
  22. 22. Meta-MCTS learns against a MCTS sparring partner.We introduce Senseis variations into this sparring partner during the Meta-MCTS run.
  23. 23. Learning curve of black by Meta-MCTS:- X-axis = log2(number of playouts)- Y-axis = moving average (window size 55) of winning rate in playouts Playouts = MCTS (its Meta-MCTS) Komi = 8.5 (winning with komi 8.5 ensures a draw with komi 9)
  24. 24. Learning curve of white by Meta-MCTS:- X-axis = log2(number of playouts)- Y-axis = moving average (window size 55) of winning rate in playouts Playouts = MCTS (its Meta-MCTS) Komi = 9.5 (winning with komi 9.5 ensures a draw with komi 9)
  25. 25. Decreasing points in the curve = introduction of Senseis variations in the opponent.Conclusion = the algorithm did not find alone all these variations ==> human needed.
  26. 26. GamesAgainst pros.
  27. 27. MoGoTW is black.
  28. 28. MoGoTW is white
  29. 29. With komi 9.5, MoGoTW won everything as White.With komi 8.5, MoGoTW won everything as Black. Exciting! Were all MoGoTWs moves perfect ?
  30. 30. With komi 9.5, MoGoTW won everything as White.With komi 8.5, MoGoTW won everything as Black. Exciting! Were all MoGoTWs moves perfect ? No :-( In one game (at least) the human might have won.
  31. 31. Left: this game was won by MoGoTW as black. Chun-Yen Lin (2P) made a mistake.Right: how Chun-Yen Lin (2P) might have won the game.
  32. 32. So, still at least one variation on which the bot does not play correctly. We did not introduce manually a correction,but we introduced the variation played by the pro in the sparring partner. We see if the bot can find a solution by itself.
  33. 33. Learning curve as black, after introducing thedangerous variation in the sparring partner. Time still logarithmic.
  34. 34. CONCLUSIONSWe used Meta-MCTS for buildingan opening book for 7x7 Go.We are not aware ofremaining bad moves, which doesnot mean theres no more bad move.
  35. 35. Meta-MCTS did a good job by itself, but human inputs ( = Senseis + games against pros) have been helpful. Towards exact solving ? = collecting all leafs of the OB + solving all of them... = huge work.
  36. 36. Other conclusions:- 7x7 can be very hard, even for pros (pros made mistakes).- MCTS alone is not enough for very strong play in 7x7

×