Bandit-based Monte-Carlo planning: the game of Go and beyondGames Olivier.Teytaud@inria.fr + too many people for being all...
GamesIntroductionComplexity measuresComputational complexityPartial observabilityZoology
Introduction to gamesPartially or fully observable    (“phantom” games)Randomized or notIterated or not1,2,3,... playersDe...
Introduction to gamesPartially or fully observableRandomized or notIterated or not1,2,3,... playersDecentralized or notCon...
Introduction to gamesPartially or fully observableRandomized or notIterated or not (reputation)1,2,3,... playersDecentrali...
Introduction to gamesPartially or fully observableRandomized or notIterated or not1,2,3,... playersDecentralized or notCon...
Introduction to gamesPartially or fully observableRandomized or notIterated or not1,2,3,... playersDecentralized or notCon...
Introduction to gamesPartially or fully observableRandomized or notIterated or not1,2,3,... playersDecentralized or notCon...
Introduction to gamesPartially or fully observableRandomized or notIterated or not1,2,3,... playersDecentralized or notCon...
Introduction to gamesPartially or fully observableRandomized or notIterated or not1,2,3,... playersDecentralized or notCon...
GamesIntroductionComplexity measuresComputational complexityPartial observabilityZoology
Complexity measures(not always well defined)State-space complexityGame-tree sizeDecision complexityGame-tree complexityCom...
Complexity measures(not always well defined)State-space complexity = number        of possible statesGame-tree sizeDecisio...
Complexity measures(not always well defined)State-space complexityGame-tree size = number of leafsDecision complexityGame-...
Complexity measures(not always well defined)State-space complexityGame-tree sizeDecision complexity = min # ofleafs of tre...
Complexity measures(not always well defined)State-space complexityGame-tree sizeDecision complexityGame-tree complexity = ...
Complexity measures(not always well defined)State-space complexityGame-tree sizeDecision complexityGame-tree complexityCom...
Complexity measures(not always well defined)State-space complexityGame-tree sizeDecision complexityGame-tree complexityCom...
Complexity measures(not always well defined)State-space complexityGame-tree sizeDecision complexityGame-tree complexityCom...
State of the art levelVery weak solvingMeans that we know who should winTypically proved by strategy-stealingE.g.: hex (fi...
State of the artlevelVery weak solvingWeak solvingPerfect play reached with reasonnable computationtimeBiggest success: dr...
State of the artlevelVery weak solvingWeak solvingStrong solving       Perfect play from any situation in       reasonable...
State of the art levelVery weak solvingWeak solvingStrong solvingBest results so farShi-Fu-Mi: humans looseEnglish draught...
GamesIntroductionComplexity measuresComputational    complexityPartial observabilityZoology
Computational complexity: Main reasons for this measure ?Good feeling of understanding                             (disagr...
Computational complexity:  Drawbacks Not clearly related to human/computercomparisonsTrivial games can be very complex (th...
Computational complexityKnown:Conjectured: strict inclusions everywhere.Higher classes include undecidable cases.
Computational complexityGiven a class X, a problem q can bein Xor harder than pbs in X (X-hard)or both (X-complete)or neit...
Complexity quizzNP means non-polynomial ?Assume P≠ NP. NP=NP-complete U P ?Are there problems solvable inexponential time ...
Complexity quizzNP means non-polynomial ?Assume P≠ NP. NP=NP-complete U P ?Are The existence of exponential problems is kn...
Complexity quizzNP means non-polynomial ?Assume P≠ NP. NP=NP-complete U P ?Are there problems solvable inexponential time ...
Complexity quizzNP means non-polynomial ?Assume P≠ NP. NP=NP-complete U P ?Are there problems solvable inexponential time ...
Complexity quizzNP means non-polynomial ?Assume P≠ NP. NP=NP-complete U P ?Are there problems solvable inexponential time ...
Complexity quizzNP means non-polynomial ?Assume P≠ NP. NP=NP-complete U P ?Are there problems solvable inexponential time ...
Complexity quizzNP means non-polynomial ?Assume P≠ NP. NP=NP-complete U P ?Are there problems solvable inexponential time ...
Complexity quizzNP means non-polynomial ?Assume P≠ NP. NP=NP-complete U P ?Are there problems solvable inexponential time ...
Complexity quizzNP means non-polynomial ?Assume P≠ NP. NP=NP-complete U P ?Are there problems solvable inexponential time ...
Complexity quizzNP means non-polynomial ?Assume P≠ NP. NP=NP-complete U P ?        YES ! Undecidable problem.Are there pro...
Computational complexityFor evaluating the complexity of your game:1. Generalize your game to any size                    ...
How to show X-completenessThe problem is in X: show that you cansolve it with resources allowed in class X.The problem is ...
Computational complexity==> cast into a decision problem (binary question)==> can be used for choosing optimal move       ...
A PSPACE-complete pb: planar generalized geography- A graph (oriented, planar) is given.- Each player follows an edge (in ...
Another PSPACE-complete pb:quantified boolean formula          True or false ?
A EXPTIME-complete pb: does aTuring machine halts in n steps ?- A program is given.- A number n is given.- Will the progra...
GamesIntroductionComplexity measuresComputational complexityPartial observabilityZoology
Partial observabilityHere discussed in the compact case        (i.e. representation by formula)Usually, compact ==> bigger...
Partial observability        (structured)(more difficult than an opponent)P(success)>c is undecidable         (proba+oppon...
Phantom-games >>> POMDPSee Rintanen 03 (case with formulae).
Phantom-games >>> POMDPSee Rintanen 03 (case with formulae).
Phantom-games >>> POMDPSee Rintanen 03 (case with formulae).
Phantom-games >>> POMDPSee Rintanen 03 (case with formulae).
Phantom-games >>> POMDPSee Rintanen 03 (case with formulae).
Phantom-games & POMDPwith infinite horizonMadani et al: infinite time POMDP are   undecidable.Auger, Teytaud: finite time ...
GamesIntroductionComplexity measuresComputational complexityPartial observabilityZoology
PSPACE vs EXPTIME==> many important games are either PSPACE or EXPTIMETheorem: If playing = filling a locationfor eternity...
PSPACE vs EXPTIME
Appendix 1: the game of Go
Go: from 29 to 6 stones1998: loss against amateur (6d) 19x19 H292008: win against a pro (8p) 19x19, H9        MoGo2008: wi...
Game of Go (9x9 here)
Game of Go
Game of Go
Game of Go
Game of Go - capture
Game of Go
Game of Go
Game of Go: counting territories(white has 7.5 “bonus” as black starts)
Game of Go: the rules        Black plays at the blue circle: the        white group dies (it is removed)Its impossible to ...
NP / PSPACE / EXPTIME in GoTsumegos with no ko, forced moves only for W, 2 moves for B, polynomial length: NP- completeAta...
NP / PSPACE / EXPTIME in GoEncodingthe formulain a ladder:
Appendix 2: what is difficult forcomputers ? Visual things ?                 70
Easy for computers ... becausehuman knowledge easy to encode.               71
Difficult for computersMuy difícil para las ordenadores.                  72
A trivial semeai           Plenty of equivalent                     situations!            They are randomly              ...
Semeai     Plenty of equivalent               situations!         They are randomly             sampled, with           no...
Semeai     Plenty of equivalent               situations!         They are randomly             sampled, with           no...
Semeai     Plenty of equivalent               situations!         They are randomly             sampled, with           no...
Semeai     Plenty of equivalent               situations!         They are randomly             sampled, with           no...
Semeai     Plenty of equivalent               situations!         They are randomly             sampled, with           no...
Semeai     Plenty of equivalent               situations!         They are randomly             sampled, with           no...
Semeai     Plenty of equivalent               situations!         They are randomly             sampled, with           no...
A trivial semeai           Plenty of equivalent                     situations!            They are randomly              ...
A trivial semeai           Plenty of equivalent                     situations!            They are randomly              ...
A trivial semeai           Plenty of equivalent                     situations!            They are randomly              ...
It does not work. Why ?                                             50% of estimated                                      ...
And the humans ?                                 50% of estimated                                   win probability!In the...
SemeaisShouldwhiteplay inthesemeai(G1)or capture(J15) ?             86
SemeaisShould blackplay thesemeai ?               87
SemeaisShould blackplay thesemeai ?               88
SemeaisShould blackplay thesemeai ?Useless!               89
Difficult games: Havannah                      Very difficult                     for computers.
Conclusions + other               elements Go complexity:superko ?Ishi-no-shita (captures / recaptures) ? (more generally:...
BiblioComplexity: Robson, Tromp, Taylor, Crasmaru, ...Bandits: Lai, Robbins, Auer, Cesa-Bianchi...UCT: Kocsis, Szepesvari,...
Paul Veyssière                                                              Hassen DoghmenAmine BourkiMatthieu Coulm   Con...
Upcoming SlideShare
Loading in...5
×

Theory of games

204

Published on

Theory of games, with a short reminder of computational complexity and an independent appendix on human complexity and the game of Go


@article{david:hal-00710073,
hal_id = {hal-00710073},
url = {http://hal.inria.fr/hal-00710073},
title = {{The Frontier of Decidability in Partially Observable Recursive Games}},
author = {David, Auger and Teytaud, Olivier},
abstract = {{The classical decision problem associated with a game is whether a given player has a winning strategy, i.e. some strategy that leads almost surely to a victory, regardless of the other players' strategies. While this problem is relevant for deterministic fully observable games, for a partially observable game the requirement of winning with probability 1 is too strong. In fact, as shown in this paper, a game might be decidable for the simple criterion of almost sure victory, whereas optimal play (even in an approximate sense) is not computable. We therefore propose another criterion, the decidability of which is equivalent to the computability of approximately optimal play. Then, we show that (i) this criterion is undecidable in the general case, even with deterministic games (no random part in the game), (ii) that it is in the jump 0', and that, even in the stochastic case, (iii) it becomes decidable if we add the requirement that the game halts almost surely whatever maybe the strategies of the players.}},
language = {Anglais},
affiliation = {Laboratoire de Recherche en Informatique - LRI , TAO - INRIA Saclay - Ile de France},
booktitle = {{Special Issue on "Frontier between Decidability and Undecidability"}},
publisher = {World Scinet},
journal = {International Journal on Foundations of Computer Science (IJFCS)},
volume = {Accepted},
note = {revised 2011, accepted 2011, in press },
audience = {internationale },
year = {2012},
}

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
204
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
8
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Theory of games"

  1. 1. Bandit-based Monte-Carlo planning: the game of Go and beyondGames Olivier.Teytaud@inria.fr + too many people for being all cited. Includes Inria, Cnrs, Univ.Paris-Sud, LRI, CMAP, Univ. Amsterdam, Taiwan universities (including NUTN)TAO, Inria-Saclay IDF, Cnrs 8623, Lri, Univ. Paris-Sud,Digiteo Labs, Pascal Network of Excellence.Tao,January 2010+updated 2012.
  2. 2. GamesIntroductionComplexity measuresComputational complexityPartial observabilityZoology
  3. 3. Introduction to gamesPartially or fully observable (“phantom” games)Randomized or notIterated or not1,2,3,... playersDecentralized or notContinuous or notInfinite time or not
  4. 4. Introduction to gamesPartially or fully observableRandomized or notIterated or not1,2,3,... playersDecentralized or notContinuous or notInfinite time or not
  5. 5. Introduction to gamesPartially or fully observableRandomized or notIterated or not (reputation)1,2,3,... playersDecentralized or notContinuous or notInfinite time or not
  6. 6. Introduction to gamesPartially or fully observableRandomized or notIterated or not1,2,3,... playersDecentralized or notContinuous or notInfinite time or not
  7. 7. Introduction to gamesPartially or fully observableRandomized or notIterated or not1,2,3,... playersDecentralized or notContinuous or notInfinite time or not (rengo)
  8. 8. Introduction to gamesPartially or fully observableRandomized or notIterated or not1,2,3,... playersDecentralized or notContinuous or notInfinite time or not
  9. 9. Introduction to gamesPartially or fully observableRandomized or notIterated or not1,2,3,... playersDecentralized or notContinuous or notInfinite time or not
  10. 10. Introduction to gamesPartially or fully observableRandomized or notIterated or not1,2,3,... playersDecentralized or notContinuous or notInfinite time or not
  11. 11. GamesIntroductionComplexity measuresComputational complexityPartial observabilityZoology
  12. 12. Complexity measures(not always well defined)State-space complexityGame-tree sizeDecision complexityGame-tree complexityComputational complexityPerfect-play complexityState of the art level
  13. 13. Complexity measures(not always well defined)State-space complexity = number of possible statesGame-tree sizeDecision complexityGame-tree complexityComputational complexityPerfect-play complexityState of the art level
  14. 14. Complexity measures(not always well defined)State-space complexityGame-tree size = number of leafsDecision complexityGame-tree complexityComputational complexityPerfect-play complexityState of the art level
  15. 15. Complexity measures(not always well defined)State-space complexityGame-tree sizeDecision complexity = min # ofleafs of tree showing perfect playGame-tree complexityComputational complexityPerfect-play complexityState of the art level
  16. 16. Complexity measures(not always well defined)State-space complexityGame-tree sizeDecision complexityGame-tree complexity = # of leafsfor perfect play with constant depthComputational complexityPerfect-play complexityState of the art level
  17. 17. Complexity measures(not always well defined)State-space complexityGame-tree sizeDecision complexityGame-tree complexityComputational complexity (=complexity classes, later)Perfect-play complexityState of the art level
  18. 18. Complexity measures(not always well defined)State-space complexityGame-tree sizeDecision complexityGame-tree complexityComputational complexityPerfect-play complexity (complexityof perfect algorithm)State of the art level
  19. 19. Complexity measures(not always well defined)State-space complexityGame-tree sizeDecision complexityGame-tree complexityComputational complexityPerfect-play complexityState of the art level
  20. 20. State of the art levelVery weak solvingMeans that we know who should winTypically proved by strategy-stealingE.g.: hex (first player wins), hex + swap (second player wins)Weak solvingStrong solvingBest results so far
  21. 21. State of the artlevelVery weak solvingWeak solvingPerfect play reached with reasonnable computationtimeBiggest success: draughts (tenths of years ofcomputation on tenths of machines)Strong solvingBest results so far
  22. 22. State of the artlevelVery weak solvingWeak solvingStrong solving Perfect play from any situation in reasonable time (variants of Tic-Tac-Toe)Best results so far
  23. 23. State of the art levelVery weak solvingWeak solvingStrong solvingBest results so farShi-Fu-Mi: humans looseEnglish draughts: humans + machines reach perfectplayChess: nobody can compete with machines9x9 Go: MoGoTW won with the disadvantageous sidewith a top player
  24. 24. GamesIntroductionComplexity measuresComputational complexityPartial observabilityZoology
  25. 25. Computational complexity: Main reasons for this measure ?Good feeling of understanding (disagree if you want :-) )Explicit families of problems (extracted by reduction)FunConnections with classical complexity measuresMuch better for looking clever (when you speak about NP-complete problems you look clever)
  26. 26. Computational complexity: Drawbacks Not clearly related to human/computercomparisonsTrivial games can be very complex (thismeasure if a worst case on situations that might neveroccur from the start of the game - many solvings arebased on openings restricting the game)Often based on incredibly long games
  27. 27. Computational complexityKnown:Conjectured: strict inclusions everywhere.Higher classes include undecidable cases.
  28. 28. Computational complexityGiven a class X, a problem q can bein Xor harder than pbs in X (X-hard)or both (X-complete)or neither NP NP -difficile NP -complete
  29. 29. Complexity quizzNP means non-polynomial ?Assume P≠ NP. NP=NP-complete U P ?Are there problems solvable inexponential time but not in polynomialtime ?Are there problems solvable inquadratic time but not in linear time ?Are there problems which cant be solved, even with infinite time and space ?
  30. 30. Complexity quizzNP means non-polynomial ?Assume P≠ NP. NP=NP-complete U P ?Are The existence of exponential problems is known.No. there problems solvable inexponential time but not in polynomialIt means (roughly) “polynomial with a machine whichtime ?can run several branches simultaneously”.Are there problems solvable inIt means (very roughly) “polynomial in linear time ?quadratic time but not with a machine whichjust has to verify a proof”.Are there problems which cant be solved, even with infinite time andMaybe P = NP is not so interesting as a question :-) space ?
  31. 31. Complexity quizzNP means non-polynomial ?Assume P≠ NP. NP=NP-complete U P ?Are there problems solvable inexponential time but not in polynomialtime ?Are there problems solvable inquadratic time but not in linear time ?Are there problems which cant be solved, even with infinite time and space ?
  32. 32. Complexity quizzNP means non-polynomial ?Assume P≠ NP. NP=NP-complete U P ?Are there problems solvable inexponential time but not in polynomial No.time ? are intermediate problems (if P≠ NP). ThereAre there problems solvable inquadratic time butNP-problems Yet, many important not in linear time ?Are are eitherproblems which cant be there P or NP-complete. solved, even with infinite time and space ?
  33. 33. Complexity quizzNP means non-polynomial ?Assume P≠ NP. NP=NP-complete U P ?Are there problems solvable inexponential time but not in polynomialtime ?Are there problems solvable inquadratic time but not in linear time ?Are there problems which cant be solved, even with infinite time and space ?
  34. 34. Complexity quizzNP means non-polynomial ?Assume P≠ NP. NP=NP-complete U P ?Are there problems solvable inexponential time but not in polynomialtime ?Are there problems solvable inquadratic time but not in linear time ?Are there problems which cant be YES, infinite time and solved, even with YES, and YES. space ?
  35. 35. Complexity quizzNP means non-polynomial ?Assume P≠ NP. NP=NP-complete U P ?Are there problems solvable inexponential time but not in polynomialtime ?Are there problems solvable inquadratic time but not in linear time ?Are there problems which cant be solved, even with infinite time and space ?
  36. 36. Complexity quizzNP means non-polynomial ?Assume P≠ NP. NP=NP-complete U P ?Are there problems solvable inexponential time but not in polynomialtime ?Are there problems solvable inquadratic time but not in linear time ?Are there problems which cant be YES ! solved, even with infinite time and All P problems do not have space ? the same complexity.
  37. 37. Complexity quizzNP means non-polynomial ?Assume P≠ NP. NP=NP-complete U P ?Are there problems solvable inexponential time but not in polynomialtime ?Are there problems solvable inquadratic time but not in linear time ?Are there problems which cant be solved, even with infinite time and space ?
  38. 38. Complexity quizzNP means non-polynomial ?Assume P≠ NP. NP=NP-complete U P ? YES ! Undecidable problem.Are there problems solvable in E.g.: time but not in polynomialexponentialtime ? Is there a seg-fault ?Are there problems solvable inquadratic time but not in linear time ?Are there problems which cant be solved, even with infinite time and space ?
  39. 39. Computational complexityFor evaluating the complexity of your game:1. Generalize your game to any size (non trivial for chess)2. Consider the problem: - here is a board - is the situation a win in perfect play ? NP NP NP -complete -difficile
  40. 40. How to show X-completenessThe problem is in X: show that you cansolve it with resources allowed in class X.The problem is complete: show that youcan encode a X-complete problem in yourproblem. NP NP NP -complete -difficile
  41. 41. Computational complexity==> cast into a decision problem (binary question)==> can be used for choosing optimal move (but not necessary)==> trivial games can be EXPTIME-hard==> no clear correlation with the fact that a game is difficult for a computer (when compared to humans) NP NP NP -complete -difficile
  42. 42. A PSPACE-complete pb: planar generalized geography- A graph (oriented, planar) is given.- Each player follows an edge (in turn).- Repetition is not allowed.- The first player who cant play looses.==> A winning strategy for first player ?
  43. 43. Another PSPACE-complete pb:quantified boolean formula True or false ?
  44. 44. A EXPTIME-complete pb: does aTuring machine halts in n steps ?- A program is given.- A number n is given.- Will the program halt in n time steps ?Best solution: simulate.Cost: n (which is exponential in log(n)!)
  45. 45. GamesIntroductionComplexity measuresComputational complexityPartial observabilityZoology
  46. 46. Partial observabilityHere discussed in the compact case (i.e. representation by formula)Usually, compact ==> bigger cost (e.g. P ==> PSPACE)
  47. 47. Partial observability (structured)(more difficult than an opponent)P(success)>c is undecidable (proba+opponent) (if no time limit.)==> analyzing P(success)=1 (no proba).
  48. 48. Phantom-games >>> POMDPSee Rintanen 03 (case with formulae).
  49. 49. Phantom-games >>> POMDPSee Rintanen 03 (case with formulae).
  50. 50. Phantom-games >>> POMDPSee Rintanen 03 (case with formulae).
  51. 51. Phantom-games >>> POMDPSee Rintanen 03 (case with formulae).
  52. 52. Phantom-games >>> POMDPSee Rintanen 03 (case with formulae).
  53. 53. Phantom-games & POMDPwith infinite horizonMadani et al: infinite time POMDP are undecidable.Auger, Teytaud: finite time deterministic games are undecidable.Undecidability of phantom-Go ?
  54. 54. GamesIntroductionComplexity measuresComputational complexityPartial observabilityZoology
  55. 55. PSPACE vs EXPTIME==> many important games are either PSPACE or EXPTIMETheorem: If playing = filling a locationfor eternity, then it is PSPACE. (not necessarily PSPACE-complete!)Proof: Depth-first search.Applis: Hex, Havannah, Tic-Tac-Toe, Atari-Go...
  56. 56. PSPACE vs EXPTIME
  57. 57. Appendix 1: the game of Go
  58. 58. Go: from 29 to 6 stones1998: loss against amateur (6d) 19x19 H292008: win against a pro (8p) 19x19, H9 MoGo2008: win against a pro (4p) 19x19, H8 CrazyStone2008: win against a pro (4p) 19x19, H7 CrazyStone2009: win against a pro (9p) 19x19, H7 MoGo2009: win against a pro (1p) 19x19, H6 MoGo2007: win against a pro (5p) 9x9 (blitz) MoGo2008: win against a pro (5p) 9x9 white MoGo2009: win against a pro (5p) 9x9 black MoGo2009: win against a pro (9p) 9x9 white Fuego2009: win against a pro (9p) 9x9 black MoGoTW==> still 6 stones at least!
  59. 59. Game of Go (9x9 here)
  60. 60. Game of Go
  61. 61. Game of Go
  62. 62. Game of Go
  63. 63. Game of Go - capture
  64. 64. Game of Go
  65. 65. Game of Go
  66. 66. Game of Go: counting territories(white has 7.5 “bonus” as black starts)
  67. 67. Game of Go: the rules Black plays at the blue circle: the white group dies (it is removed)Its impossible to kill white (two “eyes”). “Ko” rules: we dont come back to the same situation. (without ko: “PSPACE hard” with ko: “EXPTIME-complete”) At the end, we count territories ==> black starts, so +7.5 for white.
  68. 68. NP / PSPACE / EXPTIME in GoTsumegos with no ko, forced moves only for W, 2 moves for B, polynomial length: NP- completeAtari Go : PSPACEGo without ko: PSPACE-hardGo with ko + japanese rules: EXPTIME-completeGo with ko + superko: unknown (EXPSPACE?)Some phantom-rengo undecidable ?If Go with ko > Go without ko, then PSPACE EXPTIME
  69. 69. NP / PSPACE / EXPTIME in GoEncodingthe formulain a ladder:
  70. 70. Appendix 2: what is difficult forcomputers ? Visual things ? 70
  71. 71. Easy for computers ... becausehuman knowledge easy to encode. 71
  72. 72. Difficult for computersMuy difícil para las ordenadores. 72
  73. 73. A trivial semeai Plenty of equivalent situations! They are randomly sampled, with  no generalization. 50% of estimated win probability!
  74. 74. Semeai Plenty of equivalent situations! They are randomly sampled, with  no generalization. 50% of estimated win probability!
  75. 75. Semeai Plenty of equivalent situations! They are randomly sampled, with  no generalization. 50% of estimated win probability!
  76. 76. Semeai Plenty of equivalent situations! They are randomly sampled, with  no generalization. 50% of estimated win probability!
  77. 77. Semeai Plenty of equivalent situations! They are randomly sampled, with  no generalization. 50% of estimated win probability!
  78. 78. Semeai Plenty of equivalent situations! They are randomly sampled, with  no generalization. 50% of estimated win probability!
  79. 79. Semeai Plenty of equivalent situations! They are randomly sampled, with  no generalization. 50% of estimated win probability!
  80. 80. Semeai Plenty of equivalent situations! They are randomly sampled, with  no generalization. 50% of estimated win probability!
  81. 81. A trivial semeai Plenty of equivalent situations! They are randomly sampled, with  no generalization. 50% of estimated win probability!
  82. 82. A trivial semeai Plenty of equivalent situations! They are randomly sampled, with  no generalization. 50% of estimated win probability!
  83. 83. A trivial semeai Plenty of equivalent situations! They are randomly sampled, with  no generalization. 50% of estimated win probability!
  84. 84. It does not work. Why ? 50% of estimated win probability!In the first node: The first simulations give ~ 50% The next simulations go to 100% or 0% (depending on the chosen move) But, then, we switch to another node                                                (~ 8! x 8! such nodes)
  85. 85. And the humans ? 50% of estimated win probability!In the first node: The first simulations give ~ 50% The next simulations go to 100% or 0% (depending on the chosen move) But, then, we DONT switch to another node  
  86. 86. SemeaisShouldwhiteplay inthesemeai(G1)or capture(J15) ? 86
  87. 87. SemeaisShould blackplay thesemeai ? 87
  88. 88. SemeaisShould blackplay thesemeai ? 88
  89. 89. SemeaisShould blackplay thesemeai ?Useless! 89
  90. 90. Difficult games: Havannah Very difficult for computers.
  91. 91. Conclusions + other elements Go complexity:superko ?Ishi-no-shita (captures / recaptures) ? (more generally: characterizing strength /weakness of programs ?) Huge complexity classes forstructured gamespartially observable games (what about phantom-games ?)decentralized gamesGreat results for MCTS in GGP + difficult games. Next MCTS-challenges:Partially observable cases & large horizon : cf Cazenave, RoletSolve main weaknesses of MCTS (learning the MC ? Meta-actions ? Nested MC ? Mixing with value-function as in amazon ?)
  92. 92. BiblioComplexity: Robson, Tromp, Taylor, Crasmaru, ...Bandits: Lai, Robbins, Auer, Cesa-Bianchi...UCT: Kocsis, Szepesvari, Coquelin, Munos...MCTS (Go): Coulom, Chaslot, Fiter, Gelly, Hoock, Silver, Muller, Pérez, Rimmel, Wang...Tree + DP for industrial applicationl: Péret, Garcia...Bandits with infinitely many arms: Audibert, Coulom, Munos, Wang...Applications far from Go: Rolet, Teytaud (F), Rimmel, De Mesmay ...Links with “macro-actions” ?Parallelization, mixing with offline learning, bias...
  93. 93. Paul Veyssière Hassen DoghmenAmine BourkiMatthieu Coulm Contributors Colleagues from NUTN and CJCU Bandits: Lai, Robbins, Auer, Cesa-Bianchi... UCT: Kocsis, Szepesvari, Coquelin, Munos... MCTS (Go): Coulom, Chaslot, Fiter, Gelly, Hoock, Silver, Muller, Pérez, Rimmel, Wang... Tree + DP for industrial applicationl: Péret, Garcia... Bandits with infinitely many arms: Audibert, Coulom, Munos, Wang... Applications far from Go: Rolet, Teytaud (F), Rimmel, De Mesmay ... Links with “macro-actions” ? Parallelization, mixing with offline learning, bias...
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×