Disappointing and Unexpected     Results in Monte-Carlo Tree Search                                O. Teytaud & colleagues...
Disappointing and Unexpected     Results in Monte-Carlo Tree Search                                O. Teytaud & colleagues...
Part I. A success story        on Computer GamesPart II. Two unsolved problems in Computer GamesPart III. Some algorithms ...
Part I : The Success Story         (less showing off in part II :-) )            The game of Go is a beautiful            ...
Part I : The Success Story         (less showing off in part II :-) )            The game of Go is a beautiful            ...
Game of Go (9x9 here)
Game of Go
Game of Go
Game of Go
Game of Go
Game of Go
Game of Go
Game of Go: counting territories(white has 7.5 “bonus” as black starts)
Game of Go: the rules       Black plays at the blue circle:       the white group dies (it is       removed)Its impossible...
UCT (Upper Confidence Trees)               (a variant of MCTS)Coulom (06)Chaslot, Saito & Bouzy (06)Kocsis Szepesvari (06)
UCT
UCT
UCT
UCT
UCT      Kocsis & Szepesvari (06)
Exploitation ...
Exploitation ...            SCORE =                5/7             + k.sqrt( log(10)/7 )
Exploitation ...            SCORE =                5/7             + k.sqrt( log(10)/7 )
Exploitation ...            SCORE =                5/7             + k.sqrt( log(10)/7 )
... or exploration ?              SCORE =                  0/2               + k.sqrt( log(10)/2 )
“UCB” ?•   I have shown the “UCB” formula (Lai, Robbins), which is    the difference between MCTS and UCT
“UCB” ?•   I have shown the “UCB” formula (Lai, Robbins), which is    the difference between MCTS and UCT•   The UCB formu...
“UCB” ?•   I have shown the “UCB” formula (Lai, Robbins), which is    the difference between MCTS and UCT•   The UCB formu...
“UCB” ?•   I have shown the “UCB” formula (Lai, Robbins), which is    the difference between MCTS and UCT•   The UCB formu...
“UCB” ?•   I have shown the “UCB” formula (Lai, Robbins), which is    the difference between MCTS and UCT•   The UCB formu...
The great news:● Not related to classical algorithms              (no alpha-beta)● Recent tools              (Rémi Couloms...
The great news:● Not related to classical algorithms              (no alpha-beta)● Recent tools              (Rémi Couloms...
We all have to write reports:● Showing that we are very strong● Showing that our research has “breakthroughs”,    which de...
Part II: challengesTwo main challenges:● Situations which require abstract thinking                                (cf. Ca...
Part I. A success story on Computer GamesPart II. Two unsolved problems in        Computer GamesPart III. Some algorithms ...
A trivial semeai           (= “liberty” race)             Plenty of equivalent                       situations!          ...
Semeai     Plenty of equivalent               situations!         They are randomly             sampled, with           no...
Semeai     Plenty of equivalent               situations!         They are randomly             sampled, with           no...
Semeai     Plenty of equivalent               situations!         They are randomly             sampled, with           no...
Semeai     Plenty of equivalent               situations!         They are randomly             sampled, with           no...
Semeai     Plenty of equivalent               situations!         They are randomly             sampled, with           no...
Semeai     Plenty of equivalent               situations!         They are randomly             sampled, with           no...
Semeai     Plenty of equivalent               situations!         They are randomly             sampled, with           no...
A trivial semeai           Plenty of equivalent                     situations!            They are randomly              ...
A trivial semeai           Plenty of equivalent                     situations!            They are randomly              ...
A trivial semeai           Plenty of equivalent                     situations!            They are randomly              ...
This is very easy.Children can solve that.But it is too abstractfor computers.Computers play“semeais” very badly.
It does not work. Why ?                                              50% of estimated                                     ...
And the humans ?                                 50% of estimated                                   win probability!In the...
Requires more than local fighting.Requires combining several local fights.Children usually not so good at this.But strong ...
Requires more than local fighting.Requires combining several local fights.Children usually not so good at this.But strong ...
Part I. A success story on Computer GamesPart II. Two unsolved problems in Computer GamesPart III. Some algorithms which  ...
Part III: techniques for addressing these challenges             1. Parallelization           2. Machine Learning        3...
Parallelizing MCTS•       On a parallel machine with shared memory: just many        simulations in parallel, the same mem...
Parallelizing MCTS•       On a parallel machine with shared memory: just many        simulations in parallel, the same mem...
Parallelizing MCTS•       On a parallel machine with shared memory: just many        simulations in parallel, the same mem...
Parallelizing MCTS•       On a parallel machine with shared memory: just many        simulations in parallel, the same mem...
Parallelizing MCTS•       On a parallel machine with shared memory: just many        simulations in parallel, the same mem...
Good news: it works So misleading numbers...
Much better than voting schemes  But little difference with T. Cazenave  (depth 0).
Every month, someone tells us:                Try with a bigger                   machine !                And win against...
In fact, “32” and “1”have almost the same level...               (against humans...)
Being faster is not the solution
The same in Havannah      (F. Teytaud)
More deeply, 1                            (R. Coulom)Improvement in terms of performance againsthumans               <<Imp...
More deeply, 2No improvement in divide and conquer.    No improvement on situations      which require abstraction.
Part III: techniques for adressing these challenges             1. Parallelization         2. Machine Learning       3. Ge...
Machine learningA lot of tuning of the MC is central.  It is a bit disappointing for the      genericity of the method.   ...
A classical machine learning trick in MCTS: RAVE              (= rapid action value estimates)   score(move) =            ...
Here B2 is the only good move for white.But B2 makes sense only as a first move,  and nowhere else in subtrees ==> RAVE re...
A classical machine learning trick in MCTS: RAVE              (= rapid action value estimates)   score(move) =            ...
Part III: techniques for adressing these challenges             1. Parallelization          2. Machine Learning      3. Ge...
We dont want to use expert knowledge.       We want automated solutions.Developing biases by Genetic Programming ?
We dont want to use expert knowledge.      We want automated solutions.Developing a MC by Genetic Programming ?          L...
We dont want to use expert knowledge.              We want automated solutions.        Developing a MC by Genetic Programm...
Part III: techniques for addressing these challenges              1. Parallelization           2. Machine Learning        ...
Nested MCTS in one slide                             (Cazenave, F. Teytaud, etc)1) to a strategy, you can associate a valu...
Nested MCTS in one slide                             (Cazenave, F. Teytaud, etc)1) to a strategy, you can associate a valu...
Nested MCTS in one slide                             (Cazenave, F. Teytaud, etc)1) to a strategy, you can associate a valu...
Part I. A success story on Computer GamesPart II. Two unsolved problems in Computer GamesPart III. Some algorithms which d...
Part IV: ConclusionsGame of Go:1- disappointingly,    most recent progress = human expertise    ==> we understood a lot by...
Part IV: ConclusionsGame of Go:1- disappointingly,    most recent progress = human expertise2- UCB is not that much involv...
Part IV: ConclusionsRecent “generic” progress in MCTS:1- application to GGP (general game playing):    the program learns ...
Part IV: ConclusionsRecent “generic” progress in MCTS:1- application to GGP (general game playing):    the program learns ...
Part IV: Conclusions Techniques whichoutperformed thestate of the art inMinesweeper were(negatively)tested on Go,and (posi...
Part V: Meta-ConclusionHuge publication bias.People report only experiments which aresooooo great breakthrough.
Part V: Meta-ConclusionHuge publication bias.People report only experiments which aresooooo great breakthroughs.But when y...
Part V: Meta-ConclusionHuge publication bias.People report only experiments which aresooooo great breakthroughs.But when y...
Understanding this “combination of local stuff”       Abstractis impossible for computers     thinking (looks             ...
There are several examples of MCTS papersin which problems were swept under the carpet,            for the sake of publica...
Examples:             “- I have truncated results to ..... because                     it was unstable otherwise.”        ...
Examples:             “- I have truncated results to ..... because                     it was unstable otherwise.”        ...
For mathematical works, sometimes people lie on motivations, tryingto justify that there is a real world application.
For mathematical works, sometimes people lie on motivations, tryingto justify that there is a real world application. Some...
For mathematical works, sometimes people lie on motivations, tryingto justify that there is a real world application. Some...
For mathematical works, sometimes people lie on motivations, tryingto justify that there is a real world application. Some...
Part V: Meta-ConclusionHuge publication bias.People report only experiments which aresooooo great breakthroughs.But when y...
Part V: Meta-ConclusionHuge publication bias.People report only experiments which aresooooo great breakthrough.But when yo...
Part V: Meta-ConclusionHuge publication bias.People report only experiments which aresooooo great breakthrough.But when yo...
Part V: Meta-ConclusionPeople in computer-games look much moreClever since they have been working on Go.Much easier to wri...
Yet games are great challenges.When you play Go, you look clever & wise.When you play StarCraft, you look like  a geeky te...
Difficult games: Havannah                      Very difficult                     for computers.
What else ? First Person Shooting(UCT for partially observable MDP)
What else ? Real Time Strategy Game   (multiple actors, partially obs.) Frédéric Lemoine   MIG 11/07/2008   104
What else ? Sports (continuous control) Frédéric Lemoine   MIG 11/07/2008   105
“Real” gamesAssumption: if a computer understands and guesses spins, thenthis robot will be efficient for something else t...
“Real” gamesAssumption: if a computer understands and guesses spins, thenthis robot will be efficient for something else t...
What else ? Collaborative sports
Funding based on    Experimental                                    publication records       works.                      ...
Upcoming SlideShare
Loading in...5
×

Disappointing results & open problems in Monte-Carlo Tree Search

695

Published on

Talk given at ECML / PKDD 2012, Silver Workshop

Published in: Technology, Sports
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
695
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
12
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Disappointing results & open problems in Monte-Carlo Tree Search

  1. 1. Disappointing and Unexpected Results in Monte-Carlo Tree Search O. Teytaud & colleagues Silver Workshop, ECML 2012In a nutshell:- the game of Go, a great AI-complete challenge- MCTS, a great recent tool for MDP-solving- negative results on MCTS are the most important stuff- considerations on academic publications (pros and cons)
  2. 2. Disappointing and Unexpected Results in Monte-Carlo Tree Search O. Teytaud & colleagues Silver Workshop, ECML 2012 If you solve these weaknesses, even if it takes all your time in all your research during 30 years, it is worth being done.In a nutshell:- the game of Go, a great AI-complete challenge- MCTS, a great recent tool for MDP-solving- negative results on MCTS are the most important stuff- considerations on academic publications (pros and cons)
  3. 3. Part I. A success story on Computer GamesPart II. Two unsolved problems in Computer GamesPart III. Some algorithms which do not solve themPart IV. Conclusion (technical)Part V. Meta-conclusion (non-technical)
  4. 4. Part I : The Success Story (less showing off in part II :-) ) The game of Go is a beautiful Challenge.
  5. 5. Part I : The Success Story (less showing off in part II :-) ) The game of Go is a beautiful challenge. We did the first wins against professional players in the game of Go
  6. 6. Game of Go (9x9 here)
  7. 7. Game of Go
  8. 8. Game of Go
  9. 9. Game of Go
  10. 10. Game of Go
  11. 11. Game of Go
  12. 12. Game of Go
  13. 13. Game of Go: counting territories(white has 7.5 “bonus” as black starts)
  14. 14. Game of Go: the rules Black plays at the blue circle: the white group dies (it is removed)Its impossible to kill white (two “eyes”). “Superko” rule: we dont come back to the same situation. (without superko: “PSPACE hard” with superko: “EXPTIME-hard”) At the end, we count territories ==> black starts, so +7.5 for white.
  15. 15. UCT (Upper Confidence Trees) (a variant of MCTS)Coulom (06)Chaslot, Saito & Bouzy (06)Kocsis Szepesvari (06)
  16. 16. UCT
  17. 17. UCT
  18. 18. UCT
  19. 19. UCT
  20. 20. UCT Kocsis & Szepesvari (06)
  21. 21. Exploitation ...
  22. 22. Exploitation ... SCORE = 5/7 + k.sqrt( log(10)/7 )
  23. 23. Exploitation ... SCORE = 5/7 + k.sqrt( log(10)/7 )
  24. 24. Exploitation ... SCORE = 5/7 + k.sqrt( log(10)/7 )
  25. 25. ... or exploration ? SCORE = 0/2 + k.sqrt( log(10)/2 )
  26. 26. “UCB” ?• I have shown the “UCB” formula (Lai, Robbins), which is the difference between MCTS and UCT
  27. 27. “UCB” ?• I have shown the “UCB” formula (Lai, Robbins), which is the difference between MCTS and UCT• The UCB formula has deep mathematical principles.
  28. 28. “UCB” ?• I have shown the “UCB” formula (Lai, Robbins), which is the difference between MCTS and UCT• The UCB formula has deep mathematical principles.• But very far from the MCTS context.
  29. 29. “UCB” ?• I have shown the “UCB” formula (Lai, Robbins), which is the difference between MCTS and UCT• The UCB formula has deep mathematical principles.• But very far from the MCTS context.• Contrarily to what has often been claimed, UCB is not central in MCTS.
  30. 30. “UCB” ?• I have shown the “UCB” formula (Lai, Robbins), which is the difference between MCTS and UCT• The UCB formula has deep mathematical principles.• But very far from the MCTS context.• Contrarily to what has often been claimed, UCB is not central in MCTS.• But for publishing papers, relating MCTS to UCB is so beautiful, with plenty of maths papers in the bibliography :-)
  31. 31. The great news:● Not related to classical algorithms (no alpha-beta)● Recent tools (Rémi Couloms paper in 2006)● Not at all specific from Go (now widely used in games, and beyond)
  32. 32. The great news:● Not related to classical algorithms (no alpha-beta)● Recent tools (Rémi Couloms paper in 2006)● Not at all specific from Go (now widely used in games, and beyond) But great performance in Go needs adaptations (of the MC part)...
  33. 33. We all have to write reports:● Showing that we are very strong● Showing that our research has “breakthroughs”, which destroy “bottlenecks”So ok the previous slide is perfect for that
  34. 34. Part II: challengesTwo main challenges:● Situations which require abstract thinking (cf. Cazenave)● Situations which involve divide & conquer (cf Müller)
  35. 35. Part I. A success story on Computer GamesPart II. Two unsolved problems in Computer GamesPart III. Some algorithms which do not solve themPart IV. Conclusion (technical)Part V. Meta-conclusion (non-technical)
  36. 36. A trivial semeai (= “liberty” race) Plenty of equivalent situations! They are randomly sampled, with  no generalization. 50% of estimated win probability!
  37. 37. Semeai Plenty of equivalent situations! They are randomly sampled, with  no generalization. 50% of estimated win probability!
  38. 38. Semeai Plenty of equivalent situations! They are randomly sampled, with  no generalization. 50% of estimated win probability!
  39. 39. Semeai Plenty of equivalent situations! They are randomly sampled, with  no generalization. 50% of estimated win probability!
  40. 40. Semeai Plenty of equivalent situations! They are randomly sampled, with  no generalization. 50% of estimated win probability!
  41. 41. Semeai Plenty of equivalent situations! They are randomly sampled, with  no generalization. 50% of estimated win probability!
  42. 42. Semeai Plenty of equivalent situations! They are randomly sampled, with  no generalization. 50% of estimated win probability!
  43. 43. Semeai Plenty of equivalent situations! They are randomly sampled, with  no generalization. 50% of estimated win probability!
  44. 44. A trivial semeai Plenty of equivalent situations! They are randomly sampled, with  no generalization. 50% of estimated win probability!
  45. 45. A trivial semeai Plenty of equivalent situations! They are randomly sampled, with  no generalization. 50% of estimated win probability!
  46. 46. A trivial semeai Plenty of equivalent situations! They are randomly sampled, with  no generalization. 50% of estimated win probability!
  47. 47. This is very easy.Children can solve that.But it is too abstractfor computers.Computers play“semeais” very badly.
  48. 48. It does not work. Why ? 50% of estimated win probability!In the first node:● The first simulations give ~ 50%● The next simulations go to 100% or 0% (depending on the chosen move)● But, then, we switch to another node                                                (~ 8! x 8! such nodes)
  49. 49. And the humans ? 50% of estimated win probability!In the first node:● The first simulations give ~ 50%● The next simulations go to 100% or 0% (depending on the chosen move)● But, then, we DONT switch to another node  
  50. 50. Requires more than local fighting.Requires combining several local fights.Children usually not so good at this.But strong adults really good.And computers very childish. Looks like a bad move, “locally”. Lee Sedol (black) Vs Hang Jansik (white)
  51. 51. Requires more than local fighting.Requires combining several local fights.Children usually not so good at this.But strong adults really good.And computers very childish. Looks like a bad move, “locally”.
  52. 52. Part I. A success story on Computer GamesPart II. Two unsolved problems in Computer GamesPart III. Some algorithms which do not solve them (negatives results show that importance stuff is really on II...)Part IV. Conclusion (technical)Part V. Meta-conclusion (non-technical)
  53. 53. Part III: techniques for addressing these challenges 1. Parallelization 2. Machine Learning 3. Genetic Programming 4. Nested MCTS
  54. 54. Parallelizing MCTS• On a parallel machine with shared memory: just many simulations in parallel, the same memory for all.• On a parallel machine with no shared memory: one MCTS per comp. node, and 3 times per second: ● Select nodes with at least 5% of total sims (depth at most 3) ● Average all statistics on these nodes ==> comp cost = log(nb comp nodes)
  55. 55. Parallelizing MCTS• On a parallel machine with shared memory: just many simulations in parallel, the same memory for all.• On a parallel machine with no shared memory: one MCTS per comp. node, and 3 times per second: ● Select nodes with at least 5% of total sims (depth at most 3) ● Average all statistics on these nodes ==> comp cost = log(nb comp nodes)
  56. 56. Parallelizing MCTS• On a parallel machine with shared memory: just many simulations in parallel, the same memory for all.• On a parallel machine with no shared memory: one MCTS per comp. node, and 3 times per second: ● Select nodes with at least 5% of total sims (depth at most 3) ● Average all statistics on these nodes ==> comp cost = log(nb comp nodes)
  57. 57. Parallelizing MCTS• On a parallel machine with shared memory: just many simulations in parallel, the same memory for all.• On a parallel machine with no shared memory: one MCTS per comp. node, and 3 times per second: ● Select nodes with at least 5% of total sims (depth at most 3) ● Average all statistics on these nodes ==> comp cost = log(nb comp nodes)
  58. 58. Parallelizing MCTS• On a parallel machine with shared memory: just many simulations in parallel, the same memory for all.• On a parallel machine with no shared memory: one MCTS per comp. node, and 3 times per second: ● Select nodes with at least 5% of total sims (depth at most 3) ● Average all statistics on these nodes ==> comp cost = log(nb comp nodes)
  59. 59. Good news: it works So misleading numbers...
  60. 60. Much better than voting schemes But little difference with T. Cazenave (depth 0).
  61. 61. Every month, someone tells us: Try with a bigger machine ! And win against top pros ! (I have believed that, at some point...)
  62. 62. In fact, “32” and “1”have almost the same level... (against humans...)
  63. 63. Being faster is not the solution
  64. 64. The same in Havannah (F. Teytaud)
  65. 65. More deeply, 1 (R. Coulom)Improvement in terms of performance againsthumans <<Improvement in terms of performance againstcomputers <<Improvements in terms of self-play
  66. 66. More deeply, 2No improvement in divide and conquer. No improvement on situations which require abstraction.
  67. 67. Part III: techniques for adressing these challenges 1. Parallelization 2. Machine Learning 3. Genetic Programming 4. Nested MCTS
  68. 68. Machine learningA lot of tuning of the MC is central. It is a bit disappointing for the genericity of the method. Can we make this tuning automatic ?
  69. 69. A classical machine learning trick in MCTS: RAVE (= rapid action value estimates) score(move) = alpha UCB(move) + (1-alpha) StatisticsInSubtree(move) Alpha2 = nbSimulations / ( K + nbSimulations)Usually works well, but performs weakly on some situations.weakness: - brings information only from bottom to top of the tree - does not solve main problems - sometimes very harmful ==> extensions ?
  70. 70. Here B2 is the only good move for white.But B2 makes sense only as a first move, and nowhere else in subtrees ==> RAVE rejects B2.==> extensions ?
  71. 71. A classical machine learning trick in MCTS: RAVE (= rapid action value estimates) score(move) = alpha UCB(move) + (1-alpha) StatisticsInSubtree(move) Alpha2 = nbSimulations / ( K + nbSimulations)Usually works well, but performs weakly on some situations. [Müller]4 generic rules proposed recently:- Drake [ICGA 2009]: Last Good Reply- Silver and others: simulation balancing- poolRave [Rimmel et al, ACG 2011]- Contextual Monte-Carlo [Rimmel et al, EvoGames 2010]- Decisive moves and anti-decisive moves [Teytaud et al, CIG 2010] ==> significantly (statistics) ok, but far less efficient than human expertise
  72. 72. Part III: techniques for adressing these challenges 1. Parallelization 2. Machine Learning 3. Genetic Programming 4. Nested MCTS
  73. 73. We dont want to use expert knowledge. We want automated solutions.Developing biases by Genetic Programming ?
  74. 74. We dont want to use expert knowledge. We want automated solutions.Developing a MC by Genetic Programming ? Looks like a good idea. But importantly: A strong MC part(in terms of playing strength of the MC part), does not imply (by far!) a stronger MCTS. (except in 1P cases...)
  75. 75. We dont want to use expert knowledge. We want automated solutions. Developing a MC by Genetic Programming ? Hoock et alCazenave et al
  76. 76. Part III: techniques for addressing these challenges 1. Parallelization 2. Machine Learning 3. Genetic Programming 4. Nested MCTS
  77. 77. Nested MCTS in one slide (Cazenave, F. Teytaud, etc)1) to a strategy, you can associate a value function -Value(s) = expected reward when simulation with strategy  from state s
  78. 78. Nested MCTS in one slide (Cazenave, F. Teytaud, etc)1) to a strategy, you can associate a value function -Value(s) = expected reward when simulation with strategy  from state s2) Then define: Nested-MC0(state)=MC(state) Nested-MC1(state)=decision maximizing NestedMC0-value(state.(state)) ... Nested-MC.42(state)=decision maximizing NestedMC.41-value(state.(state))
  79. 79. Nested MCTS in one slide (Cazenave, F. Teytaud, etc)1) to a strategy, you can associate a value function -Value(s) = expected reward when simulation with strategy  from state s2) Then define: NestedMC0(state)=MC(state) NestedMC1(state)=decision maximizing NestedMC0-value(state+decision) ... NestedMC.42(state)=decision maximizing NestedMC.41-value(state+decision)==> looks like a great idea==> not good in Go==> good on some less widely known testbeds (“morpion solitaire”, some hard scheduling pbs)
  80. 80. Part I. A success story on Computer GamesPart II. Two unsolved problems in Computer GamesPart III. Some algorithms which do not solve themPart IV. Conclusion (technical)Part V. Meta-conclusion (non-technical)
  81. 81. Part IV: ConclusionsGame of Go:1- disappointingly, most recent progress = human expertise ==> we understood a lot by methods which do not work or work little ==> we understood a lot by counter-examples, not by impressive performance
  82. 82. Part IV: ConclusionsGame of Go:1- disappointingly, most recent progress = human expertise2- UCB is not that much involved in MCTS (simple rules perform similarly) “==> publication bias”
  83. 83. Part IV: ConclusionsRecent “generic” progress in MCTS:1- application to GGP (general game playing): the program learns the rules of the game just before the competition, no last-minute development (fully automatized) ==> not so well known, but really interesting
  84. 84. Part IV: ConclusionsRecent “generic” progress in MCTS:1- application to GGP (general game playing): the program learns the rules of the game just before the competition, no last-minute Development (fully automatized)2- one-player games: great ideas which do not work in 2P-games sometimes work in 1P games (e.g. optimizing the MC in a DPS sense)
  85. 85. Part IV: Conclusions Techniques whichoutperformed thestate of the art inMinesweeper were(negatively)tested on Go,and (positively) onindustrial problems.
  86. 86. Part V: Meta-ConclusionHuge publication bias.People report only experiments which aresooooo great breakthrough.
  87. 87. Part V: Meta-ConclusionHuge publication bias.People report only experiments which aresooooo great breakthroughs.But when you discuss with them they tellyou that there is publication andthere is reality.
  88. 88. Part V: Meta-ConclusionHuge publication bias.People report only experiments which aresooooo great breakthroughs.But when you discuss with them they tellyou that there is publication andthere is reality. At the end, we trust our friends, or publishedtheorems, but we dont trust experiments.The most interesting MCTS results arenegative results:
  89. 89. Understanding this “combination of local stuff” Abstractis impossible for computers thinking (looks like theorem proving) Current main ML techniques for MCTS does not work on this
  90. 90. There are several examples of MCTS papersin which problems were swept under the carpet, for the sake of publication, whereas the dust was the interesting stuff. Results are often difficult to reproduce, or unstable w.r.t. experimental conditions.
  91. 91. Examples: “- I have truncated results to ..... because it was unstable otherwise.” (cheat by using new version only for openings)==> for any method, with enough tuning, you get positive results
  92. 92. Examples: “- I have truncated results to ..... because it was unstable otherwise.” (cheat by using new version only for openings) “- I could make it work after a lot of tuning in 9x9, but I could not get positive results in 19x19” (cheat by heavy tuning)==> for any method, with enough tuning, you get positive results==> you are more likely to publish “I used sophisticated method XXX and got positive results” than “I used plenty of dirty tuning and got positive results” ==> if method XXX has plenty of free parameters its ok at some point you will validate it
  93. 93. For mathematical works, sometimes people lie on motivations, tryingto justify that there is a real world application.
  94. 94. For mathematical works, sometimes people lie on motivations, tryingto justify that there is a real world application. Sometimes its true, but its also often a lie. A memory from a long time ago; I was working on pure theory stuff and I asked “I have read in the abstract that this can be applied to biological problems. Can you explain ?”
  95. 95. For mathematical works, sometimes people lie on motivations, tryingto justify that there is a real world application. Sometimes its true, but its also often a lie. A memory from a long time ago; I was working on pure theory stuff and I asked “I have read in the abstract that this can be applied to biological problems. Answer: “Wahaha he has believed it!”
  96. 96. For mathematical works, sometimes people lie on motivations, tryingto justify that there is a real world application. Sometimes its true, but its also often a lie. In experiments, its different: people often use experimental setups for hiding the problems under the carpet. Mathematicians can not do that.
  97. 97. Part V: Meta-ConclusionHuge publication bias.People report only experiments which aresooooo great breakthroughs.But when you discuss with them they tellyou that there is publication andthere is reality.My conclusions:- dont trust publications too much,- I want to publish less- I want to publish (try to publish...) failures and disappointing results.
  98. 98. Part V: Meta-ConclusionHuge publication bias.People report only experiments which aresooooo great breakthrough.But when you discuss with them they tellyou that there is publication andthere is reality. We could apply in MineSweeper (1P) Ideas which do notMy conclusions: work in Go (2P)- dont trust publications too much,- I want to publish less- I want to publish (try to publish...) failures and disappointing results.
  99. 99. Part V: Meta-ConclusionHuge publication bias.People report only experiments which aresooooo great breakthrough.But when you discuss with them they tellyou that there is publication andthere is reality. We could apply in We could apply in Energy Manag. (1P) MineSweeper (1P) Ideas which do not Ideas which do notMy conclusions: work in Go (2P) work in Go (2P)- dont trust publications too much,- I want to publish less- I want to publish (try to publish...) failures and disappointing results.
  100. 100. Part V: Meta-ConclusionPeople in computer-games look much moreClever since they have been working on Go.Much easier to write reports :-)Lucky, right place, right moment.The progress in the game of Go does not cure cancer.The important challenges are still in front of us (dont trust too much published solutions...).Failed experiments on Go provide more insights than the success story (in which the tuning part, which is not so generic, is not visible...).
  101. 101. Yet games are great challenges.When you play Go, you look clever & wise.When you play StarCraft, you look like a geeky teenager.Yet, StarCraft, Doom, Table Tennis, MineSweeper are great challenges.
  102. 102. Difficult games: Havannah Very difficult for computers.
  103. 103. What else ? First Person Shooting(UCT for partially observable MDP)
  104. 104. What else ? Real Time Strategy Game (multiple actors, partially obs.) Frédéric Lemoine MIG 11/07/2008 104
  105. 105. What else ? Sports (continuous control) Frédéric Lemoine MIG 11/07/2008 105
  106. 106. “Real” gamesAssumption: if a computer understands and guesses spins, thenthis robot will be efficient for something else than just games.(holds true for Go)
  107. 107. “Real” gamesAssumption: if a computer understands and guesses spins, thenthis robot will be efficient for something else than just games. VS
  108. 108. What else ? Collaborative sports
  109. 109. Funding based on Experimental publication records works. For me, this is source of all evil.Difficult to reproduce (except games...) Academics are, and should remain, the most independent and reliable people. Statistical Should be referees conscious or Dust swept under for all important Negative results unconscious carpet / aesthetic bias unpublished industrial contracts. cheating Moderately reliable publication Yet, academic papers are, I think, more reliable than reports for billion-$ contracts ==> pressure by money does not work :-(
  1. ¿Le ha llamado la atención una diapositiva en particular?

    Recortar diapositivas es una manera útil de recopilar información importante para consultarla más tarde.

×