SlideShare a Scribd company logo
Monte-Carlo Tree Search
O. Teytaud & colleagues
ENSL / ski 2014
In a nutshell:
- the game of Go, a great AI-complete challenge
- MCTS, a great recent tool for MDP-solving
- UCT & other maths
- unsolved stuff
Monte-Carlo Tree Search
O. Teytaud & colleagues
ENSL / Ski 2014
In a nutshell:
- the game of Go, a great AI-complete challenge
- MCTS, a great recent tool for MDP-solving
- UCT & other maths
- unsolved stuff If someone solves these problems,
it justifies a whole life of
academic salary :-)
Monte Carlo
Classical Monte Carlo first.
● We want to know E f(x)
● We generate x1,...,xn
● Ef(x) ~ average of the f(xi)
Monte Carlo with Decisions
Classical Monte Carlo with multiple time 
steps, an example.
● x=one year of weather data
● f(x) = electricity production during this year
● Ill defined: f(x) depends on my decisions 
(switch on / switch off).
● So f(x,d) with d = argmin f(x,d)       
  (assuming I make optimal decisions)
Monte Carlo with Decisions
So f(x,d) with d = argmin f(x,d)       
● Still incorrect;
● x is 365­dimensional.
● d is 365­dimensional
● I can not know d360 when I decide d1.
So f(x) = E min E min … E min   E    min
              x1 d1  x2 d2 …......... x365 d365
Monte Carlo with Decisions
So f(x) = E min E min … E min   E    min
             x1 d1  x2 d2 …............... x365 d365
How to compute that ?
Define an approximate di = π(i, xi) (possibly 
randomized).
==> Randomly draw both x and the di.
Monte Carlo with Decisions
So f(x) = E min E min … E min   E    min
             x1 d1  x2 d2 …............... x365 d365
==> Randomly draw both x and the di.
Problem:
● Classical MC is consistent. 
● Decisional MC is not consistent
==>  we would like the di to be optimal.
“Adaptive” Monte Carlo
     f(x) = E min E min … E min   E    min
             x1 d1  x2 d2 …............... x365 d365
Ok we generate the di heuristically.
But we keep statistics.
And we “update” the heuristic with these statistics.
==> consistency !
MCTS is something like that. 
     (and there might be several “decision makers”)
Part I. A success story
in Computer Games
Part II. Two unsolved problems in Computer Games
Part III. Bandits, UCT & other math. stuff
Part IV. Conclusion
Part I : The Success Story
(less showing off in part II :-) )
The game of Go is a beautiful
Challenge.
Part I : The Success Story
(less showing off in part II :-) )
The game of Go is a beautiful
challenge.
We did the first wins against
professional players
in the game of Go
But with handicap!
Game of Go (9x9 here)
Game of Go
Game of Go
Game of Go
Game of Go
Game of Go
Game of Go
Game of Go: counting territories
( w h i t e h a s 7 . 5 “ b o n u s ” a s b l a c k s t a r t s )
Game of Go: the rules
Black plays at the blue circle:
the white group dies (it is
removed)
It's impossible to kill white (two “eyes”).
“Superko” rule: we don't come back to the same
situation.
(without superko: “PSPACE hard”
with superko: “EXPTIME-hard”)
At the end, we count territories
==> black starts, so +7.5 for white.
The rank of MCTS and classical programs in Go
(Source: Peter Shotwell+computer Go mailing list )
Stagnation
around 5D ?
MCTS
RAVE
MPI-parallelization
ML+
Expertise, ...
Quasi-solving
of 7x7
Not over
in 9x9...Alpha
beta
Coulom (06)
Chaslot, Saito & Bouzy (06)
Kocsis Szepesvari (06)
UCT (Upper Confidence Trees)
(a variant of MCTS)
UCT
UCT
UCT
UCT
UCT
Kocsis & Szepesvari (06)
Exploitation ...
Exploitation ...
SCORE =
5/7
+ k.sqrt( log(10)/7 )
Exploitation ...
SCORE =
5/7
+ k.sqrt( log(10)/7 )
Exploitation ...
SCORE =
5/7
+ k.sqrt( log(10)/7 )
... or exploration ?
SCORE =
0/2
+ k.sqrt( log(10)/2 )
MCTS in one slide
Summary of MCTS
• While ( we have time)
– S = state at which we need a decision
– Simulate randomly from S until end
– Update statistics
• Decision = most simulated in S
Using UCB
UCB and its variants
• We have seen the MCTS principle
• The most classical MCTS is UCT (i.e.
MCTS with UCB)
• Let us see the UCB formula and its
properties
Upper Confidence Bound
Problem specified by:
- K arms
- Probability distribution R1,...,RK
- A budget T (# time steps)
During T time steps t=1,...,t=T, ( t=T+1 ):
- we choose at
in {1,...,K}
- we get a reward rt
indep. drawn with distrib. Rat
We minimize a regret:
- Cumulative regret R = T maxi
E Ri
-
- Simple regret maxi
E Ri
– Ra(T+1)
UCB: at
=argmin averageReward(a) + sqrt( C log(t) / nb(a) )
==> reasonably good both for Simple & Cumulative
Stochastic bandit
Two main assumptions:
● Stationary
● Cumulative regret
Not true in
MCTS
Average reward for arm k variance for arm k
“UCB” ?
• I have shown the “UCB” formula (Lai, Robbins), which is
the difference between MCTS and UCT ( +sqrt(log t / nbSims) )
“UCB” ?
• I have shown the “UCB” formula (Lai, Robbins), which is
the difference between MCTS and UCT
• The UCB formula has deep mathematical principles.
“UCB” ?
• I have shown the “UCB” formula (Lai, Robbins), which is
the difference between MCTS and UCT
• The UCB formula has deep mathematical principles.
• But very far from the MCTS context.
“UCB” ?
• I have shown the “UCB” formula (Lai, Robbins), which is
the difference between MCTS and UCT
• The UCB formula has deep mathematical principles.
• But very far from the MCTS context (indep, regret).
• Contrarily to what has often been claimed, UCB is
not central in MCTS (but ok for proving
convergence).
“UCB” ?
• I have shown the “UCB” formula (Lai, Robbins), which is
the difference between MCTS and UCT
• The UCB formula has deep mathematical principles.
• But very far from the MCTS context.
• Contrarily to what has often been claimed, UCB is
not central in MCTS (ok for proving convergence).
• But for publishing papers, relating MCTS to UCB is
so beautiful, with plenty of maths papers in the
bibliography :-)
Non stationary case
• Kocsis + Szepesvari 2006: UCB in non-
stationary case
• Application to UCT:
Non stationary: Uct
• Kocsis + Szepesvari 2006: UCB in non-
stationary case
• Application to UCT:
Huge problem-dependent
constant.
Only for finite MDP
B(D/2)
iterations
(Branching & Depth)Experiments (~ αβ)
Now variants
• ((( f(x) = noisy function, finding x such
that E f(x) is minimum for x in [0,1]d
)))
• Problem with infinite action space / state
space
• And algorithms which work better than
UCT in the discrete case
Infinite action space
• E.g. actions are continuous
• Infinite branching factor
• UCB meaningless in such a case
==> progressive widening: argmax
UCBscore over n0.2
first options
Infinite MDP
• Variant of UCT (Auger et al, 2013)
• Progressive widening: consider only a
sublinear number of children nodes
• Exploration log(t) ==> te
for some e>0
Error = O ( 1/n10D
)
exponentially surely in n
Explicit rate, but it
will take time...
Without exploration
UcbScore(move) =
meanReward(move)
+ sqrt( log(t) / nbSims(move) )
Works very well in Go. Why ?
Binary rewards, without exploration
(Berthier et al, 2009)
UcbScore(move) =
meanReward(move)
+ sqrt( log(t) / nbSims(move) )
mean = (numerator+K) / (denominator + 2K)
Adversarial bandit
Different framework:
the reward is M(k,k') where
k' is chosen by an adversary
(not aware or your choice).
Criteria are a bit different,
algorithms are stochastic.
==> not for today.
==> extends UCT to
simultaneous actions
The great news about the MCTS field:
● Not related to classical algorithms
(no alpha-beta)
● Recent tools
(Rémi Coulom's paper in 2006)
● Not at all specific from Go
(now widely used in games,
and beyond)
The great news:
● Not related to classical algorithms
(no alpha-beta)
● Recent tools
(Rémi Coulom's paper in 2006)
● Not at all specific from Go
(now widely used in games,
and beyond)
But great performance in Go
needs adaptations
(of the MC part)...
Part II: challenges
Two main challenges:
● Situations which require abstract thinking
(cf. Cazenave)
● Situations which involve divide & conquer
(cf Müller)
Part I. A success story on Computer Games
Part II. Two unsolved problems in
Computer Games
Part III. Some algorithms which do not solve them
Part IV. Conclusion
A trivial semeai
(= “liberty” race)
Plenty of equivalent
situations!
They are randomly
sampled, with 
no generalization.
50% of estimated
win probability!
Semeai
Plenty of equivalent
situations!
They are randomly
sampled, with 
no generalization.
50% of estimated
win probability!
Semeai
Plenty of equivalent
situations!
They are randomly
sampled, with 
no generalization.
50% of estimated
win probability!
Semeai
Plenty of equivalent
situations!
They are randomly
sampled, with 
no generalization.
50% of estimated
win probability!
Semeai
Plenty of equivalent
situations!
They are randomly
sampled, with 
no generalization.
50% of estimated
win probability!
Semeai
Plenty of equivalent
situations!
They are randomly
sampled, with 
no generalization.
50% of estimated
win probability!
Semeai
Plenty of equivalent
situations!
They are randomly
sampled, with 
no generalization.
50% of estimated
win probability!
Semeai
Plenty of equivalent
situations!
They are randomly
sampled, with 
no generalization.
50% of estimated
win probability!
A trivial semeai
Plenty of equivalent
situations!
They are randomly
sampled, with 
no generalization.
50% of estimated
win probability!
A trivial semeai
Plenty of equivalent
situations!
They are randomly
sampled, with 
no generalization.
50% of estimated
win probability!
A trivial semeai
Plenty of equivalent
situations!
They are randomly
sampled, with 
no generalization.
50% of estimated
win probability!
This is very easy.
Children can solve that.
But it is too abstract
for computers.
Computers play
“semeais” very badly.
It does not work. Why ?
50% of estimated
win probability!
                                               (~ 8! x 8! such nodes)
And the humans ?
Humans consider just one variation!
This was the first deceptive 
situation: plenty of symmetries
Another different context:
problems that humans solve with
divide and conquer.
Requires more than local fighting.
Requires combining several local fights.
Children usually
not so good
at this.
But strong adults
really good.
And computers
very childish.
Looks like a
bad move,
“locally”.
Lee Sedol (black)
Vs
Hang Jansik (white)
Requires more than local fighting.
Requires combining several local fights.
Children usually
not so good
at this.
But strong adults
really good.
And computers
very childish.
Looks like a
bad move,
“locally”.
Alive!
Part I. A success story on Computer Games
Part II. Two unsolved problems in Computer Games
Part III. Some algorithms which
do not solve them
(negatives results show that importance stuff is
really on II...)
Part IV. Conclusion
Part III: techniques for addressing these challenges
1. Parallelization
2. Machine Learning
3. Genetic Programming
4. Nested MCTS
Parallelizing MCTS
• On a parallel machine with shared memory: just many
simulations in parallel, the same memory for all.
• On a parallel machine with no shared memory: one
MCTS per comp. node, and 3 times per second:
– Select nodes with at least 5% of total sims (depth at
most 3)
– Average all statistics on these nodes
==> comp cost = log(nb comp nodes)
Parallelizing MCTS
• On a parallel machine with shared memory: just many
simulations in parallel, the same memory for all.
• On a parallel machine with no shared memory: one
MCTS per comp. node, and 3 times per second:
– Select nodes with at least 5% of total sims (depth at
most 3)
– Average all statistics on these nodes
==> comp cost = log(nb comp nodes)
Parallelizing MCTS
• On a parallel machine with shared memory: just many
simulations in parallel, the same memory for all.
• On a parallel machine with no shared memory: one
MCTS per comp. node, and 3 times per second:
– Select nodes with at least 5% of total sims (depth at
most 3)
– Average all statistics on these nodes
==> comp cost = log(nb comp nodes)
Parallelizing MCTS
• On a parallel machine with shared memory: just many
simulations in parallel, the same memory for all.
• On a parallel machine with no shared memory: one
MCTS per comp. node, and 3 times per second:
– Select nodes with at least 5% of total sims (depth at
most 3)
– Average all statistics on these nodes
==> comp cost = log(nb comp nodes)
Parallelizing MCTS
• On a parallel machine with shared memory: just many
simulations in parallel, the same memory for all.
• On a parallel machine with no shared memory: one
MCTS per comp. node, and 3 times per second:
– Select nodes with at least 5% of total sims (depth at
most 3)
– Average all statistics on these nodes
==> comp cost = log(nb comp nodes)
Good news: it works
So misleading numbers...
Much better than voting schemes
But little difference with T. Cazenave
(depth 0).
Every month, someone says:
Try with a bigger
machine !
And win against
top pros !
(I have believed that,
at some point...)
In fact, “32” and “1”
have almost the same level...
(against humans...)
Being faster is not the solution
The same in Havannah
(F. Teytaud)
More deeply, 1
(R. Coulom)
Improvement in terms of performance against
humans
<<
Improvement in terms of performance against
computers
<<
Improvements in terms of self-play
More deeply, 2
No improvement in divide and conquer.
No improvement on situations
which require abstraction.
Part III: techniques for adressing these challenges
1. Parallelization
2. Machine Learning
3. Genetic Programming
4. Nested MCTS
What is machine learning ?
= using plenty of data
for deriving useful knowledge.
So it's statistics ?
Closely related to statistics
Just a bit more “geek”.
Machine learning
Good simulations are crucial.
It is a bit disappointing for the
genericity of the method.
Can we make this
tuning automatic ?
MACHINE LEARNING
IN MCTS:
BIASING THE
TREE SEARCH
Rapid Action Value Estimates
ScoreUCB(m,s) = average reward when
playing move m in situation ((( s + sqrt(...) )))
ScoreRAVE(m,s) = average reward when
playing move m after situation s
==> asymptotically stupid (we want an estimate
of m when it is played now, in s)
==> but non-asymptotically quite great
A classical machine learning trick in MCTS: RAVE
(= rapid action value estimates)
score(move) =
alpha UCB(move)
+ (1-alpha) RAVE(move)
Alpha2
= nbSimulations / ( K + nbSimulations)
Usually works well, but performs weakly on some situations.
weakness:
- brings information only from bottom to top of the tree
- does not solve main problems
- sometimes very harmful
==> extensions ?
A classical machine learning trick in MCTS: RAVE
(= rapid action value estimates)
score(move,s) =
alpha UCB(move,s)
+ (1-alpha) RAVE(move,s)
Alpha2
= nbSimulations / ( K + nbSimulations)
Or better:
● RAVE(m,s) = #cumRewardRAVE(m,s) / #simsRAVE(m,s)
● #simsRAVE(m,s) initialized at 50
● #cumRewardRAVE(m,s) initialized at 50 x expertise(m,s)
Currently, “expertise” is handcrafted.
Can we do better with a neural network ?
Here B2 is the only good move for white.
But B2 makes sense only as a first move,
and nowhere else in subtrees ==> RAVE rejects B2.
==> extensions ?
Criticality: covariance between
“succeeding at a location x”
and “global reward”
Criticality: how to use it ?
SimsCriticality = c x | Criticality |
● WinsCriticality= SimsCriticality if Criticality >0
● WinsCriticality= 0 otherwise
==> Then, use WinsRAVE + WinsCriticality
and SimsRAVE + SimsCriticality
MACHINE LEARNING
IN MCTS:
BIASING THE
MONTE CARLO PART
(well, trying to...)
Other Machine Learning tricks in MCTS
4 generic rules proposed recently:
- Drake [ICGA 2009]: Last Good Reply
- Silver and others: simulation balancing
- poolRave [Rimmel et al, ACG 2011]
- Contextual Monte-Carlo [Rimmel et al, E.G. 2010]
- Decisive moves and anti-decisive moves
[Teytaud et al, CIG 2010]
==> significantly positive, but far less
efficient than human expertise
Part III: techniques for adressing these challenges
1. Parallelization
2. Machine Learning
3. Genetic Programming
4. Nested MCTS
We don't want to use expert knowledge.
We want automated solutions.
Developing biases by Genetic Programming ?
Genetic programming
= optimizing programs.
E.g. optimizing the
Monte Carlo simulator.
Typically by evolutionary
algorithms.
We don't want to use expert knowledge.
We want automated solutions.
Developing biases by Genetic Programming ?
Looks like a good idea.
But importantly:
A strong MC part
(in terms of playing strength of the MC part),
does not imply (by far!)
a stronger MCTS.
(except in 1P cases...)
We don't want to use expert knowledge.
We want automated solutions.
Developing a MC by Genetic Programming ?
Hoock et al
Cazenave et al
Part III: techniques for addressing these challenges
1. Parallelization
2. Machine Learning
3. Genetic Programming
4. Nested MCTS
Nested MCTS in one slide
(Cazenave, F. Teytaud, etc)
1) to a strategy, you can associate a value function
-Value(s)
= expected reward when simulation with strategy 
from state s
Nested MCTS in one slide
(Cazenave, F. Teytaud, etc)
1) to a strategy, you can associate a value function
-Value(s)
= expected reward when simulation with strategy 
from state s
2) Then define:
Nested-MC0(state)=MC(state)
Nested-MC1(state)=decision maximizing
NestedMC0-value(next state)
...
Nested-MC.42(state)=decision maximizing
NestedMC.41-value(next state)
Nested MCTS in one slide
(Cazenave, F. Teytaud, etc)
1) to a strategy, you can associate a value function
-Value(s)
= expected reward when simulation with strategy 
from state s
2) Then define:
Nested-MC0(state)=MC(state)
Nested-MC1(state)=decision maximizing
NestedMC0-value(next state)
...
Nested-MC.42(state)=decision maximizing
NestedMC.41-value(next state)
==> looks like a great idea
==> not good in Go
==> good on some less widely known testbeds
(“morpion solitaire”, some hard scheduling pbs)
Part I. A success story on Computer Games
Part II. Two unsolved problems in Computer Games
Part III. Some algorithms which do not solve them
Part IV. Conclusion
Part IV: Conclusions
MCTS = algorithm from 2006
● Born in AI for games
●
Slightly related to A* and αβ-iterative-deepening
● Widely applicable.
● UCT = one variant (try it first, then test)
● RAVE & other statistics as a bias
● Parallelization + expertise.
●
Some clearly identified problems:
- abstract thinking (AI complete ?)
- divide & conquer
Part IV: Conclusions
Game of Go:
1- disappointingly,
most recent progress = human expertise
2- UCB is not that much involved in MCTS
(simple rules perform similarly)
==> publication bias
Part IV: Conclusions
Recent “generic” progress in MCTS:
1- application to GGP (general game playing):
the program learns the rules of the game
just before the competition, no last-minute
development (fully automatized)
==> good model for genericity
==> MCTS very good at this
Part IV: Conclusions
Recent “generic” progress in MCTS:
1- application to GGP (general game playing):
the program learns the rules of the game
just before the competition, no last-minute
development (fully automatized)
2- one-player games: great ideas which do not
work in 2P-games sometimes work in 1P
games (e.g. optimizing the MC in a
DPS sense)
Part IV: Conclusions
3. Applications in
video games
(restricted state
info)
4. PO games
(Minesweeper)
ML techniques for
understanding
from simulations
Abstract
thinking (looks
like theorem
proving)
Understanding this
“combination of local stuff”
is impossible for computers
MCTS = versatile, somehow model-free,
convenient, often great. What next ?
Can we compete with Alpha-Beta in
e.g. Chess ?

More Related Content

Viewers also liked

Bias correction, and other uncertainty management techniques
Bias correction, and other uncertainty management techniquesBias correction, and other uncertainty management techniques
Bias correction, and other uncertainty management techniques
Olivier Teytaud
 
Fuzzy control - superfast survey
Fuzzy control - superfast surveyFuzzy control - superfast survey
Fuzzy control - superfast survey
Olivier Teytaud
 
Debugging
DebuggingDebugging
Debugging
Olivier Teytaud
 
Réseaux neuronaux profonds & intelligence artificielle
Réseaux neuronaux profonds & intelligence artificielleRéseaux neuronaux profonds & intelligence artificielle
Réseaux neuronaux profonds & intelligence artificielle
Olivier Teytaud
 
Artificial intelligence for power systems
Artificial intelligence for power systemsArtificial intelligence for power systems
Artificial intelligence for power systems
Olivier Teytaud
 
Planning for power systems
Planning for power systemsPlanning for power systems
Planning for power systems
Olivier Teytaud
 
Functional programming
Functional programmingFunctional programming
Functional programming
Olivier Teytaud
 
Examples of operational research
Examples of operational researchExamples of operational research
Examples of operational research
Olivier Teytaud
 
Simulation-based optimization: Upper Confidence Tree and Direct Policy Search
Simulation-based optimization: Upper Confidence Tree and Direct Policy SearchSimulation-based optimization: Upper Confidence Tree and Direct Policy Search
Simulation-based optimization: Upper Confidence Tree and Direct Policy Search
Olivier Teytaud
 
Simple regret bandit algorithms for unstructured noisy optimization
Simple regret bandit algorithms for unstructured noisy optimizationSimple regret bandit algorithms for unstructured noisy optimization
Simple regret bandit algorithms for unstructured noisy optimization
Olivier Teytaud
 
Bias and Variance in Continuous EDA: massively parallel continuous optimization
Bias and Variance in Continuous EDA: massively parallel continuous optimizationBias and Variance in Continuous EDA: massively parallel continuous optimization
Bias and Variance in Continuous EDA: massively parallel continuous optimization
Olivier Teytaud
 
Keywords and examples of machine learning
Keywords and examples of machine learningKeywords and examples of machine learning
Keywords and examples of machine learningOlivier Teytaud
 
Disappointing results & open problems in Monte-Carlo Tree Search
Disappointing results & open problems in Monte-Carlo Tree SearchDisappointing results & open problems in Monte-Carlo Tree Search
Disappointing results & open problems in Monte-Carlo Tree Search
Olivier Teytaud
 
Direct policy search
Direct policy searchDirect policy search
Direct policy search
Olivier Teytaud
 
Combining games artificial intelligences & improving random seeds
Combining games artificial intelligences & improving random seedsCombining games artificial intelligences & improving random seeds
Combining games artificial intelligences & improving random seeds
Olivier Teytaud
 

Viewers also liked (16)

Bias correction, and other uncertainty management techniques
Bias correction, and other uncertainty management techniquesBias correction, and other uncertainty management techniques
Bias correction, and other uncertainty management techniques
 
Fuzzy control - superfast survey
Fuzzy control - superfast surveyFuzzy control - superfast survey
Fuzzy control - superfast survey
 
Debugging
DebuggingDebugging
Debugging
 
Réseaux neuronaux profonds & intelligence artificielle
Réseaux neuronaux profonds & intelligence artificielleRéseaux neuronaux profonds & intelligence artificielle
Réseaux neuronaux profonds & intelligence artificielle
 
Artificial intelligence for power systems
Artificial intelligence for power systemsArtificial intelligence for power systems
Artificial intelligence for power systems
 
Planning for power systems
Planning for power systemsPlanning for power systems
Planning for power systems
 
Functional programming
Functional programmingFunctional programming
Functional programming
 
Examples of operational research
Examples of operational researchExamples of operational research
Examples of operational research
 
Simulation-based optimization: Upper Confidence Tree and Direct Policy Search
Simulation-based optimization: Upper Confidence Tree and Direct Policy SearchSimulation-based optimization: Upper Confidence Tree and Direct Policy Search
Simulation-based optimization: Upper Confidence Tree and Direct Policy Search
 
Simple regret bandit algorithms for unstructured noisy optimization
Simple regret bandit algorithms for unstructured noisy optimizationSimple regret bandit algorithms for unstructured noisy optimization
Simple regret bandit algorithms for unstructured noisy optimization
 
Bias and Variance in Continuous EDA: massively parallel continuous optimization
Bias and Variance in Continuous EDA: massively parallel continuous optimizationBias and Variance in Continuous EDA: massively parallel continuous optimization
Bias and Variance in Continuous EDA: massively parallel continuous optimization
 
Keywords and examples of machine learning
Keywords and examples of machine learningKeywords and examples of machine learning
Keywords and examples of machine learning
 
Power systemsilablri
Power systemsilablriPower systemsilablri
Power systemsilablri
 
Disappointing results & open problems in Monte-Carlo Tree Search
Disappointing results & open problems in Monte-Carlo Tree SearchDisappointing results & open problems in Monte-Carlo Tree Search
Disappointing results & open problems in Monte-Carlo Tree Search
 
Direct policy search
Direct policy searchDirect policy search
Direct policy search
 
Combining games artificial intelligences & improving random seeds
Combining games artificial intelligences & improving random seedsCombining games artificial intelligences & improving random seeds
Combining games artificial intelligences & improving random seeds
 

Similar to Monte Carlo Tree Search in 2014 (MCMC days in Marseille)

Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
The Statistical and Applied Mathematical Sciences Institute
 
Meta Monte-Carlo Tree Search
Meta Monte-Carlo Tree SearchMeta Monte-Carlo Tree Search
Meta Monte-Carlo Tree SearchOlivier Teytaud
 
ICPC 2015, Tsukuba : Unofficial Commentary
ICPC 2015, Tsukuba: Unofficial CommentaryICPC 2015, Tsukuba: Unofficial Commentary
ICPC 2015, Tsukuba : Unofficial Commentary
irrrrr
 
AlphaZero and beyond: Polygames
AlphaZero and beyond: PolygamesAlphaZero and beyond: Polygames
AlphaZero and beyond: Polygames
Olivier Teytaud
 
Combining UCT and Constraint Satisfaction Problems for Minesweeper
Combining UCT and Constraint Satisfaction Problems for MinesweeperCombining UCT and Constraint Satisfaction Problems for Minesweeper
Combining UCT and Constraint Satisfaction Problems for Minesweeper
Olivier Teytaud
 
Stratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computationStratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computation
Umberto Picchini
 
Stratified Monte Carlo and bootstrapping for approximate Bayesian computation
Stratified Monte Carlo and bootstrapping for approximate Bayesian computationStratified Monte Carlo and bootstrapping for approximate Bayesian computation
Stratified Monte Carlo and bootstrapping for approximate Bayesian computation
Umberto Picchini
 
Matt Purkeypile's Doctoral Dissertation Defense Slides
Matt Purkeypile's Doctoral Dissertation Defense SlidesMatt Purkeypile's Doctoral Dissertation Defense Slides
Matt Purkeypile's Doctoral Dissertation Defense Slides
mpurkeypile
 
Monte Carlo Tree Search for the Super Mario Bros
Monte Carlo Tree Search for the Super Mario BrosMonte Carlo Tree Search for the Super Mario Bros
Monte Carlo Tree Search for the Super Mario Bros
Chih-Sheng Lin
 
ACM 2013-02-25
ACM 2013-02-25ACM 2013-02-25
ACM 2013-02-25
Ted Dunning
 
Some history of quantum groups
Some history of quantum groupsSome history of quantum groups
Some history of quantum groups
Daniel Tubbenhauer
 
presentation on artificial intelligence autosaved
presentation on artificial intelligence autosavedpresentation on artificial intelligence autosaved
presentation on artificial intelligence autosaved
Divya Somashekar
 
An Introduction to Discrete Choice Modelling
An Introduction to Discrete Choice ModellingAn Introduction to Discrete Choice Modelling
An Introduction to Discrete Choice Modelling
Institute for Transport Studies (ITS)
 
modeling.ppt
modeling.pptmodeling.ppt
modeling.ppt
ssuser1d6968
 
Writing a SAT solver as a hobby project
Writing a SAT solver as a hobby projectWriting a SAT solver as a hobby project
Writing a SAT solver as a hobby project
Masahiro Sakai
 
Dc8c4f010f40.hanoi.towers
Dc8c4f010f40.hanoi.towersDc8c4f010f40.hanoi.towers
Dc8c4f010f40.hanoi.towersSumedha
 
AI_Session 17 CSP.pptx
AI_Session 17 CSP.pptxAI_Session 17 CSP.pptx
AI_Session 17 CSP.pptx
Asst.prof M.Gokilavani
 
Paris data-geeks-2013-03-28
Paris data-geeks-2013-03-28Paris data-geeks-2013-03-28
Paris data-geeks-2013-03-28
Ted Dunning
 
Nearest Neighbor Customer Insight
Nearest Neighbor Customer InsightNearest Neighbor Customer Insight
Nearest Neighbor Customer Insight
MapR Technologies
 
Oxford 05-oct-2012
Oxford 05-oct-2012Oxford 05-oct-2012
Oxford 05-oct-2012
Ted Dunning
 

Similar to Monte Carlo Tree Search in 2014 (MCMC days in Marseille) (20)

Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Meta Monte-Carlo Tree Search
Meta Monte-Carlo Tree SearchMeta Monte-Carlo Tree Search
Meta Monte-Carlo Tree Search
 
ICPC 2015, Tsukuba : Unofficial Commentary
ICPC 2015, Tsukuba: Unofficial CommentaryICPC 2015, Tsukuba: Unofficial Commentary
ICPC 2015, Tsukuba : Unofficial Commentary
 
AlphaZero and beyond: Polygames
AlphaZero and beyond: PolygamesAlphaZero and beyond: Polygames
AlphaZero and beyond: Polygames
 
Combining UCT and Constraint Satisfaction Problems for Minesweeper
Combining UCT and Constraint Satisfaction Problems for MinesweeperCombining UCT and Constraint Satisfaction Problems for Minesweeper
Combining UCT and Constraint Satisfaction Problems for Minesweeper
 
Stratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computationStratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computation
 
Stratified Monte Carlo and bootstrapping for approximate Bayesian computation
Stratified Monte Carlo and bootstrapping for approximate Bayesian computationStratified Monte Carlo and bootstrapping for approximate Bayesian computation
Stratified Monte Carlo and bootstrapping for approximate Bayesian computation
 
Matt Purkeypile's Doctoral Dissertation Defense Slides
Matt Purkeypile's Doctoral Dissertation Defense SlidesMatt Purkeypile's Doctoral Dissertation Defense Slides
Matt Purkeypile's Doctoral Dissertation Defense Slides
 
Monte Carlo Tree Search for the Super Mario Bros
Monte Carlo Tree Search for the Super Mario BrosMonte Carlo Tree Search for the Super Mario Bros
Monte Carlo Tree Search for the Super Mario Bros
 
ACM 2013-02-25
ACM 2013-02-25ACM 2013-02-25
ACM 2013-02-25
 
Some history of quantum groups
Some history of quantum groupsSome history of quantum groups
Some history of quantum groups
 
presentation on artificial intelligence autosaved
presentation on artificial intelligence autosavedpresentation on artificial intelligence autosaved
presentation on artificial intelligence autosaved
 
An Introduction to Discrete Choice Modelling
An Introduction to Discrete Choice ModellingAn Introduction to Discrete Choice Modelling
An Introduction to Discrete Choice Modelling
 
modeling.ppt
modeling.pptmodeling.ppt
modeling.ppt
 
Writing a SAT solver as a hobby project
Writing a SAT solver as a hobby projectWriting a SAT solver as a hobby project
Writing a SAT solver as a hobby project
 
Dc8c4f010f40.hanoi.towers
Dc8c4f010f40.hanoi.towersDc8c4f010f40.hanoi.towers
Dc8c4f010f40.hanoi.towers
 
AI_Session 17 CSP.pptx
AI_Session 17 CSP.pptxAI_Session 17 CSP.pptx
AI_Session 17 CSP.pptx
 
Paris data-geeks-2013-03-28
Paris data-geeks-2013-03-28Paris data-geeks-2013-03-28
Paris data-geeks-2013-03-28
 
Nearest Neighbor Customer Insight
Nearest Neighbor Customer InsightNearest Neighbor Customer Insight
Nearest Neighbor Customer Insight
 
Oxford 05-oct-2012
Oxford 05-oct-2012Oxford 05-oct-2012
Oxford 05-oct-2012
 

Recently uploaded

Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
seandesed
 
Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.
PrashantGoswami42
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
Pratik Pawar
 
Vaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdfVaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdf
Kamal Acharya
 
Democratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek AryaDemocratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek Arya
abh.arya
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
Amil Baba Dawood bangali
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
Kamal Acharya
 
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdfCOLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
Kamal Acharya
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
Massimo Talia
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
Kamal Acharya
 
LIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.pptLIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.ppt
ssuser9bd3ba
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
Jayaprasanna4
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
Pipe Restoration Solutions
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
obonagu
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
gerogepatton
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
AhmedHussein950959
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
Kamal Acharya
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptx
ViniHema
 
Courier management system project report.pdf
Courier management system project report.pdfCourier management system project report.pdf
Courier management system project report.pdf
Kamal Acharya
 
ethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.pptethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.ppt
Jayaprasanna4
 

Recently uploaded (20)

Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
 
Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
 
Vaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdfVaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdf
 
Democratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek AryaDemocratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek Arya
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
 
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdfCOLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
 
LIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.pptLIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.ppt
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptx
 
Courier management system project report.pdf
Courier management system project report.pdfCourier management system project report.pdf
Courier management system project report.pdf
 
ethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.pptethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.ppt
 

Monte Carlo Tree Search in 2014 (MCMC days in Marseille)