2. Chess - Not so Conventional
Chess has been around for some 1500 years
Gameplay has been rigorously studied
It has a defined start state
It has complete information
Why not employ traditional conventional learning algorithms?
3. What makes Chess such a challenge?
Chess is incredibly complex
For example, the number of possible ways of playing the first four moves per side
is 318, 979, 564, 000
First ten moves? 169, 518, 829, 100, 544, 000, 000, 000, 000, 000 different ways
An estimated on the game-tree complexity of chess is 10120
1046 different positions that can arise (accounting for legal moves)
Exhaustive search is infeasible!
4. What does this mean?
Current evaluation functions for top chess programs rely on manual manipulation
of their evaluation function.
This equates to years of Trial and Error in order for an evaluation function to give
an adequate evaluation of the current state of the game.
Rely on A Priori data collected from the years of research
The currently algorithms are merely a brute-force search approach that have no
learning capability.
Is this the poster-child for AI that we’ve been looking for?
5. Search Space
The search space of chess is not smooth and unimodal
That means, tweaking the parameters is a significant challenge
Turn 3 dials one way and get a drop in performance, but turn the 4th and get an
increase.
Hill climbing algorithms can’t help us here
6. Tune and Guess
Do we really know what we’re doing?
Finding the best value for a parameter for the evaluation function is a matter of
guessing and intuition.
We don’t have a good grasp of the problem, making it difficult for programmers to
find heuristics to tuning parameters by means other than trial and error.
And… is there a global optimum to be found?
Current algorithms try to find this global optimum solution for the parameters,
which in the case of chess, is likely to not even exist.
7. Genetic Algorithms to the Rescue?
Are genetic algorithms the solution?
At first glance, it appears to be an optimization problem
Optimization problems + Genetic Algorithms = GOOD
i.e. Encode the parameters into a bit string and watch it evolve
So… what’s the problem?
8. Fitness Function
How to evaluate the fitness of each mutation?
We let it play Chess...
Even if only let it play 100 games with 10 seconds per game it would take 825
minutes for each generation to evolve.
That means, 57 days to just reach the 100th generation.
So… we can’t rely on co-evolution alone.
9. Alternative Fitness Function
Instead of our organisms playing themselves, they play history
Each organism is given a state, and told to make a move given that state
They are only allowed to search in the first set of actions
The organism’s action is compared to that of a Grandmaster’s game
Fitness is determined by how many matching actions the organism got
10. Organism vs. Organism
Final stage, after many iterations of the previous means of evolution
After playing compared to the Grandmaster, the evolved organisms play each
other
None are starting at random
Fitness now is the relative strength of each organism
Best single evolved organism is selected
11. Coevolution isn’t enough to create suitable parameters for chess
Small populations were used in this case
Organisms did not start off with random parameters, but were already tuned
With this method, we do not see if the move came to a favourable outcome, just
that the grandmaster made that move
12. Coevolution in previous researches
Chellapilla and Fogel - expert level checker program
used to evolve neural network board evaluators
applicability to chess not clear
No successful attempt of using coevolution to evolve the parameters of a chess
program from fully randomized initial values
Chess rely on priori knowledge, could be successfully employed when initial
material and positional parameter have initialed within sensible ranges
13. Coevolution for evolving computer chess program
Combine evolution and coevolution for evolving parameters of evaluation function
simulate moves of a human grandmaster
avoid relying on availability of evaluation scores
assume the program already contained a highly tuned search
mechanism
14. Coevolution for evolving computer chess program
Pre-condition
only relies on a widely available database of grandmaster-level games
Method in general
evolve organism to mimic the behavior of human grandmasters
organisms are then improved by coevolution
16. What does coevolution do?
Use a single-population coevolution phase
the selected 10 best organisms serve as initial population
in each generation, each organism plays four games against each other
apply fitness function based on this relative performance
apply rank-based selection to select the organisms for breeding
After running
select the best evolved organism as the best overall organism
17. Why use coevolution?
To improve upon these organisms and create an enhanced best of best organism
If use a specific grandmaster for each run
does not improve over the method used
using 1-ply searches only enables mimicking general master style
Note
in this case, the population size is small, initial organisms has well tuned
at this stage, the playing strength of the program is substantially limited
18. Selective Search: Methods
Selective search methods work by pruning uninteresting moves earlier and
spending more time exploring more interesting moves.
This works better than simply searching all moves up to a certain fixed depth.
Such methods include null-move pruning, futility pruning, multicut pruning, and
selective extensions. Each of these methods involve critical parameters which are
normally tuned using experimental data and optimized manually.
19. Selective Search: Genetic Algorithm
GA can be used to automate the tuning and optimization of the parameters for
these selective search methods.
Represent each of the parameters as binary chromosomes with the number of bits
need to satisfy the expected range of the parameters.
For training a set of tactical test positions labeled with best moves is processed by
each chromosome.
20. Selective Search: Training and Evaluation
Each chromosome processes every position in the set and searches for a best
move.
Rather than counting the number of correctly solved positions it is better to count
the number of nodes processed to reach each of the solution and sum this total
over the entire training set.
This method of evaluation gives more fitness information per position and
encourages organisms to not only find solutions but to find them quickly.
21. Selective Search: Optimization
Once we have this method of evaluating the fitness of these chromosomes the GA
implementation of optimizing selective search parameters is standard.
Chromosomes are gray coded, use fitness-proportional selection, uniform
crossover, and elitism.
22. Experimental results: running time
Running the evolution ten times for about 20 hours.
Coevolution phase run for approximately 20 hours.
Each organism played each other organism four times in every round.
Each game is limited to ten seconds
23. Results of Evolution
Y axis: the number of correct
moves found
X axis: the generation
The best organism has 1620
correct positions, which
corresponds to 32.4% of the total
positions.
24. Results of Coevolution
The evolved organism learns its
parameter values from human
grandmasters
Some evolved parameter values of the
best organism
25. Evolving Search Parameters
Y axis: Total node count for 879
ECM positions for the best and
average organism
X axis: the generation
26. Comparing with other algorithm
Evol* : contains all parameter values for the evaluation function and the search
mechanism.
the number of correct moves found
performed at comparable levels with Crafty
Junior, Fritz and Hiarcs
27. References
David, Omid E., H. Jaap Van Den Herik, Moshe Koppel, and Nathan S.
Netanyahu. "Genetic Algorithms for Evolving Computer Chess Programs." IEEE
Transactions on Evolutionary Computation IEEE Trans. Evol. Computat. 18.5
(2014): 779-89. Web.
http://www.chess-poster.com/english/notes_and_facts/did_you_know.htm
https://en.wikipedia.org/wiki/Shannon_number
Editor's Notes
We should note which question the slides are for, so that we keep everything together and know where to slot in new work.
My guess is that the current slides are for questions 1 and 2?
END QUESTION 2
Question 3 Start
We use a widely available database of grandmaster-level games, and get a list of positions and known next moves
For each position, we make the organism decide a move from a 1-ply search
Compare the organism’s selected move and with that of the grandmaster.
Fitness is the number of moves that the organism got matching
This takes very little time. About 1ms for a single position’s 1-ply search, and so 1000 positions in a second
Question 3
We have already, through the previous method, got a population of fairly good chess players
This only serves to give the best out of all of them
Question 3 End
Coevolution isn’t enough, we need to have a way of starting strong with a known good base
Small populations are good for time, maybe not as good for getting a good result
Tuned organisms will evolve to their peak quicker, but require time for tuning ahead of time
This method comes from the assumption we are training on someone who is good (it is a grandmaster) but it could be a position and move that they then lose from