Solving Sudoku Using Genetic Operations and Sub

SOLVING SUDOKU USING GENETIC OPERATIONS 1
Solving Sudoku Using Genetic Operations and Sub-
Blocks With Optimized Mutation Rates
Stephen Felman
sfelman@eckerd.edu
May 14, 2014
Abstract – This paper builds upon the idea of
using sub-blocks as building blocks within
Sudoku puzzles when attempting to use genetic
algorithms to solve Sudoku puzzles. This
paper builds off of Yuji Sato and Hazouki
Inoue’s (2010)paper on using genetic
operations to preserve building blocks and
optimizes it to solve puzzles faster by making it
easier for the population to get out of local
maximas (pg. 23). This paper attempts to solve
only the classic variant that consists of 9 3x3
sub-blocks and is 9x9 with numbers 1 through
9.
I. Introduction
Sudoku is a logic puzzle that was
most likely originally created by
architect Howard Garns in 1979 and first
published in Dell magazines as ‘Number
Place’ puzzles. Sudoku didn’t become
popular until 1986 when a Japanese
company by the name of Nikoli
popularized it (Wikipedia, 2014). The
classic Sudoku variant that we will be
discussing in this paper consists of 9 3x3
sub-blocks that must contain the
numbers 1 through 9 in their squares and
is arranged in a 9x9 puzzle. The rows
and columns must also have the numbers
1-9 for it to be a valid solution to the
puzzle. There is much related work as
attempts to solve Sudoku puzzles using
Genetic Algorithms (GA’s) have become
popular in recent years. There are many
papers that have success, some with
more than others, but the best results we
found in solving difficult puzzles were
the results of the work of Yuji Sato and
Hazouki Inoue in their paper “Solving
Sudoku with Genetic Operations that
Preserve Building Blocks” which we
build upon in this paper. Many other
papers had limited success when solving
Sudoku puzzles but some had certain
elements that may or may not be
incorporated somehow into the Genetic
Algorithm to help increase success rates
such as elitism and a reset point to get
out of local maximums (Das et al., pg. 1-
2) as well as algorithms for generating
only the desired permutations in each
3x3 sub-block based upon the known
values in the puzzle (Maji et al., pg.
393). The problem that many of the
potential GA’s run into when attempting
to solve Sudoku puzzles seems to be the
large number of local maxima. This
problem leads to the conclusion that
larger mutation rates than in normal
GA’s should be used and this paper
looks to take advantage of mutation rates
when solving these puzzles. There is
however a fine line when it comes to
solving puzzles between a mutation rate
that is too low and takes too long to get
out of local maximum and ones that are
too high and destroy building blocks
necessary to reach a solution. The
mutation operation proposed in this
paper takes this into account and
attempts to preserve more fit sub-blocks
while searching for the solution to a
puzzle.
II. Constructs of Sudoku
This paper attempts to solve only the
classic 9x9 variant of Sudoku puzzles.
The rules are as follows:

1) Each of the 9 3x3 sub-blocks must
contain the numbers 1 through 9
exactly
2) Each of the 9 rows must contain the
numbers 1 through 9 exactly
3) Each of the 9 columns must contain
the numbers 1 through 9 exactly
4) Initial values may not be changed
The previous rules are shown visually in
figure 1 below while figure 2 shows the
solution to the puzzle in figure 1.
Fig. 1 An example of a 9x9 Sudoku puzzle.
Fig. 2 The Solution to the puzzle shown in Fig.1
There are also many other variants of
Sudoku puzzles from ones that are simply
4x4 with sub-blocks of 2x2 as well as
16x16 and 25x25, Greater Than Sudoku,
Jigsaw Sudoku, and Hyper Sudoku to
name a few.
III. Genetic Operations
A) Initializing the Population:
We take the same approach as Sato and
Inoue (2010) have used when initializing
the population as this method randomly
creates individuals with relatively good
fitness values compared to randomly
putting the numbers 1 through 9 in each
non-initial cell. The method used looks
at every 3x3 sub-block and makes sure
that the numbers 1 through 9 are
contained within each of the sub-blocks.
For instance, looking at the upper left
sub-block of figure 3, we can see that the
numbers 2,3,5,6 and 9 are initial values.
This means that the numbers 1,4,7 and 8
must be randomly placed within this
sub-block (Sato and Inoue, pg.24). In
our experiments, java’s random class is
used to randomize these placements. The
same goes for the other 8 sub-blocks
taking into account their initial values.
Fig. 3 A Sudoku puzzle
B) Fitness Function:
The paper “Solving Sudoku with Genetic
Operations that Preserve Building
Blocks” uses an algorithm for computing
the fitness of potential solutions that
revolves around checking whether or not
there are conflictions within the
individual’s rows, columns and sub-
blocks. In essence, the fitness of an
individual is determined by how close to
a solution it is, the better the fitness of the
individual, the higher the fitness. We
chose to use values between 0 and 1 for
our fitness function, 1 being a perfect

individual or solution to the puzzle,
which is exactly the same as Sato and
Inoue’s (2010) in their paper (pg.25).
However, the method that we used to
calculate the fitness value varies slightly
from theirs. For our fitness evaluation,
we start by setting each row and column’s
fitness values at a perfect 6 (note: we do
not need to check the sub-blocks for
values 1 through 9 as we know because
of our initialization process that they all
contain the numbers 1 through 9). Then
we take a look at all the numbers within
the row or column and count the number
of multiples of a number. For each
number where that there is more than one
of, we take away 1(or 2 if there is 3 of a
number) from the row or column score.
For instance, if a row or column had the
numbers (1, 2, 3, 3, 4, 7, 7, 7, 9) then the
score for that row will be a 3. Sato and
Inoue (2010) use 9 as their perfect row or
column score, however, this is too high as
we know that the sub-blocks must contain
numbers 1 through 9 thus making the
worst possible row or column 123123123
or another combination that only has the
same three numbers repeated. This use of
6 as the perfect row or column score
means that the perfect individual will
have 108 (6 * 9 rows + 6 * 9 columns) as
its total score. To make this value
between 0 and 1, we take all the row and
column scores and add them up and then
divide that number by the best possible
score or 108 in this case. This differs
slightly from the paper, which uses 162
(9 * 9 rows + 9 * 9 columns). (Sato and
Inoue, pg.25) The second portion that our
fitness function does differently from
Sato and Inoue’s (2010) is that it takes
into account initial values when
calculating the fitness value for each row
or column score. It weights them extra
by taking off an extra point from the
score if there are multiple values and one
of them is an initial. The thought process
behind this is that a confliction with an
initial value is tougher to get rid of than a
confliction with a normal value because
our mutation operation cannot mutate one
of the two values that are in confliction,
which will be seen later when we discuss
the mutation operation.
C) Crossover Operation:
The crossover operation used is the same
as the Sato and Inoue’s (2010) because
we felt that it did a great job of
preserving the building blocks of the
individuals from generation to
generation. Their crossover operation
takes two parent individuals and looks at
their row scores and column scores for
the children. The crossover takes into
account the sub-blocks of the puzzles
and determines which rows and columns
are best for the next generation. For
each puzzle it finds 3 row scores and 3
column scores. These row and column
scores are not the same as the ones
calculated in the fitness function
however. These ones consist of an
addition of 3 row or column scores from
the fitness function. The way this is
done is that for the top row of sub-
blocks, they will have a row score that
correlates to the first three individual
row scores from the previous fitness
function evaluations. Thus, the
maximum row score or column score
obtainable is 18 or (6 + 6 + 6) if there
are no conflictions for any of the three
rows or columns. They utilize these new
row and column scores when creating
the children. For the first of two
children generated from a parent, they
look at only the row score values. This
will take 3 comparisons to create the
new child. What Sato and Inoue (2010)
do is look at the three new row scores
and use whichever one is greater and

input that one as the child’s new rows.
If they have equal row scores then we
take parent 1’s row. This is done for
each of the 3 rows of 3 sub-blocks. For
the second child, we use a similar
process; however in this case, we look at
the parent’s column scores and input the
columns of the child based upon the
parent’s column scores. For this case, if
they have equal column scores, we use
parent 2’s column. This crossover
method can be seen visually in figure 4
below.
Fig. 4 An example of the crossoverused by Sato
and Inoue (2010, pg.24)
Potential improvements were
contemplated for this operation but were
not attempted due to time constraints.
One potential crossover method that we
considered that has merit is one that
instead of producing just the two
children as seen above, it instead
produces three children. The first two
children would be computed the same
way as in this crossover, however, a
third child would be added that looks at
both row score and column score for
every sub-block and compares them. It
could work by adding up each of the
nine sub-blocks row and column score
and comparing the individual sub-blocks
from both parents then taking which ever
one’s row score plus column score is
higher with a tiebreaker being just taking
parent 1 or parent 2’s sub-block,
programmer’s choice. Then it would
calculate all three children’s new
fitness’s and keep the best two children
for the next generation. If two children
were tied in fitness and one was better or
they were all tied, it would be the
programmer’s choice again as to which
they would want to keep.
D) Mutation Operation:
We used a simple swap mutation as our
way of mutating individuals. We used
the same swap mutation as Sato and
Inoue (2010) although had a slight
variation when it came to the mutation
rate used, which will be discussed later
in the paper. As for the swap mutation,
each sub-block has a chance of being
swapped between 0% and 100%. We
iterate through each of the 9 sub-blocks,
which will each have the same chance of
being mutated, and are independent of
each other. This means that potentially
none of the sub-blocks mutate, all of the
sub-blocks mutate or any combination of
the sub-blocks mutate. The actual swap
mutation function takes 2 random non-
initial values in a sub-block that contains
2 or more non-initial values and swaps
their two values. Figure 5 shows this
swap mutation below.
Fig. 5 An example of swap mutation. Gray
numbers indicate an initial value which cannot
be swapped.
Other possibilities for a mutation
function would be a 3-swap mutation or
an insertion swap mutation.

E) Local Search Method:
Sato and Inoue’s (2010) experiment uses
a local search method when attempting
to solve puzzles, which this experiment
uses as well for comparison purposes.
The local search method in essence
mutates each parent a set amount of
times and then chooses whichever child
has the best fitness. This operation
seems beneficial to the solving of
Sudoku puzzles as it moves one towards
the maximum value although there is a
chance that this method may cause the
population to get stuck in a local
maximum. This is caused by the fact
that this local search method often just
moves more into the local maximum
while sometimes it may be necessary to
take one step back to take two forwards
(pg. 25).
IV. Experimentation
For our experimentation, we tried to
first get the same results as Sato and
Inoue (2010) before attempting to
implement any improvements. We were
able to achieve very similar results and
confirm their work before moving
forward with our own study. Our main
focus was with getting out of local
maximum which was already improved
with the addition of taking off extra for
initial values as well as changing the
best individual’s fitness from 164/164 to
108/108. For the mutation rate, we
wanted to drive the population towards a
perfect solution without destroying
building blocks. Our idea for this was to
have different mutation rates for every
sub-block. There were many
experiments done regarding the mutation
rate and many different formulas tested
while attempting to get the best mutation
rates. One thought was that the mutation
rates should be based upon the amount
of initial values in each box. There was
some improvement seen in the average
generations taken to solve for many of
the easier puzzles but the average
generations became worse than the
results from Sato and Inoue’s results
with the more difficult puzzles so we
scrapped this idea. The next idea that
we tried was to use each sub-block’s row
and column scores when calculating its
mutation rate. We decided that the
higher the sub-block’s score, the lower
its mutation chance should be. We
eventually found an optimized equation,
which used the row and column scores
of the sub-blocks and calculated their
mutation chances based upon this. The
equation we found that worked the best
was:
𝑀𝑢𝑡 𝐶ℎ𝑎𝑛𝑐𝑒 = 0.964 𝑅𝑜𝑤𝑆𝑐𝑜𝑟𝑒+𝐶𝑜𝑙𝑢𝑚𝑛𝑆𝑐𝑜𝑟𝑒
An example of the row scores and
column scores is shown visually in
figure 6 below and the calculated row
score plus column score shown in the 9
sub-blocks.
Fig. 6 Shows the 3 row and column scores and
their calculated RowScore+ColScore used in the
formula
This, by nature, is a computationally
intensive function, which causes some
slowdown when going from generation
to generation, however we saw some
speed up in java by using a switch block
that has all of the already calculated
values for this function in its case
statements rather than the math.pow

function that java provides in its math
class. The speed could be increased
much more drastically, however, by
using parallel processing. We did not
attempt this however due to our
inexperience with CUDA and OpenCl.
These different mutation chances per
sub-block differ from the rate used in
Sato and Inoue’s experiment where they
use a constant mutation rate of 0.3 (30%)
for each sub-block. For the experiments
that we ran, we used the same inputs as
in Sato and Inoue’s experiment for
comparison purposes.
Our experiment inputs were:
Population Size: 150
Generations: 100000
Local Search Children Per Parent: 2
Crossover Chance: 0.3
Tournament Type: Stochastic
Tournament Size: 3
The following puzzles were used in the experiment. They are the same ones used in Sato
and Inoue’s experiment for further comparisons.

V. Experimental Results
Table 1. Replication of Sato and Inoue’s experiment results
Givens 38 34 30 29 28 24 23
Solved (Out of 100 trials) 100 100 100 100 100 96 83
Average Generations Taken 67.62 129.83 1132.15 1861.66 12689.82 26367.33 43518.19
Fastest Solve 22 36 80 65 187 314 1199
Table 2.Our experiment’s results
Givens 38 34 30 29 28 24 23
Solved (Out of 100 trials) 100 100 100 100 100 100 100
Average Generations Taken 39.62 96.98 627.39 933.4 7404.66 11257.79 24352.75
Fastest Solve 18 29 78 64 99 273 408
Graph 1. Comparison between our results and those of Sato and Inoue’s results
0
5000
10000
15000
20000
25000
30000
35000
40000
45000
50000
38 34 30 29 28 24 23
Sato and Inoue Avg.
Gens
Our Avg. Gens

Table 3. Comparison between our results and those of Sato and Inoue’s results
Givens 38 34 30 29 28 24 23
Sato and Inoue Average Generations 67.62 129.83 1132.15 1861.66 12689.82 26367.33 43518.19
Our Average Generations 39.62 96.98 627.39 933.4 7404.66 11257.79 24352.75
VI. Discussion
Our results have shown that it is
possible to solve difficult Sudoku
puzzles using Genetic Operations
consistently in a small amount of time
when compared with the fastest human
speeds. We have found that through the
use of optimized mutation rates and
mutation rates that differ from sub-block
to sub-block and generation to
generation have significant advantages
when compared with a constant mutation
rate. When it comes to getting out of
local maximum and solving Sudoku
puzzles, the separate mutation rates
perform better than a constant rate.
From tables 1, 2, and 3, we show that
there are significant advantages to the
proposed method operations versus ones
that do not use building blocks. From
Graph 1, we show that as the givens
become less and the puzzles become
more difficult and resistant to solutions,
the difference between the solve time of
our proposals and the ones in the
experiments of Sato and Inoue. The
two most difficult puzzles were unable
to be solved every time by Sato and
Inoue, most likely due to their
population getting stuck in a local
maxima for thousands of generations but
our mutation rates are able to break out
of these local maxima while still
maintaining the ability to find a solution
relatively quickly once the population
has been sent down a correct path
towards the solution. When comparing
success rates for the most difficult
puzzles, our results had 100% success
rates for Sudoku puzzles with 23 and 24
givens while theirs only had 83% and
96% respectively, which is still very
good when compared to many other
experiments in other papers which had
very small success rates (0-10%) with
difficult puzzles. This shows that the
mutation methods and crossover
methods that are chosen can have a great
impact on how quickly a particular
puzzle will be solved. This also shows
that a mutation rate that is off by just a
little can have a great impact on whether
or not a puzzle will be solved.
VII. Conclusion
We proposed a mutation rate function
that preserves building blocks and can
get out of local maxima and made
improvements upon a fitness function
that was fairly standard in most Sudoku
solvers that use genetic operations.
These functions along with the genetic
operations proposed by Sato and Inoue
create a very functional Sudoku solver

that competes with the best Genetic
Sudoku solvers and could potentially
compete with logical Sudoku solvers if
parallel processing is implemented as
well as algorithms that produce only
viable solutions to each sub-block. (Maji
et al., pg. 393) It has been shown that
using the Darwinian principles of
evolution can easily be implemented to
solve real world problems. This is very
similar to the evolution of populations of
a certain species in nature. In a similar
way to Sudoku, the different conditions,
such as weather and climate as well as
other organisms, in nature are like a
puzzle that a species is trying to find the
optimal ‘solution’ to. These principles
of Darwinism can be used in many ways
to solve problems ranging from
mathematical equations to the best way
to design a machine. The ability of this
type of computation to come up with
general solutions without necessarily
knowing how to go about finding those
solutions is a very powerful idea and
problem-solving tool that is very useful
in many scientific fields. This
experiment also shows that the mutation
rates from generation to generation make
a big difference with how quickly a
population can improve or how it can
quickly ruin a population if the rates are
too high. In nature, the mutation rates
are very low, which causes small
changes in a population to occur slowly
over many generations. If mutation rates
were naturally much higher in nature,
however, I think that there is a very good
chance that we would not exist, as there
would be a lot more genetic mutations
that can cause a lot of different problems
and organisms would not be able to
reach the optimal solution or even come
very close to it. I found this to be true in
my experiment by adjusting the mutation
rates. When the mutation rates were too
high, the entire populations’ fitness
scores were a lot lower and the same was
true when they were too low, but for a
very different reason, it took too long for
them to mutate into something better.
This leads me to believe that the
mutation rate in nature may be, in some
ways, at a sort of goldilocks point. Not
too high and not too low, but just right.

Works Cited
Sudoku. (2014, May 13). In Wikipedia, The Free Encyclopedia. Retrieved May 14, 2014,
………from http://en.wikipedia.org/w/index.php?title=Sudoku&oldid=627213310
Das, K.N., Bhatia, S., Puri, S., Deep, K. (2012). A Retrievable GA for Solving Sudoku
………Puzzles. Technical Report. Retrieved May 14, 2014, from
………http://www.cse.psu.edu/~sub194/papers/sudokuTechReport.pdf.
Maji, A. K., Jana, S., Pal, R. K. (2013). An Algorithm for Generating only Desired
………Permutations for Solving Sudoku Puzzle. Procedia Technology, 10, 392-399.
Sato, Y., Inoue, H. (2010). Solving Sudoku with Genetic Operations that Preserve Building
………Blocks. Proceeding of the 2010 IEEE Conference on Computational Intelligence and
………Games, 23-29.

Solving Sudoku Using Genetic Operations and Sub

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Solving Sudoku Using Genetic Operations and Sub

Similar to Solving Sudoku Using Genetic Operations and Sub (20)

Solving Sudoku Using Genetic Operations and Sub