Active Directory Penetration Testing, cionsystems.com.pdf
Calculating Mine Probability in Minesweeper
1. Calculating Mine Probability in Minesweeper
Luke Videckis
July 2020
Contents
1 Introduction 1
2 How Mine Probability is Defined 1
3 Implementing Mine Probability 6
3.1 Splitting by Components . . . . . . . . . . . . . . . . . . . . . . . 7
3.2 Matrices and Local Deductions . . . . . . . . . . . . . . . . . . . 11
3.3 Further Splitting of Components . . . . . . . . . . . . . . . . . . 12
3.3.1 Finding Split Squares . . . . . . . . . . . . . . . . . . . . 14
3.3.2 Combining Results . . . . . . . . . . . . . . . . . . . . . . 15
3.3.3 Discussion on this Recursive Algorithm . . . . . . . . . . 17
4 References 20
1 Introduction
In minesweeper, to determine whether a given move is safe (assuming consis-
tency) was proved to be a DP-complete. This problem generalizes to finding the
probability that a move is safe. I offer an exponential, heuristically improved
approach to finding such probabilities. Everything discussed in this paper is
already implemented and tested in my android app.
2 How Mine Probability is Defined
Mine probability for a square X is defined as number of mine configurations where X is a mine
total number of mine configurations
as each mine configuration has an equal chance of occurring.
Let’s look at an example board with 20 mines:
1
2. The above board has 10 equally likely configurations of mines:
2
3. All the configurations have the following in common:
The red squares are the squares where every configuration has a mine, and
the green square had no mine in any configuration. In a sense, it is possible to
deduce that the red squares are mines, and the green square isn’t a mine. For
example, the top 3 red squares on the first row are all mines since the middle 3
has only 3 surrounding squares. Thus all three squares must have a mine.
Now what about calculating mine probabilities? For each square, let’s count
up how many out of the above 10 configurations have a mine:
It seems like the top-left square has probability 4
10 of being a mine since 4
out of 10 configurations have a mine there. As pointed out in this comment,
this is incorrect; it turns out to be more complicated. Let’s zoom out to see the
entire board again.
3
4. There are 75 squares which weren’t accounted for previously (surrounded in
red). Not all configurations of mines have the same number of mines, and thus
aren’t weighted equally. For example has 8 mines, while
has 11 mines. Recall there are 20 total mines. We have to choose where to
place the remaining mines among the 75 remaining squares. For example, since
has 8 mines, we need to choose 20 − 8 squares out of the remaining
75 squares to place the remaining mines. The number of ways to do this is
75
20−8 = 26123889412400. You can think of 75
20−8 as ’s “weight”.
Then, the fraction of “weighted” sums will give the correct mine probability for
each square.
Number of Mines Number of Configs. with top-left a mine Number of Configs.
8 0 1
9 1 4
10 2 4
11 1 1
For example, the top-left square has probability
0 75
20−8 + 1 75
20−9 + 2 75
20−10 + 1 75
20−11
1 75
20−8 + 4 75
20−9 + 4 75
20−10 + 1 75
20−11
=
14
103
=
4
10
of being a mine.
4
5. Let’s now define mine probability in general.
• Let C denote the set of all mine configurations
• Define f : C → {true, false} as follows: f(c) =
true c has a mine in square X
false c has no mine in square X
• Let Y denote the number of squares not next to any number, clue. (in the
above example, Y = 75)
• Let M denote the total number of mines. (in the above example, M = 20)
• Define #m: C → Z+
as follows: #m (c) = the number of mines in con-
figuration c. (in the above example, 8 ≤ #m(c) ≤ 11)
Then square X has a mine with probability
c∈C|f(c)
Y
M−#m(c)
d∈C
Y
M−#m(d)
.
How about squares not next to any number? For these squares, mine prob-
ability is
c∈C
M−#m(c)
Y
Y
M−#m(c)
d∈C
Y
M−#m(d)
.
For a given configuration c, M−#m(c)
Y is the probability that any of the Y
squares are a mine. And the final probability is a weighted average, where the
weight is Y
M−#m(c) .
Notice the numerator and denominator are huge (in magnitude). For the above
example,
numerator = 0
75
20 − 8
+1
75
20 − 9
+2
75
20 − 10
+1
75
20 − 11
= 6, 681, 687, 099, 710
but the end result is relatively small (in magnitude): 14
103 . The reason why most
factors cancel is because
5
6. c∈C|f(c)
Y
M−#m(c)
d∈C
Y
M−#m(d)
=
c∈C|f(c)
Y
M−#m(c)
d∈C
Y
M−#m(d)
=
c∈C|f(c)
1
d∈C
( Y
M−#m(d))
( Y
M−#m(c))
=
c∈C|f(c)
1
d∈C
( Y
M−#m(d))
( Y
M−#m(c))
,
and
Y
M−#m(d)
Y
M−#m(c)
=
Y !
(M − #m(d))! (Y − M + #m(d))!
·
(M − #m(c))! (Y − M + #m(c))!
Y !
=
(M − #m(c))! (Y − M + #m(c))!
(M − #m(d))! (Y − M + #m(d))!
.
Notice |#m(c) − #m(d)| is small, ≤ 3 in most cases, thus most factors
in (M−#m(c))!
(M−#m(d))! , and (Y −M+#m(c))!
(Y −M+#m(d))! cancel.
We can employ one of the following formulas (or their reciprocals):
•
(n
k)
(n
k)
= 1
•
(n
k)
( n
k+1)
= k+1
n−k
•
(n
k)
( n
k+2)
= (k+1)(k+2)
(n−k)(n−k−1)
•
(n
k)
( n
k+m)
= (k+1)(k+2)...(k+m)
(n−k)(n−k−1)...(n−k−(m−1))
3 Implementing Mine Probability
In order to calculate the probability that square X is a mine, we need to calculate
2 things:
• For each square X, the number of mine configurations which have m mines
where X is a mine, (0 ≤ m ≤ M)
• number of total mine configurations with m mines, (0 ≤ m ≤ M)
6
7. To calculate these, we employ backtracking:
Algorithm 1 Recursive Backtracking
procedure backtracking(i, squares)
if i = |squares| then
if mine configuration is correct then
execution will reach here once for every mine configuration
Process mine configuration
end if
return
end if
squares[i] ← mine
if pruning conditions then
backtracking(i + 1, squares)
end if
squares[i] ← free
if pruning conditions then
backtracking(i + 1, squares)
end if
end procedure
. . .
squares ← square 1 . . . square n
n
only squares next to a number, clue
backtracking(0, squares) initial call to backtracking function
Even with pruning, this algorithm runs in O |squares| · 2|squares|
. The
|squares| factor comes from handling the base case, and there are at most
2|squares|
base cases (mine configurations). This section will discuss ways to
make |squares| smaller.
3.1 Splitting by Components
The first optimization, splitting by components, is discussed in other blogs for
example this blog. Consider the following board:
7
8. Running the backtracking will take roughly (3 + 4 + 11) · 23+4+11
= 4, 718, 592
steps. Instead, how about running the backtracking independently on the red,
blue, and green components. This will take roughly 3·23
+4·24
+11·211
= 22, 616
steps (much smaller).
But splitting the backtracking into multiple components makes things difficult.
Let’s look at another example (there are still 20 total mines):
Both components are the same as earlier, and both can have either 8, 9, 10,
8
9. or 11 mines. But since there are 20 total mines, if one component has 11 mines,
then the other component can only have 8 or 9 mines. In general, the total
number of mines has to be in the range [M − Y, M], with M, Y defined earlier.
This leads to the following problem.
You are given n, the number of components (labeled 1, 2, ..., n), and an array
#configs[][], where
#configs[i][j] = number of configurations of mines for the i-th component
which have j mines, (1 ≤ i ≤ n, 0 ≤ j ≤ M).
Find #totalConfigs[] defined as
#totalConfigs[i] = number of configurations of mines which have i mines
when including all components, (0 ≤ i ≤ M).
This problem is similar to 0-1 knapsack. It’s solved by dynamic program-
ming. Let
dp[i][j] = number of configurations including only components 1, 2, ..., i which
have j mines, (1 ≤ i ≤ n, 0 ≤ j ≤ M).
Then the array dp is calculated as:
dp[i][j] =
j
k=0
dp[i − 1][k] · #configs[i][j − k].
The answer, #totalConfigs, is stored in dp[n].
This works great for calculating mine probability for squares not next to any
number,
c∈C
M−#m(c)
Y
Y
M−#m(c)
d∈C
Y
M−#m(d)
,
because we just need to know the number of components, and the number
of mines for each component. We have
c∈C
M − #m(c)
Y
Y
M − #m(c)
=
M
i=0
M − i
Y
· #totalConfigs[i] ·
Y
M − i
,
9
10. and a similar formula for the denominator. Here, i represents the number of
mines.
But calculating #totalConfigs[] isn’t enough to determine mine probability
for a square X next to a number,
c∈C|f(c)
Y
M−#m(c)
d∈C
Y
M−#m(d)
,
because the numerator requires us to calculate the number of configurations
where square X is a mine. To do this, we define a new problem.
Again you are given n, the number of components (labeled 1, 2, ..., n), and
an array #configs[][], where
#configs[i][j] = number of configurations of mines for the i-th component
which have j mines, (1 ≤ i ≤ n, 0 ≤ j ≤ M).
This time, find an array dp[][][] defined as:
dp[i][j][k] = total number of mine configurations which have k mines, consider-
ing all components (1, 2, ..., n) with the restriction that component i has exactly
1 configuration which has j mines, (1 ≤ i ≤ n, 0 ≤ j ≤ M, 0 ≤ k ≤ M).
To solve, we first calculate 2 arrays:
• prefix[i][j] = number of configurations considering components 1, 2, ..., i
which have j mines, (1 ≤ i ≤ n, 0 ≤ j ≤ M).
– calculated as: prefix[i][j] =
j
k=0
prefix[i − 1][k] · #configs[i][j − k]
• suffix[i][j] = number of configurations considering components i, i +
1, ..., n which have j mines, (1 ≤ i ≤ n, 0 ≤ j ≤ M).
– calculated as: suffix[i][j] =
j
k=0
suffix[i + 1][k] · #configs[i][j − k]
Finally, dp[][][] is calculated as:
dp[i][j][k] =
k−j
l=0
prefix[i − 1][l] · suffix[i + 1][k − j − l].
10
11. How can we use dp[][][] to calculate
c∈C|f(c)
( Y
M−#m(c))
d∈C
( Y
M−#m(d))
?
Consider a square X next to at least one number. Assume X is in compo-
nent i. When we run backtracking on component i, the base case is hit once
for every mine configuration. So we can calculate an array: countXIsMine[]
defined as
countXIsMine[j] = number of mine configurations for component i with j
mines where X is a mine, (0 ≤ j ≤ M).
We have
c∈C|f(c)
Y
M − #m(c)
=
M
k=0
k
j=0
dp[i][j][k] · countXIsMine[j] ·
Y
M − k
.
Here,
k
j=0
dp[i][j][k] · countXIsMine[j] is the number of mine configurations
(considering all components) which have k mines and X is a mine. Recall
that when calculating dp[i][j][k], we assumed component i has exactly 1 mine
configuration which has j mines. After multiplying with countXIsMine[j],
component i now has countXIsMine[j] configurations with j mines (and X is
a mine).
3.2 Matrices and Local Deductions
Minesweeper can be reduced to solving a system of linear equations. Combining
this with local deductions is a good approach to determining whether certain
moves are save. What if we ran these 2 approaches first, before splitting the
board by components? Any deducible squares which these 2 approaches give
can be used to split components, reducing their size.
Algorithm 2 Backtracking and Local Deductions
while found any local deductions do
save deductions these can be used to find further deductions
end while
components ← comp. 1 . . . comp. n
n
split by deduced squares
for i ← 1 to n do
backtracking(components[i])
end for
11
12. An example:
The [green/red] squares are the squares where running local deductions de-
termined they [didn’t/did] have mines. Notice local deductions have failed to
find all possible deducible squares (there’s a 2-1 pattern on the top row).
Before running local deductions, there is 1 component with 28 squares. The
backtracking would have taken ∼ 28 · 228
= 7, 516, 192, 768 operations. After
running local deductions, there’s 1 component with 9 non-deduced squares (9 ·
29
= 4, 608).
Although this optimization seems like it would help a lot, there are many
cases where this optimization doesn’t help out much (for example, consider if
there are no deducible squares).
3.3 Further Splitting of Components
Let’s define the problem: You are given a single component of n squares (labeled
1, 2, ..., n). Find:
• #configs[] defined as: #configs[i] = the number of mine configurations
with i mines, (0 ≤ i ≤ M)
• #configsWithJMine[][] defined as: #configsWithJMine[i][j] = the
number of mine configurations which have i mines, and square j is a
mine, (0 ≤ i ≤ M, 1 ≤ j ≤ n)
This of course can be solved with recursive backtracking in O(n · 2n
). Let’s
try to solve this problem faster as it is usually the bottleneck of the entire mine
probability calculation (every other step runs in polynomial time).
Consider the following board.
12
13. Imagine if we somehow “removed” the un-discovered square next to the 1.
Then, the board would contain 2 components.
13
14. Now, what if we ran the backtracking independently on both the blue and
yellow components, and then somehow combined the results? In this way, we
can take a single component, and split it further into ≥ 2 smaller components.
But why limit ourselves to splitting a single component once? Notice, we
have to solve the exact same problem for both the blue and yellow components.
Thus, why solve it with backtracking, when we can again use the same trick of
“removing” squares? This leads to the following recursive algorithm.
Algorithm 3 Solve Component Recursively
procedure solveComponent(component[])
if |component| ≤ ∼ 5 then
backtracking(component)
return
end if
S ← findSplitSquares(component[])
newComponents[][] ← findComponents(component[] without S)
for i ← 1 to |newComponents| do
solveComponent(newComponents[i])
end for
combine results
end procedure
. . .
solveComponent(component) initial call to recursive function
Here, the “split” squares are the squares which we remove to split up the
original component. Before diving into the reasons why this algorithm is still
exponential, I’ll explain the following parts:
• the findSplitSquares function
• combining results
3.3.1 Finding Split Squares
First, we don’t have to limit ourselves to removing just 1 square to split com-
ponents. Although, assuming we remove k squares, it turns out the combine
step will take O(2k
) time. So in my implementation, I arbitrarily chose 6 as an
upper bound on the number of split squares.
Finding split squares requires representing the minesweeper board as a graph:
• squares are nodes
• 2 squares share an edge if there exists a number next to both squares.
14
15. For example this board, , produces the following graph:
Notice that if node 4 is removed, then the graph would split into 2 connected
components. Thus node 4 is an articulation node. The first attempt would
be to choose ≤ 6 articulation nodes from the graph to remove. Although this
approach would fail on grids such as: whose graphs have
no articulation nodes. So let’s find a more generalized solution.
Ideally, we would want to remove ≤ 6 nodes such that the size of the largest
component is minimized. Unfortunately, the generalized problem of finding k
nodes to remove to minimize the size of the largest remaining component is
NP-complete. But this isn’t as relevant as you’d think. There are at least 2
options:
• employ an approximation algorithm such as the 2-exchange method
• employ a brute force approach similar to this comment. Brute force here
is okay since we’re only removing ≤ 6 nodes (thus is exponential in 6).
3.3.2 Combining Results
Recall the problem: you are given a single component of n squares (labeled 1,
2, ..., n). Find:
15
16. • #configs[] defined as: #configs[i] = the number of mine configurations
with i mines, (0 ≤ i ≤ M)
• #configsWithJMine[][] defined as: #configsWithJMine[i][j] = the
number of mine configurations which have i mines, and square j is a
mine, (0 ≤ i ≤ M, 1 ≤ j ≤ n)
I’ll only explain how to calculate #configs[] as #configsWithJMine[][] is
calculated very similarly.
To understand this section, it’ll help to know about dynamic programming with
bitmasks.
Consider this example board again:
Our goal is to calculate some sub-result (not necessarily #configs[]) for both
the blue and yellow components, and then calculate the same sub-result for the
original component. Also, we have to be able to calculate #configs[] from this
sub-result.
The sub-result is an array dp[][] defined as: dp[mask][i] = the number of
mine configurations with i mines under the restriction that if and only if the
j-th bit in mask is 1, then there is a mine in the j-th removed square.
Since the above example has only 1 removed square we consider 2 cases:
• the removed square is a mine (mask = 1)
• the removed square is not a mine (mask = 0)
If the removed square is a mine, we have
dpOriginal[1][i] =
i+1
j=0
dpBlue[1][j] · dpY ellow[1][i + 1 − j]
16
17. where:
• dpOriginal[][] is the dp[][] array corresponding to the original component
• dpBlue[][] is the dp[][] array corresponding to the blue component
• dpY ellow[][] is the dp[][] array corresponding to the yellow component
Notice the upper bound on the sum is curiously (i + 1). Both the blue and
yellow components have a mine in the removed square. Thus we end up double
counting this mine. The (i + 1) upper limit cancels out with the double count-
ing to get the correct result.
If the removed square is not a mine, we have
dpOriginal[0][i] =
i
j=0
dpBlue[0][j] · dpY ellow[0][i − j].
Assuming there are k removed squares (k ≤ 6), we have
#configs[i] =
2k
mask=0
dp[mask][i].
Here, there are 2k
masks as there are 2k
possible ways to place mines among
the k removed squares.
Finally, the original recursive call has 0 removed squares, so 20
= 1, there’s
1 possible mask (mask = 0), and the #configs array is stored in dp[0].
In this example, the original component was split into 2 sub-components, but
in general, the original component can be split into k sub-components. It is
possible to generalize this knapsack-like dynamic programming to work for k
sub-components.
There is a tricky case where a clue is surrounded by only “removed” squares.
Make sure to skip the masks which have the incorrect number of mines for these
clues.
3.3.3 Discussion on this Recursive Algorithm
Why is this algorithm still exponential? Consider the case where all of the n
≤6
subsets of nodes to remove don’t decrease the size of the component (there’s
still 1 component after removing nodes). In this case, we’re forced to solve the
problem with normal recursive backtracking in O(n · 2n
). Although in practice,
this case is super rare.
17
18. Where does the speedup come from with this recursive approach? For re-
cursive backtracking, even with the best possible pruning, you’ll still hit the
base case once for every mine configuration. Thus, recursive backtracking has a
lower bound of Ω (number of mine configurations). This new recursive approach
doesn’t have the same lower bound, and thus potentially can be faster (and is
faster in practice).
Finally, here’s a picture of the recursion tree for the board: .
Here, the squares marked with an “R” are articulation nodes in the correspond-
ing graph. If you remove any of the “R” squares, the board will split into 2
components (not considering the flagged squares).
18
19. Each node in this tree represents a single recursive call. Squares surrounded
in black are in the sub-component in the current recursive call. Squares sur-
rounded in both black and red are the “removed” squares which are used to
further split the current component. Leaf node components are solved with
recursive backtracking.
19