This document discusses artificial intelligence approaches to game playing. It covers topics like minimax search and α-β pruning for perfect information deterministic games. It also discusses handling resource limits through techniques like depth limiting and evaluation functions. It describes how to model games with chance elements and imperfect information. The key points are that game playing provides challenges like unpredictability, large search spaces, and hidden information that require approximations to find good solutions rather than perfect play.
3. Games vs. Search Problems
Game playing is a search problem
Defined by
– The Initial state
– The Successor function
– Goal test
– Path cost / utility / payoff function
Characteristics of game playing
“Unpredictable” opponent:
Solution is a strategy specifying a move for every feasible opponent reply
4. Game Playing
Plan of attack
Computer consider the possible lines of play [Babbage, 1846]
Algorithm used for perfect play [Zermelo, 1912; Von Neumann, 1944]
Finite horizon and approximate evaluation
[Zuse, 1945; Wiener, 1948; Shannon, 1950]
First chess program [Turing, 1951]
Machine learning to improve evaluation accuracy [Samuel, 1952–57]
Pruning to allow deeper search [McCarthy, 1956]
5. Types of Games
deterministic chance
perfect information
imperfect information
chess, checkers,
go, othello
backgammon
monopoly
bridge, poker, scrabble
nuclear war
6. Game Tree: 2-Player / Deterministic / Turns
XX
XX
X
X
XX
MAX (X)
MIN (O)
X
O
X O
MAX (X)
X O
X
X O
XMIN (O)
X
X O
X O X
. . . . . . . . . . . .
. . .
. . .
X O X
O X
O
. . .
TERMINAL
−1 0 +1Utility
X O X
O O X
X X O
X O X
X
X O O
7. Minimax
Perfect play for deterministic, perfect-information games
Idea
Select move to a position with the highest minimax
value, i.e., the best achievable payoff against the best
play
9. Minimax Algorithm
function MINIMAX-DECISION(game) returns an operator
for each op in OPERATORS[game] do
VALUE[op] MINIMAX-VALUE(APPLY(op, game), game)
end
return the op with the highest VALUE[op]
function MINIMAX-VALUE(state, game) returns a utility value
if TERMINAL-TEST[game](state) then
return UTILITY[game](state)
else if MAX is to move in state then
return the highest MINIMAX-VALUE of SUCCESSORS(state)
else
return the lowest MINIMAX-VALUE of SUCCESSORS(state)
10. Properties of Minimax
Complete Yes, if tree is finite (chess has specific rules for this)
Optimal Yes, against an optimal opponent. Otherwise??
Time O bm (depth-first exploration)
Space O bm (depth-first exploration)
Note
Finite strategy can exist even in an infinite tree
11. Resource Limits
Complexity of chess
b 35, m 100 for “reasonable” games
Exact solution completely impossible
Standard approach
Cutoff test
e.g., depth limit (perhaps add quiescence search)
Evaluation function
Estimates desirability of position
13. Evaluation Functions
Typical evaluation function for chess
Weighted sum of features
EVAL s w1 f1 s w2 f2 s wn fn s
Example
w1
f1 s
9
(number of white queens) (number of black queens)
14. Cutting Off Search
Does it work in practice?
bm 106
b 35 m 4
Not really, because . ..
4-ply human novice (hopeless chess player) 8-
ply typical PC, human master
12-ply Deep Blue, Kasparov
20. Properties of -
Effects of pruning:
Reduces the search space Does not affect final
result
Effectiveness:
Good move ordering improves effectiveness Time complexity with
“perfect ordering” : O bm 2 Doubles depth of search
For chess:
Can easily reach the depth of 8 and play a good chess.
21. The Idea of -
..
..
..
MAX
MIN
MAX
MIN V
is the best value (to MAX)
found so far off the current path.
If value x of some node below V is
known to be a less than ,
then value of V is known to be at the most
x, i.e., less than ,
therefore MAX will avoid node V.
Consequence
No need to expand further nodes
below V.
22. The - Algorithm
function MAX-VALUE(state, game, , ) returns the minimax value of state
inputs: state
game
/* current state in game */
/* game description */
/* the best score for MAX along the path to state */
/* the best score for MIN along the path to state */
if CUTOFF-TEST(state) then return EVAL(state)
for each s in SUCCESSORS(state) do
MAX(, MIN-VALUE(s, game, , ))
if then return
end
return
23. The - Algorithm
function MIN-VALUE(state, game, ,) returns the minimax value of state
inputs: state
game
/* current state in game */
/* game description */
/* the best score for MAX along the path to state */
/* the best score for MIN along the path to state */
if CUTOFF-TEST(state) then return EVAL(state)
for each s in SUCCESSORS(state) do
MIN( , MAX-VALUE(s, game, ,))
if then return
end
return
24. Deterministic Games in Practice
Checkers:
Chinook ended 40-year-reign of human world champions Marion Tinsley in 1994.
Used an endgame database of defining perfect play for all positions.
Involving the eight or fewer pieces on the board of a total of 443, 748, 401, 247
positions.
Chess:
Deep Blue defeated human world champion of Gary K asparo v in a six-game match in
1997. That deep blue search for two hundred millions positions per second uses very
widely distributed a evaluation and undisclosed method for extending some lines of
search up to 40 ply.
Go:
Human champions has refused to compete against computers, who are too
bad. In go, b three hundred, so most programs use pattern knowledge bases to
suggest plausible moves.
26. Nondeterministic Games in General
Chance introduced by dice,card-shuffling, etc.
MIN
Simplified example with coin-flipping
MAX
CHANCE
B. Beckert: Einführung in die KI / KI für IM – p.23
2 4 7 4 6 0 5 −2
2 4 0 −2
0.5 0.5 0.5 0.5
3 −1
27. Algorithm for Nondeterministic Games
EXPECTMINIMAX gives perfect play
if state is a MAX node then
return the highest EXPECTMINIMAX value of SUCCESSORS(state)
if state is a MIN node then
return the lowest EXPECTIMINIMAX value of SUCCESSORS(state)
if state is a chance node then
return average of EXPECTMINIMAX value of SUCCESSORS(state)
28. Pruning in Nondeterministic Game Trees
A version of - pruning is possible
0.5 0.5
[ − , + ]
[ − , + ]
[ − , + ]
0.5 0.5
[ − , + ]
[ − , + ]
[ − , + ]
29. Pruning in Nondeterministic Game Trees
A version of - pruning is possible
2
[ − , 2 ]
0.5
[ − , + ]
[ − , + ]
0.5
[ − , + ]
0.5
[ − , + ]
[ − , + ]
0.5
30. Pruning in Nondeterministic Game Trees
A version of - pruning is possible
0.5
[ − , + ]
[ − , + ]
0.5
[ − , + ]
0.5
[ − , + ]
[ − , + ]
0.5
2 2
[ 2 , 2 ]
31. Pruning in Nondeterministic Game Trees
A version of - pruning is possible
[ − , 2 ]
2
[ − , 2 ]
0.5
[ − , + ]
[ − , + ]
0.5
[ − , + ]
0.5 0.5
2 2
[ 2 , 2 ]
32. Pruning in Nondeterministic Game Trees
A version of - pruning is possible
2
0.5
[ − , + ]
[ − , + ]
0.5
[ − , + ]
0.5 0.5
2 2
[ 2 , 2 ]
1
[ 1 , 1 ]
[ 1.5 , 1.5 ]
33. Pruning in Nondeterministic Game Trees
A version of - pruning is possible
[ − , 0 ]
2
0.5
[ − , + ]
[ − , + ]
0.50.5 0.5
2 2
[ 2 , 2 ]
1
[ 1 , 1 ]
[ 1.5 , 1.5 ]
0
34. Pruning in Nondeterministic Game Trees
A version of - pruning is possible
2
0.5
[ − , + ]
[ − , + ]
0.50.5 0.5
2 2
[ 2 , 2 ]
1
[ 1 , 1 ]
[ 1.5 , 1.5 ]
0 1
[ 0 , 0 ]
35. Pruning in Nondeterministic Game Trees
A version of - pruning is possible
[ − , 0.5 ]
[ − , 1 ]
2
0.5 0.50.5 0.5
2 2
[ 2 , 2 ]
1
[ 1 , 1 ]
[ 1.5 , 1.5 ]
0 1
[ 0 , 0 ]
1
36. Pruning Continued
More pruning occurs if we can bound the leaf values
0.50.5
[ −2 , 2 ]
0.5
[ −2 , 2 ]
0.5
[ −2 , 2 ][ −2 , 2 ][ −2 , 2 ][ −2 , 2 ]
37. Pruning Continued
More pruning occurs if we can bound the leaf values
0.5 0.50.5 0.5
[ −2 , 2 ] [ −2 , 2 ]
[ −2 , 2 ][ −2 , 2 ][ −2 , 2 ][ −2 , 2 ]
2
38. Pruning Continued
More pruning occurs if we can bound the leaf values
0.5 0.50.5 0.5
[ −2 , 2 ]
[ −2 , 2 ][ −2 , 2 ][ −2 , 2 ]
2 2
[ 2 , 2 ]
[ 0 , 2 ]
39. Pruning Continued
More pruning occurs if we can bound the leaf values
0.5 0.50.5 0.5
[ −2 , 2 ]
[ −2 , 2 ][ −2 , 2 ][ −2 , 2 ]
2 2
[ 2 , 2 ]
[ 0 , 2 ]
2
40. Pruning Continued
More pruning occurs if we can bound the leaf values
0.5 0.50.5 0.5
[ −2 , 2 ]
[ −2 , 2 ][ −2 , 2 ]
2 2
[ 2 , 2 ]
2 1
[ 1 , 1 ]
[ 1.5 , 1.5 ]
41. Pruning Continued
More pruning occurs if we can bound the leaf values
0.5 0.50.5 0.5
[ −2 , 2 ]
2 2
[ 2 , 2 ]
2 1
[ 1 , 1 ]
[ 1.5 , 1.5 ]
0
[ −2 , 0 ]
[ −2 , 1 ]
42. Nondeterministic Games in Practice
Problem
- pruning is much less effective
Dice rolls increase b
21 possible rolls with 2 dice
Backgammon
20 legal moves
4 204 213 1 2 109
TDGAMMON
Uses depth-2 search + very good EVAL world-champion level
43. Digression: Exact Values DO Matter
MAX
2.1 1.3DICE
.9 .1 .9 .1 .9 .1 .9 .1
MIN 2 3 1 4 20 30 1 400
2 2 3 3 1 1 4 4 20 20 30 30 1 1 400 400
21 40.9
Behaviour is preserved only by positive linear transformation of EVAL
Hence EVAL should be proportional to the expected payoff
44. Games of Imperfect Information
Typical examples
Card games: overpass(bridge), poker, skat, etc.
Note
Like having one big dice roll at the starting of the game.
45. Games of Imperfect Information
IDEA FOR COMPUTING BEST ACTION:
•estimate the minimax value of each action in each deal,
•then choose the action with highest expected value gross deals.
•Requires information on amount of various facts.
SPECIAL CASE:
If an action is optimum for all deals,
BRIDGE:
current best overpass program, approximates this idea by
– generating hundred deals consistent with bidding info.
– picking the action that won most tricks on average.
46. Commonsense Example
Day 1
Road A leads to a few amount heap of gold
pieces
10 points
Road B leads to a fork:
– take the left fork and you will search a amound of
jewels.
– take the right fork and you be run over by a bus
100 points
1000 points
Best action: Take road B (100 points)
Day 2
Road A leads to a few amount heap of gold
pieces
10 points
Road B leads to a fork:
– take the left fork and you will be run over by a bus
– take the right fork and you will find a mound of
jewels.
1000 points
100 points
Best action: Take road B (100 points)
47. Commonsense Example
Day 3
Road A leads to a few amount heap of gold
pieces
(10 points)
Road B leads to a fork:
– guess correctly and you will find a mound of
jewels
– guess incorrectly and you will be run over by a
bus
100 points
1000 points
Best action: Take road A (10 points)
NOT: Take road B ( 1000
2
100 450 points)
48. Proper Analysis
Note
•Value of an actions is not an average of values for
actual states computed with perfect information.
•With the value of an action depends on the information state
the agent is in.
Leads to reasonable behaviors,
Acting to obtain information.
Signalling to one’s partner.
Acting randomly to minimize information disclosure.
49. Summary
Games are to AI as grand prix racing is to their own design.
Games are fun to work on and dangerous.
illustrate several important points about AI:
– perfection is unattainable, must approximate
– it is a good idea to think about what to think about
– uncertainty constrains the assignment of values to states