ARTIFICIAL
INTELLIGENCE
ARTICIAL INTELLIGENCE:
ADVERSARIAL SEARCH
ENGR DR E. M. MARTEY
LECTURE THREE
Adversarial Search
• Adversarial Search in AI is a type of search in which one can trace the movement
of an enemy or opponent. Such searches are useful in chess, business strategy tools,
trade platforms and war-based games using AI agents.
• The user can change the current state but has no control over the future stage in
an adversarial search.
• The opponent has control over the next state which is unexpected.
• There may be instances where more than one agent is searching for a solution in
the same search space which is common in gameplay.
• Games are treated as a Search problem and a heuristic evaluation function which
are the two key components that help in the modeling and solving of games in AI.
Game
Playing
Why Study
Game
Playing?
Games allow us to experiment with easier versions of real-
world situations
Hostile agents act against our goals
Games have a finite set of moves
Games are fairly easy to represent
Good idea to decide about what to think
Perfection is unrealistic, must settle for good
One of the earliest areas of AI
• Claude Shannon and Alan Turing wrote chess programs in
1950s
The opponent introduces uncertainty
The environment may contain uncertainty (backgammon)
Search space too hard to consider exhaustively
• Chess about 1040 legal positions
• Efficient and effective search strategies even more critical
Games are fun to solve adversarial search
Different Game
Scenarios using
Adversarial
Search
Different Game Scenarios using Adversarial
Search
Perfect Information: An example of a perfect information game is one
where agents can see the entire board. Agents are able to view each
other's movements and possess all of the game's information. Go,
Checkers and Chess are a few examples
Imperfect Information: The term "game with imperfect information"
refers to those, such as tic tac toe, battleship, blind, bridge, and
others, in which agents are not fully informed about the game or
aware of what is happening
Deterministic Game: Games classified as deterministic involve no
element of randomness; instead, they adhere to a rigid structure and
set of rules. A few examples are tic tac toe, go, checkers, chess, and so
on
Different Game Scenarios using Adversarial
Search
Non-deterministic Games: Games that involve a chance or luck element
and a variety of unpredictable outcomes are said to be non-
deterministic. Either dice or cards are used to introduce this element of
chance or luck. Each action-reaction is not fixed; they are random. We
also refer to these games as stochastic games. For instance, Monopoly,
Poker, Backgammon, etc
Zero-Sum games: These are strictly competitive games where the
success of one player is offset by the failure of another. Every player in
these games will have a different set of opposing strategies, and the net
value of gain and loss is zero. Every player aims to maximize gain or
minimize loss, depending on the conditions and surroundings of the
game
Zero-Sum Games
Focus primarily on “adversarial games”
Two-player, zero-sum games
As Player 1 gains strength
Player 2 loses strength
and vice versa
The sum of the two strengths is always 0.
Zero-Sum Games
• 2-person game
• Players alternate moves
• Zero-sum: one player’s loss is the other’s gain
• Perfect information: both players have access to complete
information about the state of the game. No information is
hidden from either player.
• No chance (e.g., using dice) involved
• Examples: Tic-Tac-Toe, Checkers, Chess, Go, Nim, Othello
Using
Search
• Search could be used to find a perfect sequence of
moves except the following problems arise:
– There exists an adversary who is trying to
minimize your chance of winning every other
move
• You cannot control his/her move
– Search trees can be very large, but you have
finite time to move
• Chess has 1040 nodes in search space
• With single-agent search, can afford to wait
• Some two-player games have time limits
• Solution?
– Search to n levels in the tree (n ply)
– Evaluate the nodes at the nth level
– Head for the best looking node
Search Applied to
Adversarial Games
• Initial state
– Current board position (description of current
game state)
• Operators
– Legal moves a player can make
• Terminal nodes
– Leaf nodes in the tree
– Indicate the game is over
• Utility function
– Objective function or Payoff function
– Value of the outcome of a game
– Example: tic tac toe, utility is -1, 0, or 1
Game Tree
Optimal Decision Games
A normal search problem
-The optimal solution is a sequence of actions leading to
a goal state
An adversarial search problem
-MIN interferes the sequence of action
-Strategy for MAX
Specify moves in the initial states
Observe every possible response by MIN
Specify moves in response
and so on
The optimal strategy can be determined from the minimax
value of each node.
The minimax value is the magic function that tells you the
state (good or bad)
Terminology
• The initial state is the first layer that
defines that the board is blank it’s MAX’s
turn to play.
• Successor function lists all the possible
successor moves. It is defined for all the
layers in the tree.
• Terminal State is the last layer of the
tree that shows the final state, i.e whether
the player MAX wins, loses, or ties with
the opponent.
• Utilities in this case for the terminal
states are 1, 0, and -1 as discussed earlier,
and they can be used to determine the
utilities of the other nodes as well.
Game Trees
• Tic tac toe
• Two players, MAX
and MIN
• Moves (and levels)
alternate between
two players
Minimax
Properties
• Complete if tree is finite
• Optimal if play against opponent with
same strategy (utility function)
• Time complexity is O(bm)
• Space complexity is O(bm) (depth-first
exploration)
• If we have 100 seconds to make a move
– Can explore 104 nodes/second
– Can consider 106 nodes / move
• Standard approach is
– Apply a cutoff test (depth limit,
quiescence)
– Evaluate nodes at cutoff (evaluation
function estimates desirability of
position)
How does the
minimax algorithm
work?
Step 1: First, generate
the entire game tree
starting with the current
position of the game all
the way upto the
terminal states.
Step 2: Apply the utility
function to get the utility
values for all the
terminal states.
Step 3: Determine the
utilities of the higher
nodes with the help of
the utilities of the
terminal nodes.
Step 4: Calculate the
utility values with the
help of leaves
considering one layer at
a time until the root of
the tree.
Step 5: Eventually, all the
backed-up values reach to
the root of the tree, i.e., the
topmost point. At that
point, MAX has to choose
the highest value
• Let us calculate the utility for the left node(red) of the layer above the terminal.
we have to evaluate MIN{3, 5, 10}, which we know is certainly 3. So the utility for
the red node is 3.
• Similarly, for the green node in the same layer, we will have to evaluate MIN{2,2}
which is 2.
• To summarize,
• Minimax Decision = MAX{MIN{3,5,10},MIN{2,2}}
= MAX{3,2}
= 3
Minimax Algorithm • Search the tree to the end
• Assign utility values to terminal nodes
• Find the best move for MAX (on MAX’s turn),
assuming:
– MAX will make the move that maximizes
MAX’s utility
– MIN will make the move that minimizes
MAX’s utility
• Here, MAX should make the leftmost move
• Minimax applet
Optimization
Game trees are, in general, very time consuming to build, and it’s only for simple
games that it can be generated in a short time.
If there are b legal moves, i.e., b nodes at each point and the maximum depth of the
tree is m, the time complexity of the minimax algorithm is of the order bm(O(bm)).
To curb this situation, there are a few optimizations that can be added to the
algorithm.
Fortunately, it is viable to find the actual minimax decision without even looking at
every node of the game tree. Hence, we eliminate nodes from the tree without
analyzing, and this process is called pruning.
Alpha-
beta
pruning
• It is a standard minimax
algorithm, it returns the
same move as the
standard one, but it
removes (prunes) all the
nodes that are possibly
not affecting the final
decision.
Suppose, we have the following game tree:
In this case,
Minimax Decision = MAX{MIN{3,5,10}, MIN{2,a,b}, MIN{2,7,3}}
= MAX{3,c,2}
= 3
Working of
Alpha-Beta
Pruning:
• Step 1: At the first step the, Max
player will start first move from
node A where α= -∞ and β= +∞,
these value of alpha and beta
passed down to node B where
again α= -∞ and β= +∞, and Node
B passes the same value to its
child D.
Working of Alpha-Beta
Pruning:
Step 2: At Node D, the value of α
will be calculated as its turn for Max.
The value of α is compared with
firstly 2 and then 3, and the max (2,
3) = 3 will be the value of α at node
D and node value will also 3.
Working of Alpha-Beta
Pruning:
Step 3: Now algorithm backtrack to
node B, where the value of β will
change as this is a turn of Min, Now
β= +∞, will compare with the
available subsequent nodes value,
i.e. min (∞, 3) = 3, hence at node B
now α= -∞, and β= 3.
Working of Alpha-Beta
Pruning:
In this next step, algorithm traverse
the next successor of Node B which
is node E, and the values of α= -∞,
and β= 3 will also be passed.
Step 4: At node E, Max will take its
turn, and the value of alpha will
change. The current value of alpha
will be compared with 5, so max (-∞,
5) = 5, hence at node E α= 5 and β=
3, where α>=β, so the right
successor of E will be pruned, and
algorithm will not traverse it, and the
value at node E will be 5.
Working of Alpha-Beta
Pruning:
Step 5: At next step, algorithm again
backtrack the tree, from node B to
node A. At node A, the value of alpha
will be changed the maximum
available value is 3 as max (-∞, 3)=
3, and β= +∞, these two values now
passes to right successor of A which
is Node C.
At node C, α=3 and β= +∞, and the
same values will be passed on to
node F.
Working of Alpha-Beta
Pruning:
Step 6: At node F, again the value of
α will be compared with left child
which is 0, and max(3,0)= 3, and
then compared with right child which
is 1, and max(3,1)= 3 still α remains
3, but the node value of F will
become 1.
Working of Alpha-Beta
Pruning:
Step 7: Node F returns the node
value 1 to node C, at C α= 3 and β=
+∞, here the value of beta will be
changed, it will compare with 1 so
min (∞, 1) = 1. Now at C, α=3 and
β= 1, and again it satisfies the
condition α>=β, so the next child of
C which is G will be pruned, and the
algorithm will not compute the entire
sub-tree G.
Working of Alpha-Beta
Pruning:
Step 8: C now returns the value of
1 to A here the best value for A is
max (3, 1) = 3. Following is the final
game tree which is the showing the
nodes which are computed and
nodes which has never computed.
Hence the optimal value for the
maximizer is 3 for this example.
Pseudocode
evaluate (node, alpha, beta)
if node is a leaf
return the utility value of node
if node is a minimizing node
for each child of node
beta = min (beta, evaluate (child, alpha, beta))
if beta <= alpha
return beta
return beta
if node is a maximizing node
for each child of node
alpha = max (alpha, evaluate (child, alpha, beta))
if beta <= alpha
return alpha
return alpha
Example
Example
Example
Example
Example
Example
Example
Example
Example
Example
Example
Example
Example
Example
Example
Example
Example
Example
Example
Example
Example
Example
Example
Example
Example
Example
Example
Example
Example
Example
Example
Example
Example
Example
Example
Example
Example
Example
Example
Bad and
Good Cases
for Alpha-
Beta Pruning
Bad: Worst moves encountered first
Good: Good moves ordered first
If we can order moves, we can get more benefit from
alpha-beta pruning
4 MAX
+----------------+----------------+
2 3 4 MIN
+----+----+ +----+----+ +----+----+
6 4 2 7 5 3 8 6 4 MAX
+--+ +--+ +--+ +-+-+ +--+ +--+ +--+ +--+ +--+--+
6 5 4 3 2 1 1 3 7 4 5 2 3 8 2 1 6 1 2 4
4 MAX
+----------------+----------------+
4 3 2 MIN
+----+----+ +----+----+ +----+----+
4 6 8 3 x x 2 x x MAX
+--+ +--+ +--+ +--+ +-+-+
4 2 6 x 8 x 3 2 1 2 1
Alpha Beta
Properties
Pruning does not affect
final result
Good move ordering
improves effectiveness of
pruning
With perfect ordering,
time complexity is
O(bm/2)
Problems with a
fixed ply: The
Horizon Effect
• Inevitable losses are postponed
• Unachievable goals appear achievable
• Short-term gains mask unavoidable
consequences (traps)
Lose queen Lose pawn
Lose queen!!!
The “look ahead horizon”
Solutions
• How to counter the horizon effect
– Feedover
• Do not cut off search at non-quiescent
board positions (dynamic positions)
• Example, king in danger
• Keep searching down that path until
reach quiescent (stable) nodes
– Secondary Search
• Search further down selected path to
ensure this is the best move
– Progressive Deepening
• Search one ply, then two ply, etc., until
run out of time
• Similar to IDS
Non-
Deterministic
Games
Uncertain outcomes controlled by
chance, not an adversary
Game trees with chance nodes
•Chance nodes (shown as circles) represent
random events
•For a random event with N outcomes, each
chance node has N distinct children; a
probability is associated with each
•(For 2 dice, there are 21 distinct outcomes)
•Use minimax to compute values for MAX
and MIN nodes
•Use expected values for chance nodes
•For chance nodes over a max node, as in C:
expectimax(C) = ∑i(P(di) * maxvalue(i))
•For chance nodes over a min node:
expectimin(C) = ∑i(P(di) * minvalue(i))
Max
Rolls
Min
Rolls
Expectimax
Search
• The Expectimax search algorithm is a
game theory algorithm used to
maximize the expected utility.
• It is a variation of the Minimax
algorithm. While Minimax assumes
that the adversary(the minimizer) plays
optimally, the Expectimax doesn’t.
• This is useful for modelling
environments where adversary agents
are not optimal, or their actions are
based on chance.
Algorithm of Expectimax Search
Step 1: If the current call is a maximizer node, return the maximum of
the state values of the nodes successors.
Step 2: If the current call is a chance node, then return the average of
the state values of the nodes successors(assuming all nodes have equal
probability). If different nodes have different probabilities the expected
utility from there is given by ∑ixipi
Step 3: We call the function recursively until we reach a terminal
node(the state with no successors). Then return the utility for that state.
Example of Expectimax Search
Nondeterministic
Game Algorithm
• Just like Minimax except also handle chance
nodes
• Compute ExpectMinimaxValue of successors
– If n is terminal node, then
ExpectMinimaxValue(n) = Utility(n)
– If n is a Max node, then
ExpectMinimaxValue(n) = maxsSuccessors(n)
ExpectMinimaxValue(s)
– If n is a Min node, then
ExpectMinimaxValue(n) = minsSuccessors(n)
ExpectMinimaxValue(s)
– If n is a chance node, then
ExpectMinimaxValue(n) =
sSuccessors(n) P(s) * ExpectMinimaxValue(s)
Summary
• Adversarial games can be classified as having perfect
or imperfect information. Every player in a game with
perfect information is fully aware of the game's present
condition; yet, in a game with imperfect information,
certain information is kept hidden from the players
• Search strategies such as minimax and alpha-beta
pruning are used in adversarial games to discover the
best move. These algorithms aid in choosing the best
move for a player by weighing the possible outcomes of
each move
• Since the size of the game tree, it may not always be
able to search it thoroughly. In these cases, heuristics
are used to approximate the optimal move. Heuristics
are shortcuts that allow the algorithm to quickly
evaluate the possibility of a move without having to look
through the entire game tree
Status of AI Game Players
• Tic Tac Toe
– Tied for best player in world
• Othello
– Computer better than any human
– Human champions now refuse to
play computer
• Scrabble
– Maven beat world champions Joel
Sherman and Matt Graham
• Backgammon
– 1992, Tesauro combines 3-ply
search & neural networks (with
160 hidden units) yielding top-3
player
• Bridge
– Gib ranked among top players in
the world
• Poker
– Pokie plays at strong intermediate
level
• Checkers
– 1994, Chinook ended 40-year reign of
human champion Marion Tinsley
• Chess
– 1997, Deep Blue beat human
champion Gary Kasparov in six-game
match
– Deep Blue searches 200M
positions/second, up to 40 ply
– Now looking at other applications
(molecular dynamics, drug synthesis)
• Go
– 2008, MoGo running on 25 nodes
(800 cores) beat Myungwan Kim
– $2M prize available for first computer
program to defeat a top player
Question

Artificial intelligence dic_SLIDE_3.pptx

  • 1.
  • 2.
    Adversarial Search • AdversarialSearch in AI is a type of search in which one can trace the movement of an enemy or opponent. Such searches are useful in chess, business strategy tools, trade platforms and war-based games using AI agents. • The user can change the current state but has no control over the future stage in an adversarial search. • The opponent has control over the next state which is unexpected. • There may be instances where more than one agent is searching for a solution in the same search space which is common in gameplay. • Games are treated as a Search problem and a heuristic evaluation function which are the two key components that help in the modeling and solving of games in AI.
  • 3.
  • 4.
    Why Study Game Playing? Games allowus to experiment with easier versions of real- world situations Hostile agents act against our goals Games have a finite set of moves Games are fairly easy to represent Good idea to decide about what to think Perfection is unrealistic, must settle for good One of the earliest areas of AI • Claude Shannon and Alan Turing wrote chess programs in 1950s The opponent introduces uncertainty The environment may contain uncertainty (backgammon) Search space too hard to consider exhaustively • Chess about 1040 legal positions • Efficient and effective search strategies even more critical Games are fun to solve adversarial search
  • 5.
  • 6.
    Different Game Scenariosusing Adversarial Search Perfect Information: An example of a perfect information game is one where agents can see the entire board. Agents are able to view each other's movements and possess all of the game's information. Go, Checkers and Chess are a few examples Imperfect Information: The term "game with imperfect information" refers to those, such as tic tac toe, battleship, blind, bridge, and others, in which agents are not fully informed about the game or aware of what is happening Deterministic Game: Games classified as deterministic involve no element of randomness; instead, they adhere to a rigid structure and set of rules. A few examples are tic tac toe, go, checkers, chess, and so on
  • 7.
    Different Game Scenariosusing Adversarial Search Non-deterministic Games: Games that involve a chance or luck element and a variety of unpredictable outcomes are said to be non- deterministic. Either dice or cards are used to introduce this element of chance or luck. Each action-reaction is not fixed; they are random. We also refer to these games as stochastic games. For instance, Monopoly, Poker, Backgammon, etc Zero-Sum games: These are strictly competitive games where the success of one player is offset by the failure of another. Every player in these games will have a different set of opposing strategies, and the net value of gain and loss is zero. Every player aims to maximize gain or minimize loss, depending on the conditions and surroundings of the game
  • 8.
    Zero-Sum Games Focus primarilyon “adversarial games” Two-player, zero-sum games As Player 1 gains strength Player 2 loses strength and vice versa The sum of the two strengths is always 0.
  • 9.
    Zero-Sum Games • 2-persongame • Players alternate moves • Zero-sum: one player’s loss is the other’s gain • Perfect information: both players have access to complete information about the state of the game. No information is hidden from either player. • No chance (e.g., using dice) involved • Examples: Tic-Tac-Toe, Checkers, Chess, Go, Nim, Othello
  • 10.
    Using Search • Search couldbe used to find a perfect sequence of moves except the following problems arise: – There exists an adversary who is trying to minimize your chance of winning every other move • You cannot control his/her move – Search trees can be very large, but you have finite time to move • Chess has 1040 nodes in search space • With single-agent search, can afford to wait • Some two-player games have time limits • Solution? – Search to n levels in the tree (n ply) – Evaluate the nodes at the nth level – Head for the best looking node
  • 11.
    Search Applied to AdversarialGames • Initial state – Current board position (description of current game state) • Operators – Legal moves a player can make • Terminal nodes – Leaf nodes in the tree – Indicate the game is over • Utility function – Objective function or Payoff function – Value of the outcome of a game – Example: tic tac toe, utility is -1, 0, or 1
  • 12.
  • 13.
    Optimal Decision Games Anormal search problem -The optimal solution is a sequence of actions leading to a goal state An adversarial search problem -MIN interferes the sequence of action -Strategy for MAX Specify moves in the initial states Observe every possible response by MIN Specify moves in response and so on The optimal strategy can be determined from the minimax value of each node. The minimax value is the magic function that tells you the state (good or bad)
  • 14.
    Terminology • The initialstate is the first layer that defines that the board is blank it’s MAX’s turn to play. • Successor function lists all the possible successor moves. It is defined for all the layers in the tree. • Terminal State is the last layer of the tree that shows the final state, i.e whether the player MAX wins, loses, or ties with the opponent. • Utilities in this case for the terminal states are 1, 0, and -1 as discussed earlier, and they can be used to determine the utilities of the other nodes as well.
  • 15.
    Game Trees • Tictac toe • Two players, MAX and MIN • Moves (and levels) alternate between two players
  • 17.
    Minimax Properties • Complete iftree is finite • Optimal if play against opponent with same strategy (utility function) • Time complexity is O(bm) • Space complexity is O(bm) (depth-first exploration) • If we have 100 seconds to make a move – Can explore 104 nodes/second – Can consider 106 nodes / move • Standard approach is – Apply a cutoff test (depth limit, quiescence) – Evaluate nodes at cutoff (evaluation function estimates desirability of position)
  • 18.
    How does the minimaxalgorithm work? Step 1: First, generate the entire game tree starting with the current position of the game all the way upto the terminal states. Step 2: Apply the utility function to get the utility values for all the terminal states. Step 3: Determine the utilities of the higher nodes with the help of the utilities of the terminal nodes. Step 4: Calculate the utility values with the help of leaves considering one layer at a time until the root of the tree. Step 5: Eventually, all the backed-up values reach to the root of the tree, i.e., the topmost point. At that point, MAX has to choose the highest value
  • 19.
    • Let uscalculate the utility for the left node(red) of the layer above the terminal. we have to evaluate MIN{3, 5, 10}, which we know is certainly 3. So the utility for the red node is 3.
  • 20.
    • Similarly, forthe green node in the same layer, we will have to evaluate MIN{2,2} which is 2.
  • 21.
    • To summarize, •Minimax Decision = MAX{MIN{3,5,10},MIN{2,2}} = MAX{3,2} = 3
  • 22.
    Minimax Algorithm •Search the tree to the end • Assign utility values to terminal nodes • Find the best move for MAX (on MAX’s turn), assuming: – MAX will make the move that maximizes MAX’s utility – MIN will make the move that minimizes MAX’s utility • Here, MAX should make the leftmost move • Minimax applet
  • 23.
    Optimization Game trees are,in general, very time consuming to build, and it’s only for simple games that it can be generated in a short time. If there are b legal moves, i.e., b nodes at each point and the maximum depth of the tree is m, the time complexity of the minimax algorithm is of the order bm(O(bm)). To curb this situation, there are a few optimizations that can be added to the algorithm. Fortunately, it is viable to find the actual minimax decision without even looking at every node of the game tree. Hence, we eliminate nodes from the tree without analyzing, and this process is called pruning.
  • 24.
    Alpha- beta pruning • It isa standard minimax algorithm, it returns the same move as the standard one, but it removes (prunes) all the nodes that are possibly not affecting the final decision.
  • 25.
    Suppose, we havethe following game tree: In this case, Minimax Decision = MAX{MIN{3,5,10}, MIN{2,a,b}, MIN{2,7,3}} = MAX{3,c,2} = 3
  • 26.
    Working of Alpha-Beta Pruning: • Step1: At the first step the, Max player will start first move from node A where α= -∞ and β= +∞, these value of alpha and beta passed down to node B where again α= -∞ and β= +∞, and Node B passes the same value to its child D.
  • 27.
    Working of Alpha-Beta Pruning: Step2: At Node D, the value of α will be calculated as its turn for Max. The value of α is compared with firstly 2 and then 3, and the max (2, 3) = 3 will be the value of α at node D and node value will also 3.
  • 28.
    Working of Alpha-Beta Pruning: Step3: Now algorithm backtrack to node B, where the value of β will change as this is a turn of Min, Now β= +∞, will compare with the available subsequent nodes value, i.e. min (∞, 3) = 3, hence at node B now α= -∞, and β= 3.
  • 29.
    Working of Alpha-Beta Pruning: Inthis next step, algorithm traverse the next successor of Node B which is node E, and the values of α= -∞, and β= 3 will also be passed. Step 4: At node E, Max will take its turn, and the value of alpha will change. The current value of alpha will be compared with 5, so max (-∞, 5) = 5, hence at node E α= 5 and β= 3, where α>=β, so the right successor of E will be pruned, and algorithm will not traverse it, and the value at node E will be 5.
  • 30.
    Working of Alpha-Beta Pruning: Step5: At next step, algorithm again backtrack the tree, from node B to node A. At node A, the value of alpha will be changed the maximum available value is 3 as max (-∞, 3)= 3, and β= +∞, these two values now passes to right successor of A which is Node C. At node C, α=3 and β= +∞, and the same values will be passed on to node F.
  • 31.
    Working of Alpha-Beta Pruning: Step6: At node F, again the value of α will be compared with left child which is 0, and max(3,0)= 3, and then compared with right child which is 1, and max(3,1)= 3 still α remains 3, but the node value of F will become 1.
  • 32.
    Working of Alpha-Beta Pruning: Step7: Node F returns the node value 1 to node C, at C α= 3 and β= +∞, here the value of beta will be changed, it will compare with 1 so min (∞, 1) = 1. Now at C, α=3 and β= 1, and again it satisfies the condition α>=β, so the next child of C which is G will be pruned, and the algorithm will not compute the entire sub-tree G.
  • 33.
    Working of Alpha-Beta Pruning: Step8: C now returns the value of 1 to A here the best value for A is max (3, 1) = 3. Following is the final game tree which is the showing the nodes which are computed and nodes which has never computed. Hence the optimal value for the maximizer is 3 for this example.
  • 34.
    Pseudocode evaluate (node, alpha,beta) if node is a leaf return the utility value of node if node is a minimizing node for each child of node beta = min (beta, evaluate (child, alpha, beta)) if beta <= alpha return beta return beta if node is a maximizing node for each child of node alpha = max (alpha, evaluate (child, alpha, beta)) if beta <= alpha return alpha return alpha
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52.
  • 53.
  • 54.
  • 55.
  • 56.
  • 57.
  • 58.
  • 59.
  • 60.
  • 61.
  • 62.
  • 63.
  • 64.
  • 65.
  • 66.
  • 67.
  • 68.
  • 69.
  • 70.
  • 71.
  • 72.
  • 73.
  • 75.
    Bad and Good Cases forAlpha- Beta Pruning Bad: Worst moves encountered first Good: Good moves ordered first If we can order moves, we can get more benefit from alpha-beta pruning 4 MAX +----------------+----------------+ 2 3 4 MIN +----+----+ +----+----+ +----+----+ 6 4 2 7 5 3 8 6 4 MAX +--+ +--+ +--+ +-+-+ +--+ +--+ +--+ +--+ +--+--+ 6 5 4 3 2 1 1 3 7 4 5 2 3 8 2 1 6 1 2 4 4 MAX +----------------+----------------+ 4 3 2 MIN +----+----+ +----+----+ +----+----+ 4 6 8 3 x x 2 x x MAX +--+ +--+ +--+ +--+ +-+-+ 4 2 6 x 8 x 3 2 1 2 1
  • 76.
    Alpha Beta Properties Pruning doesnot affect final result Good move ordering improves effectiveness of pruning With perfect ordering, time complexity is O(bm/2)
  • 77.
    Problems with a fixedply: The Horizon Effect • Inevitable losses are postponed • Unachievable goals appear achievable • Short-term gains mask unavoidable consequences (traps) Lose queen Lose pawn Lose queen!!! The “look ahead horizon”
  • 78.
    Solutions • How tocounter the horizon effect – Feedover • Do not cut off search at non-quiescent board positions (dynamic positions) • Example, king in danger • Keep searching down that path until reach quiescent (stable) nodes – Secondary Search • Search further down selected path to ensure this is the best move – Progressive Deepening • Search one ply, then two ply, etc., until run out of time • Similar to IDS
  • 79.
  • 80.
    Game trees withchance nodes •Chance nodes (shown as circles) represent random events •For a random event with N outcomes, each chance node has N distinct children; a probability is associated with each •(For 2 dice, there are 21 distinct outcomes) •Use minimax to compute values for MAX and MIN nodes •Use expected values for chance nodes •For chance nodes over a max node, as in C: expectimax(C) = ∑i(P(di) * maxvalue(i)) •For chance nodes over a min node: expectimin(C) = ∑i(P(di) * minvalue(i)) Max Rolls Min Rolls
  • 81.
    Expectimax Search • The Expectimaxsearch algorithm is a game theory algorithm used to maximize the expected utility. • It is a variation of the Minimax algorithm. While Minimax assumes that the adversary(the minimizer) plays optimally, the Expectimax doesn’t. • This is useful for modelling environments where adversary agents are not optimal, or their actions are based on chance.
  • 82.
    Algorithm of ExpectimaxSearch Step 1: If the current call is a maximizer node, return the maximum of the state values of the nodes successors. Step 2: If the current call is a chance node, then return the average of the state values of the nodes successors(assuming all nodes have equal probability). If different nodes have different probabilities the expected utility from there is given by ∑ixipi Step 3: We call the function recursively until we reach a terminal node(the state with no successors). Then return the utility for that state.
  • 83.
  • 84.
    Nondeterministic Game Algorithm • Justlike Minimax except also handle chance nodes • Compute ExpectMinimaxValue of successors – If n is terminal node, then ExpectMinimaxValue(n) = Utility(n) – If n is a Max node, then ExpectMinimaxValue(n) = maxsSuccessors(n) ExpectMinimaxValue(s) – If n is a Min node, then ExpectMinimaxValue(n) = minsSuccessors(n) ExpectMinimaxValue(s) – If n is a chance node, then ExpectMinimaxValue(n) = sSuccessors(n) P(s) * ExpectMinimaxValue(s)
  • 85.
    Summary • Adversarial gamescan be classified as having perfect or imperfect information. Every player in a game with perfect information is fully aware of the game's present condition; yet, in a game with imperfect information, certain information is kept hidden from the players • Search strategies such as minimax and alpha-beta pruning are used in adversarial games to discover the best move. These algorithms aid in choosing the best move for a player by weighing the possible outcomes of each move • Since the size of the game tree, it may not always be able to search it thoroughly. In these cases, heuristics are used to approximate the optimal move. Heuristics are shortcuts that allow the algorithm to quickly evaluate the possibility of a move without having to look through the entire game tree
  • 86.
    Status of AIGame Players • Tic Tac Toe – Tied for best player in world • Othello – Computer better than any human – Human champions now refuse to play computer • Scrabble – Maven beat world champions Joel Sherman and Matt Graham • Backgammon – 1992, Tesauro combines 3-ply search & neural networks (with 160 hidden units) yielding top-3 player • Bridge – Gib ranked among top players in the world • Poker – Pokie plays at strong intermediate level • Checkers – 1994, Chinook ended 40-year reign of human champion Marion Tinsley • Chess – 1997, Deep Blue beat human champion Gary Kasparov in six-game match – Deep Blue searches 200M positions/second, up to 40 ply – Now looking at other applications (molecular dynamics, drug synthesis) • Go – 2008, MoGo running on 25 nodes (800 cores) beat Myungwan Kim – $2M prize available for first computer program to defeat a top player
  • 87.