Artificial intelligence dic_SLIDE_3.pptx

ARTIFICIAL
INTELLIGENCE
ARTICIAL INTELLIGENCE:
ADVERSARIAL SEARCH
ENGR DR E. M. MARTEY
LECTURE THREE

Adversarial Search
• Adversarial Search in AI is a type of search in which one can trace the movement
of an enemy or opponent. Such searches are useful in chess, business strategy tools,
trade platforms and war-based games using AI agents.
• The user can change the current state but has no control over the future stage in
an adversarial search.
• The opponent has control over the next state which is unexpected.
• There may be instances where more than one agent is searching for a solution in
the same search space which is common in gameplay.
• Games are treated as a Search problem and a heuristic evaluation function which
are the two key components that help in the modeling and solving of games in AI.

Why Study
Game
Playing?
Games allow us to experiment with easier versions of real-
world situations
Hostile agents act against our goals
Games have a finite set of moves
Games are fairly easy to represent
Good idea to decide about what to think
Perfection is unrealistic, must settle for good
One of the earliest areas of AI
• Claude Shannon and Alan Turing wrote chess programs in
1950s
The opponent introduces uncertainty
The environment may contain uncertainty (backgammon)
Search space too hard to consider exhaustively
• Chess about 1040 legal positions
• Efficient and effective search strategies even more critical
Games are fun to solve adversarial search

Different Game
Scenarios using
Adversarial
Search

Different Game Scenarios using Adversarial
Search
Perfect Information: An example of a perfect information game is one
where agents can see the entire board. Agents are able to view each
other's movements and possess all of the game's information. Go,
Checkers and Chess are a few examples
Imperfect Information: The term "game with imperfect information"
refers to those, such as tic tac toe, battleship, blind, bridge, and
others, in which agents are not fully informed about the game or
aware of what is happening
Deterministic Game: Games classified as deterministic involve no
element of randomness; instead, they adhere to a rigid structure and
set of rules. A few examples are tic tac toe, go, checkers, chess, and so
on

Different Game Scenarios using Adversarial
Search
Non-deterministic Games: Games that involve a chance or luck element
and a variety of unpredictable outcomes are said to be non-
deterministic. Either dice or cards are used to introduce this element of
chance or luck. Each action-reaction is not fixed; they are random. We
also refer to these games as stochastic games. For instance, Monopoly,
Poker, Backgammon, etc
Zero-Sum games: These are strictly competitive games where the
success of one player is offset by the failure of another. Every player in
these games will have a different set of opposing strategies, and the net
value of gain and loss is zero. Every player aims to maximize gain or
minimize loss, depending on the conditions and surroundings of the
game

Zero-Sum Games
Focus primarily on “adversarial games”
Two-player, zero-sum games
As Player 1 gains strength
Player 2 loses strength
and vice versa
The sum of the two strengths is always 0.

Zero-Sum Games
• 2-person game
• Players alternate moves
• Zero-sum: one player’s loss is the other’s gain
• Perfect information: both players have access to complete
information about the state of the game. No information is
hidden from either player.
• No chance (e.g., using dice) involved
• Examples: Tic-Tac-Toe, Checkers, Chess, Go, Nim, Othello

Using
Search
• Search could be used to find a perfect sequence of
moves except the following problems arise:
– There exists an adversary who is trying to
minimize your chance of winning every other
move
• You cannot control his/her move
– Search trees can be very large, but you have
finite time to move
• Chess has 1040 nodes in search space
• With single-agent search, can afford to wait
• Some two-player games have time limits
• Solution?
– Search to n levels in the tree (n ply)
– Evaluate the nodes at the nth level
– Head for the best looking node

Search Applied to
Adversarial Games
• Initial state
– Current board position (description of current
game state)
• Operators
– Legal moves a player can make
• Terminal nodes
– Leaf nodes in the tree
– Indicate the game is over
• Utility function
– Objective function or Payoff function
– Value of the outcome of a game
– Example: tic tac toe, utility is -1, 0, or 1

Optimal Decision Games
A normal search problem
-The optimal solution is a sequence of actions leading to
a goal state
An adversarial search problem
-MIN interferes the sequence of action
-Strategy for MAX
Specify moves in the initial states
Observe every possible response by MIN
Specify moves in response
and so on
The optimal strategy can be determined from the minimax
value of each node.
The minimax value is the magic function that tells you the
state (good or bad)

Terminology
• The initial state is the first layer that
defines that the board is blank it’s MAX’s
turn to play.
• Successor function lists all the possible
successor moves. It is defined for all the
layers in the tree.
• Terminal State is the last layer of the
tree that shows the final state, i.e whether
the player MAX wins, loses, or ties with
the opponent.
• Utilities in this case for the terminal
states are 1, 0, and -1 as discussed earlier,
and they can be used to determine the
utilities of the other nodes as well.

Game Trees
• Tic tac toe
• Two players, MAX
and MIN
• Moves (and levels)
alternate between
two players

Minimax
Properties
• Complete if tree is finite
• Optimal if play against opponent with
same strategy (utility function)
• Time complexity is O(bm)
• Space complexity is O(bm) (depth-first
exploration)
• If we have 100 seconds to make a move
– Can explore 104 nodes/second
– Can consider 106 nodes / move
• Standard approach is
– Apply a cutoff test (depth limit,
quiescence)
– Evaluate nodes at cutoff (evaluation
function estimates desirability of
position)

How does the
minimax algorithm
work?
Step 1: First, generate
the entire game tree
starting with the current
position of the game all
the way upto the
terminal states.
Step 2: Apply the utility
function to get the utility
values for all the
terminal states.
Step 3: Determine the
utilities of the higher
nodes with the help of
the utilities of the
terminal nodes.
Step 4: Calculate the
utility values with the
help of leaves
considering one layer at
a time until the root of
the tree.
Step 5: Eventually, all the
backed-up values reach to
the root of the tree, i.e., the
topmost point. At that
point, MAX has to choose
the highest value

• Let us calculate the utility for the left node(red) of the layer above the terminal.
we have to evaluate MIN{3, 5, 10}, which we know is certainly 3. So the utility for
the red node is 3.

• Similarly, for the green node in the same layer, we will have to evaluate MIN{2,2}
which is 2.

• To summarize,
• Minimax Decision = MAX{MIN{3,5,10},MIN{2,2}}
= MAX{3,2}
= 3

Minimax Algorithm • Search the tree to the end
• Assign utility values to terminal nodes
• Find the best move for MAX (on MAX’s turn),
assuming:
– MAX will make the move that maximizes
MAX’s utility
– MIN will make the move that minimizes
MAX’s utility
• Here, MAX should make the leftmost move
• Minimax applet

Optimization
Game trees are, in general, very time consuming to build, and it’s only for simple
games that it can be generated in a short time.
If there are b legal moves, i.e., b nodes at each point and the maximum depth of the
tree is m, the time complexity of the minimax algorithm is of the order bm(O(bm)).
To curb this situation, there are a few optimizations that can be added to the
algorithm.
Fortunately, it is viable to find the actual minimax decision without even looking at
every node of the game tree. Hence, we eliminate nodes from the tree without
analyzing, and this process is called pruning.

Alpha-
beta
pruning
• It is a standard minimax
algorithm, it returns the
same move as the
standard one, but it
removes (prunes) all the
nodes that are possibly
not affecting the final
decision.

Suppose, we have the following game tree:
In this case,
Minimax Decision = MAX{MIN{3,5,10}, MIN{2,a,b}, MIN{2,7,3}}
= MAX{3,c,2}
= 3

Working of
Alpha-Beta
Pruning:
• Step 1: At the first step the, Max
player will start first move from
node A where α= -∞ and β= +∞,
these value of alpha and beta
passed down to node B where
again α= -∞ and β= +∞, and Node
B passes the same value to its
child D.

Working of Alpha-Beta
Pruning:
Step 2: At Node D, the value of α
will be calculated as its turn for Max.
The value of α is compared with
firstly 2 and then 3, and the max (2,
3) = 3 will be the value of α at node
D and node value will also 3.

Pruning:
Step 3: Now algorithm backtrack to
node B, where the value of β will
change as this is a turn of Min, Now
β= +∞, will compare with the
available subsequent nodes value,
i.e. min (∞, 3) = 3, hence at node B
now α= -∞, and β= 3.

Pruning:
In this next step, algorithm traverse
the next successor of Node B which
is node E, and the values of α= -∞,
and β= 3 will also be passed.
Step 4: At node E, Max will take its
turn, and the value of alpha will
change. The current value of alpha
will be compared with 5, so max (-∞,
5) = 5, hence at node E α= 5 and β=
3, where α>=β, so the right
successor of E will be pruned, and
algorithm will not traverse it, and the
value at node E will be 5.

Pruning:
Step 5: At next step, algorithm again
backtrack the tree, from node B to
node A. At node A, the value of alpha
will be changed the maximum
available value is 3 as max (-∞, 3)=
3, and β= +∞, these two values now
passes to right successor of A which
is Node C.
At node C, α=3 and β= +∞, and the
same values will be passed on to
node F.

Pruning:
Step 6: At node F, again the value of
α will be compared with left child
which is 0, and max(3,0)= 3, and
then compared with right child which
is 1, and max(3,1)= 3 still α remains
3, but the node value of F will
become 1.

Pruning:
Step 7: Node F returns the node
value 1 to node C, at C α= 3 and β=
+∞, here the value of beta will be
changed, it will compare with 1 so
min (∞, 1) = 1. Now at C, α=3 and
β= 1, and again it satisfies the
condition α>=β, so the next child of
C which is G will be pruned, and the
algorithm will not compute the entire
sub-tree G.

Pruning:
Step 8: C now returns the value of
1 to A here the best value for A is
max (3, 1) = 3. Following is the final
game tree which is the showing the
nodes which are computed and
nodes which has never computed.
Hence the optimal value for the
maximizer is 3 for this example.

Pseudocode
evaluate (node, alpha, beta)
if node is a leaf
return the utility value of node
if node is a minimizing node
for each child of node
beta = min (beta, evaluate (child, alpha, beta))
if beta <= alpha
return beta
return beta
if node is a maximizing node
for each child of node
alpha = max (alpha, evaluate (child, alpha, beta))
if beta <= alpha
return alpha
return alpha

Bad and
Good Cases
for Alpha-
Beta Pruning
Bad: Worst moves encountered first
Good: Good moves ordered first
If we can order moves, we can get more benefit from
alpha-beta pruning
4 MAX
+----------------+----------------+
2 3 4 MIN
+----+----+ +----+----+ +----+----+
6 4 2 7 5 3 8 6 4 MAX
+--+ +--+ +--+ +-+-+ +--+ +--+ +--+ +--+ +--+--+
6 5 4 3 2 1 1 3 7 4 5 2 3 8 2 1 6 1 2 4
4 MAX
+----------------+----------------+
4 3 2 MIN
+----+----+ +----+----+ +----+----+
4 6 8 3 x x 2 x x MAX
+--+ +--+ +--+ +--+ +-+-+
4 2 6 x 8 x 3 2 1 2 1

Alpha Beta
Properties
Pruning does not affect
final result
Good move ordering
improves effectiveness of
pruning
With perfect ordering,
time complexity is
O(bm/2)

Problems with a
fixed ply: The
Horizon Effect
• Inevitable losses are postponed
• Unachievable goals appear achievable
• Short-term gains mask unavoidable
consequences (traps)
Lose queen Lose pawn
Lose queen!!!
The “look ahead horizon”

Solutions
• How to counter the horizon effect
– Feedover
• Do not cut off search at non-quiescent
board positions (dynamic positions)
• Example, king in danger
• Keep searching down that path until
reach quiescent (stable) nodes
– Secondary Search
• Search further down selected path to
ensure this is the best move
– Progressive Deepening
• Search one ply, then two ply, etc., until
run out of time
• Similar to IDS

Non-
Deterministic
Games
Uncertain outcomes controlled by
chance, not an adversary

Game trees with chance nodes
•Chance nodes (shown as circles) represent
random events
•For a random event with N outcomes, each
chance node has N distinct children; a
probability is associated with each
•(For 2 dice, there are 21 distinct outcomes)
•Use minimax to compute values for MAX
and MIN nodes
•Use expected values for chance nodes
•For chance nodes over a max node, as in C:
expectimax(C) = ∑i(P(di) * maxvalue(i))
•For chance nodes over a min node:
expectimin(C) = ∑i(P(di) * minvalue(i))
Max
Rolls
Min
Rolls

Expectimax
Search
• The Expectimax search algorithm is a
game theory algorithm used to
maximize the expected utility.
• It is a variation of the Minimax
algorithm. While Minimax assumes
that the adversary(the minimizer) plays
optimally, the Expectimax doesn’t.
• This is useful for modelling
environments where adversary agents
are not optimal, or their actions are
based on chance.

Algorithm of Expectimax Search
Step 1: If the current call is a maximizer node, return the maximum of
the state values of the nodes successors.
Step 2: If the current call is a chance node, then return the average of
the state values of the nodes successors(assuming all nodes have equal
probability). If different nodes have different probabilities the expected
utility from there is given by ∑ixipi
Step 3: We call the function recursively until we reach a terminal
node(the state with no successors). Then return the utility for that state.

Nondeterministic
Game Algorithm
• Just like Minimax except also handle chance
nodes
• Compute ExpectMinimaxValue of successors
– If n is terminal node, then
ExpectMinimaxValue(n) = Utility(n)
– If n is a Max node, then
ExpectMinimaxValue(n) = maxsSuccessors(n)
ExpectMinimaxValue(s)
– If n is a Min node, then
ExpectMinimaxValue(n) = minsSuccessors(n)
ExpectMinimaxValue(s)
– If n is a chance node, then
ExpectMinimaxValue(n) =
sSuccessors(n) P(s) * ExpectMinimaxValue(s)

Summary
• Adversarial games can be classified as having perfect
or imperfect information. Every player in a game with
perfect information is fully aware of the game's present
condition; yet, in a game with imperfect information,
certain information is kept hidden from the players
• Search strategies such as minimax and alpha-beta
pruning are used in adversarial games to discover the
best move. These algorithms aid in choosing the best
move for a player by weighing the possible outcomes of
each move
• Since the size of the game tree, it may not always be
able to search it thoroughly. In these cases, heuristics
are used to approximate the optimal move. Heuristics
are shortcuts that allow the algorithm to quickly
evaluate the possibility of a move without having to look
through the entire game tree

Status of AI Game Players
• Tic Tac Toe
– Tied for best player in world
• Othello
– Computer better than any human
– Human champions now refuse to
play computer
• Scrabble
– Maven beat world champions Joel
Sherman and Matt Graham
• Backgammon
– 1992, Tesauro combines 3-ply
search & neural networks (with
160 hidden units) yielding top-3
player
• Bridge
– Gib ranked among top players in
the world
• Poker
– Pokie plays at strong intermediate
level
• Checkers
– 1994, Chinook ended 40-year reign of
human champion Marion Tinsley
• Chess
– 1997, Deep Blue beat human
champion Gary Kasparov in six-game
match
– Deep Blue searches 200M
positions/second, up to 40 ply
– Now looking at other applications
(molecular dynamics, drug synthesis)
• Go
– 2008, MoGo running on 25 nodes
(800 cores) beat Myungwan Kim
– $2M prize available for first computer
program to defeat a top player

Artificial intelligence dic_SLIDE_3.pptx

More Related Content

Similar to Artificial intelligence dic_SLIDE_3.pptx

More from PenielAnkomah1

Recently uploaded

Artificial intelligence dic_SLIDE_3.pptx