Class Assignment




                  CLASS ASSIGNMENT-01
                    Parallel Searching Algorithms




INTRODUCTION:
Parallel Search, also known as Multithreaded Search or SMP Search, is a way to increase
search speed by using additional processors. This topic that has been gaining popularity
recently with multiprocessor computers becoming widely available.

Actually, a parallel algorithm is an algorithm which can be executed a piece at a time on
many different processing devices, and then put back together again at the end to get the
correct result.

The cost or complexity of serial algorithms is estimated in terms of the space (memory)
and time (processor cycles) that they take. Parallel algorithms need to optimize one more
resource, the communication between different processors. There are two ways parallel
processors communicating, shared memory or message passing.

This document gives a brief summary of four types SMP algorithms which are classified by
their scalability (trend in search speed as the number of processors becomes large) and
their speedup (change in time to complete a search). Typically, programmers use scaling
to mean change in nodes per second (NPS) rates, and speedup to mean change in time to
depth. The algorithms are described below in brief:




                                                                               Page 1 of 6
Class Assignment

ALPHA – BETA SEARCH:

The Alpha-Beta algorithm (Alpha-Beta Pruning, Alpha-Beta Heuristic) is a significant
enhancement to the minimax search algorithm that eliminates the need to search large
portions of the game tree applying a branch-and-bound technique. Remarkably, it does
this without any potential of overlooking a better move. If one already has found a quite
good move and search for alternatives, one refutation is enough to avoid it. No need to look
for even stronger refutations.

Actually, the algorithm maintains two values, alpha and beta. They represent the minimum
score that the maximizing player is assured of and the maximum score that the minimizing
player is assured of respectively.




IMPLEMENTATION:

int alphaBetaMax( int alpha, int beta, int depthleft )
{
if ( depthleft == 0 ) return evaluate();
for ( all moves)
{
score = alphaBetaMin( alpha, beta, depthleft - 1 );
if( score >= beta )
return beta; // fail hard beta-cutoff
if( score > alpha )
alpha = score; // alpha acts like max in MiniMax
}
return alpha;
}

int alphaBetaMin( int alpha, int beta, int depthleft )
{
if ( depthleft == 0 ) return -evaluate();
for ( all moves)
 {
score = alphaBetaMax( alpha, beta, depthleft - 1 );
if( score <= alpha )
return alpha; // fail hard alpha-cutoff
if( score < beta )

                                                                                  Page 2 of 6
Class Assignment

beta = score; // beta acts like min in MiniMax
}
return beta;
}




JAMBOREE SEARCH:

Jamboree Search was introduced by Bradley Kuszmaul in his 1994 thesis, Synchronized
MIMD Computing. This algorithm is actually a parallelized version of the Scout search
algorithm. The idea is that all of the testing of any child that is not the first one is done in
parallel and any test that fail are sequentially valued.



Jamboree was used in the massive parallel chess programs StarTech and Socrates. It
sequentialize full-window searches for values, because, while their authors are willing to
take a chance that an empty window search will be squandered work, they are not willing
to take the chance that a full-window search (which does not prune very much) will be
squandered work.



IMPLEMENTATION:

int jamboree(CNode n, int α, int β)
{
if (n is leaf) return static_eval(n);
c[ ] = the childen of n;
b = -jamboree(c[0], -β, -α);
if (b >= β) return b;
if (b > α) α = b;
In Parallel: for (i=1; i < |c[ ]|; i++)
{
s = -jamboree(c[i], -α - 1, -α);
if (s > b) b = s;
if (s >= β) abort_and_return s;
if (s > α)
 {
s = -jamboree(c[i], -β, -α);
if (s >= β) abort_and_return s;
if (s > α) α = s;

                                                                                     Page 3 of 6
Class Assignment

if (s > b) b = s;
}
}
return b;
}



DEPTH – FIRST SEARCH:

We start the graph traversal at an arbitrary vertex and go down a particular branch until
we reach a dead end. Then we back up and go as deep possible. In this way we visit all
vertices and edges as well.




The search is similar to searching maze of hallways, edges, and rooms, vertices, with a
string and paint. We fix the string in the starting we room and mark the room with the
paint as visited we then go down the an incident hallway into the next room. We mark that
room and go to the next room always marking the rooms as visited with the paint. When
we get to a dead end or a room we have already visited we follow the string back a room
that has a hall way we have not gone through yet.

 This graph traversal is very similar to a tree traversal; either post order or preorder, in fact
if the graph is a tree then the traversal is same. The algorithm is naturally recursive, just as
the tree traversal. The algorithm is forecast here:

IMPLEMENTATION:

Algorithm DFS (graph G, Vertex v)

// Recursive algorithm

for all edges e in G.incidentEdges(v) do

if edge e is unexplored then

w = G.opposite(v, e)

if vertex w is unexplored then

label e as discovery edge

                                                                                      Page 4 of 6
Class Assignment

recursively call DFS(G, w)

else

label e back edge.




PVS SEARCH:
The best-known early attempt at searching such trees in parallel was the Principal
Variation Splitting (PVS) algorithm. This was both simple to understand and easy to
implement.




When starting an N-ply search, one processor generates the moves at the root position,
makes the first move (leading to what is often referred to as the left-most descendent
position), then generates the moves at ply=2, makes the first move again, and continues
this until reaching ply=N.

 At this point, the processor pool searches all of the moves at this ply (N) in parallel, and the
best value is backed up to ply N-1. Now that the lower bound for ply N-1 is known, the rest
of the moves at N-1 are searched in parallel, and the best value again backed up to N-2. This
continues until the first root move has been searched and the value is known. The
remainder of the root moves is searched in parallel, until none are left. The next iteration is
started and the process repeats for depth N+1.

Performance analysis with this algorithm (PVS) produced speedups given below in table 1.

            +-------------+-----+-----+-----+-----+-----+
            |# processors | 1 | 2 | 4 | 8 | 16 |
            +-------------+-----+-----+-----+-----+-----+
            |speedup      | 1.0 | 1.8 | 3.0 | 4.1 | 4.6 |
            +-------------+-----+-----+-----+-----+-----+
                  Table 1 PVS performance results




                                                                                      Page 5 of 6
Class Assignment



DRAWBACKS:

Firstly,

All of the processors work together at a single node, searching descendent positions in
parallel. If the number of possible moves is small, or the number of processors is large,
some have nothing to do. Second, every branch from a given position does not produce a
tree of equal size, since some branches may grow into complicated positions with lots of
checks and search extensions that make the tree very large, while other branches grow into
simple positions that are searched quickly. This leads to a load balancing problem where
one processor begins searching a very large tree and the others finish the easy moves and
have to wait for the remaining processor to slowly traverse the tree.

Secondly,

With a reasonable number of processors, the speedup can look very bad if most of the time
many of the processors are waiting on one last node to be completed before they can back
up to ply N-1 and start to work there.




REFERENCE:

[1] http://chessprogramming.wikispaces.com/Parallel+Search

[2] http://chessprogramming.wikispaces.com/Jamboree

[3] http://chessprogramming.wikispaces.com/Alpha-Beta

[4] http://www.netlib.org/utk/lsi/pcwLSI/text/node350.html

[5] http://www-turbul.ifh.uni-karlsruhe.de/uhlmann/mpi3/report_6.html

[6] http://www.cis.uab.edu/hyatt/search.html


……………………………………………………..X……………………………………………………….


                                                                                Page 6 of 6

Parallel searching

  • 1.
    Class Assignment CLASS ASSIGNMENT-01 Parallel Searching Algorithms INTRODUCTION: Parallel Search, also known as Multithreaded Search or SMP Search, is a way to increase search speed by using additional processors. This topic that has been gaining popularity recently with multiprocessor computers becoming widely available. Actually, a parallel algorithm is an algorithm which can be executed a piece at a time on many different processing devices, and then put back together again at the end to get the correct result. The cost or complexity of serial algorithms is estimated in terms of the space (memory) and time (processor cycles) that they take. Parallel algorithms need to optimize one more resource, the communication between different processors. There are two ways parallel processors communicating, shared memory or message passing. This document gives a brief summary of four types SMP algorithms which are classified by their scalability (trend in search speed as the number of processors becomes large) and their speedup (change in time to complete a search). Typically, programmers use scaling to mean change in nodes per second (NPS) rates, and speedup to mean change in time to depth. The algorithms are described below in brief: Page 1 of 6
  • 2.
    Class Assignment ALPHA –BETA SEARCH: The Alpha-Beta algorithm (Alpha-Beta Pruning, Alpha-Beta Heuristic) is a significant enhancement to the minimax search algorithm that eliminates the need to search large portions of the game tree applying a branch-and-bound technique. Remarkably, it does this without any potential of overlooking a better move. If one already has found a quite good move and search for alternatives, one refutation is enough to avoid it. No need to look for even stronger refutations. Actually, the algorithm maintains two values, alpha and beta. They represent the minimum score that the maximizing player is assured of and the maximum score that the minimizing player is assured of respectively. IMPLEMENTATION: int alphaBetaMax( int alpha, int beta, int depthleft ) { if ( depthleft == 0 ) return evaluate(); for ( all moves) { score = alphaBetaMin( alpha, beta, depthleft - 1 ); if( score >= beta ) return beta; // fail hard beta-cutoff if( score > alpha ) alpha = score; // alpha acts like max in MiniMax } return alpha; } int alphaBetaMin( int alpha, int beta, int depthleft ) { if ( depthleft == 0 ) return -evaluate(); for ( all moves) { score = alphaBetaMax( alpha, beta, depthleft - 1 ); if( score <= alpha ) return alpha; // fail hard alpha-cutoff if( score < beta ) Page 2 of 6
  • 3.
    Class Assignment beta =score; // beta acts like min in MiniMax } return beta; } JAMBOREE SEARCH: Jamboree Search was introduced by Bradley Kuszmaul in his 1994 thesis, Synchronized MIMD Computing. This algorithm is actually a parallelized version of the Scout search algorithm. The idea is that all of the testing of any child that is not the first one is done in parallel and any test that fail are sequentially valued. Jamboree was used in the massive parallel chess programs StarTech and Socrates. It sequentialize full-window searches for values, because, while their authors are willing to take a chance that an empty window search will be squandered work, they are not willing to take the chance that a full-window search (which does not prune very much) will be squandered work. IMPLEMENTATION: int jamboree(CNode n, int α, int β) { if (n is leaf) return static_eval(n); c[ ] = the childen of n; b = -jamboree(c[0], -β, -α); if (b >= β) return b; if (b > α) α = b; In Parallel: for (i=1; i < |c[ ]|; i++) { s = -jamboree(c[i], -α - 1, -α); if (s > b) b = s; if (s >= β) abort_and_return s; if (s > α) { s = -jamboree(c[i], -β, -α); if (s >= β) abort_and_return s; if (s > α) α = s; Page 3 of 6
  • 4.
    Class Assignment if (s> b) b = s; } } return b; } DEPTH – FIRST SEARCH: We start the graph traversal at an arbitrary vertex and go down a particular branch until we reach a dead end. Then we back up and go as deep possible. In this way we visit all vertices and edges as well. The search is similar to searching maze of hallways, edges, and rooms, vertices, with a string and paint. We fix the string in the starting we room and mark the room with the paint as visited we then go down the an incident hallway into the next room. We mark that room and go to the next room always marking the rooms as visited with the paint. When we get to a dead end or a room we have already visited we follow the string back a room that has a hall way we have not gone through yet. This graph traversal is very similar to a tree traversal; either post order or preorder, in fact if the graph is a tree then the traversal is same. The algorithm is naturally recursive, just as the tree traversal. The algorithm is forecast here: IMPLEMENTATION: Algorithm DFS (graph G, Vertex v) // Recursive algorithm for all edges e in G.incidentEdges(v) do if edge e is unexplored then w = G.opposite(v, e) if vertex w is unexplored then label e as discovery edge Page 4 of 6
  • 5.
    Class Assignment recursively callDFS(G, w) else label e back edge. PVS SEARCH: The best-known early attempt at searching such trees in parallel was the Principal Variation Splitting (PVS) algorithm. This was both simple to understand and easy to implement. When starting an N-ply search, one processor generates the moves at the root position, makes the first move (leading to what is often referred to as the left-most descendent position), then generates the moves at ply=2, makes the first move again, and continues this until reaching ply=N. At this point, the processor pool searches all of the moves at this ply (N) in parallel, and the best value is backed up to ply N-1. Now that the lower bound for ply N-1 is known, the rest of the moves at N-1 are searched in parallel, and the best value again backed up to N-2. This continues until the first root move has been searched and the value is known. The remainder of the root moves is searched in parallel, until none are left. The next iteration is started and the process repeats for depth N+1. Performance analysis with this algorithm (PVS) produced speedups given below in table 1. +-------------+-----+-----+-----+-----+-----+ |# processors | 1 | 2 | 4 | 8 | 16 | +-------------+-----+-----+-----+-----+-----+ |speedup | 1.0 | 1.8 | 3.0 | 4.1 | 4.6 | +-------------+-----+-----+-----+-----+-----+ Table 1 PVS performance results Page 5 of 6
  • 6.
    Class Assignment DRAWBACKS: Firstly, All ofthe processors work together at a single node, searching descendent positions in parallel. If the number of possible moves is small, or the number of processors is large, some have nothing to do. Second, every branch from a given position does not produce a tree of equal size, since some branches may grow into complicated positions with lots of checks and search extensions that make the tree very large, while other branches grow into simple positions that are searched quickly. This leads to a load balancing problem where one processor begins searching a very large tree and the others finish the easy moves and have to wait for the remaining processor to slowly traverse the tree. Secondly, With a reasonable number of processors, the speedup can look very bad if most of the time many of the processors are waiting on one last node to be completed before they can back up to ply N-1 and start to work there. REFERENCE: [1] http://chessprogramming.wikispaces.com/Parallel+Search [2] http://chessprogramming.wikispaces.com/Jamboree [3] http://chessprogramming.wikispaces.com/Alpha-Beta [4] http://www.netlib.org/utk/lsi/pcwLSI/text/node350.html [5] http://www-turbul.ifh.uni-karlsruhe.de/uhlmann/mpi3/report_6.html [6] http://www.cis.uab.edu/hyatt/search.html ……………………………………………………..X………………………………………………………. Page 6 of 6