Your SlideShare is downloading. ×
MPI
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

MPI

989
views

Published on

Published in: Education, Technology

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
989
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
149
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Can work with shared memory architectures also
  • Why MPI is still being used
  • Many vendors can compete for providing better implementation
  • Boring topic. But fundamental for understanding the basics
  • Safe – different libraries can work together
  • Different return codes for different functions
  • To start coding we need to use these functions.
  • Mention the case where the buffer space might not be available
  • Buffer used only during Buffered mode communication.
  • Ready call indicates the system that a receive has already been posted.
  • Built in collective operations. Reduce, Bcast, Datatypes
  • Transcript

    • 1. MPI Rohit Banga Prakher Anand K Swagat Manoj Gupta Advanced Computer Architecture Spring, 2010
    • 2. ORGANIZATION
      • Basics of MPI
      • Point to Point Communication
      • Collective Communication
      • Demo
    • 3. GOALS
      • Explain basics of MPI
      • Start coding today!
      • Keep It Short and Simple
    • 4. MESSAGE PASSING INTERFACE
      • A message passing library specification
        • Extended message-passing model
        • Not a language or compiler specification
        • Not a specific implementation, several implementations (like pthread)
      • standard for distributed memory, message passing, parallel computing
      • Distributed Memory – Shared Nothing approach!
      • Some interconnection technology – TCP, INFINIBAND (on our cluster)
    • 5. GOALS OF MPI SPECIFICATION
      • Provide source code portability
      • Allow efficient implementations
      • Flexible to port different algorithms on different hardware environments
      • Support for heterogeneous architectures – processors not identical
    • 6. REASONS FOR USING MPI
      • Standardization – virtually all HPC platforms
      • Portability – same code runs on another platform
      • Performance – vendor implementations should exploit native hardware features
      • Functionality – 115 routines
      • Availability – a variety of implementations available
    • 7. BASIC MODEL
      • Communicators and Groups
      • Group
        • ordered set of processes
        • each process is associated with a unique integer rank
        • rank from 0 to (N-1) for N processes
        • an object in system memory accessed by handle
        • MPI_GROUP_EMPTY
        • MPI_GROUP_NULL
    • 8. BASIC MODEL (CONTD.)
      • Communicator
        • Group of processes that may communicate with each other
        • MPI messages must specify a communicator
        • An object in memory
        • Handle to access the object
      • There is a default communicator (automatically defined):
      • MPI_COMM_WORLD
      • identify the group of all processes
    • 9. COMMUNICATORS
      • Intra-Communicator – All processes from the same group
      • Inter-Communicator – Processes picked up from several groups
    • 10. COMMUNICATOR AND GROUPS
      • For a programmer, group and communicator are one
      • Allow you to organize tasks, based upon function, into task groups
      • Enable Collective Communications (later) operations across a subset of related tasks
      • safe communications
      • Many Communicators at the same time
      • Dynamic – can be created and destroyed at run time
      • Process may be in more than one group/communicator – unique rank in every group/communicator
      • implementing user defined virtual topologies
    • 11. VIRTUAL TOPOLOGIES
      • coord (0,0): rank 0
      • coord (0,1): rank 1
      • coord (1,0): rank 2
      • coord (1,1): rank 3
      • Attach graph topology information to an existing communicator
    • 12. SEMANTICS
      • Header file
        • #include <mpi.h> (C)
        • include mpif.h (fortran)
        • Java, Python etc.
      Format: rc = MPI_Xxxxx(parameter, ... ) Example: rc = MPI_Bsend(&buf,count,type,dest,tag,comm) Error code: Returned as &quot;rc&quot;. MPI_SUCCESS if successful
    • 13. MPI PROGRAM STRUCTURE
    • 14. MPI FUNCTIONS – MINIMAL SUBSET
      • MPI_Init – Initialize MPI
      • MPI_Comm_size – size of group associated with the communicator
      • MPI_Comm_rank – identify the process
      • MPI_Send
      • MPI_Recv
      • MPI_Finalize
        • We will discuss simple ones first
    • 15. CLASSIFICATION OF MPI ROUTINES
      • Environment Management
        • MPI_Init, MPI_Finalize
      • Point-to-Point Communication
        • MPI_Send, MPI_Recv
      • Collective Communication
        • MPI_Reduce, MPI_Bcast
      • Information on the Processes
        • MPI_Comm_rank, MPI_Get_processor_name
    • 16. MPI_INIT
      • All MPI programs call this before using other MPI functions
        • int MPI_Init(int *pargc, char ***pargv);
      • Must be called in every MPI program
      • Must be called only once and before any other MPI functions are called
      • Pass command line arguments to all processes
      int main(int argc, char **argv) { MPI_Init(&argc, &argv); … }
    • 17. MPI_COMM_SIZE
      • Number of processes in the group associated with a communicator
        • int MPI_Comm_size(MPI_Comm comm, int *psize);
      • Find out number of processes being used by your application
      int main(int argc, char **argv) { MPI_Init(&argc, &argv); int p; MPI_Comm_size(MPI_COMM_WORLD, &p); … }
    • 18. MPI_COMM_RANK
      • Rank of the calling process within the communicator
      • Unique Rank between 0 and (p-1)
      • Can be called task ID
        • int MPI_Comm_rank(MPI_Comm comm, int *rank);
      • Unique rank for a process in each communicator it belongs to
      • Used to identify work for the processor
      int main(int argc, char **argv) { MPI_Init(&argc, &argv); int p; MPI_Comm_size(MPI_COMM_WORLD, &p); int rank; MPI_Comm_rank(MPI_COMM_WORLD, &rank); … }
    • 19. MPI_FINALIZE
      • Terminates the MPI execution environment
      • Last MPI routine to be called in any MPI program
        • int MPI_Finalize(void);
      int main(int argc, char **argv) { MPI_Init(&argc, &argv); int p; MPI_Comm_size(MPI_COMM_WORLD, &p); int rank; MPI_Comm_rank(MPI_COMM_WORLD, &rank); printf(“no. of processors: %dn rank: %d”, p, rank); MPI_Finalize(); }
    • 20.  
    • 21. HOW TO COMPILE THIS
      • Open MPI implementation on our Cluster
      • mpicc -o test_1 test_1.c
      • Like gcc only
      • mpicc not a special compiler
        • $mpicc: gcc: no input files
        • Mpi implemented just as any other library
        • Just a wrapper around gcc that includes required command line parameters
    • 22. HOW TO RUN THIS
      • mpirun -np X test_1
      • Will run X copies of program in your current run time environment
      • np option specifies number of copies of program
    • 23. MPIRUN
      • Only rank 0 process can receive standard input.
        • mpirun redirects standard input of all others to /dev/null
        • Open MPI redirects standard input of mpirun to standard input of rank 0 process
      • Node which invoked mpirun need not be the same as the node for the MPI_COMM_WORLD rank 0 process
      • mpirun directs standard output and error of remote nodes to the node that invoked mpirun
      • SIGTERM, SIGKILL kill all processes in the communicator
      • SIGUSR1, SIGUSR2 propagated to all processes
      • All other signals ignored
    • 24. A NOTE ON IMPLEMENTATION
      • I want to implement my own version of MPI
      • Evidence
      MPI_Init MPI Thread MPI_Init MPI Thread
    • 25. SOME MORE FUNCTIONS
      • int MPI_Init (&flag)
        • Check if MPI_Initialized has been called
        • Why?
      • int MPI_Wtime()
        • Returns elapsed wall clock time in seconds (double precision) on the calling processor
      • int MPI_Wtick()
        • Returns the resolution in seconds (double precision) of MPI_Wtime()
      • Message Passing Functionality
        • That is what MPI is meant for!
    • 26. POINT TO POINT COMMUNICATION
    • 27. POINT-TO-POINT COMMUNICATION
      • Communication between 2 and only 2 processes
      • One sending and one receiving
      • Types
        • Synchronous send
        • Blocking send / blocking receive
        • Non-blocking send / non-blocking receive
        • Buffered send
        • Combined send/receive
        • &quot;Ready&quot; send
    • 28. POINT-TO-POINT COMMUNICATION
      • Processes can be collected into groups
      • Each message is sent in a context, and
      • must be received in the same context
      • A group and context together form a Communicator
      • A process is identified by its rank in the group
      • associated with a communicator
      • Messages are sent with an accompanying user defined integer tag, to assist the receiving process in identifying the message
      • MPI_ANY_TAG
    • 29. POINT-TO-POINT COMMUNICATION
      • How is “data” described?
      • How are processes identified?
      • How does the receiver recognize messages?
      • What does it mean for these operations to complete?
    • 30. BLOCKING SEND/RECEIVE
      • int MPI_Send( void * buf , int count , MPI_Datatype datatype , int dest , int tag , MPI_Comm communicator )
      • buf: pointer - data to send
      • count: number of elements in buffer .
      • Datatype : which kind of data types in buffer ?
    • 31. BLOCKING SEND/RECEIVE
      • int MPI_Send(void * buf , int count , MPI_Datatype datatype , int dest , int tag , MPI_Comm communicator )
      • buf: pointer - data to send
      • count: number of elements in buffer .
      • Datatype : which kind of data types in buffer ?
    • 32. BLOCKING SEND/RECEIVE
      • int MPI_Send(void * buf , int count , MPI_Datatype datatype , int dest , int tag , MPI_Comm communicator )
      • buf: pointer - data to send
      • count: number of elements in buffer .
      • Datatype : which kind of data types in buffer ?
    • 33.  
    • 34. BLOCKING SEND/RECEIVE
      • int MPI_Send(void * buf , int count , MPI_Datatype datatype , int dest , int tag , MPI_Comm communicator )
      • buf: pointer - data to send
      • count: number of elements in buffer .
      • Datatype : which kind of data types in buffer ?
      • dest: the receiver
      • tag: the label of the message
      • communicator: set of processors involved (MPI_COMM_WORLD)
    • 35. BLOCKING SEND/RECEIVE
      • int MPI_Send(void * buf , int count , MPI_Datatype datatype , int dest , int tag , MPI_Comm communicator )
      • buf: pointer - data to send
      • count: number of elements in buffer .
      • Datatype : which kind of data types in buffer ?
      • dest: the receiver
      • tag: the label of the message
      • communicator: set of processors involved (MPI_COMM_WORLD)
    • 36. BLOCKING SEND/RECEIVE
      • int MPI_Send(void * buf , int count , MPI_Datatype datatype , int dest , int tag , MPI_Comm communicator )
      • buf: pointer - data to send
      • count: number of elements in buffer .
      • Datatype : which kind of data types in buffer ?
      • dest: the receiver
      • tag: the label of the message
      • communicator: set of processors involved (MPI_COMM_WORLD)
    • 37. BLOCKING SEND/RECEIVE (CONTD.) Process 1 Process 2 Data Processor 1 Application Send System Buffer Processor 2 Application Send System Buffer
    • 38. A WORD ABOUT SPECIFICATION
      • The user does not know if MPI implementation:
        • copies BUFFER in an internal buffer, start communication, and returns control before all the data are transferred. (BUFFERING)
        • create links between processors, send data and return control when all the data are sent (but NOT received)
        • uses a combination of the above methods
    • 39. BLOCKING SEND/RECEIVE (CONTD.)
      • &quot;return&quot; after it is safe to modify the application buffer
      • Safe
        • modifications will not affect the data intended for the receive task
        • does not imply that the data was actually received
      • Blocking send can be synchronous which means there is handshaking occurring with the receive task to confirm a safe send
      • A blocking send can be asynchronous if a system buffer is used to hold the data for eventual delivery to the receive
      • A blocking receive only &quot;returns&quot; after the data has arrived and is ready for use by the program
    • 40. NON-BLOCKING SEND/RECEIVE
      • return almost immediately
      • simply &quot;request&quot; the MPI library to perform the operation when it is able
      • Cannot predict when that will happen
      • request a send/receive and start doing other work!
      • unsafe to modify the application buffer (your variable space) until you know that the non-blocking operation has been completed
      • MPI_Isend (&buf,count,datatype,dest,tag,comm,&request)
      • MPI_Irecv (&buf,count,datatype,source,tag,comm,&request)
    • 41. NON-BLOCKING SEND/RECEIVE (CONTD.) Process 1 Process 2 Data Processor 1 Application Send System Buffer Processor 2 Application Send System Buffer
    • 42.
      • To check if the send/receive operations have completed
      • int MPI_Irecv (void *buf, int count, MPI_Datatype type, int dest, int tag, MPI_Comm comm, MPI_Request *req);
      • int MPI_Wait(MPI_Request *req, MPI_Status *status);
        • A call to this subroutine cause the code to wait until the communication pointed by req is complete
        • input/output, identifier associated to a communications
        • event (initiated by MPI_ISEND or MPI_IRECV).
        • input/output, identifier associated to a communications event (initiated by MPI_ISEND or MPI_IRECV).
      NON-BLOCKING SEND/RECEIVE (CONTD.)
    • 43.
      • int MPI_Test(MPI_Request *req, int *flag, MPI_Status *status);
        • A call to this subroutine sets flag to true if the communication pointed by req is complete, sets flag to false otherwise.
      NON-BLOCKING SEND/RECEIVE (CONTD.)
    • 44. STANDARD MODE
      • Returns when Sender is free to access and overwrite the send buffer.
      • Might be copied directly into the matching receive buffer, or might be copied into a temporary system buffer.
      • Message buffering decouples the send and receive operations.
      • Message buffering can be expensive.
      • It is up to MPI to decide whether outgoing messages will be buffered
      • The standard mode send is non-local .
    • 45. SYNCHRONOUS MODE
      • Send can be started whether or not a matching receive was posted.
      • Send completes successfully only if a corresponding receive was already posted and has already started to receive the message sent.
      • Blocking send & Blocking receive in synchronous mode.
      • Simulate a synchronous communication.
      • Synchronous Send is non-local .
    • 46. BUFFERED MODE
      • Send operation can be started whether or not a matching receive has been posted.
      • It may complete before a matching receive is posted.
      • Operation is local.
      • MPI must buffer the outgoing message.
      • Error will occur if there is insufficient buffer space.
      • The amount of available buffer space is controlled by the user.
    • 47. BUFFER MANAGEMENT
      • int MPI_Buffer_attach( void* buffer, int size)
      • Provides to MPI a buffer in the user's memory to be used for buffering outgoing messages.
      • int MPI_Buffer_detach( void* buffer_addr, int* size)
      • Detach the buffer currently associated with MPI.
      MPI_Buffer_attach( malloc(BUFFSIZE), BUFFSIZE); /* a buffer of BUFFSIZE bytes can now be used by MPI_Bsend */ MPI_Buffer_detach( &buff, &size); /* Buffer size reduced to zero */ MPI_Buffer_attach( buff, size); /* Buffer of BUFFSIZE bytes available again */
    • 48. READY MODE
      • A send may be started only if the matching receive is already posted.
      • The user must be sure of this.
      • If the receive is not already posted, the operation is erroneous and its outcome is undefined.
      • Completion of the send operation does not depend on the status of a matching receive.
      • Merely indicates that the send buffer can be reused.
      • Ready-send could be replaced by a standard-send with no effect on the behavior of the program other than performance.
    • 49. ORDER AND FAIRNESS
      • Order:
        • MPI Messages are non-overtaking .
        • When a receive matches 2 messages.
        • When a sent message matches 2 receive statements.
        • Message-passing code is deterministic, unless the processes are multi-threaded or the wild-card MPI_ANY_SOURCE is used in a receive statement.
      • Fairness:
        • MPI does not guarantee fairness
        • Example: task 0 sends a message to task 2. However, task 1 sends a competing message that matches task 2's receive. Only one of the sends will complete.
    • 50. EXAMPLE OF NON-OVERTAKING MESSAGES. CALL MPI_COMM_RANK(comm, rank, ierr) IF (rank.EQ.0) THEN CALL MPI_BSEND(buf1, count, MPI_REAL, 1, tag, comm, ierr) CALL MPI_BSEND(buf2, count, MPI_REAL, 1, tag, comm, ierr) ELSE ! rank.EQ.1 CALL MPI_RECV(buf1, count, MPI_REAL, 0, MPI_ANY_TAG, comm, status, ierr) CALL MPI_RECV(buf2, count, MPI_REAL, 0, tag, comm, status, ierr) END IF
    • 51. EXAMPLE OF INTERTWINGLED MESSAGES. CALL MPI_COMM_RANK(comm, rank, ierr) IF (rank.EQ.0) THEN CALL MPI_BSEND(buf1, count, MPI_REAL, 1, tag1, comm, ierr) CALL MPI_SSEND(buf2, count, MPI_REAL, 1, tag2, comm, ierr) ELSE ! rank.EQ.1 CALL MPI_RECV(buf1, count, MPI_REAL, 0, tag2, comm, status, ierr) CALL MPI_RECV(buf2, count, MPI_REAL, 0, tag1, comm, status, ierr) END IF
    • 52. DEADLOCK EXAMPLE CALL MPI_COMM_RANK(comm, rank, ierr) IF (rank.EQ.0) THEN CALL MPI_RECV(recvbuf, count, MPI_REAL, 1, tag, comm, status, ierr) CALL MPI_SEND(sendbuf, count, MPI_REAL, 1, tag, comm, ierr) ELSE ! rank.EQ.1 CALL MPI_RECV(recvbuf, count, MPI_REAL, 0, tag, comm, status, ierr) CALL MPI_SEND(sendbuf, count, MPI_REAL, 0, tag, comm, ierr) END IF
    • 53. EXAMPLE OF BUFFERING CALL MPI_COMM_RANK(comm, rank, ierr) IF (rank.EQ.0) THEN CALL MPI_SEND(buf1, count, MPI_REAL, 1, tag, comm, ierr) CALL MPI_RECV (recvbuf, count, MPI_REAL, 1, tag, comm, status, ierr) ELSE ! rank.EQ.1 CALL MPI_SEND(sendbuf, count, MPI_REAL, 0, tag, comm, ierr) CALL MPI_RECV(buf2, count, MPI_REAL, 0, tag, comm, status, ierr) END IF
    • 54. COLLECTIVE COMMUNICATIONS
    • 55. COLLECTIVE ROUTINES
      • Collective routines provide a higher-level way to organize a parallel program.
      • Each process executes the same communication operations.
      • Communications involving group of processes in a communicator.
      • Groups and communicators can be constructed “by hand” or using topology routines.
      • Tags are not used; different communicators deliver similar functionality.
      • No non-blocking collective operations.
      • Three classes of operations: synchronization, data movement, collective computation.
    • 56. COLLECTIVE ROUTINES (CONTD.)
      • int MPI_Barrier(MPI_Comm comm)
      • Stop processes until all processes within a communicator reach the barrier
      • Occasionally useful in measuring performance
    • 57. COLLECTIVE ROUTINES (CONTD.)
      • int MPI_Bcast(void *buf, int count, MPI_Datatype datatype, int root, MPI_Comm comm)
      • Broadcast
      • One-to-all communication: same data sent from root process to all others in the communicator
    • 58. COLLECTIVE ROUTINES (CONTD.)
      • Reduction
      • The reduction operation allow to:
        • Collect data from each process
        • Reduce the data to a single value
        • Store the result on the root processes
        • Store the result on all processes
      • Reduction function works with arrays
      • other operation: product, min, max, and, ….
      • Internally is usually implemented with a binary tree
    • 59. COLLECTIVE ROUTINES (CONTD.)
      • int MPI_Reduce/MPI_Allreduce(void * snd_buf, void * rcv_buf, int count, MPI_Datatype type, MPI_Op op, int root, MPI_Comm comm)
      • snd_buf: input array
      • rcv_buf output array
      • count: number of element of snd_buf and rcv_buf
      • type: MPI type of snd_buf and rcv_buf
      • op: parallel operation to be performed
      • root: MPI id of the process storing the result
      • comm: communicator of processes involved in the operation
    • 60. MPI OPERATIONS MPI_OP operator MPI_MIN Minimum MPI_SUM Sum MPI_PROD product MPI_MAX maximum MPI_LAND Logical and MPI_BAND Bitwise and MPI_LOR Logical or MPI_BOR Bitwise or MPI_LXOR Logical xor MPI_BXOR Bit-wise xor MPI_MAXLOC Max value and location MPI_MINLOC Min value and location
    • 61. COLLECTIVE ROUTINES (CONTD.)
    • 62. Learn by Examples
    • 63. Parallel Trapezoidal Rule Output: Estimate of the integral from a to b of f(x) using the trapezoidal rule and n trapezoids. Algorithm: 1. Each process calculates &quot;its&quot; interval of integration. 2. Each process estimates the integral of f(x) over its interval using the trapezoidal rule. 3a. Each process != 0 sends its integral to 0. 3b. Process 0 sums the calculations received from the individual processes and prints the result. Notes: 1. f(x), a, b, and n are all hardwired. 2. The number of processes (p) should evenly divide the number of trapezoids (n = 1024)
    • 64. Parallelizing the Trapezoidal Rule #include <stdio.h> #include &quot;mpi.h&quot; main(int argc, char** argv) { int my_rank; /* My process rank */ int p; /* The number of processes */ double a = 0.0; /* Left endpoint */ double b = 1.0; /* Right endpoint */ int n = 1024; /* Number of trapezoids */ double h; /* Trapezoid base length */ double local_a; /* Left endpoint my process */ double local_b; /* Right endpoint my process */ int local_n; /* Number of trapezoids for */ /* my calculation */ double integral; /* Integral over my interval */ double total; /* Total integral */ int source; /* Process sending integral */ int dest = 0; /* All messages go to 0 */ int tag = 0; MPI_Status status;
    • 65. Continued… double Trap(double local_a, double local_b, int local_n,double h); /* Calculate local integral */ MPI_Init (&argc, &argv); MPI_Barrier(MPI_COMM_WORLD); double elapsed_time = -MPI_Wtime(); MPI_Comm_rank(MPI_COMM_WORLD, &my_rank); MPI_Comm_size(MPI_COMM_WORLD, &p); h = (b-a)/n; /* h is the same for all processes */ local_n = n/p; /* So is the number of trapezoids */ /* Length of each process' interval of integration = local_n*h. So my interval starts at: */ local_a = a + my_rank*local_n*h; local_b = local_a + local_n*h; integral = Trap(local_a, local_b, local_n, h);
    • 66. Continued… /* Add up the integrals calculated by each process */ if (my_rank == 0) { total = integral; for (source = 1; source < p; source++) { MPI_Recv(&integral, 1, MPI_DOUBLE, source, tag, MPI_COMM_WORLD, &status); total = total + integral; }//End for } else MPI_Send(&integral, 1, MPI_DOUBLE, dest, tag, MPI_COMM_WORLD); MPI_Barrier(MPI_COMM_WORLD); elapsed_time += MPI_Wtime(); /* Print the result */ if (my_rank == 0) { printf(&quot;With n = %d trapezoids, our estimaten&quot;,n); printf(&quot;of the integral from %lf to %lf = %lfn&quot;,a, b, total); printf(&quot;time taken: %lfn&quot;, elapsed_time); }
    • 67. Continued… /* Shut down MPI */ MPI_Finalize(); } /* main */ double Trap( double local_a , double local_b, int local_n, double h) { double integral; /* Store result in integral */ double x; int i; double f(double x); /* function we're integrating */ integral = (f(local_a) + f(local_b))/2.0; x = local_a; for (i = 1; i <= local_n-1; i++) { x = x + h; integral = integral + f(x); } integral = integral*h; return integral; } /* Trap */
    • 68. Continued… double f(double x) { double return_val; /* Calculate f(x). */ /* Store calculation in return_val. */ return_val = 4 / (1+x*x); return return_val; } /* f */
    • 69. Program 2 Process other than root generates the random value less than 1 and sends to root. Root sums up and displays sum.
    • 70. #include <stdio.h> #include <mpi.h> #include<stdlib.h> #include <string.h> #include<time.h> int main(int argc, char **argv) { int myrank, p; int tag =0, dest=0; int i; double randIn,randOut; int source; MPI_Status status; MPI_Init(&argc,&argv); MPI_Comm_rank(MPI_COMM_WORLD,&myrank);
    • 71. MPI_Comm_size(MPI_COMM_WORLD, &p); if(myrank==0)//I am the root { double total=0,average=0; for(source=1;source<p;source++) { MPI_Recv(&randIn,1, MPI_DOUBLE, source, MPI_ANY_TAG, MPI_COMM_WORLD, &status); printf(&quot;Message from root: From %d received number %fn&quot;,source ,randIn); total+=randIn; }//End for average=total/(p-1); }//End if
    • 72. else//I am other than root { srand48((long int) myrank); randOut=drand48(); printf(&quot;randout=%f, myrank=%dn&quot;,randOut,myrank); MPI_Send(&randOut,1,MPI_DOUBLE,dest,tag,MPI_COMM_WORLD); }//End If-Else MPI_Finalize(); return 0; }
    • 73. MPI References
      • The Standard itself:
        • at http://www.mpi-forum.org
        • All MPI official releases, in both postscript and HTML
      • Books:
        • Using MPI: Portable Parallel Programming with the Message-Passing Interface , 2nd Edition, by Gropp, Lusk, and Skjellum, MIT Press, 1999. Also Using MPI-2 , w. R. Thakur
        • MPI: The Complete Reference, 2 vols , MIT Press, 1999.
        • Designing and Building Parallel Programs , by Ian Foster, Addison-Wesley, 1995.
        • Parallel Programming with MPI , by Peter Pacheco, Morgan-Kaufmann, 1997.
      • Other information on Web:
        • at http://www.mcs.anl.gov/mpi
        • For man pages of open MPI on the web : http://www.open-mpi.org/doc/v1.4/
        • apropos mpi
    • 74. THANK YOU