2a.1Message-Passing ComputingMore MPI routines:Collective routinesSynchronous routinesNon-blocking routinesITCS 4/5145 Par...
2a.2Collective message-passing routinesHave routines that send message(s) to a groupof processes or receive message(s) fro...
2a.3Collective CommunicationInvolves set of processes, defined by an intra-communicator.Message tags not present. Principa...
MPI broadcast operation2a.4Sending same message to all processes in communicator.Multicast - sending same message to defin...
MPI_Bcast parameters2a.5
Basic MPI scatter operation2a.6Sending each element of an array in root process to aseparate process. Contents of ith loca...
MPI scatter parameters2a.7
• Simplest scatter would be as illustratedwhich one element of an array is sent todifferent processes.• Extension provided...
Scattering contiguous groups ofelements to each process2a.9
ExampleIn the following code, size of send buffer is given by 100 *<number of processes> and 100 contiguous elements arese...
Gather2.11Having one process collect individual valuesfrom set of processes.
Gather parameters2a.12
2a.13Gather ExampleTo gather items from group of processes into process 0, usingdynamically allocated memory in root proce...
2a.14ReduceGather operation combined with specified arithmetic/logicaloperation.Example: Values could be gathered and then...
Reduce parameters2a.15
2a.16Reduce - operationsMPI_Reduce(*sendbuf,*recvbuf,count,datatype,op,root,comm)Parameters:*sendbuf send buffer address*r...
2a.17Sample MPI program withcollective routines#include “mpi.h”#include <stdio.h>#include <math.h>#define MAXSIZE 1000void...
Collective routinesGeneral features• Performed on a group of processes, identified by acommunicator• Substitute for a sequ...
2a.19BarrierBlock process until all processes have called it.Synchronous operation.MPI_Barrier(comm)Communicator
2a.20Synchronous Message PassingRoutines that return when message transfer completed.Synchronous send routine• Waits until...
2a.21Synchronous Message PassingSynchronous message-passing routines intrinsicallyperform two actions:• They transfer data...
2a.22Synchronous Ssend() and recv() using 3-wayprotocolProcess 1 Process 2Ssend();recv();SuspendTimeprocessAcknowledgmentM...
2a.23Parameters of synchronous send(same as blocking send)MPI_Ssend(buf, count, datatype, dest, tag, comm)Address ofNumber...
2a.24Asynchronous Message Passing• Routines that do not wait for actions to complete beforereturning. Usually require loca...
2a.25MPI Definitions of Blocking and Non-Blocking• Blocking - return after their local actions complete, thoughthe message...
2a.26MPI Nonblocking Routines• Non-blocking send - MPI_Isend() - will return“immediately” even before source location issa...
2a.27Nonblocking Routine FormatsMPI_Isend(buf,count,datatype,dest,tag,comm,request)MPI_Irecv(buf,count,datatype,source,tag...
2a.28ExampleTo send an integer x from process 0 to process 1and allow process 0 to continue:MPI_Comm_rank(MPI_COMM_WORLD, ...
2a.29How message-passing routines returnbefore message transfer completedProcess 1 Process 2send();recv();Message bufferRe...
2a.30Asynchronous (blocking) routineschanging to synchronous routines• Message buffers only of finite length• A point coul...
2a.31Next topic• Parallel AlgorithmsOther MPI features will be introduced as weneed them.
Upcoming SlideShare
Loading in …5
×

Open MPI 2

976 views
736 views

Published on

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
976
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
29
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Open MPI 2

  1. 1. 2a.1Message-Passing ComputingMore MPI routines:Collective routinesSynchronous routinesNon-blocking routinesITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, 2009.
  2. 2. 2a.2Collective message-passing routinesHave routines that send message(s) to a groupof processes or receive message(s) from agroup of processesHigher efficiency than separate point-to-pointroutines although routines not absolutelynecessary.
  3. 3. 2a.3Collective CommunicationInvolves set of processes, defined by an intra-communicator.Message tags not present. Principal collective operations:• MPI_Bcast() - Broadcast from root to all other processes• MPI_Gather() - Gather values for group of processes• MPI_Scatter() - Scatters buffer in parts to group of processes• MPI_Alltoall() - Sends data from all processes to all processes• MPI_Reduce() - Combine values on all processes to single value• MPI_Reduce_scatter() - Combine values and scatter results• MPI_Scan()- Compute prefix reductions of data on processes• MPI_Barrier() - A means of synchronizing processes by stoppingeach one until they all have reached a specific “barrier” call.
  4. 4. MPI broadcast operation2a.4Sending same message to all processes in communicator.Multicast - sending same message to defined group ofprocesses.
  5. 5. MPI_Bcast parameters2a.5
  6. 6. Basic MPI scatter operation2a.6Sending each element of an array in root process to aseparate process. Contents of ith location of array sent to ithprocess.
  7. 7. MPI scatter parameters2a.7
  8. 8. • Simplest scatter would be as illustratedwhich one element of an array is sent todifferent processes.• Extension provided in the MPI_Scatter()routine is to send a fixed number ofcontiguous elements to each process.2a.8
  9. 9. Scattering contiguous groups ofelements to each process2a.9
  10. 10. ExampleIn the following code, size of send buffer is given by 100 *<number of processes> and 100 contiguous elements aresend to each process:main (int argc, char *argv[]) {int size, *sendbuf, recvbuf[100]; /* for each process */MPI_Init(&argc, &argv); /* initialize MPI */MPI_Comm_size(MPI_COMM_WORLD, &size);sendbuf = (int *)malloc(size*100*sizeof(int));.MPI_Scatter(sendbuf,100,MPI_INT,recvbuf,100,MPI_INT,0,MPI_COMM_WORLD);.MPI_Finalize(); /* terminate MPI */}2a.10
  11. 11. Gather2.11Having one process collect individual valuesfrom set of processes.
  12. 12. Gather parameters2a.12
  13. 13. 2a.13Gather ExampleTo gather items from group of processes into process 0, usingdynamically allocated memory in root process:int data[10]; /*data to be gathered fromprocesses*/MPI_Comm_rank(MPI_COMM_WORLD, &myrank); /* find rank */if (myrank == 0) {MPI_Comm_size(MPI_COMM_WORLD, &grp_size); /*find groupsize*/buf = (int *)malloc(grp_size*10*sizeof (int)); /*alloc.mem*/}MPI_Gather(data,10,MPI_INT,buf,grp_size*10,MPI_INT,0,MPI_COMM_WORLD) ;…MPI_Gather() gathers from all processes, including root.
  14. 14. 2a.14ReduceGather operation combined with specified arithmetic/logicaloperation.Example: Values could be gathered and then added together byroot:
  15. 15. Reduce parameters2a.15
  16. 16. 2a.16Reduce - operationsMPI_Reduce(*sendbuf,*recvbuf,count,datatype,op,root,comm)Parameters:*sendbuf send buffer address*recvbuf receive buffer addresscount number of send buffer elementsdatatype data type of send elementsop reduce operation.Several operations, includingMPI_MAX MaximumMPI_MIN MinimumMPI_SUM SumMPI_PROD Productroot root process rank for resultcomm communicator
  17. 17. 2a.17Sample MPI program withcollective routines#include “mpi.h”#include <stdio.h>#include <math.h>#define MAXSIZE 1000void main(int argc, char *argv) {int myid, numprocs, data[MAXSIZE], i, x, low, high, myresult, result;char fn[255];char *fp;MPI_Init(&argc,&argv);MPI_Comm_size(MPI_COMM_WORLD,&numprocs);MPI_Comm_rank(MPI_COMM_WORLD,&myid);if (myid == 0) { /* Open input file and initialize data */strcpy(fn,getenv(“HOME”));strcat(fn,”/MPI/rand_data.txt”);if ((fp = fopen(fn,”r”)) == NULL) {printf(“Can’t open the input file: %snn”, fn);exit(1);}for(i = 0; i < MAXSIZE; i++) fscanf(fp,”%d”, &data[i]);}MPI_Bcast(data, MAXSIZE, MPI_INT, 0, MPI_COMM_WORLD); /* broadcast data */x = n/nproc; /* Add my portion Of data */low = myid * x;high = low + x;for(i = low; i < high; i++)myresult += data[i];printf(“I got %d from %dn”, myresult, myid); /* Compute global sum */MPI_Reduce(&myresult, &result, 1, MPI_INT, MPI_SUM, 0, MPI_COMM_WORLD);if (myid == 0) printf(“The sum is %d.n”, result);MPI_Finalize();}
  18. 18. Collective routinesGeneral features• Performed on a group of processes, identified by acommunicator• Substitute for a sequence of point-to-point calls• Communications are locally blocking• Synchronization is not guaranteed (implementationdependent)• Some routines use a root process to originate or receive alldata• Data amounts must exactly match• Many variations to basic categories• No message tags are needed2a.18
  19. 19. 2a.19BarrierBlock process until all processes have called it.Synchronous operation.MPI_Barrier(comm)Communicator
  20. 20. 2a.20Synchronous Message PassingRoutines that return when message transfer completed.Synchronous send routine• Waits until complete message can be accepted by thereceiving process before sending the message.In MPI, MPI_SSend() routine.Synchronous receive routine• Waits until the message it is expecting arrives.In MPI, actually the regular MPI_recv() routine.
  21. 21. 2a.21Synchronous Message PassingSynchronous message-passing routines intrinsicallyperform two actions:• They transfer data and• They synchronize processes.
  22. 22. 2a.22Synchronous Ssend() and recv() using 3-wayprotocolProcess 1 Process 2Ssend();recv();SuspendTimeprocessAcknowledgmentMessageBoth processescontinue(a) When send() occurs before recv()Process 1 Process 2recv();Ssend();SuspendTimeprocessAcknowledgmentMessageBoth processescontinue(b) When recv() occurs before send()Request to sendRequest to send
  23. 23. 2a.23Parameters of synchronous send(same as blocking send)MPI_Ssend(buf, count, datatype, dest, tag, comm)Address ofNumber of itemsDatatype ofRank of destinationMessage tagCommunicatorsend bufferto sendeach itemprocess
  24. 24. 2a.24Asynchronous Message Passing• Routines that do not wait for actions to complete beforereturning. Usually require local storage for messages.• More than one version depending upon the actualsemantics for returning.• In general, they do not synchronize processes but allowprocesses to move forward sooner.• Must be used with care.
  25. 25. 2a.25MPI Definitions of Blocking and Non-Blocking• Blocking - return after their local actions complete, thoughthe message transfer may not have been completed.Sometimes called locally blocking.• Non-blocking - return immediately (asynchronous)Non-blocking assumes that data storage used for transfernot modified by subsequent statements prior to beingused for transfer, and it is left to the programmer toensure this.Blocking/non-blocking terms may have differentinterpretations in other systems.
  26. 26. 2a.26MPI Nonblocking Routines• Non-blocking send - MPI_Isend() - will return“immediately” even before source location issafe to be altered.• Non-blocking receive - MPI_Irecv() - willreturn even if no message to accept.
  27. 27. 2a.27Nonblocking Routine FormatsMPI_Isend(buf,count,datatype,dest,tag,comm,request)MPI_Irecv(buf,count,datatype,source,tag,comm, request)Completion detected by MPI_Wait() and MPI_Test().MPI_Wait() waits until operation completed and returns then.MPI_Test() returns with flag set indicating whether operation completedat that time.Need to know whether particular operation completed.Determined by accessing request parameter.
  28. 28. 2a.28ExampleTo send an integer x from process 0 to process 1and allow process 0 to continue:MPI_Comm_rank(MPI_COMM_WORLD, &myrank); /* find rank */if (myrank == 0) {int x;MPI_Isend(&x,1,MPI_INT, 1, msgtag, MPI_COMM_WORLD, req1);compute();MPI_Wait(req1, status);} else if (myrank == 1) {int x;MPI_Recv(&x,1,MPI_INT,0,msgtag, MPI_COMM_WORLD, status);}
  29. 29. 2a.29How message-passing routines returnbefore message transfer completedProcess 1 Process 2send();recv();Message bufferReadmessage bufferContinueprocessTimeMessage buffer needed between source anddestination to hold message:
  30. 30. 2a.30Asynchronous (blocking) routineschanging to synchronous routines• Message buffers only of finite length• A point could be reached when send routine heldup because all available buffer space exhausted.• Then, send routine will wait until storage becomesre-available - i.e. routine will behave as asynchronous routine.
  31. 31. 2a.31Next topic• Parallel AlgorithmsOther MPI features will be introduced as weneed them.

×