2. Dr. Hanif Durad 2
Lecture Outline
Models for Communication
Brief introduction to MPI
Basic concepts
Learn 6 most commonly used functions
Introduce “collective” operations
IntroMPI.ppt
3. MPI Basic Send/Receive
3
We need to fill in the details in
things that need specifying:
How will “data” be described?
How will processes be identified?
How will the receiver recognize/screen messages?
What will it mean for these operations to complete?
Process 0 Process 1
Send(data)
Receive(data)
IntroMPI.ppt
4. Some Basic Concepts
Processes can be collected into groups
Each message is sent in a context, and must be received
in the same context
Provides necessary support for libraries
A group and context together form a communicator
A process is identified by its rank in the group associated
with a communicator
There is a default communicator whose group contains all
initial processes, called MPI_COMM_WORLD
Dr. Hanif Durad 4
IntroMPI.ppt
5. MPI Datatypes
The data in a message to send or receive is described by a triple
(address, count, datatype), where
An MPI datatype is recursively defined as:
predefined, corresponding to a data type from the language (e.g., MPI_INT,
MPI_DOUBLE)
a contiguous array of MPI datatypes
a strided block of datatypes
an indexed array of blocks of datatypes
an arbitrary structure of datatypes
There are MPI functions to construct custom datatypes, in
particular ones for subarrays
May hurt performance if datatypes are complex
5Dr. Hanif Durad
IntroMPI.ppt
6. MPI Tags
Messages are sent with an accompanying user-defined
integer tag, to assist the receiving process in identifying
the message
Messages can be screened at the receiving end by
specifying a specific tag, or not screened by specifying
MPI_ANY_TAG as the tag in a receive
Some non-MPI message-passing systems have called
tags “message types”. MPI calls them tags to avoid
confusion with datatypes
Dr. Hanif Durad 6
IntroMPI.ppt
7. Blocking Point-to-Point
Communication (1/2)
MPI_Send()
Basic blocking send operation. Routine returns only after the
application buffer in the sending task is free for reuse.
MPI_Recv()
Receive a message and block until the requested data is
available in the application buffer in the receiving task.
MPI_Ssend()
synchronous blocking send
7Dr. Hanif Durad
Comm.ppt
8. Blocking Point-to-Point
Communication (2/2)
MPI_Bsend()
buffered blocking send
MPI_Rsend()
blocking ready send, use with great care
MPI_Sendrecv()
Send a message and post a receive before blocking. Will block
until the sending application buffer is free for reuse and until the
receiving application buffer contains the received message.
8Dr. Hanif Durad
Comm.ppt
9. MPI Basic (Blocking) Send
MPI_SEND(start, count, datatype, dest, tag, comm)
The message buffer is described by (start, count, datatype).
The target process is specified by dest, which is the rank of the target process in
the communicator specified by comm.
When this function returns, the data has been delivered to the system and the
buffer can be reused.
Important: The message may not have been received by the target process.
Dr. Hanif Durad 9
A(10)
B(20)
MPI_Send( A, 10, MPI_DOUBLE, 1, …) MPI_Recv( B, 20, MPI_DOUBLE, 0, … )
IntroMPI.ppt
10. MPI Basic (Blocking) Receive
MPI_RECV(start, count, datatype, source, tag,comm, status)
Waits until a matching (both source and tag) message is received from the
system, and the buffer can be used
source is rank in communicator specified by comm, or MPI_ANY_SOURCE
tag is a tag to be matched on or MPI_ANY_TAG
receiving fewer than count occurrences of datatype is OK, but receiving more
is an error
status contains further information (e.g. size of message)
Dr. Hanif Durad 10
MPI_Send( A, 10, MPI_DOUBLE, 1, …) MPI_Recv( B, 20, MPI_DOUBLE, 0, … )
A(10)
B(20)
IntroMPI.ppt
12. A Simple MPI Program (C)
#include "mpi.h"
#include <stdio.h>
int main( int argc, char *argv[])
{
int rank, buf;
MPI_Status status;
MPI_Init(&argc, &argv);
MPI_Comm_rank( MPI_COMM_WORLD, &rank );
/* Process 0 sends and Process 1 receives */
if (rank == 0) {
buf = 123456;
MPI_Send( &buf, 1, MPI_INT, 1, 0, MPI_COMM_WORLD);
}
else if (rank == 1) {
MPI_Recv( &buf, 1, MPI_INT, 0, 0, MPI_COMM_WORLD,
&status );
printf( "Received %dn", buf );
}
MPI_Finalize();
return 0;
}
Dr. Hanif Durad 12
IntroMPI.pptProgram name blocking.c
13. A Simple MPI Program (Fortran)
program main
include 'mpif.h'
integer rank, buf, ierr, status(MPI_STATUS_SIZE)
call MPI_Init(ierr)
call MPI_Comm_rank( MPI_COMM_WORLD, rank, ierr )
! Process 0 sends and Process 1 receives
if (rank .eq. 0) then
buf = 123456
call MPI_Send( buf, 1, MPI_INTEGER, 1, 0, MPI_COMM_WORLD, ierr )
else if (rank .eq. 1) then
call MPI_Recv( buf, 1, MPI_INTEGER, 0, 0, MPI_COMM_WORLD, status, ierr )
print *, "Received ", buf
endif
call MPI_Finalize(ierr)
end
Dr. Hanif Durad 13
IntroMPI.pptProgram name blocking.f90
14. A Simple MPI Program (C++)
#include "mpi.h"
#include <iostream>
int main( int argc, char *argv[])
{
int rank, buf;
MPI::Init(argc, argv);
rank = MPI::COMM_WORLD.Get_rank();
// Process 0 sends and Process 1 receives
if (rank == 0)
{
buf = 123456;
MPI::COMM_WORLD.Send( &buf, 1, MPI::INT, 1, 0 );
}
else if (rank == 1)
{
MPI::COMM_WORLD.Recv( &buf, 1, MPI::INT, 0, 0 );
std::cout << "Received" << buf << "n";
}
MPI::Finalize();
return 0;
} Dr. Hanif Durad 14
IntroMPI.pptProgram name blocking.cpp
15. Retrieving Further Information
(C)
Status is a data structure allocated in the user’s
program.
In C:
int recvd_tag, recvd_from, recvd_count;
MPI_Status status;
MPI_Recv(..., MPI_ANY_SOURCE, MPI_ANY_TAG, ...,
&status )
recvd_tag = status.MPI_TAG;
recvd_from = status.MPI_SOURCE;
MPI_Get_count( &status, datatype, &recvd_count );
Dr. Hanif Durad 15
17. Retrieving Further Information
(C++)
Status is a data structure allocated in the user’s
program.
In C++:
int recvd_tag, recvd_from, recvd_count;
MPI::Status status;
Comm.Recv(..., MPI::ANY_SOURCE, MPI::ANY_TAG,
...,
status )
recvd_tag = status.Get_tag();
recvd_from = status.Get_source();
recvd_count = status.Get_count( datatype );
Dr. Hanif Durad 17
18. Tags and Contexts
Separation of messages used to be accomplished by use of
tags, but
this requires libraries to be aware of tags used by other libraries.
this can be defeated by use of “wild card” tags.
Contexts are different from tags
no wild cards allowed
allocated dynamically by the system when a library sets up a
communicator for its own use.
User-defined tags still provided in MPI for user
convenience in organizing application 18
IntroMPI.ppt
19. Home Work 1
We have just used MPI_Send() and MPI_Recv()
Try to use other blocking functions listed
Dr. Hanif Durad 19
20. Flavors of message passing
Synchronous used for routines that return when
the message transfer has been completed
Synchronous send waits until the complete message can
be accepted by the receiving process before sending the
message (send suspends until receive)
Synchronous receive will wait until the message it is
expecting arrives (receive suspends until message sent)
Also called blocking
A B
request to send
acknowledgement
message
lecture2.ppt
21. Synchronous send() and recv()
using 3-way protocol (1/2)
Dr. Hanif Durad 21
Process 1 Process 2
send();
recv();
Suspend
Time
process
Acknowledgment
MessageBoth processes
continue
(a) When send() occurs before recv()
Request to send
slides2.ppt
22. Synchronous send() and recv()
using 3-way protocol (2/2)
Dr. Hanif Durad 22
Process 1 Process 2
recv();
send();
Suspend
Time
process
Acknowledgment
MessageBoth processes
contin ue
(b) When recv() occurs before send()
Request to send
slides2.ppt
23. Nonblocking message passing
Nonblocking sends return whether or not the message has
been received
If receiving processor not ready, message may be stored in
message buffer
Message buffer used to hold messages being sent by A prior to
being accepted by receive in B
MPI:
routines that use a message buffer and return after their local
actions complete are blocking (even though message transfer
may not be complete)
Routines that return immediately are non-blocking
A B
message
buffer
24. 4 Communication Modes in MPI
(1/3)
Standard mode
Not assumed that corresponding receive routine has
started.
Amount of buffering not defined by MPI. If buffering
provided, send could complete before receive reached
Buffered(asynchronous) mode
Send may start and return before a matching receive.
Necessary to specify buffer space via routine
MPI_Buffer_attach().
Dr. Hanif Durad 24
lecture4.ppt/slides2.ppt
25. Communication Modes in MPI
(2/3)
Synchronous mode
Send and receive can start before each other but can only
complete together
Ready mode
Send can only start if matching receive already reached,
otherwise error. Use with care
Dr. Hanif Durad 25
lecture4.ppt/slides2.ppt
26. Communication Modes in MPI
(3/3)
Each of the four modes can be applied to both
blocking and nonblocking send routines.
Only the standard mode is available for the
blocking and nonblocking receive routines.
Any type of send routine can be used with any
type of receive routine.
Dr. Hanif Durad 26
slides2.ppt
27. A Real Blocking Program (1/3)
#include "mpi.h"
#include <iostream>
int main(int argc, char *argv[])
{
#define MSGLEN 2048
int ITAG_A = 100,ITAG_B = 200;
int irank, i, idest, isrc, istag, iretag;
float rmsg1[MSGLEN];
float rmsg2[MSGLEN];
MPI::Status recv_status;
MPI::Init(argc, argv);
irank = MPI::COMM_WORLD.Get_rank(); 27
Program name deadlock.cpp
28. A Real Blocking Program (2/3)
for (i = 1; i <= MSGLEN; i++)
{
rmsg1[i] = 100;
rmsg2[i] = -100;
}
if ( irank == 0 )
{
idest = 1;
isrc = 1;
istag = ITAG_A;
iretag = ITAG_B;
}
else if ( irank == 1 )
{
idest = 0;
isrc = 0;
istag = ITAG_B;
iretag = ITAG_A;
}
28
29. A Real Blocking Program (3/3)
std::cout << "Task " << irank << " has sent the message"
<<std::endl;
MPI::COMM_WORLD.Ssend(rmsg1, MSGLEN, MPI::FLOAT,
idest, istag);
MPI::COMM_WORLD.Recv(rmsg2, MSGLEN, MPI::FLOAT,
isrc, iretag, recv_status);
std::cout << "Task " << irank << " has received the message"
<<std::endl;
MPI::Finalize();
}
29
30. Nonblocking Point-to-Point
Communication (1/2)
MPI_Isend(), MPI_Irecv()
identifies the send/receive buffer. Computation proceeds
immediately. A communication request handle is returned for
handling the pending message status. The program must use
calls to MPI_Wait or MPI_Test to determine when the
operation completes.
MPI_Issend(), MPI_Ibsend(), MPI_Irsend()
non-blocking versions
Dr. Hanif Durad 30
Comm.ppt
31. Nonblocking Point-to-Point
Communication (2/2)
MPI_Test(), MPI_Testany, MPI_Testall, MPI_Testsome()
checks the status of a specified non-blocking send or receive
operation
MPI_Wait(), MPI_Waitany(), MPI_Waitall(),
MPI_Waitsome()
blocks until a specified non-blocking send or receive operation
has completed
MPI_Probe()
performs a non-blocking test for a message.
Dr. Hanif Durad 31
Comm.ppt