Getting Started with MPI Marcirio Silveira Chaves [email_address] Tutorial on Parallel Computing and  Message Passing Mode...
Memories <ul><li>Domain decomposition : Data are divided into pieces of approximately the same size, and then mapped to di...
Menu <ul><li>The MPI Standard </li></ul><ul><li>MPI Goals </li></ul><ul><li>When and When Not to Use MPI </li></ul><ul><li...
Introduction <ul><li>Message Passing Interface - MPI  </li></ul><ul><ul><li>was designed to be a standard implementation o...
Introduction <ul><li>A MPI program consists of  </li></ul><ul><ul><li>two or more autonomous processes, each executing the...
The MPI Standard <ul><li>MPI was developed over two years of discussions led by the MPI Forum: ca. 60 people representing ...
MPI Goals <ul><li>Provide source code portability — MPI programs should compile and run as-is on any platform.  </li></ul>...
Reasons for Using MPI <ul><li>Standardization  - MPI is the only message passing library which can be considered a standar...
When and When Not to Use MPI <ul><li>Use MPI when you need: </li></ul><ul><ul><li>parallel code that is  portable  across ...
Programming Model <ul><li>MPI lends itself to virtually any distributed memory parallel programming model.  </li></ul><ul>...
Programming Model <ul><li>All parallelism is explicit:  </li></ul><ul><li>- the programmer is responsible for correctly id...
Hands on June - 2009 LNEC - DHA - NTI
Getting Started  <ul><li>Header File </li></ul><ul><ul><li>Required for all programs/routines which make MPI library calls...
General MPI Program Structure June - 2009 LNEC - DHA - NTI
Communicators and Groups <ul><li>MPI uses  objects  called  communicators  and  groups  to define  which collection of pro...
Rank <ul><li>Within a communicator, every process has its own unique,  integer identifier  assigned by the system when the...
Environment Management Routines  <ul><li>They are used for an assortment of purposes: </li></ul><ul><ul><li>initializing a...
Environment Management Routines  <ul><li>MPI_Comm_size   </li></ul><ul><ul><li>Determines the number of processes in the g...
Environment Management Routines  <ul><li>MPI_Comm_rank   </li></ul><ul><ul><li>Determines the rank of the calling process ...
Environment Management Routines  <ul><li>MPI_Abort   </li></ul><ul><ul><li>Terminates all MPI processes associated with th...
Environment Management Routines  <ul><li>MPI_Initialized   </li></ul><ul><ul><li>Indicates whether MPI_Init has been calle...
Environment Management Routines  <ul><li>MPI_Wtick   </li></ul><ul><ul><li>Returns the resolution in seconds (double preci...
June - 2009 LNEC - DHA - NTI
<ul><li>Any questions, so far? </li></ul>June - 2009 LNEC - DHA - NTI
Types of MPI Routines <ul><li>Point-to-point communication </li></ul><ul><li>Collective communication </li></ul><ul><li>Pr...
Point-to-Point Communications and Messages <ul><li>Direct communication between  two processors </li></ul><ul><ul><li>one ...
Point-to-Point Communications and Messages <ul><li>In a generic send or receive, a  message  consisting of some block of d...
Point-to-Point Communications and Messages <ul><li>MPI uses three pieces of information to characterize the message body i...
Point-to-Point Communications and Messages <ul><li>Buffer : </li></ul><ul><ul><li>In a perfect world, every send operation...
Point-to-Point Communications and Messages <ul><li>Buffer : </li></ul><ul><ul><li>A send operation occurs 5 seconds before...
Point-to-Point Communications and Messages <ul><li>Buffer </li></ul>June - 2009 LNEC - DHA - NTI
Point-to-Point Communications and Messages <ul><li>Buffer : </li></ul><ul><ul><li>System buffer space is: </li></ul></ul><...
Point-to-Point Communications and Messages <ul><li>MPI uses three pieces of information to characterize the message body i...
Types of MPI Routines <ul><li>Point-to-point communication </li></ul><ul><li>  -  Communication Modes and Completion Crite...
Communication Modes and Completion Criteria <ul><li>MPI  provides much  flexibility  in specifying how  messages  are to b...
Communication Modes and Completion Criteria <ul><li>There are  four  communication modes available for  sends : </li></ul>...
Blocking and Non-blocking Communication <ul><li>Blocking communication   </li></ul><ul><ul><li>A send or receive operation...
Blocking and Non-blocking Communication <ul><li>Blocking communication   </li></ul><ul><ul><li>A blocking send routine wil...
Blocking and Non-blocking Communication <ul><li>Blocking communication   </li></ul><ul><ul><li>A blocking send can be  syn...
Blocking and Non-blocking Communication <ul><li>Non-blocking communication  </li></ul><ul><ul><li>Non-blocking send and re...
Blocking and Non-blocking Communication <ul><li>Non-blocking communication  </li></ul><ul><ul><li>It  is unsafe to modify ...
Order and Fairness <ul><li>Order </li></ul><ul><ul><li>MPI guarantees that messages will not overtake each other. </li></u...
Order and Fairness <ul><li>Fairness </li></ul><ul><ul><li>MPI does not guarantee fairness -  it's up to the programmer to ...
MPI Message Passing Routine Arguments June - 2009 LNEC - DHA - NTI Blocking sends MPI_Send( buffer,count,type ,dest, tag,c...
MPI Message Passing Routine Arguments <ul><li>Buffer </li></ul><ul><ul><li>Program (application) address space that refere...
MPI Message Passing Routine Arguments <ul><li>Data Type </li></ul>June - 2009 LNEC - DHA - NTI
MPI Message Passing Routine Arguments <ul><li>Destination </li></ul><ul><ul><li>An argument to send routines that  indicat...
MPI Message Passing Routine Arguments <ul><li>Tag  </li></ul><ul><ul><li>Arbitrary non-negative integer  assigned by the p...
MPI Message Passing Routine Arguments <ul><li>Communicator </li></ul><ul><ul><li>Indicates the communication context, or  ...
MPI Message Passing Routine Arguments <ul><li>Request </li></ul><ul><ul><li>Used by  non-blocking send and receive operati...
MPI Message Passing Routine Arguments June - 2009 LNEC - DHA - NTI Blocking sends MPI_Send( buffer,count,type ,dest, tag,c...
<ul><li>Any questions, so far? </li></ul><ul><li>Pit-stop? </li></ul>June - 2009 LNEC - DHA - NTI
Blocking Message Passing Routines <ul><li>MPI_Send </li></ul><ul><ul><li>Basic blocking send operation.  </li></ul></ul><u...
Blocking Message Passing Routines <ul><li>MPI_Recv </li></ul><ul><ul><li>Receive a message and block until the requested d...
Blocking Message Passing Routines <ul><li>MPI_Bsend </li></ul><ul><ul><li>Buffered blocking send : permits the programmer ...
Blocking Message Passing Routines <ul><li>MPI_Buffer_attach, MPI_Buffer_detach </li></ul><ul><ul><li>Used by programmer to...
Blocking Message Passing Routines <ul><li>MPI_Rsend </li></ul><ul><ul><li>Blocking ready send.  </li></ul></ul><ul><ul><li...
Blocking Message Passing Routines <ul><li>MPI_Wait, MPI_Waitany, MPI_Waitall, MPI_Waitsome </li></ul><ul><ul><li>MPI_Wait ...
Blocking Message Passing Routines <ul><li>MPI_Probe </li></ul><ul><ul><li>Performs a  blocking test for a message .  </li>...
June - 2009 LNEC - DHA - NTI Task 0 pings task 1 and awaits return ping.
Non- Blocking Message Passing Routines <ul><li>MPI_Isend </li></ul><ul><ul><li>Identifies an area in memory to serve as a ...
Non- Blocking Message Passing Routines <ul><li>MPI_Irecv </li></ul><ul><ul><li>Identifies an area in memory to serve as a ...
Non- Blocking Message Passing Routines <ul><li>MPI_Issend </li></ul><ul><ul><li>Non-blocking synchronous send.  </li></ul>...
Non- Blocking Message Passing Routines <ul><li>MPI_Irsend </li></ul><ul><ul><li>Non-blocking ready send.  </li></ul></ul><...
Non- Blocking Message Passing Routines <ul><li>MPI_Test, MPI_Testany, MPI_Testall, MPI_Testsome </li></ul><ul><ul><li>MPI_...
Non- Blocking Message Passing Routines <ul><li>MPI_Iprobe </li></ul><ul><ul><li>Performs a non-blocking test for a message...
June - 2009 LNEC - DHA - NTI Nearest neighbor exchange in ring topology.
<ul><li>Any questions, so far? </li></ul>June - 2009 LNEC - DHA - NTI
Types of MPI Routines <ul><li>Point-to-point communication </li></ul><ul><li>  - Communication Modes and Completion Criter...
Collective Communication <ul><li>Programming Considerations and Restrictions </li></ul><ul><ul><li>Collective operations a...
Collective Communication <ul><li>Broadcast operation  </li></ul><ul><ul><li>a single process sends a copy of some data to ...
Collective Communication <ul><li>Scatter and Gather Operations </li></ul><ul><ul><li>distribute data on one processor acro...
Collective Communication <ul><li>Reduction operation </li></ul><ul><ul><li>A single process (the root process) collects da...
Compiling and Running MPI Programs <ul><li>The MPI standard does not specify how MPI programs are to be started;  </li></u...
   Nano-Self-Test -  Getting Started   http://ci-tutor.ncsa.uiuc.edu/content.php?cid=1303
Upcoming SlideShare
Loading in...5
×

Tutorial on Parallel Computing and Message Passing Model - C2

1,150

Published on

1 Comment
1 Like
Statistics
Notes
  • why withold downloading of such an excellent presentation?
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
1,150
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
1
Likes
1
Embeds 0
No embeds

No notes for slide
  • ifort -o exec-simple simple1.f90 -Bstatic -I/home/share/parallel/install/mpich2/include /home/share/parallel/install/mpich2/lib/libfmpich.a /home/share/parallel/install/mpich2/lib/libmpich.a /home/share/parallel/install/mpich2/lib/libmpichf90.a ./exec-simple
  • Fairness: justiça Overtake: ultrapassar
  • Insulate: proteje
  • Transcript of "Tutorial on Parallel Computing and Message Passing Model - C2"

    1. 1. Getting Started with MPI Marcirio Silveira Chaves [email_address] Tutorial on Parallel Computing and Message Passing Model Part II - Day 2
    2. 2. Memories <ul><li>Domain decomposition : Data are divided into pieces of approximately the same size, and then mapped to different processors. </li></ul><ul><li>Functional decomposition: The problem is decomposed into a large number of smaller tasks and then the tasks are assigned to the processors as they become available. </li></ul><ul><li>Distributed memory : When each node has rapid access to its own local memory and access to the memory of other nodes via some sort of communications network . </li></ul><ul><li>Shared memory : When multiple processor units share access to a global memory space via a high speed memory bus . </li></ul><ul><li>Load balancing : Dividing the work equally among the available processes. </li></ul><ul><li>Idle time : The time a process spends waiting for data from other processors. </li></ul><ul><li>Total execution time does not involve compiling time . </li></ul>June - 2009 LNEC - DHA - NTI
    3. 3. Menu <ul><li>The MPI Standard </li></ul><ul><li>MPI Goals </li></ul><ul><li>When and When Not to Use MPI </li></ul><ul><li>Environment Management Routines </li></ul><ul><li>Types of MPI Routines </li></ul><ul><ul><li>Point-to-Point Communications and Messages </li></ul></ul><ul><ul><li>- General Concepts </li></ul></ul><ul><ul><li>- MPI Message Passing Routine Arguments </li></ul></ul><ul><ul><li>- Blocking Message Passing Routines </li></ul></ul><ul><ul><li>- Non-Blocking Message Passing Routines </li></ul></ul><ul><li>Compiling and Running MPI Programs </li></ul><ul><li>Nano-Self-Test </li></ul>June - 2009 LNEC - DHA - NTI
    4. 4. Introduction <ul><li>Message Passing Interface - MPI </li></ul><ul><ul><li>was designed to be a standard implementation of the message-passing model of parallel computing. </li></ul></ul><ul><ul><li>consists of a set of C functions or Fortran subroutines inserted into source code to perform data communication between processes . </li></ul></ul><ul><li>MPI itself is not a library. </li></ul><ul><li>MPI is an interface specification of what such a message-passing library should be in order to provide a standard for the writing of message passing parallel programs . </li></ul>June - 2009 LNEC - DHA - NTI
    5. 5. Introduction <ul><li>A MPI program consists of </li></ul><ul><ul><li>two or more autonomous processes, each executing their own codes, which may or may not be identical on a given pair of processes . </li></ul></ul><ul><ul><li>These processes communicate via calls to MPI communication routines and are identified according to their relative rank within a group (0, 1, . . . , groupsize-1). </li></ul></ul><ul><ul><li>MPI does not allow for dynamic allocation of processes during the execution of a parallel program. </li></ul></ul><ul><ul><li>You specify the number of processes at the start of your program and that number remains fixed throughout the entire program. </li></ul></ul>June - 2009 LNEC - DHA - NTI
    6. 6. The MPI Standard <ul><li>MPI was developed over two years of discussions led by the MPI Forum: ca. 60 people representing about 40 organizations. </li></ul><ul><li>The MPI-1 standard was defined in the spring of 1994: </li></ul><ul><ul><li>It specifies the names, calling sequences, and results of subroutines and functions to be called from Fortran 77 and C, respectively. </li></ul></ul><ul><ul><ul><li>All implementations of MPI must conform to these rules, thus ensuring portability . </li></ul></ul></ul><ul><ul><ul><li>MPI programs should compile and run on any platform that supports the MPI standard. </li></ul></ul></ul><ul><ul><li>Implementations of the MPI-1 standard are available for a wide variety of platforms. </li></ul></ul>June - 2009 LNEC - DHA - NTI
    7. 7. MPI Goals <ul><li>Provide source code portability — MPI programs should compile and run as-is on any platform. </li></ul><ul><li>Allow efficient implementations across a range of architectures. </li></ul>June - 2009 LNEC - DHA - NTI
    8. 8. Reasons for Using MPI <ul><li>Standardization - MPI is the only message passing library which can be considered a standard. </li></ul><ul><ul><li>It has replaced all previous message passing libraries. </li></ul></ul><ul><li>Portability - There is no need to modify your source code when you port your application to a different platform that supports (and is compliant with) the MPI standard. </li></ul><ul><li>Performance Opportunities - Vendor implementations should be able to exploit native hardware features to optimize performance. </li></ul><ul><li>Functionality - Over 115 routines are defined in MPI-1 alone. </li></ul><ul><li>Availability - A variety of implementations are available, both vendor and public domain. </li></ul>June - 2009 LNEC - DHA - NTI
    9. 9. When and When Not to Use MPI <ul><li>Use MPI when you need: </li></ul><ul><ul><li>parallel code that is portable across platforms </li></ul></ul><ul><ul><li>higher performance </li></ul></ul><ul><ul><ul><li>e.g. when small-scale &quot;loop-level&quot; parallelism does not provide enough speedup </li></ul></ul></ul><ul><li>Do not use MPI when you: </li></ul><ul><ul><li>can achieve sufficient performance and portability by using the &quot;loop level&quot; parallelism available in such software as High-Performance Fortran or OpenMP, or proprietary machine-dependent directives. </li></ul></ul><ul><ul><li>can use a pre-existing library of parallel routines, which may themselves be written using MPI (section ahead &quot;Parallel Mathematical Libraries.&quot;) </li></ul></ul><ul><ul><li>don't need parallelism at all. </li></ul></ul>June - 2009 LNEC - DHA - NTI
    10. 10. Programming Model <ul><li>MPI lends itself to virtually any distributed memory parallel programming model. </li></ul><ul><ul><li>MPI is used to implement some shared memory models, such as Data Parallel, on distributed memory architectures. </li></ul></ul><ul><li>Hardware platforms </li></ul><ul><ul><li>Distributed Memory : Originally, MPI was targeted for distributed memory systems. </li></ul></ul><ul><ul><li>Shared Memory : As shared memory systems became more popular, particularly SMP/NUMA architectures, MPI implementations for these platforms appeared. </li></ul></ul><ul><ul><li>Hybrid : MPI is now used on just about any common parallel architecture including massively parallel machines, SMP clusters, workstation clusters and heterogeneous networks. </li></ul></ul>June - 2009 LNEC - DHA - NTI
    11. 11. Programming Model <ul><li>All parallelism is explicit: </li></ul><ul><li>- the programmer is responsible for correctly identifying parallelism and implementing parallel algorithms using MPI constructs. </li></ul><ul><li>The number of tasks dedicated to run a parallel program is static. New tasks can not be dynamically spawned during run time. </li></ul>June - 2009 LNEC - DHA - NTI
    12. 12. Hands on June - 2009 LNEC - DHA - NTI
    13. 13. Getting Started <ul><li>Header File </li></ul><ul><ul><li>Required for all programs/routines which make MPI library calls. </li></ul></ul><ul><li>Format of MPI Calls: </li></ul>June - 2009 LNEC - DHA - NTI Fortran include file include 'mpif.h' Fortran Binding Format: CALL MPI_XXXXX(parameter,..., ierr) call mpi_xxxxx(parameter,..., ierr) Example: CALL MPI_BSEND(buf,count,type,dest,tag,comm,ierr) Error code: Returned as &quot;ierr&quot; parameter. MPI_SUCCESS if successful
    14. 14. General MPI Program Structure June - 2009 LNEC - DHA - NTI
    15. 15. Communicators and Groups <ul><li>MPI uses objects called communicators and groups to define which collection of processes may communicate with each other . </li></ul><ul><ul><li>Most MPI routines require a communicator as an argument. </li></ul></ul><ul><li>For now, simply use MPI_COMM_WORLD whenever a communicator is required. </li></ul><ul><ul><li>It is the predefined communicator that includes all of your MPI processes. </li></ul></ul>June - 2009 LNEC - DHA - NTI
    16. 16. Rank <ul><li>Within a communicator, every process has its own unique, integer identifier assigned by the system when the process initializes. </li></ul><ul><li>A rank is also called a &quot; task ID &quot;. </li></ul><ul><li>Ranks are contiguous and begin at zero. </li></ul><ul><li>Used by the programmer to specify the source and destination of messages. </li></ul><ul><li>Often used conditionally by the application to control program execution </li></ul><ul><ul><li>if rank=0 do this </li></ul></ul><ul><ul><li>if rank=1 do that. </li></ul></ul>June - 2009 LNEC - DHA - NTI
    17. 17. Environment Management Routines <ul><li>They are used for an assortment of purposes: </li></ul><ul><ul><li>initializing and terminating the MPI environment, </li></ul></ul><ul><ul><li>querying the environment and identity, </li></ul></ul><ul><ul><li>etc. </li></ul></ul><ul><li>MPI_Init </li></ul><ul><ul><li>Initializes the MPI execution environment. </li></ul></ul><ul><ul><li>This function must be called … </li></ul></ul><ul><ul><ul><li>in every MPI program, </li></ul></ul></ul><ul><ul><ul><li>before any other MPI functions and </li></ul></ul></ul><ul><ul><ul><li>only once in an MPI program. </li></ul></ul></ul><ul><ul><ul><li>MPI_INIT (ierr) </li></ul></ul></ul>June - 2009 LNEC - DHA - NTI
    18. 18. Environment Management Routines <ul><li>MPI_Comm_size </li></ul><ul><ul><li>Determines the number of processes in the group associated with a communicator. </li></ul></ul><ul><ul><li>Generally used within the communicator MPI_COMM_WORLD to determine the number of processes being used by your application. </li></ul></ul><ul><ul><li>MPI_COMM_SIZE (comm,size,ierr) </li></ul></ul>June - 2009 LNEC - DHA - NTI
    19. 19. Environment Management Routines <ul><li>MPI_Comm_rank </li></ul><ul><ul><li>Determines the rank of the calling process within the communicator. </li></ul></ul><ul><ul><li>Initially, each process will be assigned a unique integer rank between 0 and (number of processors – 1) within the communicator MPI_COMM_WORLD. </li></ul></ul><ul><ul><li>This rank is often referred to as a task ID. </li></ul></ul><ul><ul><li>If a process becomes associated with other communicators, it will have a unique rank within each of these as well. </li></ul></ul><ul><ul><li>MPI_COMM_RANK (comm,rank,ierr) </li></ul></ul>June - 2009 LNEC - DHA - NTI
    20. 20. Environment Management Routines <ul><li>MPI_Abort </li></ul><ul><ul><li>Terminates all MPI processes associated with the communicator. </li></ul></ul><ul><ul><li>In most MPI implementations it terminates ALL processes regardless of the communicator specified. </li></ul></ul><ul><ul><li>MPI_ABORT (comm,errorcode,ierr) </li></ul></ul><ul><li>MPI_Get_processor_name </li></ul><ul><ul><li>Returns the processor name and the length of the name. </li></ul></ul><ul><ul><li>The buffer for &quot;name&quot; must be at least MPI_MAX_PROCESSOR_NAME characters in size. </li></ul></ul><ul><ul><li>What is returned into &quot;name&quot; is implementation dependent - may not be the same as the output of the &quot;hostname&quot; or &quot;host&quot; shell commands. </li></ul></ul><ul><ul><li>MPI_GET_PROCESSOR_NAME (name,resultlength,ierr) </li></ul></ul>June - 2009 LNEC - DHA - NTI
    21. 21. Environment Management Routines <ul><li>MPI_Initialized </li></ul><ul><ul><li>Indicates whether MPI_Init has been called - returns flag as either logical true (1) or false(0). </li></ul></ul><ul><ul><li>MPI requires that MPI_Init be called once and only once by each process. </li></ul></ul><ul><ul><li>- This may pose a problem for modules that want to use MPI and are prepared to call MPI_Init if necessary. </li></ul></ul><ul><ul><li>- MPI_Initialized solves this problem. </li></ul></ul><ul><ul><li>MPI_INITIALIZED (flag,ierr) </li></ul></ul><ul><li>MPI_Wtime </li></ul><ul><ul><li>Returns an elapsed wall clock time in seconds (double precision) on the calling processor. </li></ul></ul><ul><ul><li>The elapsed time between when a process starts to run and when it is finished. This is usually longer than the processor time consumed by the process because the CPU is doing other things besides running the process such as running other user and operating system processes or waiting for disk or network I/O. </li></ul></ul><ul><ul><li>MPI_WTIME () </li></ul></ul>June - 2009 LNEC - DHA - NTI
    22. 22. Environment Management Routines <ul><li>MPI_Wtick </li></ul><ul><ul><li>Returns the resolution in seconds (double precision) of MPI_Wtime. </li></ul></ul><ul><ul><li>MPI_WTICK () </li></ul></ul><ul><li>MPI_Finalize </li></ul><ul><ul><li>Terminates the MPI execution environment. </li></ul></ul><ul><ul><li>Should be the last MPI routine called in every MPI program. </li></ul></ul><ul><ul><li>MPI_FINALIZE (ierr) </li></ul></ul>June - 2009 LNEC - DHA - NTI
    23. 23. June - 2009 LNEC - DHA - NTI
    24. 24. <ul><li>Any questions, so far? </li></ul>June - 2009 LNEC - DHA - NTI
    25. 25. Types of MPI Routines <ul><li>Point-to-point communication </li></ul><ul><li>Collective communication </li></ul><ul><li>Process groups </li></ul><ul><li>Process topologies </li></ul><ul><li>Environment management and inquiry </li></ul>June - 2009 LNEC - DHA - NTI
    26. 26. Point-to-Point Communications and Messages <ul><li>Direct communication between two processors </li></ul><ul><ul><li>one of which sends data and </li></ul></ul><ul><ul><li>the other receives this same data . </li></ul></ul><ul><li>Is two-sided , which means that both an explicit send and an explicit receive are required . </li></ul><ul><ul><li>Data are not transferred without the participation of both processors. </li></ul></ul>June - 2009 LNEC - DHA - NTI
    27. 27. Point-to-Point Communications and Messages <ul><li>In a generic send or receive, a message consisting of some block of data is transferred between processors. </li></ul><ul><li>A message consists of an envelope that indicates : </li></ul><ul><li>- the source and destination processors and , </li></ul><ul><li>- a body that contains the actual data to be sent. </li></ul>June - 2009 LNEC - DHA - NTI
    28. 28. Point-to-Point Communications and Messages <ul><li>MPI uses three pieces of information to characterize the message body in a flexible way: </li></ul><ul><ul><li>Buffer </li></ul></ul><ul><ul><ul><li>the starting location in memory where outgoing data is to be found (for a send) or incoming data is to be stored (for a receive). </li></ul></ul></ul><ul><ul><ul><li>Fortran: it is just the name of the array element where the data transfer begins. </li></ul></ul></ul>June - 2009 LNEC - DHA - NTI
    29. 29. Point-to-Point Communications and Messages <ul><li>Buffer : </li></ul><ul><ul><li>In a perfect world, every send operation would be perfectly synchronized with its matching receive. </li></ul></ul><ul><ul><li>This is rarely the case. </li></ul></ul><ul><ul><li>Somehow or other, the MPI implementation must be able to deal with storing data when the two tasks are out of sync. </li></ul></ul><ul><ul><li>Consider the following two cases: </li></ul></ul>June - 2009 LNEC - DHA - NTI
    30. 30. Point-to-Point Communications and Messages <ul><li>Buffer : </li></ul><ul><ul><li>A send operation occurs 5 seconds before the receive is ready - where is the message while the receive is pending? </li></ul></ul><ul><ul><li>Multiple sends arrive at the same receiving task which can only accept one send at a time - what happens to the messages that are &quot;backing up&quot;? </li></ul></ul><ul><li>The MPI implementation (not the MPI standard) decides what happens to data in these types of cases. </li></ul><ul><ul><li>Typically, a system buffer area is reserved to hold data in transit. </li></ul></ul>June - 2009 LNEC - DHA - NTI
    31. 31. Point-to-Point Communications and Messages <ul><li>Buffer </li></ul>June - 2009 LNEC - DHA - NTI
    32. 32. Point-to-Point Communications and Messages <ul><li>Buffer : </li></ul><ul><ul><li>System buffer space is: </li></ul></ul><ul><ul><ul><li>Opaque to the programmer and managed entirely by the MPI library </li></ul></ul></ul><ul><ul><ul><li>A finite resource that can be easy to exhaust </li></ul></ul></ul><ul><ul><ul><li>Often mysterious and not well documented </li></ul></ul></ul><ul><ul><ul><li>Able to exist on the sending side, the receiving side, or both </li></ul></ul></ul><ul><ul><ul><li>Something that may improve program performance because it allows send - receive operations to be asynchronous. </li></ul></ul></ul><ul><ul><li>User managed address space (i.e. your program variables) is called the application buffer. </li></ul></ul><ul><ul><li>MPI also provides for a user managed send buffer. </li></ul></ul>June - 2009 LNEC - DHA - NTI
    33. 33. Point-to-Point Communications and Messages <ul><li>MPI uses three pieces of information to characterize the message body in a flexible way: </li></ul><ul><ul><ul><li>Datatype — the type of data to be sent. </li></ul></ul></ul><ul><ul><ul><ul><li>In the simplest cases: float (REAL), int (INTEGER). </li></ul></ul></ul></ul><ul><ul><ul><ul><li>In more advanced applications: a user-defined datatype built from the basic types. </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Roughly analogous to C structures, and can contain data located anywhere, </li></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>i.e., not necessarily in contiguous memory locations. </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><li>This ability to make use of user-defined datatypes allows complete flexibility in defining the message content. </li></ul></ul></ul></ul><ul><ul><ul><li>Count — the number of items of type datatype to be sent. </li></ul></ul></ul>June - 2009 LNEC - DHA - NTI
    34. 34. Types of MPI Routines <ul><li>Point-to-point communication </li></ul><ul><li> - Communication Modes and Completion Criteria </li></ul><ul><li>- Blocking and Nonblocking Communication </li></ul>June - 2009 LNEC - DHA - NTI
    35. 35. Communication Modes and Completion Criteria <ul><li>MPI provides much flexibility in specifying how messages are to be sent . </li></ul><ul><ul><li>synchronous send : is defined to be complete when receipt of the message at its destination has been acknowledged. </li></ul></ul><ul><ul><li>buffered send : is complete when the outgoing data has been copied to a local buffer. </li></ul></ul><ul><ul><ul><li>Nothing at all is implied about the arrival of the message at its destination. </li></ul></ul></ul>June - 2009 LNEC - DHA - NTI
    36. 36. Communication Modes and Completion Criteria <ul><li>There are four communication modes available for sends : </li></ul><ul><ul><li>Standard </li></ul></ul><ul><ul><li>Synchronous </li></ul></ul><ul><ul><li>Buffered </li></ul></ul><ul><ul><li>Ready </li></ul></ul><ul><li>For receives , there is only a single communication mode. </li></ul><ul><ul><li>A receive is complete when the incoming data has actually arrived and is available for use. </li></ul></ul>June - 2009 LNEC - DHA - NTI
    37. 37. Blocking and Non-blocking Communication <ul><li>Blocking communication </li></ul><ul><ul><li>A send or receive operation that does not return until the send or receive has actually completed . </li></ul></ul><ul><ul><li>It is safe to overwrite or read the variable sent or received. </li></ul></ul><ul><li>Non-blocking communication </li></ul><ul><ul><li>A send or receive that returns immediately, before the operation has necessarily completed. </li></ul></ul><ul><ul><li>It allows processes to do other useful work while waiting for communications to occur (overlapping computation and communication). </li></ul></ul><ul><ul><ul><li>You should later test to see if the operation has actually completed. </li></ul></ul></ul>June - 2009 LNEC - DHA - NTI
    38. 38. Blocking and Non-blocking Communication <ul><li>Blocking communication </li></ul><ul><ul><li>A blocking send routine will only &quot;return&quot; after it is safe to modify the application buffer (your send data) for reuse. </li></ul></ul><ul><ul><ul><li>Safe </li></ul></ul></ul><ul><ul><ul><ul><li>modifications will not affect the data intended for the receive task. </li></ul></ul></ul></ul><ul><ul><ul><ul><li>does not imply that the data was actually received - it may very well be sitting in a system buffer. </li></ul></ul></ul></ul>June - 2009 LNEC - DHA - NTI
    39. 39. Blocking and Non-blocking Communication <ul><li>Blocking communication </li></ul><ul><ul><li>A blocking send can be synchronous </li></ul></ul><ul><ul><li>- there is handshaking occurring with the receive task to confirm a safe send. </li></ul></ul><ul><ul><li>A blocking send can be asynchronous if a system buffer is used to hold the data for eventual delivery to the receive. </li></ul></ul><ul><ul><li>A blocking receive only &quot;returns&quot; after the data has arrived and is ready for use by the program. </li></ul></ul>June - 2009 LNEC - DHA - NTI
    40. 40. Blocking and Non-blocking Communication <ul><li>Non-blocking communication </li></ul><ul><ul><li>Non-blocking send and receive routines behave similarly - they will return almost immediately. </li></ul></ul><ul><ul><ul><li>They do not wait for any communication events to complete, such as message copying from user memory to system buffer space or the actual arrival of message. </li></ul></ul></ul><ul><ul><li>Non-blocking operations simply &quot;request&quot; the MPI library to perform the operation when it is able. </li></ul></ul><ul><ul><ul><li>The user cannot predict when that will happen. </li></ul></ul></ul>June - 2009 LNEC - DHA - NTI
    41. 41. Blocking and Non-blocking Communication <ul><li>Non-blocking communication </li></ul><ul><ul><li>It is unsafe to modify the application buffer (your variable space) until you know for a fact the requested non-blocking operation was actually performed by the library . </li></ul></ul><ul><ul><ul><li>&quot;wait&quot; routines are used to do this. </li></ul></ul></ul><ul><ul><li>Non-blocking communications are primarily used to overlap computation with communication and exploit possible performance gains. </li></ul></ul>June - 2009 LNEC - DHA - NTI
    42. 42. Order and Fairness <ul><li>Order </li></ul><ul><ul><li>MPI guarantees that messages will not overtake each other. </li></ul></ul><ul><ul><li>- If a sender sends two messages (M1 and M2) in succession to the same destination, and both match the same receive, the receive operation will receive M1 before M2. </li></ul></ul><ul><ul><li>- If a receiver posts two receives (R1 and R2), in succession, and both are looking for the same message, R1 will receive the message before R2. </li></ul></ul><ul><ul><li>Order rules do not apply if there are multiple threads participating in the communication operations. </li></ul></ul>June - 2009 LNEC - DHA - NTI
    43. 43. Order and Fairness <ul><li>Fairness </li></ul><ul><ul><li>MPI does not guarantee fairness - it's up to the programmer to prevent &quot;operation starvation&quot;. </li></ul></ul><ul><ul><li>E.g.: </li></ul></ul><ul><ul><ul><li>task 0 sends a message to task 2. </li></ul></ul></ul><ul><ul><ul><li>task 1 sends a competing message that matches task 2's receive. Only one of the sends will complete . </li></ul></ul></ul>June - 2009 LNEC - DHA - NTI
    44. 44. MPI Message Passing Routine Arguments June - 2009 LNEC - DHA - NTI Blocking sends MPI_Send( buffer,count,type ,dest, tag,comm ) Non-blocking sends MPI_Isend( buffer,count,type ,dest, tag,comm ,request) Blocking receive MPI_Recv( buffer,count,type ,source, tag,comm ,status) Non-blocking receive MPI_Irecv( buffer,count,type ,source, tag,comm ,request)
    45. 45. MPI Message Passing Routine Arguments <ul><li>Buffer </li></ul><ul><ul><li>Program (application) address space that references the data that is to be sent or received. </li></ul></ul><ul><ul><li>- In most cases, this is simply the variable name that is be sent/received. </li></ul></ul><ul><ul><li>For C programs, this argument is passed by reference and usually must be prepended with an ampersand: &var1 </li></ul></ul><ul><li>Data Count </li></ul><ul><ul><li>Indicates the number of data elements of a particular type to be sent. </li></ul></ul>June - 2009 LNEC - DHA - NTI
    46. 46. MPI Message Passing Routine Arguments <ul><li>Data Type </li></ul>June - 2009 LNEC - DHA - NTI
    47. 47. MPI Message Passing Routine Arguments <ul><li>Destination </li></ul><ul><ul><li>An argument to send routines that indicates the process where a message should be delivered. </li></ul></ul><ul><ul><li>Specified as the rank of the receiving process . </li></ul></ul><ul><li>Source </li></ul><ul><ul><li>An argument to receive routines that indicates the originating process of the message. </li></ul></ul><ul><ul><li>Specified as the rank of the sending process. </li></ul></ul><ul><ul><li>This may be set to the wild card MPI_ANY_SOURCE to receive a message from any task. </li></ul></ul>June - 2009 LNEC - DHA - NTI
    48. 48. MPI Message Passing Routine Arguments <ul><li>Tag </li></ul><ul><ul><li>Arbitrary non-negative integer assigned by the programmer to uniquely identify a message . </li></ul></ul><ul><ul><li>Send and receive operations should match message tags. </li></ul></ul><ul><ul><li>For a receive operation, the wild card MPI_ANY_TAG can be used to receive any message regardless of its tag. </li></ul></ul><ul><ul><li>The MPI standard guarantees that integers 0-32767 can be used as tags, but most implementations allow a much larger range than this. </li></ul></ul>June - 2009 LNEC - DHA - NTI
    49. 49. MPI Message Passing Routine Arguments <ul><li>Communicator </li></ul><ul><ul><li>Indicates the communication context, or set of processes for which the source or destination fields are valid . </li></ul></ul><ul><ul><li>Unless the programmer is explicitly creating new communicators, the predefined communicator MPI_COMM_WORLD is usually used. </li></ul></ul><ul><li>Status </li></ul><ul><ul><li>For a receive operation , indicates the source of the message and the tag of the message. </li></ul></ul><ul><ul><li>Fortran: it is an integer array of size MPI_STATUS_SIZE </li></ul></ul><ul><ul><ul><li>E.g.: </li></ul></ul></ul><ul><ul><ul><ul><li>stat(MPI_SOURCE) </li></ul></ul></ul></ul><ul><ul><ul><ul><li>stat(MPI_TAG). </li></ul></ul></ul></ul><ul><ul><li>Additionally, the actual number of bytes received are obtainable from Status via the MPI_Get_count r outine. </li></ul></ul>June - 2009 LNEC - DHA - NTI
    50. 50. MPI Message Passing Routine Arguments <ul><li>Request </li></ul><ul><ul><li>Used by non-blocking send and receive operations . </li></ul></ul><ul><ul><li>Since non-blocking operations may return before the requested system buffer space is obtained, the system issues a unique &quot;request number&quot; . </li></ul></ul><ul><ul><li>The programmer uses this system assigned &quot;handle&quot; later (in a WAIT type routine) to determine completion of the non-blocking operation. </li></ul></ul><ul><ul><li>C: this argument is a pointer to a predefined structure MPI_Request. </li></ul></ul><ul><ul><li>Fortran: it is an integer. </li></ul></ul>June - 2009 LNEC - DHA - NTI
    51. 51. MPI Message Passing Routine Arguments June - 2009 LNEC - DHA - NTI Blocking sends MPI_Send( buffer,count,type ,dest, tag,comm ) Non-blocking sends MPI_Isend( buffer,count,type ,dest, tag,comm ,request) Blocking receive MPI_Recv( buffer,count,type ,source, tag,comm ,status) Non-blocking receive MPI_Irecv( buffer,count,type ,source, tag,comm ,request)
    52. 52. <ul><li>Any questions, so far? </li></ul><ul><li>Pit-stop? </li></ul>June - 2009 LNEC - DHA - NTI
    53. 53. Blocking Message Passing Routines <ul><li>MPI_Send </li></ul><ul><ul><li>Basic blocking send operation. </li></ul></ul><ul><ul><li>Routine returns only after the application buffer in the sending task is free for reuse. </li></ul></ul><ul><ul><li>This routine may be implemented differently on different systems. </li></ul></ul><ul><ul><li>The MPI standard permits the use of a system buffer but does not require it. </li></ul></ul><ul><ul><li>Some implementations may actually use a synchronous send (discussed below) to implement the basic blocking send. </li></ul></ul><ul><ul><li>MPI_SEND (buf,count,datatype,dest,tag,comm,ierr) </li></ul></ul>June - 2009 LNEC - DHA - NTI
    54. 54. Blocking Message Passing Routines <ul><li>MPI_Recv </li></ul><ul><ul><li>Receive a message and block until the requested data is available in the application buffer in the receiving task. </li></ul></ul><ul><ul><li>MPI_RECV (buf,count,datatype,source,tag,comm,status,ierr) </li></ul></ul><ul><li>MPI_Ssend </li></ul><ul><ul><li>Synchronous blocking send : Send a message and block until the application buffer in the sending task is free for reuse and the destination process has started to receive the message. </li></ul></ul><ul><ul><li>MPI_SSEND (buf,count,datatype,dest,tag,comm,ierr) </li></ul></ul>June - 2009 LNEC - DHA - NTI
    55. 55. Blocking Message Passing Routines <ul><li>MPI_Bsend </li></ul><ul><ul><li>Buffered blocking send : permits the programmer to allocate the required amount of buffer space into which data can be copied until it is delivered. </li></ul></ul><ul><ul><li>Insulates against the problems associated with insufficient system buffer space. </li></ul></ul><ul><ul><li>Routine returns after the data has been copied from application buffer space to the allocated send buffer. </li></ul></ul><ul><ul><li>Must be used with the MPI_Buffer_attach routine. </li></ul></ul><ul><ul><li>MPI_BSEND (buf,count,datatype,dest,tag,comm,ierr) </li></ul></ul>June - 2009 LNEC - DHA - NTI
    56. 56. Blocking Message Passing Routines <ul><li>MPI_Buffer_attach, MPI_Buffer_detach </li></ul><ul><ul><li>Used by programmer to allocate/deallocate message buffer space to be used by the MPI_Bsend routine. </li></ul></ul><ul><ul><li>The size argument is specified in actual data bytes - not a count of data elements. </li></ul></ul><ul><ul><li>Only one buffer can be attached to a process at a time. </li></ul></ul><ul><ul><li>MPI_BUFFER_ATTACH (buffer,size,ierr) </li></ul></ul><ul><ul><li>MPI_BUFFER_DETACH (buffer,size,ierr) </li></ul></ul>June - 2009 LNEC - DHA - NTI
    57. 57. Blocking Message Passing Routines <ul><li>MPI_Rsend </li></ul><ul><ul><li>Blocking ready send. </li></ul></ul><ul><ul><li>Should only be used if the programmer is certain that the matching receive has already been posted. </li></ul></ul><ul><ul><li>MPI_RSEND (buf,count,datatype,dest,tag,comm,ierr) </li></ul></ul><ul><li>MPI_Sendrecv </li></ul><ul><ul><li>Send a message and post a receive before blocking. </li></ul></ul><ul><ul><li>Will block until the sending application buffer is free for reuse and until the receiving application buffer contains the received message. </li></ul></ul><ul><ul><li>MPI_SENDRECV (sendbuf,sendcount,sendtype,dest,sendtag,recvbuf,recvcount,recvtype,source,recvtag,comm,status,ierr) </li></ul></ul>June - 2009 LNEC - DHA - NTI
    58. 58. Blocking Message Passing Routines <ul><li>MPI_Wait, MPI_Waitany, MPI_Waitall, MPI_Waitsome </li></ul><ul><ul><li>MPI_Wait blocks until a specified non-blocking send or receive operation has completed. </li></ul></ul><ul><ul><li>For multiple non-blocking operations, the programmer can specify any, all or some completions. </li></ul></ul><ul><ul><li>MPI_WAIT (request,status,ierr) </li></ul></ul><ul><ul><li>MPI_WAITANY (count,array_of_requests,index,status,ierr) </li></ul></ul><ul><ul><li>MPI_WAITALL (count,array_of_requests,array_of_statuses, ierr) </li></ul></ul><ul><ul><li>MPI_WAITSOME (incount,array_of_requests,outcount,array_of_offsets, array_of_statuses,ierr) </li></ul></ul>June - 2009 LNEC - DHA - NTI
    59. 59. Blocking Message Passing Routines <ul><li>MPI_Probe </li></ul><ul><ul><li>Performs a blocking test for a message . </li></ul></ul><ul><ul><li>The &quot;wildcards&quot; MPI_ANY_SOURCE and MPI_ANY_TAG may be used to test for a message from any source or with any tag. </li></ul></ul><ul><ul><li>Fortran routine: they will be returned in the integer array status(MPI_SOURCE) and </li></ul></ul><ul><ul><li>status(MPI_TAG). </li></ul></ul><ul><ul><li>MPI_PROBE (source,tag,comm,status,ierr) </li></ul></ul>June - 2009 LNEC - DHA - NTI
    60. 60. June - 2009 LNEC - DHA - NTI Task 0 pings task 1 and awaits return ping.
    61. 61. Non- Blocking Message Passing Routines <ul><li>MPI_Isend </li></ul><ul><ul><li>Identifies an area in memory to serve as a send buffer. </li></ul></ul><ul><ul><li>Processing continues immediately without waiting for the message to be copied out from the application buffer. </li></ul></ul><ul><ul><li>A communication request handle is returned for handling the pending message status. </li></ul></ul><ul><ul><li>The program should not modify the application buffer until subsequent calls to MPI_Wait or MPI_Test indicate that the non-blocking send has completed. </li></ul></ul><ul><ul><li>MPI_ISEND (buf,count,datatype,dest,tag,comm,request,ierr) </li></ul></ul>June - 2009 LNEC - DHA - NTI
    62. 62. Non- Blocking Message Passing Routines <ul><li>MPI_Irecv </li></ul><ul><ul><li>Identifies an area in memory to serve as a receive buffer. </li></ul></ul><ul><ul><li>Processing continues immediately without actually waiting for the message to be received and copied into the the application buffer. </li></ul></ul><ul><ul><li>A communication request handle is returned for handling the pending message status. </li></ul></ul><ul><ul><li>The program must use calls to MPI_Wait or MPI_Test to determine when the non-blocking receive operation completes and the requested message is available in the application buffer. </li></ul></ul><ul><ul><li>MPI_IRECV (buf,count,datatype,source,tag,comm,request,ierr ) </li></ul></ul>June - 2009 LNEC - DHA - NTI
    63. 63. Non- Blocking Message Passing Routines <ul><li>MPI_Issend </li></ul><ul><ul><li>Non-blocking synchronous send. </li></ul></ul><ul><ul><li>Similar to MPI_Isend(), except MPI_Wait() or MPI_Test() indicates when the destination process has received the message. </li></ul></ul><ul><ul><li>MPI_ISSEND (buf,count,datatype,dest,tag,comm,request,ierr) </li></ul></ul><ul><li>MPI_Ibsend </li></ul><ul><ul><li>Non-blocking buffered send. </li></ul></ul><ul><ul><li>Similar to MPI_Bsend() except MPI_Wait() or MPI_Test() indicates when the destination process has received the message. </li></ul></ul><ul><ul><li>Must be used with the MPI_Buffer_attach routine. </li></ul></ul><ul><ul><li>MPI_IBSEND (buf,count,datatype,dest,tag,comm,request,ierr) </li></ul></ul>June - 2009 LNEC - DHA - NTI
    64. 64. Non- Blocking Message Passing Routines <ul><li>MPI_Irsend </li></ul><ul><ul><li>Non-blocking ready send. </li></ul></ul><ul><ul><li>Similar to MPI_Rsend() except MPI_Wait() or MPI_Test() indicates when the destination process has received the message. </li></ul></ul><ul><ul><li>Should only be used if the programmer is certain that the matching receive has already been posted. </li></ul></ul><ul><ul><li>MPI_IRSEND (buf,count,datatype,dest,tag,comm,request,ierr) </li></ul></ul>June - 2009 LNEC - DHA - NTI
    65. 65. Non- Blocking Message Passing Routines <ul><li>MPI_Test, MPI_Testany, MPI_Testall, MPI_Testsome </li></ul><ul><ul><li>MPI_Test checks the status of a specified non-blocking send or receive operation. </li></ul></ul><ul><ul><li>The &quot;flag&quot; parameter is returned logical true (1) if the operation has completed, and logical false (0) if not. </li></ul></ul><ul><ul><li>For multiple non-blocking operations, the programmer can specify any, all or some completions. </li></ul></ul>June - 2009 LNEC - DHA - NTI <ul><ul><li>MPI_TEST (request,flag,status,ierr) </li></ul></ul><ul><ul><li>MPI_TESTANY (count,array_of_requests,index,flag,status,ierr) </li></ul></ul><ul><ul><li>MPI_TESTALL (count,array_of_requests,flag,array_of_statuses,ierr) </li></ul></ul><ul><ul><li>MPI_TESTSOME (incount,array_of_requests,outcount,array_of_offsets, array_of_statuses,ierr) </li></ul></ul>
    66. 66. Non- Blocking Message Passing Routines <ul><li>MPI_Iprobe </li></ul><ul><ul><li>Performs a non-blocking test for a message. </li></ul></ul><ul><ul><li>The &quot;wildcards&quot; MPI_ANY_SOURCE and MPI_ANY_TAG may be used to test for a message from any source or with any tag. </li></ul></ul><ul><ul><li>The integer &quot;flag&quot; parameter is returned logical true (1) if a message has arrived, and logical false (0) if not. </li></ul></ul><ul><ul><li>Fortran routine: they will be returned in the integer array status(MPI_SOURCE) and status(MPI_TAG). </li></ul></ul><ul><ul><li>MPI_IPROBE (source,tag,comm,flag,status,ierr) </li></ul></ul>June - 2009 LNEC - DHA - NTI
    67. 67. June - 2009 LNEC - DHA - NTI Nearest neighbor exchange in ring topology.
    68. 68. <ul><li>Any questions, so far? </li></ul>June - 2009 LNEC - DHA - NTI
    69. 69. Types of MPI Routines <ul><li>Point-to-point communication </li></ul><ul><li> - Communication Modes and Completion Criteria </li></ul><ul><li>- Blocking and Nonblocking Communication </li></ul><ul><li>Collective communication </li></ul><ul><li>Process groups </li></ul><ul><li>Process topologies </li></ul><ul><li>Environment management and inquiry </li></ul>June - 2009 LNEC - DHA - NTI
    70. 70. Collective Communication <ul><li>Programming Considerations and Restrictions </li></ul><ul><ul><li>Collective operations are blocking. </li></ul></ul><ul><ul><li>Collective communication routines do not take message tag arguments. </li></ul></ul><ul><ul><li>Collective operations within subsets of processes are accomplished by first partitioning the subsets into new groups and then attaching the new groups to new communicators. </li></ul></ul><ul><ul><li>Can only be used with MPI predefined datatypes - not with MPI Derived Data Types. </li></ul></ul>June - 2009 LNEC - DHA - NTI
    71. 71. Collective Communication <ul><li>Broadcast operation </li></ul><ul><ul><li>a single process sends a copy of some data to all the other processes in a group . </li></ul></ul><ul><ul><li>Each row represents a different process. </li></ul></ul><ul><ul><li>Each colored block in a column represents the location of a piece of the data. </li></ul></ul><ul><ul><li>Blocks with the same color that are located on multiple processes contain copies of the same data. </li></ul></ul>June - 2009 LNEC - DHA - NTI
    72. 72. Collective Communication <ul><li>Scatter and Gather Operations </li></ul><ul><ul><li>distribute data on one processor across a group of processors or vice-versa </li></ul></ul><ul><ul><li>Scatter operation: all of the data (an array of some type) are initially collected on a single processor. </li></ul></ul><ul><ul><li>After the scatter operation, pieces of the data are distributed on different processors. </li></ul></ul><ul><ul><li>The multicolored box reflects the possibility that the data may not be evenly divisible across the processors. </li></ul></ul><ul><ul><li>The gather operation is the inverse operation to scatter: it collects pieces of the data that are distributed across a group of processors and reassembles them in the proper order on a single processor. </li></ul></ul>June - 2009 LNEC - DHA - NTI
    73. 73. Collective Communication <ul><li>Reduction operation </li></ul><ul><ul><li>A single process (the root process) collects data from the other processes in a group and performs an operation on that data, which produces a single value. </li></ul></ul><ul><ul><li>Use : to compute the sum of the elements of an array that is distributed across several processors. </li></ul></ul><ul><ul><li>Operations: arithmetic, maximum and minimum and various logical and bitwise operations. </li></ul></ul>June - 2009 LNEC - DHA - NTI
    74. 74. Compiling and Running MPI Programs <ul><li>The MPI standard does not specify how MPI programs are to be started; </li></ul><ul><li>Implementations vary from machine to machine. When compiling a MPI program, it may be necessary to link against the MPI library. Typically, to do this, you include the option &quot;-lmpi &quot; to the loader. </li></ul><ul><li>To run an MPI code, you commonly use a wrapper called &quot;mpirun&quot; or &quot;mpprun&quot;. </li></ul><ul><ul><li>$ mpirun -np 4 execfile </li></ul></ul><ul><li>This command would run the executable &quot;execfile&quot; on four processors. </li></ul>June - 2009 LNEC - DHA - NTI
    75. 75.  Nano-Self-Test - Getting Started  http://ci-tutor.ncsa.uiuc.edu/content.php?cid=1303

    ×