Open MPIKYOSS presentation12 Jan, 2011Jeff Squyres
What is the Message Passing Interface (MPI)?The Bookof MPIA standards documentwww.mpi-forum.org
Using MPIHardware and software implementthe interface in the MPI standard (book)
MPI implementationsThere are many implementationsof the MPI standardSome areclosed sourceOthers areopen source
Open MPIOpen MPI is a free, open sourceimplementation of the MPI standardwww.open-mpi.org
So what is MPI for?Let’s break it down…Message Passing Interface
1. Message passingProcess AProcess BMessage
1. Message passingProcess AProcess BPass it
1. Message passingProcess AProcess BMessage has been passed
1. Message passingProcessThread AThread B…as opposed to data that is shared
2. InterfaceFortran too!C programming function callsMPI_Wait(req, status)MPI_Init(argv, argc)MPI_Recv(buf, count, type, src, tag, comm, status)MPI_Send(buf, count, type, dest, tag, comm)MPI_Comm_dup(in, out)MPI_Test(req, flag, status)MPI_Finalize(void)MPI_Type_size(dtype, size)
Fortran?  Really?What most modern developers associate with “Fortran”
Yes, reallySome oftoday’s mostadvancedsimulationcodes arewritten inFortran
Yes, reallyYes,that IntelOptimizedfor Nehalem,Westmere,and beyond!
Fortran is great for what it isA simple language for mathematical expressions and computationsTargeted at scientists and engineers…not computer scientists or web developers or database developers or …
Back to defining “MPI”…
Putting it back togetherMessage Passing Interface“An interface for passing messages”“C functions for passing messages”Fortran too!
C/Fortran functions for message passingProcess AProcess BMPI_Send(…)
C/Fortran functions for message passingProcess AProcess BMPI_Recv(…)
Really?  Is that all MPI is?“Can’t I just do that with sockets?”Yes!(…and no)
Comparison(TCP) SocketsConnections based on IP addresses and portsPoint-to-point communicationStream-orientedRaw data (bytes / octets)Network-independent“Slow”MPIBased on peer integer “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fast
ComparisonMPIBased on peer integer “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fastWhoa!What are these?
Peer integer “rank”MPIBased on peer integer “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fast01234567891011
Peer integer “rank”MPIBased on peer integer “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fast01234567891011
“Collective”: broadcastMPIBased on peer integer “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fast01234567891011
“Collective”: broadcastMPIBased on peer integer “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fast01234567891011
“Collective”: scatterMPIBased on peer integer “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fast01234567891011
“Collective”: scatterMPIBased on peer integer “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fast01234567891011
“Collective”: gatherMPIBased on peer integer “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fast01234567891011
“Collective”: gatherMPIBased on peer integer “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fast01234567891011
“Collective”: reduceMPIBased on peer integer “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fast01246534556367832491011244
“Collective”: reduceMPI42Based on peer integer “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fast01234567891011
“Collective”: …and othersMPIBased on peer integer “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fast01234567891011
Messages, not bytesMPIBased on peer integer “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fastEntire messageis sent andreceivedNot astream ofindividual bytes
Messages, not bytesMPIBased on peer integer “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fastContents:17 integers23 doubles98 structs…or whateverNot abunch ofbytes!
Network independentMPIBased on peer integer “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fastMPI_Send(…)MPI_Recv(…)Underlying networkEthernetMyrinetInfiniBandShared memoryTCPiWARPRoCE
Network independentMPIBased on peer integer “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fastMPI_Send(…)MPI_Recv(…)Underlying networkEthernetMyrinetRegardless of underlyingnetwork or transportprotocol, the applicationcode stays the sameInfiniBandShared memoryTCPiWARPRoCE
Blazing fastMPIBased on peer integer “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fastOnemicrosecond(!)…more on performance later
What is MPI?MPI isprobablysomewherearound here
What is MPI?MPI ishides allthe layersunderneath
What is MPI?A high-level network programming abstractionIPaddressesbytestreamsrawbytes
What is MPI?A high-level network programming abstractionNothing to see herePlease move alongIPaddressesbytestreamsrawbytes
So what?What’s all this message passing stuffgot to do with supercomputers?
So what?Let’s define “supercomputers”
Supercomputers
Supercomputers“Nebulae”NationalSupercomputingCentre,Shenzen,China
Supercomputers“Mare Nostrum”(Our Sea)BarcelonaSupercomputerCenter,SpainUsed to be a church
SupercomputersNotice anything?
SupercomputersThey’re justracks ofservers!
Generally speaking…Supercomputer=Lots ofprocessorsLots ofRAMLots ofdisk++
Generally speaking…Supercomputer=(Many) Racks of (commodity)high-end servers(this is one definition; there are others)
So if that’s a supercomputer…Rack of36 1Uservers
How is it different from my web farm?Rack of36 1Uservers
Just a bunch of servers?The difference betweensupercomputers and web farmsand database farms (and …)All the servers act together tosolve a single computational problem
Acting togetherTake your computational problem…InputOutputComputational problem
Acting together…and split it up!InputOutputComputational problem
Acting togetherDistribute the input dataacross a bunch of serversInputOutputComputational problem
Acting togetherUse the network between serversto communicate / coordinateInputOutput
Acting togetherUse the network between serversto communicate / coordinateInputOutput
Acting togetherMPI is used for this communicationInputOutput
Why go to so much trouble?One processor hourComputational problem1 processor = …a long time…
Why go to so much trouble?One processor hourOne processor hourOne processor hourOne processor hourOne processor hourOne processor hourOne processor hourComputational problemOne processor hourOne processor hourOne processor hourOne processor hourOne processor hourOne processor hourOne processor hourOne processor hourOne processor hourOne processor hourOne processor hourOne processor hourOne processor hourOne processor hour21 processors = ~1 hour (!)Disclaimer: scaling is rarely perfect
High Performance ComputingHPC=Using supercomputers to solvereal world problems that are TOO BIGfor laptops, desktops, or individuals servers
Why does HPC      MPI? Network abstractionAre thesecores?
Why does HPC      MPI? Network abstraction…orservers?
Why does HPC      MPI? Message semanticsArray of10,000integers
Why does HPC      MPI? Message semanticsArray of10,000integers
Why does HPC      MPI? Ultra-low network latency(depending on your network type!)1 microsecond
1 microsecond = 0.000001 secondFromhereTohere
1 microsecond = 0.000001 secondFromhereTohere
Holy smokes!That’s fast
Let’s get into some details…
MPI Basics“6 function MPI”MPI_Init(): startupMPI_Comm_size(): how many peers?MPI_Comm_rank(): my unique (ordered) IDMPI_Send(): send a messageMPI_Recv(): receive a messageMPI_Finalize(): shutdownCan implement a huge number of parallel applications with just these 6 functions
Let’s see “Hello, World” in MPI
MPI Hello, World#include <stdio.h>#include <mpi.h>intmain(intargc, char **argv) {int rank, size;MPI_Init(&argc, &argv);MPI_Comm_rank(MPI_COMM_WORLD, &rank);MPI_Comm_size(MPI_COMM_WORLD, &size);printf("Hello, world!  I am %d of %d\n", rank, size);MPI_Finalize();    return 0;}Initialize MPIWho am I?Num. peers?Shut down MPI
Compile it with Open MPIshell$ mpicchello.c -o helloshell$Open MPI comes standard in many Linux and BSD distributions(and OS X)Hey – what’s that?  Where’s gcc?
“Wrapper” compilermpicc simply fills in a bunch of compiler command line options for youshell$ mpicchello.c -o hello –showmegcchello.c -o hello -I/opt/openmpi/include -pthread -L/open/openmpi/lib -lmpi -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldlshell$
Now let’s run itshell$ mpirun –np 4 helloHey – what’s that?  Why don’t I just run “./hello”?
mpirun launchermpirun launches N copies of yourprogram and “wires them up”shell$ mpirun –np 4 hello“-np” = “number of processes”This command launches a 4 process parallel job
mpirun launchershell$ mpirun –np 4 hellohellohelloFour copiesof “hello”are launchedThen theyare “wired up”on the networkhellohello
Now let’s run itshell$ mpirun –np 4 helloHello, world!  I am 0 of 4Hello, world!  I am 1 of 4Hello, world!  I am 2 of 4Hello, world!  I am 3 of 4shell$ By default, all copiesrun on the local host
Run on multiple servers!shell$ cat my_hostfilehost1.example.comhost2.example.comhost3.example.comhost4.example.comshell$
Run on multiple servers!shell$ cat my_hostfilehost1.example.comhost2.example.comhost3.example.comhost4.example.comshell$ mpirun–hostfilemy_hostfile–np 4 helloHello, world!  I am 0 of 4Hello, world!  I am 1 of 4Hello, world!  I am 2 of 4Hello, world!  I am 3 of 4shell$  Ran on host1 Ran on host2 Ran on host3 Ran on host4
Run it againshell$ mpirun –hostfilemy_hostfile –np 4 helloHello, world!  I am 2 of 4Hello, world!  I am 3 of 4Hello, world!  I am 0 of 4Hello, world!  I am 1 of 4shell$ 2301Hey – why are the numbers out of order?
Standard output re-routingshell$ mpirun–hostfilemy_hostfile –np 4 helloHello, world! I am 0 of 4Hello, world! I am 1 of 4hello0hello1mpirunEach “hello” program’sstandard outputis interceptedand sent across thenetwork to mpirunhello3hello2Hello, world! I am 2 of 4Hello, world! I am 3 of 4
Standard output re-routingshell$ mpirun–hostfilemy_hostfile –np 4 hellohello0hello1mpirunBut the exactordering ofreceived printf’sis non-deterministichello3hello2Hello, world! I am 2 of 4Hello, world! I am 3 of 4Hello, world! I am 0 of 4Hello, world! I am 1 of 4
Printf debugging = BadIf you can’t rely on output ordering,printf debugging is pretty lousy (!)
Parallel debuggersFortunately, there are paralleldebuggers and other toolsParalleldebuggerAttaches to allprocesses inthe MPI jobhello0hello1mpirunhello3hello2
Now let’s send a simple MPI message
Send a simple messageint rank;double buffer[SIZE];MPI_Comm_rank(MPI_COMM_WORLD, &rank);if (0 == rank) {    /* …initialize buffer[]… */MPI_Send(buffer, SIZE, MPI_DOUBLE, 1, 123,                 MPI_COMM_WORLD);} else if (1 == rank) {MPI_Recv(buffer, SIZE, MPI_DOUBLE, 0, 123,                 MPI_COMM_WORLD, MPI_STATUS_IGNORE);}If I’m number 0, send the buffer[] array to number 1If I’m number 1, receive thebuffer[] array from number 0
That’s enough MPI for now…
Open MPIPACX-MPILAM/MPIProject founded in 2003 after intensediscussions between multiple open source MPI implementations LA-MPIFT-MPISun CT 6
Open_MPI_Init()shell$ svn log –r 1 https://svn.open-mpi.org/svn/ompi------------------------------------------------------------------------r1 | jsquyres | 2003-11-22 11:36:58 -0500 (Sat, 22 Nov 2003) | 2 linesFirstcommit------------------------------------------------------------------------shell$
Open_MPI_Current_status()shell$ svn log –r HEAD https://svn.open-mpi.org/svn/ompi------------------------------------------------------------------------r24226 | rhc | 2011-01-11 20:57:47 -0500 (Tue, 11 Jan 2011) | 25 linesFixes #2683: Move ORTE DPM compiler warning squash to v1.4------------------------------------------------------------------------shell$
Open MPI 2011 Membership15 members, 11 contributors, 2 partners
Fun statsohloh.net says:517,400 lines of code30 developers (over time)“Well-commented source code”I rank in top-25 ohloh stats for:CAutomakeShell scriptFortran (ouch!)
Open MPI has grownIt’s amazing (to me) that the Open MPIproject works so wellNew features, new releases, new membersLong live Open MPI!
RecapDefined Message Passing Interface (MPI)Defined “supercomputers”Defined High Performance Computing (HPC)Showed what MPI isShowed some trivial MPI codesDiscussed Open MPI
Additional ResourcesMPI Forum web siteThe only site for the official MPI standardshttp://www.mpi-forum.org/NCSA MPI basic and intermediate tutorialsRequires a free accounthttp://ci-tutor.ncsa.uiuc.edu/login.php“MPI Mechanic” magazine columnshttp://cw.squyres.com/
Additional ResourcesResearch, Computing, and Engineering (RCE) podcasthttp://www.rce-cast.com/My blog: MPI_BCASThttp://blogs.cisco.com/category/performance/
Questions?

The Message Passing Interface (MPI) in Layman's Terms

  • 1.
    Open MPIKYOSS presentation12Jan, 2011Jeff Squyres
  • 2.
    What is theMessage Passing Interface (MPI)?The Bookof MPIA standards documentwww.mpi-forum.org
  • 3.
    Using MPIHardware andsoftware implementthe interface in the MPI standard (book)
  • 4.
    MPI implementationsThere aremany implementationsof the MPI standardSome areclosed sourceOthers areopen source
  • 5.
    Open MPIOpen MPIis a free, open sourceimplementation of the MPI standardwww.open-mpi.org
  • 6.
    So what isMPI for?Let’s break it down…Message Passing Interface
  • 7.
    1. Message passingProcessAProcess BMessage
  • 8.
    1. Message passingProcessAProcess BPass it
  • 9.
    1. Message passingProcessAProcess BMessage has been passed
  • 10.
    1. Message passingProcessThreadAThread B…as opposed to data that is shared
  • 11.
    2. InterfaceFortran too!Cprogramming function callsMPI_Wait(req, status)MPI_Init(argv, argc)MPI_Recv(buf, count, type, src, tag, comm, status)MPI_Send(buf, count, type, dest, tag, comm)MPI_Comm_dup(in, out)MPI_Test(req, flag, status)MPI_Finalize(void)MPI_Type_size(dtype, size)
  • 12.
    Fortran? Really?Whatmost modern developers associate with “Fortran”
  • 13.
    Yes, reallySome oftoday’smostadvancedsimulationcodes arewritten inFortran
  • 14.
    Yes, reallyYes,that IntelOptimizedforNehalem,Westmere,and beyond!
  • 15.
    Fortran is greatfor what it isA simple language for mathematical expressions and computationsTargeted at scientists and engineers…not computer scientists or web developers or database developers or …
  • 16.
    Back to defining“MPI”…
  • 17.
    Putting it backtogetherMessage Passing Interface“An interface for passing messages”“C functions for passing messages”Fortran too!
  • 18.
    C/Fortran functions formessage passingProcess AProcess BMPI_Send(…)
  • 19.
    C/Fortran functions formessage passingProcess AProcess BMPI_Recv(…)
  • 20.
    Really? Isthat all MPI is?“Can’t I just do that with sockets?”Yes!(…and no)
  • 21.
    Comparison(TCP) SocketsConnections basedon IP addresses and portsPoint-to-point communicationStream-orientedRaw data (bytes / octets)Network-independent“Slow”MPIBased on peer integer “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fast
  • 22.
    ComparisonMPIBased on peerinteger “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fastWhoa!What are these?
  • 23.
    Peer integer “rank”MPIBasedon peer integer “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fast01234567891011
  • 24.
    Peer integer “rank”MPIBasedon peer integer “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fast01234567891011
  • 25.
    “Collective”: broadcastMPIBased onpeer integer “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fast01234567891011
  • 26.
    “Collective”: broadcastMPIBased onpeer integer “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fast01234567891011
  • 27.
    “Collective”: scatterMPIBased onpeer integer “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fast01234567891011
  • 28.
    “Collective”: scatterMPIBased onpeer integer “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fast01234567891011
  • 29.
    “Collective”: gatherMPIBased onpeer integer “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fast01234567891011
  • 30.
    “Collective”: gatherMPIBased onpeer integer “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fast01234567891011
  • 31.
    “Collective”: reduceMPIBased onpeer integer “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fast01246534556367832491011244
  • 32.
    “Collective”: reduceMPI42Based onpeer integer “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fast01234567891011
  • 33.
    “Collective”: …and othersMPIBasedon peer integer “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fast01234567891011
  • 34.
    Messages, not bytesMPIBasedon peer integer “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fastEntire messageis sent andreceivedNot astream ofindividual bytes
  • 35.
    Messages, not bytesMPIBasedon peer integer “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fastContents:17 integers23 doubles98 structs…or whateverNot abunch ofbytes!
  • 36.
    Network independentMPIBased onpeer integer “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fastMPI_Send(…)MPI_Recv(…)Underlying networkEthernetMyrinetInfiniBandShared memoryTCPiWARPRoCE
  • 37.
    Network independentMPIBased onpeer integer “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fastMPI_Send(…)MPI_Recv(…)Underlying networkEthernetMyrinetRegardless of underlyingnetwork or transportprotocol, the applicationcode stays the sameInfiniBandShared memoryTCPiWARPRoCE
  • 38.
    Blazing fastMPIBased onpeer integer “rank” (e.g., 8)Point-to-point and collective and one-sided and …Message orientedTyped messagesNetwork independentBlazing fastOnemicrosecond(!)…more on performance later
  • 39.
    What is MPI?MPIisprobablysomewherearound here
  • 40.
    What is MPI?MPIishides allthe layersunderneath
  • 41.
    What is MPI?Ahigh-level network programming abstractionIPaddressesbytestreamsrawbytes
  • 42.
    What is MPI?Ahigh-level network programming abstractionNothing to see herePlease move alongIPaddressesbytestreamsrawbytes
  • 43.
    So what?What’s allthis message passing stuffgot to do with supercomputers?
  • 44.
    So what?Let’s define“supercomputers”
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
    Generally speaking…Supercomputer=(Many) Racksof (commodity)high-end servers(this is one definition; there are others)
  • 52.
    So if that’sa supercomputer…Rack of36 1Uservers
  • 53.
    How is itdifferent from my web farm?Rack of36 1Uservers
  • 54.
    Just a bunchof servers?The difference betweensupercomputers and web farmsand database farms (and …)All the servers act together tosolve a single computational problem
  • 55.
    Acting togetherTake yourcomputational problem…InputOutputComputational problem
  • 56.
    Acting together…and splitit up!InputOutputComputational problem
  • 57.
    Acting togetherDistribute theinput dataacross a bunch of serversInputOutputComputational problem
  • 58.
    Acting togetherUse thenetwork between serversto communicate / coordinateInputOutput
  • 59.
    Acting togetherUse thenetwork between serversto communicate / coordinateInputOutput
  • 60.
    Acting togetherMPI isused for this communicationInputOutput
  • 61.
    Why go toso much trouble?One processor hourComputational problem1 processor = …a long time…
  • 62.
    Why go toso much trouble?One processor hourOne processor hourOne processor hourOne processor hourOne processor hourOne processor hourOne processor hourComputational problemOne processor hourOne processor hourOne processor hourOne processor hourOne processor hourOne processor hourOne processor hourOne processor hourOne processor hourOne processor hourOne processor hourOne processor hourOne processor hourOne processor hour21 processors = ~1 hour (!)Disclaimer: scaling is rarely perfect
  • 63.
    High Performance ComputingHPC=Usingsupercomputers to solvereal world problems that are TOO BIGfor laptops, desktops, or individuals servers
  • 64.
    Why does HPC MPI? Network abstractionAre thesecores?
  • 65.
    Why does HPC MPI? Network abstraction…orservers?
  • 66.
    Why does HPC MPI? Message semanticsArray of10,000integers
  • 67.
    Why does HPC MPI? Message semanticsArray of10,000integers
  • 68.
    Why does HPC MPI? Ultra-low network latency(depending on your network type!)1 microsecond
  • 69.
    1 microsecond =0.000001 secondFromhereTohere
  • 70.
    1 microsecond =0.000001 secondFromhereTohere
  • 71.
  • 72.
    Let’s get intosome details…
  • 73.
    MPI Basics“6 functionMPI”MPI_Init(): startupMPI_Comm_size(): how many peers?MPI_Comm_rank(): my unique (ordered) IDMPI_Send(): send a messageMPI_Recv(): receive a messageMPI_Finalize(): shutdownCan implement a huge number of parallel applications with just these 6 functions
  • 74.
    Let’s see “Hello,World” in MPI
  • 75.
    MPI Hello, World#include<stdio.h>#include <mpi.h>intmain(intargc, char **argv) {int rank, size;MPI_Init(&argc, &argv);MPI_Comm_rank(MPI_COMM_WORLD, &rank);MPI_Comm_size(MPI_COMM_WORLD, &size);printf("Hello, world! I am %d of %d\n", rank, size);MPI_Finalize(); return 0;}Initialize MPIWho am I?Num. peers?Shut down MPI
  • 76.
    Compile it withOpen MPIshell$ mpicchello.c -o helloshell$Open MPI comes standard in many Linux and BSD distributions(and OS X)Hey – what’s that? Where’s gcc?
  • 77.
    “Wrapper” compilermpicc simplyfills in a bunch of compiler command line options for youshell$ mpicchello.c -o hello –showmegcchello.c -o hello -I/opt/openmpi/include -pthread -L/open/openmpi/lib -lmpi -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldlshell$
  • 78.
    Now let’s runitshell$ mpirun –np 4 helloHey – what’s that? Why don’t I just run “./hello”?
  • 79.
    mpirun launchermpirun launchesN copies of yourprogram and “wires them up”shell$ mpirun –np 4 hello“-np” = “number of processes”This command launches a 4 process parallel job
  • 80.
    mpirun launchershell$ mpirun–np 4 hellohellohelloFour copiesof “hello”are launchedThen theyare “wired up”on the networkhellohello
  • 81.
    Now let’s runitshell$ mpirun –np 4 helloHello, world! I am 0 of 4Hello, world! I am 1 of 4Hello, world! I am 2 of 4Hello, world! I am 3 of 4shell$ By default, all copiesrun on the local host
  • 82.
    Run on multipleservers!shell$ cat my_hostfilehost1.example.comhost2.example.comhost3.example.comhost4.example.comshell$
  • 83.
    Run on multipleservers!shell$ cat my_hostfilehost1.example.comhost2.example.comhost3.example.comhost4.example.comshell$ mpirun–hostfilemy_hostfile–np 4 helloHello, world! I am 0 of 4Hello, world! I am 1 of 4Hello, world! I am 2 of 4Hello, world! I am 3 of 4shell$  Ran on host1 Ran on host2 Ran on host3 Ran on host4
  • 84.
    Run it againshell$mpirun –hostfilemy_hostfile –np 4 helloHello, world! I am 2 of 4Hello, world! I am 3 of 4Hello, world! I am 0 of 4Hello, world! I am 1 of 4shell$ 2301Hey – why are the numbers out of order?
  • 85.
    Standard output re-routingshell$mpirun–hostfilemy_hostfile –np 4 helloHello, world! I am 0 of 4Hello, world! I am 1 of 4hello0hello1mpirunEach “hello” program’sstandard outputis interceptedand sent across thenetwork to mpirunhello3hello2Hello, world! I am 2 of 4Hello, world! I am 3 of 4
  • 86.
    Standard output re-routingshell$mpirun–hostfilemy_hostfile –np 4 hellohello0hello1mpirunBut the exactordering ofreceived printf’sis non-deterministichello3hello2Hello, world! I am 2 of 4Hello, world! I am 3 of 4Hello, world! I am 0 of 4Hello, world! I am 1 of 4
  • 87.
    Printf debugging =BadIf you can’t rely on output ordering,printf debugging is pretty lousy (!)
  • 88.
    Parallel debuggersFortunately, thereare paralleldebuggers and other toolsParalleldebuggerAttaches to allprocesses inthe MPI jobhello0hello1mpirunhello3hello2
  • 89.
    Now let’s senda simple MPI message
  • 90.
    Send a simplemessageint rank;double buffer[SIZE];MPI_Comm_rank(MPI_COMM_WORLD, &rank);if (0 == rank) { /* …initialize buffer[]… */MPI_Send(buffer, SIZE, MPI_DOUBLE, 1, 123, MPI_COMM_WORLD);} else if (1 == rank) {MPI_Recv(buffer, SIZE, MPI_DOUBLE, 0, 123, MPI_COMM_WORLD, MPI_STATUS_IGNORE);}If I’m number 0, send the buffer[] array to number 1If I’m number 1, receive thebuffer[] array from number 0
  • 91.
  • 92.
    Open MPIPACX-MPILAM/MPIProject foundedin 2003 after intensediscussions between multiple open source MPI implementations LA-MPIFT-MPISun CT 6
  • 93.
    Open_MPI_Init()shell$ svn log–r 1 https://svn.open-mpi.org/svn/ompi------------------------------------------------------------------------r1 | jsquyres | 2003-11-22 11:36:58 -0500 (Sat, 22 Nov 2003) | 2 linesFirstcommit------------------------------------------------------------------------shell$
  • 94.
    Open_MPI_Current_status()shell$ svn log–r HEAD https://svn.open-mpi.org/svn/ompi------------------------------------------------------------------------r24226 | rhc | 2011-01-11 20:57:47 -0500 (Tue, 11 Jan 2011) | 25 linesFixes #2683: Move ORTE DPM compiler warning squash to v1.4------------------------------------------------------------------------shell$
  • 95.
    Open MPI 2011Membership15 members, 11 contributors, 2 partners
  • 96.
    Fun statsohloh.net says:517,400lines of code30 developers (over time)“Well-commented source code”I rank in top-25 ohloh stats for:CAutomakeShell scriptFortran (ouch!)
  • 97.
    Open MPI hasgrownIt’s amazing (to me) that the Open MPIproject works so wellNew features, new releases, new membersLong live Open MPI!
  • 98.
    RecapDefined Message PassingInterface (MPI)Defined “supercomputers”Defined High Performance Computing (HPC)Showed what MPI isShowed some trivial MPI codesDiscussed Open MPI
  • 99.
    Additional ResourcesMPI Forumweb siteThe only site for the official MPI standardshttp://www.mpi-forum.org/NCSA MPI basic and intermediate tutorialsRequires a free accounthttp://ci-tutor.ncsa.uiuc.edu/login.php“MPI Mechanic” magazine columnshttp://cw.squyres.com/
  • 100.
    Additional ResourcesResearch, Computing,and Engineering (RCE) podcasthttp://www.rce-cast.com/My blog: MPI_BCASThttp://blogs.cisco.com/category/performance/
  • 101.