Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
Getting started with libfabric
Next
Download to read offline and view in fullscreen.

0

Share

The State of libfabric in Open MPI

Download to read offline

Slides presented by Jeff Squyres at the 2015 OpenFabrics Software Developers' Workshop. This talk discusses Cisco's experiences implementing an ultra-low latency Ethernet plugin / provider for the Linux Verbs API and for for the Libfabric API.

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all
  • Be the first to like this

The State of libfabric in Open MPI

  1. 1. Open Fabrics workshop, March 2015 State of libfabric in Open MPI 1 The State of libfabric in Open MPI Jeffrey M. Squyres jsquyres@cisco.com 16 March 2015
  2. 2. Open Fabrics workshop, March 2015 State of libfabric in Open MPI 2 What is the Message Passing Interface (MPI)? A standards document
  3. 3. Open Fabrics workshop, March 2015 State of libfabric in Open MPI 3 Using MPI Hardware and software implement the interface in the MPI standard (book)
  4. 4. Open Fabrics workshop, March 2015 State of libfabric in Open MPI 4 MPI implementations There are many implementations of the MPI standard Some are closed source Others are open source
  5. 5. Open Fabrics workshop, March 2015 State of libfabric in Open MPI 5 Open MPI Open MPI is a free, open source implementation of the MPI standard www.open-mpi.org
  6. 6. Open Fabrics workshop, March 2015 State of libfabric in Open MPI 6 How I think of MPI
  7. 7. Open Fabrics workshop, March 2015 State of libfabric in Open MPI 7 ServerServer MPI abstracts away the underlying network MPI_Send(…) MPI_Recv(…) Network
  8. 8. Open Fabrics workshop, March 2015 State of libfabric in Open MPI 8 ServerServer MPI abstracts away the underlying network MPI_Send(…) MPI_Recv(…) Network MAGIC
  9. 9. Open Fabrics workshop, March 2015 State of libfabric in Open MPI 9 ServerServer Open MPI multiplexes to the underlying network stack MPI_Send(…) MPI_Recv(…) TCP Shared memory Verbs MXM Portals SCIF Loopback uGni PSM libfabric
  10. 10. Open Fabrics workshop, March 2015 State of libfabric in Open MPI 10 Two major types of transports Byte Transport Layer (BTL) plugins Matching Transport Layer (MTL) plugins MPI_Send(…)
  11. 11. Open Fabrics workshop, March 2015 State of libfabric in Open MPI 11 BTL •  Inherently multi-device •  Round-robin for small messages •  Striping for large messages •  Major protocol decisions and MPI message matching driven by an Open MPI engine Byte Transport Layer (BTL) plugins
  12. 12. Open Fabrics workshop, March 2015 State of libfabric in Open MPI 12 Matching Transport Layer (MTL) plugins MTL •  Most details hidden by network API •  MXM •  Portals •  PSM •  As a side effect, must handle: •  Process loopback •  Server loopback (usually via shared memory)
  13. 13. Open Fabrics workshop, March 2015 State of libfabric in Open MPI 13 BTL and MTL plugins Byte Transport Layer (BTL) plugins Matching Transport Layer (MTL) plugins •  IB / iWarp (verbs) •  Portals •  SCIF •  Shared memory •  TCP •  uGNI •  usNIC (verbs) •  MXM •  Portals •  PSM
  14. 14. Open Fabrics workshop, March 2015 State of libfabric in Open MPI 14 •  IB / iWarp (verbs) •  Portals •  SCIF •  Shared memory •  TCP •  uGNI •  usNIC Now featuring 200% more libfabric Byte Transport Layer (BTL) plugins Matching Transport Layer (MTL) plugins •  MXM •  Portals •  PSM •  ofi libfabric
  15. 15. Open Fabrics workshop, March 2015 State of libfabric in Open MPI 15 Linux linker: fun fact MPI process libmpi.so ofi MTL libfabric.so Linker auto loads dependency
  16. 16. Open Fabrics workshop, March 2015 State of libfabric in Open MPI 16 Linux linker: fun fact MPI process libmpi.so usnic BTLofi MTL libfabric.so Linker does not re-load
  17. 17. Open Fabrics workshop, March 2015 State of libfabric in Open MPI 17 Libfabric-based plugins libfabric usnic BTL ofi MTL •  Cisco developed •  usNIC-specific •  OFI point-to-point / UD •  Tested with usNIC •  Intel developed •  Provider neutral •  OFI tag matching •  Tested with PSM
  18. 18. Open Fabrics workshop, March 2015 State of libfabric in Open MPI 18 First experiment usnic BTL: verbs à libfabric verbs bootstrapping verbs message passing Can loosely classify the usnic BTL into two parts
  19. 19. Open Fabrics workshop, March 2015 State of libfabric in Open MPI 19 First experiment usnic BTL: verbs à libfabric verbs bootstrapping verbs message passing sideband bootstrapping 1.  Find the corresponding ethX device 2.  Obtain MTU 3.  Open usNIC-specific configuration options
  20. 20. Open Fabrics workshop, March 2015 State of libfabric in Open MPI 20 First experiment usnic BTL: verbs à libfabric verbs bootstrapping verbs message passing sideband bootstrapping libfabric bootstrapping à libfabric message passing à
  21. 21. Open Fabrics workshop, March 2015 State of libfabric in Open MPI 21 Comparison results verbs bootstrapping verbs message passing sideband bootstrapping libfabric bootstrapping à libfabric message passing à Pretty much a 1:1 swap of verbs à libfabric calls Bootstrapping sequence totally different libfabric requires no sideband bootstrapping
  22. 22. Open Fabrics workshop, March 2015 State of libfabric in Open MPI 22 Second experiment Two different libfabric usage models •  For a specific provider §  Ask fi_getinfo() for prov_name=“usnic” •  Use usNIC extensions §  Netmask, link speed, IP device name, etc. •  usNIC-specific error messages •  For any tag-matching provider •  No extension use §  100% portable •  Generic error messages usnic BTL ofi MTL
  23. 23. Open Fabrics workshop, March 2015 State of libfabric in Open MPI 23 Second experiment Two different libfabric usage models •  For a specific provider §  Ask fi_getinfo() for prov_name=“usnic” •  Use usNIC extensions §  Netmask, link speed, IP device name, etc. •  usNIC-specific error messages •  For any tag-matching provider •  No extension use §  100% portable •  Generic error messages usnic BTL ofi MTL
  24. 24. Open Fabrics workshop, March 2015 State of libfabric in Open MPI 24 libfabric performance vs. Linux verbs 1.9 1.95 2 2.05 2.1 2.15 2.2 2.25 2.3 2.35 2.4 0.1 1 10 100 Time(microseconds) Buffer size Open MPI with usNIC: IMB PingPong Latency imb-pingpong-ompi-1.8-verbs.out imb-pingpong-ompi-1.8-libfabric.out
  25. 25. Open Fabrics workshop, March 2015 State of libfabric in Open MPI 25 61000 62000 63000 64000 65000 66000 67000 68000 69000 1e+06 Bandwidth(megabits/second) Buffer size Open MPI with usNIC: IMB SendRecv Bandwidth imb-sendrecv-ompi-1.8-verbs.out imb-sendrecv-ompi-1.8-libfabric.out libfabric performance vs. Linux verbs
  26. 26. Open Fabrics workshop, March 2015 State of libfabric in Open MPI 26 Version roadmap Git master Main development v1.8 / Stable release series
  27. 27. Open Fabrics workshop, March 2015 State of libfabric in Open MPI 27 Version roadmap Git master Main development v1.8 / Stable release series Past v1.9 Feature series Present Futurelibfabric libfabric libfabric Dec 2014
  28. 28. Open Fabrics workshop, March 2015 State of libfabric in Open MPI 28 Currently embedding libfabric •  openmpi-master §  opal •  mca § common •  libfabric •  include •  prov •  src … Because there is no public libfabric release (yet) Will be removed before Open MPI v1.9 release
  29. 29. Open Fabrics workshop, March 2015 State of libfabric in Open MPI 29 Periodic refresh from libfabric Github •  openmpi-master §  opal •  mca § common •  libfabric •  include •  prov •  src …
  30. 30. Open Fabrics workshop, March 2015 State of libfabric in Open MPI 30 Periodic refresh from libfabric Github •  openmpi-master §  opal •  mca § common •  libfabric •  include •  prov •  src …
  31. 31. Open Fabrics workshop, March 2015 State of libfabric in Open MPI 31 Periodic refresh from libfabric Github •  openmpi-master §  opal •  mca § common •  libfabric •  include •  prov •  src … Moar new libfabric goodness!
  32. 32. Open Fabrics workshop, March 2015 State of libfabric in Open MPI 32 Can also build against external libfabric •  openmpi-master §  opal •  mca § common •  libfabric •  include •  prov •  src … libfabric (e.g., installed under $HOME, or in /usr, or …)
  33. 33. Open Fabrics workshop, March 2015 State of libfabric in Open MPI 33 Will be the only model in v1.9 •  openmpi-master libfabric (e.g., installed under $HOME, or in /usr, or …)
  34. 34. Open Fabrics workshop, March 2015 State of libfabric in Open MPI 34 Feedback loop = good •  Using libfabric in its (first) intended environment was quite useful §  Resulted in libfabric pull requests, minor changes, etc. •  Biggest thing missing is the mmunotify functionality §  …will file a PR/RFC about this soon
  35. 35. Open Fabrics workshop, March 2015 State of libfabric in Open MPI 35 Questions?

Slides presented by Jeff Squyres at the 2015 OpenFabrics Software Developers' Workshop. This talk discusses Cisco's experiences implementing an ultra-low latency Ethernet plugin / provider for the Linux Verbs API and for for the Libfabric API.

Views

Total views

3,230

On Slideshare

0

From embeds

0

Number of embeds

1,030

Actions

Downloads

44

Shares

0

Comments

0

Likes

0

×