SlideShare a Scribd company logo
1 of 11
Exascale Process
Management Interface
Ralph Castain
Intel Corporation
rhc@open-mpi.org
Joshua S. Ladd
Mellanox Technologies Inc.
joshual@mellanox.com
Gary Brown
Adaptive Computing
gbrown@adaptivecomputing.com
David Bigagli
SchedMD
david@schedmd.com
Artem Y. Polyakov
Mellanox Technologies Inc.
artemp@mellanox.com
PMIx – PMI exascale
Collaborative open source effort led by
Intel, Mellanox Technologies, and Adaptive Computing.
New collaborators are most welcome!
2
Process Management Interface – PMI
 PMI is most commonly utilized to bootstrap MPI processes.
 Typically, MPI processes “put” data into the KVS data base that is intended to be
shared with all other MPI processes, a collective operation that is a logical
allgather synchronizes the database.
 PMI enables Resource Managers (RMs) to use their infrastructure to implement
advanced support for MPI application acting like RTE daemons.
 SLURM supports both PMI-1/PMI-2 (http://slurm.schedmd.com/mpi_guide.html)
RTE
daemon
MPI
process
Connect
ORTEd Hydra
ssh/
rsh
PMI
pbsdsh
tm_spawn()
blaunch
lsb_launch()
launch
control
IO forward
Remote
Execution
Service
Unix TORQUE LSF SLURM
MPD
qrsh llspawn
Load
Leveler
SGE
srun
SLURM
PMI
MPI RTE (MPICH2, MVAPICH, Open MPI ORTE)
3
PMIx – PMI exascale
(What and Why)
 What is it?
 Extended Process Management Interface.
 Why?
 MPI/OSHMEM job launch time is a hot topic!
 Extreme-scale system requirements: 30 second job launch time for
O(106) MPI processes.
 Scaling studies have illuminated many limitations of current PMI-
1/PMI-2 interfaces at extreme scale.
 Tight integration with Resource Managers can drastically reduce
the amount of data that needs to be exchanged during MPI_Init.
4
PMIx – PMI exascale
(Technical Goals)
 Reduce the memory footprint from O(N) to O(1) by leveraging
shared memory and distributed databases.
 Reduce the volume of data exchanged in collective operations
with scoping hints.
 Provide the ability to overlap communication with computation
with non-blocking collectives and get operations.
 Support both collective communication modes of data
exchange and point-to-point "direct" data retrieval.
 Reduce the amount of local messages exchanged between
application processes and RTE daemons (many-core nodes).
 Use high-speed HPC interconnects available on the system
for the data exchange.
 Extend "Application – Resource Manager" interface to support
fault-tolerance and energy-efficiency requirements. 5
PMIx implementation architecture
RTE
daemon
MPI
process
PMIx-client
PMIx-server
MPI
process
PMIx-client
PMIx
Shared mem
(store blobs)
usock
usock
RTEMPI
PMIx
PMIx
MPI
PMIx
SHMEM
RTEMPI
PMIx
PMIx
MPI
PMIx
SHMEM
RTEMPI
PMIx
PMIx
MPI
PMIx
SHMEM
6
PMIx v1.0 features
 Data scoping with 3 levels of locality: local, remote, global.
 Communication scoping: PMIx_Fence under arbitrary
subset of processes.
 Full support for point-to-point "direct" data retrieval well
suited for applications with sparse communication graphs.
 Full support for non-blocking operations.
 Support for “binary blobs”: PMIx client retrieves process data
only once as one chunk reducing intra-node exchanges and
encoding/decoding overhead.
 Basic support for MPI dynamic process management;
7
PMIx v2.0 features
Performance enhancements:
 One instance of database per node with "zero-message" data
access using shared-memory.
 Distributed database for storing Key-Values.
 Enhanced support for collective operations.
Functional enhancements:
 Extended support for dynamic allocation and process
management suitable for other HPC paradigms (not MPI-only.)
 Power management interface to RMs.
 File positioning service.
 Event notification service enabling fault tolerant-aware
applications.
 Fabric QoS and security controls. 8
SLURM PMIx plugin
PMIx support in SLURM
 Implemented as a new MPI plugin called "pmix".
 To use it:
a) either set as a command line command line parameter:
$ srun –mpi=pmix ./a.out
b) or set PMIx plugin as the default in slurm.conf file:
MpiDefault = pmix
 Development version of the plugin is available on github:
https://github.com/artpol84/slurm/tree/pmix-step2
 Beta version of PMIx plugin will be available in the next SLURM major
release (15.11.x) at SC 2015.
9
PMIx development timeline
2015 2016 2017
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4
 PMIx 1.0:
• basic feature set;
• initial performance optimizations.
 Open MPI integration (already in
master).
 SLURM PMIx plugin (15.11.x release)
 PMIx 2.0:
• memory footprint improvements;
• distributed database storage;
• internal collectives implementation and
integration with existing collectives
libraries (Mellanox HCOLL);
• enhanced RM API.
 Update of Open MPI and SLURM
integration.
 LSF and Moab/TORQUE support. 10
Contribute or Follow Along!
 Project: https://www.open-mpi.org/projects/pmix/
 Code: https://github.com/open-mpi/pmix
Contributions are welcomed!
11

More Related Content

What's hot

DPDK Architecture Musings - Andy Harvey
DPDK Architecture Musings - Andy HarveyDPDK Architecture Musings - Andy Harvey
DPDK Architecture Musings - Andy Harveyharryvanhaaren
 
Row #9: An architecture overview of APNIC's RDAP deployment to the cloud
Row #9: An architecture overview of APNIC's RDAP deployment to the cloudRow #9: An architecture overview of APNIC's RDAP deployment to the cloud
Row #9: An architecture overview of APNIC's RDAP deployment to the cloudAPNIC
 
Developing Enterprise Applications for the Cloud, from Monolith to Microservice
Developing Enterprise Applications for the Cloud, from Monolith to MicroserviceDeveloping Enterprise Applications for the Cloud, from Monolith to Microservice
Developing Enterprise Applications for the Cloud, from Monolith to MicroserviceJack-Junjie Cai
 
Barak Perlman, ConteXtream - SFC (Service Function Chaining) Using Openstack ...
Barak Perlman, ConteXtream - SFC (Service Function Chaining) Using Openstack ...Barak Perlman, ConteXtream - SFC (Service Function Chaining) Using Openstack ...
Barak Perlman, ConteXtream - SFC (Service Function Chaining) Using Openstack ...Cloud Native Day Tel Aviv
 
Apache NiFi SDLC Improvements
Apache NiFi SDLC ImprovementsApache NiFi SDLC Improvements
Apache NiFi SDLC ImprovementsBryan Bende
 
L4-L7 services for SDN and NVF by Youcef Laribi
L4-L7 services for SDN and NVF by Youcef LaribiL4-L7 services for SDN and NVF by Youcef Laribi
L4-L7 services for SDN and NVF by Youcef Laribibuildacloud
 
Architecture of OpenFlow SDNs
Architecture of OpenFlow SDNsArchitecture of OpenFlow SDNs
Architecture of OpenFlow SDNsUS-Ignite
 
Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...
Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...
Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...Brocade
 
Generic Resource Manager - László Vadkerti, András Kovács
Generic Resource Manager - László Vadkerti, András KovácsGeneric Resource Manager - László Vadkerti, András Kovács
Generic Resource Manager - László Vadkerti, András Kovácsharryvanhaaren
 
DEVNET-1175 OpenDaylight Service Function Chaining
DEVNET-1175	OpenDaylight Service Function ChainingDEVNET-1175	OpenDaylight Service Function Chaining
DEVNET-1175 OpenDaylight Service Function ChainingCisco DevNet
 
Dynamic Service Chaining
Dynamic Service Chaining Dynamic Service Chaining
Dynamic Service Chaining Tail-f Systems
 
Hotplug and Virtio - Tetsuya Mukawa
Hotplug and Virtio - Tetsuya MukawaHotplug and Virtio - Tetsuya Mukawa
Hotplug and Virtio - Tetsuya Mukawaharryvanhaaren
 
LCA14: LCA14-209: ODP Project Update
LCA14: LCA14-209: ODP Project UpdateLCA14: LCA14-209: ODP Project Update
LCA14: LCA14-209: ODP Project UpdateLinaro
 
Introduction to Beryllium release of OpenDaylight
Introduction to Beryllium release of OpenDaylightIntroduction to Beryllium release of OpenDaylight
Introduction to Beryllium release of OpenDaylightSDN Hub
 
OpenDaylight Update (June 2018)
OpenDaylight Update (June 2018)OpenDaylight Update (June 2018)
OpenDaylight Update (June 2018)Michelle Holley
 
Learn more about the tremendous value Open Data Plane brings to NFV
Learn more about the tremendous value Open Data Plane brings to NFVLearn more about the tremendous value Open Data Plane brings to NFV
Learn more about the tremendous value Open Data Plane brings to NFVGhodhbane Mohamed Amine
 

What's hot (20)

SDN Project PPT
SDN Project PPTSDN Project PPT
SDN Project PPT
 
DPDK Architecture Musings - Andy Harvey
DPDK Architecture Musings - Andy HarveyDPDK Architecture Musings - Andy Harvey
DPDK Architecture Musings - Andy Harvey
 
ODP Presentation LinuxCon NA 2014
ODP Presentation LinuxCon NA 2014ODP Presentation LinuxCon NA 2014
ODP Presentation LinuxCon NA 2014
 
Row #9: An architecture overview of APNIC's RDAP deployment to the cloud
Row #9: An architecture overview of APNIC's RDAP deployment to the cloudRow #9: An architecture overview of APNIC's RDAP deployment to the cloud
Row #9: An architecture overview of APNIC's RDAP deployment to the cloud
 
FD.io - The Universal Dataplane
FD.io - The Universal DataplaneFD.io - The Universal Dataplane
FD.io - The Universal Dataplane
 
Developing Enterprise Applications for the Cloud, from Monolith to Microservice
Developing Enterprise Applications for the Cloud, from Monolith to MicroserviceDeveloping Enterprise Applications for the Cloud, from Monolith to Microservice
Developing Enterprise Applications for the Cloud, from Monolith to Microservice
 
Barak Perlman, ConteXtream - SFC (Service Function Chaining) Using Openstack ...
Barak Perlman, ConteXtream - SFC (Service Function Chaining) Using Openstack ...Barak Perlman, ConteXtream - SFC (Service Function Chaining) Using Openstack ...
Barak Perlman, ConteXtream - SFC (Service Function Chaining) Using Openstack ...
 
Apache NiFi SDLC Improvements
Apache NiFi SDLC ImprovementsApache NiFi SDLC Improvements
Apache NiFi SDLC Improvements
 
L4-L7 services for SDN and NVF by Youcef Laribi
L4-L7 services for SDN and NVF by Youcef LaribiL4-L7 services for SDN and NVF by Youcef Laribi
L4-L7 services for SDN and NVF by Youcef Laribi
 
Architecture of OpenFlow SDNs
Architecture of OpenFlow SDNsArchitecture of OpenFlow SDNs
Architecture of OpenFlow SDNs
 
Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...
Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...
Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...
 
Generic Resource Manager - László Vadkerti, András Kovács
Generic Resource Manager - László Vadkerti, András KovácsGeneric Resource Manager - László Vadkerti, András Kovács
Generic Resource Manager - László Vadkerti, András Kovács
 
DEVNET-1175 OpenDaylight Service Function Chaining
DEVNET-1175	OpenDaylight Service Function ChainingDEVNET-1175	OpenDaylight Service Function Chaining
DEVNET-1175 OpenDaylight Service Function Chaining
 
Dynamic Service Chaining
Dynamic Service Chaining Dynamic Service Chaining
Dynamic Service Chaining
 
Hotplug and Virtio - Tetsuya Mukawa
Hotplug and Virtio - Tetsuya MukawaHotplug and Virtio - Tetsuya Mukawa
Hotplug and Virtio - Tetsuya Mukawa
 
LCA14: LCA14-209: ODP Project Update
LCA14: LCA14-209: ODP Project UpdateLCA14: LCA14-209: ODP Project Update
LCA14: LCA14-209: ODP Project Update
 
Introduction to Beryllium release of OpenDaylight
Introduction to Beryllium release of OpenDaylightIntroduction to Beryllium release of OpenDaylight
Introduction to Beryllium release of OpenDaylight
 
OpenDaylight Update (June 2018)
OpenDaylight Update (June 2018)OpenDaylight Update (June 2018)
OpenDaylight Update (June 2018)
 
Learn more about the tremendous value Open Data Plane brings to NFV
Learn more about the tremendous value Open Data Plane brings to NFVLearn more about the tremendous value Open Data Plane brings to NFV
Learn more about the tremendous value Open Data Plane brings to NFV
 
Container Service Chaining
Container Service ChainingContainer Service Chaining
Container Service Chaining
 

Similar to PMIx - A Collaborative Open Source Effort for Exascale Process Management

Advanced Scalable Decomposition Method with MPICH Environment for HPC
Advanced Scalable Decomposition Method with MPICH Environment for HPCAdvanced Scalable Decomposition Method with MPICH Environment for HPC
Advanced Scalable Decomposition Method with MPICH Environment for HPCIJSRD
 
Cwin16 tls-a micro-service deployment - v1.0
Cwin16 tls-a micro-service deployment - v1.0Cwin16 tls-a micro-service deployment - v1.0
Cwin16 tls-a micro-service deployment - v1.0Capgemini
 
Automatically partitioning packet processing applications for pipelined archi...
Automatically partitioning packet processing applications for pipelined archi...Automatically partitioning packet processing applications for pipelined archi...
Automatically partitioning packet processing applications for pipelined archi...Ashley Carter
 
IRJET- ALPYNE - A Grid Computing Framework
IRJET- ALPYNE - A Grid Computing FrameworkIRJET- ALPYNE - A Grid Computing Framework
IRJET- ALPYNE - A Grid Computing FrameworkIRJET Journal
 
2008-01-23 Red Hat Overview to CUNY Information Managers Forum
2008-01-23 Red Hat Overview to CUNY Information Managers Forum2008-01-23 Red Hat Overview to CUNY Information Managers Forum
2008-01-23 Red Hat Overview to CUNY Information Managers ForumShawn Wells
 
PLNOG 17 - Andrzej Jeruzal - Dell Networking OS10: sieciowy system operacyjny...
PLNOG 17 - Andrzej Jeruzal - Dell Networking OS10: sieciowy system operacyjny...PLNOG 17 - Andrzej Jeruzal - Dell Networking OS10: sieciowy system operacyjny...
PLNOG 17 - Andrzej Jeruzal - Dell Networking OS10: sieciowy system operacyjny...PROIDEA
 
Introduction to HPC Programming Models - EUDAT Summer School (Stefano Markidi...
Introduction to HPC Programming Models - EUDAT Summer School (Stefano Markidi...Introduction to HPC Programming Models - EUDAT Summer School (Stefano Markidi...
Introduction to HPC Programming Models - EUDAT Summer School (Stefano Markidi...EUDAT
 
SAMOS 2018: LEGaTO: first steps towards energy-efficient toolset for heteroge...
SAMOS 2018: LEGaTO: first steps towards energy-efficient toolset for heteroge...SAMOS 2018: LEGaTO: first steps towards energy-efficient toolset for heteroge...
SAMOS 2018: LEGaTO: first steps towards energy-efficient toolset for heteroge...LEGATO project
 
186 devlin p-poster(2)
186 devlin p-poster(2)186 devlin p-poster(2)
186 devlin p-poster(2)vaidehi87
 
Clustering by AKASHMSHAH
Clustering by AKASHMSHAHClustering by AKASHMSHAH
Clustering by AKASHMSHAHAkash M Shah
 
Dataservices - Processing Big Data The Microservice Way
Dataservices - Processing Big Data The Microservice WayDataservices - Processing Big Data The Microservice Way
Dataservices - Processing Big Data The Microservice WayJosef Adersberger
 
Cloud Computing for National Security Applications
Cloud Computing for National Security ApplicationsCloud Computing for National Security Applications
Cloud Computing for National Security Applicationswhite paper
 
2008-03-06 Harris Corp Security Seminar
2008-03-06 Harris Corp Security Seminar2008-03-06 Harris Corp Security Seminar
2008-03-06 Harris Corp Security SeminarShawn Wells
 
UCX: An Open Source Framework for HPC Network APIs and Beyond
UCX: An Open Source Framework for HPC Network APIs and BeyondUCX: An Open Source Framework for HPC Network APIs and Beyond
UCX: An Open Source Framework for HPC Network APIs and BeyondEd Dodds
 
Co-Design Architecture for Exascale
Co-Design Architecture for ExascaleCo-Design Architecture for Exascale
Co-Design Architecture for Exascaleinside-BigData.com
 
Performance evaluation of larger matrices over cluster of four nodes using mpi
Performance evaluation of larger matrices over cluster of four nodes using mpiPerformance evaluation of larger matrices over cluster of four nodes using mpi
Performance evaluation of larger matrices over cluster of four nodes using mpieSAT Journals
 
Spring and Pivotal Application Service - SpringOne Tour - Boston
Spring and Pivotal Application Service - SpringOne Tour - BostonSpring and Pivotal Application Service - SpringOne Tour - Boston
Spring and Pivotal Application Service - SpringOne Tour - BostonVMware Tanzu
 

Similar to PMIx - A Collaborative Open Source Effort for Exascale Process Management (20)

Advanced Scalable Decomposition Method with MPICH Environment for HPC
Advanced Scalable Decomposition Method with MPICH Environment for HPCAdvanced Scalable Decomposition Method with MPICH Environment for HPC
Advanced Scalable Decomposition Method with MPICH Environment for HPC
 
Cwin16 tls-a micro-service deployment - v1.0
Cwin16 tls-a micro-service deployment - v1.0Cwin16 tls-a micro-service deployment - v1.0
Cwin16 tls-a micro-service deployment - v1.0
 
Automatically partitioning packet processing applications for pipelined archi...
Automatically partitioning packet processing applications for pipelined archi...Automatically partitioning packet processing applications for pipelined archi...
Automatically partitioning packet processing applications for pipelined archi...
 
IRJET- ALPYNE - A Grid Computing Framework
IRJET- ALPYNE - A Grid Computing FrameworkIRJET- ALPYNE - A Grid Computing Framework
IRJET- ALPYNE - A Grid Computing Framework
 
2008-01-23 Red Hat Overview to CUNY Information Managers Forum
2008-01-23 Red Hat Overview to CUNY Information Managers Forum2008-01-23 Red Hat Overview to CUNY Information Managers Forum
2008-01-23 Red Hat Overview to CUNY Information Managers Forum
 
PLNOG 17 - Andrzej Jeruzal - Dell Networking OS10: sieciowy system operacyjny...
PLNOG 17 - Andrzej Jeruzal - Dell Networking OS10: sieciowy system operacyjny...PLNOG 17 - Andrzej Jeruzal - Dell Networking OS10: sieciowy system operacyjny...
PLNOG 17 - Andrzej Jeruzal - Dell Networking OS10: sieciowy system operacyjny...
 
Grid Presentation
Grid PresentationGrid Presentation
Grid Presentation
 
Introduction to HPC Programming Models - EUDAT Summer School (Stefano Markidi...
Introduction to HPC Programming Models - EUDAT Summer School (Stefano Markidi...Introduction to HPC Programming Models - EUDAT Summer School (Stefano Markidi...
Introduction to HPC Programming Models - EUDAT Summer School (Stefano Markidi...
 
Interconnect your future
Interconnect your futureInterconnect your future
Interconnect your future
 
p850-ries
p850-riesp850-ries
p850-ries
 
SAMOS 2018: LEGaTO: first steps towards energy-efficient toolset for heteroge...
SAMOS 2018: LEGaTO: first steps towards energy-efficient toolset for heteroge...SAMOS 2018: LEGaTO: first steps towards energy-efficient toolset for heteroge...
SAMOS 2018: LEGaTO: first steps towards energy-efficient toolset for heteroge...
 
186 devlin p-poster(2)
186 devlin p-poster(2)186 devlin p-poster(2)
186 devlin p-poster(2)
 
Clustering by AKASHMSHAH
Clustering by AKASHMSHAHClustering by AKASHMSHAH
Clustering by AKASHMSHAH
 
Dataservices - Processing Big Data The Microservice Way
Dataservices - Processing Big Data The Microservice WayDataservices - Processing Big Data The Microservice Way
Dataservices - Processing Big Data The Microservice Way
 
Cloud Computing for National Security Applications
Cloud Computing for National Security ApplicationsCloud Computing for National Security Applications
Cloud Computing for National Security Applications
 
2008-03-06 Harris Corp Security Seminar
2008-03-06 Harris Corp Security Seminar2008-03-06 Harris Corp Security Seminar
2008-03-06 Harris Corp Security Seminar
 
UCX: An Open Source Framework for HPC Network APIs and Beyond
UCX: An Open Source Framework for HPC Network APIs and BeyondUCX: An Open Source Framework for HPC Network APIs and Beyond
UCX: An Open Source Framework for HPC Network APIs and Beyond
 
Co-Design Architecture for Exascale
Co-Design Architecture for ExascaleCo-Design Architecture for Exascale
Co-Design Architecture for Exascale
 
Performance evaluation of larger matrices over cluster of four nodes using mpi
Performance evaluation of larger matrices over cluster of four nodes using mpiPerformance evaluation of larger matrices over cluster of four nodes using mpi
Performance evaluation of larger matrices over cluster of four nodes using mpi
 
Spring and Pivotal Application Service - SpringOne Tour - Boston
Spring and Pivotal Application Service - SpringOne Tour - BostonSpring and Pivotal Application Service - SpringOne Tour - Boston
Spring and Pivotal Application Service - SpringOne Tour - Boston
 

More from rcastain

PMIx: Bridging the Container Boundary
PMIx: Bridging the Container BoundaryPMIx: Bridging the Container Boundary
PMIx: Bridging the Container Boundaryrcastain
 
PMIx Tiered Storage Support
PMIx Tiered Storage SupportPMIx Tiered Storage Support
PMIx Tiered Storage Supportrcastain
 
SC'16 PMIx BoF Presentation
SC'16 PMIx BoF PresentationSC'16 PMIx BoF Presentation
SC'16 PMIx BoF Presentationrcastain
 
EuroMPI 2017 PMIx presentation
EuroMPI 2017 PMIx presentationEuroMPI 2017 PMIx presentation
EuroMPI 2017 PMIx presentationrcastain
 
SC'17 BoF Presentation
SC'17 BoF PresentationSC'17 BoF Presentation
SC'17 BoF Presentationrcastain
 
PMIx: Debuggers and Fabric Support
PMIx: Debuggers and Fabric SupportPMIx: Debuggers and Fabric Support
PMIx: Debuggers and Fabric Supportrcastain
 
SC'18 BoF Presentation
SC'18 BoF PresentationSC'18 BoF Presentation
SC'18 BoF Presentationrcastain
 
HPC Resource Management: Futures
HPC Resource Management: FuturesHPC Resource Management: Futures
HPC Resource Management: Futuresrcastain
 

More from rcastain (8)

PMIx: Bridging the Container Boundary
PMIx: Bridging the Container BoundaryPMIx: Bridging the Container Boundary
PMIx: Bridging the Container Boundary
 
PMIx Tiered Storage Support
PMIx Tiered Storage SupportPMIx Tiered Storage Support
PMIx Tiered Storage Support
 
SC'16 PMIx BoF Presentation
SC'16 PMIx BoF PresentationSC'16 PMIx BoF Presentation
SC'16 PMIx BoF Presentation
 
EuroMPI 2017 PMIx presentation
EuroMPI 2017 PMIx presentationEuroMPI 2017 PMIx presentation
EuroMPI 2017 PMIx presentation
 
SC'17 BoF Presentation
SC'17 BoF PresentationSC'17 BoF Presentation
SC'17 BoF Presentation
 
PMIx: Debuggers and Fabric Support
PMIx: Debuggers and Fabric SupportPMIx: Debuggers and Fabric Support
PMIx: Debuggers and Fabric Support
 
SC'18 BoF Presentation
SC'18 BoF PresentationSC'18 BoF Presentation
SC'18 BoF Presentation
 
HPC Resource Management: Futures
HPC Resource Management: FuturesHPC Resource Management: Futures
HPC Resource Management: Futures
 

Recently uploaded

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 

Recently uploaded (20)

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 

PMIx - A Collaborative Open Source Effort for Exascale Process Management

  • 1. Exascale Process Management Interface Ralph Castain Intel Corporation rhc@open-mpi.org Joshua S. Ladd Mellanox Technologies Inc. joshual@mellanox.com Gary Brown Adaptive Computing gbrown@adaptivecomputing.com David Bigagli SchedMD david@schedmd.com Artem Y. Polyakov Mellanox Technologies Inc. artemp@mellanox.com
  • 2. PMIx – PMI exascale Collaborative open source effort led by Intel, Mellanox Technologies, and Adaptive Computing. New collaborators are most welcome! 2
  • 3. Process Management Interface – PMI  PMI is most commonly utilized to bootstrap MPI processes.  Typically, MPI processes “put” data into the KVS data base that is intended to be shared with all other MPI processes, a collective operation that is a logical allgather synchronizes the database.  PMI enables Resource Managers (RMs) to use their infrastructure to implement advanced support for MPI application acting like RTE daemons.  SLURM supports both PMI-1/PMI-2 (http://slurm.schedmd.com/mpi_guide.html) RTE daemon MPI process Connect ORTEd Hydra ssh/ rsh PMI pbsdsh tm_spawn() blaunch lsb_launch() launch control IO forward Remote Execution Service Unix TORQUE LSF SLURM MPD qrsh llspawn Load Leveler SGE srun SLURM PMI MPI RTE (MPICH2, MVAPICH, Open MPI ORTE) 3
  • 4. PMIx – PMI exascale (What and Why)  What is it?  Extended Process Management Interface.  Why?  MPI/OSHMEM job launch time is a hot topic!  Extreme-scale system requirements: 30 second job launch time for O(106) MPI processes.  Scaling studies have illuminated many limitations of current PMI- 1/PMI-2 interfaces at extreme scale.  Tight integration with Resource Managers can drastically reduce the amount of data that needs to be exchanged during MPI_Init. 4
  • 5. PMIx – PMI exascale (Technical Goals)  Reduce the memory footprint from O(N) to O(1) by leveraging shared memory and distributed databases.  Reduce the volume of data exchanged in collective operations with scoping hints.  Provide the ability to overlap communication with computation with non-blocking collectives and get operations.  Support both collective communication modes of data exchange and point-to-point "direct" data retrieval.  Reduce the amount of local messages exchanged between application processes and RTE daemons (many-core nodes).  Use high-speed HPC interconnects available on the system for the data exchange.  Extend "Application – Resource Manager" interface to support fault-tolerance and energy-efficiency requirements. 5
  • 6. PMIx implementation architecture RTE daemon MPI process PMIx-client PMIx-server MPI process PMIx-client PMIx Shared mem (store blobs) usock usock RTEMPI PMIx PMIx MPI PMIx SHMEM RTEMPI PMIx PMIx MPI PMIx SHMEM RTEMPI PMIx PMIx MPI PMIx SHMEM 6
  • 7. PMIx v1.0 features  Data scoping with 3 levels of locality: local, remote, global.  Communication scoping: PMIx_Fence under arbitrary subset of processes.  Full support for point-to-point "direct" data retrieval well suited for applications with sparse communication graphs.  Full support for non-blocking operations.  Support for “binary blobs”: PMIx client retrieves process data only once as one chunk reducing intra-node exchanges and encoding/decoding overhead.  Basic support for MPI dynamic process management; 7
  • 8. PMIx v2.0 features Performance enhancements:  One instance of database per node with "zero-message" data access using shared-memory.  Distributed database for storing Key-Values.  Enhanced support for collective operations. Functional enhancements:  Extended support for dynamic allocation and process management suitable for other HPC paradigms (not MPI-only.)  Power management interface to RMs.  File positioning service.  Event notification service enabling fault tolerant-aware applications.  Fabric QoS and security controls. 8
  • 9. SLURM PMIx plugin PMIx support in SLURM  Implemented as a new MPI plugin called "pmix".  To use it: a) either set as a command line command line parameter: $ srun –mpi=pmix ./a.out b) or set PMIx plugin as the default in slurm.conf file: MpiDefault = pmix  Development version of the plugin is available on github: https://github.com/artpol84/slurm/tree/pmix-step2  Beta version of PMIx plugin will be available in the next SLURM major release (15.11.x) at SC 2015. 9
  • 10. PMIx development timeline 2015 2016 2017 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4  PMIx 1.0: • basic feature set; • initial performance optimizations.  Open MPI integration (already in master).  SLURM PMIx plugin (15.11.x release)  PMIx 2.0: • memory footprint improvements; • distributed database storage; • internal collectives implementation and integration with existing collectives libraries (Mellanox HCOLL); • enhanced RM API.  Update of Open MPI and SLURM integration.  LSF and Moab/TORQUE support. 10
  • 11. Contribute or Follow Along!  Project: https://www.open-mpi.org/projects/pmix/  Code: https://github.com/open-mpi/pmix Contributions are welcomed! 11