SlideShare a Scribd company logo
Ralph H. Castain
Intel
Joshua Hursey
IBM
PMIx: State-of-the-Union
https://pmix.orgSlack: pmix-workspace.slack.com
Agenda
• Quick status update
 Library releases
 Standards documents
• Application examples
 PMIx Basics: Querying information
 OpenSHMEM
 Interlibrary coordination
 MPI Sessions: Dynamic Process Grouping
 Dynamic Programming Models (web only)
Ralph H. Castain, Joshua Hursey, Aurelien Bouteiller, David Solt, “PMIx: Process management for exascale
environments”, Parallel Computing, 2018.
https://doi.org/10.1016/j.parco.2018.08.002
Overview Paper
• Standards Documents
 v2.0 released Sept. 27, 2018
 v2.1 (errata/clarifications, Dec.)
 v3.0 (Dec.)
 v4.0 (1Q19)
Status Snapshot
Current Library Releases
https://github.com/pmix/pmix-standard/releases
• Clarify terms
 Session, job, application
• New chapter detailing namespace registration
data organization
• Data retrieval examples
 Application size in multi-app scenarios
 What rank to provide for which data attributes
PMIx Standard – In Progress
• Standards Documents
 v2.0 released Sept. 27, 2018
 v2.1 (errata/clarifications, Dec.)
 v3.0 (Dec.)
 v4.0 (1Q19)
• Reference Implementation
 v3.0 => v3.1 (Dec)
 v2.1 => v2.2 (Dec)
Status Snapshot
Current Library Releases
New dstore
implementation
Full Standard
Compliance
• v2 => workflow orchestration
 (Ch.7) Job Allocation Management and Reporting, (Ch.8) Event Notification
• v3 => tools support
 Security credentials, I/O management
• v4 => tools, network, dynamic programming models
 Complete debugger attach/detach/re-attach
• Direct and indirect launch
 Fabric topology information (switches, NICs, …)
• Communication cost, connectivity graphs, inventory collection
 PMIx Groups
 Bindings (Python, …)
What’s in the Standard?
• PRRTE
 Formal releases to begin in Dec. or Jan.
 Ties to external PMIx, libevent, hwloc installations
 Support up thru current head PMIx master branch
• Cross-version support
 Regularly tested, automated script
• Help Wanted: Container developers for regular regression testing
 v2.1.1 and above remain interchangeable
 https://pmix.org/support/faq/how-does-pmix-work-with-containers/
Status Snapshot
Adoption Updates
• MPI use-cases
 Re-ordering for load balance
(UTK/ECP)
 Fault management (UTK)
 On-the-fly session
formation/teardown (MPIF)
• MPI libraries
 OMPI, MPICH, Intel MPI,
HPE-MPI, Spectrum MPI,
Fujitsu MPI
• Resource Managers (RMs)
 Slurm, Fujitsu,
IBM’s Job Step Manager (JSM),
PBSPro (2019), Kubernetes(?)
• Spark, TensorFlow
 Just getting underway
• OpenSHMEM
 Multiple implementations
Artem Y. Polyakov, Joshua S. Ladd, Boris I. Karasev
Mellanox Technologies
Shared Memory Optimizations (ds21)
• Improvements to intra-node performance
 Highly optimized the intra-node communication component dstor/ds21[1]
• Significantly improved the PMIx_Get latency using the 2N-lock algorithm on systems with a large
number of processes per node accessing many keys.
• Designed for exascale: Deployed in production on CORAL systems, Summit and Sierra,
POWER9-based systems.
• Available in PMIx v2.2, v3.1, and subsequent releases.
 Enabled by default.
• Improvements to the SLURM PMIx plug-in
 Added a ring-based PMIx_Fence collective (Slurm v18.08)
 Added support for high-performance RDMA point-to-point
via UCX (Slurm v17.11) [2]
 Verified PMIx compatibility through v3.x API (Slurm v18.08)
[1] Artem Polyakov et al. A Scalable PMIx Database (poster) EuroMPI’18: https://eurompi2018.bsc.es/sites/default/files/uploaded/dstore_empi2018.pdf
[2] Artem Polyakov PMIx Plugin with UCX Support (SC17), SchedMD Booth talk: https://slurm.schedmd.com/publications.html
Mellanox update
PMIx_Get() latency (POWER8 system)
Joshua Hursey
IBM
PMIx Basics: Querying Information
IBM
Systems
Harnessing PMIx capabilities
#include <pmix.h>
int main(int argc, char**argv) {
pmix_proc_tmyproc,proc_wildcard;
pmix_value_t value;
pmix_value_t *val =&value;
PMIx_Init(&myproc, NULL, 0);
PMIX_PROC_CONSTRUCT(&proc_wildcard);
(void)strncpy(proc_wildcard.nspace, myproc.nspace, PMIX_MAX_NSLEN);
proc_wildcard.rank =PMIX_RANK_WILDCARD;
PMIx_Get(&proc_wildcard, PMIX_JOB_SIZE,NULL, 0, &val);
job_size =val->data.uint32;
PMIx_Get(&myproc, PMIX_HOSTNAME, NULL, 0, &val);
strncpy(hostname, val->data.string,256);
printf("%d/%d) Hello Worldfrom %sn", myproc.rank, job_size, hostname);
PMIx_Get(&proc_wildcard, PMIX_LOCALLDR,NULL, 0, &val);// Lowest rankon this node
printf("%d/%d) Lowest Local Rank: %dn", myproc.rank, job_size,val->data.rank);
PMIx_Get(&proc_wildcard, PMIX_LOCAL_SIZE,NULL, 0, &val);// Number ofranks on this node
printf("%d/%d) Local Ranks: %dn", myproc.rank, job_size, val->data.uint32);
PMIx_Fence(&proc_wildcard, 1, NULL, 0); // Synchronize processes (not required, just fordemo)
PMIx_Finalize(NULL, 0); // Cleanup
return 0;
}
• https://pmix.org/support/faq/rm-provided-information/
• https://pmix.org/support/faq/what-information-is-an-rm-supposed-to-provide/
IBM
Systems
Harnessing PMIx capabilities
volatile bool isdone =false;
static void cbfunc(pmix_status_t status, pmix_info_t*info, size_t ninfo, void *cbdata,
pmix_release_cbfunc_t release_fn, void *release_cbdata)
{
size_t n, i;
if (0 < ninfo) {
for (n=0; n < ninfo; n++) {
// iterate trough the info[n].value.data.darray->arrayof
/*
typedef struct pmix_proc_info {
pmix_proc_tproc;
char *hostname;
char *executable_name;
pid_tpid;
int exit_code;
pmix_proc_state_t state;
} pmix_proc_info_t;
*/
}
}
if (NULL !=release_fn) {
release_fn(release_cbdata);
}
isdone =true;
}
• https://pmix.org/support/how-to/example-direct-launch-debugger-tool/
• https://pmix.org/support/how-to/example-indirect-launch-debugger-tool/
#include <pmix.h>
intmain(int argc, char**argv) {
pmix_query_t*query;
char clientspace[PMIX_MAX_NSLEN+1];
// ... Initialize and setup – PMIx_tool_init()...
/* get the proctable forthis nspace */
PMIX_QUERY_CREATE(query, 1);
PMIX_ARGV_APPEND(rc, query[0].keys, PMIX_QUERY_PROC_TABLE);
query[0].nqual =1;
PMIX_INFO_CREATE(query->qualifiers, query[0].nqual);
PMIX_INFO_LOAD(&query->qualifiers[0], PMIX_NSPACE, clientspace, PMIX_STRING);
if (PMIX_SUCCESS!=(rc=PMIx_Query_info_nb(query, 1,cbfunc, NULL))) {
exit(42);
}
/* wait toget a response */
while( !isdone ) { usleep(10); }
// ... Finalize andcleanup
}
Tony Curtis <anthony.curtis@stonybrook.edu>
Dr. Barbara Chapman
Abdullah Shahneous Bari Sayket, Wenbin Lü, Wes Suttle
Stony Brook University
OSSS OpenSHMEM-on-UCX / Fault Tolerance
https://www.iacs.stonybrook.edu/
PMIx@SC18: OSSS OpenSHMEM-on-UCX / Fault Tolerance
• OpenSHMEM
 Partitioned Global Address Space library
 Take advantage of RDMA
• put/get directly to/from remote memory
• Atomics, locks
• Collectives
• And more…
http://www.openshmem.org/
PMIx@SC18: OSSS OpenSHMEM-on-UCX / Fault Tolerance
• OpenSHMEM: our bit
 Reference Implementation
• OSSS + LANL + SBU + Rice + Duke + Rhodes
• Wireup/maintenance: PMIx
• Communication substrate: UCX
 shm, IB, uGNI, CUDA, …
PMIx@SC18: OSSS OpenSHMEM-on-UCX / Fault Tolerance
• Reference Implementation
 PMIx/UCX for OpenSHMEM ≥ 1.4
 Also for < 1.4 based on GASNet
 Open-Source @ github
• Or my dev repo if you’re brave
• Our research vehicle
PMIx@SC18: OSSS OpenSHMEM-on-UCX / Fault Tolerance
• Reference Implementation
 PMIx/PRRTE used as o-o-b startup / launch
 Exchange of wireup info e.g. heap addresses/rkeys
 Rank/locality and other “ambient” queries
 Stays out of the way until OpenSHMEM finalized
 Could kill it once we’re up but used for fault tolerance
project
PMIx@SC18: OSSS OpenSHMEM-on-UCX / Fault Tolerance
• Fault Tolerance (NSF) #1
 Project with UTK and Rutgers
 Current OpenSHMEM specification lacks general
fault tolerance (FT) features
 PMIx has basic FT building blocks already
• Event handling, process monitoring, job control
 Using these features to build FT API for
OpenSHMEM
PMIx@SC18: OSSS OpenSHMEM-on-UCX / Fault Tolerance
• Fault Tolerance (NSF) #2
 User specifies
• Desired process monitoring scheme
• Application-specific fault mitigation procedure
 Then our FT API takes care of:
• Initiating specified process monitoring
• Registering fault mitigation routine with PMIx server
 PMIx takes care of the rest
PMIx@SC18: OSSS OpenSHMEM-on-UCX / Fault Tolerance
• Fault Tolerance (NSF) #3
 Longer-term goal: automated selection and
implementation of FT techniques
 Compiler chooses from a small set of pre-packaged
FT schemes
 Appropriate technique selected based on application
structure and data access patterns
 Compiler implements the scheme at compile-time
PMIx@SC18: OSSS OpenSHMEM-on-UCX / Fault Tolerance
• PMIx @ github
 We try to keep up on the bleeding edge of both PMIx
& PRRTE for development
 But make sure releases work too!
 We’ve opened quite a few tickets
• Always fixed or nixed quickly!
Any (easy) Questions?
Tony Curtis <anthony.curtis@stonybrook.edu>
Dr. Barbara Chapman
Abdullah Shahneous Bari Sayket, Wenbin Lü, Wes Suttle
Stony Brook University
OSSS OpenSHMEM-on-UCX / Fault Tolerance
https://www.iacs.stonybrook.edu/
http://www.openshmem.org/
http://www.openucx.org/
Geoffroy Vallee
Oak Ridge National Laboratory
Runtime Coordination Based on PMIx
Introduction
• Many pre-exascale systems show a fairly
drastic hardware architecture change
 Bigger/complex compute nodes
 Summit: 2 IBM Power-9 CPUs, 6 GPUs per node
• Requires most scientific simulations to switch
from pure MPI to a MPI+X paradigm
• MPI+OpenMP is a popular solution
Challenges
• MPI and OpenMP are independent communities
• Implementations un-aware of each other
 MPI will deploy ranks without any knowledge of the
OpenMP threads that will run under each rank
 OpenMP assumes by default that all available
resources can be used
It is very difficult for users to have a fine-grain control
over application execution
Context
• U.S. DOE Exascale Computing Project (ECP)
• ECP OMPI-X project
• ECP SOLLVE project
• Oak Ridge Leadership Computing Facility
(OCLF)
Challenges (2)
• Impossible to coordinate runtimes and optimize
resource utilization
 Runtimes independently allocate resources
 No fine-grain & coordinated control over the
placement of MPI ranks and OpenMP threads
 Difficult to express affinity across runtimes
 Difficult to optimize applications
Illustration
• On Summit nodes
Numa Node 1
Numa Node 2
GPU
GPU
GPU
GPU
GPU
GPU
Node architecture
Numa Node 1
Numa Node 2
GPU
GPU
GPU
GPU
GPU
GPU
Desired placement
MPI rank / OpenMP master thread
OpenMP thread
Proposed Solution
• Make all runtimes PMIx aware for data
exchange and notification through events
• Key PMIx characteristics
 Distributed key/value store for data exchange
 Asynchronous events for coordination
 Enable interactions with the resource manager
• Modification of LLVM OpenMP and Open MPI
Two Use-cases
• Fine-grain placement over MPI ranks and
OpenMP threads (prototyping stage)
• Inter-runtime progress (design stage)
Fine-grain Placement
• Goals
 Explicit placement of MPI ranks
 Explicit placement of OpenMP threads
 Re-configure MPI+OpenMP layout at runtime
• Propose the concept of layout
 Static layout: explicit definition of ranks and threads
placement at initialization time only
 Dynamic layouts: change placement at runtime
Layout: Example
→
Static Layouts
• New MPI mapper
 Get the layout specified by the user
 Publish it through PMIx
• New MPI OpenMP Coordination (MOC) helper-
library gets the layout and set OpenMP places
to specify threads will be created placement
• No modification to OpenMP standard or runtime
Dynamic Layouts
• New API to define phases within the application
 A static layout is deployed for each phase
 Modification of the layout between phases (w/ MOC)
 Well adapted to the nature of some applications
• Requires
 OpenMP runtime modifications (LLVM prototype)
• Get phases’ layouts and set new places internally
 OpenMP standard modifications to allow dynamicity
Architecture Overview
• MOC ensures inter-runtime data
exchanged (MOC-specific PMIx
keys)
• Runtimes are PMIx-aware
• Runtimes have a layout-aware
mapper/affinity manager
• Respect the separation between
standards, resource manager
and runtimes
Inter-runtime Progress
• Runtime usually guarantee internal progress but
are not aware of other runtimes
• Runtimes can end up in a deadlock if mutually
waiting for completion
 MPI operation is ready to complete
 Application calls and block in other runtimes
Inter-runtime Progress (2)
• Proposed solution
 Runtimes use PMIx to notify of progress
 When a runtime gets a progress notification, it yields
once done with all tasks that can be completed
• Currently working on prototyping
Conclusion
• There is a real need for making runtimes aware
of each other
• PMIx is a privileged option
 Asynchronous / event based
 Data exchange
 Enable interaction with the resource manager
• Opens new opportunities for both research and
development (e.g. new project focusing on QoS)
Dan Holmes
EPCC University of Edinburgh
Howard Pritchard, Nathan Hjelm
Los Alamos National Laboratory
MPI Sessions: Dynamic Proc. Groups
Using PMIx to Help
Replace MPI_Init
14th November 2018
Dan Holmes, Howard Pritchard, Nathan Hjelm
LA-UR-18-30830
Problems with MPI_Init
● All MPI processes must initialize MPI exactly once
● MPI cannot be initialized within an MPI process from different
application components without coordination
● MPI cannot be re-initialized after MPI is finalized
● Error handling for MPI initialization cannot be specified
Sessions – a new way to start
MPI
● General scheme:
● Query the underlying
run-time system
● Get a “set” of
processes
● Determine the processes
you want
● Create an MPI_Group
● Create a communicator
with just those processes
● Create an MPI_Comm
Query runtime
for set of processes
MPI_Group
MPI_Comm
MPI_Session
MPI Sessions proposed API
● Create (or destroy) a session:
● MPI_SESSION_INIT (and MPI_SESSION_FINALIZE)
● Get names of sets of processes:
● MPI_SESSION_GET_NUM_PSETS,
MPI_SESSION_GET_NTH_PSET
● Create an MPI_GROUP from a process set name:
● MPI_GROUP_CREATE_FROM_SESSION
● Create an MPI_COMM from an MPI_GROUP:
● MPI_COMM_CREATE_FROM_GROUP
PMIx
groups
helps here
MPI_COMM_CREATE_FROM_GROU
P
The ‘uri’ is supplied by the application.
Implementation challenge: ‘group’ is a local object.
Need some way to synchronize with other “joiners” to the
communicator. The ‘uri’ is different than a process set
name.
MPI_Create_comm_from_group(IN MPI_Group group,
IN const char *uri,
IN MPI_Info info,
IN MPI_Erhandler hndl,
OUT MPI_Comm *comm);
Using PMIx Groups
● PMIx Groups - a collection of processes desiring a unified
identifier for purposes such as passing events or participating in
PMIx fence operations
● Invite/join/leave semantics
● Sessions prototype implementation currently uses
PMIX_Group_construct/PMIX_Group_destruct
● Can be used to generate a “unique” 64-bit identifier for the group.
Used by the sessions prototype to generate a communicator ID.
● Useful options for future work
● Timeout for processes joining the group
● Asynchronous notification when a process leaves the group
Using PMIx_Group_Construct
● ‘id’ maps to/from the ‘uri’ in MPI_Comm_create_from_group
(plus additional Open MPI internal info)
● ‘procs’ array comes from information previously supplied by PMIx
○ “mpi://world” and “mpi://self” already available
○ mpiexec -np 2 --pset user://ocean ocean.x : 
-np 2 --pset user://atmosphere atmosphere.x
PMIx_Group_Construct(const char id[],
const pmix_proc_t procs[],
const pmix_info_t info[],
size_t ninfo);
MPI Sessions Prototype Status
● All “MPI_*_from_group” functions have been implemented
● Only pml/ob1 supported at this time
● Working on MPI_Group creation (from PSET) now
● Will start with mpi://world and mpi://self
● User-defined process sets need additional support from PMIx
● Up next: break apart MPI initialization
● Goal is to reduce startup time and memory footprint
Summary
● PMIx Groups provides an OOB mechanism for MPI processes to
bootstrap the formation of a communication context (MPI
Communicator) from a group of MPI processes
● Functionality for future work
● Handling (unexpected) process exit
● User-defined process sets
● Group expansion
Funding Acknowledgments
Useful Links:
General Information: https://pmix.org/
PMIx Library: https://github.com/pmix/pmix
PMIx Reference RunTime Environment (PRRTE): https://github.com/pmix/prrte
PMIx Standard: https://github.com/pmix/pmix-standard
Slack: pmix-workspace.slack.com
Q&A

More Related Content

What's hot

HPC at NIBR
HPC at NIBRHPC at NIBR
HPC at NIBR
inside-BigData.com
 
Linux Kernel Cryptographic API and Use Cases
Linux Kernel Cryptographic API and Use CasesLinux Kernel Cryptographic API and Use Cases
Linux Kernel Cryptographic API and Use Cases
Kernel TLV
 
2008-09-09 IBM Interaction Conference, Red Hat Update for System z
2008-09-09 IBM Interaction Conference, Red Hat Update for System z2008-09-09 IBM Interaction Conference, Red Hat Update for System z
2008-09-09 IBM Interaction Conference, Red Hat Update for System z
Shawn Wells
 
44CON 2013 - Hack the Gibson - Exploiting Supercomputers - John Fitzpatrick &...
44CON 2013 - Hack the Gibson - Exploiting Supercomputers - John Fitzpatrick &...44CON 2013 - Hack the Gibson - Exploiting Supercomputers - John Fitzpatrick &...
44CON 2013 - Hack the Gibson - Exploiting Supercomputers - John Fitzpatrick &...
44CON
 
Data Plane and VNF Acceleration Mini Summit
Data Plane and VNF Acceleration Mini Summit Data Plane and VNF Acceleration Mini Summit
Data Plane and VNF Acceleration Mini Summit
Open-NFP
 
Real time Linux
Real time LinuxReal time Linux
Real time Linux
navid ashrafi
 
Overview of Linux real-time challenges
Overview of Linux real-time challengesOverview of Linux real-time challenges
Overview of Linux real-time challenges
Daniel Stenberg
 
ProjectVault[VivekKumar_CS-C_6Sem_MIT].pptx
ProjectVault[VivekKumar_CS-C_6Sem_MIT].pptxProjectVault[VivekKumar_CS-C_6Sem_MIT].pptx
ProjectVault[VivekKumar_CS-C_6Sem_MIT].pptx
Vivek Kumar
 
Accelerating Envoy and Istio with Cilium and the Linux Kernel
Accelerating Envoy and Istio with Cilium and the Linux KernelAccelerating Envoy and Istio with Cilium and the Linux Kernel
Accelerating Envoy and Istio with Cilium and the Linux Kernel
Thomas Graf
 
Rina sim workshop
Rina sim workshopRina sim workshop
Rina sim workshop
ICT PRISTINE
 
Dpdk: rte_security: An update and introducing PDCP
Dpdk: rte_security: An update and introducing PDCPDpdk: rte_security: An update and introducing PDCP
Dpdk: rte_security: An update and introducing PDCP
Hemant Agrawal
 
OpenStack Summit Tokyo - Know-how of Challlenging Deploy/Operation NTT DOCOMO...
OpenStack Summit Tokyo - Know-how of Challlenging Deploy/Operation NTT DOCOMO...OpenStack Summit Tokyo - Know-how of Challlenging Deploy/Operation NTT DOCOMO...
OpenStack Summit Tokyo - Know-how of Challlenging Deploy/Operation NTT DOCOMO...
Masaaki Nakagawa
 
Dpdk 2019-ipsec-eventdev
Dpdk 2019-ipsec-eventdevDpdk 2019-ipsec-eventdev
Dpdk 2019-ipsec-eventdev
Hemant Agrawal
 
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDPDockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
Thomas Graf
 
eBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux KerneleBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux Kernel
Thomas Graf
 
Introduction to eBPF
Introduction to eBPFIntroduction to eBPF
Introduction to eBPF
RogerColl2
 
Evolving Virtual Networking with IO Visor
Evolving Virtual Networking with IO VisorEvolving Virtual Networking with IO Visor
Evolving Virtual Networking with IO Visor
Larry Lang
 
OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray
OVS and DPDK - T.F. Herbert, K. Traynor, M. GrayOVS and DPDK - T.F. Herbert, K. Traynor, M. Gray
OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray
harryvanhaaren
 
The Impact of Software-based Virtual Network in the Public Cloud
The Impact of Software-based Virtual Network in the Public CloudThe Impact of Software-based Virtual Network in the Public Cloud
The Impact of Software-based Virtual Network in the Public Cloud
Chunghan Lee
 
SDN & NFV Introduction - Open Source Data Center Networking
SDN & NFV Introduction - Open Source Data Center NetworkingSDN & NFV Introduction - Open Source Data Center Networking
SDN & NFV Introduction - Open Source Data Center Networking
Thomas Graf
 

What's hot (20)

HPC at NIBR
HPC at NIBRHPC at NIBR
HPC at NIBR
 
Linux Kernel Cryptographic API and Use Cases
Linux Kernel Cryptographic API and Use CasesLinux Kernel Cryptographic API and Use Cases
Linux Kernel Cryptographic API and Use Cases
 
2008-09-09 IBM Interaction Conference, Red Hat Update for System z
2008-09-09 IBM Interaction Conference, Red Hat Update for System z2008-09-09 IBM Interaction Conference, Red Hat Update for System z
2008-09-09 IBM Interaction Conference, Red Hat Update for System z
 
44CON 2013 - Hack the Gibson - Exploiting Supercomputers - John Fitzpatrick &...
44CON 2013 - Hack the Gibson - Exploiting Supercomputers - John Fitzpatrick &...44CON 2013 - Hack the Gibson - Exploiting Supercomputers - John Fitzpatrick &...
44CON 2013 - Hack the Gibson - Exploiting Supercomputers - John Fitzpatrick &...
 
Data Plane and VNF Acceleration Mini Summit
Data Plane and VNF Acceleration Mini Summit Data Plane and VNF Acceleration Mini Summit
Data Plane and VNF Acceleration Mini Summit
 
Real time Linux
Real time LinuxReal time Linux
Real time Linux
 
Overview of Linux real-time challenges
Overview of Linux real-time challengesOverview of Linux real-time challenges
Overview of Linux real-time challenges
 
ProjectVault[VivekKumar_CS-C_6Sem_MIT].pptx
ProjectVault[VivekKumar_CS-C_6Sem_MIT].pptxProjectVault[VivekKumar_CS-C_6Sem_MIT].pptx
ProjectVault[VivekKumar_CS-C_6Sem_MIT].pptx
 
Accelerating Envoy and Istio with Cilium and the Linux Kernel
Accelerating Envoy and Istio with Cilium and the Linux KernelAccelerating Envoy and Istio with Cilium and the Linux Kernel
Accelerating Envoy and Istio with Cilium and the Linux Kernel
 
Rina sim workshop
Rina sim workshopRina sim workshop
Rina sim workshop
 
Dpdk: rte_security: An update and introducing PDCP
Dpdk: rte_security: An update and introducing PDCPDpdk: rte_security: An update and introducing PDCP
Dpdk: rte_security: An update and introducing PDCP
 
OpenStack Summit Tokyo - Know-how of Challlenging Deploy/Operation NTT DOCOMO...
OpenStack Summit Tokyo - Know-how of Challlenging Deploy/Operation NTT DOCOMO...OpenStack Summit Tokyo - Know-how of Challlenging Deploy/Operation NTT DOCOMO...
OpenStack Summit Tokyo - Know-how of Challlenging Deploy/Operation NTT DOCOMO...
 
Dpdk 2019-ipsec-eventdev
Dpdk 2019-ipsec-eventdevDpdk 2019-ipsec-eventdev
Dpdk 2019-ipsec-eventdev
 
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDPDockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
 
eBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux KerneleBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux Kernel
 
Introduction to eBPF
Introduction to eBPFIntroduction to eBPF
Introduction to eBPF
 
Evolving Virtual Networking with IO Visor
Evolving Virtual Networking with IO VisorEvolving Virtual Networking with IO Visor
Evolving Virtual Networking with IO Visor
 
OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray
OVS and DPDK - T.F. Herbert, K. Traynor, M. GrayOVS and DPDK - T.F. Herbert, K. Traynor, M. Gray
OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray
 
The Impact of Software-based Virtual Network in the Public Cloud
The Impact of Software-based Virtual Network in the Public CloudThe Impact of Software-based Virtual Network in the Public Cloud
The Impact of Software-based Virtual Network in the Public Cloud
 
SDN & NFV Introduction - Open Source Data Center Networking
SDN & NFV Introduction - Open Source Data Center NetworkingSDN & NFV Introduction - Open Source Data Center Networking
SDN & NFV Introduction - Open Source Data Center Networking
 

Similar to SC'18 BoF Presentation

GPA Software Overview R3
GPA Software Overview R3GPA Software Overview R3
GPA Software Overview R3
Grid Protection Alliance
 
Exploring the Final Frontier of Data Center Orchestration: Network Elements -...
Exploring the Final Frontier of Data Center Orchestration: Network Elements -...Exploring the Final Frontier of Data Center Orchestration: Network Elements -...
Exploring the Final Frontier of Data Center Orchestration: Network Elements -...
Puppet
 
eBPF Basics
eBPF BasicseBPF Basics
eBPF Basics
Michael Kehoe
 
SC'16 PMIx BoF Presentation
SC'16 PMIx BoF PresentationSC'16 PMIx BoF Presentation
SC'16 PMIx BoF Presentation
rcastain
 
EuroMPI 2017 PMIx presentation
EuroMPI 2017 PMIx presentationEuroMPI 2017 PMIx presentation
EuroMPI 2017 PMIx presentation
rcastain
 
Typesafe spark- Zalando meetup
Typesafe spark- Zalando meetupTypesafe spark- Zalando meetup
Typesafe spark- Zalando meetup
Stavros Kontopoulos
 
Tamir Dresher - DotNet 7 What's new.pptx
Tamir Dresher - DotNet 7 What's new.pptxTamir Dresher - DotNet 7 What's new.pptx
Tamir Dresher - DotNet 7 What's new.pptx
Tamir Dresher
 
MingLiuResume2016
MingLiuResume2016MingLiuResume2016
MingLiuResume2016
Ming Liu
 
Review of QNX
Review of QNXReview of QNX
Docker meetup - PaaS interoperability
Docker meetup - PaaS interoperabilityDocker meetup - PaaS interoperability
Docker meetup - PaaS interoperability
Ludovic Piot
 
KubeCon USA 2017 brief Overview - from Kubernetes meetup Bangalore
KubeCon USA 2017 brief Overview - from Kubernetes meetup BangaloreKubeCon USA 2017 brief Overview - from Kubernetes meetup Bangalore
KubeCon USA 2017 brief Overview - from Kubernetes meetup Bangalore
Krishna-Kumar
 
Mastering Real-time Linux
Mastering Real-time LinuxMastering Real-time Linux
Mastering Real-time Linux
Jean-François Deverge
 
HiPEAC Computing Systems Week 2022_Mario Porrmann presentation
HiPEAC Computing Systems Week 2022_Mario Porrmann presentationHiPEAC Computing Systems Week 2022_Mario Porrmann presentation
HiPEAC Computing Systems Week 2022_Mario Porrmann presentation
VEDLIoT Project
 
PETRUCCI_Andrea_Research_Projects_and_Publications
PETRUCCI_Andrea_Research_Projects_and_PublicationsPETRUCCI_Andrea_Research_Projects_and_Publications
PETRUCCI_Andrea_Research_Projects_and_Publications
Andrea PETRUCCI
 
HiPEAC 2019 Tutorial - Maestro RTOS
HiPEAC 2019 Tutorial - Maestro RTOSHiPEAC 2019 Tutorial - Maestro RTOS
HiPEAC 2019 Tutorial - Maestro RTOS
Tulipp. Eu
 
Intel open stack-summit-session-nov13-final
Intel open stack-summit-session-nov13-finalIntel open stack-summit-session-nov13-final
Intel open stack-summit-session-nov13-final
Deepak Mane
 
OCP Telco Engineering Workshop at BCE2017
OCP Telco Engineering Workshop at BCE2017OCP Telco Engineering Workshop at BCE2017
OCP Telco Engineering Workshop at BCE2017
Radisys Corporation
 
Microservices @ Work - A Practice Report of Developing Microservices
Microservices @ Work - A Practice Report of Developing MicroservicesMicroservices @ Work - A Practice Report of Developing Microservices
Microservices @ Work - A Practice Report of Developing Microservices
QAware GmbH
 
Introduction to Civil Infrastructure Platform
Introduction to Civil Infrastructure PlatformIntroduction to Civil Infrastructure Platform
Introduction to Civil Infrastructure Platform
SZ Lin
 
Monitoring federation open stack infrastructure
Monitoring federation open stack infrastructureMonitoring federation open stack infrastructure
Monitoring federation open stack infrastructure
Fernando Lopez Aguilar
 

Similar to SC'18 BoF Presentation (20)

GPA Software Overview R3
GPA Software Overview R3GPA Software Overview R3
GPA Software Overview R3
 
Exploring the Final Frontier of Data Center Orchestration: Network Elements -...
Exploring the Final Frontier of Data Center Orchestration: Network Elements -...Exploring the Final Frontier of Data Center Orchestration: Network Elements -...
Exploring the Final Frontier of Data Center Orchestration: Network Elements -...
 
eBPF Basics
eBPF BasicseBPF Basics
eBPF Basics
 
SC'16 PMIx BoF Presentation
SC'16 PMIx BoF PresentationSC'16 PMIx BoF Presentation
SC'16 PMIx BoF Presentation
 
EuroMPI 2017 PMIx presentation
EuroMPI 2017 PMIx presentationEuroMPI 2017 PMIx presentation
EuroMPI 2017 PMIx presentation
 
Typesafe spark- Zalando meetup
Typesafe spark- Zalando meetupTypesafe spark- Zalando meetup
Typesafe spark- Zalando meetup
 
Tamir Dresher - DotNet 7 What's new.pptx
Tamir Dresher - DotNet 7 What's new.pptxTamir Dresher - DotNet 7 What's new.pptx
Tamir Dresher - DotNet 7 What's new.pptx
 
MingLiuResume2016
MingLiuResume2016MingLiuResume2016
MingLiuResume2016
 
Review of QNX
Review of QNXReview of QNX
Review of QNX
 
Docker meetup - PaaS interoperability
Docker meetup - PaaS interoperabilityDocker meetup - PaaS interoperability
Docker meetup - PaaS interoperability
 
KubeCon USA 2017 brief Overview - from Kubernetes meetup Bangalore
KubeCon USA 2017 brief Overview - from Kubernetes meetup BangaloreKubeCon USA 2017 brief Overview - from Kubernetes meetup Bangalore
KubeCon USA 2017 brief Overview - from Kubernetes meetup Bangalore
 
Mastering Real-time Linux
Mastering Real-time LinuxMastering Real-time Linux
Mastering Real-time Linux
 
HiPEAC Computing Systems Week 2022_Mario Porrmann presentation
HiPEAC Computing Systems Week 2022_Mario Porrmann presentationHiPEAC Computing Systems Week 2022_Mario Porrmann presentation
HiPEAC Computing Systems Week 2022_Mario Porrmann presentation
 
PETRUCCI_Andrea_Research_Projects_and_Publications
PETRUCCI_Andrea_Research_Projects_and_PublicationsPETRUCCI_Andrea_Research_Projects_and_Publications
PETRUCCI_Andrea_Research_Projects_and_Publications
 
HiPEAC 2019 Tutorial - Maestro RTOS
HiPEAC 2019 Tutorial - Maestro RTOSHiPEAC 2019 Tutorial - Maestro RTOS
HiPEAC 2019 Tutorial - Maestro RTOS
 
Intel open stack-summit-session-nov13-final
Intel open stack-summit-session-nov13-finalIntel open stack-summit-session-nov13-final
Intel open stack-summit-session-nov13-final
 
OCP Telco Engineering Workshop at BCE2017
OCP Telco Engineering Workshop at BCE2017OCP Telco Engineering Workshop at BCE2017
OCP Telco Engineering Workshop at BCE2017
 
Microservices @ Work - A Practice Report of Developing Microservices
Microservices @ Work - A Practice Report of Developing MicroservicesMicroservices @ Work - A Practice Report of Developing Microservices
Microservices @ Work - A Practice Report of Developing Microservices
 
Introduction to Civil Infrastructure Platform
Introduction to Civil Infrastructure PlatformIntroduction to Civil Infrastructure Platform
Introduction to Civil Infrastructure Platform
 
Monitoring federation open stack infrastructure
Monitoring federation open stack infrastructureMonitoring federation open stack infrastructure
Monitoring federation open stack infrastructure
 

More from rcastain

PMIx: Bridging the Container Boundary
PMIx: Bridging the Container BoundaryPMIx: Bridging the Container Boundary
PMIx: Bridging the Container Boundary
rcastain
 
PMIx Tiered Storage Support
PMIx Tiered Storage SupportPMIx Tiered Storage Support
PMIx Tiered Storage Support
rcastain
 
SC'17 BoF Presentation
SC'17 BoF PresentationSC'17 BoF Presentation
SC'17 BoF Presentation
rcastain
 
PMIx: Debuggers and Fabric Support
PMIx: Debuggers and Fabric SupportPMIx: Debuggers and Fabric Support
PMIx: Debuggers and Fabric Support
rcastain
 
HPC Resource Management: Futures
HPC Resource Management: FuturesHPC Resource Management: Futures
HPC Resource Management: Futures
rcastain
 
SC15 PMIx Birds-of-a-Feather
SC15 PMIx Birds-of-a-FeatherSC15 PMIx Birds-of-a-Feather
SC15 PMIx Birds-of-a-Feather
rcastain
 
HPC Controls Future
HPC Controls FutureHPC Controls Future
HPC Controls Future
rcastain
 
Exascale Process Management Interface
Exascale Process Management InterfaceExascale Process Management Interface
Exascale Process Management Interface
rcastain
 

More from rcastain (8)

PMIx: Bridging the Container Boundary
PMIx: Bridging the Container BoundaryPMIx: Bridging the Container Boundary
PMIx: Bridging the Container Boundary
 
PMIx Tiered Storage Support
PMIx Tiered Storage SupportPMIx Tiered Storage Support
PMIx Tiered Storage Support
 
SC'17 BoF Presentation
SC'17 BoF PresentationSC'17 BoF Presentation
SC'17 BoF Presentation
 
PMIx: Debuggers and Fabric Support
PMIx: Debuggers and Fabric SupportPMIx: Debuggers and Fabric Support
PMIx: Debuggers and Fabric Support
 
HPC Resource Management: Futures
HPC Resource Management: FuturesHPC Resource Management: Futures
HPC Resource Management: Futures
 
SC15 PMIx Birds-of-a-Feather
SC15 PMIx Birds-of-a-FeatherSC15 PMIx Birds-of-a-Feather
SC15 PMIx Birds-of-a-Feather
 
HPC Controls Future
HPC Controls FutureHPC Controls Future
HPC Controls Future
 
Exascale Process Management Interface
Exascale Process Management InterfaceExascale Process Management Interface
Exascale Process Management Interface
 

Recently uploaded

Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
fredae14
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
Hiike
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
Shinana2
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
saastr
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStrDeep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
saastr
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
Finale of the Year: Apply for Next One!
Finale of the Year: Apply for Next One!Finale of the Year: Apply for Next One!
Finale of the Year: Apply for Next One!
GDSC PJATK
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
LucaBarbaro3
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 

Recently uploaded (20)

Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStrDeep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
Finale of the Year: Apply for Next One!
Finale of the Year: Apply for Next One!Finale of the Year: Apply for Next One!
Finale of the Year: Apply for Next One!
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 

SC'18 BoF Presentation

  • 1. Ralph H. Castain Intel Joshua Hursey IBM PMIx: State-of-the-Union https://pmix.orgSlack: pmix-workspace.slack.com
  • 2. Agenda • Quick status update  Library releases  Standards documents • Application examples  PMIx Basics: Querying information  OpenSHMEM  Interlibrary coordination  MPI Sessions: Dynamic Process Grouping  Dynamic Programming Models (web only)
  • 3. Ralph H. Castain, Joshua Hursey, Aurelien Bouteiller, David Solt, “PMIx: Process management for exascale environments”, Parallel Computing, 2018. https://doi.org/10.1016/j.parco.2018.08.002 Overview Paper
  • 4. • Standards Documents  v2.0 released Sept. 27, 2018  v2.1 (errata/clarifications, Dec.)  v3.0 (Dec.)  v4.0 (1Q19) Status Snapshot Current Library Releases https://github.com/pmix/pmix-standard/releases
  • 5. • Clarify terms  Session, job, application • New chapter detailing namespace registration data organization • Data retrieval examples  Application size in multi-app scenarios  What rank to provide for which data attributes PMIx Standard – In Progress
  • 6. • Standards Documents  v2.0 released Sept. 27, 2018  v2.1 (errata/clarifications, Dec.)  v3.0 (Dec.)  v4.0 (1Q19) • Reference Implementation  v3.0 => v3.1 (Dec)  v2.1 => v2.2 (Dec) Status Snapshot Current Library Releases New dstore implementation Full Standard Compliance
  • 7. • v2 => workflow orchestration  (Ch.7) Job Allocation Management and Reporting, (Ch.8) Event Notification • v3 => tools support  Security credentials, I/O management • v4 => tools, network, dynamic programming models  Complete debugger attach/detach/re-attach • Direct and indirect launch  Fabric topology information (switches, NICs, …) • Communication cost, connectivity graphs, inventory collection  PMIx Groups  Bindings (Python, …) What’s in the Standard?
  • 8. • PRRTE  Formal releases to begin in Dec. or Jan.  Ties to external PMIx, libevent, hwloc installations  Support up thru current head PMIx master branch • Cross-version support  Regularly tested, automated script • Help Wanted: Container developers for regular regression testing  v2.1.1 and above remain interchangeable  https://pmix.org/support/faq/how-does-pmix-work-with-containers/ Status Snapshot
  • 9. Adoption Updates • MPI use-cases  Re-ordering for load balance (UTK/ECP)  Fault management (UTK)  On-the-fly session formation/teardown (MPIF) • MPI libraries  OMPI, MPICH, Intel MPI, HPE-MPI, Spectrum MPI, Fujitsu MPI • Resource Managers (RMs)  Slurm, Fujitsu, IBM’s Job Step Manager (JSM), PBSPro (2019), Kubernetes(?) • Spark, TensorFlow  Just getting underway • OpenSHMEM  Multiple implementations
  • 10. Artem Y. Polyakov, Joshua S. Ladd, Boris I. Karasev Mellanox Technologies Shared Memory Optimizations (ds21)
  • 11. • Improvements to intra-node performance  Highly optimized the intra-node communication component dstor/ds21[1] • Significantly improved the PMIx_Get latency using the 2N-lock algorithm on systems with a large number of processes per node accessing many keys. • Designed for exascale: Deployed in production on CORAL systems, Summit and Sierra, POWER9-based systems. • Available in PMIx v2.2, v3.1, and subsequent releases.  Enabled by default. • Improvements to the SLURM PMIx plug-in  Added a ring-based PMIx_Fence collective (Slurm v18.08)  Added support for high-performance RDMA point-to-point via UCX (Slurm v17.11) [2]  Verified PMIx compatibility through v3.x API (Slurm v18.08) [1] Artem Polyakov et al. A Scalable PMIx Database (poster) EuroMPI’18: https://eurompi2018.bsc.es/sites/default/files/uploaded/dstore_empi2018.pdf [2] Artem Polyakov PMIx Plugin with UCX Support (SC17), SchedMD Booth talk: https://slurm.schedmd.com/publications.html Mellanox update PMIx_Get() latency (POWER8 system)
  • 12. Joshua Hursey IBM PMIx Basics: Querying Information
  • 13. IBM Systems Harnessing PMIx capabilities #include <pmix.h> int main(int argc, char**argv) { pmix_proc_tmyproc,proc_wildcard; pmix_value_t value; pmix_value_t *val =&value; PMIx_Init(&myproc, NULL, 0); PMIX_PROC_CONSTRUCT(&proc_wildcard); (void)strncpy(proc_wildcard.nspace, myproc.nspace, PMIX_MAX_NSLEN); proc_wildcard.rank =PMIX_RANK_WILDCARD; PMIx_Get(&proc_wildcard, PMIX_JOB_SIZE,NULL, 0, &val); job_size =val->data.uint32; PMIx_Get(&myproc, PMIX_HOSTNAME, NULL, 0, &val); strncpy(hostname, val->data.string,256); printf("%d/%d) Hello Worldfrom %sn", myproc.rank, job_size, hostname); PMIx_Get(&proc_wildcard, PMIX_LOCALLDR,NULL, 0, &val);// Lowest rankon this node printf("%d/%d) Lowest Local Rank: %dn", myproc.rank, job_size,val->data.rank); PMIx_Get(&proc_wildcard, PMIX_LOCAL_SIZE,NULL, 0, &val);// Number ofranks on this node printf("%d/%d) Local Ranks: %dn", myproc.rank, job_size, val->data.uint32); PMIx_Fence(&proc_wildcard, 1, NULL, 0); // Synchronize processes (not required, just fordemo) PMIx_Finalize(NULL, 0); // Cleanup return 0; } • https://pmix.org/support/faq/rm-provided-information/ • https://pmix.org/support/faq/what-information-is-an-rm-supposed-to-provide/
  • 14. IBM Systems Harnessing PMIx capabilities volatile bool isdone =false; static void cbfunc(pmix_status_t status, pmix_info_t*info, size_t ninfo, void *cbdata, pmix_release_cbfunc_t release_fn, void *release_cbdata) { size_t n, i; if (0 < ninfo) { for (n=0; n < ninfo; n++) { // iterate trough the info[n].value.data.darray->arrayof /* typedef struct pmix_proc_info { pmix_proc_tproc; char *hostname; char *executable_name; pid_tpid; int exit_code; pmix_proc_state_t state; } pmix_proc_info_t; */ } } if (NULL !=release_fn) { release_fn(release_cbdata); } isdone =true; } • https://pmix.org/support/how-to/example-direct-launch-debugger-tool/ • https://pmix.org/support/how-to/example-indirect-launch-debugger-tool/ #include <pmix.h> intmain(int argc, char**argv) { pmix_query_t*query; char clientspace[PMIX_MAX_NSLEN+1]; // ... Initialize and setup – PMIx_tool_init()... /* get the proctable forthis nspace */ PMIX_QUERY_CREATE(query, 1); PMIX_ARGV_APPEND(rc, query[0].keys, PMIX_QUERY_PROC_TABLE); query[0].nqual =1; PMIX_INFO_CREATE(query->qualifiers, query[0].nqual); PMIX_INFO_LOAD(&query->qualifiers[0], PMIX_NSPACE, clientspace, PMIX_STRING); if (PMIX_SUCCESS!=(rc=PMIx_Query_info_nb(query, 1,cbfunc, NULL))) { exit(42); } /* wait toget a response */ while( !isdone ) { usleep(10); } // ... Finalize andcleanup }
  • 15. Tony Curtis <anthony.curtis@stonybrook.edu> Dr. Barbara Chapman Abdullah Shahneous Bari Sayket, Wenbin Lü, Wes Suttle Stony Brook University OSSS OpenSHMEM-on-UCX / Fault Tolerance https://www.iacs.stonybrook.edu/
  • 16. PMIx@SC18: OSSS OpenSHMEM-on-UCX / Fault Tolerance • OpenSHMEM  Partitioned Global Address Space library  Take advantage of RDMA • put/get directly to/from remote memory • Atomics, locks • Collectives • And more… http://www.openshmem.org/
  • 17. PMIx@SC18: OSSS OpenSHMEM-on-UCX / Fault Tolerance • OpenSHMEM: our bit  Reference Implementation • OSSS + LANL + SBU + Rice + Duke + Rhodes • Wireup/maintenance: PMIx • Communication substrate: UCX  shm, IB, uGNI, CUDA, …
  • 18.
  • 19. PMIx@SC18: OSSS OpenSHMEM-on-UCX / Fault Tolerance • Reference Implementation  PMIx/UCX for OpenSHMEM ≥ 1.4  Also for < 1.4 based on GASNet  Open-Source @ github • Or my dev repo if you’re brave • Our research vehicle
  • 20. PMIx@SC18: OSSS OpenSHMEM-on-UCX / Fault Tolerance • Reference Implementation  PMIx/PRRTE used as o-o-b startup / launch  Exchange of wireup info e.g. heap addresses/rkeys  Rank/locality and other “ambient” queries  Stays out of the way until OpenSHMEM finalized  Could kill it once we’re up but used for fault tolerance project
  • 21. PMIx@SC18: OSSS OpenSHMEM-on-UCX / Fault Tolerance • Fault Tolerance (NSF) #1  Project with UTK and Rutgers  Current OpenSHMEM specification lacks general fault tolerance (FT) features  PMIx has basic FT building blocks already • Event handling, process monitoring, job control  Using these features to build FT API for OpenSHMEM
  • 22. PMIx@SC18: OSSS OpenSHMEM-on-UCX / Fault Tolerance • Fault Tolerance (NSF) #2  User specifies • Desired process monitoring scheme • Application-specific fault mitigation procedure  Then our FT API takes care of: • Initiating specified process monitoring • Registering fault mitigation routine with PMIx server  PMIx takes care of the rest
  • 23. PMIx@SC18: OSSS OpenSHMEM-on-UCX / Fault Tolerance • Fault Tolerance (NSF) #3  Longer-term goal: automated selection and implementation of FT techniques  Compiler chooses from a small set of pre-packaged FT schemes  Appropriate technique selected based on application structure and data access patterns  Compiler implements the scheme at compile-time
  • 24. PMIx@SC18: OSSS OpenSHMEM-on-UCX / Fault Tolerance • PMIx @ github  We try to keep up on the bleeding edge of both PMIx & PRRTE for development  But make sure releases work too!  We’ve opened quite a few tickets • Always fixed or nixed quickly!
  • 25. Any (easy) Questions? Tony Curtis <anthony.curtis@stonybrook.edu> Dr. Barbara Chapman Abdullah Shahneous Bari Sayket, Wenbin Lü, Wes Suttle Stony Brook University OSSS OpenSHMEM-on-UCX / Fault Tolerance https://www.iacs.stonybrook.edu/ http://www.openshmem.org/ http://www.openucx.org/
  • 26. Geoffroy Vallee Oak Ridge National Laboratory Runtime Coordination Based on PMIx
  • 27. Introduction • Many pre-exascale systems show a fairly drastic hardware architecture change  Bigger/complex compute nodes  Summit: 2 IBM Power-9 CPUs, 6 GPUs per node • Requires most scientific simulations to switch from pure MPI to a MPI+X paradigm • MPI+OpenMP is a popular solution
  • 28. Challenges • MPI and OpenMP are independent communities • Implementations un-aware of each other  MPI will deploy ranks without any knowledge of the OpenMP threads that will run under each rank  OpenMP assumes by default that all available resources can be used It is very difficult for users to have a fine-grain control over application execution
  • 29. Context • U.S. DOE Exascale Computing Project (ECP) • ECP OMPI-X project • ECP SOLLVE project • Oak Ridge Leadership Computing Facility (OCLF)
  • 30. Challenges (2) • Impossible to coordinate runtimes and optimize resource utilization  Runtimes independently allocate resources  No fine-grain & coordinated control over the placement of MPI ranks and OpenMP threads  Difficult to express affinity across runtimes  Difficult to optimize applications
  • 31. Illustration • On Summit nodes Numa Node 1 Numa Node 2 GPU GPU GPU GPU GPU GPU Node architecture Numa Node 1 Numa Node 2 GPU GPU GPU GPU GPU GPU Desired placement MPI rank / OpenMP master thread OpenMP thread
  • 32. Proposed Solution • Make all runtimes PMIx aware for data exchange and notification through events • Key PMIx characteristics  Distributed key/value store for data exchange  Asynchronous events for coordination  Enable interactions with the resource manager • Modification of LLVM OpenMP and Open MPI
  • 33. Two Use-cases • Fine-grain placement over MPI ranks and OpenMP threads (prototyping stage) • Inter-runtime progress (design stage)
  • 34. Fine-grain Placement • Goals  Explicit placement of MPI ranks  Explicit placement of OpenMP threads  Re-configure MPI+OpenMP layout at runtime • Propose the concept of layout  Static layout: explicit definition of ranks and threads placement at initialization time only  Dynamic layouts: change placement at runtime
  • 36. Static Layouts • New MPI mapper  Get the layout specified by the user  Publish it through PMIx • New MPI OpenMP Coordination (MOC) helper- library gets the layout and set OpenMP places to specify threads will be created placement • No modification to OpenMP standard or runtime
  • 37. Dynamic Layouts • New API to define phases within the application  A static layout is deployed for each phase  Modification of the layout between phases (w/ MOC)  Well adapted to the nature of some applications • Requires  OpenMP runtime modifications (LLVM prototype) • Get phases’ layouts and set new places internally  OpenMP standard modifications to allow dynamicity
  • 38. Architecture Overview • MOC ensures inter-runtime data exchanged (MOC-specific PMIx keys) • Runtimes are PMIx-aware • Runtimes have a layout-aware mapper/affinity manager • Respect the separation between standards, resource manager and runtimes
  • 39. Inter-runtime Progress • Runtime usually guarantee internal progress but are not aware of other runtimes • Runtimes can end up in a deadlock if mutually waiting for completion  MPI operation is ready to complete  Application calls and block in other runtimes
  • 40. Inter-runtime Progress (2) • Proposed solution  Runtimes use PMIx to notify of progress  When a runtime gets a progress notification, it yields once done with all tasks that can be completed • Currently working on prototyping
  • 41. Conclusion • There is a real need for making runtimes aware of each other • PMIx is a privileged option  Asynchronous / event based  Data exchange  Enable interaction with the resource manager • Opens new opportunities for both research and development (e.g. new project focusing on QoS)
  • 42. Dan Holmes EPCC University of Edinburgh Howard Pritchard, Nathan Hjelm Los Alamos National Laboratory MPI Sessions: Dynamic Proc. Groups
  • 43. Using PMIx to Help Replace MPI_Init 14th November 2018 Dan Holmes, Howard Pritchard, Nathan Hjelm LA-UR-18-30830
  • 44. Problems with MPI_Init ● All MPI processes must initialize MPI exactly once ● MPI cannot be initialized within an MPI process from different application components without coordination ● MPI cannot be re-initialized after MPI is finalized ● Error handling for MPI initialization cannot be specified
  • 45. Sessions – a new way to start MPI ● General scheme: ● Query the underlying run-time system ● Get a “set” of processes ● Determine the processes you want ● Create an MPI_Group ● Create a communicator with just those processes ● Create an MPI_Comm Query runtime for set of processes MPI_Group MPI_Comm MPI_Session
  • 46. MPI Sessions proposed API ● Create (or destroy) a session: ● MPI_SESSION_INIT (and MPI_SESSION_FINALIZE) ● Get names of sets of processes: ● MPI_SESSION_GET_NUM_PSETS, MPI_SESSION_GET_NTH_PSET ● Create an MPI_GROUP from a process set name: ● MPI_GROUP_CREATE_FROM_SESSION ● Create an MPI_COMM from an MPI_GROUP: ● MPI_COMM_CREATE_FROM_GROUP PMIx groups helps here
  • 47. MPI_COMM_CREATE_FROM_GROU P The ‘uri’ is supplied by the application. Implementation challenge: ‘group’ is a local object. Need some way to synchronize with other “joiners” to the communicator. The ‘uri’ is different than a process set name. MPI_Create_comm_from_group(IN MPI_Group group, IN const char *uri, IN MPI_Info info, IN MPI_Erhandler hndl, OUT MPI_Comm *comm);
  • 48. Using PMIx Groups ● PMIx Groups - a collection of processes desiring a unified identifier for purposes such as passing events or participating in PMIx fence operations ● Invite/join/leave semantics ● Sessions prototype implementation currently uses PMIX_Group_construct/PMIX_Group_destruct ● Can be used to generate a “unique” 64-bit identifier for the group. Used by the sessions prototype to generate a communicator ID. ● Useful options for future work ● Timeout for processes joining the group ● Asynchronous notification when a process leaves the group
  • 49. Using PMIx_Group_Construct ● ‘id’ maps to/from the ‘uri’ in MPI_Comm_create_from_group (plus additional Open MPI internal info) ● ‘procs’ array comes from information previously supplied by PMIx ○ “mpi://world” and “mpi://self” already available ○ mpiexec -np 2 --pset user://ocean ocean.x : -np 2 --pset user://atmosphere atmosphere.x PMIx_Group_Construct(const char id[], const pmix_proc_t procs[], const pmix_info_t info[], size_t ninfo);
  • 50. MPI Sessions Prototype Status ● All “MPI_*_from_group” functions have been implemented ● Only pml/ob1 supported at this time ● Working on MPI_Group creation (from PSET) now ● Will start with mpi://world and mpi://self ● User-defined process sets need additional support from PMIx ● Up next: break apart MPI initialization ● Goal is to reduce startup time and memory footprint
  • 51. Summary ● PMIx Groups provides an OOB mechanism for MPI processes to bootstrap the formation of a communication context (MPI Communicator) from a group of MPI processes ● Functionality for future work ● Handling (unexpected) process exit ● User-defined process sets ● Group expansion
  • 53. Useful Links: General Information: https://pmix.org/ PMIx Library: https://github.com/pmix/pmix PMIx Reference RunTime Environment (PRRTE): https://github.com/pmix/prrte PMIx Standard: https://github.com/pmix/pmix-standard Slack: pmix-workspace.slack.com Q&A