SlideShare a Scribd company logo
1 of 9
SR-IOV: The Key Enabling Technology for
Fully Virtualized HPC Clusters!
Glenn K. Lockwood!
Christopher Irving!
Philip M. Papadopoulos!
Mahidhar Tatineni!
Rick Wagner!
SAN DIEGO SUPERCOMPUTER CENTER
Single Root I/O Virtualization in HPC!
•  Problem: complex workflows demand increasing
flexibility from HPC platforms"
•  Virtualization = flexibility"
•  Virtualization = IO performance loss (e.g.,
excessive DMA interrupts)"
•  Solution: SR-IOV and Mellanox ConnectX-3
InfiniBand HCAs "
•  One physical function (PF) à multiple virtual
functions (VF), each with own DMA streams,
memory space, interrupts"
•  Allows DMA to bypass hypervisor to VMs!
SAN DIEGO SUPERCOMPUTER CENTER
High-Performance Virtualization
on Comet !
•  Mellanox FDR InfiniBand HCAs with SR-IOV"
•  Rocks and OpenStack Nova to manage VMs"
•  Flexibility to support complex science
gateways and web-based workflow engines"
•  custom compute appliances and virtual clusters
developed with FutureGrid and their existing
expertise"
•  backed by virtualized Lustre running over virtualized
InfiniBand"
SAN DIEGO SUPERCOMPUTER CENTER
Hardware/Software Configuration of Test Cluster !
Native, SR-IOV!

Amazon EC2!

Platform"

•  Rocks 6.1 (EL6)"
•  Virtualization via kvm	
  

•  Amazon Linux 2013.03 (EL6)"
•  cc2.8xlarge Instances"

CPUs"

•  2x Xeon E5-2660 (2.2GHz)"
•  16 cores per node"

•  2x Xeon E5-2670 (2.6GHz)"
•  16 cores per node"

RAM"

•  64 GB DDR3 DRAM"

•  60.5 DDR3 DRAM"

Interconnect" •  QDR4X InfiniBand"
•  10 GbE"
•  Mellanox ConnectX-3 (MT27500)" •  common placement group"
•  Intel VT-d, SR-IOV enabled in
firmware, kernel, drivers"
•  mlx4_core	
  1.1"
•  Mellanox OFED 2.0"
•  HCA firmware 2.11.1192"
SAN DIEGO SUPERCOMPUTER CENTER
50x less latency than Amazon EC2!
•  SR-IOV!
•  < 30% overhead for M <
128 bytes"
•  < 10% overhead for
eager send/recv"
•  Overhead à 0% for
bandwidth-limited regime"
•  Amazon EC2!
•  > 5000% worse latency"
•  Time dependent (noisy)"
5
	

SAN DIEGO SUPERCOMPUTER CENTER

OSU Microbenchmarks (3.9, osu_latency)"
10x more bandwidth than Amazon EC2!
•  SR-IOV!
•  < 2% bandwidth loss
over entire range"
•  > 95% peak bandwidth"
•  Amazon EC2!
•  < 35% peak bandwidth"
•  900% to 2500% worse
bandwidth than
virtualized InfiniBand"
6
	

SAN DIEGO SUPERCOMPUTER CENTER

OSU Microbenchmarks (3.9, osu_bw)"
Weather Modeling – 15% Overhead!
•  96-core (6-node)
calculation"
•  Nearest-neighbor
communication"
•  Scalable algorithms"
•  SR-IOV incurs modest
(15%) performance hit"
•  ...but still still 20%
faster*** than Amazon"
SAN DIEGO SUPERCOMPUTER CENTER

WRF 3.4.1 – 3hr forecast"
*** 20% faster despite SR-IOV cluster having 20% slower CPUs"
Quantum ESPRESSO: 5x Faster than EC2!
•  48-core, 3 node calc"
•  CG matrix inversion
(irregular comm.)"
•  3D FFT matrix
transposes (All-to-all
communication)"
•  28% slower w/ SR-IOV"
•  SR-IOV still > 500%
faster*** than EC2"
SAN DIEGO SUPERCOMPUTER CENTER

Quantum Espresso 5.0.2 – DEISA AUSURF112 benchmark"
*** 20% faster despite SR-IOV cluster having 20% slower CPUs"
Conclusions!
•  SR-IOV: huge step forward in high-performance virtualization"
•  Shows substantial improvement in latency over Amazon EC2, and
it provides nearly 0 bandwidth overhead!
•  Benchmark application performance confirms this: significant
improvement over EC2"

•  SR-IOV: lowers performance barrier to virtualizing the
interconnect and makes fully virtualized HPC clusters
viable!
•  Comet will deliver virtualized HPC to new/non-traditional
communities that need flexibility without major loss of performance!
SAN DIEGO SUPERCOMPUTER CENTER

More Related Content

What's hot

Routed networks sydney
Routed networks sydneyRouted networks sydney
Routed networks sydneyMiguel Lavalle
 
The TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux KernelThe TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux KernelDivye Kapoor
 
Kvm and libvirt
Kvm and libvirtKvm and libvirt
Kvm and libvirtplarsen67
 
Boosting I/O Performance with KVM io_uring
Boosting I/O Performance with KVM io_uringBoosting I/O Performance with KVM io_uring
Boosting I/O Performance with KVM io_uringShapeBlue
 
VMware vSphere 6.0 - Troubleshooting Training - Day 1
VMware vSphere 6.0 - Troubleshooting Training - Day 1VMware vSphere 6.0 - Troubleshooting Training - Day 1
VMware vSphere 6.0 - Troubleshooting Training - Day 1Sanjeev Kumar
 
VMware Advance Troubleshooting Workshop - Day 2
VMware Advance Troubleshooting Workshop - Day 2VMware Advance Troubleshooting Workshop - Day 2
VMware Advance Troubleshooting Workshop - Day 2Vepsun Technologies
 
Kubernetes Networking 101
Kubernetes Networking 101Kubernetes Networking 101
Kubernetes Networking 101Weaveworks
 
OpenvSwitch Deep Dive
OpenvSwitch Deep DiveOpenvSwitch Deep Dive
OpenvSwitch Deep Diverajdeep
 
how to install VMware
how to install VMwarehow to install VMware
how to install VMwarertchandu
 
05.2 virtio introduction
05.2 virtio introduction05.2 virtio introduction
05.2 virtio introductionzenixls2
 
Virtualized network with openvswitch
Virtualized network with openvswitchVirtualized network with openvswitch
Virtualized network with openvswitchSim Janghoon
 
VMworld 2017 - Top 10 things to know about vSAN
VMworld 2017 - Top 10 things to know about vSANVMworld 2017 - Top 10 things to know about vSAN
VMworld 2017 - Top 10 things to know about vSANDuncan Epping
 
VMware HCI solutions - 2020-01-16
VMware HCI solutions - 2020-01-16VMware HCI solutions - 2020-01-16
VMware HCI solutions - 2020-01-16David Pasek
 
Linux Networking Explained
Linux Networking ExplainedLinux Networking Explained
Linux Networking ExplainedThomas Graf
 
Load Balancing-as-a-Service (LBaaS) with octavia in openstack
Load Balancing-as-a-Service (LBaaS) with octavia in openstackLoad Balancing-as-a-Service (LBaaS) with octavia in openstack
Load Balancing-as-a-Service (LBaaS) with octavia in openstackYashar Esmaildokht
 
Introduction to BTRFS and ZFS
Introduction to BTRFS and ZFSIntroduction to BTRFS and ZFS
Introduction to BTRFS and ZFSTsung-en Hsiao
 
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...OpenStack Korea Community
 

What's hot (20)

Routed networks sydney
Routed networks sydneyRouted networks sydney
Routed networks sydney
 
The TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux KernelThe TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux Kernel
 
Kvm and libvirt
Kvm and libvirtKvm and libvirt
Kvm and libvirt
 
Boosting I/O Performance with KVM io_uring
Boosting I/O Performance with KVM io_uringBoosting I/O Performance with KVM io_uring
Boosting I/O Performance with KVM io_uring
 
VMware vSphere 6.0 - Troubleshooting Training - Day 1
VMware vSphere 6.0 - Troubleshooting Training - Day 1VMware vSphere 6.0 - Troubleshooting Training - Day 1
VMware vSphere 6.0 - Troubleshooting Training - Day 1
 
VMware Advance Troubleshooting Workshop - Day 2
VMware Advance Troubleshooting Workshop - Day 2VMware Advance Troubleshooting Workshop - Day 2
VMware Advance Troubleshooting Workshop - Day 2
 
The kvm virtualization way
The kvm virtualization wayThe kvm virtualization way
The kvm virtualization way
 
Kubernetes Networking 101
Kubernetes Networking 101Kubernetes Networking 101
Kubernetes Networking 101
 
Xen Hypervisor
Xen HypervisorXen Hypervisor
Xen Hypervisor
 
OpenvSwitch Deep Dive
OpenvSwitch Deep DiveOpenvSwitch Deep Dive
OpenvSwitch Deep Dive
 
how to install VMware
how to install VMwarehow to install VMware
how to install VMware
 
05.2 virtio introduction
05.2 virtio introduction05.2 virtio introduction
05.2 virtio introduction
 
Virtualized network with openvswitch
Virtualized network with openvswitchVirtualized network with openvswitch
Virtualized network with openvswitch
 
VMworld 2017 - Top 10 things to know about vSAN
VMworld 2017 - Top 10 things to know about vSANVMworld 2017 - Top 10 things to know about vSAN
VMworld 2017 - Top 10 things to know about vSAN
 
VMware HCI solutions - 2020-01-16
VMware HCI solutions - 2020-01-16VMware HCI solutions - 2020-01-16
VMware HCI solutions - 2020-01-16
 
Virtualization
VirtualizationVirtualization
Virtualization
 
Linux Networking Explained
Linux Networking ExplainedLinux Networking Explained
Linux Networking Explained
 
Load Balancing-as-a-Service (LBaaS) with octavia in openstack
Load Balancing-as-a-Service (LBaaS) with octavia in openstackLoad Balancing-as-a-Service (LBaaS) with octavia in openstack
Load Balancing-as-a-Service (LBaaS) with octavia in openstack
 
Introduction to BTRFS and ZFS
Introduction to BTRFS and ZFSIntroduction to BTRFS and ZFS
Introduction to BTRFS and ZFS
 
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...
 

Viewers also liked

Optimizing ARM cortex a and cortex-m based heterogeneous multiprocessor syste...
Optimizing ARM cortex a and cortex-m based heterogeneous multiprocessor syste...Optimizing ARM cortex a and cortex-m based heterogeneous multiprocessor syste...
Optimizing ARM cortex a and cortex-m based heterogeneous multiprocessor syste...Arm
 
A practical approach to securing embedded and io t platforms
A practical approach to securing embedded and io t platformsA practical approach to securing embedded and io t platforms
A practical approach to securing embedded and io t platformsArm
 
Developing functional safety systems with arm architecture solutions stroud
Developing functional safety systems with arm architecture solutions   stroudDeveloping functional safety systems with arm architecture solutions   stroud
Developing functional safety systems with arm architecture solutions stroudArm
 
Efficient software development with heterogeneous devices
Efficient software development with heterogeneous devicesEfficient software development with heterogeneous devices
Efficient software development with heterogeneous devicesArm
 
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...Glenn K. Lockwood
 
Software development in ar mv8 m architecture - yiu
Software development in ar mv8 m architecture - yiuSoftware development in ar mv8 m architecture - yiu
Software development in ar mv8 m architecture - yiuArm
 
So you think developing an SoC needs to be complex or expensive?
So you think developing an SoC needs to be complex or expensive?So you think developing an SoC needs to be complex or expensive?
So you think developing an SoC needs to be complex or expensive?Arm
 
Microsoft Project Olympus AI Accelerator Chassis (HGX-1)
Microsoft Project Olympus AI Accelerator Chassis (HGX-1)Microsoft Project Olympus AI Accelerator Chassis (HGX-1)
Microsoft Project Olympus AI Accelerator Chassis (HGX-1)inside-BigData.com
 
Microservices Tracing With Spring Cloud and Zipkin @Szczecin JUG
Microservices Tracing With Spring Cloud and Zipkin @Szczecin JUGMicroservices Tracing With Spring Cloud and Zipkin @Szczecin JUG
Microservices Tracing With Spring Cloud and Zipkin @Szczecin JUGMarcin Grzejszczak
 

Viewers also liked (9)

Optimizing ARM cortex a and cortex-m based heterogeneous multiprocessor syste...
Optimizing ARM cortex a and cortex-m based heterogeneous multiprocessor syste...Optimizing ARM cortex a and cortex-m based heterogeneous multiprocessor syste...
Optimizing ARM cortex a and cortex-m based heterogeneous multiprocessor syste...
 
A practical approach to securing embedded and io t platforms
A practical approach to securing embedded and io t platformsA practical approach to securing embedded and io t platforms
A practical approach to securing embedded and io t platforms
 
Developing functional safety systems with arm architecture solutions stroud
Developing functional safety systems with arm architecture solutions   stroudDeveloping functional safety systems with arm architecture solutions   stroud
Developing functional safety systems with arm architecture solutions stroud
 
Efficient software development with heterogeneous devices
Efficient software development with heterogeneous devicesEfficient software development with heterogeneous devices
Efficient software development with heterogeneous devices
 
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
 
Software development in ar mv8 m architecture - yiu
Software development in ar mv8 m architecture - yiuSoftware development in ar mv8 m architecture - yiu
Software development in ar mv8 m architecture - yiu
 
So you think developing an SoC needs to be complex or expensive?
So you think developing an SoC needs to be complex or expensive?So you think developing an SoC needs to be complex or expensive?
So you think developing an SoC needs to be complex or expensive?
 
Microsoft Project Olympus AI Accelerator Chassis (HGX-1)
Microsoft Project Olympus AI Accelerator Chassis (HGX-1)Microsoft Project Olympus AI Accelerator Chassis (HGX-1)
Microsoft Project Olympus AI Accelerator Chassis (HGX-1)
 
Microservices Tracing With Spring Cloud and Zipkin @Szczecin JUG
Microservices Tracing With Spring Cloud and Zipkin @Szczecin JUGMicroservices Tracing With Spring Cloud and Zipkin @Szczecin JUG
Microservices Tracing With Spring Cloud and Zipkin @Szczecin JUG
 

Similar to SR-IOV: The Key Enabling Technology for Fully Virtualized HPC Clusters

AIST Super Green Cloud: lessons learned from the operation and the performanc...
AIST Super Green Cloud: lessons learned from the operation and the performanc...AIST Super Green Cloud: lessons learned from the operation and the performanc...
AIST Super Green Cloud: lessons learned from the operation and the performanc...Ryousei Takano
 
Sharing High-Performance Interconnects Across Multiple Virtual Machines
Sharing High-Performance Interconnects Across Multiple Virtual MachinesSharing High-Performance Interconnects Across Multiple Virtual Machines
Sharing High-Performance Interconnects Across Multiple Virtual Machinesinside-BigData.com
 
Building Efficient HPC Clouds with MCAPICH2 and RDMA-Hadoop over SR-IOV Infin...
Building Efficient HPC Clouds with MCAPICH2 and RDMA-Hadoop over SR-IOV Infin...Building Efficient HPC Clouds with MCAPICH2 and RDMA-Hadoop over SR-IOV Infin...
Building Efficient HPC Clouds with MCAPICH2 and RDMA-Hadoop over SR-IOV Infin...inside-BigData.com
 
Anton Moldovan "Building an efficient replication system for thousands of ter...
Anton Moldovan "Building an efficient replication system for thousands of ter...Anton Moldovan "Building an efficient replication system for thousands of ter...
Anton Moldovan "Building an efficient replication system for thousands of ter...Fwdays
 
Platforms for Accelerating the Software Defined and Virtual Infrastructure
Platforms for Accelerating the Software Defined and Virtual InfrastructurePlatforms for Accelerating the Software Defined and Virtual Infrastructure
Platforms for Accelerating the Software Defined and Virtual Infrastructure6WIND
 
Arquitetura Hibrida - Integrando seu Data Center com a Nuvem da AWS
Arquitetura Hibrida - Integrando seu Data Center com a Nuvem da AWSArquitetura Hibrida - Integrando seu Data Center com a Nuvem da AWS
Arquitetura Hibrida - Integrando seu Data Center com a Nuvem da AWSAmazon Web Services LATAM
 
The von Neumann Memory Barrier and Computer Architectures for the 21st Century
The von Neumann Memory Barrier and Computer Architectures for the 21st CenturyThe von Neumann Memory Barrier and Computer Architectures for the 21st Century
The von Neumann Memory Barrier and Computer Architectures for the 21st CenturyPerry Lea
 
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...DataWorks Summit/Hadoop Summit
 
You Call that Micro, Mr. Docker? How OSv and Unikernels Help Micro-services S...
You Call that Micro, Mr. Docker? How OSv and Unikernels Help Micro-services S...You Call that Micro, Mr. Docker? How OSv and Unikernels Help Micro-services S...
You Call that Micro, Mr. Docker? How OSv and Unikernels Help Micro-services S...rhatr
 
High Performance Cloud Computing
High Performance Cloud ComputingHigh Performance Cloud Computing
High Performance Cloud ComputingAmazon Web Services
 
High Performance Cloud Computing
High Performance Cloud ComputingHigh Performance Cloud Computing
High Performance Cloud ComputingAmazon Web Services
 
#VMUGMTL - Xsigo Breakout
#VMUGMTL - Xsigo Breakout#VMUGMTL - Xsigo Breakout
#VMUGMTL - Xsigo Breakout1CloudRoad.com
 
The Pace of Innovation - Pop-up Loft Tel Aviv
The Pace of Innovation - Pop-up Loft Tel AvivThe Pace of Innovation - Pop-up Loft Tel Aviv
The Pace of Innovation - Pop-up Loft Tel AvivAmazon Web Services
 
The Impact of Hardware and Software Version Changes on Apache Kafka Performan...
The Impact of Hardware and Software Version Changes on Apache Kafka Performan...The Impact of Hardware and Software Version Changes on Apache Kafka Performan...
The Impact of Hardware and Software Version Changes on Apache Kafka Performan...Paul Brebner
 
Real-time DeepLearning on IoT Sensor Data
Real-time DeepLearning on IoT Sensor DataReal-time DeepLearning on IoT Sensor Data
Real-time DeepLearning on IoT Sensor DataRomeo Kienzler
 
Cassandra Summit 2014: Active-Active Cassandra Behind the Scenes
Cassandra Summit 2014: Active-Active Cassandra Behind the ScenesCassandra Summit 2014: Active-Active Cassandra Behind the Scenes
Cassandra Summit 2014: Active-Active Cassandra Behind the ScenesDataStax Academy
 
ISC Cloud 2013 - Cloud Architectures for HPC – Industry Case Studies
 ISC Cloud 2013 - Cloud Architectures for HPC – Industry Case Studies ISC Cloud 2013 - Cloud Architectures for HPC – Industry Case Studies
ISC Cloud 2013 - Cloud Architectures for HPC – Industry Case StudiesOpenNebula Project
 
ISC Cloud 2013 - Cloud Architectures for HPC – Industry Case Studies
 ISC Cloud 2013 - Cloud Architectures for HPC – Industry Case Studies ISC Cloud 2013 - Cloud Architectures for HPC – Industry Case Studies
ISC Cloud 2013 - Cloud Architectures for HPC – Industry Case StudiesIgnacio M. Llorente
 
ClickOS_EE80777777777777777777777777777.pptx
ClickOS_EE80777777777777777777777777777.pptxClickOS_EE80777777777777777777777777777.pptx
ClickOS_EE80777777777777777777777777777.pptxBiHongPhc
 

Similar to SR-IOV: The Key Enabling Technology for Fully Virtualized HPC Clusters (20)

AIST Super Green Cloud: lessons learned from the operation and the performanc...
AIST Super Green Cloud: lessons learned from the operation and the performanc...AIST Super Green Cloud: lessons learned from the operation and the performanc...
AIST Super Green Cloud: lessons learned from the operation and the performanc...
 
Sharing High-Performance Interconnects Across Multiple Virtual Machines
Sharing High-Performance Interconnects Across Multiple Virtual MachinesSharing High-Performance Interconnects Across Multiple Virtual Machines
Sharing High-Performance Interconnects Across Multiple Virtual Machines
 
Building Efficient HPC Clouds with MCAPICH2 and RDMA-Hadoop over SR-IOV Infin...
Building Efficient HPC Clouds with MCAPICH2 and RDMA-Hadoop over SR-IOV Infin...Building Efficient HPC Clouds with MCAPICH2 and RDMA-Hadoop over SR-IOV Infin...
Building Efficient HPC Clouds with MCAPICH2 and RDMA-Hadoop over SR-IOV Infin...
 
Anton Moldovan "Building an efficient replication system for thousands of ter...
Anton Moldovan "Building an efficient replication system for thousands of ter...Anton Moldovan "Building an efficient replication system for thousands of ter...
Anton Moldovan "Building an efficient replication system for thousands of ter...
 
Platforms for Accelerating the Software Defined and Virtual Infrastructure
Platforms for Accelerating the Software Defined and Virtual InfrastructurePlatforms for Accelerating the Software Defined and Virtual Infrastructure
Platforms for Accelerating the Software Defined and Virtual Infrastructure
 
Arquitetura Hibrida - Integrando seu Data Center com a Nuvem da AWS
Arquitetura Hibrida - Integrando seu Data Center com a Nuvem da AWSArquitetura Hibrida - Integrando seu Data Center com a Nuvem da AWS
Arquitetura Hibrida - Integrando seu Data Center com a Nuvem da AWS
 
The von Neumann Memory Barrier and Computer Architectures for the 21st Century
The von Neumann Memory Barrier and Computer Architectures for the 21st CenturyThe von Neumann Memory Barrier and Computer Architectures for the 21st Century
The von Neumann Memory Barrier and Computer Architectures for the 21st Century
 
Big Data Management at CERN: The CMS Example
Big Data Management at CERN: The CMS ExampleBig Data Management at CERN: The CMS Example
Big Data Management at CERN: The CMS Example
 
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
 
You Call that Micro, Mr. Docker? How OSv and Unikernels Help Micro-services S...
You Call that Micro, Mr. Docker? How OSv and Unikernels Help Micro-services S...You Call that Micro, Mr. Docker? How OSv and Unikernels Help Micro-services S...
You Call that Micro, Mr. Docker? How OSv and Unikernels Help Micro-services S...
 
High Performance Cloud Computing
High Performance Cloud ComputingHigh Performance Cloud Computing
High Performance Cloud Computing
 
High Performance Cloud Computing
High Performance Cloud ComputingHigh Performance Cloud Computing
High Performance Cloud Computing
 
#VMUGMTL - Xsigo Breakout
#VMUGMTL - Xsigo Breakout#VMUGMTL - Xsigo Breakout
#VMUGMTL - Xsigo Breakout
 
The Pace of Innovation - Pop-up Loft Tel Aviv
The Pace of Innovation - Pop-up Loft Tel AvivThe Pace of Innovation - Pop-up Loft Tel Aviv
The Pace of Innovation - Pop-up Loft Tel Aviv
 
The Impact of Hardware and Software Version Changes on Apache Kafka Performan...
The Impact of Hardware and Software Version Changes on Apache Kafka Performan...The Impact of Hardware and Software Version Changes on Apache Kafka Performan...
The Impact of Hardware and Software Version Changes on Apache Kafka Performan...
 
Real-time DeepLearning on IoT Sensor Data
Real-time DeepLearning on IoT Sensor DataReal-time DeepLearning on IoT Sensor Data
Real-time DeepLearning on IoT Sensor Data
 
Cassandra Summit 2014: Active-Active Cassandra Behind the Scenes
Cassandra Summit 2014: Active-Active Cassandra Behind the ScenesCassandra Summit 2014: Active-Active Cassandra Behind the Scenes
Cassandra Summit 2014: Active-Active Cassandra Behind the Scenes
 
ISC Cloud 2013 - Cloud Architectures for HPC – Industry Case Studies
 ISC Cloud 2013 - Cloud Architectures for HPC – Industry Case Studies ISC Cloud 2013 - Cloud Architectures for HPC – Industry Case Studies
ISC Cloud 2013 - Cloud Architectures for HPC – Industry Case Studies
 
ISC Cloud 2013 - Cloud Architectures for HPC – Industry Case Studies
 ISC Cloud 2013 - Cloud Architectures for HPC – Industry Case Studies ISC Cloud 2013 - Cloud Architectures for HPC – Industry Case Studies
ISC Cloud 2013 - Cloud Architectures for HPC – Industry Case Studies
 
ClickOS_EE80777777777777777777777777777.pptx
ClickOS_EE80777777777777777777777777777.pptxClickOS_EE80777777777777777777777777777.pptx
ClickOS_EE80777777777777777777777777777.pptx
 

More from Glenn K. Lockwood

Understanding and Measuring I/O Performance
Understanding and Measuring I/O PerformanceUnderstanding and Measuring I/O Performance
Understanding and Measuring I/O PerformanceGlenn K. Lockwood
 
ASCI Terascale Simulation Requirements and Deployments
ASCI Terascale Simulation Requirements and DeploymentsASCI Terascale Simulation Requirements and Deployments
ASCI Terascale Simulation Requirements and DeploymentsGlenn K. Lockwood
 
Large-scale Genomic Analysis Enabled by Gordon
Large-scale Genomic Analysis Enabled by GordonLarge-scale Genomic Analysis Enabled by Gordon
Large-scale Genomic Analysis Enabled by GordonGlenn K. Lockwood
 
Hadoop Streaming: Programming Hadoop without Java
Hadoop Streaming: Programming Hadoop without JavaHadoop Streaming: Programming Hadoop without Java
Hadoop Streaming: Programming Hadoop without JavaGlenn K. Lockwood
 

More from Glenn K. Lockwood (7)

Understanding and Measuring I/O Performance
Understanding and Measuring I/O PerformanceUnderstanding and Measuring I/O Performance
Understanding and Measuring I/O Performance
 
Parallel R and Hadoop
Parallel R and HadoopParallel R and Hadoop
Parallel R and Hadoop
 
ASCI Terascale Simulation Requirements and Deployments
ASCI Terascale Simulation Requirements and DeploymentsASCI Terascale Simulation Requirements and Deployments
ASCI Terascale Simulation Requirements and Deployments
 
Overview of Spark for HPC
Overview of Spark for HPCOverview of Spark for HPC
Overview of Spark for HPC
 
myHadoop 0.30
myHadoop 0.30myHadoop 0.30
myHadoop 0.30
 
Large-scale Genomic Analysis Enabled by Gordon
Large-scale Genomic Analysis Enabled by GordonLarge-scale Genomic Analysis Enabled by Gordon
Large-scale Genomic Analysis Enabled by Gordon
 
Hadoop Streaming: Programming Hadoop without Java
Hadoop Streaming: Programming Hadoop without JavaHadoop Streaming: Programming Hadoop without Java
Hadoop Streaming: Programming Hadoop without Java
 

Recently uploaded

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 

Recently uploaded (20)

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 

SR-IOV: The Key Enabling Technology for Fully Virtualized HPC Clusters

  • 1. SR-IOV: The Key Enabling Technology for Fully Virtualized HPC Clusters! Glenn K. Lockwood! Christopher Irving! Philip M. Papadopoulos! Mahidhar Tatineni! Rick Wagner! SAN DIEGO SUPERCOMPUTER CENTER
  • 2. Single Root I/O Virtualization in HPC! •  Problem: complex workflows demand increasing flexibility from HPC platforms" •  Virtualization = flexibility" •  Virtualization = IO performance loss (e.g., excessive DMA interrupts)" •  Solution: SR-IOV and Mellanox ConnectX-3 InfiniBand HCAs " •  One physical function (PF) à multiple virtual functions (VF), each with own DMA streams, memory space, interrupts" •  Allows DMA to bypass hypervisor to VMs! SAN DIEGO SUPERCOMPUTER CENTER
  • 3. High-Performance Virtualization on Comet ! •  Mellanox FDR InfiniBand HCAs with SR-IOV" •  Rocks and OpenStack Nova to manage VMs" •  Flexibility to support complex science gateways and web-based workflow engines" •  custom compute appliances and virtual clusters developed with FutureGrid and their existing expertise" •  backed by virtualized Lustre running over virtualized InfiniBand" SAN DIEGO SUPERCOMPUTER CENTER
  • 4. Hardware/Software Configuration of Test Cluster ! Native, SR-IOV! Amazon EC2! Platform" •  Rocks 6.1 (EL6)" •  Virtualization via kvm   •  Amazon Linux 2013.03 (EL6)" •  cc2.8xlarge Instances" CPUs" •  2x Xeon E5-2660 (2.2GHz)" •  16 cores per node" •  2x Xeon E5-2670 (2.6GHz)" •  16 cores per node" RAM" •  64 GB DDR3 DRAM" •  60.5 DDR3 DRAM" Interconnect" •  QDR4X InfiniBand" •  10 GbE" •  Mellanox ConnectX-3 (MT27500)" •  common placement group" •  Intel VT-d, SR-IOV enabled in firmware, kernel, drivers" •  mlx4_core  1.1" •  Mellanox OFED 2.0" •  HCA firmware 2.11.1192" SAN DIEGO SUPERCOMPUTER CENTER
  • 5. 50x less latency than Amazon EC2! •  SR-IOV! •  < 30% overhead for M < 128 bytes" •  < 10% overhead for eager send/recv" •  Overhead à 0% for bandwidth-limited regime" •  Amazon EC2! •  > 5000% worse latency" •  Time dependent (noisy)" 5 SAN DIEGO SUPERCOMPUTER CENTER OSU Microbenchmarks (3.9, osu_latency)"
  • 6. 10x more bandwidth than Amazon EC2! •  SR-IOV! •  < 2% bandwidth loss over entire range" •  > 95% peak bandwidth" •  Amazon EC2! •  < 35% peak bandwidth" •  900% to 2500% worse bandwidth than virtualized InfiniBand" 6 SAN DIEGO SUPERCOMPUTER CENTER OSU Microbenchmarks (3.9, osu_bw)"
  • 7. Weather Modeling – 15% Overhead! •  96-core (6-node) calculation" •  Nearest-neighbor communication" •  Scalable algorithms" •  SR-IOV incurs modest (15%) performance hit" •  ...but still still 20% faster*** than Amazon" SAN DIEGO SUPERCOMPUTER CENTER WRF 3.4.1 – 3hr forecast" *** 20% faster despite SR-IOV cluster having 20% slower CPUs"
  • 8. Quantum ESPRESSO: 5x Faster than EC2! •  48-core, 3 node calc" •  CG matrix inversion (irregular comm.)" •  3D FFT matrix transposes (All-to-all communication)" •  28% slower w/ SR-IOV" •  SR-IOV still > 500% faster*** than EC2" SAN DIEGO SUPERCOMPUTER CENTER Quantum Espresso 5.0.2 – DEISA AUSURF112 benchmark" *** 20% faster despite SR-IOV cluster having 20% slower CPUs"
  • 9. Conclusions! •  SR-IOV: huge step forward in high-performance virtualization" •  Shows substantial improvement in latency over Amazon EC2, and it provides nearly 0 bandwidth overhead! •  Benchmark application performance confirms this: significant improvement over EC2" •  SR-IOV: lowers performance barrier to virtualizing the interconnect and makes fully virtualized HPC clusters viable! •  Comet will deliver virtualized HPC to new/non-traditional communities that need flexibility without major loss of performance! SAN DIEGO SUPERCOMPUTER CENTER