Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro HPC Workshop 2018


Published on

Speakers: Gilad Shainer and Scot Schultz
Company: Mellanox Technologies
Talk Title: Intelligent Interconnect Architecture to Enable Next
Generation HPC
Talk Abstract:
The latest revolution in HPC interconnect architecture is the development of In-Network Computing, a technology that enables handling and accelerating application workloads at the network level. By placing data-related algorithms on an intelligent network, we can overcome the new performance bottlenecks and improve the data center and applications performance. The combination of In-Network Computing and ARM based processors offer a rich set of capabilities and opportunities to build the next generation of HPC platforms.

Gilad Shainer Bio:
Gilad Shainer has served as Mellanox's vice president of marketing since March 2013. Previously, Mr. Shainer was Mellanox's vice president of marketing development from March 2012 to March 2013. Mr. Shainer joined Mellanox in 2001 as a design engineer and later served in senior marketing management roles between July 2005 and February 2012. Mr. Shainer holds several patents in the field of high-speed networking and contributed to the PCI-SIG PCI-X and PCIe specifications. Gilad Shainer holds a MSc degree (2001, Cum Laude) and a BSc degree (1998, Cum Laude) in Electrical Engineering from the Technion Institute of Technology in Israel.

Scot Schultz Bio:
Scot Schultz is a HPC technology specialist with broad knowledge in operating systems, high speed interconnects and processor technologies. Joining the Mellanox team in 2013, Schultz is 30-year veteran of the computing industry. Prior to joining Mellanox, he spent the past 17 years at AMD in various engineering and leadership roles in the area of high performance computing. Scot has also been instrumental with the growth and development of various industry organizations including the Open Fabrics Alliance, and continues to serve as a founding board-member of the OpenPOWER Foundation and Director of Educational Outreach and founding member of the HPC-AI Advisory Council.

Published in: Technology
  • Be the first to comment

Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro HPC Workshop 2018

  1. 1. 1© 2018 Mellanox Technologies | Confidential Paving the Road to Exascale July 2018 Interconnect Your Future
  2. 2. 2© 2018 Mellanox Technologies | Confidential Higher Data Speeds Faster Data Processing Better Data Security Adapters Switches Cables & Transceivers SmartNIC System on a Chip HPC and AI Needs the Most Intelligent Interconnect
  3. 3. 3© 2018 Mellanox Technologies | Confidential Mellanox Accelerates Leading HPC and AI Systems ‘Summit’ CORAL System World’s Fastest HPC / AI System 9.2K InfiniBand Nodes ‘Sierra’ CORAL System #2 USA Supercomputer 8.6K InfiniBand Nodes Wuxi Supercomputing Center Fastest Supercomputer in China 41K InfiniBand Nodes 1 2 3
  4. 4. 4© 2018 Mellanox Technologies | Confidential Mellanox Accelerates Leading HPC and AI Systems Fastest Supercomputer in Canada Dragonfly+ Topology 1.5K InfiniBand Nodes ‘Astra’ Arm-Based Supercomputer NNSA Vanguard Program 2.6K InfiniBand Nodes Fastest HPC / AI System in Japan 1.1K InfiniBand Nodes
  5. 5. 5© 2018 Mellanox Technologies | Confidential Data Centric Architecture to Overcome Latency Bottlenecks CPU-Centric (Onload) Data-Centric (Offload) Communications Latencies of 30-40us Intelligent Interconnect Paves the Road to Exascale Performance GPU CPU GPU CPU GPU CPU CPU GPU GPU CPU GPU CPU GPU CPU CPU GPU Communications Latencies of 3-4us
  6. 6. 6© 2018 Mellanox Technologies | Confidential In-Network Computing to Enable Data-Centric Data Centers GPU CPU GPU CPU GPU CPU CPU GPU GPUDirect RDMA Scalable Hierarchical Aggregation and Reduction Protocol NVMeOver Fabrics
  7. 7. 7© 2018 Mellanox Technologies | Confidential SHARP AllReduce Performance Advantages (128 Nodes) SHARP enables 75% Reduction in Latency Providing Scalable Flat Latency
  8. 8. 8© 2018 Mellanox Technologies | Confidential SHARP AllReduce Performance Advantages 1500 Nodes, 60K MPI Ranks, Dragonfly+ Topology SHARP Enables Highest Performance
  9. 9. © 2018 UCF Consortium 9 UCXUnified Communication - X Framework WEB: Mailing List:
  10. 10. 10© 2018 Mellanox Technologies | Confidential Highest-Performance 200Gb/s Interconnect Solutions Transceivers Active Optical and Copper Cables (10 / 25 / 40 / 50 / 56 / 100 / 200Gb/s) 40 HDR (200Gb/s) InfiniBand Ports 80 HDR100 InfiniBand Ports Throughput of 16Tb/s, <90ns Latency 200Gb/s Adapter, 0.6us latency 200 million messages per second (10 / 25 / 40 / 50 / 56 / 100 / 200Gb/s) 16 400GbE, 32 200GbE, 128 25/50GbE Ports (10 / 25 / 40 / 50 / 100 / 200 GbE) Throughput of 6.4Tb/s MPI, SHMEM/PGAS, UPC For Commercial and Open Source Applications Leverages Hardware Accelerations System on Chip and SmartNIC Programmable adapter Smart Offloads
  11. 11. 11© 2018 Mellanox Technologies | Confidential BlueField System on a Chip ▪ Tile Architecture - 16 ARM® A72 CPUs ▪ SkyMesh™ fully coherent low-latency interconnect ▪ 8MB L2 Cache, 8 Tiles ▪ 12MB L3 Last Level Cache ▪ Integrated ConnectX-5 subsystem ▪ Dual 100Gb/s Ethernet/InfiniBand, compatible with ConnectX-5 ▪ High-end Networking Offloads ▪ RDMA, NVMe-oF, Erasure Coding, T10-DIF ▪ Fully Integrated PCIe switch ▪ 32 Bifurcated PCI Gen4 lanes (up to 200Gb/s) ▪ Root Complex or Endpoint modes ▪ 2x16, 4x8, 8x4 or 16x2 configurations ▪ Crypto Engines ▪ AES, SHA-1/2 ▪ Public Key acceleration, True RNG ▪ Memory Controllers ▪ 2x Channels DDR4 Memory Controllers w/ ECC Dual VPI Ports Ethernet/InfiniBand: 1, 10, 25,40,50,100G 32-lanes PCIe Gen3/4
  12. 12. 12© 2018 Mellanox Technologies | Confidential BlueField Product Line ▪ Different SKUs ▪ # Cores ▪ Speeds ▪ Perf points SmartNIC Ethernet Dual Port 100Gb/s Controller BF1600 Dual Port 100Gb/s Controller BF1700 BlueField™ Platform BF1100 & BF1200System on Chip ▪ Different SKUs ▪ GPU and SSD ▪ 1U and 2U ▪ Up to 16 SSDs ▪ Different SKUs ▪ 16/8/4 cores ▪ Dual 25Gb/s ▪ Storage card▪ Different SKUs ▪ PCIe x16 ▪ PCIe x32 ▪ SDK and Development Tools ▪ Networking , Security Features ▪ Full NVMe storage capability Full Software Enablement SmartNIC VPI BF1660 ▪ VPI 2-ports 100G ▪ 100Gb/s / EDR ▪ 16 Cores
  13. 13. 13© 2018 Mellanox Technologies | Confidential Thank You