Your SlideShare is downloading. ×
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply



Published on

Published in: Technology

  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. InfiniBand Present and Future March , 2004 Yaron Haviv Voltaire, CTO [email_address]
  • 2. Agenda
    • Cluster Scalability Challenges
    • InfiniBand Technology
    • InfiniBand Target Applications
    • About Voltaire
    • Voltaire InfiniBand Technology
  • 3. Scaling Out Using Clusters
    • Much Lower cost
    • But, lower reliability/MTBF, underutilization, Higher Complexity, Storage bottleneck
    Super Computers and Mainframes Bunch of interconnected Linux machines Our Challenge: Minimize the Scale-Out overhead Paradigm Shift
  • 4. What Does It Take to Build a Computing Cluster?
    • Physical Distribution
      • High Bandwidth for Growing Data Requirements
      • Low Latency, RDMA for Linear Scalability
    • Logical Consolidation
      • Virtualization
      • Built-in RAS Capabilities
    High Performance Interconnect Intelligent Interconnect
  • 5. InfiniBand value proposition
    • Interconnect designed from ground up for High-Performance system interconnect and RDMA, removing any redundant features
    • Significantly lower cost/performance than any other technology due to its architecture
    • Lowest latency of 140ns per hop, and 5.8us E2E
    • Enable high-speed file, block, network, and IPC traffic using a single technology with RDMA
    • Built in HA, QoS, Partitioning, etc’
    • Already support 30Gb/s links TODAY
      • New standard for 120Gb/s defined
  • 6. InfiniBand Enabling high-end clusters Similar shared memory model at much lower costs! Memory InfiniBand Switch Switch Expensive SMP Server with Proprietary Interconnect InfiniBand Cluster based on 2-way Servers RDMA Operations allow fast access of CPUs to remote memory I/O I/O CPU CPU CPU CPU CPU CPU CPU CPU CPU Memory HCA CPU CPU Memory HCA CPU CPU Memory HCA CPU CPU Memory HCA CPU
  • 7. InfiniBand TM Link Protocol Slide Taken from IBTA InfiniBand Overview Presentation Transaction Message Message Message Message Packet Packet Packet Up to 2GB Message Size Support for 256 512, 1024, 2048, 4096B MTU Automatic Segmentation & Reasembly
  • 8. InfiniBand TM Link Attributes Message Message Message Message Packet Packet Packet Each VL has dedicated Buffers Packet’s Service Level is mapped to a Virtual Lane(VL) Each link in a fabric may support a different number of VL’s De-Mux Mux Physical Link Packets are sent one at a time byte-striped across width and bit serial on each conductor Packet sent with Specified Service Level
  • 9. Partitions
    • The InfiniBand Architecture defines a mechanism for defining isolated access domains
      • Each port/node belongs to at least one partition
      • Can only communicate with nodes in the same partition
      • Defines Limited and Full membership for shared resources
    • SM manages partitions by assigning P_Keys
      • CA’s have primary responsibility for enforcing P_Keys
      • Switches can optionally do P_Key enforcement
    Host A Host B InfiniBand TM Fabric Partition 1 Partition 2 Host C
  • 10. InfiniBand cables and connectors
    • Current support for 2.5Ghz signaling X1/4/12, up to 30Gb/s
    • Future 5/10Ghz signaling (DDR/QDR) X1/4/12, up to 120Gb/s
    1X 2.5 (5/10) Gbs 4X 10 (20/40) Gbs 12X 30 (60/120) Gbs Parallel Optics IB 4X Connector Optical to Copper Smart Module 4X Copper connector
  • 11. Transport Service Types X Raw Datagram X Unreliable Datagram X X Unreliable Connection X X X X Reliable Datagram X X X X Reliable Connection Multi-Cast Atomic Op RDMA Write RDMA Read Send Send Queue Operation X = allowed operation Service Type X X Slide Taken from IBTA InfiniBand Overview Presentation
  • 12. Channel (QP) architecture Consumer Consumer Transport Transport Port Port Port Port Packet Relay Send Receive Send Receive Switching Fabric IBA Operations (Send/RDMA/Atomic) IBA Packets/Messages Consumer Transactions, Operations Etc. (DAPL/SDP/iSER/IPoIB ..) QP QP CQE CQE WQE WQE Host Channel Adapter Application Application Direct Access Consumer Consumer Transport Transport Port Port Port Port Packet Relay Send Receive Send Receive IBA Operations (Send/RDMA/Atomic) IBA Packets/Messages Consumer Transactions, Operations Etc. QP QP CQE CQE WQE WQE Host Channel Adapter Application Application Direct Access
  • 13. Using InfiniBand™ API’s in user space
    • Applications use direct access to hardware avoiding OS overhead
    • Mapped to standard MPI, MPICH libraries or other API’s (Sockets, DAPL)
    InfiniBand HCA (Hardware) Upper-Layer protocol (e.g. MPI) CM Interface Naming, Routing, Security, CM, Fail-Over , P&P, .. IB Verb API HCA Driver Data User space Kernel Control Access layer Application Standard Interface (e.g. MPI)
  • 14. Voltaire’s Protocol Stack File IO Multiple Apps over one fabric Low Latency MPI Applications Low Latency Clustering Apps, Clustered File Systems High Performance Block Storage Traffic HCA Drivers GSI SMA CM IBARP Access Data DAPL IPoIB TCP iSCSI RDMA SDP InfiniBand Services Upper Layer Protocols Score / LAM MPI NFS RDMA Clustering Middleware High Performance File Storage Traffic, Any Sockets Based Apps UDP Based Apps, Management Traffic
  • 15. InfiniBand vs. Myrinet Much better performance and multipurpose at the same prices 6.3 5.9 MPI Latency (uS) 248 879 (2500 pci-ex) Bandwidth (Mbytes/sec) Yes No Proprietary No Yes Dynamic Reconfiguration No Yes Managed Fabric No Yes Multiple Protocols 2 Gbps 10Gbps Data Rate Myrinet InfiniBand
  • 16. InfiniBand & HPC
    • Value
      • First industry standard to enable Linux Server Clusters – Fastest growing segment in HPC
      • Significant performance advantage over proprietary technologies
        • More than 3X MPI bandwidth improvement over Myrinet
    • Market Dynamics
      • Large labs and universities leading the way
        • Specifically demanding InfiniBand
        • Clusters of hundreds of nodes are being deployed
      • Resellers team up with InfiniBand vendors, target the lower end
        • IB pricing and maturity ready for the mass
    Generating Significant Revenue Today!
  • 17. InfiniBand Value Proposition in Data Centers
    • Only technology to enable Grid Computing solutions that scale
    • Enabling the “Lintel-based data center”
    • Immediate performance benefits for database clusters
      • 1.5X to 2X application level improvement for Oracle and DB2
    • 40% TCO Savings vs.GbE Clusters
    • Co-existence with legacy systems
    InfiniBand in the Data Center: Do More with Less Linux World 2003 Best of Show with IBM
  • 18. Scaling out the storage Clustered File Systems Cluster FC Storage IB Storage iSCSI Storage Providing high-bandwidth storage Using Clustered File Systems, SAN, Multipathing, and storage virtualization Storage Virtualization Management NFS/RDMA Or Luster iSCSI/RDMA Remote
  • 19. Planned InfiniBand progress in 2004
    • Performance:
      • PCI-Express HCA (2.5GB/s), demoed in IDF
      • DDR support (Switch: 60Gb/s, HCA: 20Gb/s))
    • Scalability
      • Moving to 1000’s node clusters
    • Integration
      • Adding storage and network connectivity
      • Diskless (Boot over InfiniBand)
      • Support from all major server vendors
    • Open Software
      • Native support in RedHat and Suse (in Q3)
  • 20. Voltaire Products
  • 21. Voltaire providing complete grid solutions Servers, and Server Blades IP SAN/NAS FC SAN Administration Remote Backup Site
    • Providing fully integrated intelligent connectivity solution
      • High performance interconnect, storage and networking
      • Central management software
      • Automated deployment and recovery in different levels
    Clients TCP/IP FC InfiniBand InfiniBand FC GbE
  • 22. Voltaire: Fast Facts
    • Locations
      • Business HQ: Boston
      • Sales Offices: Japan, France
      • R&D: Herzeliya Israel
    • Headcount : 70
    • Financing
      • Strategic Investors: Hitachi and Quantum
      • Top US and Israeli VCs
    • Recent Partnerships
      • Hitachi
      • HP
      • SGI
      • Apple
  • 23.
    • Delivering solutions to prestigious labs and research centers
    • Signed 10 resellers, global coverage
    • Voltaire’s competitive edge:
      • Most scalable switch family, Largest IBTA Certified Switches
      • Scalable HPC focused software: Stacks and Fabric Management
      • Highly scalable, Integrated GbE and Storage connectivity
      • Open Source Strategy
    Recent Voltaire Successes in HPC
  • 24. Joint HPC Deployments with IBM "We have been working closely with Voltaire to deliver superior InfiniBand clustering solutions … reached a level of maturity that is encouraging rapid adoption within the industry.“ Dave Turek, vice president of Deep Computing, IBM 192 Dual Xeon IBM x335 Servers Voltaire provided: Switches, Adapters, Software and Advanced Fabric Management Storage is next
  • 25. Voltaire InfiniBand Switch Router Family
    • 12-18/24/96 port non-blocking, multi-protocol connectivity
    • Integrated wire-speed network and storage Virtualization
    • No single point of failure – Hot-swappable FRUs
    • Non-disruptive software update, fail-over
    • Modular elements for investment protection
    • Advanced integrated management
    Voltaire InfiniBand Switch Router Family ISR6000 ISR9600 ISR9024
  • 26. Voltaire ISR 9288 and The New Switch Family
    • ISR 9288 is part of a new family of switches built for enterprise availability, usability and scalability
    • Very cost effective switches for HPC
    • Single non-blocking switch to support up to 288 ports cluster
      • Enable scaling to XX,000 nodes with minimal hops and cost
    • Integrated Intelligent GbE and FC I/O modules
    • No single point of failure
      • Redundant and hot swappable FRUs
    • Shared components across family for investment protection
    • Advanced embedded management
    HPC Solution in a single Box ISR9288
  • 27. Voltaire FC/GbE Router Blades
    • Using Router Blades Drawers, multiple modules can be pooled to create a large virtual router for bandwidth aggregation and high-availability - all are managed as a single entity
    • Voltaire GER 200/400 – TCP/IP to InfiniBand Router Blades
      • 2/4 1GbE layer 4 + TOE Blade
      • Transparent like a GbE switch
      • Custom high-speed ASIC for Layer 4-7, and TCP Termination
    • Voltaire FCR 400 – InfiniBand to Fibre Channel Router Blades
      • 4 200MB FC ports
      • LUN Mapping/Masking, Zoning
      • Logical Volume Management, Striping
      • Optional Mirroring, Snapshots, Scripting
  • 28. One Switch, Many Storage Options High performance, low cost native InfiniBand storage Fibre Channel SAN IP SAN FC GbE (Remote)
    • Emulate virtual local storage for hosts, or switch between Medias
    • Complete Storage Virtualization, Centrally managed through VoltaireVision and 3rd party tools
    • Providing maximal performance, availability, and scalability using dynamic multipathing
    • Authenticated access
    Coming Soon iSCSI (iSNS) / 3 rd Party Management ( optional ) IB HCA iSCSI RDMA Server SCSI
  • 29. Complete grid infrastructure management Voltaire Fabric Manager (VFM) Load Balancing / NAT, Filtering, VLAN’s, QoS TCP/IP Provisioning Storage Provisioning Logical Volume Management LUN Masking/Mapping High-Availability, Security Voltaire Device Manager (VDM)
  • 30. Integrated Fabric Management and Virtualization
    • Eliminate Fabric Congestion
      • Intelligent multi-path and adaptive routing
      • Proactive monitoring
      • Central control over IB QoS , Virtual Lanes
    • Policy-based Management, and Grouping
      • Manage pools of resources using group operations and policies
      • Consolidate and aggregate status information and alert
    • Partitioning
      • Create Isolated domains within a single fabric
    • Name services to match nodes with logical names & IP’s
    • Automatic tools for fault detection and recovery
    • Scripting and External API’s, integration with 3 rd party tools
    Enabling scale-out though intelligent management and virtualization of grid resources
  • 31. Automated Cluster Status/Verification Tools
    • VFM showing hierarchal status
    • Periodic port/node status scanning
    • CLOS Topology verification
    • Automated host software verification
    • Trace Route and route utilization
    • Integration with open statistics tools
    • Name services (ATS)
    Fabric Verification Log VFM Inspect Voltaire Fabric Manager IB Host Automated Cluster Verification Ganglia integration Exportable CSV (Excel) with real-time fabric status/performance Providing simple, automated views to the user, exportable to files
  • 32. Summary
    • Large Linux clusters are been deployed today
    • Presenting some performance and maintenance challenges
    • InfiniBand is the right choice for Data Grids
    • Voltaire products maximize InfiniBand value, and provide fully integrated solutions