SlideShare a Scribd company logo
1 of 46
Download to read offline
© 2006 IBM Corporation
This presentation is intended for the education of IBM and Business Partner sales personnel. It should not be distributed to customers.
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation
Introduction to Intelligent Clusters
XTW01
Topic 7
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation 2
Course Overview
The objectives of this course of study are:
>Describe a high-performance computing cluster
>List the business goals that Intelligent Clusters addresses
>Identify three core Intelligent Clusters components
>List the high-speed networking options available in Intelligent Clusters
>List three software tools used in Clusters
>Describe Cluster benchmarking
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation 3
Topic Agenda
>*Commodity Clusters*
>Overview of Intelligent Clusters
>Cluster Hardware
>Cluster Networking
>Cluster Management, Software Stack, and Benchmarking
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation 4
>Clusters are comprised of standard, commodity components that could be
used separately in other types of computing configurations
 Compute servers – a.k.a. nodes
 High-speed networking adapters and switches
 Local and/or external storage
 A commodity operating system such as Linux
 Systems management software
 Middleware libraries and Application software
>Clusters enable “Commodity-based supercomputing”
What is a Commodity Cluster?
A multi-server system, comprised of interconnected computers and associated
networking and storage devices, that are unified via systems management and
networking software to accomplish a specific purpose.
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation 5
Storage
Rack
Fiber
Network
Fibre SAN Switch
Storage Nodes
Ethernet Switch
Management,
Storage,
SOL and
Cluster VLANs
Storage VLAN
Management,
SOL, Cluster
VLANs
Management Node
User/Login Nodes
LAN
Cluster VLAN
High-speed network
Switch
Message-passing Network
User access
To management
network
Compute Node Rack
Public
VLAN
Conceptual View of a Cluster
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation 6
Energy Finance Mfg Life Sciences Media Public / Gov’t
Seismic
Analysis
Reservoir
Analysis
Derivative
Analysis
Actuarial
Analysis
Asset Liability
Management
Portfolio Risk
Analysis
Statistical
Analysis
Mechanical/
Electric Design
Process
Simulation
Finite Element
Analysis
Failure
Analysis
Drug
Discovery
Protein
Folding
Medical
Imaging
Digital
Rendering
Collaborative
Research
Numerical
Weather
Forecasting
High Energy
Physics
Bandwidth
Consumption
Gaming
Application of Clusters in Industry
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation 7
Technology Innovation in HPC
>Multi-core enabled systems create new opportunities to advance
applications and solutions
 Dual and Quad core along with increased density memory designs
 “8 way” x86 128GB capable system that begins at less than $10k.
>Virtualization is a hot topic for architectures
 Possible workload consolidation for cost savings
 Power consumption reduced by optimizing system level utilization
>Manageability is key to addressing complexity
 Effective power/thermal management through SW tools
 Virtualization management tools must be integrated into the overall
management scheme
Multi Core
Virtualization
Manageability
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation 8
Topic Agenda
>Commodity Clusters
>*Overview of Intelligent Clusters*
>Cluster Hardware
>Cluster Networking
>Cluster Management, Software Stack and Benchmarking
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation 9
Approaches to Clustering
Roll Your OwnRoll Your Own
• Client orders individual
components from a
variety of vendors,
including IBM
• Client tests and
integrates components
or contracts with an
integrator
• Client must address
warranty issues with
each vendor
BP IntegratedBP Integrated
• BP orders servers &
storage from IBM and
networking from 3rd
Party vendors
• BP builds and
integrates components
and delivers to customer
• Client must address
warranty issues with
each vendor
IBM Racked andIBM Racked and
StackedStacked
• Client orders servers &
storage in standard rack
configurations from IBM
• Client integrates IBM
racks with 3rd
Party
components or
contracts with IGS or
other integrator
• Client must address
warranty issues with
each vendor
IntelligentIntelligent
ClustersClusters
• Client orders integrated
cluster solution from
IBM, including servers,
storage and networking
components
• IBM delivers factory-
built and tested cluster
ready to “plug-in”
• Client has Single Point
of Contact for all
warranty issues.
Piece PartsPiece Parts Integrated SolutionIntegrated Solution
Client bears all risk for
sizing, design, integration,
deployment and warranty
issues
Single Vendor responsible
for sizing, design,
integration, deployment,
and all warranty issues
IBM Delivers Across the Spectrum
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation 10
Blade Servers
Disk Storage
Storage
Networking
Fiber Channel
iSCSI
FcOE
Rack-mount Servers
Compute Nodes
Storage
Software
ServeRAID
IBM TotalStorage®
IBM Servers
Core
Technologies
An IBM portfolio of components that
have been cluster configured, tested,
and work with a defined supporting
software stack.
•Factory assembled
•Onsite Installation
•One phone number for support.
•Selection of options to customize
your configuration including
Linux operating system (RHEL or
SUSE), xCAT, & GPFS
The degree to which a multi-server
system exhibits these characteristics
determines if it is a cluster:
- Dedicated private VLAN
- All nodes running same suite of apps
- Single point-of-control for:
- Software/application distribution
- Hardware management
- Inter-node communication
- Node interdependence
- Linux operating system
Storage Nodes
Management Nodes
Processors
-Intel®
Fiber
SAS
iSCSI
HS21-XM
Ethernet
10 GbE
1 GbE
InfiniBand
4X – SDR
4X – DDR
4X - QDR
Networks
What is an IBM System Intelligent Cluster?
Out of band
Management
Terminal Serv.
x3550 M3
x3650 M3
HS22
Scale-out Servers
iDataPlex
dx360 M3
HX5
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation 11
IBM HPC Cluster Solution (Intelligent Clusters)
HPC Cluster SolutionSystem x Servers
(Rack mount, Blades or iDataPlex)
GPFS
 xCAT
 Linux or
Windows
Switches &
Storage
Cluster
Software
IBM or Business Partner adds…
+ =+
Technical
Application
(or “Workload”)
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation 12
Course Agenda
>Commodity Clusters
>Overview of Intelligent Clusters
>*Cluster Hardware*
>Cluster Networking
>Cluster Management and Software Stack and Benchmarking
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation 13
Intelligent Clusters Overview - Servers
IBM System x™ 3550 M3
High performance compute nodes
• Dual Socket – 3550 M3 Intel
• Integrated System Management
IBM System x™ 3650 M3
Mission critical performance
• Dual Socket – 3650 M3 Intel
• Integrated System Management
2U
1U
Active Energy ManagerTM
: Power Management at Your Control
• HS21-XM/HS22/HX5 Intel Processor-based Blades
IBM BladeCenter® with HS21-XM, HS22, and HX5
IBM BladeCenter S
Distributed, small office,
easy to configure
IBM BladeCenter H
High performance
IBM BladeCenter E
Best energy efficiency,
best density
HS21 XM
Extended-memory
HS22
General-purpose
enterprise
Industry-leading performance, reliability and control
HX5
Scalable enterprise
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation 14
IBM System x iDataPlex
PDUs
3U Chassis
2U Chassis
Switches
iDataPlex Rear Door
Heat Exchanger
HPC ServerWeb Server
Storage
Drives &
Options
I/O TrayStorage Tray
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation
Current iDataPlex Server Offerings
>Processor: Quad Core Intel Xeon 5500
>Quick Path Interconnect up to 6.4 GT/s
>Memory:16 DIMM DDR3 - 128 GB max
>Memory Speed: up to 1333 MHz
>PCIe: x16 electrical/ x16 mechanical
>Chipset: Tylersburg-36D
>Last Order Date: December 31, 2010
iDataPlex dx360 M2
High-performance Dual-Socket
>Storage: 12 3.5” HDD up to 24 TB per node / 672TB per rack
>Proc: 6 or 4 Core Intel Xeon 5600
>Memory: 16 DIMM / 128 GB max
>Chipset: Westmere
iDataPlex 3U Storage Rich
File Intense Dual-Socket
>Processor: 6 & 4 Core Intel Xeon 5600
>Quick Path Interconnect up to 6.4 GT/s
>Memory:16 DIMM DDR3 - 128 GB max
>Memory Speed: up to 1333 MHz
>PCIe: x16 electrical/ x16 mechanical
>Chipset: Westmere 12 MB cache
>Ship Support March 26, 2010
iDataPlex dx360 M3
High-performance Dual-Socket
>Processor: 6 & 4 Core Intel Xeon 5600
>2 NVIDIA M1060 or M2050
>Quick Path Interconnect up to 6.4 GT/s
>Memory:16 DIMM DDR3 - 128 GB max
>Memory Speed: up to 1333 MHz
>PCIe: x16 electrical/ x16 mechanical
>Chipset: Westmere 12 MB cache
>Ship Support August 12, 2010
iDataPlex dx360 M3 Refresh
Exa-scale Hybrid CPU + GPU
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation
System x iDataPlex dx360 M3
iDataPlex flexibility with better
performance, efficiency and more options!
1U Drive Tray
1U Compute
Node
3U Storage Chassis
Maximize Storage Density
3U, 1 Node Slot & Triple Drive Tray
HDD: 12 (3.5” Drives) up to 24TB
I/O: PCIe for networking + PCIe for RAID
Compute + Storage
Balanced Storage and Processing
2U, 1 Node Slot & Drive Tray
HDD: up to 5 (3.5”)
Compute Intensive
Maximum Processing
2U, 2 Compute Nodes
750W N+N
Redundant
Power Supply
900W
Power
Supply
1U Dual GPU I/O Tray
550W
Power
Supply
Acceleration Compute + I/O
Maximum Component Flexibility
2U, 1 Node Slot
I/O: up to 2 PCIe, HDD up to 8 (2.5”)
Tailored for Your Business Needs
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation
iDataPlex dx360 M3 Refresh
>Increased Server efficiency & Westmere enablement
 Intel Westmere-EP 4 and 6 core processor support (up to 95 watts)
 2 DIMM / Channel @1333MHz with Westmere 95 watt CPU’s
 Lower Power (1.35V) DIMM (2GB, 4GB, 8GB)
>Expanded I/O performance capabilities
 New I/O tray and 3-slot “butterfly” PCIe riser to support 2 GPU + network adapter
 Support for NVIDIA Tesla M1060 or “Fermi” M2050 in a 2U Chassis + 4 HDD
>Expanded Power Supply Offerings
 Optional Redundant 2U Power Supply for Line Feed (AC) and Chassis (DC) protection
 High Efficiency power supplies fitted to workload power demands
>Storage Performance, Capacity and Flexibility
 Simple-Swap SAS, SATA & SSD, 2.5” & 3.5” in any 2U configuration
 Increased capacities of 2.5” & 3.5” SAS, SATA and SSD
 Increased capacities in 3U Storage Dense to 24TB (with 2TB 3.5” SATA/SAS drives)
 6Gbps backplane for performance
 Rear PCIe slot enablement in 2U chassis for RAID controller flexibility
 Higher capacity/higher performance Solid State Drive controller
>Next-Generation Converged Networking
 FCoE via 10G Converged Network Adapters, Dual Port 10Gb Ethernet
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation
dx360 M3 Refresh - Power Supply Offerings
>Maximum Efficiency for lower power requirements
 New High Efficiency 550W Power Supply for optimum efficiency in low power
configurations
 More efficiency by running higher on the power curve
>Flexibility to optimize power supply to workload appropriately
 550W (non-redundant) for lower power demands
 900W (non-redundant) for higher power demands
 750W N+N for node and line feed redundancy
>Redundant Power Supply option for the iDataPlex chassis
 Node-level power protection for smaller clusters, head node, 3U storage-rich, VM &
Enterprise
 Rack-level line feed redundancy with discreet feeds
 Tailor rack-level solutions that require redundant power in some or all nodes
 Maintains maximum floor space density with the iDataPlex rack
 Graceful shutdown on power supply failure for virtualized environments
>Flexibility per chassis to optimize rack power
 Power supply is per 2U or 3U chassis
 Mix across the rack to maximize flexibility, minimize stranded power
900W HE
550W HE
750W N+N
AC 1
AC 2
PS 1 750W Max
PS 2 750W Max
C
A
B
750W Total in redundant mode
200-240V only
Redundant supply block diagram
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation
>Rack level value
 Greater density, easier to cool
 Flexibility of network topology without compromising density
 More density reduces number of racks and power feeds in the data
center
 Rear Door Heat Exchanger provides the ultimate value in cooling and
density
dx360 M3 Refresh - Rack GPU Configuration
>42 High Performance GPU servers / rack
>iDataPlex efficiency drives more density on the floor
>In-rack networking will not reduce rack density, regardless of
topology required by the customer
>Rear Door Heat Exchanger provides further TCO value
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation
4- 2.5” SS SAS 300 or 600GB/10K 6Gbps
(or SATA, or 3.5”, or SSD…)
Infiniband DDR
(or QDR, or 10GbE…)
NVIDIA M2050 #1
“Fermi”
(or M1060,or FX3800, or Fusion IO,…
NVIDIA M2050 #2
“Fermi”
Or M1060 FX3800, or Fusion IO,…)
Server level value
> Each server is individually serviceable
> Balanced performance for demanding GPU workloads
> 6Gbps SAS drives and controller for maximum
performance
> Service and support for server and GPU from IBM
dx360 M3 Refresh - Server GPU Configuration
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation 21
Intelligent Clusters Storage Portfolio Summary
>Intelligent Clusters BOM consists of the following Storage components
 Entry-level DS3000 series disk storage systems
 Mid-range DS4000 series disk storage systems
 High-end DS5000 series disk storage systems
 All standard hard disk drives (SAS/SATA/FC)
 Entry-level SAN fabric switches
>Majority of the HPC solutions use DS3000/DS4000 series disk storage with IBM
GPFS parallel file system software
>A small percentage of HPC clusters use entry-level storage
(DS3200/DS3300/DS3400/DS3500)
>Integrated business solutions (SAP-BWA, Smart Analytics, SoFS) use DS3500
storage (mostly)
>Smaller-size custom solutions use DS3000 entry-level storage
>A small percentage of special HPC bids use DDN DCS9550 storage
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation 22
DS5020 (FC-SAN) DS5000 (FC-SAN)
DS3400 (FC-SAN, SAS/SATA)
DS3500 (SAS)
DS3300 (iSCSI/SAS)
EXP3000 Storage Expansion (JBOD)
Intelligent Clusters Storage Portfolio (Dec 2008)
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation 23
Topic Agenda
>Commodity Clusters
>Overview of Intelligent Clusters
>Cluster Hardware
>*Cluster Networking*
>Cluster Management Software Stack and Benchmarking
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation 24
Cluster Networking
>Networking is an integral part of any Cluster system from
communication across various devices, including servers
and storage, and for cluster management
>All servers in the cluster, including login, management,
compute, and storage nodes communicate using one or
more network fabrics connecting them
>Typically clusters have one or more of the following
networks
 A cluster-wide Management network
 A user/campus network through which users login to the
cluster and launch jobs
 A low-latency, high-bandwidth network such as InfiniBand
used for inter-process communication
 A Storage network used for communication across the
storage nodes (optional)
 A Fibre-channel or Ethernet network (in case of iSCSI traffic)
used as the Storage network fabric
Cluster Network
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation
QDR InfiniBand HCA’s
QDR InfiniBand Switches
4036
1U
36 ports
12200-36
1U
36 ports
InfiniScale IV
1U
36 ports
Director Class
 InfiniScale IV
10 U
216 ports
Director Class
InfiniScale IV
29 U
648 ports
12800-180
14 U
432 ports
12800-360
29 U
864 ports
ConnectX-2
Dual Port
ConnectX-2
Single Port
QLE 7340
Single Port
12300-36
1U Managed
36 ports
Director Class
 InfiniScale IV
6 U
108 ports
Director Class
 InfiniScale IV
17 U
324 ports
 Grid Director4700
18U
324 ports
 Grid Director 4200
11U
110-160 ports
= New for 10B Release
InfiniBand Portfolio - Intelligent Cluster
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation
SMC 8848M 1U
48 x 1Gb ports
2 x 10Gb uplink
SMC 8126L2 1U
26 1Gb ports
Cisco 4948 1U
48 x 1Gb ports
2x 10Gb optional uplink
Cisco 2960G-48
48 1Gb ports
1U
Low Cost 48 Port Industry Low Cost Premium Brand Alternative Premium
Brand (Stackable)
Premium Low Cost
Industry Low Cost
Blade G8000-48
48-1Gb ports
4x10Gb Up
1U
Cisco 4900 2U
Cisco 10gbE
24 10Gb Ports
IBM FCX-48
(Foxhound) 48X
48 1Gb ports
10Gb Up - I- DPX
Low Cost 48 Port
Added in Oct 10
BOM
SMC 8150L2 1U
50 1Gb ports
Industry Low Cost
Force 10 S60 1U
48 x 1Gb ports
Up to 4 x 10Gb opt. uplink
Blade G8124 1U
24x SFP+ 10Gb
24-port 10bGb SFP+
Cisco 3750G-48
48 1Gb ports with
Stacking
1U
Intelligent Cluster Ethernet Portfolio
10G Switches
1G 48 Port with 10G Up
1G 48 Port Switches
1G 24 Port Switches
Entry / Leaf / Top of Rack Switches
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation
Ethernet Switch Portfolio - iDataPlex
SMC 8848M 1U
48 x 1Gb ports
2 x 10Gb uplink
Industry Low Cost
Cisco 4948E 1U
48 x 1Gb ports
4 x 10Gb optional uplink
Premium Brand
Added in Oct 10
BOM
Blade G8124 1U
24x SFP+ 10Gb
24-port 10bGb SFP+
IBM B24X
(TurboIron) 24X
24 10Gb ports
I- DPX
IBM DCN -24port 10Gb
IBM DCN -48port 10Gb
IBM B50C 1U
(NettIron 48)
48 1Gb ports w/2
10GbE (opt)
Low Cost 48
Port
IBM FCX-48
(Foxhound) 48X
48 1Gb ports
10Gb Up - I- DPX
Blade G8000-48
48-1Gb ports
4x10Gb Up
1U
Low Cost 48 Port
IBM J48
Juniper EX4200-48
48-1Gb ports
10Gb Up, 2 VC ports
I- DPX
Premium Brand
Alternative
Premium Brand
(Stackable)
Force 10 S60 1U
48 x 1Gb ports
4-10Gb Uplinks
10G Switches
1G 48 Port with 10G Uplinks
1G 24/48 Port Switches
Entry / Leaf / Top of Rack Switches
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation
Ethernet Switch Portfolio - Intelligent Cluster
Core & Aggregate
Switches
.
Cisco 6509-E
15U
9 Slots
384-1Gb ports
32-10Gb ports
Chelsio Dual Port T3 SFP+ 10Gbe PCI-E x8 line rate adapter
Chelsio Dual port T3 CX4 10Gbe PCI-E x8 line rate adapter
Chelsio Dual port T3 10Gbe CFFh High Performance Daughter Card for Blades
Mellanox ConnectX 2 EN 10GbE PCI-E x8 line rate adapter
Added in Oct 10 BOM
10GbE HPC Adapters
IBM B16R
(BigIron)
16 Slots
768 -1Gb
256 -10Gb ports
IBM B08R
(BigIron)
8 Slots
384
-1Gb
32 -10Gb ports
Voltaire 8500
12 Slots
15U
288 -10Gb ports
All Core Switches & 10GbE
Adapters Tested for
compatibility
with iDataPlex
Force 10 E600i
16U
7 slots
633-1Gb
ports
112-10Gb
ports
Force 10 E1200i
21U
14 slots
1260-1Gb ports
224-10Gb ports
Juniper 8216 21U
16 Slots
768-1Gb ports
128 -10Gb ports
Juniper 8208 14U
8 Slots
384-1Gb ports
64-10Gb ports
Core Switches & Adapters
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation 29
High-speed Networking
>Many HPC applications are sensitive to network bandwidth and latency for
performance
>Primary choices for high-speed networking for Clusters
 InfiniBand
 10 Gigabit Ethernet (emerging)
>InfiniBand
 InfiniBand is an industry standard low-latency, high-bandwidth server interconnect, ideal
to carry multiple traffic types (clustering, communications, storage, management) over a
single connection
>10Gigabit Ethernet
 10GbE or 10GigE is an IEEE Ethernet standard 802.3ae, which defines Ethernet
technology with data rate of 10 Gbits/sec
 Follow-on to 1Gigabit Ethernet technology
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation 30
InfiniBand
>An industry standard low-latency, high-bandwidth server interconnect
>Ideal to carry multiple traffic types (clustering, communications, storage,
management) over a single physical connection
>Serial I/O interconnect architecture operating at a base speed of 5Gb/s in each
direction with DDR and 10Gb/s in each direction with QDR
>Provides highest node-to-node bandwidth available today of 40Gb/s bidirectional
with Quadruple Data Rate (QDR) technology
>Lowest end-to-end messaging latency in micro seconds (1.2-1.5 µsec)
>Wide-industry adoption and multiple vendors (Mellanox, Voltaire, QLogic, etc.)
>Open source drivers and libraries are available for users (OFED)
Lanes SDR - 2.5Gb/s DDR - 5Gb/s QDR - 10Gb/s EDR - 20Gb/s
1x (2.5 + 2.5) Gb/s (5 + 5) Gb/s (10 + 10) Gb/s (20 + 20) Gb/s
4x (10 + 10) Gb/s (20 + 20) Gb/s (40 + 40) Gb/s (80 + 80) Gb/s
8x (20 + 20) Gb/s (40 + 40) Gb/s (80 + 80) Gb/s (160 + 160) Gb/s
12x (30 + 30) Gb/s (60 + 60) Gb/s (120 + 120) Gb/s (240 + 240) Gb/s
InfiniBand Peak Bi-directional Bandwidth Table
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation
QDR InfiniBand HCA’s
QDR InfiniBand Switches
4036
1U
36 ports
12200-36
1U
36 ports
InfiniScale IV 1U
36 ports
Director Class
InfiniScale IV
10 U
216 ports
Director Class
InfiniScale IV
29 U
648 ports
12800-180
14 U
432 ports
12800-360
29 U
864 ports
ConnectX-2
Dual Port
ConnectX-2
Single Port
QLE 7340
Single Port
12300-36
1U Managed
36 ports
Director Class
InfiniScale IV
6 U
108 ports
Director Class
InfiniScale IV
17 U
324 ports
Grid Director4700
18U
324 ports
Grid Director 4200
11U
110-160 ports
New for 10B Release
InfiniBand Portfolio - Intelligent Cluster
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation 32
10 Gigabit Ethernet
>10GbE or 10GigE is an IEEE Ethernet standard 802.3ae, which defines Ethernet
technology with data rates of 10 Gbits/sec
>Enables applications to take advantage of 10Gbps Ethernet
>Requires no changes to the application code
>High-speed interconnect choice for “loosely-coupled” HPC applications
>Wide industry support for 10GbE technology
>Growing user adoption for Data Center Ethernet (DCE) and Fibre Channel Over
Ethernet (FCoE) technologies
>Intelligent Clusters supports 10GbE technologies for both node-level and switch-
level, providing multiple vendor choices for adapters and switches (BNT, SMC,
Force10, Brocade, Cisco, Chelsio, etc.)
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation 33
Topic Agenda
>Commodity Clusters
>Overview of Intelligent Clusters
>Cluster Hardware
>Cluster Networking
>*Cluster Management, Software Stack and Benchmarking*
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation 34
Cluster Management - xCAT
>xCAT - Extreme Cluster (Cloud) Administration Toolkit
 Open Source Linux/AIX/Windows Scale-out Cluster Management Solution
 Leverage best practices for deploying and managing Clusters at scale
 Scripts only (no compiled code)
 Community requirements driven
>xCAT Capabilities
 Remote Hardware Control
- Power, Reset, Vitals, Inventory, Event Logs, SNMP alert processing
 Remote Console Management
- Serial Console, SOL, Logging / Video Console (no logging)
 Remote OS Boot Target Control
- Local/SAN Boot, Network Boot, iSCSI Boot
 Remote Automated Unattended Network Installation
 For more information on xCAT go to http://xcat.sf.net
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation 35
Cluster Software Stack
>Provides fast and reliable access to common
set of file data from a single computer to
hundreds of systems
>Brings together multiple systems to create a
truly scalable cloud storage infrastructure
>GPFS-managed storage improves disk
utilization and reduces footprint energy
consumption and management efforts
>GPFS removes client-server and SAN file
system access bottlenecks
>All applications and users share all disks with
dynamic re-provisioning capability
SAN
GPFS
SAN
LAN
TECHNOLOGY:
>OS Support
 Linux (on POWER and x86)
 AIX
 Windows
>Interconnect Support (w/ TCP/IP)
 1GbE and 10 GbE
 Infiniband (RDMA in addition to IPoIB)
 Myrinet
 IBM HPS
High performance scalable file management solution
IBM GPFS - General Parallel File System
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation 36
What is GPFS ?
>IBM’s shared disk, parallel cluster file system.
>Product available on pSeries/xSeries clusters with AIX/Linux
>Used on many of the largest supercomputers in the world
>Cluster: 2400+ nodes, fast reliable
communication, common admin
domain.
>Shared disk: all data and metadata
on disk accessible from any node
through disk I/O interface.
>Parallel: data and metadata flows
from all of the nodes to all of the
disks in parallel.
GPFS File System Nodes
Switching fabric
(System or storage area network)
Shared disks
(SAN-attached or network
block device)
For more information on IBM GPFS, go to http://www-03.ibm.com/systems/clusters/software/gpfs/index.html
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation 37
>Resource Managers/Schedulers queue, validate, manage, load balance, and
launch user programs/jobs.
>Torque - Portable Batch System (free)
 Works with Maui Scheduler (free)
>LSF--Load Sharing Facility (commercial)
>Sun Grid Engine (free)
>Condor (free)
>MOAB Cluster Suite (commercial)
>Load Leveler (commercial scheduler from IBM)
Resource Managers/Schedulers
Job SchedulerJob Scheduler
User 1User 1
Resource ManagerResource Manager
Node 3Node 3Node 2Node 2Node 1Node 1 Node NNode N
Job Queue
User 2User 2 User NUser N
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation 38
Messaging Passing Libraries
>Enable inter-process communication among processes of an application running
across multiple nodes in the cluster (or on a symmetric multi-processing system)
>“Masks” the underlying interconnect from the user application
 Allows application programmer to use a “virtual” communication environment as reference
for programming applications for clusters
Messaging Passing Interface (MPI) Parallel Virtual Machine (PVM)
>Included with most Linux distributions (open
source)
 IP (Ethernet)
 GM (Myrinet)
>Linda (commercial)
 IP (Ethernet)
>MPICH2 (free)
 IP (Ethernet)
 MX (Myrinet)
 InfiniBand
>LAM-MPI (free)
 IP (Ethernet)
>Scali (commercial)
 IP (Ethernet)
 MX (Myrinet)
 InfiniBand
>OpenMPI (free)
 IP (Ethernet)
 InfiniBand
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation 39
Compilers & Other tools
>Compilers are critical in creating an optimized binary code that takes advantage of
the specific processor architectural features such that the application can exploit
the full power of the system and runs most efficiently
>Respective processor vendors typically have the best compilers for their
processors – e.g. Intel, AMD, IBM, SGI, Sun, etc.
>Compilers are important to produce the best code for HPC applications as
individual node performance is a critical factor for the overall cluster performance
>Open source and commercial compilers are available such as the GNU GCC
compiler suite (C/C++, Fortran 77/90) (Free), and PathScale (owned by QLogic)
compilers
>Support libraries and debugger tools are also packaged and made available with
the compilers, such as Math libraries (e.g. Intel Math Kernel Libraries, AMD Core
Math Library) and debuggers such as gdb (GNU debugger) and TotalView
debugger used for debugging parallel applications on clusters
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation 40
HPC Software Stack
The Intelligent Clusters supports a broad range of HPC software from industry leading suppliers.
Software is available directly from IBM or the respective Solution providers.
Functional Area Software Product Source Comments
Cluster Systems Management xCAT2
IBM Director
IBM CSM functionality now merged into
xCAT2
File Systems General Parallel File System
(GPFS) for Linux;
GPFS for Linux on POWER
IBM
PolyServe Matrix Server File
System
HP
NFS Open Source
Lustre Open Source
Workload Management Open PBS Open Source
PBS Pro Altaire
LoadLeveler IBM
LSF Platform Computing
MOAB Cluster Resources Commercial version of Maui
scheduler
Gridserver Datasynapse
Maui Scheduler open source Interfaces to many schedulers
Message Passing Interface
Solutions
Scali MPI Connect™ Scali
Compilers PGI Fortran 77/90; C/C++ STM Portland Group 32/64-bit support
Intel Fortran/C/C++ Intel
NAG Fortran/C/C++ NAG 32/64-bit
Absoft® Compilers Absoft
PathScale™ Compilers PathScale AMD Opteron
GCC open source
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation 41
Debugger/Tracer TotalView Etnus
CodeAnalyst AMD Timer/event profiling pipeline simulations
Fx2 Debugger™ Absoft
Distributed Debugging Tool (DDT) Allinea
Math Libraries ACML (AMD Core Math Libraries) AMD/NAG BLAS, FFT, LAPACK
Intel Integrated Performance
Primitives
Intel
Intel Math Kernel Library Intel
Intel Cluster Math Kernel Library Intel
IMSL™, PV-WAVE® Visual Numerics
Message Passing
Libraries
MPICH Open Source TCP/IP networks
MPIC-GM Myricom Myrinet networks
SCA TCP Linda™ SCA
WMPI II™ Critical Software
Parallelization Tools TCP Linda® SCA
Interconnect ManagementScali MPI Connect Scali
Performance Tuning Intel VTune™ Performance Analyzer Intel
Optimization and Profiling Tool
(OPT)
Allinea
High Performance Computing ToolkitIBM http://www.research.ibm.com/actc
Threading Tool Intel Thread Checker Intel
Trace Tool Intel Trace Analyzer and Collector Intel
Functional Area Software Product Source Comments
HPC Software Stack Cont.
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation 42
Cluster Benchmarking
Benchmarking – technique for running some well-known reference applications on clusters in
order to exercise various system components and measuring the performance characteristics of
the cluster (e.g. network bandwidth, latency, FLOPs, etc.)
>STREAM (memory access latency and bandwidth)
 http://www.cs.virginia.edu/stream/ref.html
>Linpack - the TOP500 benchmark
 Solves a dense system of linear equations
 You are allowed to tune the problem size and benchmark to optimize for your system
 http://www.netlib.org/benchmark/hpl/index.html
>HPC Challenge
 A set of HPC benchmarks to test various subsystems of a cluster system
 http://icl.cs.utk.edu/hpcc/
>SPEC
 A set of commercial benchmarks to measure performance of various subsystems of the servers
 http://www.spec.org/
>NAS 2.3 Parallel Benchmarks
>http://www.nas.nasa.gov/Resources/Software/npb.html
>Intel MPI Benchmarks (previously Pallas benchmarks)
 http://software.intel.com/en-us/articles/intel-mpi-benchmarks/
>Ping-Pong (Common MPI benchmark to measure point-to-point latency and bandwidth
>Customer's own code
 Provides a good representation of the system performance specific to the application code
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation 43
Summary
> A Cluster system is created out of commodity server hardware, high-speed networking,
storage and software technologies
> High-performance computing (HPC) takes advantage of cluster systems to solve complex
problems in various industries
> IBM Intelligent Clusters provides a one-stop-shop for creating and deploying HPC solutions
using IBM servers and third party Networking, Storage and Software
> InfiniBand, Myrinet (MX and Myri-10G), and 10Gigabit Ethernet technologies are more
commonly used as the high-speed interconnect solution for Clusters
> IBM GPFS parallel file system provides a highly-scalable, and robust parallel file system and
storage virtualization solution for Clusters and other general-purpose computing systems
> xCAT is an open-source, scalable cluster deployment and Cloud hardware management
solution
> Cluster benchmarking enables performance analysis, debugging and tuning capabilities for
extracting optimal performance from Clusters by isolating and fixing critical bottlenecks
> Message-passing middleware enables developing HPC applications for Clusters
> Several commercial software tools are available for Cluster computing
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation 44
Glossary of Terms
>Commodity Cluster
>InfiniBand
>Message Passing Interface (MPI)
>Extreme Cluster (Cloud)
Administration Toolkit (xCAT)
>Network-attached storage (NAS)
>Cluster VLAN
>Message-Passing Libraries
>Management Node
>High Performance Computing (HPC)
>Roll Your Own (RYO)
>BP Integrated
>Distributed Network Topology
>Intelligent Clusters
>General Parallel File System (GPFS)
>Direct-attached storage (DAS).
>iDataPlex
>Inter-node communication
>Compute Network
>Centralized Network Topology
>IBM Racked and Stacked
>Leaf Switch
>Core/aggregate Switch
>Quadruple Data Rate
>Storage Area Network (SAN)
>Parallel Virtual Machine (PVM)
>Benchmarking
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation 45
Additional Resources
>IBM STG SMART Zone for more education:
 Internal: http://lt.be.ibm.com
 BP: http://lt2.portsmouth.uk.ibm.com/
>IBM System x
 http://www-03.ibm.com/systems/x/
>IBM ServerProven
 http://www-03.ibm.com/servers/eserver/serverproven/compat/us/
>IBM System x Support
 http://www-947.ibm.com/support/entry/portal/
>IBM System Intelligent Clusters
 http://www-03.ibm.com/systems/x/hardware/cluster/index.html
IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM
Corporation 46
Trademarks
•The following are trademarks of the International Business Machines Corporation in the United States, other countries, or both.
>Not all common law marks used by IBM are listed on this page. Failure of a mark to appear does not mean that IBM does not use the mark nor does it mean that the
product is not actively marketed or is not significant within its relevant market.
>Those trademarks followed by ® are registered trademarks of IBM in the United States; all others are trademarks or common law marks of IBM in the United States.
For a complete list of IBM Trademarks, see www.ibm.com/legal/copytrade.shtml:
•The following are trademarks or registered trademarks of other companies.
>Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or
other countries.
>Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefore.
>Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both.
>Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.
>Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered
trademarks of Intel Corporation or its subsidiaries in the United States and other countries.
>UNIX is a registered trademark of The Open Group in the United States and other countries.
>Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.
>ITIL is a registered trademark, and a registered community trademark of the Office of Government Commerce, and is registered in the U.S. Patent and Trademark
Office.
>IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency, which is now part of the Office of Government Commerce.
•All other products may be trademarks or registered trademarks of their respective companies
>Notes:
>Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The
actual throughput that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O
configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput
improvements equivalent to the performance ratios stated here.
>IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply.
>All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the
results they may have achieved. Actual environmental costs and performance characteristics will vary depending on individual customer configurations and conditions.
>This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the
information may be subject to change without notice. Consult your local IBM business contact for information on the product or services available in your area.
>All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.
>Information about non-IBM products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and
cannot confirm the performance, compatibility, or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed
to the suppliers of those products

More Related Content

What's hot

Xtw01t5v011311 disk storage
Xtw01t5v011311 disk storageXtw01t5v011311 disk storage
Xtw01t5v011311 disk storagepgnguyen44
 
Future of Power: PureFlex and IBM i - Erik Rex
Future of Power: PureFlex and IBM i - Erik RexFuture of Power: PureFlex and IBM i - Erik Rex
Future of Power: PureFlex and IBM i - Erik RexIBM Danmark
 
Flex system client_presentation
Flex system client_presentationFlex system client_presentation
Flex system client_presentationNatalija Pavic
 
IBM i 7.1 & TRs CEC 2012
IBM i 7.1 & TRs CEC 2012IBM i 7.1 & TRs CEC 2012
IBM i 7.1 & TRs CEC 2012COMMON Europe
 
IBM Power Event, Keynote Presentation Doug Davis
IBM Power Event, Keynote Presentation Doug DavisIBM Power Event, Keynote Presentation Doug Davis
IBM Power Event, Keynote Presentation Doug DavisIBM Danmark
 
Ibm pure systems sales bootcamp
Ibm pure systems sales bootcampIbm pure systems sales bootcamp
Ibm pure systems sales bootcampsolarisyougood
 
November flex and pure flex announcements.ppt&token=mtm1mjkynzewmze4mw==&loca...
November flex and pure flex announcements.ppt&token=mtm1mjkynzewmze4mw==&loca...November flex and pure flex announcements.ppt&token=mtm1mjkynzewmze4mw==&loca...
November flex and pure flex announcements.ppt&token=mtm1mjkynzewmze4mw==&loca...Simon Womack
 
The benefits of IBM FlashSystems
The benefits of IBM FlashSystemsThe benefits of IBM FlashSystems
The benefits of IBM FlashSystemsLuca Comparini
 
Aix The Future of UNIX
Aix The Future of UNIX Aix The Future of UNIX
Aix The Future of UNIX xKinAnx
 
Student guide power systems for aix - virtualization i implementing virtual...
Student guide   power systems for aix - virtualization i implementing virtual...Student guide   power systems for aix - virtualization i implementing virtual...
Student guide power systems for aix - virtualization i implementing virtual...solarisyougood
 
FlashSystems 2016 update
FlashSystems 2016 updateFlashSystems 2016 update
FlashSystems 2016 updateJoe Krotz
 
Xiv cloud machine_webinar_090414
Xiv cloud machine_webinar_090414Xiv cloud machine_webinar_090414
Xiv cloud machine_webinar_090414Jinesh Shah
 
IBM BladeCenter Fundamentals Introduction
IBM BladeCenter Fundamentals Introduction IBM BladeCenter Fundamentals Introduction
IBM BladeCenter Fundamentals Introduction Dsunte Wilson
 
Big data and ibm flashsystems
Big data and ibm flashsystemsBig data and ibm flashsystems
Big data and ibm flashsystemssolarisyougood
 
IBM flash systems
IBM flash systems IBM flash systems
IBM flash systems Solv AS
 
102550121 symmetrix-foundations-student-resource-guide
102550121 symmetrix-foundations-student-resource-guide102550121 symmetrix-foundations-student-resource-guide
102550121 symmetrix-foundations-student-resource-guideAmit Sharma
 
Flash Ahead: IBM Flash System Selling Point
Flash Ahead: IBM Flash System Selling PointFlash Ahead: IBM Flash System Selling Point
Flash Ahead: IBM Flash System Selling PointCTI Group
 

What's hot (20)

Xtw01t5v011311 disk storage
Xtw01t5v011311 disk storageXtw01t5v011311 disk storage
Xtw01t5v011311 disk storage
 
Future of Power: PureFlex and IBM i - Erik Rex
Future of Power: PureFlex and IBM i - Erik RexFuture of Power: PureFlex and IBM i - Erik Rex
Future of Power: PureFlex and IBM i - Erik Rex
 
Flex system client_presentation
Flex system client_presentationFlex system client_presentation
Flex system client_presentation
 
IBM i 7.1 & TRs CEC 2012
IBM i 7.1 & TRs CEC 2012IBM i 7.1 & TRs CEC 2012
IBM i 7.1 & TRs CEC 2012
 
IBM Power Event, Keynote Presentation Doug Davis
IBM Power Event, Keynote Presentation Doug DavisIBM Power Event, Keynote Presentation Doug Davis
IBM Power Event, Keynote Presentation Doug Davis
 
Ibm pure systems sales bootcamp
Ibm pure systems sales bootcampIbm pure systems sales bootcamp
Ibm pure systems sales bootcamp
 
November flex and pure flex announcements.ppt&token=mtm1mjkynzewmze4mw==&loca...
November flex and pure flex announcements.ppt&token=mtm1mjkynzewmze4mw==&loca...November flex and pure flex announcements.ppt&token=mtm1mjkynzewmze4mw==&loca...
November flex and pure flex announcements.ppt&token=mtm1mjkynzewmze4mw==&loca...
 
IBM I and blade center update 2009
IBM I and blade center update 2009IBM I and blade center update 2009
IBM I and blade center update 2009
 
The benefits of IBM FlashSystems
The benefits of IBM FlashSystemsThe benefits of IBM FlashSystems
The benefits of IBM FlashSystems
 
Aix The Future of UNIX
Aix The Future of UNIX Aix The Future of UNIX
Aix The Future of UNIX
 
IBM PureSystems
IBM PureSystemsIBM PureSystems
IBM PureSystems
 
IBM PureFlex System
IBM PureFlex SystemIBM PureFlex System
IBM PureFlex System
 
Student guide power systems for aix - virtualization i implementing virtual...
Student guide   power systems for aix - virtualization i implementing virtual...Student guide   power systems for aix - virtualization i implementing virtual...
Student guide power systems for aix - virtualization i implementing virtual...
 
FlashSystems 2016 update
FlashSystems 2016 updateFlashSystems 2016 update
FlashSystems 2016 update
 
Xiv cloud machine_webinar_090414
Xiv cloud machine_webinar_090414Xiv cloud machine_webinar_090414
Xiv cloud machine_webinar_090414
 
IBM BladeCenter Fundamentals Introduction
IBM BladeCenter Fundamentals Introduction IBM BladeCenter Fundamentals Introduction
IBM BladeCenter Fundamentals Introduction
 
Big data and ibm flashsystems
Big data and ibm flashsystemsBig data and ibm flashsystems
Big data and ibm flashsystems
 
IBM flash systems
IBM flash systems IBM flash systems
IBM flash systems
 
102550121 symmetrix-foundations-student-resource-guide
102550121 symmetrix-foundations-student-resource-guide102550121 symmetrix-foundations-student-resource-guide
102550121 symmetrix-foundations-student-resource-guide
 
Flash Ahead: IBM Flash System Selling Point
Flash Ahead: IBM Flash System Selling PointFlash Ahead: IBM Flash System Selling Point
Flash Ahead: IBM Flash System Selling Point
 

Viewers also liked

04.egovFrame Runtime Environment Workshop
04.egovFrame Runtime Environment Workshop04.egovFrame Runtime Environment Workshop
04.egovFrame Runtime Environment WorkshopChuong Nguyen
 
Vrme 2016 presentation kabanova
Vrme 2016 presentation kabanovaVrme 2016 presentation kabanova
Vrme 2016 presentation kabanovaNadezhda Kabanova
 
The Impact of Therapeutic Interventions
The Impact of Therapeutic InterventionsThe Impact of Therapeutic Interventions
The Impact of Therapeutic InterventionsLori-Jo Curran
 
ご依頼の流れ
ご依頼の流れご依頼の流れ
ご依頼の流れabeoffice
 
Ximena computacion
Ximena computacionXimena computacion
Ximena computacionXime Lerma
 
рождество
рождестворождество
рождествоdetki
 
Clerks notification 2015
Clerks notification 2015Clerks notification 2015
Clerks notification 2015Raja Kashyap
 
Colegio nacional nicolás esguerra
Colegio nacional nicolás esguerraColegio nacional nicolás esguerra
Colegio nacional nicolás esguerraEdward Neuta
 

Viewers also liked (12)

ACOOK_portfolio
ACOOK_portfolioACOOK_portfolio
ACOOK_portfolio
 
04.egovFrame Runtime Environment Workshop
04.egovFrame Runtime Environment Workshop04.egovFrame Runtime Environment Workshop
04.egovFrame Runtime Environment Workshop
 
Vrme 2016 presentation kabanova
Vrme 2016 presentation kabanovaVrme 2016 presentation kabanova
Vrme 2016 presentation kabanova
 
The Impact of Therapeutic Interventions
The Impact of Therapeutic InterventionsThe Impact of Therapeutic Interventions
The Impact of Therapeutic Interventions
 
ご依頼の流れ
ご依頼の流れご依頼の流れ
ご依頼の流れ
 
Ximena computacion
Ximena computacionXimena computacion
Ximena computacion
 
рождество
рождестворождество
рождество
 
Clerks notification 2015
Clerks notification 2015Clerks notification 2015
Clerks notification 2015
 
Traducción de palabras
Traducción de palabrasTraducción de palabras
Traducción de palabras
 
Colegio nacional nicolás esguerra
Colegio nacional nicolás esguerraColegio nacional nicolás esguerra
Colegio nacional nicolás esguerra
 
กุหลาบ สายประดิษฐ์
กุหลาบ สายประดิษฐ์กุหลาบ สายประดิษฐ์
กุหลาบ สายประดิษฐ์
 
Chuck and larry
Chuck and larryChuck and larry
Chuck and larry
 

Similar to Xtw01t7v021711 cluster

IBM Special Announcement session Intel #IDF2013 September 10, 2013
IBM Special Announcement session Intel #IDF2013 September 10, 2013IBM Special Announcement session Intel #IDF2013 September 10, 2013
IBM Special Announcement session Intel #IDF2013 September 10, 2013Cliff Kinard
 
Webinář: Dell VRTX - datacentrum vše-v-jednom za skvělou cenu / 7.10.2013
Webinář: Dell VRTX - datacentrum vše-v-jednom za skvělou cenu / 7.10.2013Webinář: Dell VRTX - datacentrum vše-v-jednom za skvělou cenu / 7.10.2013
Webinář: Dell VRTX - datacentrum vše-v-jednom za skvělou cenu / 7.10.2013Jaroslav Prodelal
 
Ibm spectrum virtualize 101
Ibm spectrum virtualize 101 Ibm spectrum virtualize 101
Ibm spectrum virtualize 101 xKinAnx
 
1.ibm pure flex system mar 2013
1.ibm pure flex system   mar 20131.ibm pure flex system   mar 2013
1.ibm pure flex system mar 2013solarisyougood
 
Ibm symp14 referent_marcus alexander mac dougall_ibm x6 und flex system
Ibm symp14 referent_marcus alexander mac dougall_ibm x6 und flex systemIbm symp14 referent_marcus alexander mac dougall_ibm x6 und flex system
Ibm symp14 referent_marcus alexander mac dougall_ibm x6 und flex systemIBM Switzerland
 
S ss0884 sds-what-why-how-edge2015-v7
S ss0884 sds-what-why-how-edge2015-v7S ss0884 sds-what-why-how-edge2015-v7
S ss0884 sds-what-why-how-edge2015-v7Tony Pearson
 
Presentazione IBM Flex System e System x Evento Venaria 14 ottobre
Presentazione IBM Flex System e System x Evento Venaria 14 ottobrePresentazione IBM Flex System e System x Evento Venaria 14 ottobre
Presentazione IBM Flex System e System x Evento Venaria 14 ottobrePRAGMA PROGETTI
 
Webinář: Provozujte datacentrum v kanceláři (Dell VRTX) / 5.9.2013
Webinář: Provozujte datacentrum v kanceláři (Dell VRTX) / 5.9.2013Webinář: Provozujte datacentrum v kanceláři (Dell VRTX) / 5.9.2013
Webinář: Provozujte datacentrum v kanceláři (Dell VRTX) / 5.9.2013Jaroslav Prodelal
 
Presentation blade center 101
Presentation   blade center 101Presentation   blade center 101
Presentation blade center 101xKinAnx
 
Integrierte Experten Systeme_Erik-Werner Radtke_IBM Symposium 2013
Integrierte Experten Systeme_Erik-Werner Radtke_IBM Symposium 2013Integrierte Experten Systeme_Erik-Werner Radtke_IBM Symposium 2013
Integrierte Experten Systeme_Erik-Werner Radtke_IBM Symposium 2013IBM Switzerland
 
Technical Comuting Solutions Made Simple - ISC13 IBM System x Solution
Technical Comuting Solutions Made Simple - ISC13 IBM System x SolutionTechnical Comuting Solutions Made Simple - ISC13 IBM System x Solution
Technical Comuting Solutions Made Simple - ISC13 IBM System x SolutionIntel IT Center
 
Introducing Affordable HPC or HPC for the Masses - IBM NeXtScale System
Introducing Affordable HPC or HPC for the Masses - IBM NeXtScale System Introducing Affordable HPC or HPC for the Masses - IBM NeXtScale System
Introducing Affordable HPC or HPC for the Masses - IBM NeXtScale System Cliff Kinard
 
Future of Power: IBM PureFlex - Kim Mortensen
Future of Power: IBM PureFlex - Kim MortensenFuture of Power: IBM PureFlex - Kim Mortensen
Future of Power: IBM PureFlex - Kim MortensenIBM Danmark
 
Future of Power: PowerLinux - Jan Kristian Nielsen
Future of Power: PowerLinux - Jan Kristian NielsenFuture of Power: PowerLinux - Jan Kristian Nielsen
Future of Power: PowerLinux - Jan Kristian NielsenIBM Danmark
 
S100298 pendulum-swings-orlando-v1804a
S100298 pendulum-swings-orlando-v1804aS100298 pendulum-swings-orlando-v1804a
S100298 pendulum-swings-orlando-v1804aTony Pearson
 
Superior Cloud Economics with Power Systems
Superior Cloud Economics with Power Systems Superior Cloud Economics with Power Systems
Superior Cloud Economics with Power Systems IBM Power Systems
 
MT47 Modernize infrastructure for a modern data center
MT47 Modernize infrastructure for a modern data centerMT47 Modernize infrastructure for a modern data center
MT47 Modernize infrastructure for a modern data centerDell EMC World
 

Similar to Xtw01t7v021711 cluster (20)

IBM Special Announcement session Intel #IDF2013 September 10, 2013
IBM Special Announcement session Intel #IDF2013 September 10, 2013IBM Special Announcement session Intel #IDF2013 September 10, 2013
IBM Special Announcement session Intel #IDF2013 September 10, 2013
 
Webinář: Dell VRTX - datacentrum vše-v-jednom za skvělou cenu / 7.10.2013
Webinář: Dell VRTX - datacentrum vše-v-jednom za skvělou cenu / 7.10.2013Webinář: Dell VRTX - datacentrum vše-v-jednom za skvělou cenu / 7.10.2013
Webinář: Dell VRTX - datacentrum vše-v-jednom za skvělou cenu / 7.10.2013
 
Ibm spectrum virtualize 101
Ibm spectrum virtualize 101 Ibm spectrum virtualize 101
Ibm spectrum virtualize 101
 
The IBM zEnterprise EC12
The IBM zEnterprise EC12The IBM zEnterprise EC12
The IBM zEnterprise EC12
 
1.ibm pure flex system mar 2013
1.ibm pure flex system   mar 20131.ibm pure flex system   mar 2013
1.ibm pure flex system mar 2013
 
Ibm symp14 referent_marcus alexander mac dougall_ibm x6 und flex system
Ibm symp14 referent_marcus alexander mac dougall_ibm x6 und flex systemIbm symp14 referent_marcus alexander mac dougall_ibm x6 und flex system
Ibm symp14 referent_marcus alexander mac dougall_ibm x6 und flex system
 
S ss0884 sds-what-why-how-edge2015-v7
S ss0884 sds-what-why-how-edge2015-v7S ss0884 sds-what-why-how-edge2015-v7
S ss0884 sds-what-why-how-edge2015-v7
 
Xiv overview
Xiv overviewXiv overview
Xiv overview
 
Presentazione IBM Flex System e System x Evento Venaria 14 ottobre
Presentazione IBM Flex System e System x Evento Venaria 14 ottobrePresentazione IBM Flex System e System x Evento Venaria 14 ottobre
Presentazione IBM Flex System e System x Evento Venaria 14 ottobre
 
Webinář: Provozujte datacentrum v kanceláři (Dell VRTX) / 5.9.2013
Webinář: Provozujte datacentrum v kanceláři (Dell VRTX) / 5.9.2013Webinář: Provozujte datacentrum v kanceláři (Dell VRTX) / 5.9.2013
Webinář: Provozujte datacentrum v kanceláři (Dell VRTX) / 5.9.2013
 
Presentation blade center 101
Presentation   blade center 101Presentation   blade center 101
Presentation blade center 101
 
Integrierte Experten Systeme_Erik-Werner Radtke_IBM Symposium 2013
Integrierte Experten Systeme_Erik-Werner Radtke_IBM Symposium 2013Integrierte Experten Systeme_Erik-Werner Radtke_IBM Symposium 2013
Integrierte Experten Systeme_Erik-Werner Radtke_IBM Symposium 2013
 
Technical Comuting Solutions Made Simple - ISC13 IBM System x Solution
Technical Comuting Solutions Made Simple - ISC13 IBM System x SolutionTechnical Comuting Solutions Made Simple - ISC13 IBM System x Solution
Technical Comuting Solutions Made Simple - ISC13 IBM System x Solution
 
Introducing Affordable HPC or HPC for the Masses - IBM NeXtScale System
Introducing Affordable HPC or HPC for the Masses - IBM NeXtScale System Introducing Affordable HPC or HPC for the Masses - IBM NeXtScale System
Introducing Affordable HPC or HPC for the Masses - IBM NeXtScale System
 
Future of Power: IBM PureFlex - Kim Mortensen
Future of Power: IBM PureFlex - Kim MortensenFuture of Power: IBM PureFlex - Kim Mortensen
Future of Power: IBM PureFlex - Kim Mortensen
 
Future of Power: PowerLinux - Jan Kristian Nielsen
Future of Power: PowerLinux - Jan Kristian NielsenFuture of Power: PowerLinux - Jan Kristian Nielsen
Future of Power: PowerLinux - Jan Kristian Nielsen
 
S100298 pendulum-swings-orlando-v1804a
S100298 pendulum-swings-orlando-v1804aS100298 pendulum-swings-orlando-v1804a
S100298 pendulum-swings-orlando-v1804a
 
IBM PureSystems
IBM PureSystemsIBM PureSystems
IBM PureSystems
 
Superior Cloud Economics with Power Systems
Superior Cloud Economics with Power Systems Superior Cloud Economics with Power Systems
Superior Cloud Economics with Power Systems
 
MT47 Modernize infrastructure for a modern data center
MT47 Modernize infrastructure for a modern data centerMT47 Modernize infrastructure for a modern data center
MT47 Modernize infrastructure for a modern data center
 

Recently uploaded

The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)codyslingerland1
 
March Patch Tuesday
March Patch TuesdayMarch Patch Tuesday
March Patch TuesdayIvanti
 
UiPath Studio Web workshop series - Day 1
UiPath Studio Web workshop series  - Day 1UiPath Studio Web workshop series  - Day 1
UiPath Studio Web workshop series - Day 1DianaGray10
 
Graphene Quantum Dots-Based Composites for Biomedical Applications
Graphene Quantum Dots-Based Composites for  Biomedical ApplicationsGraphene Quantum Dots-Based Composites for  Biomedical Applications
Graphene Quantum Dots-Based Composites for Biomedical Applicationsnooralam814309
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
 
Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)Muhammad Tiham Siddiqui
 
UiPath Studio Web workshop Series - Day 3
UiPath Studio Web workshop Series - Day 3UiPath Studio Web workshop Series - Day 3
UiPath Studio Web workshop Series - Day 3DianaGray10
 
Stobox 4: Revolutionizing Investment in Real-World Assets Through Tokenization
Stobox 4: Revolutionizing Investment in Real-World Assets Through TokenizationStobox 4: Revolutionizing Investment in Real-World Assets Through Tokenization
Stobox 4: Revolutionizing Investment in Real-World Assets Through TokenizationStobox
 
UiPath Studio Web workshop series - Day 2
UiPath Studio Web workshop series - Day 2UiPath Studio Web workshop series - Day 2
UiPath Studio Web workshop series - Day 2DianaGray10
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightSafe Software
 
My key hands-on projects in Quantum, and QAI
My key hands-on projects in Quantum, and QAIMy key hands-on projects in Quantum, and QAI
My key hands-on projects in Quantum, and QAIVijayananda Mohire
 
2024.03.12 Cost drivers of cultivated meat production.pdf
2024.03.12 Cost drivers of cultivated meat production.pdf2024.03.12 Cost drivers of cultivated meat production.pdf
2024.03.12 Cost drivers of cultivated meat production.pdfThe Good Food Institute
 
Automation Ops Series: Session 2 - Governance for UiPath projects
Automation Ops Series: Session 2 - Governance for UiPath projectsAutomation Ops Series: Session 2 - Governance for UiPath projects
Automation Ops Series: Session 2 - Governance for UiPath projectsDianaGray10
 
TrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie WorldTrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie WorldTrustArc
 
Introduction - IPLOOK NETWORKS CO., LTD.
Introduction - IPLOOK NETWORKS CO., LTD.Introduction - IPLOOK NETWORKS CO., LTD.
Introduction - IPLOOK NETWORKS CO., LTD.IPLOOK Networks
 
UiPath Studio Web workshop series - Day 4
UiPath Studio Web workshop series - Day 4UiPath Studio Web workshop series - Day 4
UiPath Studio Web workshop series - Day 4DianaGray10
 
Top 10 Squarespace Development Companies
Top 10 Squarespace Development CompaniesTop 10 Squarespace Development Companies
Top 10 Squarespace Development CompaniesTopCSSGallery
 
Keep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES LiveKeep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES LiveIES VE
 
The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)IES VE
 

Recently uploaded (20)

The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)
 
March Patch Tuesday
March Patch TuesdayMarch Patch Tuesday
March Patch Tuesday
 
UiPath Studio Web workshop series - Day 1
UiPath Studio Web workshop series  - Day 1UiPath Studio Web workshop series  - Day 1
UiPath Studio Web workshop series - Day 1
 
Graphene Quantum Dots-Based Composites for Biomedical Applications
Graphene Quantum Dots-Based Composites for  Biomedical ApplicationsGraphene Quantum Dots-Based Composites for  Biomedical Applications
Graphene Quantum Dots-Based Composites for Biomedical Applications
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)
 
UiPath Studio Web workshop Series - Day 3
UiPath Studio Web workshop Series - Day 3UiPath Studio Web workshop Series - Day 3
UiPath Studio Web workshop Series - Day 3
 
Stobox 4: Revolutionizing Investment in Real-World Assets Through Tokenization
Stobox 4: Revolutionizing Investment in Real-World Assets Through TokenizationStobox 4: Revolutionizing Investment in Real-World Assets Through Tokenization
Stobox 4: Revolutionizing Investment in Real-World Assets Through Tokenization
 
UiPath Studio Web workshop series - Day 2
UiPath Studio Web workshop series - Day 2UiPath Studio Web workshop series - Day 2
UiPath Studio Web workshop series - Day 2
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
My key hands-on projects in Quantum, and QAI
My key hands-on projects in Quantum, and QAIMy key hands-on projects in Quantum, and QAI
My key hands-on projects in Quantum, and QAI
 
SheDev 2024
SheDev 2024SheDev 2024
SheDev 2024
 
2024.03.12 Cost drivers of cultivated meat production.pdf
2024.03.12 Cost drivers of cultivated meat production.pdf2024.03.12 Cost drivers of cultivated meat production.pdf
2024.03.12 Cost drivers of cultivated meat production.pdf
 
Automation Ops Series: Session 2 - Governance for UiPath projects
Automation Ops Series: Session 2 - Governance for UiPath projectsAutomation Ops Series: Session 2 - Governance for UiPath projects
Automation Ops Series: Session 2 - Governance for UiPath projects
 
TrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie WorldTrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie World
 
Introduction - IPLOOK NETWORKS CO., LTD.
Introduction - IPLOOK NETWORKS CO., LTD.Introduction - IPLOOK NETWORKS CO., LTD.
Introduction - IPLOOK NETWORKS CO., LTD.
 
UiPath Studio Web workshop series - Day 4
UiPath Studio Web workshop series - Day 4UiPath Studio Web workshop series - Day 4
UiPath Studio Web workshop series - Day 4
 
Top 10 Squarespace Development Companies
Top 10 Squarespace Development CompaniesTop 10 Squarespace Development Companies
Top 10 Squarespace Development Companies
 
Keep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES LiveKeep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES Live
 
The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)
 

Xtw01t7v021711 cluster

  • 1. © 2006 IBM Corporation This presentation is intended for the education of IBM and Business Partner sales personnel. It should not be distributed to customers. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation Introduction to Intelligent Clusters XTW01 Topic 7
  • 2. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation 2 Course Overview The objectives of this course of study are: >Describe a high-performance computing cluster >List the business goals that Intelligent Clusters addresses >Identify three core Intelligent Clusters components >List the high-speed networking options available in Intelligent Clusters >List three software tools used in Clusters >Describe Cluster benchmarking
  • 3. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation 3 Topic Agenda >*Commodity Clusters* >Overview of Intelligent Clusters >Cluster Hardware >Cluster Networking >Cluster Management, Software Stack, and Benchmarking
  • 4. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation 4 >Clusters are comprised of standard, commodity components that could be used separately in other types of computing configurations  Compute servers – a.k.a. nodes  High-speed networking adapters and switches  Local and/or external storage  A commodity operating system such as Linux  Systems management software  Middleware libraries and Application software >Clusters enable “Commodity-based supercomputing” What is a Commodity Cluster? A multi-server system, comprised of interconnected computers and associated networking and storage devices, that are unified via systems management and networking software to accomplish a specific purpose.
  • 5. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation 5 Storage Rack Fiber Network Fibre SAN Switch Storage Nodes Ethernet Switch Management, Storage, SOL and Cluster VLANs Storage VLAN Management, SOL, Cluster VLANs Management Node User/Login Nodes LAN Cluster VLAN High-speed network Switch Message-passing Network User access To management network Compute Node Rack Public VLAN Conceptual View of a Cluster
  • 6. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation 6 Energy Finance Mfg Life Sciences Media Public / Gov’t Seismic Analysis Reservoir Analysis Derivative Analysis Actuarial Analysis Asset Liability Management Portfolio Risk Analysis Statistical Analysis Mechanical/ Electric Design Process Simulation Finite Element Analysis Failure Analysis Drug Discovery Protein Folding Medical Imaging Digital Rendering Collaborative Research Numerical Weather Forecasting High Energy Physics Bandwidth Consumption Gaming Application of Clusters in Industry
  • 7. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation 7 Technology Innovation in HPC >Multi-core enabled systems create new opportunities to advance applications and solutions  Dual and Quad core along with increased density memory designs  “8 way” x86 128GB capable system that begins at less than $10k. >Virtualization is a hot topic for architectures  Possible workload consolidation for cost savings  Power consumption reduced by optimizing system level utilization >Manageability is key to addressing complexity  Effective power/thermal management through SW tools  Virtualization management tools must be integrated into the overall management scheme Multi Core Virtualization Manageability
  • 8. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation 8 Topic Agenda >Commodity Clusters >*Overview of Intelligent Clusters* >Cluster Hardware >Cluster Networking >Cluster Management, Software Stack and Benchmarking
  • 9. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation 9 Approaches to Clustering Roll Your OwnRoll Your Own • Client orders individual components from a variety of vendors, including IBM • Client tests and integrates components or contracts with an integrator • Client must address warranty issues with each vendor BP IntegratedBP Integrated • BP orders servers & storage from IBM and networking from 3rd Party vendors • BP builds and integrates components and delivers to customer • Client must address warranty issues with each vendor IBM Racked andIBM Racked and StackedStacked • Client orders servers & storage in standard rack configurations from IBM • Client integrates IBM racks with 3rd Party components or contracts with IGS or other integrator • Client must address warranty issues with each vendor IntelligentIntelligent ClustersClusters • Client orders integrated cluster solution from IBM, including servers, storage and networking components • IBM delivers factory- built and tested cluster ready to “plug-in” • Client has Single Point of Contact for all warranty issues. Piece PartsPiece Parts Integrated SolutionIntegrated Solution Client bears all risk for sizing, design, integration, deployment and warranty issues Single Vendor responsible for sizing, design, integration, deployment, and all warranty issues IBM Delivers Across the Spectrum
  • 10. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation 10 Blade Servers Disk Storage Storage Networking Fiber Channel iSCSI FcOE Rack-mount Servers Compute Nodes Storage Software ServeRAID IBM TotalStorage® IBM Servers Core Technologies An IBM portfolio of components that have been cluster configured, tested, and work with a defined supporting software stack. •Factory assembled •Onsite Installation •One phone number for support. •Selection of options to customize your configuration including Linux operating system (RHEL or SUSE), xCAT, & GPFS The degree to which a multi-server system exhibits these characteristics determines if it is a cluster: - Dedicated private VLAN - All nodes running same suite of apps - Single point-of-control for: - Software/application distribution - Hardware management - Inter-node communication - Node interdependence - Linux operating system Storage Nodes Management Nodes Processors -Intel® Fiber SAS iSCSI HS21-XM Ethernet 10 GbE 1 GbE InfiniBand 4X – SDR 4X – DDR 4X - QDR Networks What is an IBM System Intelligent Cluster? Out of band Management Terminal Serv. x3550 M3 x3650 M3 HS22 Scale-out Servers iDataPlex dx360 M3 HX5
  • 11. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation 11 IBM HPC Cluster Solution (Intelligent Clusters) HPC Cluster SolutionSystem x Servers (Rack mount, Blades or iDataPlex) GPFS  xCAT  Linux or Windows Switches & Storage Cluster Software IBM or Business Partner adds… + =+ Technical Application (or “Workload”)
  • 12. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation 12 Course Agenda >Commodity Clusters >Overview of Intelligent Clusters >*Cluster Hardware* >Cluster Networking >Cluster Management and Software Stack and Benchmarking
  • 13. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation 13 Intelligent Clusters Overview - Servers IBM System x™ 3550 M3 High performance compute nodes • Dual Socket – 3550 M3 Intel • Integrated System Management IBM System x™ 3650 M3 Mission critical performance • Dual Socket – 3650 M3 Intel • Integrated System Management 2U 1U Active Energy ManagerTM : Power Management at Your Control • HS21-XM/HS22/HX5 Intel Processor-based Blades IBM BladeCenter® with HS21-XM, HS22, and HX5 IBM BladeCenter S Distributed, small office, easy to configure IBM BladeCenter H High performance IBM BladeCenter E Best energy efficiency, best density HS21 XM Extended-memory HS22 General-purpose enterprise Industry-leading performance, reliability and control HX5 Scalable enterprise
  • 14. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation 14 IBM System x iDataPlex PDUs 3U Chassis 2U Chassis Switches iDataPlex Rear Door Heat Exchanger HPC ServerWeb Server Storage Drives & Options I/O TrayStorage Tray
  • 15. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation Current iDataPlex Server Offerings >Processor: Quad Core Intel Xeon 5500 >Quick Path Interconnect up to 6.4 GT/s >Memory:16 DIMM DDR3 - 128 GB max >Memory Speed: up to 1333 MHz >PCIe: x16 electrical/ x16 mechanical >Chipset: Tylersburg-36D >Last Order Date: December 31, 2010 iDataPlex dx360 M2 High-performance Dual-Socket >Storage: 12 3.5” HDD up to 24 TB per node / 672TB per rack >Proc: 6 or 4 Core Intel Xeon 5600 >Memory: 16 DIMM / 128 GB max >Chipset: Westmere iDataPlex 3U Storage Rich File Intense Dual-Socket >Processor: 6 & 4 Core Intel Xeon 5600 >Quick Path Interconnect up to 6.4 GT/s >Memory:16 DIMM DDR3 - 128 GB max >Memory Speed: up to 1333 MHz >PCIe: x16 electrical/ x16 mechanical >Chipset: Westmere 12 MB cache >Ship Support March 26, 2010 iDataPlex dx360 M3 High-performance Dual-Socket >Processor: 6 & 4 Core Intel Xeon 5600 >2 NVIDIA M1060 or M2050 >Quick Path Interconnect up to 6.4 GT/s >Memory:16 DIMM DDR3 - 128 GB max >Memory Speed: up to 1333 MHz >PCIe: x16 electrical/ x16 mechanical >Chipset: Westmere 12 MB cache >Ship Support August 12, 2010 iDataPlex dx360 M3 Refresh Exa-scale Hybrid CPU + GPU
  • 16. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation System x iDataPlex dx360 M3 iDataPlex flexibility with better performance, efficiency and more options! 1U Drive Tray 1U Compute Node 3U Storage Chassis Maximize Storage Density 3U, 1 Node Slot & Triple Drive Tray HDD: 12 (3.5” Drives) up to 24TB I/O: PCIe for networking + PCIe for RAID Compute + Storage Balanced Storage and Processing 2U, 1 Node Slot & Drive Tray HDD: up to 5 (3.5”) Compute Intensive Maximum Processing 2U, 2 Compute Nodes 750W N+N Redundant Power Supply 900W Power Supply 1U Dual GPU I/O Tray 550W Power Supply Acceleration Compute + I/O Maximum Component Flexibility 2U, 1 Node Slot I/O: up to 2 PCIe, HDD up to 8 (2.5”) Tailored for Your Business Needs
  • 17. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation iDataPlex dx360 M3 Refresh >Increased Server efficiency & Westmere enablement  Intel Westmere-EP 4 and 6 core processor support (up to 95 watts)  2 DIMM / Channel @1333MHz with Westmere 95 watt CPU’s  Lower Power (1.35V) DIMM (2GB, 4GB, 8GB) >Expanded I/O performance capabilities  New I/O tray and 3-slot “butterfly” PCIe riser to support 2 GPU + network adapter  Support for NVIDIA Tesla M1060 or “Fermi” M2050 in a 2U Chassis + 4 HDD >Expanded Power Supply Offerings  Optional Redundant 2U Power Supply for Line Feed (AC) and Chassis (DC) protection  High Efficiency power supplies fitted to workload power demands >Storage Performance, Capacity and Flexibility  Simple-Swap SAS, SATA & SSD, 2.5” & 3.5” in any 2U configuration  Increased capacities of 2.5” & 3.5” SAS, SATA and SSD  Increased capacities in 3U Storage Dense to 24TB (with 2TB 3.5” SATA/SAS drives)  6Gbps backplane for performance  Rear PCIe slot enablement in 2U chassis for RAID controller flexibility  Higher capacity/higher performance Solid State Drive controller >Next-Generation Converged Networking  FCoE via 10G Converged Network Adapters, Dual Port 10Gb Ethernet
  • 18. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation dx360 M3 Refresh - Power Supply Offerings >Maximum Efficiency for lower power requirements  New High Efficiency 550W Power Supply for optimum efficiency in low power configurations  More efficiency by running higher on the power curve >Flexibility to optimize power supply to workload appropriately  550W (non-redundant) for lower power demands  900W (non-redundant) for higher power demands  750W N+N for node and line feed redundancy >Redundant Power Supply option for the iDataPlex chassis  Node-level power protection for smaller clusters, head node, 3U storage-rich, VM & Enterprise  Rack-level line feed redundancy with discreet feeds  Tailor rack-level solutions that require redundant power in some or all nodes  Maintains maximum floor space density with the iDataPlex rack  Graceful shutdown on power supply failure for virtualized environments >Flexibility per chassis to optimize rack power  Power supply is per 2U or 3U chassis  Mix across the rack to maximize flexibility, minimize stranded power 900W HE 550W HE 750W N+N AC 1 AC 2 PS 1 750W Max PS 2 750W Max C A B 750W Total in redundant mode 200-240V only Redundant supply block diagram
  • 19. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation >Rack level value  Greater density, easier to cool  Flexibility of network topology without compromising density  More density reduces number of racks and power feeds in the data center  Rear Door Heat Exchanger provides the ultimate value in cooling and density dx360 M3 Refresh - Rack GPU Configuration >42 High Performance GPU servers / rack >iDataPlex efficiency drives more density on the floor >In-rack networking will not reduce rack density, regardless of topology required by the customer >Rear Door Heat Exchanger provides further TCO value
  • 20. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation 4- 2.5” SS SAS 300 or 600GB/10K 6Gbps (or SATA, or 3.5”, or SSD…) Infiniband DDR (or QDR, or 10GbE…) NVIDIA M2050 #1 “Fermi” (or M1060,or FX3800, or Fusion IO,… NVIDIA M2050 #2 “Fermi” Or M1060 FX3800, or Fusion IO,…) Server level value > Each server is individually serviceable > Balanced performance for demanding GPU workloads > 6Gbps SAS drives and controller for maximum performance > Service and support for server and GPU from IBM dx360 M3 Refresh - Server GPU Configuration
  • 21. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation 21 Intelligent Clusters Storage Portfolio Summary >Intelligent Clusters BOM consists of the following Storage components  Entry-level DS3000 series disk storage systems  Mid-range DS4000 series disk storage systems  High-end DS5000 series disk storage systems  All standard hard disk drives (SAS/SATA/FC)  Entry-level SAN fabric switches >Majority of the HPC solutions use DS3000/DS4000 series disk storage with IBM GPFS parallel file system software >A small percentage of HPC clusters use entry-level storage (DS3200/DS3300/DS3400/DS3500) >Integrated business solutions (SAP-BWA, Smart Analytics, SoFS) use DS3500 storage (mostly) >Smaller-size custom solutions use DS3000 entry-level storage >A small percentage of special HPC bids use DDN DCS9550 storage
  • 22. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation 22 DS5020 (FC-SAN) DS5000 (FC-SAN) DS3400 (FC-SAN, SAS/SATA) DS3500 (SAS) DS3300 (iSCSI/SAS) EXP3000 Storage Expansion (JBOD) Intelligent Clusters Storage Portfolio (Dec 2008)
  • 23. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation 23 Topic Agenda >Commodity Clusters >Overview of Intelligent Clusters >Cluster Hardware >*Cluster Networking* >Cluster Management Software Stack and Benchmarking
  • 24. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation 24 Cluster Networking >Networking is an integral part of any Cluster system from communication across various devices, including servers and storage, and for cluster management >All servers in the cluster, including login, management, compute, and storage nodes communicate using one or more network fabrics connecting them >Typically clusters have one or more of the following networks  A cluster-wide Management network  A user/campus network through which users login to the cluster and launch jobs  A low-latency, high-bandwidth network such as InfiniBand used for inter-process communication  A Storage network used for communication across the storage nodes (optional)  A Fibre-channel or Ethernet network (in case of iSCSI traffic) used as the Storage network fabric Cluster Network
  • 25. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation QDR InfiniBand HCA’s QDR InfiniBand Switches 4036 1U 36 ports 12200-36 1U 36 ports InfiniScale IV 1U 36 ports Director Class  InfiniScale IV 10 U 216 ports Director Class InfiniScale IV 29 U 648 ports 12800-180 14 U 432 ports 12800-360 29 U 864 ports ConnectX-2 Dual Port ConnectX-2 Single Port QLE 7340 Single Port 12300-36 1U Managed 36 ports Director Class  InfiniScale IV 6 U 108 ports Director Class  InfiniScale IV 17 U 324 ports  Grid Director4700 18U 324 ports  Grid Director 4200 11U 110-160 ports = New for 10B Release InfiniBand Portfolio - Intelligent Cluster
  • 26. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation SMC 8848M 1U 48 x 1Gb ports 2 x 10Gb uplink SMC 8126L2 1U 26 1Gb ports Cisco 4948 1U 48 x 1Gb ports 2x 10Gb optional uplink Cisco 2960G-48 48 1Gb ports 1U Low Cost 48 Port Industry Low Cost Premium Brand Alternative Premium Brand (Stackable) Premium Low Cost Industry Low Cost Blade G8000-48 48-1Gb ports 4x10Gb Up 1U Cisco 4900 2U Cisco 10gbE 24 10Gb Ports IBM FCX-48 (Foxhound) 48X 48 1Gb ports 10Gb Up - I- DPX Low Cost 48 Port Added in Oct 10 BOM SMC 8150L2 1U 50 1Gb ports Industry Low Cost Force 10 S60 1U 48 x 1Gb ports Up to 4 x 10Gb opt. uplink Blade G8124 1U 24x SFP+ 10Gb 24-port 10bGb SFP+ Cisco 3750G-48 48 1Gb ports with Stacking 1U Intelligent Cluster Ethernet Portfolio 10G Switches 1G 48 Port with 10G Up 1G 48 Port Switches 1G 24 Port Switches Entry / Leaf / Top of Rack Switches
  • 27. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation Ethernet Switch Portfolio - iDataPlex SMC 8848M 1U 48 x 1Gb ports 2 x 10Gb uplink Industry Low Cost Cisco 4948E 1U 48 x 1Gb ports 4 x 10Gb optional uplink Premium Brand Added in Oct 10 BOM Blade G8124 1U 24x SFP+ 10Gb 24-port 10bGb SFP+ IBM B24X (TurboIron) 24X 24 10Gb ports I- DPX IBM DCN -24port 10Gb IBM DCN -48port 10Gb IBM B50C 1U (NettIron 48) 48 1Gb ports w/2 10GbE (opt) Low Cost 48 Port IBM FCX-48 (Foxhound) 48X 48 1Gb ports 10Gb Up - I- DPX Blade G8000-48 48-1Gb ports 4x10Gb Up 1U Low Cost 48 Port IBM J48 Juniper EX4200-48 48-1Gb ports 10Gb Up, 2 VC ports I- DPX Premium Brand Alternative Premium Brand (Stackable) Force 10 S60 1U 48 x 1Gb ports 4-10Gb Uplinks 10G Switches 1G 48 Port with 10G Uplinks 1G 24/48 Port Switches Entry / Leaf / Top of Rack Switches
  • 28. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation Ethernet Switch Portfolio - Intelligent Cluster Core & Aggregate Switches . Cisco 6509-E 15U 9 Slots 384-1Gb ports 32-10Gb ports Chelsio Dual Port T3 SFP+ 10Gbe PCI-E x8 line rate adapter Chelsio Dual port T3 CX4 10Gbe PCI-E x8 line rate adapter Chelsio Dual port T3 10Gbe CFFh High Performance Daughter Card for Blades Mellanox ConnectX 2 EN 10GbE PCI-E x8 line rate adapter Added in Oct 10 BOM 10GbE HPC Adapters IBM B16R (BigIron) 16 Slots 768 -1Gb 256 -10Gb ports IBM B08R (BigIron) 8 Slots 384 -1Gb 32 -10Gb ports Voltaire 8500 12 Slots 15U 288 -10Gb ports All Core Switches & 10GbE Adapters Tested for compatibility with iDataPlex Force 10 E600i 16U 7 slots 633-1Gb ports 112-10Gb ports Force 10 E1200i 21U 14 slots 1260-1Gb ports 224-10Gb ports Juniper 8216 21U 16 Slots 768-1Gb ports 128 -10Gb ports Juniper 8208 14U 8 Slots 384-1Gb ports 64-10Gb ports Core Switches & Adapters
  • 29. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation 29 High-speed Networking >Many HPC applications are sensitive to network bandwidth and latency for performance >Primary choices for high-speed networking for Clusters  InfiniBand  10 Gigabit Ethernet (emerging) >InfiniBand  InfiniBand is an industry standard low-latency, high-bandwidth server interconnect, ideal to carry multiple traffic types (clustering, communications, storage, management) over a single connection >10Gigabit Ethernet  10GbE or 10GigE is an IEEE Ethernet standard 802.3ae, which defines Ethernet technology with data rate of 10 Gbits/sec  Follow-on to 1Gigabit Ethernet technology
  • 30. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation 30 InfiniBand >An industry standard low-latency, high-bandwidth server interconnect >Ideal to carry multiple traffic types (clustering, communications, storage, management) over a single physical connection >Serial I/O interconnect architecture operating at a base speed of 5Gb/s in each direction with DDR and 10Gb/s in each direction with QDR >Provides highest node-to-node bandwidth available today of 40Gb/s bidirectional with Quadruple Data Rate (QDR) technology >Lowest end-to-end messaging latency in micro seconds (1.2-1.5 µsec) >Wide-industry adoption and multiple vendors (Mellanox, Voltaire, QLogic, etc.) >Open source drivers and libraries are available for users (OFED) Lanes SDR - 2.5Gb/s DDR - 5Gb/s QDR - 10Gb/s EDR - 20Gb/s 1x (2.5 + 2.5) Gb/s (5 + 5) Gb/s (10 + 10) Gb/s (20 + 20) Gb/s 4x (10 + 10) Gb/s (20 + 20) Gb/s (40 + 40) Gb/s (80 + 80) Gb/s 8x (20 + 20) Gb/s (40 + 40) Gb/s (80 + 80) Gb/s (160 + 160) Gb/s 12x (30 + 30) Gb/s (60 + 60) Gb/s (120 + 120) Gb/s (240 + 240) Gb/s InfiniBand Peak Bi-directional Bandwidth Table
  • 31. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation QDR InfiniBand HCA’s QDR InfiniBand Switches 4036 1U 36 ports 12200-36 1U 36 ports InfiniScale IV 1U 36 ports Director Class InfiniScale IV 10 U 216 ports Director Class InfiniScale IV 29 U 648 ports 12800-180 14 U 432 ports 12800-360 29 U 864 ports ConnectX-2 Dual Port ConnectX-2 Single Port QLE 7340 Single Port 12300-36 1U Managed 36 ports Director Class InfiniScale IV 6 U 108 ports Director Class InfiniScale IV 17 U 324 ports Grid Director4700 18U 324 ports Grid Director 4200 11U 110-160 ports New for 10B Release InfiniBand Portfolio - Intelligent Cluster
  • 32. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation 32 10 Gigabit Ethernet >10GbE or 10GigE is an IEEE Ethernet standard 802.3ae, which defines Ethernet technology with data rates of 10 Gbits/sec >Enables applications to take advantage of 10Gbps Ethernet >Requires no changes to the application code >High-speed interconnect choice for “loosely-coupled” HPC applications >Wide industry support for 10GbE technology >Growing user adoption for Data Center Ethernet (DCE) and Fibre Channel Over Ethernet (FCoE) technologies >Intelligent Clusters supports 10GbE technologies for both node-level and switch- level, providing multiple vendor choices for adapters and switches (BNT, SMC, Force10, Brocade, Cisco, Chelsio, etc.)
  • 33. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation 33 Topic Agenda >Commodity Clusters >Overview of Intelligent Clusters >Cluster Hardware >Cluster Networking >*Cluster Management, Software Stack and Benchmarking*
  • 34. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation 34 Cluster Management - xCAT >xCAT - Extreme Cluster (Cloud) Administration Toolkit  Open Source Linux/AIX/Windows Scale-out Cluster Management Solution  Leverage best practices for deploying and managing Clusters at scale  Scripts only (no compiled code)  Community requirements driven >xCAT Capabilities  Remote Hardware Control - Power, Reset, Vitals, Inventory, Event Logs, SNMP alert processing  Remote Console Management - Serial Console, SOL, Logging / Video Console (no logging)  Remote OS Boot Target Control - Local/SAN Boot, Network Boot, iSCSI Boot  Remote Automated Unattended Network Installation  For more information on xCAT go to http://xcat.sf.net
  • 35. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation 35 Cluster Software Stack >Provides fast and reliable access to common set of file data from a single computer to hundreds of systems >Brings together multiple systems to create a truly scalable cloud storage infrastructure >GPFS-managed storage improves disk utilization and reduces footprint energy consumption and management efforts >GPFS removes client-server and SAN file system access bottlenecks >All applications and users share all disks with dynamic re-provisioning capability SAN GPFS SAN LAN TECHNOLOGY: >OS Support  Linux (on POWER and x86)  AIX  Windows >Interconnect Support (w/ TCP/IP)  1GbE and 10 GbE  Infiniband (RDMA in addition to IPoIB)  Myrinet  IBM HPS High performance scalable file management solution IBM GPFS - General Parallel File System
  • 36. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation 36 What is GPFS ? >IBM’s shared disk, parallel cluster file system. >Product available on pSeries/xSeries clusters with AIX/Linux >Used on many of the largest supercomputers in the world >Cluster: 2400+ nodes, fast reliable communication, common admin domain. >Shared disk: all data and metadata on disk accessible from any node through disk I/O interface. >Parallel: data and metadata flows from all of the nodes to all of the disks in parallel. GPFS File System Nodes Switching fabric (System or storage area network) Shared disks (SAN-attached or network block device) For more information on IBM GPFS, go to http://www-03.ibm.com/systems/clusters/software/gpfs/index.html
  • 37. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation 37 >Resource Managers/Schedulers queue, validate, manage, load balance, and launch user programs/jobs. >Torque - Portable Batch System (free)  Works with Maui Scheduler (free) >LSF--Load Sharing Facility (commercial) >Sun Grid Engine (free) >Condor (free) >MOAB Cluster Suite (commercial) >Load Leveler (commercial scheduler from IBM) Resource Managers/Schedulers Job SchedulerJob Scheduler User 1User 1 Resource ManagerResource Manager Node 3Node 3Node 2Node 2Node 1Node 1 Node NNode N Job Queue User 2User 2 User NUser N
  • 38. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation 38 Messaging Passing Libraries >Enable inter-process communication among processes of an application running across multiple nodes in the cluster (or on a symmetric multi-processing system) >“Masks” the underlying interconnect from the user application  Allows application programmer to use a “virtual” communication environment as reference for programming applications for clusters Messaging Passing Interface (MPI) Parallel Virtual Machine (PVM) >Included with most Linux distributions (open source)  IP (Ethernet)  GM (Myrinet) >Linda (commercial)  IP (Ethernet) >MPICH2 (free)  IP (Ethernet)  MX (Myrinet)  InfiniBand >LAM-MPI (free)  IP (Ethernet) >Scali (commercial)  IP (Ethernet)  MX (Myrinet)  InfiniBand >OpenMPI (free)  IP (Ethernet)  InfiniBand
  • 39. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation 39 Compilers & Other tools >Compilers are critical in creating an optimized binary code that takes advantage of the specific processor architectural features such that the application can exploit the full power of the system and runs most efficiently >Respective processor vendors typically have the best compilers for their processors – e.g. Intel, AMD, IBM, SGI, Sun, etc. >Compilers are important to produce the best code for HPC applications as individual node performance is a critical factor for the overall cluster performance >Open source and commercial compilers are available such as the GNU GCC compiler suite (C/C++, Fortran 77/90) (Free), and PathScale (owned by QLogic) compilers >Support libraries and debugger tools are also packaged and made available with the compilers, such as Math libraries (e.g. Intel Math Kernel Libraries, AMD Core Math Library) and debuggers such as gdb (GNU debugger) and TotalView debugger used for debugging parallel applications on clusters
  • 40. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation 40 HPC Software Stack The Intelligent Clusters supports a broad range of HPC software from industry leading suppliers. Software is available directly from IBM or the respective Solution providers. Functional Area Software Product Source Comments Cluster Systems Management xCAT2 IBM Director IBM CSM functionality now merged into xCAT2 File Systems General Parallel File System (GPFS) for Linux; GPFS for Linux on POWER IBM PolyServe Matrix Server File System HP NFS Open Source Lustre Open Source Workload Management Open PBS Open Source PBS Pro Altaire LoadLeveler IBM LSF Platform Computing MOAB Cluster Resources Commercial version of Maui scheduler Gridserver Datasynapse Maui Scheduler open source Interfaces to many schedulers Message Passing Interface Solutions Scali MPI Connect™ Scali Compilers PGI Fortran 77/90; C/C++ STM Portland Group 32/64-bit support Intel Fortran/C/C++ Intel NAG Fortran/C/C++ NAG 32/64-bit Absoft® Compilers Absoft PathScale™ Compilers PathScale AMD Opteron GCC open source
  • 41. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation 41 Debugger/Tracer TotalView Etnus CodeAnalyst AMD Timer/event profiling pipeline simulations Fx2 Debugger™ Absoft Distributed Debugging Tool (DDT) Allinea Math Libraries ACML (AMD Core Math Libraries) AMD/NAG BLAS, FFT, LAPACK Intel Integrated Performance Primitives Intel Intel Math Kernel Library Intel Intel Cluster Math Kernel Library Intel IMSL™, PV-WAVE® Visual Numerics Message Passing Libraries MPICH Open Source TCP/IP networks MPIC-GM Myricom Myrinet networks SCA TCP Linda™ SCA WMPI II™ Critical Software Parallelization Tools TCP Linda® SCA Interconnect ManagementScali MPI Connect Scali Performance Tuning Intel VTune™ Performance Analyzer Intel Optimization and Profiling Tool (OPT) Allinea High Performance Computing ToolkitIBM http://www.research.ibm.com/actc Threading Tool Intel Thread Checker Intel Trace Tool Intel Trace Analyzer and Collector Intel Functional Area Software Product Source Comments HPC Software Stack Cont.
  • 42. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation 42 Cluster Benchmarking Benchmarking – technique for running some well-known reference applications on clusters in order to exercise various system components and measuring the performance characteristics of the cluster (e.g. network bandwidth, latency, FLOPs, etc.) >STREAM (memory access latency and bandwidth)  http://www.cs.virginia.edu/stream/ref.html >Linpack - the TOP500 benchmark  Solves a dense system of linear equations  You are allowed to tune the problem size and benchmark to optimize for your system  http://www.netlib.org/benchmark/hpl/index.html >HPC Challenge  A set of HPC benchmarks to test various subsystems of a cluster system  http://icl.cs.utk.edu/hpcc/ >SPEC  A set of commercial benchmarks to measure performance of various subsystems of the servers  http://www.spec.org/ >NAS 2.3 Parallel Benchmarks >http://www.nas.nasa.gov/Resources/Software/npb.html >Intel MPI Benchmarks (previously Pallas benchmarks)  http://software.intel.com/en-us/articles/intel-mpi-benchmarks/ >Ping-Pong (Common MPI benchmark to measure point-to-point latency and bandwidth >Customer's own code  Provides a good representation of the system performance specific to the application code
  • 43. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation 43 Summary > A Cluster system is created out of commodity server hardware, high-speed networking, storage and software technologies > High-performance computing (HPC) takes advantage of cluster systems to solve complex problems in various industries > IBM Intelligent Clusters provides a one-stop-shop for creating and deploying HPC solutions using IBM servers and third party Networking, Storage and Software > InfiniBand, Myrinet (MX and Myri-10G), and 10Gigabit Ethernet technologies are more commonly used as the high-speed interconnect solution for Clusters > IBM GPFS parallel file system provides a highly-scalable, and robust parallel file system and storage virtualization solution for Clusters and other general-purpose computing systems > xCAT is an open-source, scalable cluster deployment and Cloud hardware management solution > Cluster benchmarking enables performance analysis, debugging and tuning capabilities for extracting optimal performance from Clusters by isolating and fixing critical bottlenecks > Message-passing middleware enables developing HPC applications for Clusters > Several commercial software tools are available for Cluster computing
  • 44. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation 44 Glossary of Terms >Commodity Cluster >InfiniBand >Message Passing Interface (MPI) >Extreme Cluster (Cloud) Administration Toolkit (xCAT) >Network-attached storage (NAS) >Cluster VLAN >Message-Passing Libraries >Management Node >High Performance Computing (HPC) >Roll Your Own (RYO) >BP Integrated >Distributed Network Topology >Intelligent Clusters >General Parallel File System (GPFS) >Direct-attached storage (DAS). >iDataPlex >Inter-node communication >Compute Network >Centralized Network Topology >IBM Racked and Stacked >Leaf Switch >Core/aggregate Switch >Quadruple Data Rate >Storage Area Network (SAN) >Parallel Virtual Machine (PVM) >Benchmarking
  • 45. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation 45 Additional Resources >IBM STG SMART Zone for more education:  Internal: http://lt.be.ibm.com  BP: http://lt2.portsmouth.uk.ibm.com/ >IBM System x  http://www-03.ibm.com/systems/x/ >IBM ServerProven  http://www-03.ibm.com/servers/eserver/serverproven/compat/us/ >IBM System x Support  http://www-947.ibm.com/support/entry/portal/ >IBM System Intelligent Clusters  http://www-03.ibm.com/systems/x/hardware/cluster/index.html
  • 46. IBM Systems & Technology Group Education & Sales Enablement © 2011 IBM Corporation 46 Trademarks •The following are trademarks of the International Business Machines Corporation in the United States, other countries, or both. >Not all common law marks used by IBM are listed on this page. Failure of a mark to appear does not mean that IBM does not use the mark nor does it mean that the product is not actively marketed or is not significant within its relevant market. >Those trademarks followed by ® are registered trademarks of IBM in the United States; all others are trademarks or common law marks of IBM in the United States. For a complete list of IBM Trademarks, see www.ibm.com/legal/copytrade.shtml: •The following are trademarks or registered trademarks of other companies. >Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries. >Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefore. >Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. >Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. >Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. >UNIX is a registered trademark of The Open Group in the United States and other countries. >Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. >ITIL is a registered trademark, and a registered community trademark of the Office of Government Commerce, and is registered in the U.S. Patent and Trademark Office. >IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency, which is now part of the Office of Government Commerce. •All other products may be trademarks or registered trademarks of their respective companies >Notes: >Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here. >IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply. >All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics will vary depending on individual customer configurations and conditions. >This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the information may be subject to change without notice. Consult your local IBM business contact for information on the product or services available in your area. >All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. >Information about non-IBM products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the performance, compatibility, or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products

Editor's Notes

  1. {DESCRIPTION} This screen displays an image of a group of IBM System x eX5 servers. {TRANSCRIPT} Welcome to IBM System x ™ Technical Principles - Introduction to Intelligent Clusters . This is Topic 7 of the System x Technical Principles Course series - XTW01.
  2. {DESCRIPTION} This slide presents a bullet list of the course objectives. {TRANSCRIPT} The objectives of this course are: Describe a high-performance computing cluster List the business goals that Intelligent Clusters addresses Identify three core Intelligent Clusters components List the high-speed networking options available in Intelligent Clusters List three software tools used in Clusters Describe Cluster benchmarking
  3. {DESCRIPTION} This slide presents a bullet list of the topics discussed TRANSCRIPT} In this topic we will explore: Commodity Clusters Overview of Intelligent Clusters Cluster Hardware Cluster Networking Cluster Management, Software Stack, and Benchmarking The next section will discuss Commodity Clusters
  4. {DESCRIPTION} This slide presents a bullet list Commodity Cluster features. There is an image of a server rack in the lower right corner. {TRANSCRIPT} What is a Commodity Cluster? Clusters have many definitions in the industry. The most common definition of a commodity cluster is a group of interconnected computer systems which operate as a single logical entity to create a big, powerful system capable of solving complex computational problems. Clusters consist of commodity components, including server systems with up to two CPU sockets (up to eight total CPU cores) each based on commodity x86 or POWER architecture, a high-speed interconnect network such as InfiniBand and the associated switches for inter-process communication, a Storage subsystem, either directly attached to the individual nodes or a network-attached storage system such as NAS or SAN. The servers in the Cluster run a commodity operating system such as Linux or Microsoft Windows, and a common cluster-wide hardware and software management system such as xCAT. In addition, Clusters run a set of software applications that solve various industry problems in science, engineering, finance, and several others. As opposed to traditional techniques of creating powerful machines called Supercomputers, Clusters use commodity technologies to create a more cost-effective and yet scalable and equally performing (and in some cases exceeding) the traditional Supercomputers. Hence, clusters today based on commodity technologies enable computing at speeds of traditional Supercomputers.
  5. {DESCRIPTION} This slide presents an exploded view of a clusters components. There is an image of each. {TRANSCRIPT} The Picture in this chart depicts the conceptual architecture of a Cluster system built out of commodity components (nodes, networking, switches, storage, etc.). The mid-portion of the picture shows a rack full of IBM server systems, which act as the core compute nodes of the cluster, and are connected in the backend using two separate network interfaces – Ethernet and InfiniBand. The Ethernet network is used mainly as the fabric for the management traffic (OS provisioning, remote hardware management, IPMI, Serial-over-LAN, running administrative commands on nodes, job scheduling, etc.). The InfiniBand network is used as the fabric for inter-process communication and message-passing across the nodes and provides high-bandwidth and low-latency interconnect across the nodes. On the right side of the picture is the Fibre-channel Storage Area Network (SAN) storage, which is directly attached via Fibre-optic network and switches to the Storage nodes. The storage nodes manage the SAN storage directly and they provide the shared Cluster file system, which is exported to the rest of the Cluster (i.e. the compute nodes). Other kinds of storage systems are also commonly used in Clusters, including network-attached storage (NAS) and Direct-attached storage (DAS). As shown in the picture, the different network traffic types are carefully isolated by configuring corresponding VLANs on the network switches. For example, all the management traffic is isolated into a separate VLAN (designated as the Management VLAN), which is only accessible to the main management node in the Cluster for security reasons. On the other hand, the Cluster VLAN connects the management node to the compute nodes and in some cases it is used as the VLAN for message-passing traffic. User access to the Cluster is provided via the “Login” nodes in the cluster. These nodes are designated as the main entry point for users into the cluster. Various user applications are installed on the Login nodes, such as compilers, message-passing libraries (e.g. MPICH), debuggers, job scheduler interface tools, etc. Using Login nodes in the cluster provides a secure and cleaner interface to users into the cluster. There are usually one or more management nodes in the cluster. The management node is where all the hardware management and other administrative tools necessary to deploy and manage the cluster are installed. Cluster administrators login to the management node to perform hardware management as well as for maintaining various software and user-related configuration activities on the cluster nodes.
  6. {DESCRIPTION} This slide presents a graphic similar to a bar graph. Each bar represents an application type.5bornot5b {TRANSCRIPT} Commodity clusters are applied widely in the industry for solving complex computational problems with speed, accuracy and efficiency. As shown in the picture, Clusters are used in various industry vertical segments in Energy, Finance, Manufacturing, Life Sciences, Media and Entertainment, Public sector and Government. Several common applications of clusters in each of these industry segment are shown in the picture, including Seismic analysis (e.g. oil exploration), Portfolio risk analysis in Finance, finite element analysis (FEA) and engineering design in the manufacturing industry, weather forecasting, oceanography, and so on. As is evident, Clusters are not only used in traditional research and academic computing, but are also used in commercial industry segments today.
  7. {DESCRIPTION} {TRANSCRIPT} Rapid advances in processor technologies and accelerated development of new techniques for improving computing efficiency are making cluster computing more and more attractive and easily deployable solution for various industries. One of the key advances that contributes to the success of HPC is the multi-core processor technology. Multi-core processors enable more dense computing and parallelism within the individual nodes in the cluster so that you can do more computation with less number of nodes. In addition, faster and bigger memory chips enable applications requiring lots of physical RAM as well as the ability to run multiple applications simultaneously without any performance penalties. Virtualization is a hot topic in the industry today, although the concept has been around for a while. Virtualization enables consolidation of physical resources and provides various other advantages. The application of virtualization to HPC is not yet attractive given the performance and scalability concerns for HPC applications. However, future advances in both hardware and software technologies might make virtualization interesting for a subset of HPC applications that don’t have stringent performance requirements and could potentially benefit from the reliability and fault-tolerance aspects of virtualization. As clusters scale beyond tens of hundreds of computers the complexity in managing and efficiently utilizing expensive resources becomes a concern for system administrators. Power consumption and equipment cooling in data centers are two key concerns in today’s large-scale computing environments. Several “Green” technologies and strategies are emerging to make computing resources more power-efficient and easy to cool.
  8. {DESCRIPTION} {TRANSCRIPT} The next section will present an overview of the Intelligent Clusters
  9. {DESCRIPTION} {TRANSCRIPT} There are multiple approaches for deploying clusters. As shown in the picture, customers have several choices when it comes to purchasing and installing clusters. Roll Your Own (RYO) In this approach, the customer orders all the required individual hardware components such as servers and switches from vendors like IBM. The customer then does the integration of these components on their own, or contracts a third party integrator. The disadvantage with this approach is that the customer doesn’t get a full solution from a single vendor, and they have to deal with warranty related issues through each hardware vendor directly – in other words, there is no single point of contact for support and warranty issues. IBM Racked and Stacked In this approach the client procures servers and storage components in standard racks from IBM and then integrates other third party components such as switches into these racks on their own or contracts IBM Global Services or some other integrator to do this work. The disadvantage with this approach is that the client needs to address warranty and support issues with each vendor directly. BP Integrated In this approach, a qualified IBM business partner works with the customer in ordering servers and storage components from IBM and networking components from third party vendors. The BP at that point will build the cluster by integrating the components and then delivers the cluster to the customer. The disadvantage with this approach is again the fact that the customer needs to address warranty and support issues directly with each vendor. Intelligent Clusters In this approach, the customer orders a fully integrated cluster solution from IBM directly, which includes all the server, storage as well as third party network switches The advantages with this approach are: IBM delivers the factory-built and tested cluster solution, ready to be deployed in the customer data center and easy to plug into their environment The customer can contact IBM for all warranty, service and support issues – in other words, a single point of contact
  10. {DESCRIPTION} {TRANSCRIPT} IBM System Intelligent Clusters is an “integrated cluster” solution offering from IBM. The Intelligent Clusters offers a fully-integrated, turn-key cluster solution to customers by combining various IBM as well as OEM hardware components, IBM/third-party software components, implementation/management services, and provides a single-point of support for all hardware and software. As shown in the picture, the Intelligent Clusters consists of the following core components: IBM System x Servers: Rack mount servers Intel-based x3550 M3, x3650 M3 Blade servers: Intel-based HS22, HS22V and HX5 Blade servers High-density servers: IBM System x iDataPlex 2U and 3U FlexChassis technology Intel-based dx360 M3 servers IBM Systems Storage Fibre-channel, SAS and iSCSI based storage systems, switches and adapters Network switches 1Gbps Ethernet, 10Gbps Ethernet IBM Intelligent Clusters will integrate all these core components into a single cluster-optimized solution and deliver it to the customers as a fully-bundled and ready-for-deployment solution. Intelligent Clusters also offers professional implementation services for custom deployments via the System x Lab-based Services organization.
  11. {DESCRIPTION} {TRANSCRIPT} As shown in the picture, the IBM HPC Cluster Solution is created by combining IBM Server hardware, third-party switches and storage, Cluster software applications such as GPFS, xCAT and Linux or Windows, and the applications and tools necessary to run customer’s own HPC codes. Hence, and HPC cluster solution consists of all the hardware and software components end-to-end, ready to execute high-performance parallel and cluster applications.
  12. {DESCRIPTION} {TRANSCRIPT} In the following section we will discuss Cluster hardware.
  13. {DESCRIPTION} Overview of server choices for intelligent clusters {TRANSCRIPT} This chart shows some of the key IBM System x server offerings that are used in Intelligent Clusters. On the left are the rack-optimized servers based on Intel processors – the 1U x3550 M3 and the 2U x3650 M3. On the right are the IBM Blade Server offerings, with the chassis optimized for various businesses, and the variety of Blade servers with the Intel processors. Together, these servers offer a wide range of capabilities to address needs of the specific industries and applications such as large memory, high-performance, and scalability.
  14. {DESCRIPTION} {TRANSCRIPT} What challenges does iDataPlex address? iDataPlex is an innovative solution from IBM designed to better address data center challenges around compute density, not just rack density but to provide more servers into the client’s data center within their limited floor space, power and cooling infrastructure. These data centers need a solution that can be deployed quickly (scalability), and is easy to service and manage. iDataPlex is designed to address TCO (Total Cost of Ownership), not just acquisition costs but operational costs throughout the lifecycle of the deployment. Finally, iDataPlex is designed to be flexible, since every customer’s workload requirements are unique. IBM System x iDataPlex is the newest set of System x server offerings, targeted for extremely large-scale server deployments such as data centers running Web 2.0 and HPC style workloads. Customers having such a need for computational capacities at extreme scale can deploy iDataPlex, which has been optimized for such environments. iDataPlex is a custom rack design that allows up to 84 standard dual-CPU servers to be placed in the single rack that has the same footprint as a standard enterprise rack, which only allows a maximum of 42 servers. In addition, the iDataPlex rack design has been optimized for power and cooling so that the iDPX rack is more power-efficient and easy to cool than the traditional racks. iDataPlex supports the standard network switches and adapters for Ethernet and InfiniBand. A special chassis design called the FlexChassis allows various configurations of iDataPlex to be created by combining servers, storage and I/O options, to address the specific customer requirements. iDataPlex is one of the server choices offered for a Intelligent Clusters solution.
  15. {DESCRIPTION} {TRANSCRIPT} The iDataPlex portfolio continues to evolve to meet the computing requirements in the data center of today and tomorrow. IBM introduced the dx360 M2 in March 2009, based on Intel Nehalem processors which provides maximum performance while maintaining outstanding performance per watt with the highly efficient iDataPlex design. In March 2010, IBM introduced the dx360 M3, increasing our performance and efficiency with the new Intel Westmere processors and new server capabilities which we will go into more detail on in the next few charts. We also have a 3U chassis available with the dx360 M3 server, which provides up to 12 3.5” SAS or SATA hard disk drives, up to 24TB per server for large capacity local storage. Again, within the iDataPlex rack we can mix these offerings to provide the specific rack-level solution that the client is looking for.
  16. {DESCRIPTION} {TRANSCRIPT} This chart shows various iDataPlex chassis and server configuration options. The compute intensive configuration allows two 1U compute servers in the 2U chassis for maximum server density. The compute+storage configuration allows a combination of one server tray and a 1U drive tray. This combination allows up to five 3.5” form factor drives to be installed in the 2U chassis in addition to the compute server. The acceleration+storage compute + I/O configuration is the new addition to iDataPlex. This configuration is intended for customers that require GPU acceleration capabilities for certain high-performance applications that are intensive in floating point and vector calculations, hence they will benefit from GPUs such as NVIDIA’s Tesla cards, which are qualified for iDataPlex. This configuration will allow two GPU cards to be installed in the chassis with a single 1U compute server attached. A special GPU I/O tray will be required for this configuration. The 3U storage iDataPlex chassis is intended for storage-rich applications requiring vast amounts of storage. The 3U chassis will allow up to 12 3.5” form factor SAS/SATA drives of varying capacities to be installed into the chassis. As of this presentation, the biggest size disks available are the 2TB SATA NL drives, which enable up to 24TB of disk space in a 3U iDataPlex chassis. In addition to the chassis configuration options, iDataPlex supports various power supply options, including the traditional 900W, and the new 550W as well as the 750W redundant power supply. Customers can choose the right power supply option that’s suitable to their environments and requirements.
  17. {DESCRIPTION} {TRANSCRIPT} What is new with the dx360 M3? The dx360 M3 provides more performance and better efficiency. The new Intel Westmere-EP processors provide up to 50% more cores with the 6-core design, and also have new lower-powered CPU’s to reduce power. The dx360 M3 allows for lower powered DDR3 memory, allowing customers to further increase efficiency without affecting performance. Where cost is a concern, clients can take advantage of dx360 M3 support for 2 DIMMs per channel at a full 1333MHz bandwidth with 95W processors and 12 DIMMs. This allows the server to maintain maximum performance by utilizing 12 lower capacity DIMMs instead of 6 higher capacity DIMMs, reducing acquisition cost. The dx360 M3 also provides an additional power supply option which provides power redundancy for server and line feed protection. As part of dx360 M3 we’ve also brought in new capacities of hard drives, in 2.5”, 3.5”, SAS, SATA and SSD. For example the new 2TB 3.5” drives provide 24TB of local storage in the 3U chassis. We’ve also introduced new Converged Network adapters with the M3, allowing convergence of Ethernet and Fiber Channel at the server on a single interface. Finally, Trusted Platform Module is standard on the dx360 M3, providing secure key storage for applications such as Microsoft BitLocker.
  18. {DESCRIPTION} {TRANSCRIPT} With the redundant power option, customers can still take advantage of all the optimization for software-resilient workloads, and now take advantage of iDataPlex efficiency for non-grid applications where they desire. The new supply is in the same form factor as the 900W non-redundant supply, with 2 discrete supplies inside the container that are bussed together and 2 discrete line feeds to split power to separate PDU’s. Deploying a full rack of redundant power will require doubling the PDU count, but the vertical slots in the iDataPlex rack can easily accommodate these. Whether the customer’s requirement is line feed maintenance, node protection, or just increased reliability, iDataPlex can now deliver a solution.
  19. {DESCRIPTION} {TRANSCRIPT} The iDataPlex GPU solution takes advantage of iDataPlex efficiency and thermal design to maximize density for GPU compute clusters. GPU cards have a peak power of 225W each, but the iDataPlex server can easily accommodate them within the current design. iDataPlex gives customers a much more efficient solution that will result in more GPU-based servers being deployed per rack within the customer’s power and cooling envelopes, resulting in more GPU’s per rack, less racks required, less power feeds, and ultimately less operating cost. And, with the Rear Door Heat Exchanger, iDataPlex provides the ultimate solution for GPU computing!
  20. {DESCRIPTION} {TRANSCRIPT} This chart depicts the new I/O capabilities of the dx360 M3. A new I/O tray and 3 slot riser will be introduced for the iDataPlex chassis, allowing 2 full height full length 1.5wide cards (such as the NVidia M2050 “Fermi”) in the top of the chassis with x16 connectivity. In addition there is an open x8 slot designed to accommodate a high bandwidth adapter such as Infiniband or 10Gb ethernet, or Converged Network Adapter. The dx360 M3 also has an internal slot that will accommodate a RAID adapter, providing full 6Gbps performance for up to 4 2.5” drives. When compared to outboard solutions, each iDataPlex GPU server is individually serviceable. No longer will clients have to take down 2 servers in the event of a problem with a GPU card. Sparing of GPU’s becomes much simpler, as each card can be replaced individually, instead of an outboard unit that contains 4 cards. The significant I/O capabilities also provide for maximum local storage performance with RAID. And GPU’s will be provided as part of the Cluster Intelligent Cluster integrated solution from IBM, so when there is an issue there is one number to call for resolution.
  21. {DESCRIPTION} {TRANSCRIPT} Storage is an integral piece of every cluster solution. Typical cluster applications utilize Storage for various purposes, including storing large amounts of application data, storage for temporary (scratch) files, storage for cluster databases, cluster OS images, parallel file system (GPFS), and so on. Hence, having an efficient Storage subsystem attached to a cluster is important. There are literally hundreds of storage solutions available in the market today, ranging from simple and cheap, all the way to complex and expensive. The most commonly deployed Storage solutions in clusters are: Disk storage such as direct attached storage (DAS) or Network attached storage (NAS) Storage Area Network (SAN) The complexity and cost of the particular storage solution will depend on various factors, including type of storage, vendor, protocol support (TCP/IP, iSCSI, Fibre-channel), life-cycle management features, management software, performance characteristics, etc. IBM Systems Storage offers several choices when it comes to storage solutions. IBM Storage portfolio consists of entry-level, mid-range, all the way up to enterprise storage solutions. The Intelligent Clusters storage portfolio is restricted to entry-level and mid-range disk systems and SAN storage, which provides customers the option to use the inexpensive direct attached storage such as SAS storage (DS3000) or use a more complex Fibre-channel SAN storage for higher performance (e.g. DS5100/DS5300). In addition, the Intelligent Clusters portfolio includes various third party storage solutions such as switches and adapters (e.g. Brocade, QLogic, Emulex, and LSI).
  22. {DESCRIPTION} {TRANSCRIPT} The picture shows some of the IBM disk storage systems supported in the Intelligent Clusters portfolio. As described in the previous charts, the Intelligent Clusters portfolio supports entry-level disk systems such as DS3200 (SAS/SATA drives) and EXP3000 Storage expansion unit, the mid-range disk systems such as DS3400 (Fibre-channel, SAS and SATA disks) DS3300 (iSCSI interface with SAS disks), and DS3500 (SAS/SATA drives), and mid-range disk systems such as DS5020 and DS5100/DS5300 with Fibre-channel interface.
  23. {DESCRIPTION} {TRANSCRIPT} Cluster Networking will be examined in the following section.
  24. {DESCRIPTION} {TRANSCRIPT} Clusters require one or more network fabrics for inter-node communication and management Typically, clusters use at least one network for management and one network for inter-node communication (compute network). Optionally, there can be additional networks used based on customer and application-specific requirements for performance, security, fault-tolerance, etc. The management network is used for managing various cluster elements, including servers, switches, and storage. A separate, dedicated management network is essential to reliably manage the cluster elements using either in-band or out-of-band communication. In addition, the management fabric is also used for deploying OS images to cluster nodes or network-booting of the OS on the cluster nodes using tools such as xCAT. The management network is also used for monitoring the cluster and gathering performance and utilization information. Gigabit Ethernet is typically used as the management network fabric. A compute network is used for inter-node communication and message-passing applications to send and receive messages across the cluster nodes. The compute network is a dedicated network and is typically only used to carry message-passing traffic to avoid introducing extra overhead and congestion. Often, a high-speed network such as InfiniBand or Myrinet is used for the compute network to provide a high-bandwidth and low-latency fabric. In some cases where bandwidth and latency are not a major concern, a Gigabit Ethernet or 10 Gigabit Ethernet (10GbE) network is also employed for the compute network. The user or campus network is the external network to which a cluster is attached so that users login to the cluster to run their jobs. The user network is not part of the cluster, but administrators need to provide a secure and reliable interface to the cluster from the user/campus network. In addition to the management, compute, and user fabrics, there can be optionally other networks used in clusters, such as a Storage network, which is used to interconnect servers facing Storage subsystem (referred to as Storage nodes),
  25. {DESCRIPTION} This picture shows the intelligent cluster InfiniBand portfolio. {TRANSCRIPT} This picture shows the intelligent cluster InfiniBand portfolio.
  26. {DESCRIPTION} {TRANSCRIPT} The Intelligent Clusters portfolio offers a wide range of options for networking. 1350 partners with several OEM vendors for network switches, adapters, and cables. As shown in the picture, the 1350 Ethernet switch portfolio consists of 1 Gigabit Ethernet switches from vendors such as SMC networks, Cisco, Blade Network Technologies, and Force 10. Customers have a range of choices when it comes to Ethernet networking to be used in Intelligent Clusters clusters. The 1350 pre-sales cluster architect can pick and choose which particular switches o use in the cluster solution, depending on customer preferences and application needs. Special care must be taken in order to address the performance, scalability and availability requirements for applications and users when architecting the cluster network fabric.
  27. {DESCRIPTION} {TRANSCRIPT} This chart shows the Ethernet entry/leaf/top of rack switches qualified for the iDataPlex solution.
  28. {DESCRIPTION} {TRANSCRIPT} When discussing Cluster network, often one comes across the concept of a centralized Vs distributed network. Clusters typically use one of these two architectures for their network fabrics. In case of a centralized network topology, there are one or more centralized switches, and all cluster elements including servers, storage and others are connected directly to the central switches. There are no intermediate hops to go to the central switches and all elements communicate to others via the central switches. On the other hand, with a distributed network topology, the network architecture has multiple tiers. Typically, there are two tiers – the core/aggregation tier, and the access/leaf tier. The core/aggregation tier consists of core switches that connect to the access/leaf tier switches via Inter-switch links (ISLs). The leaf switches are the smaller size switches placed inside the individual racks in a cluster and they connect directly to the cluster nodes. All nodes communicate via the leaf switches, which are aggregated at the core/aggregation point via the core switches. The right approach for networking – centralized or distributed, depends on various factors, including cluster size, performance requirements, and cost. Typically, a distributed network topology is used when the size of the cluster is big – e.g. hundreds and thousands of nodes because the distributed network scales well and is easy to expand in future. On the other hand the centralized network is a good choice for small size clusters of tens of nodes. As shown in the picture, the Intelligent Clusters portfolio provides several core switch choices with high port counts, from vendors such as Force10 and Cisco. In addition to the switches, the 1350 BOM also contains several high-speed network adapter choices, such as Chelsio 10GbE PCI-E adapters and Blade daughter cards.
  29. {DESCRIPTION} {TRANSCRIPT} High-speed networking is important for cluster applications where bandwidth and latency are critical for performance. Traditional Gigabit Ethernet network will not deliver on these requirements, due to the relatively low bandwidth and latencies in the order of a few milliseconds. Hence, HPC clusters typically employ some type of high-bandwidth, low-latency network fabric to meet the performance requirements for applications. Today, the primary choices for high-speed networking for Clusters are the following: InfiniBand 10 Gigabit Ethernet InfiniBand InfiniBand is an industry standard low-latency, high-bandwidth server interconnect, ideal to carry multiple traffic types (clustering, communications, storage, management) over a single connection. A switch-based serial I/O interconnect architecture operating at a base speed of 2.5 Gb/s, 5 Gb/s, or 10 Gb/s in each direction (per port) Provides highest node-to-node bandwidth available today of 40Gb/s bidirectional with Quadruple Data Rate (QDR) technology Lowest end-to-end messaging latency in micro seconds (1.2-1.5 µsec) Wide-industry adoption and multiple vendors (Mellanox, Voltaire, QLogic, etc.)
  30. {DESCRIPTION} {TRANSCRIPT} InfiniBand (IB) is an industry standard server interconnect technology developed by a consortium of companies as part of the InfiniBand Trade Association (IBTA). InfiniBand defines the standard for a low latency, high-bandwidth point-to-point server interconnect technology. The low-latency in the order of 1.2 microseconds at the application level can be achieved using the Remote Direct Memory Access (RDMA) protocol for communication across the servers, which bypasses the standard kernel protocol layers in the Operating system and gives direct access to memory on the remote system. The high-bandwidth of InfiniBand fabric in the order of 40Gb/s (with QDR technology) is achieved via the serial bus interface with each of the lanes supporting up to 10Gbps bi-directional bandwidth. InfiniBand specification defines various speeds for the fabric, depending on the purpose of the link. For example, the most commonly used link width is 4x, which corresponds to “four lanes” of the IB serial links. This link width is used for the connectivity between the servers and switches. On the other hand, links of 12x width are typically used as inter-switch links. As the IB technology advanced over the years, the serial link speed kept doubling every two to three years. Correspondingly, the technology was termed “SDR” (single-data rate), “DDR” (double-data rate), “QDR” (quad-data rate) and so on. Currently, the QDR link speed is 10Gbps bidirectional. Hence, the 4x QDR link gives 40Gbps bi-directional bandwidth. High-performance computing applications utilizing a parallel middleware library such as Message Passing Interface (MPI) typically use the native RDMA protocol enabled by InfiniBand fabric adapters and switches for communication in order to achieve low-latency and high-bandwidth communication across processing running on different servers in the cluster. There are multiple vendors that make adapters, switching and associated gear, as well as software for InfiniBand. Intelligent Clusters portfolio carries several vendor options to support IB to be sold as an integrated high-speed interconnect option for clusters.
  31. {DESCRIPTION} {TRANSCRIPT} This chart shows the various InfiniBand switch and adapter options from Voltaire, QLogic, and Mellanox, which are supported in the Intelligent Cluster bill of materials. The newly introduced QLogic QDR core switches support up to 864 ports in a single chassis, which is typically used in large scale IB clusters with hundreds of nodes. The QDR leaf switches support up to 36 non-blocking IB ports Dual-port QDR IB adapters are available from Mellanox and QLogic. These adapters are available for both rack-mount servers and Blades.
  32. {DESCRIPTION} {TRANSCRIPT} 10GbE or 10GigE is an IEEE Ethernet standard 802.3ae, which defines Ethernet technology with data rate of 10 Gbits/sec. 10GbE technology enables applications to take advantage of interconnect speeds ten times (10x) faster than the traditional 1 Gigabit Ethernet technology. The main advantage of 10GbE technology is that it requires no changes to the application code, which was originally written for 1 Gigabit Ethernet (provided the underlying OS and hardware support 10GbE fabric). 10GbE is picking up momentum as the high-speed interconnect choice for “loosely-coupled” message-passing HPC applications that traditionally used 1GbE technology as the interconnect. 10GbE technology is less expensive and easier to deploy than other high-speed networking options such as InfiniBand or Myrinet. There is wide industry support for 10GbE technology in terms of adapters and switches with growing user adoption. 10GbE is fast becoming the choice for Data Center Ethernet (DCE) and the emerging Fibre Channel Over Ethernet (FCoE) technologies with the Converged Enhanced Ethernet (CEE) standard. Intelligent Clusters supports 10GbE technologies for both node-level and switch-level support, providing multiple vendor choices for adapters and switches (BNT, SMC, Force10, Brocade, Cisco, Chelsio, etc.)
  33. {DESCRIPTION} {TRANSCRIPT} Cluster Management, Software Stack and Benchmarking will discussed next.
  34. {DESCRIPTION} {TRANSCRIPT} xCAT stands for Extreme Cluster (Cloud) Administration Toolkit. xCAT is an open source Linux/AIX/Windows Scale-out Cluster Management Solution, primarily developed and tested by IBM xCAT key design principles are to: Build upon existing technologies Leverage best practices for provisioning and managing large-scale clusters and Cloud-type infrastructure Implement based on scripts only without any compiled code in order to make it portable, and make the source code available xCAT core capabilities are: Remote Hardware Control Power on/off/reset, vitals, inventory, event Logs, and SNMP alert processing Remote Console Management Serial Console, SOL, Logging / Video Console (no logging) Remote Boot Control Local/SAN Boot, Network Boot, and iSCSI Boot Remote Automated Unattended Network Installation Auto-Discovery of nodes through intelligent switch integration MAC Address Collection Service Processor Programming Remote BIOS/firmware Flashing Kickstart, Autoyast, Imaging, Stateless/Diskless, iSCSI With all these features, xCat provides a comprehensive, flexible yet powerful cluster management solution that’s developed and tested on some of the biggest IBM clusters to date.
  35. {DESCRIPTION} {TRANSCRIPT} The IBM GPFS stands for General Parallel File System. GPFS is a cluster file system developed by IBM originally targeted for high-performance computing environments to eliminate some of the core performance and scalability faced by customers using traditional file systems such as NFS. GPFS provides significant performance and scalability advantages over traditional file systems and other cluster file systems in the market due to its architecture and the evolution over the years. Today, GPFS is used as the premium choice when designing Storage for Clusters as well as the emerging Cloud computing environments. Some of the important features of GPFS are: GPFS provides fast and reliable access to common set of file data from a single computer to hundreds of systems Brings together multiple systems to create a truly scalable cloud storage infrastructure GPFS-managed storage improves disk utilization and reduces footprint energy consumption and management efforts GPFS removes client-server and SAN file system access bottlenecks All applications and users share all disks with dynamic re-provisioning capability GPFS is developed and sold as a commercially licensed product by IBM
  36. {DESCRIPTION} {TRANSCRIPT} GPFS provides shared storage to the cluster nodes and a common cluster-wide parallel file system. The parallelism in GPFS comes from it’s ability to provide concurrent shared access to the same files from multiple nodes in the cluster, which improves the file access performance significantly over traditional techniques. GPFS is available on a wide range of platforms and operating systems, including IBM pSeries and xSeries servers, and AIX, Linux and Windows. GPFS is currently used on some of the largest Supercomputers in the world, consisting of 100s of nodes as the parallel file system. GPFS has been demonstrated to scale beyond 2400 nodes, without any performance degradation or loss of data. GPFS provides a single administrative control point and most of the GPFS commands can be executed from any node in the cluster, which simplifies administration and provides flexibility. Shared disk: All data and metadata on disks is accessible from any node through a unique and consistent “disk I/O” interface. Parallel access: Data and metadata is accessible from all nodes in the cluster at any time and in parallel to improve performance.
  37. {DESCRIPTION} {TRANSCRIPT} A cluster resource manager manages the nodes and other hardware resources in the cluster. The resource manager helps streamline resource requests from users by reserving resources and executing jobs on the cluster nodes. A resource manager is usually combined with a job scheduler, which interfaces with the resource manager to allocate resources to various user jobs based on the job requirements. The Job scheduler makes complex decisions when picking the next job to run from the queue of jobs based on various job attributes such as job priorities, fairshare policies, type of resources requested, resource availability, reservations, etc. There are several cluster resource managers and job schedulers available in both public domain (open source) and commercial. Torque is an open source portable resource manager/batch job scheduler. Although the base scheduler only comes with a few standard scheduling algorithms such as FIFO, Torque works in conjunction with another advanced job scheduler such as Maui (also open source) or Moab (commercial version of Maui). When used in conjunction with the scheduler, Torque acts as the resource manager that is able to control the cluster resources and for executing and managing jobs on the nodes. Other job schedulers and resource managers popularly used in cluster environments are Load Sharing Facility (from Platform Computing), Sun Grid Engine (from Sun Microsystems), Condor (from U of Wisconsin), Moab Cluster Suite (from Cluster Resources), and Load Leveler (from IBM).
  38. {DESCRIPTION} {TRANSCRIPT} Message passing libraries are used as the programming API for developing applications to run on clusters. Typically HPC applications are written using the Message Passing Interface (MPI) or Parallel Virtual Machine (PVM) libraries, which provide the abstraction layer that presents a virtual interface to model clusters for writing parallel code that runs across multiple nodes in the cluster. MPI is a portable parallel programming interface specification developed by a consortium of academic, government and commercial companies for enabling writing parallel and cluster applications that can be easily ported across multiple hardware as well as operating system platforms. There are various open source and commercial versions of the MPI library implementation available, including MPICH2, LAM, Scali and OpenMPI. Many of these implementations support multiple different networks as the underlying communication fabric, including Ethernet, InfiniBand, and Myrinet. Code written using MPI is portable across different networks, often requiring no changes to the source code (although the application might need to be recompiled and linked against the right network support library). Parallel Virtual Machine (PVM) is an open source library, which provides a virtual view of the cluster such that programmers can write code against this virtual “single system” model and hence don’t have to be concerned with a particular cluster architecture.
  39. {DESCRIPTION} {TRANSCRIPT} Compilers are critical in creating an optimized binary code that takes full advantage of the specific processor architectural features such as the CPU and memory architecture, execution units, pipelining, co-processors, registers, shared memory, etc., such that the application can exploit the full power of the system and runs most efficiently on the specific hardware platform. Typically, processor vendors of the respective processors such as Intel Xeon, AMD Opteron, IBM POWER, and Sun Sparc, etc., have the best compilers for their processors: Intel Compiler Suite AMD Open64 Compilers IBM XL C/C++ and Fortran Compilers Compilers are important to produce the best code for HPC applications as individual node performance is a critical factor for the overall cluster performance. Optimizing code for the specific processor used in the cluster nodes will ensure optimal performance on individual systems, which will in turn help the overall application performance when running on multiple systems on the cluster. In addition to the vendor specific compilers, open source as well as other commercial compilers are available, which are commonly used for compiling HPC applications. For example, the GNU GCC compiler suite (C/C++, Fortran 77/90), which is part of the standard Linux distributions, and the PathScale compiler suite (which is currently owned by QLogic) and sold for a fee. Other support libraries and debugger tools are also commonly packaged and made available with the compilers, such as Math libraries (e.g. Intel Math Kernel Libraries and the AMD Core Math Library), and debuggers such as GDB (GNU debugger) and the TotalView debugger from TotalView technologies, which is used for debugging parallel applications on clusters.
  40. {DESCRIPTION} {TRANSCRIPT} This table summarizes various software tools, compilers and libraries available for utilizing on clusters.
  41. {DESCRIPTION} {TRANSCRIPT} This table summarizes various software tools, compilers and libraries available for utilizing on clusters. As is evident from the table, there is vast availability of software tools for developing applications for clusters.
  42. {DESCRIPTION} {TRANSCRIPT} Benchmarking is a technique for running some well-known reference applications on clusters in order to exercise various system components and measuring the performance characteristics of the cluster (e.g. network bandwidth, latency, FLOPs, etc.). Benchmarking allows cluster users and administrators to measure the performance and scalability aspects of clusters and to address critical bottlenecks by isolating bad hardware and tuning applications to optimally take advantage of the hardware Several public domain and commercial benchmarking tools are available for clusters: STREAM is a micro benchmark used to measure the memory throughput and latency on individual cluster nodes. STREAM is useful in finding the “skew” in the cluster by exposing nodes with memory performance inferior to the expected values. Linpack is an open source cluster benchmark application, which is used to predict the sustained aggregate Floating Point Operations per Second (FLOPS) from all the cluster nodes by solving a dense system of linear equations in parallel on the cluster. Linpack uses double-precision floating point arithmetic operations and the basic linear algebra subroutines (BLAS) library for solving the linear equations. Hence, Linpack is a good exerciser of the CPUs, memory and the network subsystem of the cluster. Linpack results are used as the basis for determining the fastest supercomputers in the world, which is maintained by top500.org website. The other commonly used benchmarks on clusters include HPC challenge benchmark, the SPEC suite of benchmarks from Standard Performance Evaluation Corporation (commercial), the NAS Parallel benchmarks developed by NASA, the Intel MPI benchmark (IMB), etc. One of the key recommendations for cluster users is to use their own codes/applications for benchmarking clusters because ultimately, the users’ code and applications are what are executed on clusters on a daily basis and any performance tuning or improvements necessary in these codes is best judged by running the respective codes on the cluster and then addressing the potential bottlenecks to improve application performance.
  43. {DESCRIPTION} {TRANSCRIPT} In this course we presented the following topics: A Cluster system is created out of commodity server hardware, high-speed networking, storage and software technologies. High-performance computing (HPC) takes advantage of cluster systems to solve complex problems in various industries that require significant compute capacity and fast compute resources. IBM Intelligent Clusters provides a one-stop-shop for creating and deploying HPC solutions using IBM servers and third party Networking, Storage and Software technologies. InfiniBand, Myrinet (MX and Myri-10G), and 10Gigabit Ethernet technologies are more commonly used as the high-speed interconnect solution for Clusters. IBM GPFS parallel file system provides a highly-scalable, and robust parallel file system and storage virtualization solution for Clusters and other general-purpose computing systems. xCAT is an open-source, scalable cluster deployment and Cloud hardware management solution. Cluster benchmarking enables performance analysis, debugging and tuning capabilities for extracting optimal performance from Clusters by isolating and fixing critical bottlenecks. Message-passing middleware enables developing HPC applications for Clusters. Several commercial software tools are available for Cluster computing.
  44. {DESCRIPTION} {TRANSCRIPT} This slide presents a glossary of acronyms and terms used in this topic.
  45. {DESCRIPTION} {TRANSCRIPT} To learn more about Intelligent Clusters, please visit any of the resources presented in the slide.
  46. {DESCRIPTION} {TRANSCRIPT} The following are trademarks of the International Business Machines Corporation in the United States, other countries, or both.