1. HP sx2000 chipset white paper
Executive summary..........................................................................................................................3
Performance...............................................................................................................................3
Price/performance ratio ...............................................................................................................4
Reliability and availability.............................................................................................................4
Virtualization and manageability ...................................................................................................4
Investment protection ...................................................................................................................4
sx2000 chipset and system topologies ...............................................................................................4
Cell subsystem ............................................................................................................................8
I/O subsystem ............................................................................................................................9
System crossbar chip .................................................................................................................10
Links .......................................................................................................................................11
Capabilities of the sx2000 chipset...................................................................................................11
Reliability and availability...........................................................................................................11
Serviceability............................................................................................................................13
Manageability..........................................................................................................................13
Partitioning...............................................................................................................................13
Performance.............................................................................................................................15
Scalability................................................................................................................................16
Benefits of the sx2000 chipset for applications...................................................................................16
Online transaction processing .....................................................................................................16
Business intelligence and decision support.....................................................................................17
Video-on-demand ......................................................................................................................17
Science and government ............................................................................................................18
Server consolidation ..................................................................................................................18
Conclusion ..................................................................................................................................18
Glossary .....................................................................................................................................19
For more information.....................................................................................................................19
Executive summary..........................................................................................................................2
Performance...............................................................................................................................2
Price/Performance.......................................................................................................................2
Reliability and availability.............................................................................................................3
2. Virtualization and manageability ...................................................................................................3
Investment protection ...................................................................................................................3
sx2000 chipset and system topologies ...............................................................................................3
Cell subsystem ............................................................................................................................6
I/O subsystem ............................................................................................................................7
System crossbar chip ...................................................................................................................8
Links .........................................................................................................................................9
Capabilities of the sx2000 chipset.....................................................................................................9
Reliability & availability................................................................................................................9
Serviceability............................................................................................................................11
Manageability..........................................................................................................................11
Partitioning...............................................................................................................................11
Performance.............................................................................................................................12
Scalability................................................................................................................................14
Benefits of the sx2000 chipset for applications...................................................................................14
On-line transaction processing (OLTP) ...........................................................................................14
Business intelligence / Decision support ........................................................................................15
Video on demand .....................................................................................................................15
Science and government ............................................................................................................15
Server consolidation ..................................................................................................................16
Conclusion ..................................................................................................................................16
Glossary .....................................................................................................................................16
For more information.....................................................................................................................17
3. Executive summary
The HP sx2000 is a new Hewlett-Packard’s new enterprise systems chipset, the sx2000, is that is
designed to provide the scalability, reliability, manageability and performance to meet thecustomers’
most demanding server needs. The chipset, used in the latest HP Integrity and HP 9000 high-end and
midrange servers, supports the new dual-core Intel Itanium 2 processor and, the Intel Itanium 2 9M
processors, while providing a foundation for the upcomingfuture Itanium 2 “Montvale” processor. In
addition, with the support of the HP PA-8900 RISC processors on the sx2000 chipset, HP Itanium 2
processors and PA-8900 RISC processors can concurrently operate in an HP Integrity Superdome in
separate hard partitions. This will provide the flexibility and longevity to maximize the customers’
return- on- investment(ROI).
In today’s business environment, IT departments are finding it increasingly necessaryare driven to
consolidate and simplify their environments to free- up staff, meet the high service level expectations of
the end users and bring high- value projects up and online quicker. As a result, they are looking for
solutions that provide flexible capacity, enhanced availability and security, and a simplified
management of their environment. The new sx2000 chipset for the Integrity Superdome, HP Integrity
rx8640, rx8640 and HP Integrity rx7640,rx7640 servers and the HP 9000 SuperdomeHP 9000
Superdome,, rp8440, and rp7640 servers provides enhancements in all of these areas. Larger
bandwidths, lower latencies, and support for future processors provide greater performance and
scalability which enhances the flexible capacity of the IT environment. Enhanced error- correcting
and self-healing technologies optimize the reliability and availability of the server. Redundant
components, as well as the ability to mixing processor types within an Integrity Superdome, and
being able to upgrade partitions individually, make the environment easier for the IT departments to
provision new applications and manage and maintain the systems with increasedgreater up- time.
The end result is a better return on IT investment with HP Integrity and HP 9000 servers.
The sx2000 chipset advancements deliver key benefits with the HP’s Integrity and HP 9000 high-end
and midrange servers in the following areas:
Performance
Price/performance
Reliability and availability
Flexibility and simplicity with virtualization and management
Investment protection
Performance
The sx2000 chipset has a well- balanced architecture that is designed to maximize the performance
of the Itanium 2 and PA-RISC processors across a wide variety of commercial and technical
applications. The sx2000 provides more thanover twice the memory bandwidth and more thanover
four times the I/O and fabric bandwidth of the HP sx1000 chipset, along with significant reductions
in system latencies. This enables a significantly more performance boost to be achieved with existing
Itanium 2 9M processors (for example, up to a 30 percent% increase for applications running on
Integrity Superdomes) and PA-8900 processors. Additionally, it supports the bandwidth demands of
the hyperthreaded dual-core Itanium 2 processors and future Itanium 2 “Montvale” processors. The
sx2000 Superdome provides 122 GB/s of I/O bandwidth and an unprecedented 218 GB/s of
system fabric bandwidth. This results in near- linear performance scaling of up to 64 processor cores
(128 processors in the future) and 192 I/O adapters, enabling customers to add capacity as their
business needs grow. The sx2000 also incorporates Write Coalescing, which enables more efficient
transfer of data from processors to network adapters, whichthereby enhancesing bandwidth and
reducesing latency of inter-systems communications in a cluster.
Comment [SDT1]: Is this the right
product name? I couldn’t find it at HP
online. GLOBAL CHANGE.
Comment [SDT2]: See SDT1.
Comment [SDT3]: What’s the object
here? The sx2000 chipset?
4. Price/pPerformance ratio
The sx2000 chipset uses leading-edge, enterprise-ready technologies to maximize performance while
following industry- standard, commodity cost curves. This provides the best price/performance ratio to
our customers. In addition to using the industry- standard Itanium 2 processors, the sx2000 uses Dual
Data Rate (DDR-2) DRAMs, as well as PCI-X I/O adapters. (PCI Express I/O adapters will be
supported in the future).
Reliability and availability
HP sx2000 systems use error- correcting and self- healing technologies throughout, whichthereby
maximizesing availability. Unique technologies include:
D Double- chip -spare—, which Iimmediately restores chip--spare protection after a DRAM has
failed, and protects against the vast majority of memory- buffer and data- bus failures; and link self-
healing which recovers from most connector, backplane or cable failures without loss of
performance. In the unlikely event of a backplane failure that exceeds the capability of the self-
healing technology, the affected partitions can be rebooted with reduced- fabric bandwidth but
without losing any processors, memory, or I/O resources.
Other key technologies include: ECC on all PCI-X slots
; nNon-shared PCI-X buses to isolate failures to a single I/O card
; N+1 hot- swappable power supplies
, vVoltage regulators, and fans;
rRedundant power cords for dual power grid support
; and a rRedundant master system clock with failover.
Virtualization and manageability
The sx2000 supports a wide array of partitioning and virtualization options to meet customers' needs.
Hard partitions (nPars) provide full hardware isolation between partitions. Virtual partitions (vPars)
and HP Integrity Virtual Machines (Integrity VM) allow hard partitions to be further subdivided to
provide fine- grained control over the assignment of resources. In addition, sx2000 systems enable
Integrity server customers to choose from any of the industry- leading operating environments,
including:: HP-UX 11i, Microsoft® Windows®, Linux or OpenVMS.
Investment protection
All sx2000 systems are designed to be upgradeable to keep upand grow with the needs of
customers’ business. needs. New I/O subsystems and/or cells and/or I/O subsystems can be added
on-line to the system to expand current partitions or to create new partitions. Unlike IBM p5 590/595
systems, cells can be quickly and easily removed to add memory or upgrade CPUs.
Current users of the HP Integrity and HP 9000 high-end and midrange servers can take advantage of
the added performance, availability and manageability with an in-box upgrade. New Integrity high-
end and midrange servers will have include this technology as part of the system with current and
future support of multiple generations of Intel Itanium 2 processors. In both cases, the servers based
on the new sx2000 chipset provide a foundation for dual-core Intel Itanium 2 processors, Itanium 2
9M processors and future Intel Itanium 2 “Montvale” processors. In addition, the dual-core Intel
Itanium 2 processors, Itanium 2 9M processors, and future Intel Itanium 2 “Montvale” processors, and
along with the PA-8900 processors can co-exist in the same sx2000-based Superdome server in
separate hard partitions. This helps easesmooth the migration to Integrity- based servers.
sx2000 chipset and system topologies
In any high performance computer system, the cCore eElectronics cComplex (also known as CEC or
simply the “chipset”) plays a vital role in delivering the performance, availability and manageability
Formatted: _HP Bullet_10 pt
Formatted: Bullets and
Numbering
Comment [SDT4]: The added
performance, availability and
manageability...of the sx2000 chipset?
5. necessary to support the most demanding workloads. The chipset provides the interconnectivity
between processors and, memory and I/O cards, turning this group of components into a high
performance computer system. The new HP’s new sx2000 chipset plays this critical role in the HP
Integrity Superdome and midrange rx8640 and rx7640 servers, as well as and HP 9000
Superdome, rp8440 and rp7640 systems.
HP Integrity and HP 9000 servers that use the sx2000 chipset are called “cellular” systems. These
systems are based on building blocks called cells. Each cell contains a set of processors, memory
DIMMs, and connectivity to an I/O subsystem. Cells enable the deployment of hard partitions (also
called nPars), which are independent, hardware- isolated operating system instances. A hard partition
consists of one or more cells within a server system. Each hard partition has its own I/O components.
Multiple hard partitions can be deployed on a single server to support server consolidation.
Operating environments can vary from partition to partition, enabling a single server to run HP-UX
11i, Linux, Microsoft® Windows Server™ 2003Microsoft Server 2003 and OpenVMS, all at the
same time (Linux, Microsoft Windows Server 2003 and OpenVMS require Intel Itanium 2 processors).
In addition, hard partitions running HP-UX can be further subdivided into virtual partitions (also called
vPars) to provide even finer- grained partitions for server consolidation and virtualization. The
foundation is also set for future operating system enhancements to take advantage of efficient and
secure inter-partition communications.
The sx2000 chipset consists of five5 very-large-scale integration (VLSI) component types, including: a
cell controller, a memory buffer, a crossbar switch, a PCI-X system bus adapter, and a PCI-X host
bridge. One cell controller and eight memory buffer components reside on each cell, as shown in
Figure 1. One PCI-X system bus adapter and 8 or 12 PCI-X host bridges reside within each I/O
system.
Formatted: Normal
6. Figure 1. HP sx2000 cell board architecture
Cell
Controller
2x M em
Buffer
2x M em
Buffer
2x M em
Buffer
2x M em
Buffer
16 or32 DDR-2
DIM M s
Crossbar
Sw itch 0
Crossbar
Sw itch 1
Crossbar
Sw itch 2
sx2000 Fabric Links
P CI-X System
Bus Adapter
PCI-X
Host
Bridge
PCI-X
Host
Bridge
PCI/PCI-X Busses
8 or12 Slots
16 links
...
I/O System
System Backplane
CellBoard CPU
CPU
CPUCPU
CPU
The crossbar switches reside on the system backplane, and determine the overall topology of the
system. As shown in Figure 2, the Integrity rx7640 and HP 9000 rp7640 systems have two directly -
connected cells, and therefore does not use any crossbar switches. The Integrity rx8640 and HP
9000 rp8440 systems have four cells interconnected by two crossbar switches that provide three
system fabrics. The Integrity and HP 9000 Superdome servers have eight8 cells and six6 crossbar
switches in each of two cabinets. Crossbar- based topologies allow a cell to be taken off-line for
upgrade or service without any impact to the rest of the system. This topology is superior to ring-based
topologies, where removal of a single cell will severs communication within the entire system, thus
requiring a complete system shutdown. The rx7640/rp7640 and rx8640/rp8640 systems support
16 DIMMs per cell, while Superdome systems support 32 DIMMs per cell.
Comment [SDT5]: Change “Busses” to
“Buses”
Formatted: _HP Body text 10 pt
7. Figure 2. System topology
Cell
0
Cell
0
I/O
Sys
I/O
Sys
Cell
1
Cell
3
Cell
2
Cell
1
Crossbar Crossbar
I/O
Sys
I/O
Sys
I/O
Sys
I/O
Sys
rx7640/rp7640
rx8640/rp8640
Cell
0
Cell
3
Cell
2
Cell
1
Crossbar Crossbar
I/O
Sys
I/O
Sys
I/O
Sys
I/O
Sys
Crossba
r
Cell
4
Cell
7
Cell
6
Cell
5
I/O
Sys
I/O
Sys
I/O
Sys
I/O
Sys
Cell
8
Cell
11
Cell
10
Cell
9
Crossba
r
Crossbar
I/O
Sys
I/O
Sys
I/O
Sys
I/O
Sys
Crossba
r
Cell
12
Cell
15
Cell
14
Cell
13
I/O
Sys
I/O
Sys
I/O
Sys
I/O
Sys
Superdome
CrossbarCrossbarCrossbarCrossbarCrossbarCrossbar
Comment [SDT6]: Is it possible to clean
up the cell-to-crossbar lines in this top
graphic? They appear very randomly
placed (they look more uniform in the
graphic underneath). Also, in the graphic
underneath, “crossbar” is cut off in some
boxes.
8. Cell
Controller
2x M em
Buffer
2x M em
Buffer
2x M em
Buffer
2x M em
Buffer
16 or 32 D D R-2
D IM M s
Crossba
r Sw itch
0
Crossba
r Sw itch
1
Crossba
r Sw itch
2
sx2000 Fabric Links
PCI-X System Bus
Adapter
PCI-X
H ost
Bridge
PCI-X
H ost
Bridge
PCI/ PCI-X Busses
8 or 12 Slots
16 links
...
I/O
System
System Backplane
CellBoard CPU
CPU
CPUCPU
CPU
Figure 3. The sx2000 chipset provides increased bandwidth compared toover the sx1000 chipset
CPU
CPU
CPUCPU
CPU
. . . sx1000 Chipset
Memory
...
4.4x I/O
Bandw idth
1.33x CPU
Bandw idth
2.1x M em ory
Bandw idth
4.2x Fabric
Bandw idth
sx2000 Chipset
CPU
CPUCPU
CPU
Cell
Controller
Cell
Controller
M em ory MemoryM em ory
Crossbar Crossbar
Crossbar
Crossbar
P CI-X System Bus
Adapter
P CI-X System Bus
Adapter
PCI-X
Host
Bridge
PCI-X
Host
Bridge
PCI-X
Host
Bridge
PCI-X
Host
Bridge
PCI-X PCI-X PCI-X PCI-X
In addition to supporting the new hyperthreaded dual-core Intel Itanium 2 processor, Itanium 2 9M
processor and the PA-8900 processors (separately or mixed within an Integrity Superdome), the
sx2000 also will supports the future Itanium 2 “Montvale” processor. The sx2000 enhances the
performance of the server by increasing the bandwidths between components as compared to the
sx1000 chipset. The sx2000 chipset provides 1.33 times the CPU bus bandwidth, more thanover
twice the memory bandwidth and more thanover four times the I/O and fabric bandwidth of the
sx1000. In the unlikely event of a crossbar component failure, the sx2000’s three independent
crossbar fabrics on the sx2000 enable the affected partitions to be rebooted with reduced fabric
bandwidth but without losing any processors, memory, or I/O resources (but with reduced fabric
bandwidth).
Cell subsystem
A cell board is composedconstructed of processors, memory
and chipset VLSI (the cell controller and memory buffers), as
well as paths to I/O and the system fabric. The cCell
cController (CC) interconnects all the parts of the cell board.
The CC provides high--bandwidth, low- latency, error-
correcting paths between the cell’s processors, memory, I/O
and fabric. Each CC supports the connection ofing up to
four processor sockets running on two busses that can each
transport data up to 8.5 GB/secs. Processors supported
include Intel’s Itanium 2 processors as well as HP’s PA-8900
RISC processors (see Table 1).
Table 1: Processors supported with the sx2000 chipset
Intel Itanium 2 9M
processor
Dual-core Intel
Itanium 2 processor
HP PA-8900
processor
Intel “Montvale”
processor
Timeframe Available Available Available Planned
Architecture Itanium 2 Itanium 2 PA-8900 Itanium 2
Comment [SDT7]: Are the bold arrows
here supposed to indicate areas of
increased bandwidth? (versus the smaller
arrows on the sx1000 graphic)
Formatted: _HP Graphic Line
9. Cores Single Dual- core with
hyperthreading
Dual Dual core with
hyperthreading
Cache (maximum) 9 MB 12 MB per core 64 MB shared 12 MB per core
FSB rate 400 MT/s 533 MT/s
The new sx2000 chipset provides greater performance and availability with memory enhancements.
The CC maintains a cache- coherent memory system using a directory- based memory controller. The
CC’s memory controller is combined with memory buffers and DIMMs to create four independent
memory systems (quadrants). The memory buffers enable streaming of data between the CC and the
DIMMs at 533 MT/s. These memory buffers accelerate changes to the coherency state in memory,
enable memor ECC of memory, and can buffer several cache lines going both to and from memory.
Up to 32 DIMMs can be controlled by the CC’s memory controller. The DIMMs use DDR2 SDRAMs
running at 267 MHz or 533 MT/s. With 4-GB DIMMs (using 1-Gb DRAMs), total memory capacity
of 64- GB- per- cell board for the 8- and 16-socket Integrity servers, and 128- GB- per- cell board for
the Integrity Superdome is possible. ; Llater releases will support 128- GB- per- cell board for the 8-
and 16-socket Integrity servers and 256 GB for the Integrity Superdome when larger DIMMs become
available for these servers. Accesses to consecutive cache lines can be interleaved across the
quadrants on one cell (cell local memory) or across quadrants on all cells within a partition (cell
interleaved). A single cell can achieve sustained memory bandwidths of 16GB/sec.
The sx2000 has four times the memory capacity, of the sx1000, twice the memory bandwidth of the
sx1000, and more thanover 23 percent% lower memory- access latency thanwhen compared to the
sx1000. The CC also supports an increased level of memory availability, called double chip-
sparing, as described in the section on aAvailability.
The CC has one high-speed link which connects it to an I/O subsystem, and three high-speed links
which connect it to the system backplane fabric. Traffic is distributed equally across the three
backplane fabrics using a unique algorithm to maximize performance.
Using aAdvanced VLSI technology and HP’s architectural enhancements haves enabled additional
system features in the CC to boost system performance. One such enhancement is processor cache-to-
cache transfers, which is. These are described further covered in the section abouton pPerformance.
Table 2 illustrates specifications for the cell controller VLSI chip.
Table 2: Specifications forof an individual sx2000 cCell cController
Die sSize 18.5 x 18.5mm
Processors bandwidth 17.1 GB/sec peak (13.6 GB/sec sustained)
Memory bandwidth 17.1 GB/sec peak (16.0 GB/sec sustained)
I/O bandwidth 11.5 GB/sec peak (8.2 GB/sec sustained)
Fabric bandwidth 34.6 GB/sec peak (27.3 GB/sec sustained)
Maximum memory supported 256 GB (using 2 Gb DRAM technology)
I/O subsystem
The I/O subsystem connects the processors and memory to the I/O cards. The I/O subsystem
containsis comprised of two types of chips: the PCI-X sSystem bBus aAdapter and the PCI-X hHost
bBridge
(see Figure 1). The cell controller connects to a single PCI-X sSystem bBus aAdapter over a high-
speed link. The PCI-X sSystem bBus aAdapter connects to 8-to-12 PCI-X hHost bBridges, which each
connect to a single PCI/PCI-X bus. HP’s high-end systems support 12 PCI-X hHost bBridges (and 12
Comment [SDT8]: Should this be Gb or
GB?
Comment [SDT9]: Might add “see”
cross ref here.
Comment [SDT10]: Might add “see”
cross ref here.
Comment [SDT11]: GB or Gb?
10. I/O slots) per cell. , and HP’s midrange systems support eight8 PCI-X hHost bBridges (and eight8 I/O
slots) per cell.
The specifications of the PCI-X sSystem bBus aAdapter and the PCI-X hHost bBridge are locatedcan be
found in Table 3.
Table 3: Specifications of the I/O subsystem
Slot bandwidth 0.5 GB/s – 2.0 GB/s per I/O slot
Aggregate sustainable bandwidth from the I/O subsystem
on one cell
11.5 GB/s peak (8.2 GB/s sustained)
The new sx2000 chipset enhances system performance and availability with I/O advancements and
support for a wide variety of I/O cards. The PCI-X sSystem bBus aAdapter has a cache to buffer
inbound and outbound data. The cache size has been increased by 50 percent% for the sx2000 to
ensure that the increased bandwidth requirements of the I/O cards can be satisfied. In addition, the
cache hides the non-uniform memory access time for different lines that the I/O card has requested
from memory for DMA reads or writes. This allows data to be streamed to and from the I/O card
without added delays.
The PCI-X hHost bBridge architecture connects each PCI-X hHost bBridge to just one PCI/PCI-X card. ;
nNo shared slots are used. This architecture provides greater system availability by isolating failures
that occur on one PCI card to a single PCI bus and PCI-X hHost bBridge, leaving the other I/O cards
and the rest of the system undisturbed. The PCI-X Host Bridgehost bridge in the sx2000 supports PCI-
X 2.0, 266MHz, as its high-bandwidth I/O bus, as well as all previous versions of PCI/PCI-X. This
new I/O bus has two primarymain benefits:
1. H1) higher bandwidth (about 2GB/s)
2. , and 2) hHigher reliability with the addition of ECC.
The sx2000 will enable ECC for any PCI-X card that supports ECC, whether it is running at 133MHz
or 266MHz.
The sx2000 can meet the most demanding I/O needs demonstrated inby today’s computing
workloads. For network connectivity, the sx2000 supports leading- edge 10-Gb Ethernet cards. To
connect to a Fiber Channel aArray, the sx2000 supports 4-Gb Fiber Channel cards. For Infiniband
networks, the sx2000 supports 4x Infiniband cards. A large number of additional cards are also
supported, including PCI-Express, SCSI, Gigabit Ethernet, Fiber Channel, VGA, and more. In
additionFurthermore, the sx2000 supports a long list of legacy I/O cards that will continue to perform
just as well (or better) on the sx2000. A list of Ssupported I/O cards can be found at
http://www.hp.com/products1/serverconnectivity/index.html.
System crossbar chip
The crossbar chip is the heart of the sx2000 system fabric. It connects all of the cells in the system,
providing a high-bandwidth, low- latency, coherent path among processors, memory, and I/O. This
chip is an eight-port non-blocking crossbar, meaning it has eight connections to/from cells and/or
Formatted: English (U.S.)
Formatted: _HP Numbered list
Formatted: Bullets and
Numbering
11. other crossbar chips. There are 12twelve of these chips in a 64--socket system. See Figure 2. The
specifications of the crossbar chip are locatedcan be found in Table 4.
Table 4: Crossbar Sspecifications of the crossbar
Bandwidth of each port 11.5 GB/sec peak (9.1 GB/sec sustained)
Fabric bandwidth per cell 34.6GB/sec peak (27.3 GB/sec sustained)
Each cell has three connections to the fabric. Each of these connections is to a different crossbar
chip. With this much fabric bandwidth, the sx2000 architecture provides room to grow so the full
benefit will be seen with new applications and processors.
Links
In addition to providing higher bandwidth, the high-speed links have been re-designed for the sx2000
to provide greater availability and reliability. For example, greater reliability is achieved with
transient-error tolerationby tolerating transient errors. When an error is detected on a high-speed link,
any transactions in flight will be re-transmitted with the correct data. Greater availability is achieved
with toleration forby tolerating a hard error in one of the channels that make up a link. If the chip
determines that one of the channels of the link has gone “bad,”, it will swap-in a spare channel and
continue communicating at full bandwidth. Both of these “self- healing” features are processeddone
entirely inside the hardware with no software intervention. TConsequently this results in a higher
degree of reliability and availability for those business- critical computingational needs.
In the unlikely event that a link or crossbar chip fails completely, the sx2000 minimizes downtime by
allowing the partitions affected by the failure to be immediately rebooted using the unaffected fabrics.
All CPUs, memory and I/O are now available again, with 1/3- to- 2/3 of the original fabric
bandwidth. Partitions thatwhich do notwere not useing the failed components are unaffected. For
those systems that require maximum uptime, the cells can be partitioned in such a way that no
crossbar or fabric link is used by two or more partitions. This guarantees that a fabric failure will
affect only one partition, at most, one partition.
The sx2000 links take advantage of high-speed SERDES technology, using 8b/10b encoding with
clock/data recovery. As a result, there is no critical clock signal in any link that could become a
single point- of- failure for that link. The sx2000 chips have been designed to support links
transmitted both through cables and in- board traces. The same link technology is used for both cell-
to-cell interconnection and cell-to-I/O subsystem interconnection.
Capabilities of the sx2000 chipset
The sx2000 was designed to provide exceptional performance and reliable operations even in the
presence of errors. This section describes how the sx2000 delivers thesese benefits. are delivered
with the new sx2000 chipset.
Reliability and& availability
Systems with the new sx2000 chipset have numerous innovative, self-healing, error- detection and
error- correction features to provide the highest levels of reliability and availability.
Servers built with the sx2000 chipset support extensive state-of-the-art error detection and error
correction features (ECC). This protection is used on paths within the VLSI components, and between
the chipset components and processors, memory and I/O. This includes processor busses, memory
busses, I/O and fabric links, as well as I/O slots.
An enhanced feature in the sx2000 chipset compared toover the sx1000 chipset is the double chip-
sparing technology. This is a capability of the memory error- correcting logic to re-establish chip-spare
correction after a DRAM in the ECC codeword has failed. This occurs when theis done by firmware
recognizesing when the first DRAM has failed, and “erasesing” its bits from the ECC correction
Comment [SDT12]: Need object here.
“feature”? "technology”?
12. calculations. This allows the ECC logic to correct for a second DRAM failure in the same ECC
codeword. This erasure can be applied to a single DRAM, or all DRAMs sharing a bit of a bus, or all
busses inof a memory subsystem, and. This maximizes the coverage of this unique protection
mechanism. The DRAMsS will not need to be replaced until two2 DRAM failures occur, which thus
reducesing the number of times the system has to be taken down for memory service. This enhanced
feature is a result of HP's advanced research in memory technology and does not require more
DRAMs- per- MB than was required for the sx1000’s single chip-sparing support. This strategy is more
cost effective than memory- mirroring methods for protecting memory. It is also superior to the
approach of HP competitors,’ approaches whoich use a spare DRAM and have tomust perform a
read/correct/write “walk” through all memory locations to recover from a chip- kill (, since this walk
increases the likelihood of encountering a single- bit error which becomesis now uncorrectable). The
primary benefit of this high- availability feature is that it eliminates unscheduled downtime forto
replacement of failed DIMMs.
A list of sx2000 chipset system reliability and availability features are located can be found in Table
5.
Table 5: sx2000-based system reliability and availability features
Location Feature
Memory system Error detection / correction
DIMM address / control- signal parity protection
Redundant address / control- signal contacts
Double chip sparing
Processor Cache error detection / correction
Processor bus Error detection / correction
I/O and fabric links Link level retry
Link spare channel
Crossbar/ / Ssystem fabric Multiple fabrics enables reboot of affected partitions after a
fabric failure
I/O slots Error detection / correction
PCI failure isolation to a single slot
Chipset Internal data path error detection / correction
Partition nPars – hHardware isolation between partitions
GSM communication without loss of isolation (requires OS
support)
Cell OLARD (requires OS support)
PCI card OLARD (requires OS support)
TAs table 6 details howillustrates, HP Integrity servers with the sx2000 chipset offer added reliability,
availability and serviceability features comparedover to the IBM p5 590/595 RASMan offering from
another vendor.
Table 6: Feature comparison – Integrity Superdome andversus IBM’s p5 590/595 RASM feature comparison
RASM feature HP Integrity Superdome with the
sx2000 chipset
System fabric HP’s system fabric is self- healing. It uses error detection,
retries, and an extra channel for automatic failover on every
link connection.
A cCell failure is isolated to the partitions using that cell.
Memory Double chip sparing supported with the same amount of
memory as previous system generation.
Redundant contacts on DIMM address and control signals.
Comment [SDT13]: At least two?
Comment [SDT14]: Need object. What
strategy?
Comment [SDT15]: See SDT16.
Comment [SDT16]: See SDT 16.
Comment [SDT17]: Does this need to
be called out for clarity, or will the
audience know this acronym?
13. Address pParity is provided in the memory system.
Memory upgrades
and servicing
Upgrades and servicing cCan be accomplisheddone without
taking the entire system down. Unaffected partitions can
continue to run.
Self- healing cache The sx2000 sSupports self- healing cache features.
I/O card upgrades and ease- of- use HP does not require special encapsulating hardware to
install PCI cards.
System clocks Redundant clock sources prevent a single clock failure from
crashing the system. A reboot is not requiredDoes not
require reboot for failover.
Hot- swappable fans All of the N+1 fans are hot- swappable.
Service processor Superdome service processor failures do not disable a
running system and can be hot- swapped.
Partition information is distributed to redundant locations
and can be recovered in the event the service processor
fails.
Serviceability
Another set of features in tThe sx2000 provide serviceability—the ability to add, replace, or delete
components while the system is running. This type of serviceability is termed “On-Line Replacement,
Addition, and Deletion” , (or OLRAD). The sx2000 supports OLRAD for PCI cards and cells. Cell
OLRAD supports workload balancing, capacity- on- demand, and field service. Cell OLRAD requires
OS support, and benefits are maximized by pre-defining floating cells, which do not participate in
cross-cell memory interleave. PCI OLRAD allows a failed PCI card to be replaced while the partition is
running. Additional PCI card capacity can also be added to a running partition. OLRAD prevents
unnecessary system downtime, an essential feature for mission-critical applications.
Manageability
The sx2000 chipset is designed with system manageability in mind. It supports multiple partitioning
options, multiple types of processors, and multiple operating environments. These features are
complemented bywith server- level manageability hardware and suites of manageability tools onat the
operating system and higher levels (for example,: HP System Insight Manager and, HP OpenView,
etc).
Partitioning
Systems that are sx2000- based systems support a wide array of partitioning options. Each
partitioning strategy splits serverup the resources in the server (CPUs, mMemory, I/O) into virtual
machines that can each run an operating system instance. Three partitioning strategies are
available: nPars, vPars, and Integrity VM. nPars are electrically- isolated hard partitions with security
provided in the hardware. Secure firmware configures the sx2000 fabric to isolate resources in an
nPars partition from the rest of the system. This creates a hardware firewall to prevent other
operating system instances from disrupting that partition. The firewall also minimizes the chance that
a single failure (in hardware or software) can take down multiple partitions. The size of an nPar
partition can range from a single cell board to the entire system.
The other two partitioning strategies, vPars and Integrity VM, run within an nPar partition. vPars are
available on HP-UX 11i and offer partitioning granularity down to the CPU -core level. They also
offer the performance necessary for key applications because there is minimal overhead to coordinate
among between the guest operating environments. Integrity VM is also available on HP-UX 11i and
offers sub-CPU-core partitioning granularity. Integrity VM offers good security from untrustedin guest
operating environments that are not trusted.
14. Table 7 highlights the different virtualization offerings available from HP.
Table 7: Partition and virtualization comparison
sx1000 chipset sx2000 chipset IBM P5 595
nPars
(hHard pPartitions)
Offers electrically- isolated
hard partitions
Minimizes failures that
can crash multiple
partitions
Offers electrically- isolated
hard partitions
Minimizes failures that
can crash multiple
partitions
Improved hardware
firewalls including GSM
hardware support
Not supported.
vPars
(soft partitionsing)
Offers CPU-core- level
partitioning granularity
Minimal overhead to
coordinate
betweenamong the guest
operating environments
Run within a hard
partition.
Offered on HP-UX 11i
Offers CPU-core- level
partitioning granularity
Minimal overhead to
coordinate
betweenamong the guest
operating environments
Run within a hard
partition.
Offered on HP-UX 11i
LPars allows sub-CPU-core
partitioning granularity
Lower performance
because of the additional
overhead of trapping
privileged instructions
Hard partitions are not
supported.
Offered on AIX (. Ssome
features offered on Linux)
Integrity VM
(sub- CPU partitioning)
Offers sub-CPU-core
partitioning granularity
Offers good security.
Is run within a hard
partition.
Offered on HP-UX 11i
Offers sub-CPU-core
partitioning granularity
Offers good security.
Is run within a hard
partition.
Offered on HP-UX 11i
LPars allows sub-CPU-
core partitioning
granularity
Offers good security
Offers good I/O
virtualization
Hard partitions are not
supported.
Offered on AIX.( Ssome
features offered on Linux)
The Integrity Superdome sx2000 cells in Integrity Superdomes provide superior investment protection
by the supporting for both PA-8900 RISC and Itanium 2 processors. Similar processors must be
loaded on all sockets of a partition. However, each partition in a system can host a different type of
processor. As a result, today both dual-core Itanium 2 and Itanium 2 9M processors can now co-
exist in different partitions. PA-8900 RISC processors can co-exist with Intel Itanium processors in
different partitions. This feature enables a migration path from the PA-RISC to Itanium 2 technology
within the same system.
It is possible toCustomers can choose which operating environment to run with the sx2000 system.
The supported environments are HP-UX 11i, Windows, Linux, and OpenVMS (Windows, Linux and
OpenVMS require Intel Itanium 2 processors). The sx2000 performs well no matter which operating
environment is running. In addition, sx2000 servers support running multiple operating environments
simultaneously in different partitions on the same system. Systems with the sx2000 chipset systems
also support Service Guard technologies, which enable full- system fail-over from one system to
another.
Table 8: OptionsChoices for CPUs and operating systems
sx1000 chipset sx2000 chipset IBM P5 595
CPU support Supports the industry-
standard Intel Itanium 2
processors: (Itanium 2
6M and Itanium 2 9M
Supports the industry-
standard Intel Itanium 2
processors: (new dual-
core Intel Itanium 2
Supports the proprietary
P5 from the Power line of
processors in the IBM
Comment [SDT18]: This is the first time
LPars is mentioned. Do they know what this
is?
Comment [SDT19]: See SDT20.
Comment [SDT20]: To be consistent, we
need intro sentence for this table.
15. CPUs)
Supports the PA-8800 and
PA-8900 (PA-RISC)
processors, Itanium 2
9M processors, and
future Itanium 2
“Montvale” CPUs)
Supports the PA-8900
(PA-RISC)
pSeries servers.
Operating system support HP-UX 11i, Windows, Linux
and OpenVMS
HP-UX 11i, Windows, Linux
and OpenVMS
AIX, i5/OS, and Linux
Performance
The sx2000 chipset improves system- level performance in a number of ways, beginning with support
for . Supporting powerful next-generation Itanium2 processors. is where we start. However, without
an optimized system architecture, theose processors would not deliver much more performance than
their predecessors. The sx2000 can ensure thatkeep the next-generation processors are fully
utilized. The sx2000 delivers increased bandwidth on every interface, allowing the processors to do
the maximum amount of work. These bBandwidths are shown in Table 9 below. The sx2000 also
supports write coalescing to speed data to I/O devices.
Table 9: System bandwidth comparison – sx2000 versus sx1000system bandwidths
sx1000 chipset
peak (sustained)
sx2000 chipset
peak (sustained)
Improvement
Processor buses / cell 12.8 (10.2) GB/sec 17.1 (13.6) GB/sec 1.33x
Memory subsystem / cell 16.0 (7.5) GB/sec 17.1 (16.0) GB/sec 2.1x
Fabric link / cell 8.0 (6.4) GB/sec 34.6 (27.3) GB/sec 4.2x
I/O link / cell 2.5 (1.8 duplex, 1.0
outbound, 0.9 inbound)
GB/sec
11.5 (8.2 duplex, 5.0
outbound, 4.5 inbound)
GB/sec
4.4x
I/O single slot 1.0 GB/sec 2.0 GB/sec 2x
Theis increased bandwidth of the sx2000 allows communication-intensive or large data-set
applications to run faster, whether it is transferring data to/from memory, communicating
betweenamong the processors running the application, or transferring data to/from an I/O device.
In some systems, higher bandwidth means longer latency. H; however, the sx2000 has decreased
system latencies. This means processors and I/O devices have faster access to memory, improving
application performance. In addition, the sx2000 adds the ability to transfer data directly between
processors, rather than going from the originone processor, to memory, and then to the destination
processor. When a processor requests data that a second processor has just created/modified, that
data can be delivered to the requesting processor with a maximum of three3 crossbar hops, no matter
where theose two processors are in the system. The sx2000 system latencies are shown in Table 10.
Table 10: sx2000 system latencies
sx1000 chipset
peak (sustained)
sx2000 chipset
peak (sustained)
Improvement
Processor buses / cell 12.8 (10.2) GB/secs 17.1 (13.6) GB/secs 1.33x
Memory subsystem / cell 16.0 (7.5) GB/secs 17.1 (16.0) GB/secs 2.1x
Fabric link / cell 8.0 (6.4) GB/secs 34.6 (27.3) GB/secs 4.2x
I/O link / cell 2.5 (1.8 duplex, 1.0
outbound, 0.9 inbound)
GB/secs
11.5 (8.2 duplex, 5.0
outbound, 4.5 inbound)
GB/secs
4.4x
16. I/O single slot 1.0 GB/secs 2.0 GB/secs 2x
The sx2000 supports PCI-X 2.0, 266- MHz slots, which can takes advantage of the additional
bandwidth of the next- generation Fibre Channel cards, 4Gb FC.
Table 11 below compares the sx2000 Superdome to the IBM p5 595 using several key industry
metrics that highlight memory and I/O latency and bandwidth.
Table 11: Integrity Superdome and IBM’s p5 595 bandwidth and latency comparison
HP Integrity Superdome with sx2000
chipset and dual-core Intel Itanium 2
processor
IBM P5 595 1.9 GHz
Cache latency 8 ns to access 24 MB of L2 cache 48 ns to access 36 MB of shared
cache
Memory latency, measured by LM
bench
185 ns 210 ns
Memory bandwidth, measured by
Streams Triad
170,448 MB/secs 173,564 MB/secs
Another performance enhancementnew feature for the sx2000 that enhances performance is referred
to astermed “write coalescing.” Write coalescing allows a processor to write data directly to an I/O
device in cache-line--sized blocks (128 bytes). This allows 16 8-byte transactions to the I/O device
to be coalesced into one 128-byte transaction. This method of “pushing” data to an I/O device can
speed up short data transfers in applications such as clustering.
Scalability
Systems based on the sx2000 chipset are inherently scalable systems. These systems support up to
64 processor sockets, all accessible to one each another through the low-latency, high-bandwidth
system fabric. Each of theose processor sockets supports the dual-core Itanium 2 processor and the
Intel Itanium 2 9M processor. In the Integrity Superdome, sockets in a separate hard partition can be
used to support the dual-core PA-8900 processor. The sx2000 chipset provides superior investment
protection by setting the foundation for the next generation of Itanium 2 dual core processors
including the new dual-core Itanium 2 processor and the future Itanium “Montvale” processor. Dual
core processors double the scalability of each of the cell-based servers. At the high -end, the HP
Integrity Superdome and HP 9000 Superdome,s with 128 processor cores, provide enough
computing power to meet the demands of nearly any of today’s applications. These systems also
support 128 GB of memory per cell, or a total of 2 TB in a 16-cell system, using 4-GB DIMMS. HP’s
plans is for sx2000-based systems to support up to 4 TB of memory when memory densities increase.
Each cell board also supports 12 or 8 I/O cards respectively, depending on whether the system it is
an Integrity Superdome or midrange server. In a 16-cell system, that is 192 I/O cards. The amount
of memory-mapped I/O (MMIO) has also been increased over what the sx1000 supported. The
sx2000 supports from 1.75 – 3.75 GB of MMIO per I/O card.
Benefits of the sx2000 chipset for applications
HP’s sx2000- based systems are ideally suited to a wide variety of applications and market segments.
In this section, we will describe several key applications in different market segments, and show how
sx2000 systems can meet the specific requirementsneeds of each of these applications.
On-line transaction processing (OLTP)
Online transaction processing (OLTP) applications handle the day-to-day activities of running a
business, which might includesuch as receiving and processing orders, tracking inventory in
Comment [SDT21]: What is 192 I/O
cards?
17. warehouses, and maintaining customer information. OLTP applications generally consist of a
database server running a DBMS package such as Oracle or SQL Server, along with several
application servers or clients. The database maintains its information in tables, which reside on a
disk(s) and are typically cached in main memory to improve performance. These tables are updated
frequently, which requires frequent communication and synchronization between the processors.
HPThe sx2000 systems handle the OLTP demands with ease. The HP Integrity and HP 9000
Superdome sx2000 systems scale to 64 processors and 128 processor cores to handle even the most
demanding OLTP applications. These systems support up to 2 TB of main memory (128 GB per cell),
with the ability to grow to 4 TB as DRAM densities increase, enabling a large portion of the database
tables to be cached in memory for improved performance. Memory latencies have been reduced by
up to 37 percent%, and cache-to-cache latencies have been reduced by up to 44 percent (% as
comparedrelative to the sx1000), to minimize the amount of time processors spend waiting on cache
misses.
Customers’ OLTP systems are often mission-critical, with outages resulting in loss of revenue and
customer goodwill. The sx2000 systems were designed with multiple self-healing technologies to
minimize both scheduled and unscheduled downtime. Memory double chip-spare eliminates DRAM
failures as a source of system failure, andplus allows replacement of the DIMM to be deferred until the
next regularly- scheduled maintenance period. In the backplane and I/O fabrics, the spare link
channels and elimination of common link clocks allow most connector or cable failures to be handled
automatically in the hardware without any loss of performance or application availability. ECC on all
I/O busses enables correction of intermittent failures. These and other technologies make sx2000
systems an ideal choice for mission-critical OLTP databases.
Business intelligence and/ dDecision support
Business intelligence (BI) applications analyze large amounts of historical data to help answer
business planning questions. BI applications typically stream large data files from disk into memory,
while simultaneously using multiple processors to analyze the data. These applications are often
highly parallel, with performance scaling well with the number of processors.
HP’s sx2000-based systems are ideal for running BI applications. Users may use from 2 to 64 high-
performance dual-core Intel Itanium 2 processors, to tackle BI problems that range from modest to
mammoth. All sx2000 systems provide 4.5 GB/secs of sustainable inbound I/O bandwidth per cell,
which eliminates I/O bandwidth bottlenecks and enables the processors to run at maximum
performance.
Video- on- demand
Video- on- demand (VoD) is a fast growing segment for high performance servers. A typical VoD
provider must manage thousands of video assets, to ensure assets are stored, distributed, replicated
and delivered according to meet customers’ needs. A VoD server may provide streaming video
content for a few hundred to several thousand customers, or may handle asset storage and
distribution to smaller customer-facing servers. Popular assets can often account for a high percentage
of video streams, so asset- sharing is an important way to reduce VoD server costs.
HP sx2000 systems provide an excellent, flexible, commercial off- the- shelf (COTS) choice for VoD
providers. The sx2000 cellular architecture enables each VoD server to scale over time to meet
growing customer demand. VoD solution providers may independently specify the processing,
memory and I/O resources of each sx2000 cell to best meet the needs of their VoD software. Each
sx2000 cell provides overmore than 5 GB/s of sustainable outbound I/O bandwidth, which can
support roughly 8,000 simultaneous MPEG-2 video streams at 3.75 Mb/s per stream. Each cell can
also be configured with up to 128 GB of memory, enabling the VoD server to hold hundreds of titles
in memory.
18. Science and government
The science and governmentis market segments contains the most diverse set of applications
available. Many of these applications require high floating- point throughput, along with high
memory, backplane and I/O bandwidths. HP sx2000 systems are well suited to these applications.
With dual-core 1.6-GHz Itanium 2 processors, the sx2000-based Integrity Superdome system can
sustain in excess of 819 GFlops respectively, as measured by the industry- standard Linpack
benchmark. Each cell also provides overmore than 16 GB/secs of sustainable memory bandwidth,
allowing a 16- cell Integrity Superdome to report overmore than 170 GB/s using the industry-
standard Streams Triad benchmark. This same system has a backplane bisection bandwidth of
overmore than 109 GB/secs, which enables it to achieve more thanover 150GB/s on Streams Triad
using cell- interleaved memory. This makes sx2000 systems ideally suited to non-partitionable
scatter/gather applications used in the defense sector. Each sx2000 cell can sustain overmore than 8
GB/s of duplex I/O bandwidth. For real-time signal processing applications, this enables a 16-cell
sx2000 system throughput of overmore than 11,000 TB per day.
HP sx2000 systems also work well in high- performance technical computing clusters, either as the
primary node type or as a “fat” node running SMP areasportions of the application. SMP systems like
the sx2000 provide a good approach for achieving huge CPU counts while keeping the interconnect
fabric costs under control. For example, using an interconnect fabric such as Quadrics QsNetII or
Infiniband, a cluster of only 64 Itanium 2 9 M-based rx8640 nodes provides 1024 processors with
overmore than 6.5 TFlops of peak computinge, and can support up to 28 fabric rails with 1GB/s per
node, per rail, for a total fabric bisection bandwidth of overmore than 1.6 TB/s.
Server consolidation
Server consolidation is not an actual market segment, but is a practice that can be applied to market
segments like those described previouslyabove to reduce IT costs. This method enables a user to
replace a large number of discrete systems with a smaller number of partitioned servers. Benefits of
this include reduced hardware costs, reduced application and OS licensing fees, fewer OS instances
to manage, and reduced facilities costs. Partitionable servers are also easier and faster to implement
a new application on as compared to a stand-alone server. This is because the provisioning of a
new stand-alone server usually requires making a purchase, or at least allocating a resource, whereas
adding a new virtual partition to an existing system is very fast.
The sx2000 servers are the ideal target for server consolidation initiatives. Each cell, which is the
building block of nPars, is a fully- isolated computing unit. As compared to HP competitors’ systems,
where cache coherency traffic continues to flow between partitions thereby creating potential single-
points- of- failure and unwanted performance interactions, sx2000 systems keep partition local
coherency traffic entirely within the cells of the nPar. In addition to significantly reducing the risk of a
hardware failure affecting more than one partition, this maximizes the performance of a partitioned
system. To handle a user’s changing computing needs, cells can be easily be added to a running
system, and can be dynamically re-allocated between partitions. This enables partitions to meet the
peak computing demands without requiring hardware to sit unused during normal hours, as is the
case with discrete systems.
Conclusion
HP Integrity and HP 9000 high-end and midrange servers running key business and research
applications such as OLTPon-line transaction processing, BIbusiness intelligence, video on
demandVoD, and science and government applications, or systems being utilized to consolidate the
IT environment, will benefit from the advancements in the new sx2000 chipset. It provides greater
performance and scalability, enhanced availability and better manageability to support these mission-
critical workloads. In addition, the sx2000 chipset provides superior investment protection by
providing a foundation for taking advantage of future processor technologies. By providing more