January 2011MANAGEMENT BRIEF                 Value Proposition for       IBM POWER7 Based Blade Servers                   ...
Copyright © 2011 by the International Technology Group. All rights reserved. Material, in whole or part, contained in this...
TABLE OF CONTENTS         EXECUTIVE SUMMARY                                                          1              Differ...
EXECUTIVE SUMMARY         DIFFERENTIATION         For almost a decade, blades have been one of the fastest-growing segment...
FUTURES         For Telcordia, UPMC, Dancerace and many other organizations, the selection of POWER7 based         blades ...
Telcordia: Getting It Right First Time                  This case study is based on an interview with Richard             ...
Use of BladeCenter systems, according to the company, provides the flexibility                    to meet a wide range of ...
UPMC: Success in Partitioning                  This case study is based on an interview with Iftekhar Kazi                ...
Blade Deployments                  In 2008, the Center decided to                  deploy blade servers to consolidate    ...
                  This case study is based on an interview with Anthony      ...
The ability to maintain continuous uptime also played a key role in Dancerace’s                  choice of PS700s. Twenty-...
UNIX SERVER BLADES         OVERVIEW         Over the last decade, three players – HP, IBM and Sun – have dominated the UNI...
HP and IBM offer blade fabric management tools, Virtual Connect and BladeCenter Open Fabric         Manager (BOFM) respect...
By the mid-2000s, Integrity systems were outperformed by IBM Power equivalents. The Integrity         market position was ...
IBM Power         Compared to the experiences of HP and Sun, the evolution of IBM Power Systems resembles the         stor...
In contrast, T-Series systems employ a distinctive architecture originally developed by Afara         Websystems, which wa...
These comparisons are based on IBM rPerf metrics, which measure comparative performance of         POWER based systems for...
VIRTUALIZATION CAPABILITIES         Partitioning         Although virtualization and partitioning are often equated, in pr...
One reason for this preference is that, in a multitier architecture, database servers represent the             main point...
Figure 6                                     Hewlett-Packard Workload Manager Services                      Adjust process...
Physical as well as logical (thread-based) processors may be grouped in shared pools. A key IBM         differentiator is ...
The extent to which Integrity, Power and T-Series systems meet these challenges varies. All three         employ basic tec...
Key mainframe-derived features include First Failure Data Capture (FFDC), which employs         thousands of embedded sens...
HP and (to a lesser extent) Sun offer a number of comparable capabilities. Generally, however, the         quality of IBM ...
X86 SERVER BLADES         DIFFERENTIATORS         There are some obvious resemblances between UNIX and x86 server blades. ...
VIRTUALIZATION CAPABILITIES         Virtualization Enablers         In comparing the capabilities of POWER7 based systems,...
This pattern is striking in that VMware, in principle, can deliver extremely high levels of         performance and scalab...
The organizational impact of adding this layer depends upon a number of factors. These include         the number of hardw...
Prevention of outages, however, requires more than hardware reliability. The availability         optimization capabilitie...
Value proposition for ibm power7 based blade servers
Upcoming SlideShare
Loading in …5

Value proposition for ibm power7 based blade servers


Published on

January 2011 MANAGEMENT BRIEF Value Proposition for IBM POWER7 Based Blade Servers Analysis Based on User Experiences International Technology Group ITG 4546 El

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Value proposition for ibm power7 based blade servers

  1. 1. January 2011MANAGEMENT BRIEF Value Proposition for IBM POWER7 Based Blade Servers Analysis Based on User Experiences International Technology Group 4546 El Camino Real, Suite 230 Los Altos, California 94022-1069 ITG Telephone: (650) 949-8410 Facsimile: (650) 949-8415 Email: info-itg@pacbell.net
  2. 2. Copyright © 2011 by the International Technology Group. All rights reserved. Material, in whole or part, contained in this document may not bereproduced or distributed by any means or in any form, including original, without the prior written permission of the International TechnologyGroup (ITG). Information has been obtained from sources assumed to be reliable and reflects conclusions at the time. This document wasdeveloped with International Business Machines Corporation (IBM) funding. Although the document may utilize publicly available material fromvarious sources, including IBM, it does not necessarily reflect the positions of such sources on the issues addressed in this document. Materialcontained and conclusions presented in this document are subject to change without notice. All warranties as to the accuracy, completeness oradequacy of such material are disclaimed. There shall be no liability for errors, omissions or inadequacies in the material contained in thisdocument or for interpretations thereof. Trademarks included in this document are the property of their respective owners.
  3. 3. TABLE OF CONTENTS EXECUTIVE SUMMARY 1 Differentiation 1 Customers 1 Futures 2 TELCORDIA: GETTING IT RIGHT FIRST TIME 3 UPMC: SUCCESS IN PARTITIONING 5 DANCERACE: BUILDING A BUSINESS ON TECHNOLOGY 6 UNIX SERVER BLADES 9 Overview 9 Platforms and Products 9 Transitions 9 HP Integrity 10 IBM Power 12 Sun T-Series 12 Comparative Performance 13 Virtualization Capabilities 15 Partitioning 15 Workload Management 16 Availability Optimization 18 X86 SERVER BLADES 22 Differentiators 22 Comparative Performance 22 Virtualization Capabilities 23 Virtualization Enablers 23 Size and Scale 23 Complexity 24 Availability and Security 25 Comparing Availability 25 Security and Malware Resistance 26LIST OF FIGURES 1. Latest-generation HP, IBM and Sun UNIX Blades 10 2. HP Integrity Itanium 2- and 9300-based Models 11 3. Comparative Performance: IBM POWER6 and POWER7 Based Blades 13 4. Comparative Performance per Socket: POWER6 and POWER7 Based Blades 14 5. Software-based Minimum Partition Sizes: HP, IBM and Sun Blades 15 6. Hewlett-Packard Workload Manager Services 17 7. POWER7 Based Systems Virtualization Capabilities 17 8. Key POWER7 Availability Optimization Technologies 19 9. Key AIX 7.1 Availability Optimization Features 20 10. System Environment Layers: Example 24 11. Major Components of VMware vSphere 4 Environment 25International Technology Group i
  4. 4. EXECUTIVE SUMMARY DIFFERENTIATION For almost a decade, blades have been one of the fastest-growing segments of the server market. In 2010, blades will probably account for more than 25 percent of x86 server sales. Demand for UNIX blades has also shown more rapid growth over the last few years. The appeal of both types of blade has been driven by common factors. Server consolidation has enabled organizations to realize space, energy and other savings. Capacity upgrades and provisioning have been facilitated. Network complexity has been reduced. However, although hardware packaging may be similar, blades from different vendors are far from the same. Variations in performance, virtualization, availability and other capabilities reflect differences in system architectures, processors and operating systems. In comparing UNIX blades from Hewlett-Packard (HP), IBM and Oracle’s Sun, these differences are clearly apparent. The leadership position that IBM’s Power Systems have gained in the overall UNIX server market extends to blades built around the company’s POWER7 Architecture. POWER7 based blades are also significantly differentiated from their x86 counterparts. Higher performance, more granular partitioning and more effective workload management are delivered than by Windows and x86 Linux blades equipped with VMware and equivalents. The value proposition for POWER7 based blades is materially reinforced by higher availability and better security than competitive platforms. These affect not only the quality of service, but also the cost-effectiveness experienced by users. POWER7 based blades may not be appropriate for all applications. But it is important to understand how they differ from competitive platforms, and how the distinctive strengths of POWER7 Architecture may provide unique customer value. CUSTOMERS This report presents three case studies of such value. Telcordia, a leading player in the highly competitive market for mobile communications solutions, employs POWER7 based blades and the AIX operating system to deliver the real-time performance and 24/7 availability required by its customers. University of Pittsburgh Medical Center (UPMC) exploits POWER7 based blades, highly granular PowerVM partitioning and AIX to consolidate application and Web serving across its full range of business-critical systems. Dancerace, a 20-person UK company that offers online invoice discounting, factoring and trade financing, employs POWER7 based blades and the IBM i operating system to deliver services to a worldwide customer base that can afford neither delays nor downtime. All three organizations are recognized leaders in their respective industries. All selected POWER7 based blades and IBM BladeCenter chassis to handle demanding workloads, support growth and maintain continuous availability for applications that run their businesses.International Technology Group 1
  5. 5. FUTURES For Telcordia, UPMC, Dancerace and many other organizations, the selection of POWER7 based blades was influenced as much by future as present needs. Use of blades is evolving in ways that play to the strengths of Power Architecture. When blades first came into widespread use in the early 2000s, they were employed primarily for “scale-out” applications requiring light-duty servers. Over time, however, blades have moved into broader roles. Databases, transactional and – increasingly – mixed workloads must be handled with the same efficiency and reliability as conventional platforms. This trend has intersected with another: growing use of virtualization. Deployment of blades, as well as the adoption of VMware, Xen and equivalents have been driven by server consolidation. Support for multiple partitioned instances, and execution of the often-diverse workloads that these generate, have become key new requirements. Power Systems are industry leaders in these areas of capability. Among UNIX blade vendors, HP’s Integrity platform comes closest to Power strengths. Latest-generation Itanium 9300 Series- based Integrity blades, however, are outclassed by POWER7 in performance terms, and lack key reliability, availability and serviceability (RAS) features offered on larger Integrity models. Sun’s T-Series blades implement an architecture that was designed to handle high-volume, low- impact Internet workloads. It performs less well in other roles. Performance, virtualization, workload management, availability optimization and other capabilities are significantly weaker than those of HP Integrity and IBM Power Systems. Compared to x86 blades, key POWER7 differentiators include higher performance as well as the integration, stability and resilience of IBM AIX and i; the ability of PowerVM to support higher concentrations of diverse guest workloads; and industry-leading automation features that enable greater operating efficiency and significantly reduce administrative overhead. There also a number of differences in blade chassis design that tend to favor POWER7 based BladeCenter systems over HP BladeSystem, Sun Blade 6000 and other competitive equivalents. This does not mean that POWER7 based blades are appropriate in all roles. There are numerous applications, particularly in the x86 space, where this may not be the case. HP, Sun and x86 blades can provide a great deal of value to organizations consolidating servers. But for the most demanding and business-critical applications, POWER7 based systems are strong candidates.International Technology Group 2
  6. 6. Telcordia: Getting It Right First Time This case study is based on an interview with Richard Goldberg, Product Manager, Service Delivery Solutions for Telcordia. Richard is responsible for the Telcordia Converged Application Server, which is built upon IBM BladeCenter systems with POWER blades. Telcordia provides software, turnkey systems and services to communications services suppliers worldwide. Based in Piscataway, New Jersey, it was formed in 1984 and was originally known as Bell Communications Research or Bellcore. Currently, it is a privately held company employing more than 2,800 people and operating in 15 countries. Approximately half of its business is outside the U.S. Telcordia provides service delivery and charging, operational support system (OSS) and interconnection solutions, along with research and consulting. Customers include landline, mobile and converged operators, Internet service providers (ISPs), government entities and companies offering multiple types of service. Telcordia has expanded rapidly during the 2000s, particularly in fast- growth, highly competitive communications markets in Asia and Europe. The company has won numerous industry awards for innovation, product quality and network design. It is widely regarded as an industry leader in several areas including real-time operational, policy management and billing solutions for converged services. Since 2005 Telcordia has employed IBM POWER based BladeCenter servers to host the core platform of its Service Delivery Suite, which includes the Telcordia Converged Application Server. The Telcordia Converged Application Server incorporates a base set of applications for real-time policy, charging, service creation and converged and interactive services. These may be configured with additional Telcordia modules, or employed to develop customized service solutions for individual operators. Systems employ clusters of multiple blades to form a highly available solution. Telcordia’s next-generation version of the Converged Application Server, which will be delivered in the first quarter of 2011, will be built around POWER7 based blades. The company evaluated, but decided not to use late-model POWER6+ based BladeSystem JS23 and JS43 blades. Price/performance for POWER7 based models equipped with DDR3 RAM and a higher-speed memory bus was found to be superior. A number of competitive platforms were similarly rejected.International Technology Group 3
  7. 7. Use of BladeCenter systems, according to the company, provides the flexibility to meet a wide range of customer needs, and allows for rapid, non-disruptive scaling. Many of the company’s customers experience high levels of business growth over long periods. The Telcordia solution is designed to scale from entry-level configurations to systems supporting more than 100 million customers. Other Telcordia requirements that POWER based BladeCenter servers have met include high levels of performance to support real-time operations, including an in-memory database. Availability is also critical. The company advertises “better than five nines” uptime for the solution. The applications it supports typically operate on a 24/7 basis, and outages can translate into serious customer losses. Telcordia also cites the stability and predictability of AIX as a key factor in its commitment to POWER BladeCenter servers. Telcordia bundles not only application offerings, but also complex custom middleware into its solutions. Close integration and optimization, and exhaustive testing of overall system packages are mandated. Operating system problems would not be welcome at any time during the lifecycle of the next-generation solution. For this reason, Telcordia closely reviewed the IBM AIX Roadmap. It was determined that this roadmap would meet its requirements for the foreseeable future. In marketing to communications services providers, the company claims that: As the global leader in the development of mobile, broadband and enterprise software and services, Telcordia is known for getting it right the first time. No argument there.International Technology Group 4
  8. 8. UPMC: Success in Partitioning This case study is based on an interview with Iftekhar Kazi (Senior Enterprise Architect) at the University of Pittsburgh Medical Center (UPMC) based in Pittsburgh, Pennsylvania. Additional information was supplied by Bill Hirsch (Manager of Systems Support for AIX). UPMC and Power The University of Pittsburgh Medical Center (UPMC) is one of the largest and most respected health care organizations in the United States. UPMC is a not-for-profit organization affiliated with the University of Pittsburgh Schools of the Health Sciences. It runs industry-leading programs in transplantation, oncology, neurosurgery, psychiatry, orthopedics, sports medicine and other areas, and has been nationally recognized by business and industry groups for innovation and service excellence. UPMC currently operates 20 hospitals with 4,000+ inpatient beds, along with more than 400 clinics and outpatient locations in Western Pennsylvania, and health care facilities in Europe. It employs more than 2,700 doctors. Its health insurance arm covers more than 1.4 million members. UPMC has grown during the 2000s through expansion of existing programs and facilities, along with acquisitions and joint ventures that have expanded its local and global presence. Between 2005 and 2009, operating revenues grew from $5 billion to almost $8 billion, while numbers of employees increased from 40,000 to 50,000. UPMC standardized on IBM POWER based systems and AIX in 2005, and now uses these for all systems. Patient care, scheduling, billing, Oracle’s PeopleSoft and other applications run on a variety of IBM POWER based platforms, ranging from blades to a high-end Power 595 server. Growth posed challenges for the UPMC IT infrastructure. After researching a number of options, an aggressive program to deploy Power and AIX partitioning was decided upon. By using LPARs and micro-partitions to improve capacity utilization, the organization was able to reclaim around 50 percent of its processor capacity. Workload growth could thus be supported while limiting new hardware and software investments. A further benefit was that capacity could be brought online more rapidly, and at lower cost. Time to provision a new server was reduced from a day or more to a few minutes. By the time this exercise had been completed, UPMC had become an industry leader in the use of Power partitioning and workload management. A key lesson learned is that workloads must be understood in detail, and monitored and managed with high levels of granularity.International Technology Group 5
  9. 9. Blade Deployments In 2008, the Center decided to deploy blade servers to consolidate application and Web serving workloads that required low levels of disk and I/O throughput. The strategy was to employ AIX Micro- Partitioning to consolidate large numbers of instances onto fewer physical machines. After initially deploying POWER6 UPMC Data Center: Racks and Blades based blades, UPMC moved to POWER7 technology as soon as this became available. The Center has installed six 16-core PS702s with 256GB of memory, hosting up to 40 partitions each (compared to 24 for POWER6 based blades). The higher performance and scalability of POWER7 based blades allows the Center to achieve higher levels of concentration than with POWER6 based equivalents. UPMC also employs pools of shared processors, along with Power system mechanisms that allow these to use spare cycles in dedicated partitions. System resources can be re-allocated in a rapid and flexible manner as needs change. According to UPMC, consolidation has been facilitated by the IBM BladeCenter Open Fabric Manager (BOFM). This enables high-speed switching across multiple BladeCenter chassis, allowing blades to be located, and new capacity activated at any point within the Center’s blade infrastructure. BOFM also provides automated failover between blades in different chassis in the event that a failure occurs. Other benefits of employing POWER based blades include lower operating costs. Footprints and energy consumption have been reduced not only for servers, but also for network connections. A BladeCenter, according to UPMC, requires eight to ten network cables rather than the 60 to 80 that would have been required if the same workloads had been UPMC Data Center: Control Center deployed on conventional servers. UPMC plans to build upon its successes by implementing more granular service- level management, extending automated provisioning across all hardware and AIX components, and more closely integrating server and storage virtualization. The Center has been an early adopter of IBM’s SAN Volume Controller (SVC), and sees important synergies between this and its POWER and AIX based environment. UPMC prides itself on being an innovator in heath care. But that, clearly, is not the only area in which it is on the industry’s leading edge.International Technology Group 6
  10. 10. Dancerace:
 This case study is based on an interview with Anthony Avison, Chief Executive of Dancerace plc. The company is based in the city of Bath in the United Kingdom. Avison founded the company in 1992 and has personally led its technology strategy since that time. Dancerace is one of a new breed of companies: a global business that processes £ billions with around 20 employees. Dancerace specializes in software and services for invoice discounting, factoring and trade financing; i.e., raising funds from receivables. In a tight credit market, this approach has proved popular among businesses of all sizes. Facing larger players in a near-saturated market, for almost 20 years Dancerace has sought to differentiate itself through the way in which it uses IT. The company was the first in its industry to offer products over the Internet, and has used a combination of leading-edge proprietary applications and latest-generation IBM technology to gain and retain customers. Continuing this tradition, Dancerace was the first IBM customer in the UK to deploy POWER7 based blade servers. The company’s business is entirely online. Customers are located in Europe, Australia and Asia, as well as in developing countries in Africa, where Dancerace is involved in local micro-financing initiatives. Dancerace’s core system must meet exacting requirements. Workloads for individual customers vary widely. The largest runs to millions of transactions, but there are also customers for whom Dancerace services comparatively small portfolios. The system must be capable of handling diverse, often fluctuating workloads and of growing rapidly as the business expands. In 2008, Dancerace replaced its earlier IBM System i server with a BladeCenter chassis and three POWER6 based blades running IBM i. Two years later, the company redeployed these as standby failover and recovery systems and substituted a new BladeCenter H with two POWER7 based PS700 blades for production. The two new blades were able to handle the same workloads as the three POWER6 based models. A third PS700 was later added to support growth. Key benefits of deploying IBM PS700s, according to Dancerace, included only higher performance, but also partitioning capabilities, and support for the IBM i operating system – which is valued by Dancerace for its reliability, stability and ease of management. Lower electricity consumption is also seen as important. Dancerace describes itself as a “green” business. Fourteen micro-partitions of varying sizes run on production, and seven on standby blades. Workloads for individual customers are hosted in micro- partitions, which can be modified automatically if their needs change. It is also easy to add new partitions and/or blades as business expands.International Technology Group 7
  11. 11. The ability to maintain continuous uptime also played a key role in Dancerace’s choice of PS700s. Twenty-four/seven availability is critical for the company. Because customers depend upon Dancerace for short-term financing needs, even brief outages could have a significant impact on their bottom lines. Dancerace could lose customers, and its reputation could be undermined. This has not occurred. As the company’s Website notes: We’re easily in the top two suppliers to this market in the world but only we can boast that none of our clients have lost a business day due to a fault of ours since the first Dancerace systems were launched in 1994. Which has proved to be a useful marketing message. Management also put a more sophisticated disaster recovery infrastructure in place. The objective was to provide a further level of protection against the effects of network and power outages, fires, extreme weather conditions and other events that might disable the company’s main data center. PRIMARY SITE STANDBY SITE DS4800 DS3200 BLADECENTER H BLADECENTER S 10 miles PS700 Servers JS12 Servers (16 kilometers) 14 virtual servers/blade 7 virtual servers/blade IBM i operating system IBM i operating system BOFM BOFM Duplexed, fault-tolerant fiber optics Dancerace Data Center The POWER6 based failover and recovery configuration is located at a secure site approximately 10 miles away. An in-house solution built on the remote journaling feature of the IBM i operating system has been put in place, enabling service to be resumed in, at most, a few minutes. Duplexed, fault-tolerant fiber optic networks link the main and standby sites. Customers have been impressed by these arrangements, which have also proved useful in attracting new accounts. Dancerace has also invested in advanced data center infrastructures, SAN- connected IBM disk systems and other state-of-the-art capabilities. Quality of technology, according to management, has played an important role in the company’s success. Not bad for a 20-person company. As the British say: brilliant. IBM Business Partner Imtech ICT and IBM personnel provided valued assistance in deploying and supporting Dancerace’s blade-based systems.International Technology Group 8
  12. 12. UNIX SERVER BLADES OVERVIEW Over the last decade, three players – HP, IBM and Sun – have dominated the UNIX server market. These companies are also the major players in UNIX server blades. All three companies market UNIX and x86 server blades for use in common BladeSystem (HP), BladeCenter (IBM) and Blade 6000 (Sun) chassis. IBM and Sun began to offer UNIX server blades in 2003, and HP in 2005. IBM and HP have offered blade models of their Power and Integrity platforms since that time. Sun has offered a broader mix of products built around Sun UltraSPARC (now withdrawn) and T-Series processors, along with x86 models running the Solaris operating system. Sun x86 blades have enjoyed steady demand among users seeking to replace the company’s older SPARC-based servers. Sales volume for T-Series blades has been significantly lower, for reasons detailed later in this section. Market share calculations are often confused by the fact that the x86 version of Solaris is also supported on HP, IBM and other vendors’ x86 blades. This section deals with HP Integrity, IBM Power and Sun T-Series blades. Vendor platforms and products are first outlined, then capabilities in three key areas – performance, virtualization and availability optimization – are compared. The following section compares x86 and UNIX server blades, focusing primarily on differences between x86 and POWER7 based systems. PLATFORMS AND PRODUCTS Transitions During 2010, the UNIX server product lines of HP, IBM and Sun have undergone significant changes. HP Integrity systems have moved from Intel Itanium 2 to next-generation Itanium 9300 Series processors, while IBM Power Systems have transitioned from POWER6 to the POWER7 generation of architecture. These shifts, which were announced in April, have been reflected in blade product lines. HP now markets Integrity blades based on quad-core Itanium 9300 Series processors with rated frequencies of 1.33 to 1.73 GHz, while IBM Power blades employ quad-core 3.0 GHz POWER7 processors. IBM also offers POWER7 six- and eight-core processors with frequencies of up to 4.0 GHz in other server forms. In September 2010, Sun announced a third generation of T-Series systems (the first generation was introduced in 2005) built around 1.65 GHz T3 processors. These systems included a single-socket blade model, the T3-1B, which may be configured with an 8- or 16-core T3 processor. In addition to their principal chassis, the c7000 and BladeCenter H respectively, HP and IBM offer the smaller-format c3000 and BladeCenter S. IBM offers a compact version of the BladeCenter H chassis designated BladeCenter E. HP, IBM and Sun also offer modified chassis that comply with U.S. Network Equipment – Building System (NEBS) standards for communications carrier applications.International Technology Group 9
  13. 13. HP and IBM offer blade fabric management tools, Virtual Connect and BladeCenter Open Fabric Manager (BOFM) respectively. These handle switching and failover processes across multiple chassis. HP offers its own line of switches, while IBM supports Blade Network Technologies, Brocade, Cisco Systems and other third-party offerings. There is no Sun equivalent. These products are summarized in figure 1. All three vendors also continue to market older blades. Figure 1 Latest-generation HP, IBM and Sun UNIX Blades HP IBM SUN Chassis – BladeSystem c7000 – 16x, 10U BladeCenter H – 14x, 9U Blade 6000 – 10x, 10U Number blades, BladeSystem c3000 – 8x, 6U BladeCenter E – 14x, 7U Blade 6048 – 48x, 42U size BladeCenter S – 6x, 7U Model(s) BL860c i2 PS700 Express T3-1B 1-socket 1.6 GHz 1-socket 3.0 GHz – up to 1-socket 8- or 16-core T3 2-socket 1.33, 1.6 or 1.73 GHz 64GB main memory 1.65 GHz – up to 128GB up to 192GB main memory PS701 Express main memory BL870c i2 2-socket 3.0 GHz – up to 2- or 4-socket 1.33, 1.6 or 1.73 GHz 128GB main memory up to 384GB main memory PS702 Express BL890c i2 4-socket 3.0 GHz – up to 4- or 8-socket 1.33, 1.6 or 1.73 GHz 256GB main memory up to 768 GB main memory Operating systems HP-UX, Windows*, RHEL*, SLES AIX, IBM i, RHEL, SLES Solaris Threads per core 2 2–4 16 Fabric Virtual Connect: 1-4 chassis BladeCenter Open Fabric No equivalent management Virtual Connect Enterprise Manager (BOFM): 1-100 Manager: 5-100 chassis chassis *Current versions HP Integrity blades support the HP-UX operating system as well as Windows Server, Red Hat Enterprise Linux (RHEL) and SUSE Linux Enterprise Server (SLES). IBM POWER7 based models support IBM AIX and i, along with RHEL and SLES. An optional facility, Lx86, also supports older 32-bit Linux applications. Sun T-Series blades support only Solaris. IBM i is the latest version of an operating system that has been employed by over 200,000 organizations worldwide on IBM AS/400, iSeries and System i platforms, in some cases for more than 20 years. Since 2008, IBM i has run on the same POWER based hardware platforms that support AIX. The IBM i operating system has proved popular among small, midsize and some large corporate users, and is generally recognized as one of the most closely integrated, automated and reliable system environments in existence. It is employed at Dancerace, one of the case studies presented in this report. HP Integrity HP’s UNIX server strategy has, during the 2000s, been dominated by its commitment to Intel Itanium processors for its Integrity platform. The original Itanium design, which was developed by HP and Intel in the late 1990s, was intended to address the high-end RISC and volume microprocessor markets with a single chip. A series of performance shortfalls and technical problems resulted, however, in weak early market penetration. Intel shifted its volume focus to the Xeon family.International Technology Group 10
  14. 14. By the mid-2000s, Integrity systems were outperformed by IBM Power equivalents. The Integrity market position was undermined by delays in upgrading Itanium performance. The most recent “Tukwila” or Itanium 9300 Series generation of processors, in particular, was originally scheduled for release in 2007, but appeared only in February 2010. One result is that Itanium processors have not matched progress in POWER technology. Current Itanium 9300 Series processors are four-core chips with rated frequencies of 1.33 GHz to 1.73 GHz. IBM POWER7 processors range from four-core 3.0 GHz (employed in Power blades) to eight-core 4.0 GHz chips. Integrity market momentum has also been undermined by relatively weak appeal for Windows and Linux deployment. Although Windows, RHEL and SLES have been supported on Integrity systems since they were first introduced, the majority of installations – more than 80 percent, according to most industry estimates – run HP-UX. Windows deployments are believed to account for 5 to 10 percent. In moving to Itanium 9300 Series processors, HP has replaced earlier Integrity rack-mount and tower models. Apart from an entry-level rx2800 designed for remote office and small business use, the entire Integrity product line is now blade based. The company appears to have abandoned the market space previously occupied by its midrange rack-mounted rx6600, rx7640 and rx8640. Figure 2 illustrates this transition. Figure 2 HP Integrity Itanium 2- and 9300-based Models ITANIUM 2-BASED BL860c rx2660 rx7640 Model rx6600 Superdome BL870c rx3600 rx8640 Cores 2-4 1-4 2-8 2-32 2-128 nPars, vPars, nPars, vPars, Partitioning HPVMs HPVMs HPVMs HPVMs HPVMs Form Blade Rack, Tower Rack, Tower Rack Rack ITANIUM 9300 SERIES-BASED Model rx2800 i2 BL860c i2 BL870c i2 BL890c i2 Superdome 2 Cores 2-8 2-4 4 4-32 16-128 nPars, vPars, Partitioning HPVMs HPVMs HPVMs HPVMs HPVMs Form Rack, Tower Blade Blade Blade Blade HPVMs = Integrity Virtual Machines At the high end of the line, the Superdome 2 consists of Itanium 9300-based blades grouped in 8-, 16- and 32-socket configurations. These implement HP nPar hard partitioning technology. However, nPars are not supported on other Itanium 9300 Series-based models, including blades. In the past, HP has characterized nPars as essential to maintaining “business-critical” availability. According to HP, the company remains committed to the Integrity platform, and plans to use future Intel Itanium processors. These are expected to include “Poulson” and “Kittson” processors said by Intel to be scheduled for 2012 and 2014 respectively. No details of these have been released.International Technology Group 11
  15. 15. IBM Power Compared to the experiences of HP and Sun, the evolution of IBM Power Systems resembles the story of “the tortoise and the hare.” The Power platform maintained a steady pace of price/performance gains and functional improvement through its POWER5 (2004), POWER6 (2007) and POWER7 (2010) generations. Over this period, Power Systems emerged as the overall market share leader in UNIX servers. In contrast to the Intel Itanium 9300 Series and Sun T3 introductions, the POWER7 generation represents a significant architectural transition. At the chip level, the POWER7 design features not only higher core densities but also features that deliver high levels of performance with a comparatively small chip size. POWER7 processors embed 1.2 billion transistors, compared to 2 billion for the Intel Itanium 9300 and Sun T3, and 2.3 billion for the eight-core Intel 7500 Series (Nehalem EX). The POWER7 outperforms both. Reduced transistor counts contribute to both faster speeds and lower energy consumption. Other new POWER7 capabilities include:  Intelligent Threads. The maximum number of threads supported per core increases from two for POWER6 processors to four for POWER7. The IBM implementation allows workloads to be executed using one, two or four threads per core. The approach is highly automated. Systems can automatically determine which to use for optimum performance, or system administrators may select the number of threads employed. In automatic mode, the system provides continuous optimization of performance, which materially facilitates execution of heterogeneous workloads.  Intelligent Cache. The POWER7 cache structure provides 256KB of on-chip Level 2 (L2) cache per core, and 32 MB of shared Level 3 (L3) on-chip cache per processor. As for numbers of threads, the amount of cache employed for specific workloads may be determined automatically by the system, or set by system administrators.  Active Memory Expansion. This enables system-managed compression and decompression of data in memory. POWER7 is the first major commercial processor to offer this capability. Compression rates of up to 50 percent are supported; i.e., useable main memory may be up to double physical memory. Exploitation of Active Memory Expansion is again highly automated. Compression and decompression may be turned on and off by the system to optimize performance for partition based workloads. The amount of memory made available in this manner may be subject to system-level priorities. According to IBM, a next generation of “POWER8” architecture is under development, although no details have been released. It is expected that this will appear in the 2013 to 2014 timeframe. Sun T-Series Although there are significant differences between them, both Integrity and Power systems draw upon Reduced Instruction Set Computing (RISC) design concepts intended to yield high levels of performance for multiple types of workload.International Technology Group 12
  16. 16. In contrast, T-Series systems employ a distinctive architecture originally developed by Afara Websystems, which was acquired by Sun Microsystems in 2002. This employs a combination of low-frequency processors and large numbers of threads – up to 16 per core on latest-generation T3 systems. It was designed to support high-volume, low-latency Internet workloads. Since the first T-Series models were introduced in 2005, they have been deployed primarily by Internet Services Providers (ISPs), communications services providers and others for this type of workload. T-Series systems have proved less effective for database- and transaction-intensive workloads, which tend to run more efficiently using single threads on higher-frequency cores. This limitation is recognized by Oracle, which recommends that T-Series systems should not be employed for database workloads sensitive to response time, or for batch or “heavyweight” single- threaded applications. Most commercial applications operate in single-threaded mode. Oracle has committed to maintain and enhance the T-Series until 2015, after which the company plans to introduce a new SPARC processor platform. No further details have been released. COMPARATIVE PERFORMANCE In comparing performance of Integrity, POWER and T-Series blades, certain caveats are in order. At the time of writing, there was limited user experience with Itanium 9300- and POWER7 based blades, which were introduced in April 2010, and with Sun SPARC T3 systems, which were introduced in September 2010. Comparative performance of previous generations of systems, however, may be taken as a general baseline. Prior to 2010 introductions, IBM POWER6 based systems outperformed competitive platforms with the same number of cores by margins of two to three times. In transitioning to the POWER7 generation of systems, Power Systems performance has increased substantially. Within the IBM BladeCenter product line, performance per core for POWER7 based models has increased by 49 to 53 percent compared to POWER6 based and 19 to 24 percent compared to POWER6+ based models. Figure 3 shows comparisons. Figure 3 Comparative Performance: IBM POWER6 and POWER7 Based Blades Model Sockets Processor Cores rPerf rPerf/Core POWER6 based JS12 1 POWER6 2 x 3.8 GHz 14.71 7.36 JS22 2 POWER6 4 x 3.8 GHz 30.26 7.57 JS23 2 POWER6+ 4 x 4.2 GHz 36.28 9.07 JS43 4 POWER6+ 8 x 4.2 GHz 68.20 8.53 POWER7 based PS700 1 POWER7 4 x 3.0 GHz 45.13 11.28 PS701 2 POWER7 8 x 3.0 GHz 81.24 10.16 PS702 4 POWER7 16 x 3.0 GHz 154.36 9.65 Across the Power product line as a whole, performance typically increased by between 40 to 60 percent compared to POWER6 based systems introduced in 2007, and by 20 to 30 percent compared to systems based on POWER6+ processors introduced in 2008 and 2009.International Technology Group 13
  17. 17. These comparisons are based on IBM rPerf metrics, which measure comparative performance of POWER based systems for commercial workloads. Users have found them to be generally reliable. There are no HP or Sun equivalents. Performance increases appear to be greater than for latest-generation HP Integrity and Sun T- Series systems. A TPC-H 1TB decision support benchmark published by HP indicates, for example, an average per core performance increase of 14 percent for Superdome 2 systems based on Itanium 9300 Series 1.73 GHz quad-core processors compared to previous-generation Superdome systems based on Itanium 2 1.6 GHz dual-core processors. The probability is that the Sun T-Series transition from 8-core 1.6 GHz to 16-core 1.65 GHz processors also represents, at best, a minor improvement in per core performance. The SPARC T3 processor consists of two T2 processors embedded on a single chip. It does not incorporate significant functional changes that might further accelerate performance. It may thus be reasonably concluded that POWER7 based systems have retained an approximately two to three times advantage in against latest-generation competitive systems with the same number of cores. In practice, as the density of cores per processor is significantly higher for latest-generation systems, performance per socket has increased more dramatically. Within the BladeCenter product line, for example, the evolution has been as shown in figure 4. Comparisons are again based on IBM rPerf ratings. Figure 4 Comparative Performance per Socket: POWER6 and POWER7 Based Blades SINGLE-SOCKET SERVERS JS12– 2 x POWER6 3.8 GHz cores 14.71 PS700 – 4 x POWER7 3.0 GHz cores 45.13 DUAL-SOCKET SERVERS JS23 – 4 x POWER6+ 4.2 GHz cores 36.28 PS701 – 8 x POWER7 3.0 GHz cores 81.24 FOUR-SOCKET SERVERS JS43 – 8 x POWER6+ 4.2 GHz cores 68.2 PS702 – 16 x POWER7 3.0 GHz cores 154.36 In these comparisons, single-socket performance has increased by more than three times, and both dual-socket and four-socket performance more than doubled. Price/performance ratios also, as is noted in the Telcordia case study, appear to have improved significantly.International Technology Group 14
  18. 18. VIRTUALIZATION CAPABILITIES Partitioning Although virtualization and partitioning are often equated, in practice they are different. Virtualization is a broader concept that extends to management of system resources in partitioned environments. These capabilities are discussed separately below. In comparing the capabilities of HP Integrity, IBM Power and Sun T-Series blades, three types of partitioning should be addressed: 1. Hard partitioning refers to hardware- or microcode-based methods that allow better isolation of workloads than software-based equivalents. Advantages include improved manageability – workloads are less likely to interfere with each other – and reduced security exposure. HP, IBM and Sun offer hard partitioning on at least some of their UNIX server platforms. Only IBM offers this capability on blades. Hewlett-Packard’s strategic hard partitioning technology, nPars, is supported only on Superdome 2 systems. It requires specialized, comparatively expensive “cell blades,” which duplicate the functions of cell boards employed in earlier Integrity Superdome systems. Sun offers a hard partitioning capability, Dynamic Domains, on its M-Series servers, which are not available in blade form. There is, however, no T-Series equivalent. IBM logical partitions (LPARs), which are implemented in PowerVM microcode, may be configured in increments as small as 1/10th of a core. Up to 10 LPARS are supported per core, for totals of between 40 and 160 per blade. LPARs may host AIX, IBM i, RHEL and SLES instances, or combinations of these. 2. Software-based partitioning, in this context, refers to software-based techniques employed to host multiple operating system instances. All three vendors offer such techniques. There are variations in granularity that are shown in figure 5. Figure 5 Software-based Minimum Partition Sizes: HP, IBM and Sun Blades HP INTEGRITY IBM POWER7 SUN T-Series th HPVMs – 1/20 core Micro-partitions Oracle VM Server for SPARC th th – 1/10 core initial increment – 1/8 core th – 1/100 core subsequent increment IBM micro-partitions must initially be configured in increments of 1/10th, but may be expanded in later increments of 1/100th of a core. HPVMs and Oracle VM Server for SPARC were formerly known as Integrity Virtual Machines (IVMs), and Sun Logical Domains (LDOMs) respectively. The capabilities of software-based techniques overlap, to some extent, those of hard partitioning equivalents. Hard partitions continue, however, to be preferred by many organizations for their most performance- and availability-sensitive applications, and for database serving. Among the case studies presented in this report, for example, UPMC employs micro-partitions to consolidate application and Web serving. Consolidation of database servers, however, is handled by LPARs.International Technology Group 15
  19. 19. One reason for this preference is that, in a multitier architecture, database servers represent the main point of vulnerability. If an application or Web server fails, an organization can switch to an alternate. Loss of a database server, however, will disable the entire system. Loss of database contents may have even more serious repercussions. 3. Application partitioning, in this context, means techniques that allow system resources to be allocated to specific applications sharing a common operating system instance in a partition. Application partitioning is typically employed for development, test and other non-production instances as well as for light-duty production applications. The best-known approach, Oracle Containers and Zones (Containers provide system resource controls, while Zones define partitions) are supported on Sun T-Series as well as other Solaris platforms. IBM Workload Partitions (WPARs) and HP Secure Resource Partitions provide functionally similar capabilities. A fourth type of partitioning, Virtual I/O Server (VIOS), is offered only by IBM. VIOS partitions, which are supported on all Power based systems, including blades, allow operating system instances in multiple LPARs to share a common pool of LAN adapters as well as Fiber Channel, SCSI and RAID devices; i.e., it is not necessary to dedicate adapters to individual partitions. Numbers of physical adapters required may be significantly reduced. It can be expected that this capability will become increasingly significant as processors become more powerful, and numbers of partitions on individual blades increase. Apart from cost savings, management complexity and energy consumption may be significantly reduced. Redundant VIOS may be employed. IBM is the only vendor that supports its full range of partitioning capabilities on blades. Workload Management Partitioning creates the potential for high levels of concentration and capacity utilization. The extent to which these are realized in practice, however, depends upon the mechanisms that (1) allocate and re-allocate system resources and (2) monitor and control execution processes across and within partitions. If these mechanisms are ineffective, risks are run. Low-priority processes may draw resources away from business-critical applications. Contention for resources may degrade system-level performance and cause outages. Utilization goals may not be realized because IT organizations leave a great deal of spare capacity to allow for these effects. Risks are compounded when workloads are subject to sustained growth, or fluctuate, and they are compounded further when both occur. In order to deal with these challenges, HP, IBM and Sun all offer such capabilities as workload prioritization, dynamic partitioning, shared processor pools, allocation of processor and memory resources based on performance, service level or other targets, and capping (i.e., setting limits on the resources that may be consumed by a particular partition or workload). HP and IBM approaches are, however, more effective at handling dynamic resources, and in the extent of their integration with mechanisms that manage processor, memory, I/O and other system resources. The HP Workload Manager (WLM), for example, offers the range of services summarized in figure 6.International Technology Group 16
  20. 20. Figure 6 Hewlett-Packard Workload Manager Services Adjust processor resources based on workload priorities Migrate cores between virtual partitions Adjust number of cores in processor sets Manage resources inside virtual machines Adjust resource allocations by time of day, system events or application metrics Ensure critical workloads have sufficient resources to perform at required levels Set minimum & maximum amounts of CPU & memory available to workloads Grant workloads dedicated processor sets Grant workload CPU resources according to a metric, such as number of processes Optimize performance for multiple workloads on a single system Monitor resource consumption by applications or users Capabilities for POWER7 based systems are similar but broader, and are implemented using a combination of workload management features built into IBM AIX and i, and into the PowerVM hypervisor. These capabilities illustrated in figure 7. Figure 7 POWER7 Based Systems Virtualization Capabilities 
 AIX 7.1 Intelligent Threads • Intelligent Cache WORKLOAD MANAGER Active Memory Sharing • Shared Dedicated Capacity Shared Processor Pools • Multiple Shared Pools Active Memory Expansion POWERVM HYPERVISOR LPAR LPAR LPAR LPAR Micro-partitions Micro-partitions WPARs LPAR Virtual LAN VIRTUAL I/O SERVER VIRTUAL I/O SERVER In POWER7 based systems, processors and main memory, along with cache and threads may be allocated and re-allocated. Resources may be dedicated to LPARs (Static LPARs), or shared according to application priorities (Dynamic LPARs). Static LPARs are typically employed for applications with high levels of business criticality.International Technology Group 17
  21. 21. Physical as well as logical (thread-based) processors may be grouped in shared pools. A key IBM differentiator is that, in POWER7 based systems, physical and logical processors may be assigned separately to specific pools, individual workloads or both. This provides a great deal more flexibility than offered by HP or Sun. Other mechanisms include Active Memory Sharing, which allows memory to be shared between LPARs; Shared Dedicated Capacity, which allows shared processor pools to use idle CPU cycles in dedicated LPARs; and Multiple Shared Pools, which allows resources used by groups of LPARs to be capped. These are integrated with new POWER7 capabilities such as Intelligent Threads and Intelligent Cache. The result is that POWER7 based systems can manipulate a wider range of variables – including threads, cache, main memory and I/O, multiple types of partition, multiple threads, and dedicated or pooled processors – to optimize performance for heterogeneous applications and workloads. The IBM design emphasis on automation should be highlighted. Although key parameters may be set by system administrators, systems can automatically determine, for example, how much cache, how many threads, or how many virtual or physical processors to allocate to a specific application task based on workload characteristics and priorities. Systems evaluate resource utilization every 10 milliseconds, and may change resource allocations as rapidly. Automation yields multiple benefits. One is that, by reducing the complexities to which system administrators are exposed, full time equivalent (FTE) staffing may be reduced. Automation may also improve overall capacity utilization, and improve quality of service by reducing the potential for performance bottlenecks and outages caused by human error. Integrity systems and HP-UX offer many of the same features. But they do not match the overall set of capabilities offered by POWER7 based systems, or the manner in which these are integrated and optimized in a mutually reinforcing manner. Sun offers some comparable functions through the Resource Manager component of Solaris. These function, however, are more rudimentary and less granular than HP and IBM equivalents, and dynamic resource allocation and automation capabilities are significantly weaker. Manageability has not been a major Sun focus in the past. AVAILABILITY OPTIMIZATION For several decades, availability expectations have typically been higher for UNIX than for x86 servers, and this is also proving to be the case for blades. The level of availability maintained by any system depends on a number of factors. These include capabilities that minimize the frequency and effects of component or software failures, along with monitoring, diagnostic, and fault masking and resolution, and other tools. In virtualized system environments, workload management effectiveness also plays an important role. If workloads interfere with each other or exceed their capacity limits, outages may occur. Organizations must, moreover, seek not only to deal with unplanned (i.e. accidental) outages, but also to minimize the frequency and duration of planned downtime for such functions as hardware and software upgrades and scheduled maintenance. Maintaining high availability thus presents multiple, overlapping challenges. The extent to which platforms are capable of meeting these challenges is determined not only by individual hardware and software, but also by overall system-level design and optimization.International Technology Group 18
  22. 22. The extent to which Integrity, Power and T-Series systems meet these challenges varies. All three employ basic techniques such as component redundancy and hot-swapping (i.e., allowing a device to be replaced without taking systems online), and provide facilities for monitoring, diagnostics, predictive failure analysis and related functions. HP and IBM have generally been leaders in availability optimization. Their approaches have been built around multiple levels of capability across all hardware and software components that represent potential sources of downtime. For example, IBM employs the hardware- and microcode-based technologies summarized in figure 8 in POWER7 based blades. Figure 8 Key POWER7 Availability Optimization Technologies BASIC CAPABILITIES Redundancy, hot-swap & internal Redundant/hot-swap disks, PCI adapters, GX buses, fans & blowers, failover power supplies, power regulators & other components. Redundant disk controllers & I/O paths. Concurrent system clock repair. Redundant oscillators/dynamic oscillator failover. Concurrent firmware update Server firmware may be updated without taking systems offline. Concurrent maintenance Allows processors, memory cards & adapters to be replaced, upgraded or serviced without taking systems offline. MONITORING, DIAGNOSTICS & FAULT ISOLATION/RESOLUTION Hardware-assisted memory scrubbing Automatic daily test of all system memory. Detects & reports developing memory errors before they cause problems. Chipkill error checking Technology capable of detecting & correcting single-bit as well as 2-, 3- & 4-bit errors in memory devices, including cache & memory interfaces. Employs RAID-like striping of data across memory devices to provide redundancy & enable reinstatement of original data. Significantly more reliable than conventional error correction code (ECC) technology. First Failure Data Capture (FFDC) Employs 1,000+ embedded sensors that identify errors in any system component. Root causes of errors are determined without the need to recreate problems or run tracing or diagnostics programs. FAULT MASKING Processor instruction retry If an instruction fails to execute due to a hardware or software fault, Alternate processor recovery the system automatically retries the operation. If the failure persists, Processor-contained checkstop the operation is repeated on a different processor &, if this does not succeed, the failed processor is taken out of service (checkstopped). Only LPARs supported by the failed processor are affected. Dynamic processor sparing Allows idle Capacity Upgrade on Demand (CUoD) processors to be automatically activated as replacements for failed processors. Partition availability priority In the event of a processor failure, maintains LPAR-based workloads based on assigned priorities; i.e., remaining processor capacity is assigned to the highest-priority workloads. Memory sparing Enables redundant memory modules to be activated in the event of memory failures. Enhanced memory subsystem Enables memory controller & cache sparing. Enhanced cache recovery Detects & purges processor, L2 & L3 cache errors. Recovers & reinstates original data. Dynamic I/O line bit repair (eRepair) Detects & bypasses failed memory pins. PCI bus parity error retry Retries an I/O operation if an error occurs. POWER7 based systems benefit from a number of technology transfers from IBM mainframe systems, which enjoy the highest levels of availability of any major platform. According to IBM, the availability optimization features of POWER7 based systems were developed jointly by the company’s Power and System z (mainframe) design teams.International Technology Group 19
  23. 23. Key mainframe-derived features include First Failure Data Capture (FFDC), which employs thousands of embedded sensors to identify and determine the cause of errors in any system component; and Alternate Processor Retry, which engages a series of diagnostic and remedial actions if an instruction fails to execute. A further level of capability is provided by AIX. The latest version 7.1 includes the features shown in figure 9. Figure 9 Key AIX 7.1 Availability Optimization Features Second Failure Data Capture Supports First Failure Data Capture technology with additional diagnostic & data (SFDC)* capture features built into the operating system. Multisystem First Failure Data Consolidates FFDC information, & provides single point to launch data collection, Capture* debug & monitoring actions across multiple systems. Run-time error checking System-wide framework for FFDC & SFDC capabilities. Concurrent Kernel Updates Enables some kernel fixes to be installed without rebooting. Allows patches to be applied without interruption of service. Can be employed for approximately 80 percent of required single module kernel updates. Kernel exploitation of POWER Exploits a POWER7 hardware feature that separates memory spaces for the Storage Keys* kernel, file system & drivers to prevent software errors affecting one of these from spreading to the others. Functional Recovery Routines* Suite of diagnostic & recovery routines that can enable recovery from errors that would otherwise cause the operating system to crash. Kernel no-execute protection Establishes kernel data areas that should not be treated as executable code. Enables immediate detection if erroneous device driver or kernel code strays into these areas. Avoids potential system crashes. Kernel stack overflow detection Detects stacks overflows & enables recovery of some of these. Tracing facilities System trace – main AIX trace facility. Lightweight memory trace – allows tracing of key kernel events only. Lightweight structure results in minimal performance impact. Component trace – enables tracing with per-component granularity. Dynamic Tracing with probevue Allows developers or system administrators to dynamically place probes in existing application or kernel code, without requiring special source code or even recompilation. Simplifies debugging of complex system or application code. POSIX trace Implements POSIX Trace Standard for application tracing. Live Dump* Allows key subsystems to dump diagnostic information for service analysis, without requiring a full system dump. Firmware-assisted dump* Allows firmware to incorporate FFDC information in system dumps. MiniDump Small compressed dump of system data for diagnostic analysis. Enables quick snapshot of crash without full system dump. Parallel Dump Compressed format enabling multiple processors to dump in parallel sub-areas. Greatly reduces time to produce dump. Netmalloc debug Memory subsystem monitoring tool that enables isolation of memory leaks. Live Partition Mobility Allows movement of active LPARs between Power Systems. Brief interruptions – no more than one or two seconds – may occur due to network latency. Live Application Mobility Allows movement of WPARs between systems. Service interruptions are longer than for Live Partition Mobility – typically around 20 seconds. Cluster Aware AIX Provides kernel-based heartbeat, messaging, file sharing, commands & APIs, data collection & event management services supporting clustered HA solutions. *Mainframe-derived feature Key AIX mainframe-derived features include Second Failure Data Capture (SFDC), Kernel Exploitation of Storage Keys and Functional Recovery Routines, which are drawn from z/OS operating system.International Technology Group 20
  24. 24. HP and (to a lesser extent) Sun offer a number of comparable capabilities. Generally, however, the quality of IBM microelectronics technology is superior – unlike HP and Sun, the company is a major semiconductor designer and manufacturer – and neither competitor has been able to draw upon mainframe hardware and software technology in the same manner as IBM. It is unclear whether the Integrity transition to blade-based hardware structures will have availability implications. HP nPar partitioning will be supported only on high-end Superdome 2 models. According to the company, this will also be the case for other Integrity RAS features. Other Power and AIX capabilities include Live Partition Mobility, which allows movement of active LPARs between Power Systems, and Live Application Mobility, which allows WPARs to be moved in the same manner. Live Partition Mobility users may experience service interruptions of one or two seconds due to network latency. For Live Application Mobility, interruptions are typically around 20 seconds. HP offers a similar capability for HPVMs. Oracle VM Server for SPARC allows for domain migration, but this a labor-intensive and protracted process. The company has committed to improved automation. Clustered failover solutions are available for all three platforms. The HP Serviceguard and IBM PowerHA SystemMirror, which is offered for IBM AIX and i, are among the industry’s most stable and mature high availability clustering solutions. PowerHA SystemMirror for AIX was formerly known as IBM High Availability Cluster Multi-Processing (HACMP).International Technology Group 21
  25. 25. X86 SERVER BLADES DIFFERENTIATORS There are some obvious resemblances between UNIX and x86 server blades. Hardware formats are similar, and the leading vendors, HP and IBM, offer the same BladeSystem and BladeCenter chassis as for their UNIX server blades. This is also the case for Sun, although the company is a minor player outside the x86 Solaris space. In comparing POWER7 based blades to x86 equivalents, however, differences are greater than resemblances. POWER7 based blades yield higher levels of performance; system architectures are significantly different; and the capabilities of operating systems (AIX compared to Windows and Linux) and virtualization enablers (PowerVM compared to VMware and others) vary widely. The extent to which these compete directly is less than is generally realized. At the risk of stating the obvious, POWER7 based systems do not support Microsoft Windows. POWER7 based systems are not candidates for deployment of popular Microsoft solutions such as Exchange, SharePoint and SQL Server, for the company’s infrastructure products, or for third-party add-ons to and extensions of these. Equally, x86 Linux deployments tend to involve applications and workloads that differ from, and are typically less challenging than those for which POWER based systems are employed. Comparative market share statistics may not reflect these demographics. Power Systems have evolved over more than 20 years to handle applications that are more demanding, and to deliver scalability, concentration and quality of service that are – by wide margins – greater than those experienced in most Windows and x86 Linux server environments. COMPARATIVE PERFORMANCE In terms of “raw” performance, x86 servers equipped with latest-generation Intel Nehalem EX processors and Advanced Micro Devices (AMD) Opteron equivalents have begun to approach RISC levels. This is less the case for IBM Power than for HP Integrity and Sun servers, although there is some overlap with lower-density POWER7 processors. Raw performance is not, however, the only determinant of the actual performance levels that will be experienced by users. Most industry benchmarks, for example, are run using standardized workloads in stable operating conditions. Actual production environments tend to be very different, particularly when they involve virtualized servers hosting diverse, fluctuating workloads. Where this is the case, the effectiveness of Power partitioning and workload management mechanisms, and the impact of such capabilities as intelligent threads and cache may significantly increase the amount of work that may be performed by a POWER7 based system. This may not be visible in capacity utilization statistics. Knowing that a server is operating at, say, 65 or 85 percent of capacity does not provide insight into how efficiently that capacity is used. Rapid allocation and re-allocation of system resources, and fine-grained concurrent workload execution may mean that a great deal more work is performed by a POWER7 based system even if capacity utilization is the same. Performance measurement in virtualized environments is not a simple process.International Technology Group 22
  26. 26. VIRTUALIZATION CAPABILITIES Virtualization Enablers In comparing the capabilities of POWER7 based systems, and Windows and x86 Linux servers in virtualized environments, a key role is played by the relative strengths and weaknesses of PowerVM and x86 virtualization enablers. These include Microsoft Hyper-V, which is implemented as an extension of Windows Server 2008. Linux equivalents include Xen and Kernel-based Virtual Machine (KVM), which originated as open source tools but have been upgraded and enhanced by Citrix Systems (XenServer) and Red Hat respectively. Oracle VM is also Xen-based, and is supported by the company for Oracle Enterprise Linux, RHEL and Windows servers. The dominant player for Windows as well as x86 Linux virtualization is, however, VMware. Its market share is generally estimated at over 75 percent. VMware solutions are technically sophisticated and address a wide range of functions for servers as well as storage and networks supporting these. However, user experiences have highlighted drawbacks in several areas – including difficulties in supporting large-scale virtualized environments, and in levels of complexity that are generated. Although the following comments deal with VMware, they may also be considered to apply to the other x86 enablers mentioned above. Size and Scale Although there are some large VMware based systems, the vast majority of installations involve consolidation of comparatively small applications. The company itself estimates that more than 95 percent of its base consists of applications running on single- or dual-processor servers with less than four GB peak memory utilization, and fewer than 100 I/Os per second (IOPS). The largest areas of VMware consolidation have involved test and development instances – there are routinely at least two, and often as many as six of these for each production instance – and comparatively light-duty Web, infrastructure and end-user applications. The majority of VMware servers run five or fewer instances, and most industry estimates are that between 75 and 85 percent of all VMware servers support fewer than 15 VMs, This is the industry norm even for latest-generation two-socket servers. Larger consolidated environments – typically in the range of 15 to 30 instances per server – tend to run on newer four-socket servers with multicore processors. These statistics do not necessarily mean that VMware systems cannot support large applications or workloads. But the reality is that, in terms of usage patterns, VMware is predominantly a small- scale server virtualization solution. There are also distinct nuances in applications that are typically consolidated using VMware. Database servers and transaction processing systems, for example, have seen considerably less activity than other types. This is particularly the case for production systems. Even when VMware has been employed to virtualize “heavyweight” solutions, such as enterprise resource planning (ERP) systems, typically only the more lightweight components of suites have been deployed in virtual machines (VMs).International Technology Group 23
  27. 27. This pattern is striking in that VMware, in principle, can deliver extremely high levels of performance and scalability. The Virtual Infrastructure 3 product set, introduced in 2006, supported up 32 physical cores, 256GB of main memory, and 128 powered-on VMs per host, and up to four virtual CPUs and 64GB of main memory per VM. The latest vSphere 4 version extended support to 64 physical cores, 1TB of main memory, and 320 powered-on VMs per host, and up to eight virtual CPUs and 256GB of main memory per VM. Only a small fraction of the potential of either version, however, has been exploited in practice. One reason for this is that VMware is a software-based partitioning technology, with comparatively weak server-level workload management capabilities. This is particularly the case where mixed workloads must be dealt with. Although VMware (the company) has invested heavily in management solutions within its vSphere environment, these are primarily designed to manage resources across server farms, and supporting storage and networks, rather than within individual servers. As a result, users are often reluctant to “push the envelope” in configuring numbers of VMs on individual servers. Unless application installation instances generate very small, homogeneous and/or stable workloads, risks of performance bottlenecks and disruption might escalate to unacceptable levels. Power Systems capabilities are significantly more effective. Multiple forms of partitioning are tightly integrated with industry-leading mixed workload management capabilities. Even using earlier generations of technology, Power Systems users have routinely been able to support 30 to 50 LPARs and even larger numbers of micro-partitions on single systems. Use of Power blades is moving in the same direction. Among the case studies presented in this report, for example, UPMC anticipated that it could support up to 40 micro-partitions on new PS702 blades. Complexity After almost a decade of rapid growth, market researchers have reported a recent slowdown in x86 virtualization initiatives. The primary reason is that organizations face growing complexity challenges. Implementation has often proved to be a longer and more difficult process than anticipated, and skill requirements and staffing levels have tended to escalate. Virtualization inevitably increases complexity by introducing a new layer of architecture into system environments. Figure 10 illustrates this effect. Figure 10 System Environment Layers: Example APPLICATIONS DATABASES/MIDDLEWARE OPERATING SYSTEMS VIRTUALIZATION HARDWAREInternational Technology Group 24
  28. 28. The organizational impact of adding this layer depends upon a number of factors. These include the number of hardware and software components, the degree of integration between these and the extent to which processes are automated. A VMware environment will typically include components from at least four vendors – Intel or AMD, the server hardware manufacturer, the operating system supplier and VMware itself. The number may be significantly larger if environments also extend to storage and networks, or if third-party tools are added. Complexity is also materially affected by concentration. If system processes must be coordinated, and resources managed across large numbers of separate hypervisors, challenges are greater than if the virtualization layer is built around more compact structures. In most organizations, it would be necessary to implement the VMware solution set – which now includes the major components shown in figure 11 – across x86 infrastructures that may be less fragmented than would have been the case in the past, but which nevertheless fall well short of the degree of concentration that may be realized with POWER7 based systems. Figure 11 Major Components of VMware vSphere 4 Environment vCenter Converter  Host Profiles  Lab Manager  Lifecycle Manager  Orchestrator Site Recover Manager  Tools & Utilities Availability & Recovery Security HA, Data Recovery, Fault Tolerance vShield Zones, VMsafe vMotion, Storage vMotion System Base Storage Network ESX, ESXi VM File System Distributed Switch Distributed Resource Scheduler Thin Provisioning Network I/O Control Distributed Power Management Storage I/O Control Memory Overcommit There are other Power Systems advantages. Hardware, microcode and operating systems are developed and supported by one vendor, and are integrated and optimized to a degree that far exceeds what has been achieved in the x86 world. Automation is also more advanced, and more pervasive than is the case for any x86 server platform, operating system or virtualization enabler. AVAILABILITY AND SECURITY Comparing Availability Differences in the levels of availability that are typically realized with Windows and Linux compared to UNIX servers have been widely documented. Certain misperceptions nevertheless remain. In availability, for example, it is commonly argued that latest-generation x86 processors can deliver “mainframe-class” uptime.International Technology Group 25
  29. 29. Prevention of outages, however, requires more than hardware reliability. The availability optimization capabilities of POWER7 based systems described earlier, for example, include multiple levels of redundancy and concurrency, as well as mechanisms for monitoring, diagnostics, fault isolation and resolution, and fault masking (enabling systems to continue functioning even if a fault occurs) that are more sophisticated than those of most x86 platforms. Such mechanisms must also extend to operating systems. More planned outages are caused by software than by hardware failures, and this is the case for most planned outages. In POWER7 based systems, availability optimization features are also embedded in the microcode components of the underlying system, and of PowerVM virtualization technology. In an x86 environment, availability optimization needs to occur not only across processors and server platforms, but also across Windows or Linux and virtualization enablers. Typically, this again means dealing with the offerings of at least four vendors. Integration and optimization of availability features is inevitably more problematic than is the case for Power Systems. Further points should be made. Maintenance of, say, 99.5 percent availability is a comparatively simple process. Difficulties increase substantially, however, if higher levels must be sustained. The challenges of maintaining near-continuous availability may be orders of magnitude greater. Equally, maintenance of high levels of availability for low-impact workloads is a great deal simpler than for high-volume database- and transaction-intensive systems. Challenges are compounded if it is necessary to support virtualized environments characterized by diverse, volatile workloads. Use of high availability clusters does not necessarily change this picture. Windows and Linux clusters typically experience more downtime than UNIX equivalents, and failover and recovery processes tend to be both slower and less reliable. Clusters also tend to generate complexity, and the effects are magnified if they are multiplied across farms of small servers. Experience has shown that x86 clusters are often necessary to achieve availability levels that may be realized by standalone Power Systems. Security and Malware Resistance It is a truism that security and malicious code (malware) exposure for Windows and x86 Linux is greater than for UNIX. Windows, in particular, is the world’s most targeted operating system, and there are believed to be over a million Windows malware variants. An unprotected Windows server connected to the Internet will typically become infected in a matter of minutes. The comparatively loose design of, diversity of components, and open source origins of Linux environments also pose security and malware challenges that are greater than for major versions of UNIX. Since the mid-2000s, VMware and other x86 virtualization enablers have also emerged as major hacker and malware targets. The business impacts of security violations, data loss and malware damage may be significant. Experience has shown, however, that there are also major cost implications. Even if high levels of security can be realized for Windows and Linux systems, the process tends to be more labor- intensive than for UNIX systems, and security administration costs tend to be higher. At a time when security budgets are under pressure in many organizations, the central challenge is not simply to maintain security. It is to do so in a cost-effective manner. POWER7 based systems can materially assist in achieving that goal.International Technology Group 26