SlideShare a Scribd company logo
1 of 23
Download to read offline
INFORMATION TECHNOLOGY
                    INTELLIGENCE CORP.




ITIC 2009 Global Server Hardware and
     Server OS Reliability Survey




                                                               July 2009




© Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved.
Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders.
Executive Summary
“Time is money”
For the second year in a row, IBM AIX UNIX running on the Power or ―P‖ series servers scored
the highest reliability ratings among 15 different server operating system platforms – including
Linux, Mac OS X, UNIX and Windows.

Those are the results of the ITIC 2009 Global Server Hardware and Server OS Reliability Survey
which polled C-level executives and IT managers at 400 corporations from 20 countries
worldwide. The results indicate that the IBM AIX operating system running on Big Blue’s
Power servers (System p5s), is the clear winner; it offers rock solid reliability, besting all
competing operating systems, including those running on Intel-based x86 machines. The IBM
servers running AIX consistently score at least 99.99% or just 15 minutes of unplanned per
server, per annum downtime (See Exhibit 1).

Overall, the results showed improvements in reliability, patch management procedures and an
across-the-board reduction in per server, per annum Tier 1, Tier 2 and the most severe Tier 3
outages.

          IBM AIX on the Power series System p5 and System p6 servers leads all vendors for
           both server hardware and server OS reliability. The IBM UNIX distribution recorded the
           fewest number of Tier 1, Tier 2 and Tier 3 unplanned server outages per year. IBM AIX
           running on the System p5s and newer p6s had less than one unplanned outage incident
           per server in a 12 month period. More impressively, the IBM servers experience no
           severe Tier 3 outages.
          Hewlett-Packard’s HP UX 11i running on the HP 9000 and Integrity servers also
           performed very well though HP servers notch approximately 21 to 25 minutes more
           downtime than IBM servers, depending on model and configuration. The HP UX 11i v. 3
           Update 4 on the HP 9000s average 36 minutes of per server, per annum downtime; while
           the HP UX 11i v.3 Update 4 on HP Integrity servers recorded 39 minutes of per server,
           per annum downtime.
          Faster Patch Management. IT managers spend approximately 11 minutes to apply
           patches to IBM servers running the AIX operating system, which is again, the least
           amount of time spent patching any server or operating system. The open source Ubuntu
           distribution is a close second with IT managers spending 12 minutes to apply patches,
           while IT managers in the Novell SUSE Enterprise, customized Linux distribution and
           Apple Mac OS X 10.x. environments each spend a very economical 15 to 19 minutes
           applying patches.
          Unplanned severe Tier 2 and Tier 3 Outages Decline. IBM also took top honors in
           another important category: IBM Power Series System p5 and p6 servers running AIX
           experience the lowest amount of the more severe Tier 2 and Tier 3 outages combined of
           any server hardware or server operating system. The combined total of Tier 2 and Tier 3


© Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved.
Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders.

                                                                                                                                         Page 2
outages accounted for just 19% of all per server, per annum failures in IBM network
           environments. HP UX on the 9000 and Integrity servers, Novell SUSE Linux Enterprise
           11 and ―other‖ Linux distributions were close behind with combined Tier 2 + Tier 3
           outages accounting for 24% to 25% of unplanned yearly downtime.
          Novell SUSE Superiority. Among the Linux and Open Source server operating system
           distributions, both Novell SUSE Linux Enterprise 10 and 11 versions consistently
           achieved superior reliability ratings. In fact, Novell SUSE in a customized
           implementation had the lowest instance -- approximately 16 minutes of per server/server
           OS, per annum downtime – of any distribution with the exception of IBM’s AIX on the
           Power Series. Many IT managers specifically mentioned and extolled the high level of
           integration and interoperability between their Novell SUSE Linux Enterprise and
           Microsoft Windows Server 2003 and Windows Server 2008 in heterogeneous networks,
           in their anecdotal responses and first person customer interviews.

          Most Improved. Microsoft Windows Server 2003 and Windows Server 2008 showed the
           biggest improvements of any of the vendors. The Windows Server 2003 and 2008
           operating systems running on Intel-based platforms saw a 35% reduction in the amount
           of unplanned per server, per annum downtime from 3.77 hours in 2008 to 2.42 hours in
           2009. The number of annual Windows Server Tier 3 outages also decreased by 31% year
           over year and the time spent applying patches similarly decline by 35% from last year to
           32 minutes in 2009.

          Apple Mac and OS X 10.x Competitive Enterprise Reliability. This year’s survey for
           the first time also incorporated reliability results for the Apple Mac and OS X 10.x OS
           platform. Over the past two to three years, the Apple Mac platform has made a comeback
           in corporate enterprises. The numbers of Mac G4 servers are modest in comparison to the
           more entrenched Windows, Linux and UNIX distributions. Nonetheless, they are making
           their presence known. IT managers report the reliability has been generally very good.
           The survey respondents indicated that the Apple Mac G4 servers are extremely
           competitive in an enterprise setting. IT managers spend approximately 15 minutes per
           server to apply patches and an average recorded downtime of about 40 minutes per
           server, per annum.. It is important to note that at this point, the workloads of the G4 Macs
           are not comparable to those of the high end IBM, HP and Sun (now Oracle) UNIX
           systems or the customized Linux and open source distributions.




© Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved.
Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders.

                                                                                                                                         Page 3
The intent of this Report is to quantify and qualify the reliability of 15 different server operating
system platforms running on a variety of proprietary UNIX and Intel-based hardware platforms.
This will allow organizations to more easily identify baseline reliability metrics associated with
individual platforms in order to better determine and optimize their total cost of ownership
(TCO), accelerate return on investment (ROI) and more efficiently manage risk.




© Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved.
Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders.

                                                                                                                                         Page 4
Table of Contents

Executive Summary...................................................................................................2
Introduction..............................................................................................................6
   Survey Methodology ..............................................................................................8
   Survey Demographics ............................................................................................9
Data & Analysis.........................................................................................................9
Conclusions ............................................................................................................ 19
Recommendations................................................................................................... 19
   Recommendations for Corporate Customers .......................................................... 20
   Recommendations for Vendors ............................................................................. 22




© Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved.
Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders.

                                                                                                                                         Page 5
Introduction

Server hardware and server operating system reliability is the foundation and bedrock upon
which crucial applications, storage, security and third party utilities and management, rest. The
stability and health of the entire network infrastructure depend heavily on the server hardware
and the operating systems that run on them. Server hardware and server operating system
reliability are inextricably linked to the corporation’s ability to lower its TCO, accelerate ROI
and reduce the risk factors that negatively impact performance.

Information on specific reliability metrics, allows businesses to calculate the real-time resources
and monies needed to manage and maintain their various server hardware platforms and
operating systems. It also enables them to determine whether or not their mission critical server
hardware and operating system software are assisting or impeding the business from meeting key
service level agreements (SLAs) to their customers, business partners and suppliers as well as
internally to the company’s own end users.

The ITIC self-selecting reliability survey polled IT managers at 400 corporations worldwide on
the annual amount and percent of unplanned per server, per annum downtime experienced
following 15 hardware and server OS environments.


          IBM AIX on Power series System p5 and p6 servers
          HP UX on the 9000
          HP UX on Integrity servers
          Sun Solaris UNIX on the SPARC Servers
          Apple Mac OS X 10.5, 10.6 on G4 Macs
          Novell SUSE Linux Enterprise on Intel x86 servers
          Novell SUSE Linux Enterprise on Intel x86 servers
          Red Hat Enterprise Linux on Intel x86 servers
          Red Hat Enterprise Linux with customization
          Windows Server 2003 on Intel x86 servers
          Windows Server 2008 on Intel x86 servers
          Ubuntu open source
          Debian open source
          Other Linux distributions (e.g. Mandriva, Turbo Linux)
          Other Linux distributions with customization


The survey data gives a detailed comparison breakdown of the percentage of Tier 1, Tier 2 and
highest severity Tier 3 outages.

© Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved.
Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders.

                                                                                                                                         Page 6
ITIC’s definition of server outages is as follows:
    Tier 1: These are the typically minor common, albeit annoying occurrences. A network
     administrator can usually resolve such incidents with less than 30 minutes for dependent
     users. Tier 1 incidents can usually be resolved by rebooting the server and rarely involve any
     data loss.
    Tier 2: These are moderate issues in which the server may be offline from one hour to four
     hours or about a half-day. Tier 2 problems may require the intervention of more than one
     network administrator to troubleshoot and it frequently affects the corporation’s end users
     and possibly business partners, customers and suppliers in the event they are attempting to
     access data on an affected corporate extranet.
    Tier 3: This is the most severe type of incident. Tier 3 outages are of longer than four hours
     duration for network administrators and the company’s associated dependent users. Tier 3
     outages almost always require a team of multiple network administrators to resolve. Data loss
     or damage to systems and applications may or may not occur. Another real threat associated
     with a protracted Tier 3 outage is potential lost business and the potential damage to the
     company’s reputation.
.
The length and severity of each of these actions correspond to specific line item capital
expenditure and operational expenditure costs for the business. Reliability, measured by
downtime, can positively or negatively impact TCO and accelerate or delay the time it takes to
realize ROI.

Improvements or declines in reliability also mitigate or increase technical and business risks to
the organization’s end users and external customers. The ability to meet service-level
agreements (SLAs) hinges on server reliability, uptime and manageability. These are key
indicators that enable organizations to determine which server operating system platform or
combination thereof is most suitable.
The survey data detailed the disparity in the number and severity of unplanned server outages
and the amount of time in minutes and hours that businesses experience on the various Linux,
Windows and UNIX platforms.

The survey closely examined both the actual quantitative reliability statistics as well as the
qualitative issues that positively or negatively impacted outage time. The ITIC survey queried
corporate IT managers and C-level executives on myriad reliability-related functions including:
    The amount of downtime (minutes/hours experienced per server, per annum
    The amount of time spent patching each server
    Whether the IT administrators apply updates via an automated group policy procedure or
     manually apply the patches to individual servers



© Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved.
Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders.

                                                                                                                                         Page 7
On average, individual corporate Linux, Windows and UNIX servers experience from zero to
approximately two failures per server per year. In a best case scenario, this results in 20 minutes
(IBM AIX running on p5 and p6 Power servers) to 4.3 hours (Debian open source) hours of
annual downtime for each server. Windows Server 2008 servers experienced a total of just under
three unplanned yearly Tier 1, Tier 2 and Tier 3 outages. However, the necessity of having to
take many of the Windows Servers offline to apply monthly patches and then do a system reboot,
resulted in Windows Server 2008 machines being offline for just under two and a half hours each
year. Still, this is a 35% reduction for the 3.77 hours of downtime experienced by Windows
Server 2008 machines in last year’s ITIC reliability survey.
Among the Linux distributions Novell SUSE Enterprise exhibited consistent reliability
reminiscent of the late 1980s and 1990s when Novell NetWare was famous for running several
years – in some cases as long as nine years – without experiencing a failure or the need to reboot.
This can be attributed to the stability of the Novell distribution, the experience of the SUSE
engineers and the length of experience of many IT managers who came from the NetWare
environment. Novell also inked an interoperability and technical service and support agreement
with Microsoft two and a half years ago, which also served to improve reliability.
The open source Ubuntu distribution also scored some impressive reliability gains as it continues
to gain in popularity and deployments.

Overall, these survey responses provide crucial, comparative reliability metrics to enable
customers to make informed choices on which server hardware and server operating system or
combination thereof, best suits their specific business and budgets needs.

Survey Methodology
ITIC conducted the 2009 Global Server Hardware and Server OS Survey, an independent Web-
based survey; that included multiple-choice questions and essay responses from March through
July 2009. ITIC polled C-level executives and IT managers at 400 corporations worldwide.
ITIC analysts supplemented the Web survey by conducting two dozen first-person customer
interviews. ITIC conducted additional interviews with customers in October 2009 and updated
the Report with specific information on server downtime statistics. The anecdotal data obtained
from these interviews validates the survey responses and provides deeper insight into the
challenges confronting businesses in both the immediate and long term.

To deliver the most unbiased, accurate information, ITIC did not accept any vendor
sponsorship money for the online poll or the subsequent first-person interviews conducted in
connection with this project. ITIC employed authentication and tracking mechanisms to prevent
tampering and to prohibit multiple responses by the same parties.




© Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved.
Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders.

                                                                                                                                         Page 8
Survey Demographics
Companies of all sizes and all vertical markets were represented in the survey. Respondents
came from companies ranging from small and medium businesses (SMBs) with fewer than 50
workers, to large enterprises with more than 100,000 employees.
Roughly 33% of the survey respondents came from the SMB segment with 1 to 100 employees;
12% of those polled were from midsize companies with 100 to 500 employees; 14% were drawn
from corporations employing 500 to 1,000 employees; and 41% of respondents worked in large
enterprises with 1,000 to more than 100,000 workers.
The survey was truly global. Approximately 85% of respondents came from North America.
The remaining 15% hailed from more than 20 countries including Europe, Asia, Australia, New
Zealand, South America and Africa.



Data & Analysis
Server hardware and server operating system reliability has improved immeasurably in the last
five years.

When ITIC began conducting reliability research and surveys, our original definition of
unplanned downtime was an unexpected external or internal incident that caused the server
hardware and/or the server operating system software to spontaneously fail or freeze, thereby
disrupting network operations and requiring remediation efforts and a reboot. Depending on the
seriousness of the incident, the downtime may also have resulted in lost or damaged data.

However, it quickly became apparent from the anecdotal survey comments and during our first
person customer interviews, that IT managers and network administrators had a broader
definition of what constituted downtime.

As far as IT departments are concerned, anything that causes them to take the server offline,
regardless of the cause, is unplanned downtime. Included in this category are instances of
vendors releasing an unanticipated patch to fix a technical bug or security vulnerability. Such an
occurrence does not qualify as unplanned downtime in the narrowest definition of the term;
network administrators oftentimes do not make that distinction. To them downtime is downtime
because it disrupts their routine and may also impact daily operations because it means the IT
department must devote time to remedial issues that would have been spent performing other IT
chores. And in some network environments like Windows, it’s still necessary to take the servers
down, apply the patch and perform a hard reboot.

Time very literally equates to money. The economic downturn has forced companies to cut staff,
put network and software upgrades on hold, decimated IT departments and has severely reduced
the training and recertification for network administrators.

© Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved.
Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders.

                                                                                                                                         Page 9
A recent ITIC survey that polled 250 corporations worldwide in October 2009 found that 47% of
businesses had budget cuts within the past 12 months. That number was even greater for
companies with over 500 end users; 64% of large enterprises experienced budget cuts.
Consequently, 84% of the respondents reported that their IT departments simply pick up the
slack and work longer and harder.



Downtime by the Numbers
In the early days of networks, corporate enterprises considered 99% uptime to be an adequate
reliability standard. Not so in 2009. An ITIC survey of 250 enterprises conducted in October
found that only 14% of survey respondents consider 99% uptime adequate for their most mission
 critical, line of business (LOB) applications. Another 14% said that 99.9% or three nines met
their reliability needs. A two-thirds majority – 66% -- of those polled however, said their
network environments require 99.95%; 99.999% or greater reliability for their most mission
 critical LOBs.

It’s easy to see why when you correlate the downtime percentages to actual downtime:

99%            = average unplanned downtime of one hour and 40 minutes per week
99.9%          = average unplanned downtime of 45 minutes per month
99.95%         = average unplanned downtime of 22 minutes per month
99.999%        = average unplanned downtime of 5 1/2 minutes per year


Taken in this context, it’s easy to understand how the ongoing economic crisis has cast renewed
emphasis on server and server operating system reliability. Businesses of all sizes and across all
vertical markets are extremely risk averse. IT departments grapple daily with the reality of
keeping networks up and running in the face of cost cuts, layoffs and fewer resources. Server
hardware, server operating systems and the a Businesses and their IT departments are under
pressure to maximize server hardware and server operating system uptime in order to realize the
greatest economies of scale and ensure that their server hardware, server operating systems and
the crucial business applications and services that run on them are available to end users,
corporate clients, business partners and suppliers. A server outage of even a few minutes
duration can disrupt network operations and result in lost data, steep monetary losses and
damage a company’s reputation.


Reliability Then and Now
The first generations of server hardware and server operating system software platforms
introduced in the mid-to-late 1980s, were proprietary. Network administrators typically became
experts in a particular vendor’s platform. The 1.0 version of new hardware and software products


© Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved.
Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders.

                                                                                                                                         Page 10
from 10 to 20 years ago were also rife with bugs. It typically took from six months to a year for
the vendors to work the kinks out and achieve an acceptable level of stability and IT managers to
gain sufficient expertise and knowledge resulting in higher levels of uptime. It is also worth
noting that two decades ago, businesses were not as wholly dependent on their networks as they
are today.

 In the 1990s, 99% reliability was considered an acceptable industry standard. That is no longer
the case; 99% uptime is the equivalent of over 80 hours of annual per server downtime. ITIC’s
separate 2009 Global Application Availability Survey conducted in April found that eight out
of 10 of the 300 businesses polled said that their major business applications require higher
availability rates than they did two or three years ago. However, nearly three-quarters of
companies – 72% -- are unable to quantify the cost of downtime or the impact that unplanned
reliability outages have on the business. Among the other 2009 Global Application Availability
survey findings:

    Nearly two-thirds -- 61% -- of organizations are unsure of how estimate the impact of
     downtime on the business or do not even attempt to track the losses associated with
     application downtime and reliability
    Two out of five firms -- 41% -- said they require conventional 99% to 99.9% application
     availability; 29% said they needed 99.95% or 99.99% uptime; while 7% of respondents
     indicated they need continuous availability of 99.999% or 99.9999% availability.
    Just under half – 49% of companies – lack the budget to purchase additional third party
     software or hardware availability technology. This places more of an onus on the underlying
     server hardware and server OS to deliver high reliability.

The responses from the ITIC 2009 Global Application underscore the crucial importance of
having highly reliable server hardware and server operating system reliability. If the servers,
server OS and related applications are unavailable for any reason, business and daily operations
grind to a halt – with sometimes catastrophic results.

The demand for server hardware, server OS and application availability has grown, particularly
with the emergence of new technologies like cloud computing and virtualization. Corporations
need to ensure that reliability keeps pace. To quantify the reliability statistics: 99.99% uptime
equates to approximately four hours or 240 minutes of per server, per annum downtime.

Today’s networks demand near perfect reliability; corporations deem any downtime as an
anathema to their business operations. This is particularly true for those companies in vertical
markets such as banking and finance, stock exchanges, insurance, healthcare and legal, whose
businesses are based on intensive data transactions. A server crash of even 15 to 30 minutes
duration can cost a company from tens of thousands or tens of millions in lost business and
remediation efforts. Zero downtime – or as close to it as is humanly and technologically possible,
is the obvious goal and Holy Grail of reliability.




© Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved.
Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders.

                                                                                                                                         Page 11
While system flaws will always be present in some fashion, the survey found that at present,
server hardware and server OS reliability was also inextricably linked with several other crucial
factors and components. They are:

          Integration and interoperability is crucial. Over 85% of businesses with 300+ end
           users have myriad types of server hardware and three different operating systems present
           in their environment. Heterogeneity and openness are essential to the reliability of
           today’s networks. The 2007 wide ranging, non-exclusive interoperability pact between
           Microsoft and Novell was extremely well received and a huge boon for the respective
           customer bases of both firms. As part of the deal, Microsoft and Novell team up to
           provide joint sales, technical service and support to deliver plug and play interoperability
           between the Windows and SUSE Linux Enterprise environments.
          Workloads. The applications themselves are growing in size and complexity. It is
           therefore imperative that the server hardware be robust enough to handle the increased
           demands of new classes of applications such as streaming audio and digital and highly
           complex processes. It is a fact that a robust server configuration that includes new multi-
           core and multi-threading technologies, maximum memory, hard drive and the fastest
           processors will perform better than old, outmoded and inadequate equipment. The survey
           showed for example that the high reliability ratings for IBM and HP were no fluke: the
           powerful IBM System p5 and System p6 Power Series servers and the HP 9000 and
           Integrity Servers achieved very high reliability – 99.99% and 99.999% uptime – while
           carrying workloads that were 30% to 40% greater than comparable x86-based machines.
          Experience of the IT managers. Errors by neophyte, inexperienced network
           administrators and IT managers who have not been able to get training and re-certified on
           the latest technologies is another major factor that contributes to extended downtime and
           adversely impacts system reliability.
          Patch management. The amount of time spent applying patches is one of the biggest
           contributors to system downtime; this is especially true of security patches, as we see in
           Exhibit 2 below.




© Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved.
Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders.

                                                                                                                                         Page 12
IBM AIX administrators spent the least amount of time – 11 minutes – applying patches. They
were followed closely by the Ubuntu open source distribution, Apple Mac, niche market ―other‖
customized Linux distributions and Novell SUSE; administrators in each of these environments
spent on average from 12 to 15 minutes applying patches in these environments.

This speaks to the underlying stability of these environments as well as the experience of the
administrative staff. Typically, UNIX installations – notably IBM’s AIX, as well as Novell
SUSE Enterprise and Apple Mac, tend to be stable, static environments with experienced, hands
on network administrators who are familiar with the most minute details of the bits and bytes of
their systems. Fast patch management positively impacts reliability.

The feedback from the survey respondents reinforced the importance of being able to receive and
download patches quickly once a bug has been identified. Corporate IT managers noted the
significant strides that had been made by all of the vendors across the board in recent years,
though they still voiced some concerns. Among the anecdotal comments:




© Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved.
Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders.

                                                                                                                                         Page 13
•     ―IBM has done a wonderful job of keeping our AIX systems up and ready. We rarely if
           ever have reliability issues,‖ said an IT manager at a Midwest financial institution.
     •     ―Patch management automation has significantly reduced both the manpower required to
           apply patches and the downtime associated with patch management over the last three
           years,‖ noted an IT administrator at a large health care facility in the northeast.
     •     ―Novell SUSE Linux Enterprise is always very up-to-date on patches; Zenworks is nice
           and we never have a problem,‖ said a longtime Novell user at a large healthcare provider
           in the Southwest.
     •     ―The amount of time it takes to identify vulnerability and when the vendors release the
           patch, has decreased significantly, but if the bug is a dangerous one, we still worry,‖
           according to a chief technology officer at midsized retailer.
     •     ―Our patches are tested at our corporate headquarters location and then distributed as
           needed to the various remote locations, downloaded to a local Microsoft Systems
           Management Server (SMS) and automatically downloaded via group policy to each
           workstation and server. The process is accelerated and it’s relatively painless for the IT
           department,‖ said an administrator at a large West Coast enterprise.
     •     ―Our patch management dramatically improved with SUSE 10.2 and SUSE 11,‖noted
           another veteran Novell administrator. ―We have no problems now to speak of.‖
     •     ―We currently use Group Policy to download patches on each server, but we manually
           apply them. So it takes us about 15 minutes to patch each Windows server. This means
           that each server takes less than 15 min to patch. On a whole, other than hardware issues,
           we've averaged less than two failures per server, per year on our Windows Server 2003
           systems,‖ said an IT manager at a large East coast insurance firm.




© Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved.
Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders.

                                                                                                                                         Page 14
Serious Tier 2 + Tier 3 Incidents Decline
The survey results also showed a discernible decline in the number and percentage of the more
serious Tier 2, Tier 3 and combined Tier 2 + Tier 3 incidents, according to Exhibit 3 below.




© Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved.
Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders.

                                                                                                                                         Page 15
Once again, IBM AIX on the Power Series System p5 and p6s recorded the smallest percentage
of combined Tier 2 + Tier 3 incidents at 19%. The other UNIX and Linux distributions including
the HP UX 11i v3 on the HP 9000 and HP Integrity, Novell SUSE Linux Enterprise and Sun
Solaris also scored well with the more serious aggregate Tier 2+ Tier 3 outages accounting for
24% to 25% of total outages. And all of the aforementioned distributions managed to lower their
scores from the similar survey in 2008.

Microsoft’s Windows Server 2003 on x86-based servers came in with a very respectable 30% of
reliability outages being in the Tier 2 + Tier 3 categories; this was a reduction of 11% from the
41% reported by respondents to the 2008 ITIC Global Reliability Survey.

One of the most impressive statistics was that IBM AIX Power Series System p5 and System p6
servers notched no severe Tier 3 incidents whatsoever. Again, this achievement is even more
impressive when one considers that these systems typically run higher workloads than their x86-
based counterparts as shown in Exhibit 4.




© Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved.
Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders.

                                                                                                                                         Page 16
HP’s UX 11i v.3 Update 4 on the HP 9000 and Integrity servers and Sun Solaris on SPARC
Servers (now owned by Oracle), Novell SUSE, Red Hat Enterprise Linux and Apple Mac OS
10x 5.6 on the G4 Macs also recorded very few Tier 3 outages – less than one each, per server
per annum.

The most common Tier 1 incidents that are usually between 10 and 30 minutes duration, also
showed across the board reductions among all server hardware and server operating system
platforms as we see from Exhibit 5.




© Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved.
Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders.

                                                                                                                                         Page 17
In the Tier 1 category, IBM also came out on top with less than one-half of one Tier 1 incident
per AIX Power Series System p5 and System p6 per annum. This equates to about four to seven
minutes downtime per server, per year.

In fact, all of the server hardware and server OS environments each racked up less than one Tier
1 per server, per annum outage.

The results were similarly encouraging for the average number of Tier 2 outages as we see in
Exhibit 6 below.




© Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved.
Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders.

                                                                                                                                         Page 18
Conclusions
In summary the ITIC 2009 Global Server Hardware and Server OS Reliability Survey findings
indicate that all of the server operating system platforms have achieved a high degree of
reliability. However, the UNIX distributions led by IBM AIX running on the p5 and p6 Power
Servers is the clear winner followed closely by HP, Novell SUSE Enterprise Linux and the
Ubuntu open source distribution.

These results are especially considering in light of the ongoing economic crunch which has
caused companies to cut their budgets and reduce IT staff. As they strive to accomplish more
with fewer resources, IT departments must rely even more heavily on their vendors to deliver
more reliable servers and server operating system software.

To reiterate, time is literally money. Even a few minutes of downtime can cost companies
thousands or millions of dollars and cause business operations to grind to a halt. Downtime can
also impact adversely a company’s relationship with its customers, business suppliers, partners
and internal end users. Reliability or lack thereof can potentially damage a company’s reputation
and result in lost business.

Hence, corporations must have confidence in the reliability and stability of the underlying server
hardware and server OS platforms.

The advances in technology are encouraging. Now companies must tackle other equally
important and challenging issues to ensure the highest level of uptime and reliability. Close
attention must be paid to integration and interoperability, patch management, documentation and
getting the necessary training and certification for the appropriate IT managers. The most
bulletproof hardware and software platforms can be undone by human error. It’s equally
important that companies find the funds to stay as current as possible on their server hardware
and server OS software. Performance will suffer if the server is configuration is old and
inadequate.




Recommendations
Server hardware and server operating system reliability has improved vastly since the 1980s,
1990s and even in just the last two to three years. While technical bugs still exist, the number,
frequency and severity have declined significantly.

With few exceptions, common human error poses a bigger threat to server hardware and server
operating system reliability then technical glitches.


© Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved.
Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders.

                                                                                                                                         Page 19
Crucial TCO metrics such as reliability, performance, security and management ultimately
depends as much on each firm’s specific implementation, as it does on the properties of the
server and server OS technology itself. There are inherent dependencies between the underlying
capabilities of a particular server operating system and an individual corporation’s ability to
adhere to best deployment practices with respect to training, testing and configuration. The
reliability, security and manageability of even the most hardened server and server operating
system are easily compromised by human error.
A company that does not restrict physical access to the server is asking for trouble. Similarly,
any firm which does not enact and enforce strong usage and security policies, risks
compromising the reliability and integrity of its server hardware and server OS environment. The
reliability of the server environment can also be undone easily or seriously compromised by such
actions as: a bad configuration; the use of incompatible or unapproved memory and logic chips,
hardware, peripherals and software drivers; over clocking machines; failing to apply necessary
patches; failing to upgrade or retrofit inadequate or obsolete servers and operating systems and
taxing server and software resources beyond their capabilities.



Recommendations for Corporate Customers
To optimize uptime and reliability, ITIC advises corporations to:

          Regularly analyze and review configurations, usage and performance levels. This
           will enable companies to determine whether or not their current server and server OS
           environment allows them to achieve optimal reliability.

          Adopt formal SLAs. Service level agreements enable organizations to define acceptable
           performance metrics. Companies should meet with their vendors and customers on at
           least an annual basis to ensure the terms are met.

          Define measure and monitor reliability and performance metrics. It is imperative that
           companies measure component, system, server hardware, server OS and desktop and
           server OS, security, network infrastructure, storage and application performance. Keep a
           log of the planned and unplanned downtime in a continuous fashion throughout the
           enterprise.

          Regularly track server and server OS reliability and downtime. Keep accurate
           records of outages and their causes. Segment the outages according to their severity and
           length – e.g. Tier 1, Tier 2 and Tier 3. The appropriate IT managers should also keep
           detailed logs of remediation efforts in the event of the outage. These logs should include
           a full account of remediation activities, specifying how the problem was solved, how
           long it took and what staff members participated in the event. It should also list the
           monetary costs as well as any material impact on the business, its operations and its end
           users. This will prove invaluable resource should the problem recur. It may also make the


© Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved.
Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders.

                                                                                                                                         Page 20
difference in containing or curtailing the reliability-related incident, saving precious time
           for the IT department, the end users and corporate customers.

          Calculate the cost of unplanned downtime. Companies should determine the average
           cost of minor Tier 1 outages. They should also keep more detailed cost assessments of the
           more serious unplanned Tier 2 and Tier 3 incidents. It’s essential for businesses to know
           the monetary amount of each outage – including IT and end user salaries due to
           troubleshooting and any lost productivity – as well as the impact on the business. C-level
           executives and IT managers should also pay close attention to whether or not the
           company’s reputation suffered as a result of a reliability incident; did any litigation
           ensue; were customers, business partners and suppliers impacted (and at what cost) and at
           least try and gauge whether or not the company lost business or potential business.

          Ensure that your organization has robust server hardware that can adequately
           handle the OS and application workloads. The server hardware (standalone, blade,
           cluster, etc.) and the server operating system are inextricably linked. To achieve optimal
           performance from both components, corporations must ensure that the server hardware is
           robust enough to carry both the current and anticipated workloads for the lifecycle of
           both.
          Compile a list of best practices and adhere to them. This is absolutely essential. Chief
           technology officers (CTOs), software developers, engineers, network administrators and
           managers should have extensive familiarity with the products they currently use and are
           considering. Check and adhere to your vendors’ list of approved, compatible hardware,
           software and applications. Software developers and network administrators must obey the
           rules. That means avoiding such ill-advised and iffy practices like overclocking server
           and desktop hardware, allowing unskilled or neophyte administrators to make changes to
           the registry. All of these actions can lead to serious reliability problems.
          Don’t skimp on training and recertification for IT administrators, software
           developers and engineers. In these days of budget cuts, it’s common practice to
           eliminate monies that were formerly earmarked for training. ITIC understands that money
           is tight. If you can’t afford the time or expense to re-certify your entire IT department,
           designate the most experienced or appropriate IT staffer to take the course – even if it’s
           only an online course – and allow that person to train additional appropriate managers.
          Perform regular asset management testing. Schedule asset management reviews on a
           yearly, bi-annual or quarterly basis, as needed. This will assist your company in remaining
           current on hardware and software and help you to adhere to the terms and conditions of
           licensing contracts. All of these issues influence network reliability. It also allows
           organizations to be better equipped to meet their SLA requirements and maintain peak
           performance and reliability.
          Manual vs. Automated Group Policy Patch Management. IT managers, particularly in
           high end UNIX environments and in corporations whose environments feature a high degree
           of customization, will continue to perform manual patch management.


© Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved.
Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders.

                                                                                                                                         Page 21
     Keep your software updated with the latest necessary patches and upgrades. You don’t
           have to apply every patch, but it’s wise to keep track of which patches are crucial to the
           network’s health. Construct and adhere to a regular schedule to apply patches, preferably on a
           monthly basis. This will help the company avoid potentially nasty surprises.
          Standardize legacy and future hardware, server OS and application environments
           as much as possible. ITIC survey data indicates that standardization—that is, following a
           prescribed configuration and version for the company’s hardware, software and network
           infrastructure components—can lower TCO costs by 15%. Standardization benefits all
           users—including organizations that have custom configurations.
          Note that custom software implementations require the highest level of expertise. Any
           firm that elects to customize its Linux or open source server operating system distribution
           should either employ guru-level administrators or contract with a systems integrator or
           outsourcer with the appropriate expertise.
          Automated patch management applied via Group Policy vs. manual patching.
           Companies should also regularly review whether it is feasible for the firm to migrate away
           from manual patch management. Collecting this information may seem to be a chore at first,
           but it will be an invaluable source of information that can guide the company to lower its
           TCO and improve the rate of its ROI.


Recommendations for Vendors
It is a buyer’s market and is likely to remain so for the foreseeable future. Competition among
vendors is intense because businesses have a wide array of server hardware and server operating
system platforms from which to choose. In order to retain the current customer base and attract
new corporate customers, all of the vendors must strive to improve the features, performance,
reliability and security of their respective server hardware and server OS software. Additionally,
ITIC advises vendors to:


    Embrace Interoperability and Integration. The survey data indicates that backwards
     compatibility and integration with other hardware, server OS, applications and third party
     tools and utilities pose significant potential threat to the underlying stability of the network
     environment.
    Provide Explicit Guidance around Patches and Patch Management. Patches vary
     according to the importance, severity of the fix or update and by the number of patches in a
     formal release as well. Data ITIC obtained from anecdotal essay comments and first person
     customer interviews underscore the need for vendors to issue patches in an efficient,
     expeditious manner and to provide full transparency on the nature and severity of all bugs.
     Many IT managers expressed frustration and confusion with the patch management process,
     which was sometimes cumbersome. IT managers also noted that oftentimes they were unsure
     of which patches were crucial versus optional. ITIC advises vendors to deliver specific
     recommendations and instructions on the download process, since patch management is a
     crucial element of IT management that can positively or negatively impact reliability.


© Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved.
Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders.

                                                                                                                                         Page 22
    Provide the latest technical documentation. Ready access to clear, concise technical
     guidelines and detailed documentation has never been more important. The economic
     downturn forced many companies to cut staff. Time and money are scarce or non-existent for
     training and re-certification of IT administrators. It is therefore crucial that vendors pick up
     the slack and publicize and disseminate technical ―how to‖ guidelines via their respective
     Websites, Emails and Webinars.
    Vendors should also actively work with third party ISVs to assist in resolving driver
     and application compatibility issues. As we noted above, integration and interoperability
     issues are a top priority for IT departments who wish to maintain a high level of reliability.
     While many of the largest third party ISVs do an exemplary job of ensuring that their
     applications and drivers are certified to work with new server hardware and server OS
     releases, many smaller and niche ISVs – particularly in specific verticals like finance, legal
     and healthcare, in many instances lack the necessary resources and funds to support new
     releases. Vendors should poll their customers on which third party applications, drivers and
     utilities are crucial and when necessary assist ISVs in providing the necessary compatibility.
    Work with partners to provide expanded access to discounted certification and online
     training courses. One of the biggest challenges confronting IT departments today is finding
     the money and sparing the time to get the appropriate administrators re-trained and certified
     on the latest server hardware and server OS software.




© Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved.
Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders.

                                                                                                                                         Page 23

More Related Content

What's hot

A brief look at ibm mainframe history
A brief look at ibm mainframe historyA brief look at ibm mainframe history
A brief look at ibm mainframe historysivaprasanth rentala
 
Ims keeping current for phoenix
Ims keeping current for phoenixIms keeping current for phoenix
Ims keeping current for phoenixJeff Pearce
 
Integrated Intrusion Detection Services for z/OS Communications Server
Integrated Intrusion Detection Services for z/OS Communications Server Integrated Intrusion Detection Services for z/OS Communications Server
Integrated Intrusion Detection Services for z/OS Communications Server zOSCommserver
 
TCP/IP Stack Configuration with Configuration Assistant for IBM z/OS CS
TCP/IP Stack Configuration with Configuration Assistant for IBM z/OS CSTCP/IP Stack Configuration with Configuration Assistant for IBM z/OS CS
TCP/IP Stack Configuration with Configuration Assistant for IBM z/OS CSzOSCommserver
 
IBM Z/OS support for z15 - oct 2021
IBM Z/OS support for z15 -  oct 2021IBM Z/OS support for z15 -  oct 2021
IBM Z/OS support for z15 - oct 2021Marna Walle
 
Upgrade to IBM z/OS V2.4 planning
Upgrade to IBM z/OS V2.4 planningUpgrade to IBM z/OS V2.4 planning
Upgrade to IBM z/OS V2.4 planningMarna Walle
 
z/OS Small Enhancements - Edition 2020A
z/OS Small Enhancements - Edition 2020Az/OS Small Enhancements - Edition 2020A
z/OS Small Enhancements - Edition 2020AMarna Walle
 
Upgrade to IBM z/OS V2.4 technical actions
Upgrade to IBM z/OS V2.4 technical actionsUpgrade to IBM z/OS V2.4 technical actions
Upgrade to IBM z/OS V2.4 technical actionsMarna Walle
 
Upgrade to IBM z/OS V2.5 technical actions
Upgrade to IBM z/OS V2.5 technical actionsUpgrade to IBM z/OS V2.5 technical actions
Upgrade to IBM z/OS V2.5 technical actionsMarna Walle
 
System Z Mainframe Security For An Enterprise
System Z Mainframe Security For An EnterpriseSystem Z Mainframe Security For An Enterprise
System Z Mainframe Security For An EnterpriseJim Porell
 
Ugif 04 2011 informix fug-paris
Ugif 04 2011   informix fug-parisUgif 04 2011   informix fug-paris
Ugif 04 2011 informix fug-parisUGIF
 
Systemz Security Overview (for non-Mainframe folks)
Systemz Security Overview (for non-Mainframe folks)Systemz Security Overview (for non-Mainframe folks)
Systemz Security Overview (for non-Mainframe folks)Mike Smith
 
Open source-options-v1
Open source-options-v1Open source-options-v1
Open source-options-v1Hieu Le Trung
 
Public Training Power System for AIX : AIX Implementation & Administration (A...
Public Training Power System for AIX : AIX Implementation & Administration (A...Public Training Power System for AIX : AIX Implementation & Administration (A...
Public Training Power System for AIX : AIX Implementation & Administration (A...Hany Paulina
 
Optimized License Management for the Datacenter
Optimized License Management for the DatacenterOptimized License Management for the Datacenter
Optimized License Management for the DatacenterFlexera
 

What's hot (16)

A brief look at ibm mainframe history
A brief look at ibm mainframe historyA brief look at ibm mainframe history
A brief look at ibm mainframe history
 
Ims keeping current for phoenix
Ims keeping current for phoenixIms keeping current for phoenix
Ims keeping current for phoenix
 
Integrated Intrusion Detection Services for z/OS Communications Server
Integrated Intrusion Detection Services for z/OS Communications Server Integrated Intrusion Detection Services for z/OS Communications Server
Integrated Intrusion Detection Services for z/OS Communications Server
 
TCP/IP Stack Configuration with Configuration Assistant for IBM z/OS CS
TCP/IP Stack Configuration with Configuration Assistant for IBM z/OS CSTCP/IP Stack Configuration with Configuration Assistant for IBM z/OS CS
TCP/IP Stack Configuration with Configuration Assistant for IBM z/OS CS
 
IBM Z/OS support for z15 - oct 2021
IBM Z/OS support for z15 -  oct 2021IBM Z/OS support for z15 -  oct 2021
IBM Z/OS support for z15 - oct 2021
 
Upgrade to IBM z/OS V2.4 planning
Upgrade to IBM z/OS V2.4 planningUpgrade to IBM z/OS V2.4 planning
Upgrade to IBM z/OS V2.4 planning
 
z/OS Small Enhancements - Edition 2020A
z/OS Small Enhancements - Edition 2020Az/OS Small Enhancements - Edition 2020A
z/OS Small Enhancements - Edition 2020A
 
Upgrade to IBM z/OS V2.4 technical actions
Upgrade to IBM z/OS V2.4 technical actionsUpgrade to IBM z/OS V2.4 technical actions
Upgrade to IBM z/OS V2.4 technical actions
 
Upgrade to IBM z/OS V2.5 technical actions
Upgrade to IBM z/OS V2.5 technical actionsUpgrade to IBM z/OS V2.5 technical actions
Upgrade to IBM z/OS V2.5 technical actions
 
DeltaV Virtualization
DeltaV VirtualizationDeltaV Virtualization
DeltaV Virtualization
 
System Z Mainframe Security For An Enterprise
System Z Mainframe Security For An EnterpriseSystem Z Mainframe Security For An Enterprise
System Z Mainframe Security For An Enterprise
 
Ugif 04 2011 informix fug-paris
Ugif 04 2011   informix fug-parisUgif 04 2011   informix fug-paris
Ugif 04 2011 informix fug-paris
 
Systemz Security Overview (for non-Mainframe folks)
Systemz Security Overview (for non-Mainframe folks)Systemz Security Overview (for non-Mainframe folks)
Systemz Security Overview (for non-Mainframe folks)
 
Open source-options-v1
Open source-options-v1Open source-options-v1
Open source-options-v1
 
Public Training Power System for AIX : AIX Implementation & Administration (A...
Public Training Power System for AIX : AIX Implementation & Administration (A...Public Training Power System for AIX : AIX Implementation & Administration (A...
Public Training Power System for AIX : AIX Implementation & Administration (A...
 
Optimized License Management for the Datacenter
Optimized License Management for the DatacenterOptimized License Management for the Datacenter
Optimized License Management for the Datacenter
 

Similar to ITIC 2009 Global Server Hardware and Server OS Reliability Survey

Presentazione IBM Flex System e System x Evento Venaria 14 ottobre
Presentazione IBM Flex System e System x Evento Venaria 14 ottobrePresentazione IBM Flex System e System x Evento Venaria 14 ottobre
Presentazione IBM Flex System e System x Evento Venaria 14 ottobrePRAGMA PROGETTI
 
Focus Group Open Source 09.05.2011 Massimiliano Belardi
Focus Group Open Source 09.05.2011 Massimiliano BelardiFocus Group Open Source 09.05.2011 Massimiliano Belardi
Focus Group Open Source 09.05.2011 Massimiliano BelardiRoberto Galoppini
 
Choosing IBM Flex System for Your Private Cloud Infrastructure
Choosing IBM Flex System for Your Private Cloud InfrastructureChoosing IBM Flex System for Your Private Cloud Infrastructure
Choosing IBM Flex System for Your Private Cloud InfrastructureIBM India Smarter Computing
 
Virtualization Performance on the IBM PureFlex System
Virtualization Performance on the IBM PureFlex SystemVirtualization Performance on the IBM PureFlex System
Virtualization Performance on the IBM PureFlex SystemIBM India Smarter Computing
 
MongoDB Linux Porting, Performance Measurements and and Scaling Advantage usi...
MongoDB Linux Porting, Performance Measurements and and Scaling Advantage usi...MongoDB Linux Porting, Performance Measurements and and Scaling Advantage usi...
MongoDB Linux Porting, Performance Measurements and and Scaling Advantage usi...MongoDB
 
IBM i for Midsize Businesses Minimizing Costs and Risks for Midsize Businesses
IBM i for Midsize Businesses Minimizing Costs and Risks for Midsize BusinessesIBM i for Midsize Businesses Minimizing Costs and Risks for Midsize Businesses
IBM i for Midsize Businesses Minimizing Costs and Risks for Midsize BusinessesIBM India Smarter Computing
 
Linux on Z13 and Simulatenus Multithreading - Sebastien Llaurency
Linux on Z13 and Simulatenus Multithreading - Sebastien LlaurencyLinux on Z13 and Simulatenus Multithreading - Sebastien Llaurency
Linux on Z13 and Simulatenus Multithreading - Sebastien LlaurencyNRB
 
Wipro - FM Best Practices Showcase
Wipro - FM Best Practices ShowcaseWipro - FM Best Practices Showcase
Wipro - FM Best Practices ShowcaseSudhendu Bali
 
VALUE PROPOSITION FOR IBM POWER SYSTEMS Comparing Costs of IBM PowerVM and x8...
VALUE PROPOSITION FOR IBM POWER SYSTEMS Comparing Costs of IBM PowerVM and x8...VALUE PROPOSITION FOR IBM POWER SYSTEMS Comparing Costs of IBM PowerVM and x8...
VALUE PROPOSITION FOR IBM POWER SYSTEMS Comparing Costs of IBM PowerVM and x8...IBM India Smarter Computing
 
Visão geral do hardware do servidor System z e Linux on z - Concurso Mainframe
Visão geral do hardware do servidor System z e Linux on z - Concurso MainframeVisão geral do hardware do servidor System z e Linux on z - Concurso Mainframe
Visão geral do hardware do servidor System z e Linux on z - Concurso MainframeAnderson Bassani
 
2016 02-16-announce-overview-zsp04505 usen
2016 02-16-announce-overview-zsp04505 usen2016 02-16-announce-overview-zsp04505 usen
2016 02-16-announce-overview-zsp04505 usenDavid Morlitz
 
Using Linux on z/VM to Meet the Challenges of the 21st Century
Using Linux on z/VM to Meet the Challenges of the 21st CenturyUsing Linux on z/VM to Meet the Challenges of the 21st Century
Using Linux on z/VM to Meet the Challenges of the 21st CenturyIBM India Smarter Computing
 
The best private cloud in the world - hands down
The best private cloud in the world - hands downThe best private cloud in the world - hands down
The best private cloud in the world - hands downHans A.T. Dekkers
 
Lenovo Rack and Tower Server Portfolio
Lenovo Rack and Tower Server PortfolioLenovo Rack and Tower Server Portfolio
Lenovo Rack and Tower Server PortfolioLenovo Data Center
 

Similar to ITIC 2009 Global Server Hardware and Server OS Reliability Survey (20)

Presentazione IBM Flex System e System x Evento Venaria 14 ottobre
Presentazione IBM Flex System e System x Evento Venaria 14 ottobrePresentazione IBM Flex System e System x Evento Venaria 14 ottobre
Presentazione IBM Flex System e System x Evento Venaria 14 ottobre
 
Focus Group Open Source 09.05.2011 Massimiliano Belardi
Focus Group Open Source 09.05.2011 Massimiliano BelardiFocus Group Open Source 09.05.2011 Massimiliano Belardi
Focus Group Open Source 09.05.2011 Massimiliano Belardi
 
Why linux on power?
Why linux on power?Why linux on power?
Why linux on power?
 
Choosing IBM Flex System for Your Private Cloud Infrastructure
Choosing IBM Flex System for Your Private Cloud InfrastructureChoosing IBM Flex System for Your Private Cloud Infrastructure
Choosing IBM Flex System for Your Private Cloud Infrastructure
 
Overview of IBM PureSystems
Overview of IBM PureSystemsOverview of IBM PureSystems
Overview of IBM PureSystems
 
Virtualization Performance on the IBM PureFlex System
Virtualization Performance on the IBM PureFlex SystemVirtualization Performance on the IBM PureFlex System
Virtualization Performance on the IBM PureFlex System
 
MongoDB Linux Porting, Performance Measurements and and Scaling Advantage usi...
MongoDB Linux Porting, Performance Measurements and and Scaling Advantage usi...MongoDB Linux Porting, Performance Measurements and and Scaling Advantage usi...
MongoDB Linux Porting, Performance Measurements and and Scaling Advantage usi...
 
AIX Solution Editions
AIX Solution EditionsAIX Solution Editions
AIX Solution Editions
 
IBM i for Midsize Businesses Minimizing Costs and Risks for Midsize Businesses
IBM i for Midsize Businesses Minimizing Costs and Risks for Midsize BusinessesIBM i for Midsize Businesses Minimizing Costs and Risks for Midsize Businesses
IBM i for Midsize Businesses Minimizing Costs and Risks for Midsize Businesses
 
Linux on Z13 and Simulatenus Multithreading - Sebastien Llaurency
Linux on Z13 and Simulatenus Multithreading - Sebastien LlaurencyLinux on Z13 and Simulatenus Multithreading - Sebastien Llaurency
Linux on Z13 and Simulatenus Multithreading - Sebastien Llaurency
 
Wipro - FM Best Practices Showcase
Wipro - FM Best Practices ShowcaseWipro - FM Best Practices Showcase
Wipro - FM Best Practices Showcase
 
VALUE PROPOSITION FOR IBM POWER SYSTEMS Comparing Costs of IBM PowerVM and x8...
VALUE PROPOSITION FOR IBM POWER SYSTEMS Comparing Costs of IBM PowerVM and x8...VALUE PROPOSITION FOR IBM POWER SYSTEMS Comparing Costs of IBM PowerVM and x8...
VALUE PROPOSITION FOR IBM POWER SYSTEMS Comparing Costs of IBM PowerVM and x8...
 
Visão geral do hardware do servidor System z e Linux on z - Concurso Mainframe
Visão geral do hardware do servidor System z e Linux on z - Concurso MainframeVisão geral do hardware do servidor System z e Linux on z - Concurso Mainframe
Visão geral do hardware do servidor System z e Linux on z - Concurso Mainframe
 
Presentation Why I Final 7 15 09
Presentation Why I Final 7 15 09Presentation Why I Final 7 15 09
Presentation Why I Final 7 15 09
 
2016 02-16-announce-overview-zsp04505 usen
2016 02-16-announce-overview-zsp04505 usen2016 02-16-announce-overview-zsp04505 usen
2016 02-16-announce-overview-zsp04505 usen
 
zLinux
zLinuxzLinux
zLinux
 
Using Linux on z/VM to Meet the Challenges of the 21st Century
Using Linux on z/VM to Meet the Challenges of the 21st CenturyUsing Linux on z/VM to Meet the Challenges of the 21st Century
Using Linux on z/VM to Meet the Challenges of the 21st Century
 
The best private cloud in the world - hands down
The best private cloud in the world - hands downThe best private cloud in the world - hands down
The best private cloud in the world - hands down
 
Lenovo Rack and Tower Server Portfolio
Lenovo Rack and Tower Server PortfolioLenovo Rack and Tower Server Portfolio
Lenovo Rack and Tower Server Portfolio
 
Why Ibm i
Why Ibm iWhy Ibm i
Why Ibm i
 

More from IBM India Smarter Computing

Using the IBM XIV Storage System in OpenStack Cloud Environments
Using the IBM XIV Storage System in OpenStack Cloud Environments Using the IBM XIV Storage System in OpenStack Cloud Environments
Using the IBM XIV Storage System in OpenStack Cloud Environments IBM India Smarter Computing
 
TSL03104USEN Exploring VMware vSphere Storage API for Array Integration on th...
TSL03104USEN Exploring VMware vSphere Storage API for Array Integration on th...TSL03104USEN Exploring VMware vSphere Storage API for Array Integration on th...
TSL03104USEN Exploring VMware vSphere Storage API for Array Integration on th...IBM India Smarter Computing
 
A Comparison of PowerVM and Vmware Virtualization Performance
A Comparison of PowerVM and Vmware Virtualization PerformanceA Comparison of PowerVM and Vmware Virtualization Performance
A Comparison of PowerVM and Vmware Virtualization PerformanceIBM India Smarter Computing
 
IBM pureflex system and vmware vcloud enterprise suite reference architecture
IBM pureflex system and vmware vcloud enterprise suite reference architectureIBM pureflex system and vmware vcloud enterprise suite reference architecture
IBM pureflex system and vmware vcloud enterprise suite reference architectureIBM India Smarter Computing
 

More from IBM India Smarter Computing (20)

Using the IBM XIV Storage System in OpenStack Cloud Environments
Using the IBM XIV Storage System in OpenStack Cloud Environments Using the IBM XIV Storage System in OpenStack Cloud Environments
Using the IBM XIV Storage System in OpenStack Cloud Environments
 
All-flash Needs End to End Storage Efficiency
All-flash Needs End to End Storage EfficiencyAll-flash Needs End to End Storage Efficiency
All-flash Needs End to End Storage Efficiency
 
TSL03104USEN Exploring VMware vSphere Storage API for Array Integration on th...
TSL03104USEN Exploring VMware vSphere Storage API for Array Integration on th...TSL03104USEN Exploring VMware vSphere Storage API for Array Integration on th...
TSL03104USEN Exploring VMware vSphere Storage API for Array Integration on th...
 
IBM FlashSystem 840 Product Guide
IBM FlashSystem 840 Product GuideIBM FlashSystem 840 Product Guide
IBM FlashSystem 840 Product Guide
 
IBM System x3250 M5
IBM System x3250 M5IBM System x3250 M5
IBM System x3250 M5
 
IBM NeXtScale nx360 M4
IBM NeXtScale nx360 M4IBM NeXtScale nx360 M4
IBM NeXtScale nx360 M4
 
IBM System x3650 M4 HD
IBM System x3650 M4 HDIBM System x3650 M4 HD
IBM System x3650 M4 HD
 
IBM System x3300 M4
IBM System x3300 M4IBM System x3300 M4
IBM System x3300 M4
 
IBM System x iDataPlex dx360 M4
IBM System x iDataPlex dx360 M4IBM System x iDataPlex dx360 M4
IBM System x iDataPlex dx360 M4
 
IBM System x3500 M4
IBM System x3500 M4IBM System x3500 M4
IBM System x3500 M4
 
IBM System x3550 M4
IBM System x3550 M4IBM System x3550 M4
IBM System x3550 M4
 
IBM System x3650 M4
IBM System x3650 M4IBM System x3650 M4
IBM System x3650 M4
 
IBM System x3500 M3
IBM System x3500 M3IBM System x3500 M3
IBM System x3500 M3
 
IBM System x3400 M3
IBM System x3400 M3IBM System x3400 M3
IBM System x3400 M3
 
IBM System x3250 M3
IBM System x3250 M3IBM System x3250 M3
IBM System x3250 M3
 
IBM System x3200 M3
IBM System x3200 M3IBM System x3200 M3
IBM System x3200 M3
 
IBM PowerVC Introduction and Configuration
IBM PowerVC Introduction and ConfigurationIBM PowerVC Introduction and Configuration
IBM PowerVC Introduction and Configuration
 
A Comparison of PowerVM and Vmware Virtualization Performance
A Comparison of PowerVM and Vmware Virtualization PerformanceA Comparison of PowerVM and Vmware Virtualization Performance
A Comparison of PowerVM and Vmware Virtualization Performance
 
IBM pureflex system and vmware vcloud enterprise suite reference architecture
IBM pureflex system and vmware vcloud enterprise suite reference architectureIBM pureflex system and vmware vcloud enterprise suite reference architecture
IBM pureflex system and vmware vcloud enterprise suite reference architecture
 
X6: The sixth generation of EXA Technology
X6: The sixth generation of EXA TechnologyX6: The sixth generation of EXA Technology
X6: The sixth generation of EXA Technology
 

Recently uploaded

Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 

Recently uploaded (20)

Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 

ITIC 2009 Global Server Hardware and Server OS Reliability Survey

  • 1. INFORMATION TECHNOLOGY INTELLIGENCE CORP. ITIC 2009 Global Server Hardware and Server OS Reliability Survey July 2009 © Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved. Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders.
  • 2. Executive Summary “Time is money” For the second year in a row, IBM AIX UNIX running on the Power or ―P‖ series servers scored the highest reliability ratings among 15 different server operating system platforms – including Linux, Mac OS X, UNIX and Windows. Those are the results of the ITIC 2009 Global Server Hardware and Server OS Reliability Survey which polled C-level executives and IT managers at 400 corporations from 20 countries worldwide. The results indicate that the IBM AIX operating system running on Big Blue’s Power servers (System p5s), is the clear winner; it offers rock solid reliability, besting all competing operating systems, including those running on Intel-based x86 machines. The IBM servers running AIX consistently score at least 99.99% or just 15 minutes of unplanned per server, per annum downtime (See Exhibit 1). Overall, the results showed improvements in reliability, patch management procedures and an across-the-board reduction in per server, per annum Tier 1, Tier 2 and the most severe Tier 3 outages.  IBM AIX on the Power series System p5 and System p6 servers leads all vendors for both server hardware and server OS reliability. The IBM UNIX distribution recorded the fewest number of Tier 1, Tier 2 and Tier 3 unplanned server outages per year. IBM AIX running on the System p5s and newer p6s had less than one unplanned outage incident per server in a 12 month period. More impressively, the IBM servers experience no severe Tier 3 outages.  Hewlett-Packard’s HP UX 11i running on the HP 9000 and Integrity servers also performed very well though HP servers notch approximately 21 to 25 minutes more downtime than IBM servers, depending on model and configuration. The HP UX 11i v. 3 Update 4 on the HP 9000s average 36 minutes of per server, per annum downtime; while the HP UX 11i v.3 Update 4 on HP Integrity servers recorded 39 minutes of per server, per annum downtime.  Faster Patch Management. IT managers spend approximately 11 minutes to apply patches to IBM servers running the AIX operating system, which is again, the least amount of time spent patching any server or operating system. The open source Ubuntu distribution is a close second with IT managers spending 12 minutes to apply patches, while IT managers in the Novell SUSE Enterprise, customized Linux distribution and Apple Mac OS X 10.x. environments each spend a very economical 15 to 19 minutes applying patches.  Unplanned severe Tier 2 and Tier 3 Outages Decline. IBM also took top honors in another important category: IBM Power Series System p5 and p6 servers running AIX experience the lowest amount of the more severe Tier 2 and Tier 3 outages combined of any server hardware or server operating system. The combined total of Tier 2 and Tier 3 © Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved. Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders. Page 2
  • 3. outages accounted for just 19% of all per server, per annum failures in IBM network environments. HP UX on the 9000 and Integrity servers, Novell SUSE Linux Enterprise 11 and ―other‖ Linux distributions were close behind with combined Tier 2 + Tier 3 outages accounting for 24% to 25% of unplanned yearly downtime.  Novell SUSE Superiority. Among the Linux and Open Source server operating system distributions, both Novell SUSE Linux Enterprise 10 and 11 versions consistently achieved superior reliability ratings. In fact, Novell SUSE in a customized implementation had the lowest instance -- approximately 16 minutes of per server/server OS, per annum downtime – of any distribution with the exception of IBM’s AIX on the Power Series. Many IT managers specifically mentioned and extolled the high level of integration and interoperability between their Novell SUSE Linux Enterprise and Microsoft Windows Server 2003 and Windows Server 2008 in heterogeneous networks, in their anecdotal responses and first person customer interviews.  Most Improved. Microsoft Windows Server 2003 and Windows Server 2008 showed the biggest improvements of any of the vendors. The Windows Server 2003 and 2008 operating systems running on Intel-based platforms saw a 35% reduction in the amount of unplanned per server, per annum downtime from 3.77 hours in 2008 to 2.42 hours in 2009. The number of annual Windows Server Tier 3 outages also decreased by 31% year over year and the time spent applying patches similarly decline by 35% from last year to 32 minutes in 2009.  Apple Mac and OS X 10.x Competitive Enterprise Reliability. This year’s survey for the first time also incorporated reliability results for the Apple Mac and OS X 10.x OS platform. Over the past two to three years, the Apple Mac platform has made a comeback in corporate enterprises. The numbers of Mac G4 servers are modest in comparison to the more entrenched Windows, Linux and UNIX distributions. Nonetheless, they are making their presence known. IT managers report the reliability has been generally very good. The survey respondents indicated that the Apple Mac G4 servers are extremely competitive in an enterprise setting. IT managers spend approximately 15 minutes per server to apply patches and an average recorded downtime of about 40 minutes per server, per annum.. It is important to note that at this point, the workloads of the G4 Macs are not comparable to those of the high end IBM, HP and Sun (now Oracle) UNIX systems or the customized Linux and open source distributions. © Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved. Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders. Page 3
  • 4. The intent of this Report is to quantify and qualify the reliability of 15 different server operating system platforms running on a variety of proprietary UNIX and Intel-based hardware platforms. This will allow organizations to more easily identify baseline reliability metrics associated with individual platforms in order to better determine and optimize their total cost of ownership (TCO), accelerate return on investment (ROI) and more efficiently manage risk. © Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved. Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders. Page 4
  • 5. Table of Contents Executive Summary...................................................................................................2 Introduction..............................................................................................................6 Survey Methodology ..............................................................................................8 Survey Demographics ............................................................................................9 Data & Analysis.........................................................................................................9 Conclusions ............................................................................................................ 19 Recommendations................................................................................................... 19 Recommendations for Corporate Customers .......................................................... 20 Recommendations for Vendors ............................................................................. 22 © Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved. Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders. Page 5
  • 6. Introduction Server hardware and server operating system reliability is the foundation and bedrock upon which crucial applications, storage, security and third party utilities and management, rest. The stability and health of the entire network infrastructure depend heavily on the server hardware and the operating systems that run on them. Server hardware and server operating system reliability are inextricably linked to the corporation’s ability to lower its TCO, accelerate ROI and reduce the risk factors that negatively impact performance. Information on specific reliability metrics, allows businesses to calculate the real-time resources and monies needed to manage and maintain their various server hardware platforms and operating systems. It also enables them to determine whether or not their mission critical server hardware and operating system software are assisting or impeding the business from meeting key service level agreements (SLAs) to their customers, business partners and suppliers as well as internally to the company’s own end users. The ITIC self-selecting reliability survey polled IT managers at 400 corporations worldwide on the annual amount and percent of unplanned per server, per annum downtime experienced following 15 hardware and server OS environments.  IBM AIX on Power series System p5 and p6 servers  HP UX on the 9000  HP UX on Integrity servers  Sun Solaris UNIX on the SPARC Servers  Apple Mac OS X 10.5, 10.6 on G4 Macs  Novell SUSE Linux Enterprise on Intel x86 servers  Novell SUSE Linux Enterprise on Intel x86 servers  Red Hat Enterprise Linux on Intel x86 servers  Red Hat Enterprise Linux with customization  Windows Server 2003 on Intel x86 servers  Windows Server 2008 on Intel x86 servers  Ubuntu open source  Debian open source  Other Linux distributions (e.g. Mandriva, Turbo Linux)  Other Linux distributions with customization The survey data gives a detailed comparison breakdown of the percentage of Tier 1, Tier 2 and highest severity Tier 3 outages. © Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved. Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders. Page 6
  • 7. ITIC’s definition of server outages is as follows:  Tier 1: These are the typically minor common, albeit annoying occurrences. A network administrator can usually resolve such incidents with less than 30 minutes for dependent users. Tier 1 incidents can usually be resolved by rebooting the server and rarely involve any data loss.  Tier 2: These are moderate issues in which the server may be offline from one hour to four hours or about a half-day. Tier 2 problems may require the intervention of more than one network administrator to troubleshoot and it frequently affects the corporation’s end users and possibly business partners, customers and suppliers in the event they are attempting to access data on an affected corporate extranet.  Tier 3: This is the most severe type of incident. Tier 3 outages are of longer than four hours duration for network administrators and the company’s associated dependent users. Tier 3 outages almost always require a team of multiple network administrators to resolve. Data loss or damage to systems and applications may or may not occur. Another real threat associated with a protracted Tier 3 outage is potential lost business and the potential damage to the company’s reputation. . The length and severity of each of these actions correspond to specific line item capital expenditure and operational expenditure costs for the business. Reliability, measured by downtime, can positively or negatively impact TCO and accelerate or delay the time it takes to realize ROI. Improvements or declines in reliability also mitigate or increase technical and business risks to the organization’s end users and external customers. The ability to meet service-level agreements (SLAs) hinges on server reliability, uptime and manageability. These are key indicators that enable organizations to determine which server operating system platform or combination thereof is most suitable. The survey data detailed the disparity in the number and severity of unplanned server outages and the amount of time in minutes and hours that businesses experience on the various Linux, Windows and UNIX platforms. The survey closely examined both the actual quantitative reliability statistics as well as the qualitative issues that positively or negatively impacted outage time. The ITIC survey queried corporate IT managers and C-level executives on myriad reliability-related functions including:  The amount of downtime (minutes/hours experienced per server, per annum  The amount of time spent patching each server  Whether the IT administrators apply updates via an automated group policy procedure or manually apply the patches to individual servers © Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved. Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders. Page 7
  • 8. On average, individual corporate Linux, Windows and UNIX servers experience from zero to approximately two failures per server per year. In a best case scenario, this results in 20 minutes (IBM AIX running on p5 and p6 Power servers) to 4.3 hours (Debian open source) hours of annual downtime for each server. Windows Server 2008 servers experienced a total of just under three unplanned yearly Tier 1, Tier 2 and Tier 3 outages. However, the necessity of having to take many of the Windows Servers offline to apply monthly patches and then do a system reboot, resulted in Windows Server 2008 machines being offline for just under two and a half hours each year. Still, this is a 35% reduction for the 3.77 hours of downtime experienced by Windows Server 2008 machines in last year’s ITIC reliability survey. Among the Linux distributions Novell SUSE Enterprise exhibited consistent reliability reminiscent of the late 1980s and 1990s when Novell NetWare was famous for running several years – in some cases as long as nine years – without experiencing a failure or the need to reboot. This can be attributed to the stability of the Novell distribution, the experience of the SUSE engineers and the length of experience of many IT managers who came from the NetWare environment. Novell also inked an interoperability and technical service and support agreement with Microsoft two and a half years ago, which also served to improve reliability. The open source Ubuntu distribution also scored some impressive reliability gains as it continues to gain in popularity and deployments. Overall, these survey responses provide crucial, comparative reliability metrics to enable customers to make informed choices on which server hardware and server operating system or combination thereof, best suits their specific business and budgets needs. Survey Methodology ITIC conducted the 2009 Global Server Hardware and Server OS Survey, an independent Web- based survey; that included multiple-choice questions and essay responses from March through July 2009. ITIC polled C-level executives and IT managers at 400 corporations worldwide. ITIC analysts supplemented the Web survey by conducting two dozen first-person customer interviews. ITIC conducted additional interviews with customers in October 2009 and updated the Report with specific information on server downtime statistics. The anecdotal data obtained from these interviews validates the survey responses and provides deeper insight into the challenges confronting businesses in both the immediate and long term. To deliver the most unbiased, accurate information, ITIC did not accept any vendor sponsorship money for the online poll or the subsequent first-person interviews conducted in connection with this project. ITIC employed authentication and tracking mechanisms to prevent tampering and to prohibit multiple responses by the same parties. © Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved. Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders. Page 8
  • 9. Survey Demographics Companies of all sizes and all vertical markets were represented in the survey. Respondents came from companies ranging from small and medium businesses (SMBs) with fewer than 50 workers, to large enterprises with more than 100,000 employees. Roughly 33% of the survey respondents came from the SMB segment with 1 to 100 employees; 12% of those polled were from midsize companies with 100 to 500 employees; 14% were drawn from corporations employing 500 to 1,000 employees; and 41% of respondents worked in large enterprises with 1,000 to more than 100,000 workers. The survey was truly global. Approximately 85% of respondents came from North America. The remaining 15% hailed from more than 20 countries including Europe, Asia, Australia, New Zealand, South America and Africa. Data & Analysis Server hardware and server operating system reliability has improved immeasurably in the last five years. When ITIC began conducting reliability research and surveys, our original definition of unplanned downtime was an unexpected external or internal incident that caused the server hardware and/or the server operating system software to spontaneously fail or freeze, thereby disrupting network operations and requiring remediation efforts and a reboot. Depending on the seriousness of the incident, the downtime may also have resulted in lost or damaged data. However, it quickly became apparent from the anecdotal survey comments and during our first person customer interviews, that IT managers and network administrators had a broader definition of what constituted downtime. As far as IT departments are concerned, anything that causes them to take the server offline, regardless of the cause, is unplanned downtime. Included in this category are instances of vendors releasing an unanticipated patch to fix a technical bug or security vulnerability. Such an occurrence does not qualify as unplanned downtime in the narrowest definition of the term; network administrators oftentimes do not make that distinction. To them downtime is downtime because it disrupts their routine and may also impact daily operations because it means the IT department must devote time to remedial issues that would have been spent performing other IT chores. And in some network environments like Windows, it’s still necessary to take the servers down, apply the patch and perform a hard reboot. Time very literally equates to money. The economic downturn has forced companies to cut staff, put network and software upgrades on hold, decimated IT departments and has severely reduced the training and recertification for network administrators. © Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved. Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders. Page 9
  • 10. A recent ITIC survey that polled 250 corporations worldwide in October 2009 found that 47% of businesses had budget cuts within the past 12 months. That number was even greater for companies with over 500 end users; 64% of large enterprises experienced budget cuts. Consequently, 84% of the respondents reported that their IT departments simply pick up the slack and work longer and harder. Downtime by the Numbers In the early days of networks, corporate enterprises considered 99% uptime to be an adequate reliability standard. Not so in 2009. An ITIC survey of 250 enterprises conducted in October found that only 14% of survey respondents consider 99% uptime adequate for their most mission critical, line of business (LOB) applications. Another 14% said that 99.9% or three nines met their reliability needs. A two-thirds majority – 66% -- of those polled however, said their network environments require 99.95%; 99.999% or greater reliability for their most mission critical LOBs. It’s easy to see why when you correlate the downtime percentages to actual downtime: 99% = average unplanned downtime of one hour and 40 minutes per week 99.9% = average unplanned downtime of 45 minutes per month 99.95% = average unplanned downtime of 22 minutes per month 99.999% = average unplanned downtime of 5 1/2 minutes per year Taken in this context, it’s easy to understand how the ongoing economic crisis has cast renewed emphasis on server and server operating system reliability. Businesses of all sizes and across all vertical markets are extremely risk averse. IT departments grapple daily with the reality of keeping networks up and running in the face of cost cuts, layoffs and fewer resources. Server hardware, server operating systems and the a Businesses and their IT departments are under pressure to maximize server hardware and server operating system uptime in order to realize the greatest economies of scale and ensure that their server hardware, server operating systems and the crucial business applications and services that run on them are available to end users, corporate clients, business partners and suppliers. A server outage of even a few minutes duration can disrupt network operations and result in lost data, steep monetary losses and damage a company’s reputation. Reliability Then and Now The first generations of server hardware and server operating system software platforms introduced in the mid-to-late 1980s, were proprietary. Network administrators typically became experts in a particular vendor’s platform. The 1.0 version of new hardware and software products © Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved. Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders. Page 10
  • 11. from 10 to 20 years ago were also rife with bugs. It typically took from six months to a year for the vendors to work the kinks out and achieve an acceptable level of stability and IT managers to gain sufficient expertise and knowledge resulting in higher levels of uptime. It is also worth noting that two decades ago, businesses were not as wholly dependent on their networks as they are today. In the 1990s, 99% reliability was considered an acceptable industry standard. That is no longer the case; 99% uptime is the equivalent of over 80 hours of annual per server downtime. ITIC’s separate 2009 Global Application Availability Survey conducted in April found that eight out of 10 of the 300 businesses polled said that their major business applications require higher availability rates than they did two or three years ago. However, nearly three-quarters of companies – 72% -- are unable to quantify the cost of downtime or the impact that unplanned reliability outages have on the business. Among the other 2009 Global Application Availability survey findings:  Nearly two-thirds -- 61% -- of organizations are unsure of how estimate the impact of downtime on the business or do not even attempt to track the losses associated with application downtime and reliability  Two out of five firms -- 41% -- said they require conventional 99% to 99.9% application availability; 29% said they needed 99.95% or 99.99% uptime; while 7% of respondents indicated they need continuous availability of 99.999% or 99.9999% availability.  Just under half – 49% of companies – lack the budget to purchase additional third party software or hardware availability technology. This places more of an onus on the underlying server hardware and server OS to deliver high reliability. The responses from the ITIC 2009 Global Application underscore the crucial importance of having highly reliable server hardware and server operating system reliability. If the servers, server OS and related applications are unavailable for any reason, business and daily operations grind to a halt – with sometimes catastrophic results. The demand for server hardware, server OS and application availability has grown, particularly with the emergence of new technologies like cloud computing and virtualization. Corporations need to ensure that reliability keeps pace. To quantify the reliability statistics: 99.99% uptime equates to approximately four hours or 240 minutes of per server, per annum downtime. Today’s networks demand near perfect reliability; corporations deem any downtime as an anathema to their business operations. This is particularly true for those companies in vertical markets such as banking and finance, stock exchanges, insurance, healthcare and legal, whose businesses are based on intensive data transactions. A server crash of even 15 to 30 minutes duration can cost a company from tens of thousands or tens of millions in lost business and remediation efforts. Zero downtime – or as close to it as is humanly and technologically possible, is the obvious goal and Holy Grail of reliability. © Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved. Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders. Page 11
  • 12. While system flaws will always be present in some fashion, the survey found that at present, server hardware and server OS reliability was also inextricably linked with several other crucial factors and components. They are:  Integration and interoperability is crucial. Over 85% of businesses with 300+ end users have myriad types of server hardware and three different operating systems present in their environment. Heterogeneity and openness are essential to the reliability of today’s networks. The 2007 wide ranging, non-exclusive interoperability pact between Microsoft and Novell was extremely well received and a huge boon for the respective customer bases of both firms. As part of the deal, Microsoft and Novell team up to provide joint sales, technical service and support to deliver plug and play interoperability between the Windows and SUSE Linux Enterprise environments.  Workloads. The applications themselves are growing in size and complexity. It is therefore imperative that the server hardware be robust enough to handle the increased demands of new classes of applications such as streaming audio and digital and highly complex processes. It is a fact that a robust server configuration that includes new multi- core and multi-threading technologies, maximum memory, hard drive and the fastest processors will perform better than old, outmoded and inadequate equipment. The survey showed for example that the high reliability ratings for IBM and HP were no fluke: the powerful IBM System p5 and System p6 Power Series servers and the HP 9000 and Integrity Servers achieved very high reliability – 99.99% and 99.999% uptime – while carrying workloads that were 30% to 40% greater than comparable x86-based machines.  Experience of the IT managers. Errors by neophyte, inexperienced network administrators and IT managers who have not been able to get training and re-certified on the latest technologies is another major factor that contributes to extended downtime and adversely impacts system reliability.  Patch management. The amount of time spent applying patches is one of the biggest contributors to system downtime; this is especially true of security patches, as we see in Exhibit 2 below. © Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved. Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders. Page 12
  • 13. IBM AIX administrators spent the least amount of time – 11 minutes – applying patches. They were followed closely by the Ubuntu open source distribution, Apple Mac, niche market ―other‖ customized Linux distributions and Novell SUSE; administrators in each of these environments spent on average from 12 to 15 minutes applying patches in these environments. This speaks to the underlying stability of these environments as well as the experience of the administrative staff. Typically, UNIX installations – notably IBM’s AIX, as well as Novell SUSE Enterprise and Apple Mac, tend to be stable, static environments with experienced, hands on network administrators who are familiar with the most minute details of the bits and bytes of their systems. Fast patch management positively impacts reliability. The feedback from the survey respondents reinforced the importance of being able to receive and download patches quickly once a bug has been identified. Corporate IT managers noted the significant strides that had been made by all of the vendors across the board in recent years, though they still voiced some concerns. Among the anecdotal comments: © Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved. Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders. Page 13
  • 14. ―IBM has done a wonderful job of keeping our AIX systems up and ready. We rarely if ever have reliability issues,‖ said an IT manager at a Midwest financial institution. • ―Patch management automation has significantly reduced both the manpower required to apply patches and the downtime associated with patch management over the last three years,‖ noted an IT administrator at a large health care facility in the northeast. • ―Novell SUSE Linux Enterprise is always very up-to-date on patches; Zenworks is nice and we never have a problem,‖ said a longtime Novell user at a large healthcare provider in the Southwest. • ―The amount of time it takes to identify vulnerability and when the vendors release the patch, has decreased significantly, but if the bug is a dangerous one, we still worry,‖ according to a chief technology officer at midsized retailer. • ―Our patches are tested at our corporate headquarters location and then distributed as needed to the various remote locations, downloaded to a local Microsoft Systems Management Server (SMS) and automatically downloaded via group policy to each workstation and server. The process is accelerated and it’s relatively painless for the IT department,‖ said an administrator at a large West Coast enterprise. • ―Our patch management dramatically improved with SUSE 10.2 and SUSE 11,‖noted another veteran Novell administrator. ―We have no problems now to speak of.‖ • ―We currently use Group Policy to download patches on each server, but we manually apply them. So it takes us about 15 minutes to patch each Windows server. This means that each server takes less than 15 min to patch. On a whole, other than hardware issues, we've averaged less than two failures per server, per year on our Windows Server 2003 systems,‖ said an IT manager at a large East coast insurance firm. © Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved. Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders. Page 14
  • 15. Serious Tier 2 + Tier 3 Incidents Decline The survey results also showed a discernible decline in the number and percentage of the more serious Tier 2, Tier 3 and combined Tier 2 + Tier 3 incidents, according to Exhibit 3 below. © Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved. Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders. Page 15
  • 16. Once again, IBM AIX on the Power Series System p5 and p6s recorded the smallest percentage of combined Tier 2 + Tier 3 incidents at 19%. The other UNIX and Linux distributions including the HP UX 11i v3 on the HP 9000 and HP Integrity, Novell SUSE Linux Enterprise and Sun Solaris also scored well with the more serious aggregate Tier 2+ Tier 3 outages accounting for 24% to 25% of total outages. And all of the aforementioned distributions managed to lower their scores from the similar survey in 2008. Microsoft’s Windows Server 2003 on x86-based servers came in with a very respectable 30% of reliability outages being in the Tier 2 + Tier 3 categories; this was a reduction of 11% from the 41% reported by respondents to the 2008 ITIC Global Reliability Survey. One of the most impressive statistics was that IBM AIX Power Series System p5 and System p6 servers notched no severe Tier 3 incidents whatsoever. Again, this achievement is even more impressive when one considers that these systems typically run higher workloads than their x86- based counterparts as shown in Exhibit 4. © Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved. Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders. Page 16
  • 17. HP’s UX 11i v.3 Update 4 on the HP 9000 and Integrity servers and Sun Solaris on SPARC Servers (now owned by Oracle), Novell SUSE, Red Hat Enterprise Linux and Apple Mac OS 10x 5.6 on the G4 Macs also recorded very few Tier 3 outages – less than one each, per server per annum. The most common Tier 1 incidents that are usually between 10 and 30 minutes duration, also showed across the board reductions among all server hardware and server operating system platforms as we see from Exhibit 5. © Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved. Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders. Page 17
  • 18. In the Tier 1 category, IBM also came out on top with less than one-half of one Tier 1 incident per AIX Power Series System p5 and System p6 per annum. This equates to about four to seven minutes downtime per server, per year. In fact, all of the server hardware and server OS environments each racked up less than one Tier 1 per server, per annum outage. The results were similarly encouraging for the average number of Tier 2 outages as we see in Exhibit 6 below. © Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved. Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders. Page 18
  • 19. Conclusions In summary the ITIC 2009 Global Server Hardware and Server OS Reliability Survey findings indicate that all of the server operating system platforms have achieved a high degree of reliability. However, the UNIX distributions led by IBM AIX running on the p5 and p6 Power Servers is the clear winner followed closely by HP, Novell SUSE Enterprise Linux and the Ubuntu open source distribution. These results are especially considering in light of the ongoing economic crunch which has caused companies to cut their budgets and reduce IT staff. As they strive to accomplish more with fewer resources, IT departments must rely even more heavily on their vendors to deliver more reliable servers and server operating system software. To reiterate, time is literally money. Even a few minutes of downtime can cost companies thousands or millions of dollars and cause business operations to grind to a halt. Downtime can also impact adversely a company’s relationship with its customers, business suppliers, partners and internal end users. Reliability or lack thereof can potentially damage a company’s reputation and result in lost business. Hence, corporations must have confidence in the reliability and stability of the underlying server hardware and server OS platforms. The advances in technology are encouraging. Now companies must tackle other equally important and challenging issues to ensure the highest level of uptime and reliability. Close attention must be paid to integration and interoperability, patch management, documentation and getting the necessary training and certification for the appropriate IT managers. The most bulletproof hardware and software platforms can be undone by human error. It’s equally important that companies find the funds to stay as current as possible on their server hardware and server OS software. Performance will suffer if the server is configuration is old and inadequate. Recommendations Server hardware and server operating system reliability has improved vastly since the 1980s, 1990s and even in just the last two to three years. While technical bugs still exist, the number, frequency and severity have declined significantly. With few exceptions, common human error poses a bigger threat to server hardware and server operating system reliability then technical glitches. © Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved. Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders. Page 19
  • 20. Crucial TCO metrics such as reliability, performance, security and management ultimately depends as much on each firm’s specific implementation, as it does on the properties of the server and server OS technology itself. There are inherent dependencies between the underlying capabilities of a particular server operating system and an individual corporation’s ability to adhere to best deployment practices with respect to training, testing and configuration. The reliability, security and manageability of even the most hardened server and server operating system are easily compromised by human error. A company that does not restrict physical access to the server is asking for trouble. Similarly, any firm which does not enact and enforce strong usage and security policies, risks compromising the reliability and integrity of its server hardware and server OS environment. The reliability of the server environment can also be undone easily or seriously compromised by such actions as: a bad configuration; the use of incompatible or unapproved memory and logic chips, hardware, peripherals and software drivers; over clocking machines; failing to apply necessary patches; failing to upgrade or retrofit inadequate or obsolete servers and operating systems and taxing server and software resources beyond their capabilities. Recommendations for Corporate Customers To optimize uptime and reliability, ITIC advises corporations to:  Regularly analyze and review configurations, usage and performance levels. This will enable companies to determine whether or not their current server and server OS environment allows them to achieve optimal reliability.  Adopt formal SLAs. Service level agreements enable organizations to define acceptable performance metrics. Companies should meet with their vendors and customers on at least an annual basis to ensure the terms are met.  Define measure and monitor reliability and performance metrics. It is imperative that companies measure component, system, server hardware, server OS and desktop and server OS, security, network infrastructure, storage and application performance. Keep a log of the planned and unplanned downtime in a continuous fashion throughout the enterprise.  Regularly track server and server OS reliability and downtime. Keep accurate records of outages and their causes. Segment the outages according to their severity and length – e.g. Tier 1, Tier 2 and Tier 3. The appropriate IT managers should also keep detailed logs of remediation efforts in the event of the outage. These logs should include a full account of remediation activities, specifying how the problem was solved, how long it took and what staff members participated in the event. It should also list the monetary costs as well as any material impact on the business, its operations and its end users. This will prove invaluable resource should the problem recur. It may also make the © Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved. Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders. Page 20
  • 21. difference in containing or curtailing the reliability-related incident, saving precious time for the IT department, the end users and corporate customers.  Calculate the cost of unplanned downtime. Companies should determine the average cost of minor Tier 1 outages. They should also keep more detailed cost assessments of the more serious unplanned Tier 2 and Tier 3 incidents. It’s essential for businesses to know the monetary amount of each outage – including IT and end user salaries due to troubleshooting and any lost productivity – as well as the impact on the business. C-level executives and IT managers should also pay close attention to whether or not the company’s reputation suffered as a result of a reliability incident; did any litigation ensue; were customers, business partners and suppliers impacted (and at what cost) and at least try and gauge whether or not the company lost business or potential business.  Ensure that your organization has robust server hardware that can adequately handle the OS and application workloads. The server hardware (standalone, blade, cluster, etc.) and the server operating system are inextricably linked. To achieve optimal performance from both components, corporations must ensure that the server hardware is robust enough to carry both the current and anticipated workloads for the lifecycle of both.  Compile a list of best practices and adhere to them. This is absolutely essential. Chief technology officers (CTOs), software developers, engineers, network administrators and managers should have extensive familiarity with the products they currently use and are considering. Check and adhere to your vendors’ list of approved, compatible hardware, software and applications. Software developers and network administrators must obey the rules. That means avoiding such ill-advised and iffy practices like overclocking server and desktop hardware, allowing unskilled or neophyte administrators to make changes to the registry. All of these actions can lead to serious reliability problems.  Don’t skimp on training and recertification for IT administrators, software developers and engineers. In these days of budget cuts, it’s common practice to eliminate monies that were formerly earmarked for training. ITIC understands that money is tight. If you can’t afford the time or expense to re-certify your entire IT department, designate the most experienced or appropriate IT staffer to take the course – even if it’s only an online course – and allow that person to train additional appropriate managers.  Perform regular asset management testing. Schedule asset management reviews on a yearly, bi-annual or quarterly basis, as needed. This will assist your company in remaining current on hardware and software and help you to adhere to the terms and conditions of licensing contracts. All of these issues influence network reliability. It also allows organizations to be better equipped to meet their SLA requirements and maintain peak performance and reliability.  Manual vs. Automated Group Policy Patch Management. IT managers, particularly in high end UNIX environments and in corporations whose environments feature a high degree of customization, will continue to perform manual patch management. © Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved. Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders. Page 21
  • 22. Keep your software updated with the latest necessary patches and upgrades. You don’t have to apply every patch, but it’s wise to keep track of which patches are crucial to the network’s health. Construct and adhere to a regular schedule to apply patches, preferably on a monthly basis. This will help the company avoid potentially nasty surprises.  Standardize legacy and future hardware, server OS and application environments as much as possible. ITIC survey data indicates that standardization—that is, following a prescribed configuration and version for the company’s hardware, software and network infrastructure components—can lower TCO costs by 15%. Standardization benefits all users—including organizations that have custom configurations.  Note that custom software implementations require the highest level of expertise. Any firm that elects to customize its Linux or open source server operating system distribution should either employ guru-level administrators or contract with a systems integrator or outsourcer with the appropriate expertise.  Automated patch management applied via Group Policy vs. manual patching. Companies should also regularly review whether it is feasible for the firm to migrate away from manual patch management. Collecting this information may seem to be a chore at first, but it will be an invaluable source of information that can guide the company to lower its TCO and improve the rate of its ROI. Recommendations for Vendors It is a buyer’s market and is likely to remain so for the foreseeable future. Competition among vendors is intense because businesses have a wide array of server hardware and server operating system platforms from which to choose. In order to retain the current customer base and attract new corporate customers, all of the vendors must strive to improve the features, performance, reliability and security of their respective server hardware and server OS software. Additionally, ITIC advises vendors to:  Embrace Interoperability and Integration. The survey data indicates that backwards compatibility and integration with other hardware, server OS, applications and third party tools and utilities pose significant potential threat to the underlying stability of the network environment.  Provide Explicit Guidance around Patches and Patch Management. Patches vary according to the importance, severity of the fix or update and by the number of patches in a formal release as well. Data ITIC obtained from anecdotal essay comments and first person customer interviews underscore the need for vendors to issue patches in an efficient, expeditious manner and to provide full transparency on the nature and severity of all bugs. Many IT managers expressed frustration and confusion with the patch management process, which was sometimes cumbersome. IT managers also noted that oftentimes they were unsure of which patches were crucial versus optional. ITIC advises vendors to deliver specific recommendations and instructions on the download process, since patch management is a crucial element of IT management that can positively or negatively impact reliability. © Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved. Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders. Page 22
  • 23. Provide the latest technical documentation. Ready access to clear, concise technical guidelines and detailed documentation has never been more important. The economic downturn forced many companies to cut staff. Time and money are scarce or non-existent for training and re-certification of IT administrators. It is therefore crucial that vendors pick up the slack and publicize and disseminate technical ―how to‖ guidelines via their respective Websites, Emails and Webinars.  Vendors should also actively work with third party ISVs to assist in resolving driver and application compatibility issues. As we noted above, integration and interoperability issues are a top priority for IT departments who wish to maintain a high level of reliability. While many of the largest third party ISVs do an exemplary job of ensuring that their applications and drivers are certified to work with new server hardware and server OS releases, many smaller and niche ISVs – particularly in specific verticals like finance, legal and healthcare, in many instances lack the necessary resources and funds to support new releases. Vendors should poll their customers on which third party applications, drivers and utilities are crucial and when necessary assist ISVs in providing the necessary compatibility.  Work with partners to provide expanded access to discounted certification and online training courses. One of the biggest challenges confronting IT departments today is finding the money and sparing the time to get the appropriate administrators re-trained and certified on the latest server hardware and server OS software. © Copyright 2009, Information Technology Intelligence Corp. (ITIC) All rights reserved. Other products and companies referred to herein are trademarks or registered trademarks of their respective companies or mark holders. Page 23