IBM Global Technology Services May 2011Thought Leadership White PaperProactive technical support:The overlooked essential inbusiness resilienceCombine hardware and software technical support with businesscontinuity and disaster recovery plans
2 Proactive technical support: The overlooked essential in business resilienceContents In fact, in light of the risks associated with business disruptions in today’s climate, responses to the IBM Global IT Risk Study— 2 Introduction a comprehensive online survey of 556 IT managers and others 4 Addressing risk driven by business, events and data primarily involved in their business’s IT functions—uncovered a surprising paradox. Fifty-four percent of respondents said their 5 Redeﬁning the requisite elements of business resilience overall approach to mitigating IT risk was good. However, only 23 percent of respondents believed their organizations were well 6 Foundational business continuity for today’s prepared for hardware and software system failures. And when infrastructures asked to list the kinds of risk issues they had dealt with in the 8 Beyond foundational resiliency: transferring risk previous 12 months, respondents ranked systems failures second only to security breaches.110 Essential characteristics of a business resilience vendor As the respondents’ experiences would indicate, hardware and11 IBM’s business resilience capabilities software support are critical to an effective business resilience12 For more information strategy. Yet organizations often spend their time focusing on other, albeit equally important, elements of business continuity, including a range of provisions to avert interruptions—fromIntroduction normal business events to unforeseen catastrophes. These provi-Today’s business environments are at their most complex. sions include backup servers, data mirroring, data backups,Enterprises continue to undergo rapid change while facing the offsite data storage and disaster recovery. While all of these ele-heightened impact of business disruptions and coping with a ments are critical, they can fall short of meeting daily needs forvolatile ﬁnancial and regulatory landscape. The possibility for system availability. Without hardware and software viability,business disruptions has increased exponentially, and so have the organizations are only as strong as their weakest link.ramiﬁcations. Yet when planning for business continuity, manyorganizations overlook the importance of technical support. For example, one of the basic tenets of business resilience isHardware and software failures are often perceived as routine maintaining data and service availability, both of which areoccurrences, but without a proactive management approach, directly threatened when software and hardware failures occur.they can quickly evolve into full-blown disasters with signiﬁcant Unplanned outages can create far-reaching consequences thatﬁnancial and reputational consequences. For this reason, success- impact long-term revenue streams, brand reputation and sur-ful organizations reinforce their business resilience strategy with vival. The potential impact may be more prevalent than previ-proactive hardware and software technical support. As shown in ously thought. In fact, a March 2011 report on UK businessesresponses to the 2010 IBM Global IT Risk Study, many others revealed that more than 80 percent of business interruptions instill do not. 2010 were due to hardware failures. The report also revealed that more than 50 percent of hardware failures were a result of inadequate maintenance in place to protect equipment against failures.2
IBM Global Technology Services 3Software errors can be equally damaging. In late 2009, a major What are the provisions for releasing ﬁxes and updates foronline reseller saw its site crash, just in time for holiday shop- microcode and business-critical applications, and for ensuringping, due to a software error that collided with a surge in web that these are applied at every endpoint where the applicationssite traffic.3 In January 2009, Google’s search engine erroneously are in use? Security patches can require distribution in near realnotiﬁed users that every web site worldwide was potentially time. Is the organization prepared to do so?malicious, including its own. A programmer’s misplaced forwardslash (/) was to blame.4 In another case, a software problem con- What is the plan to prevent servers that house customer-tributed to a rail car ﬁre in a major underground metro system facing web services from crashing?in 2007 when software designed to monitor performance did notdetect overheating.5 The 2003 North America blackout was trig- This white paper will discuss the foundational elements of agered by a local outage that went undetected due to a ﬂaw in the strong business resilience plan, which must holistically includemonitoring software.6 not only the generally established capabilities for business continuity and disaster recovery but also hardware andEven with these numerous occurrences, effective hardware software support.and software support—while often addressed within theorganization—are often not included as part of a comprehensivebusiness resiliency plan; in many cases business continuity is In a recent survey of 200 disaster recoverymanaged in one silo, and hardware and software support in decision makers and inﬂuencers conductedanother, hindering cross-infrastructure visibility and leaving the jointly by Forrester and the Disaster Recov-business exposed to risk. ery Journal (see Figure 1), when asked,Organizations must be able to answer questions like: “What was the cause(s) of your most recentWhat would happen if a Microsoft Exchange server on disaster declaration(s) or major business dis-which corporate emails are stored were to crash? Are there pro- ruption?”, twenty-four percent of respondentsvisions for server technical support and maintenance, in addition cited IT hardware failure, second only behindto backup servers? power failure. Eleven percent of respondentsHow long, if at all, would business operations continue if an cited software failure as the culprit.7email server became unavailable?
4 Proactive technical support: The overlooked essential in business resilience “What is the cause(s) of your most significant disaster declaration(s) or major business disruption?” Power failure 44% IT hardware failure 24% Network failure 15% Winter storm 14% Human error 13% Flood 13% IT software failure 11% Fire 6% Other 5% Hurricane 4% Tornado 2% Earthquake 1% Terrorism 1% We have not declared a disaster or 36% had a major business disruption Base: 200 disaster recovery decision makers and inﬂuencers at business globally (multiple responses accepted) Source: Forrester/Disaster Recovery Journal November 2010 Global Disaster Recovery Preparedness OnlineFigure 1: A 2010 Forrester/Disaster Recovery Journal survey revealed hardware failures were ranked second only to power failures as the most common sourceof a recent disaster or business disruption.Addressing risk driven by business, events Most business resilience strategies address business-drivenand data events, such as audits, new product rollouts, future marketingA comprehensive, successful business resilience strategy includes promotions, or failure to meet industry standards. They alsoprovisions for ensuring continuous business operations and address event-driven risks, such as natural disasters, regionalaround-the-clock global service delivery, providing the ability power outages, acts of war or economic downturns. In fact,to act with speed and agility when an unforeseen event strikes. organizations today may be the best prepared to manage the lat-To accomplish this, organizations must address all potential chal- ter, understanding the need to quickly react to revenue shortfallslenges to business resilience, which can be divided into three and drastic shifts in business goals.categories: business-, event- and data-driven events.
IBM Global Technology Services 5The missing piece for many business resilience strategies is in Redeﬁning the requisite elements of business resilienceeffectively and proactively addressing the possibility of data- Most business resilience plans have mainly focused on businessdriven events, such as data corruption, disk failure, application continuity and disaster recovery, including provisions for backupoutages and network problems—which can sometimes be traced servers, data mirroring and failover to protect data after failuresto software or hardware system failures. The reality is that if occur, but in order to prevent failures, organizations must take aorganizations have a strong business continuity program that more holistic approach by incorporating hardware and softwareincludes network, application and data security but inadequate maintenance and support into the overall business resiliencelevels of technical support, they are putting the business at risk plan. Both prevention and remediation are critical, yet often theon a day-to-day basis. An operating system without the latest two are addressed by separate IT teams. This makes it difficultpatches and ﬁxes installed can spread a virus to backup servers, to provide a cohesive approach to business resilience that canfor example. Likewise, an older version of microcode on a server cover enterprise and work area risk, IT stability, recoverability ofcan cause failures when new applications are installed. the IT infrastructure, data backup and disaster recovery—as well as availability of critical data and business applications. It is these last two, data and application availability, that require a compre- hensive hardware and software support strategy that can meetIndian bank speeds time to market by combining disaster the demands of enterprises facing rapid change and a volatile,recovery with IT infrastructure maintenance highly regulated marketplace—a marketplace that depends onA regional bank in India wanted to provide “anywhere bank- the secure and continuous ﬂow of information. Uninterrupteding” services to customers to secure competitive advantage, business operations, and even ongoing business strategy, hingeyet its small IT team was unable to manage the undertaking on the ability of hardware and software to enable the transforma-alone. The bank engaged IBM to build a comprehensive tion of data into that information. Addressed separately, technol-business resilience plan that combined disaster recovery— ogy, software and hardware, and even the data they contain,including procedure documentation and drill support—and become useless. Integrating the layers of business resilience, andIT infrastructure maintenance, including 24×7 comprehensivemonitoring and management services for servers, storage, addressing hardware and software support within those layersnetworks, security, backup and databases. The combined (see Figure 2), is essential.disaster recovery and infrastructure maintenance strategyreduced the bank’s upfront capital expenditure for web-facing systems. It also eliminated the need for multivendorcoordination, enabling the bank to better use existing ITstaff and ultimately increase time to market for new servicedelivery. Total cost of ownership went down while ROIincreased.
6 Proactive technical support: The overlooked essential in business resilience Risk mitigation strategies Freight railroad increases efficiency and ﬂexibility with new business resilience plan Business Data Event driven driven driven A North American freight railroad that services 23 states and two Canadian provinces relies heavily on 24×7 applica- tion availability but lacked a disaster recovery facility. The Strategy and vision company worked with IBM to establish data mirroring capabilities that would help them achieve near real-time Organization recovery point objectives, as well as hardware and software maintenance and support across multiple vendors. Processes Afterward, the railroad incorporated all its equipment and Technical support applications, from IBM and other vendors, into a comprehen- sive business resilience plan. The transition was so seamless Applications and data that customers and supply chain partners requested that additional applications be included in the plan. By adding Technology new technologies like virtualization into their data center, the railroad established data center portability, self-healing and Facilities greater ﬂexibility during business recovery. Business resilience Foundational business continuity forFigure 2: To maintain the continuous ﬂow of information required to support today’s infrastructuresuninterrupted business operations, each layer shown above must be inte- Best-in-class companies take a holistic approach that combinesgrated within a comprehensive business resilience framework that providesfor business-, event- and data-driven events. Hardware and software support planning, prevention and remediation to help solidify ongoingshould be addressed within the technology, applications and data, and business operations. Yet this may be easier to accomplish in the-processes layers. ory than in practice. The challenge is that today’s data center is a mixture of vendor products and platforms, and is more difficult to support in a coherent way. Faced with staffing and budget limitations, few organizations can maintain a stable of experts on every platform and product on which they do business. This adds complexity to maintaining not only data availability, but also security and consistent resilience policy enforcement.
IBM Global Technology Services 7With this in mind, a comprehensive business resilience strategy ● Appropriate system redundancy, ranging from onsite backupbuilt for the needs of today’s IT infrastructures should include: servers and storage to offsite mirrored systems with automatic failover● A tested and proven business resilience program and plan that ● Security, privacy and data protection to guard against internal addresses business continuity, disaster recovery and hardware threats (such as data theft) and external threats (such as viruses) and software support ● Technical processes, including change management and appli-● Support for regulatory compliance and governance cation testing, that help prevent system failures requirements ● Hardware maintenance and support for multivendor● Consistent policy enforcement across departments, corporate environments divisions and global locations ● Software maintenance and support that provides fast access to● Offsite data center and work area recovery capabilities, deeply skilled expertise including alternate employee workspaces ● Remote support capabilities (including execution of ﬁxes and● A tested and proven data backup and archiving system with patches) and web self-service to provide access to technical offsite storage expertise, help prevent failures and speed problem determination and resolution. IBM’s Modular Support Services Are Designed to Optimize Your Support Today and Tomorrow Support Continuum Integrated infrastructure availability management Single point of accountability Adaptive service levels Integrated hardware and Multi-vendor support software support Contract and inventory Preventative hardware and management software support Client value Customized support System monitoring and automated services Hardware support Managed support Software support Remote Support Web self-service Enhanced support Support Foundation Proactive PreventiveFigure 3: Hardware and software support is a foundational component of business resilience. Organizations can further enhance business continuity by addingcapabilities such as system monitoring and integrated infrastructure availability management.
8 Proactive technical support: The overlooked essential in business resilience Beyond foundational resiliency:IT services provider leverages single contact for disaster transferring riskrecovery No matter the approach to developing a business resilience strat-A company that provides a passenger service system and egy, every organization must ﬁrst understand its risks and thendata center services for one of the largest airlines in take action to mitigate those risks. For risk mitigation, it isIndonesia had a local backup system in place but no disaster sometimes more cost-effective and strategic to transfer the riskrecovery solution, which the company needed to implement to a skilled vendor rather than invest in the skills training andprior to their initial public offering. The company engaged capital outlays that may be required to maintain consistent andIBM to create a comprehensive disaster recovery solution high-functioning business operations in the face of a business-,with ongoing disaster recovery support, including a data event- or data-driven event. In this case, managed support (seecenter and disaster recovery center facility, 24×7 managed Figure 3) can help organizations achieve enhanced IT availabilityservices, and mainframe maintenance. Today, the company to support the continuous availability of business operations.has a single point of contact for hardware and a majority ofits software and operating systems, services and communi-cations; a comprehensive disaster recovery solution; and a Organizations seeking advanced capabilities for business conti-secure environment for its IT equipment. nuity and disaster recovery may require: Event-driven managed services to meet recovery time and recovery point objectives after a business disruption, includingJapanese business achieves one-day disaster recovery fast backups, restores and critical data archiving, to help mitigateand critical hardware support business impact. Other capabilities that may be required includePreviously, the largest publication distributor in Japan workload balancing, limiting data loss via email recovery andmanaged the recovery of its bookstores’ operating systems supporting workforce resiliency through emergencyand applications in-house—a time-consuming, resource- notiﬁcations.intensive process. The company needed to accelerate therecovery of mission-critical ﬁles and applications during fre- Outsourced recovery of IT and business infrastructures, includ-quent unplanned outages. Today, IBM provides ﬁle recoverysupport and disaster recovery services for the company’s ing work areas, from disruptions to help reduce costs and time-applications, operating systems and hardware within one frames related to lost employee productivity. Capabilities forbusiness day. IBM also provides ongoing maintenance and maintaining continuity of critical systems, protecting against losssoftware troubleshooting services. The company has of key business data and employee productivity, and meetingimproved service levels, and outsourcing enables the distrib- regulatory compliance mandates can help prevent businessutor’s IT team to focus on core competencies while reducing disruptions.recovery costs.
IBM Global Technology Services 9Security-rich, automated protection of critical business data failures and remove complexity. As new solutions are added tothrough onsite and offsite cloud-based services to reduce the the infrastructure, they create additional points of failure. Properneed for infrastructure build-out to protect and store massive change orchestration across hardware and software updates andamounts of data. maintenance is crucial.Auditable compliance with general and industry-speciﬁc regula- A single point of accountability across vendors based on consis-tory requirements to help track data access and prove data pro- tently agreed upon and continuously monitored service levels,tection to avoid violations. Unproven security policies and which can signiﬁcantly improve availability management andprocedures for widely distributed data can open the door to speed time to resolution by avoiding ﬁnger pointing duringpublic disclosure requirements. problem isolation.Holistic support focused on application availability for the end Proactive monitoring to minimize or eliminate the businessuser rather than isolated server and storage availability. A view impact of technical problems by continually tracking the healthinto how hardware availability affects software availability, which, of the IT infrastructure via real-time dashboards, which can helpin turn, greatly impacts business continuity, is critical. anticipate future problems and notify IT to take preventive action.Deep technology expertise, including an understanding of theinter-relation between applications and infrastructure compo- Contract and inventory management capabilities, which arenents, especially in complex environments such as virtualization. essential to maintaining compliance and controlling IT assetToday, services and resources are distributed across what had costs. License compliance and inventories can be tracked auto-been isolated domains—which means older management tools matically, removing manual, inaccurate oversight while avoidingmay be ineffective. Deep expertise and insight can prevent over-purchasing and under-licensing.hardware and software failures from impacting business-criticalservices. Customized support, which is especially relevant for multivendor, multidomain environments where complex dependencies canIntegrated hardware and software capabilities, including cross- wreak havoc on administrators who do not have the insight orplatform and multivendor technical support and proactive time to manage this type of infrastructure.microcode and release management, which can help prevent
10 Proactive technical support: The overlooked essential in business resilience ● Globally consistent services, from simple break-ﬁx to compre-Data storage company transfers risk to improve service hensive, end-to-end support capabilities, so that every locationdelivery can access immediate expertise to resolve events ● The ability to anticipate problems and provide proactive guid-A data storage company that provides hardware devices ance to meet required service level agreementsand data management software to customers in virtuallyevery industry had previously provided ongoing product ● Proof of potential cost savings via outsourced support that maysupport via internal resources and a network of seven indi- include remote support tools and staff as well as managementvidual third-party maintainers (TPMs) in the United States of contracts, warranties and asset inventoriesand dozens of TPMs worldwide. Product support was poor, ● Understanding of global business needs and IT strategies, anand the business suffered. The company integrated its cus- essential characteristic for organizations focused on maintain-tomer relationship management system with IBM’s, allowing ing competitive advantageIBM service representatives and engineers to respond to ● Remote and automated tools, which can help control thetechnical support calls and requests for on-site support. labor costs of maintaining the IT infrastructure and also speedProviding customers with a single point-of-contact and a sin- problem resolutiongle workﬂow for all calls reduces the number of interactions ● Proactive monitoring and event notiﬁcations to prevent busi-while improving service quality. The cost savings for the ﬁrst ness disruptionsyear was approximately US$8 million, with a possible cost ● Parts replacement capabilities, on time, at the right locationssavings over ﬁve years of more than US$40 million. Assessing riskEssential characteristics of a business An effective business resilience plan must begin with aresilience vendor thorough risk analysis:Organizations looking to solidify business resilience should care-fully consider the capabilities of the vendor(s) with whom they ● Rank threats based on past occurrences, the amount ofchoose to work. A business resilience vendor should be able to potential revenue loss, damage to the brand, complianceprovide: risks and single points of failure ● Prioritize safeguards, including provisions for hardware and software support, business continuity and disaster● Global leadership in technical expertise, including broad infra- recovery structure competence and deep, multiplatform technology ● Conduct a cost-beneﬁt analysis for a quantitative risk skills assessment● Highly responsive, international support for the full scope of ● Determine next steps based on threat severity, selected IT infrastructure operations, both current operations and safeguards, and the cost and ease of implementation. operations scoped for near-term and long-term expansion Take the IBM risk self-assessment
IBM Global Technology Services 11IBM’s business resilience capabilities For technical support, IBM provides nearly 15,000 techniciansThrough IBM, organizations can obtain all the foundational and support personnel, 400 parts distribution centers and morecapabilities needed for a solid business resilience strategy than 80 support centers located worldwide, so that organizations, including business continuity, disaster recovery and hardware can expect responsiveness no matter where they are located,and software technical support, that can fully address business-, including 24×7 service logistics network availability. For businessevent- and data-driven risks to the business. In addition, continuity support, organizations can look to IBM’s extensiveIBM can provide proof of potential cost savings of between infrastructure investments; more than 1,800 dedicated business5 and 40 percent, depending on the current state of the support continuity professionals; more than 150 business resilience cen-environment and how much support is out-tasked.8 IBM’s depth ters located worldwide; and the depth of knowledge drawn fromof expertise—across multiple vendor platforms—and breadth of more than 50 years of business resilience and disaster recoveryresilience capabilities are designed to support today’s complex IT experience with more than 12,000 disaster recovery clients—allinfrastructures. of which has led respected industry analysts to recognize IBM as a leader in business continuity and resilience.