SlideShare a Scribd company logo
1 of 60
Information Storage and
Management
Storing, Managing, and Protecting Digital Information
Ali Broumandnia, Broumandnia@gmail.com 1
Section III Business Continuity
In This Section:
• Chapter 11: Introduction to Business Continuity
• Chapter 12: Backup and Recovery
• Chapter 13: Local Replication
• Chapter 14: Remote Replication
Ali Broumandnia, Broumandnia@gmail.com 2
Chapter 11
Introduction to
Business Continuity
• 11.1 Information Availability
• 11.2 BC Terminology
• 11.3 BC Planning Lifecycle
• 11.4 Failure Analysis
• 11.5 Business Impact Analysis
• 11.6 BC Technology Solutions
• 11.7 Concept in Practice: EMC PowerPath
• Summary
Ali Broumandnia, Broumandnia@gmail.com 3
Key Concepts:
• Business Continuity
• Information Availability
• Disaster Recovery
• Disaster Restart
• BC Planning
• Business Impact Analysis
Ali Broumandnia, Broumandnia@gmail.com 4
This chapter describes the factors that affect information
availability. It also explains how to create an effective
BC plan and design fault-tolerant mechanisms to protect
against single points of failure.
Ali Broumandnia, Broumandnia@gmail.com 5
11.1 Information Availability
Information availability (IA) refers to the ability of the
infrastructure to function according to business
expectations during its specified time of operation.
Information availability ensures that people (employees,
customers, suppliers, and partners) can access
information whenever they need it. Information
availability can be defined with the help of reliability,
accessibility and timeliness.
Ali Broumandnia, Broumandnia@gmail.com 6
• Reliability: This reflects a component’s ability to
function without failure, under stated conditions, for a
specified amount of time.
• Accessibility: This is the state within which the
required information is accessible at the right place, to
the right user. The period of time during which the
system is in an accessible state is termed system
uptime; when it is not accessible it is termed system
downtime.
Ali Broumandnia, Broumandnia@gmail.com 7
• Timeliness: Defines the exact moment or the time
window (a particular time of the day, week, month,
and/or year as specified) during which information
must be accessible. For example, if online access to an
application is required between 8:00 am and 10:00 pm
each day, any disruptions to data availability outside of
this time slot are not considered to affect timeliness.
Ali Broumandnia, Broumandnia@gmail.com 8
11.1.1 Causes of Information Unavailability
Various planned and unplanned incidents result in data
unavailability. Planned outages include
installation/integration/maintenance of new hardware,
software upgrades or patches, taking backups,
application and data restores, facility operations
(renovation and construction), and refresh/migration of
the testing to the production environment. Unplanned
outages include failure caused by database corruption,
component failure, and human errors.Ali Broumandnia, Broumandnia@gmail.com 9
Another type of incident that may cause data
unavailability is natural or man-made disasters such as
flood, fire, earthquake, and contamination. As illustrated
in Figure 11-1, the majority of outages are planned.
Planned outages are expected and scheduled, but still
cause data to be unavailable. Statistically, less than 1
percent is likely to be the result of an unforeseen disaster.
Ali Broumandnia, Broumandnia@gmail.com 10
Ali Broumandnia, Broumandnia@gmail.com 11
11.1.2 Measuring Information
Availability
Information availability relies on the availability of the
hardware and software components of a data center.
Failure of these components might disrupt information
availability. A failure is the termination of a component’s
ability to perform a required function. The component’s
ability can be restored by performing an external
corrective action, such as a manual reboot, a repair, or
replacement of the failed component(s).
Ali Broumandnia, Broumandnia@gmail.com 12
• Mean Time Between Failure (MTBF): It is the
average time available for a system or component to
perform its normal operations between failures.
• Mean Time To Repair (MTTR): It is the average time
required to repair a failed component. While
calculating MTTR, it is assumed that the fault
responsible for the failure is correctly identified and
that the required spares and personnel are available.
Ali Broumandnia, Broumandnia@gmail.com 13
Uptime per year is based on the exact timeliness
requirements of the service, this calculation leads to the
number of “9s” representation for availability metrics.
Table 11-1 lists the approximate amount of downtime
allowed for a service to achieve certain levels of 9s
availability.
Ali Broumandnia, Broumandnia@gmail.com 14
Ali Broumandnia, Broumandnia@gmail.com 15
11.1.3 Consequences of Downtime
Data unavailability, or downtime, results in loss of
productivity, loss of revenue, poor financial performance,
and damages to reputation. Loss of productivity reduces
the output per unit of labor, equipment, and capital. Loss
of revenue includes direct loss, compensatory payments,
future revenue losses, billing losses, and investment
losses.
Ali Broumandnia, Broumandnia@gmail.com 16
11.2 BC Terminology
This section introduces and defines common
terms related to BC operations and are used in
the next few chapters to explain advanced
concepts:
• Disaster recovery
• Disaster restart
• Recovery-Point Objective (RPO)
Ali Broumandnia, Broumandnia@gmail.com 17
• RPO of 24 hours
• RPO of 1 hour
• RPO of zero
 Recovery-Time Objective (RTO)
A. RTO of 72 hours
B. RTO of 12 hours
C. RTO of 4 hours
D. RTO of 1 hour
E. RTO of a few seconds
Ali Broumandnia, Broumandnia@gmail.com 18
Ali Broumandnia, Broumandnia@gmail.com 19
11.3 BC Planning Lifecycle
BC planning must follow a disciplined approach like any
other planning process. Organizations today dedicate
specialized resources to develop and maintain BC plans.
From the conceptualization to the realization of the BC
plan, a lifecycle of activities can be defined for the BC
process. The BC planning lifecycle includes five stages
(see Figure 11-3):
Ali Broumandnia, Broumandnia@gmail.com 20
1. Establishing objectives
2. Analyzing
3. Designing and developing
4. Implementing
5. Training, testing, assessing, and maintaining
Several activities are performed at each stage of the BC
planning lifecycle, including the following key activities:
Ali Broumandnia, Broumandnia@gmail.com 21
Ali Broumandnia, Broumandnia@gmail.com 22
1. Establishing objectives
 Determine BC requirements.
 Estimate the scope and budget to achieve requirements.
 Select a BC team by considering subject matter experts
from all areas of the business, whether internal or
external.
 Create BC policies.
Ali Broumandnia, Broumandnia@gmail.com 23
2. Analyzing
 Collect information on data profiles, business
processes, infrastructure support, dependencies, and
frequency of using business infrastructure.
 Identify critical business needs and assign recovery
priorities.
 Create a risk analysis for critical areas and mitigation
strategies.
 Conduct a Business Impact Analysis (BIA).
 Create a cost and benefit analysis based on the
consequences of data unavailability.Ali Broumandnia, Broumandnia@gmail.com 24
 Evaluate options.
3. Designing and developing
 Define the team structure and assign individual roles
and responsibilities.
 Design data protection strategies and develop
infrastructure.
 Develop contingency scenarios.
 Develop emergency response procedures.
 Detail recovery and restart procedures.
Ali Broumandnia, Broumandnia@gmail.com 25
4. Implementing
 Implement risk management and mitigation procedures
that include backup, replication, and management of
resources.
 Prepare the disaster recovery sites that can be utilized if
a disaster affects the primary data center.
 Implement redundancy for every resource in a data
center to avoid single points of failure.
Ali Broumandnia, Broumandnia@gmail.com 26
5. Training, testing, assessing, and maintaining
 Train the employees who are responsible for backup
and replication of business-critical data on a regular
basis or whenever there is a modification in the BC
plan.
 Train employees on emergency response procedures
when disasters are declared.
 Train the recovery team on recovery procedures based
on contingency scenarios.
 Perform damage assessment processes and review
recovery plans.
Ali Broumandnia, Broumandnia@gmail.com 27
 Test the BC plan regularly to evaluate its performance
and identify its limitations.
 Assess the performance reports and identify
limitations.
 Update the BC plans and recovery/restart procedures to
reflect regular changes within the data center.
Ali Broumandnia, Broumandnia@gmail.com 28
11.4 Failure Analysis
Failure analysis involves analyzing the data center to
identify systems that are susceptible to a single point of
failure and implementing fault-tolerance mechanisms
such as redundancy.
Ali Broumandnia, Broumandnia@gmail.com 29
11.4.1 Single Point of Failure
A single point of failure refers to the failure of a
component that can terminate the availability of the
entire system or IT service. Figure 11-4 illustrates the
possibility of a single point of failure in a system with
various components: server, network, switch, and storage
array. The figure depicts a system setup in which an
application running on the server provides an interface to
the client and performs I/O operations.
Ali Broumandnia, Broumandnia@gmail.com 30
Ali Broumandnia, Broumandnia@gmail.com 31
11.4.2 Fault Tolerance
To mitigate a single point of failure, systems are
designed with redundancy, such that the system will fail
only if all the components in the redundancy group fail.
This ensures that the failure of a single component does
not affect data availability. Figure 11-5 illustrates the
fault-tolerant implementation of the system just
described (and shown in Figure 11-4).
Ali Broumandnia, Broumandnia@gmail.com 32
Ali Broumandnia, Broumandnia@gmail.com 33
11.4.3 Multipathing Software
Configuration of multiple paths increases the data
availability through path failover. If servers are
configured with one I/O path to the data there will be no
access to the data if that path fails. Redundant paths
eliminate the path to become single points of failure.
Multiple paths to data also improve I/O performance
through load sharing and maximize server, storage, and
data path utilization.
Ali Broumandnia, Broumandnia@gmail.com 34
11.5 Business Impact Analysis
A business impact analysis (BIA) identifies and evaluates
financial, operational, and service impacts of a disruption
to essential business processes. Selected functional areas
are evaluated to determine resilience of the infrastructure
to support information availability.
Ali Broumandnia, Broumandnia@gmail.com 35
The BIA process leads to a report detailing the incidents
and their impact over business functions. The impact
may be specified in terms of money or in terms of time.
Based on the potential impacts associated with
downtime, businesses can prioritize and implement
countermeasures to mitigate the likelihood of such
disruptions. These are detailed in the BC plan. A BIA
includes the following set of tasks:
Ali Broumandnia, Broumandnia@gmail.com 36
 Identify the key business processes critical to its
operation.
 Determine the attributes of the business process in
terms of applications, databases, and hardware and
software requirements.
 Estimate the costs of failure for each business process.
 Calculate the maximum tolerable outage and define
RTO and RPO for each business process.
Ali Broumandnia, Broumandnia@gmail.com 37
 Establish the minimum resources required for the
operation of business processes.
 Determine recovery strategies and the cost for
implementing them.
 Optimize the backup and business recovery strategy
based on business priorities.
 Analyze the current state of BC readiness and optimize
future BC planning.
Ali Broumandnia, Broumandnia@gmail.com 38
11.6 BC Technology Solutions
After analyzing the business impact of an outage,
designing appropriate solutions to recover from a failure
is the next important activity. One or more copies of the
original data are maintained using any of the following
strategies, so that data can be recovered and business
operations can be restarted using an alternate copy:
Ali Broumandnia, Broumandnia@gmail.com 39
 Backup and recovery
 Storage array-based replication (local)
 Storage array-based replication (remote)
 Host-based replication
Ali Broumandnia, Broumandnia@gmail.com 40
11.7 Concept in Practice: EMC
PowerPath
PowerPath is a host-based multipathing software that
provides path failover and load balancing functionality.
PowerPath operates between operating systems and
device drivers and supports SCSI, iSCSI, and Fibre
Channel environment. It prioritizes I/O bandwidth
utilization by using sophisticated load balancing
algorithms to ensure optimal application performance.
Ali Broumandnia, Broumandnia@gmail.com 41
11.7.1 PowerPath Features
PowerPath provides the following features:
• Online path configuration and management
• Dynamic load balancing across multiple paths
• Automatic path failover
• Proactive path testing
• Cluster support
• InteroperabilityAli Broumandnia, Broumandnia@gmail.com 42
11.7.2 Dynamic Load Balancing
For every I/O, the PowerPath filter driver selects the path
based on the load balancing policy and failover setting
for the logical device. The driver identifies all available
paths that read and write to a device and builds a routing
table called a volume path set for the devices. PowerPath
follows any one of the following user specified load-
balancing policies:
Ali Broumandnia, Broumandnia@gmail.com 43
 Round-Robin policy
 Least I/O s policy
 Least Blocks policy
 Priority-Based policy
Ali Broumandnia, Broumandnia@gmail.com 44
I/O Operation without PowerPath
Figure 11-6 illustrates I/O operations in a storage system
environment in the absence of PowerPath. The
applications running on a host have four paths to the
storage array. However, the applications can use only one
of the paths because the LVM that is native to the host
operating system allows only one path for application I/O
operations.
Ali Broumandnia, Broumandnia@gmail.com 45
This example illustrates how I/O throughput is
unbalanced without PowerPath. Two applications are
generating high I/O traffic, which overloads both paths,
but the other two paths are less loaded. In this scenario,
some paths may be idle or unused while other paths have
multiple I/O operations queued. As a result, the
applications cannot achieve optimal performance.
Ali Broumandnia, Broumandnia@gmail.com 46
Ali Broumandnia, Broumandnia@gmail.com 47
I/O Operation with PowerPath
Figure 11-7 shows I/O operations in a storage system
environment that has PowerPath. PowerPath ensures that
I/O requests are balanced across the four paths to
storage, based on the load-balancing algorithm chosen.
As a result, the applications can effectively utilize their
resources, thereby improving their performance.
Ali Broumandnia, Broumandnia@gmail.com 48
Ali Broumandnia, Broumandnia@gmail.com 49
11.7.3 Automatic Path Failover
The next two examples demonstrate how PowerPath
performs path failover operations in the event of a path
failure for active-active and active-passive array
configurations.
Ali Broumandnia, Broumandnia@gmail.com 50
Path Failure without PowerPath
Figure 11-8 shows a scenario in which applications use
only one of the four paths defined by the operating
system. Without PowerPath, the loss of paths (the path
failure is marked by a cross “X”) due to single points of
failure, such as the loss of an HBA, storage array front-
end connectivity, switch port, or a failed cable, can result
in an outage for one or more applications.
Ali Broumandnia, Broumandnia@gmail.com 51
Ali Broumandnia, Broumandnia@gmail.com 52
Path Failover with PowerPath: Active-Active Array
Figure 11-9 shows a storage system environment in
which an application uses PowerPath with an active-
active array configuration to perform I/O operations.
PowerPath redirects the application I/O s through an
alternate active path.
Ali Broumandnia, Broumandnia@gmail.com 53
Ali Broumandnia, Broumandnia@gmail.com 54
Path Failover with PowerPath: Active-
Passive Array
Figure 11-10 shows a scenario in which a logical device
is assigned to a storage processor B (SP B).
Ali Broumandnia, Broumandnia@gmail.com 55
Ali Broumandnia, Broumandnia@gmail.com 56
Path failure can occur due to a failure of the link, HBA,
or storage processor (SP). In the event of a path failure,
PowerPath with an active-passive configuration performs
the path failover operation in the following way:
Ali Broumandnia, Broumandnia@gmail.com 57
If an active I/O path to SP B through HBA 2 fails,
PowerPath uses a passive path to SP B through HBA 1.
If HBA 2 fails, the application uses HBA 1 to access
the logical device.
If SP B fails, PowerPath stops all I/O to SP B and
trespasses the device over to SP A. All I/O will be sent
down the paths to SP A, this process is referred as LUN
trespassing. When SP B is brought back online,
PowerPath recognizes that it is available and resumes
sending I/O down to SP B.
Ali Broumandnia, Broumandnia@gmail.com 58
Summary
Technology innovations have led to a rich set of options
in terms of storage devices and solutions to meet the
needs of businesses for high availability and business
continuity. The goal of any business continuity (BC) plan
is to identify and implement the most appropriate risk
management and mitigation procedures to protect against
possible failures. The process of analyzing the hardware
and software configuration to identify any single points
of failure and their impact on business operations is
critical.
Ali Broumandnia, Broumandnia@gmail.com 59
A business impact analysis (BIA) helps a company
develop an appropriate BC plan to ensure that the storage
infrastructure and services are designed to meet business
requirements. BC provides the framework for
organizations to implement effective and cost-efficient
disaster recovery and restart procedures. In a constantly
changing business environment, BC can become a
demanding endeavor. The next three chapters discuss
specific BC technology solutions, backup and recovery,
local replication, and remote replication.
Ali Broumandnia, Broumandnia@gmail.com 60

More Related Content

What's hot

How Balabit helps to comply with iso 27001 (infographics)
How Balabit helps to comply with iso 27001  (infographics)How Balabit helps to comply with iso 27001  (infographics)
How Balabit helps to comply with iso 27001 (infographics)Sectricity
 
Wp data center standards (final)
Wp   data center standards (final)Wp   data center standards (final)
Wp data center standards (final)Sri Chalasani
 
Chap5 2007 C I S A Review Course
Chap5 2007 C I S A Review CourseChap5 2007 C I S A Review Course
Chap5 2007 C I S A Review CourseDesmond Devendran
 
Carl Binder Resume Myrtle Beach address 1-24-17
Carl Binder Resume Myrtle Beach address 1-24-17Carl Binder Resume Myrtle Beach address 1-24-17
Carl Binder Resume Myrtle Beach address 1-24-17Carl Binder
 
InfoRelay National Data Centers Overview
InfoRelay National Data Centers OverviewInfoRelay National Data Centers Overview
InfoRelay National Data Centers OverviewJonathan Maxim
 
Get Mainframe Visibility to Enhance SIEM Efforts in Splunk
Get Mainframe Visibility to Enhance SIEM Efforts in SplunkGet Mainframe Visibility to Enhance SIEM Efforts in Splunk
Get Mainframe Visibility to Enhance SIEM Efforts in SplunkPrecisely
 
The Database Environment Chapter 2
The Database Environment Chapter 2The Database Environment Chapter 2
The Database Environment Chapter 2Jeanie Arnoco
 
Security Issues for Cloud Applications
Security Issues for Cloud ApplicationsSecurity Issues for Cloud Applications
Security Issues for Cloud ApplicationsGuillermo Remache
 
The enhancement of logging system accuracy for infrastructure as a service cloud
The enhancement of logging system accuracy for infrastructure as a service cloudThe enhancement of logging system accuracy for infrastructure as a service cloud
The enhancement of logging system accuracy for infrastructure as a service cloudjournalBEEI
 
Nuevo estándar para diseño de data center ANSI/BICSI-002
Nuevo estándar para diseño de data center ANSI/BICSI-002Nuevo estándar para diseño de data center ANSI/BICSI-002
Nuevo estándar para diseño de data center ANSI/BICSI-002Data Center Consultores
 

What's hot (20)

How Balabit helps to comply with iso 27001 (infographics)
How Balabit helps to comply with iso 27001  (infographics)How Balabit helps to comply with iso 27001  (infographics)
How Balabit helps to comply with iso 27001 (infographics)
 
Ch12
Ch12Ch12
Ch12
 
Wp data center standards (final)
Wp   data center standards (final)Wp   data center standards (final)
Wp data center standards (final)
 
Cloud resilience, provisioning
Cloud resilience, provisioning Cloud resilience, provisioning
Cloud resilience, provisioning
 
Chap5 2007 C I S A Review Course
Chap5 2007 C I S A Review CourseChap5 2007 C I S A Review Course
Chap5 2007 C I S A Review Course
 
ppt ch10
ppt ch10ppt ch10
ppt ch10
 
Carl Binder Resume Myrtle Beach address 1-24-17
Carl Binder Resume Myrtle Beach address 1-24-17Carl Binder Resume Myrtle Beach address 1-24-17
Carl Binder Resume Myrtle Beach address 1-24-17
 
U nit 4
U nit 4U nit 4
U nit 4
 
InfoRelay National Data Centers Overview
InfoRelay National Data Centers OverviewInfoRelay National Data Centers Overview
InfoRelay National Data Centers Overview
 
Managerof managerarchitecture
Managerof managerarchitectureManagerof managerarchitecture
Managerof managerarchitecture
 
unit1.pdf
unit1.pdfunit1.pdf
unit1.pdf
 
Get Mainframe Visibility to Enhance SIEM Efforts in Splunk
Get Mainframe Visibility to Enhance SIEM Efforts in SplunkGet Mainframe Visibility to Enhance SIEM Efforts in Splunk
Get Mainframe Visibility to Enhance SIEM Efforts in Splunk
 
Network management
Network management Network management
Network management
 
The Database Environment Chapter 2
The Database Environment Chapter 2The Database Environment Chapter 2
The Database Environment Chapter 2
 
Security Issues for Cloud Applications
Security Issues for Cloud ApplicationsSecurity Issues for Cloud Applications
Security Issues for Cloud Applications
 
The enhancement of logging system accuracy for infrastructure as a service cloud
The enhancement of logging system accuracy for infrastructure as a service cloudThe enhancement of logging system accuracy for infrastructure as a service cloud
The enhancement of logging system accuracy for infrastructure as a service cloud
 
Nuevo estándar para diseño de data center ANSI/BICSI-002
Nuevo estándar para diseño de data center ANSI/BICSI-002Nuevo estándar para diseño de data center ANSI/BICSI-002
Nuevo estándar para diseño de data center ANSI/BICSI-002
 
ITIL # Lecture 9
ITIL # Lecture 9ITIL # Lecture 9
ITIL # Lecture 9
 
Executive DCIM
Executive DCIMExecutive DCIM
Executive DCIM
 
Cloud Auditing
Cloud AuditingCloud Auditing
Cloud Auditing
 

Similar to Chapter 11

Business Continuity Plan TemplateCIO Maria Sosa has asked you to p.docx
Business Continuity Plan TemplateCIO Maria Sosa has asked you to p.docxBusiness Continuity Plan TemplateCIO Maria Sosa has asked you to p.docx
Business Continuity Plan TemplateCIO Maria Sosa has asked you to p.docxfelicidaddinwoodie
 
Business Continuity Detailed Plan
Business Continuity Detailed PlanBusiness Continuity Detailed Plan
Business Continuity Detailed PlanWissam Abdel Baki
 
Integra: Attack of the Business Killing Monster (Infographic)
Integra: Attack of the Business Killing Monster (Infographic)Integra: Attack of the Business Killing Monster (Infographic)
Integra: Attack of the Business Killing Monster (Infographic)Jessica Legg
 
Ultan kinahan dr - minasi 2010
Ultan kinahan   dr - minasi 2010Ultan kinahan   dr - minasi 2010
Ultan kinahan dr - minasi 2010Nathan Winters
 
10 Ways to Better Application-Centric Service Management
10 Ways to Better Application-Centric Service Management10 Ways to Better Application-Centric Service Management
10 Ways to Better Application-Centric Service ManagementLinh Nguyen
 
Network Strategy and Design Final assignment disaster rec
Network Strategy and Design Final assignment disaster recNetwork Strategy and Design Final assignment disaster rec
Network Strategy and Design Final assignment disaster recrosu555
 
4.3.2013 Tool Demo - Traditional Development RTC Planning In Context Collabor...
4.3.2013 Tool Demo - Traditional Development RTC Planning In Context Collabor...4.3.2013 Tool Demo - Traditional Development RTC Planning In Context Collabor...
4.3.2013 Tool Demo - Traditional Development RTC Planning In Context Collabor...IBM Rational
 
Analyzing data, performance and impacts in construction
Analyzing data, performance and impacts in constructionAnalyzing data, performance and impacts in construction
Analyzing data, performance and impacts in constructionMichael Pink
 
Disaster Recovery: Develop Efficient Critique for an Emergency
Disaster Recovery: Develop Efficient Critique for an EmergencyDisaster Recovery: Develop Efficient Critique for an Emergency
Disaster Recovery: Develop Efficient Critique for an Emergencysco813f8ko
 
RUNNING HEADER Disaster Recovery Plan Information and Documentat.docx
RUNNING HEADER Disaster Recovery Plan Information and Documentat.docxRUNNING HEADER Disaster Recovery Plan Information and Documentat.docx
RUNNING HEADER Disaster Recovery Plan Information and Documentat.docxanhlodge
 
Business Continuity Workshop Final
Business Continuity Workshop   FinalBusiness Continuity Workshop   Final
Business Continuity Workshop FinalBill Lisse
 
Stop flirting with disaster! Prepare your business for unforeseen disasters.
Stop flirting with disaster! Prepare your business for unforeseen disasters.Stop flirting with disaster! Prepare your business for unforeseen disasters.
Stop flirting with disaster! Prepare your business for unforeseen disasters.Sameer Sule
 
Successful_BC_Strategy.pdf
Successful_BC_Strategy.pdfSuccessful_BC_Strategy.pdf
Successful_BC_Strategy.pdfmykovalenko1
 
Business Continuity Awareness Week 2009
Business Continuity Awareness Week 2009Business Continuity Awareness Week 2009
Business Continuity Awareness Week 2009Brigitte Theuma
 
Running Head EXECUTIVE SUMMARY6Executive SummaryS.docx
Running Head EXECUTIVE SUMMARY6Executive SummaryS.docxRunning Head EXECUTIVE SUMMARY6Executive SummaryS.docx
Running Head EXECUTIVE SUMMARY6Executive SummaryS.docxcowinhelen
 
BCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMS
BCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMS
BCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMSsarankamalanathan
 

Similar to Chapter 11 (20)

Business Continuity Plan TemplateCIO Maria Sosa has asked you to p.docx
Business Continuity Plan TemplateCIO Maria Sosa has asked you to p.docxBusiness Continuity Plan TemplateCIO Maria Sosa has asked you to p.docx
Business Continuity Plan TemplateCIO Maria Sosa has asked you to p.docx
 
Business Continuity Detailed Plan
Business Continuity Detailed PlanBusiness Continuity Detailed Plan
Business Continuity Detailed Plan
 
Integra: Attack of the Business Killing Monster (Infographic)
Integra: Attack of the Business Killing Monster (Infographic)Integra: Attack of the Business Killing Monster (Infographic)
Integra: Attack of the Business Killing Monster (Infographic)
 
Business continuity
Business continuityBusiness continuity
Business continuity
 
Dcca study guide
Dcca study guideDcca study guide
Dcca study guide
 
BCP Awareness
BCP Awareness BCP Awareness
BCP Awareness
 
Ultan kinahan dr - minasi 2010
Ultan kinahan   dr - minasi 2010Ultan kinahan   dr - minasi 2010
Ultan kinahan dr - minasi 2010
 
10 Ways to Better Application-Centric Service Management
10 Ways to Better Application-Centric Service Management10 Ways to Better Application-Centric Service Management
10 Ways to Better Application-Centric Service Management
 
Network Strategy and Design Final assignment disaster rec
Network Strategy and Design Final assignment disaster recNetwork Strategy and Design Final assignment disaster rec
Network Strategy and Design Final assignment disaster rec
 
4.3.2013 Tool Demo - Traditional Development RTC Planning In Context Collabor...
4.3.2013 Tool Demo - Traditional Development RTC Planning In Context Collabor...4.3.2013 Tool Demo - Traditional Development RTC Planning In Context Collabor...
4.3.2013 Tool Demo - Traditional Development RTC Planning In Context Collabor...
 
Analyzing data, performance and impacts in construction
Analyzing data, performance and impacts in constructionAnalyzing data, performance and impacts in construction
Analyzing data, performance and impacts in construction
 
Disaster Recovery: Develop Efficient Critique for an Emergency
Disaster Recovery: Develop Efficient Critique for an EmergencyDisaster Recovery: Develop Efficient Critique for an Emergency
Disaster Recovery: Develop Efficient Critique for an Emergency
 
RUNNING HEADER Disaster Recovery Plan Information and Documentat.docx
RUNNING HEADER Disaster Recovery Plan Information and Documentat.docxRUNNING HEADER Disaster Recovery Plan Information and Documentat.docx
RUNNING HEADER Disaster Recovery Plan Information and Documentat.docx
 
Business Continuity Workshop Final
Business Continuity Workshop   FinalBusiness Continuity Workshop   Final
Business Continuity Workshop Final
 
Stop flirting with disaster! Prepare your business for unforeseen disasters.
Stop flirting with disaster! Prepare your business for unforeseen disasters.Stop flirting with disaster! Prepare your business for unforeseen disasters.
Stop flirting with disaster! Prepare your business for unforeseen disasters.
 
Successful_BC_Strategy.pdf
Successful_BC_Strategy.pdfSuccessful_BC_Strategy.pdf
Successful_BC_Strategy.pdf
 
Ijcet 06 07_005
Ijcet 06 07_005Ijcet 06 07_005
Ijcet 06 07_005
 
Business Continuity Awareness Week 2009
Business Continuity Awareness Week 2009Business Continuity Awareness Week 2009
Business Continuity Awareness Week 2009
 
Running Head EXECUTIVE SUMMARY6Executive SummaryS.docx
Running Head EXECUTIVE SUMMARY6Executive SummaryS.docxRunning Head EXECUTIVE SUMMARY6Executive SummaryS.docx
Running Head EXECUTIVE SUMMARY6Executive SummaryS.docx
 
BCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMS
BCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMS
BCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMSBCMS
 

More from Ali Broumandnia (14)

Chapter 06 eng
Chapter 06 engChapter 06 eng
Chapter 06 eng
 
Chapter 16
Chapter 16Chapter 16
Chapter 16
 
Chapter 14
Chapter 14Chapter 14
Chapter 14
 
Chapter 12
Chapter 12Chapter 12
Chapter 12
 
Chapter 10
Chapter 10Chapter 10
Chapter 10
 
Chapter 9
Chapter 9Chapter 9
Chapter 9
 
Chapter 8
Chapter 8Chapter 8
Chapter 8
 
Chapter 7
Chapter 7Chapter 7
Chapter 7
 
Chapter 6
Chapter 6Chapter 6
Chapter 6
 
Chapter 5
Chapter 5Chapter 5
Chapter 5
 
Chapter 4
Chapter 4Chapter 4
Chapter 4
 
Chapter 3
Chapter 3Chapter 3
Chapter 3
 
Chapter 1
Chapter 1Chapter 1
Chapter 1
 
Chapter 2
Chapter 2Chapter 2
Chapter 2
 

Recently uploaded

Intro To Electric Vehicles PDF Notes.pdf
Intro To Electric Vehicles PDF Notes.pdfIntro To Electric Vehicles PDF Notes.pdf
Intro To Electric Vehicles PDF Notes.pdfrs7054576148
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...SUHANI PANDEY
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfRagavanV2
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptMsecMca
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptNANDHAKUMARA10
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Arindam Chakraborty, Ph.D., P.E. (CA, TX)
 
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...tanu pandey
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756dollysharma2066
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfKamal Acharya
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startQuintin Balsdon
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 

Recently uploaded (20)

Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
 
Intro To Electric Vehicles PDF Notes.pdf
Intro To Electric Vehicles PDF Notes.pdfIntro To Electric Vehicles PDF Notes.pdf
Intro To Electric Vehicles PDF Notes.pdf
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdf
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.ppt
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
NFPA 5000 2024 standard .
NFPA 5000 2024 standard                                  .NFPA 5000 2024 standard                                  .
NFPA 5000 2024 standard .
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
 
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 

Chapter 11

  • 1. Information Storage and Management Storing, Managing, and Protecting Digital Information Ali Broumandnia, Broumandnia@gmail.com 1
  • 2. Section III Business Continuity In This Section: • Chapter 11: Introduction to Business Continuity • Chapter 12: Backup and Recovery • Chapter 13: Local Replication • Chapter 14: Remote Replication Ali Broumandnia, Broumandnia@gmail.com 2
  • 3. Chapter 11 Introduction to Business Continuity • 11.1 Information Availability • 11.2 BC Terminology • 11.3 BC Planning Lifecycle • 11.4 Failure Analysis • 11.5 Business Impact Analysis • 11.6 BC Technology Solutions • 11.7 Concept in Practice: EMC PowerPath • Summary Ali Broumandnia, Broumandnia@gmail.com 3
  • 4. Key Concepts: • Business Continuity • Information Availability • Disaster Recovery • Disaster Restart • BC Planning • Business Impact Analysis Ali Broumandnia, Broumandnia@gmail.com 4
  • 5. This chapter describes the factors that affect information availability. It also explains how to create an effective BC plan and design fault-tolerant mechanisms to protect against single points of failure. Ali Broumandnia, Broumandnia@gmail.com 5
  • 6. 11.1 Information Availability Information availability (IA) refers to the ability of the infrastructure to function according to business expectations during its specified time of operation. Information availability ensures that people (employees, customers, suppliers, and partners) can access information whenever they need it. Information availability can be defined with the help of reliability, accessibility and timeliness. Ali Broumandnia, Broumandnia@gmail.com 6
  • 7. • Reliability: This reflects a component’s ability to function without failure, under stated conditions, for a specified amount of time. • Accessibility: This is the state within which the required information is accessible at the right place, to the right user. The period of time during which the system is in an accessible state is termed system uptime; when it is not accessible it is termed system downtime. Ali Broumandnia, Broumandnia@gmail.com 7
  • 8. • Timeliness: Defines the exact moment or the time window (a particular time of the day, week, month, and/or year as specified) during which information must be accessible. For example, if online access to an application is required between 8:00 am and 10:00 pm each day, any disruptions to data availability outside of this time slot are not considered to affect timeliness. Ali Broumandnia, Broumandnia@gmail.com 8
  • 9. 11.1.1 Causes of Information Unavailability Various planned and unplanned incidents result in data unavailability. Planned outages include installation/integration/maintenance of new hardware, software upgrades or patches, taking backups, application and data restores, facility operations (renovation and construction), and refresh/migration of the testing to the production environment. Unplanned outages include failure caused by database corruption, component failure, and human errors.Ali Broumandnia, Broumandnia@gmail.com 9
  • 10. Another type of incident that may cause data unavailability is natural or man-made disasters such as flood, fire, earthquake, and contamination. As illustrated in Figure 11-1, the majority of outages are planned. Planned outages are expected and scheduled, but still cause data to be unavailable. Statistically, less than 1 percent is likely to be the result of an unforeseen disaster. Ali Broumandnia, Broumandnia@gmail.com 10
  • 12. 11.1.2 Measuring Information Availability Information availability relies on the availability of the hardware and software components of a data center. Failure of these components might disrupt information availability. A failure is the termination of a component’s ability to perform a required function. The component’s ability can be restored by performing an external corrective action, such as a manual reboot, a repair, or replacement of the failed component(s). Ali Broumandnia, Broumandnia@gmail.com 12
  • 13. • Mean Time Between Failure (MTBF): It is the average time available for a system or component to perform its normal operations between failures. • Mean Time To Repair (MTTR): It is the average time required to repair a failed component. While calculating MTTR, it is assumed that the fault responsible for the failure is correctly identified and that the required spares and personnel are available. Ali Broumandnia, Broumandnia@gmail.com 13
  • 14. Uptime per year is based on the exact timeliness requirements of the service, this calculation leads to the number of “9s” representation for availability metrics. Table 11-1 lists the approximate amount of downtime allowed for a service to achieve certain levels of 9s availability. Ali Broumandnia, Broumandnia@gmail.com 14
  • 16. 11.1.3 Consequences of Downtime Data unavailability, or downtime, results in loss of productivity, loss of revenue, poor financial performance, and damages to reputation. Loss of productivity reduces the output per unit of labor, equipment, and capital. Loss of revenue includes direct loss, compensatory payments, future revenue losses, billing losses, and investment losses. Ali Broumandnia, Broumandnia@gmail.com 16
  • 17. 11.2 BC Terminology This section introduces and defines common terms related to BC operations and are used in the next few chapters to explain advanced concepts: • Disaster recovery • Disaster restart • Recovery-Point Objective (RPO) Ali Broumandnia, Broumandnia@gmail.com 17
  • 18. • RPO of 24 hours • RPO of 1 hour • RPO of zero  Recovery-Time Objective (RTO) A. RTO of 72 hours B. RTO of 12 hours C. RTO of 4 hours D. RTO of 1 hour E. RTO of a few seconds Ali Broumandnia, Broumandnia@gmail.com 18
  • 20. 11.3 BC Planning Lifecycle BC planning must follow a disciplined approach like any other planning process. Organizations today dedicate specialized resources to develop and maintain BC plans. From the conceptualization to the realization of the BC plan, a lifecycle of activities can be defined for the BC process. The BC planning lifecycle includes five stages (see Figure 11-3): Ali Broumandnia, Broumandnia@gmail.com 20
  • 21. 1. Establishing objectives 2. Analyzing 3. Designing and developing 4. Implementing 5. Training, testing, assessing, and maintaining Several activities are performed at each stage of the BC planning lifecycle, including the following key activities: Ali Broumandnia, Broumandnia@gmail.com 21
  • 23. 1. Establishing objectives  Determine BC requirements.  Estimate the scope and budget to achieve requirements.  Select a BC team by considering subject matter experts from all areas of the business, whether internal or external.  Create BC policies. Ali Broumandnia, Broumandnia@gmail.com 23
  • 24. 2. Analyzing  Collect information on data profiles, business processes, infrastructure support, dependencies, and frequency of using business infrastructure.  Identify critical business needs and assign recovery priorities.  Create a risk analysis for critical areas and mitigation strategies.  Conduct a Business Impact Analysis (BIA).  Create a cost and benefit analysis based on the consequences of data unavailability.Ali Broumandnia, Broumandnia@gmail.com 24
  • 25.  Evaluate options. 3. Designing and developing  Define the team structure and assign individual roles and responsibilities.  Design data protection strategies and develop infrastructure.  Develop contingency scenarios.  Develop emergency response procedures.  Detail recovery and restart procedures. Ali Broumandnia, Broumandnia@gmail.com 25
  • 26. 4. Implementing  Implement risk management and mitigation procedures that include backup, replication, and management of resources.  Prepare the disaster recovery sites that can be utilized if a disaster affects the primary data center.  Implement redundancy for every resource in a data center to avoid single points of failure. Ali Broumandnia, Broumandnia@gmail.com 26
  • 27. 5. Training, testing, assessing, and maintaining  Train the employees who are responsible for backup and replication of business-critical data on a regular basis or whenever there is a modification in the BC plan.  Train employees on emergency response procedures when disasters are declared.  Train the recovery team on recovery procedures based on contingency scenarios.  Perform damage assessment processes and review recovery plans. Ali Broumandnia, Broumandnia@gmail.com 27
  • 28.  Test the BC plan regularly to evaluate its performance and identify its limitations.  Assess the performance reports and identify limitations.  Update the BC plans and recovery/restart procedures to reflect regular changes within the data center. Ali Broumandnia, Broumandnia@gmail.com 28
  • 29. 11.4 Failure Analysis Failure analysis involves analyzing the data center to identify systems that are susceptible to a single point of failure and implementing fault-tolerance mechanisms such as redundancy. Ali Broumandnia, Broumandnia@gmail.com 29
  • 30. 11.4.1 Single Point of Failure A single point of failure refers to the failure of a component that can terminate the availability of the entire system or IT service. Figure 11-4 illustrates the possibility of a single point of failure in a system with various components: server, network, switch, and storage array. The figure depicts a system setup in which an application running on the server provides an interface to the client and performs I/O operations. Ali Broumandnia, Broumandnia@gmail.com 30
  • 32. 11.4.2 Fault Tolerance To mitigate a single point of failure, systems are designed with redundancy, such that the system will fail only if all the components in the redundancy group fail. This ensures that the failure of a single component does not affect data availability. Figure 11-5 illustrates the fault-tolerant implementation of the system just described (and shown in Figure 11-4). Ali Broumandnia, Broumandnia@gmail.com 32
  • 34. 11.4.3 Multipathing Software Configuration of multiple paths increases the data availability through path failover. If servers are configured with one I/O path to the data there will be no access to the data if that path fails. Redundant paths eliminate the path to become single points of failure. Multiple paths to data also improve I/O performance through load sharing and maximize server, storage, and data path utilization. Ali Broumandnia, Broumandnia@gmail.com 34
  • 35. 11.5 Business Impact Analysis A business impact analysis (BIA) identifies and evaluates financial, operational, and service impacts of a disruption to essential business processes. Selected functional areas are evaluated to determine resilience of the infrastructure to support information availability. Ali Broumandnia, Broumandnia@gmail.com 35
  • 36. The BIA process leads to a report detailing the incidents and their impact over business functions. The impact may be specified in terms of money or in terms of time. Based on the potential impacts associated with downtime, businesses can prioritize and implement countermeasures to mitigate the likelihood of such disruptions. These are detailed in the BC plan. A BIA includes the following set of tasks: Ali Broumandnia, Broumandnia@gmail.com 36
  • 37.  Identify the key business processes critical to its operation.  Determine the attributes of the business process in terms of applications, databases, and hardware and software requirements.  Estimate the costs of failure for each business process.  Calculate the maximum tolerable outage and define RTO and RPO for each business process. Ali Broumandnia, Broumandnia@gmail.com 37
  • 38.  Establish the minimum resources required for the operation of business processes.  Determine recovery strategies and the cost for implementing them.  Optimize the backup and business recovery strategy based on business priorities.  Analyze the current state of BC readiness and optimize future BC planning. Ali Broumandnia, Broumandnia@gmail.com 38
  • 39. 11.6 BC Technology Solutions After analyzing the business impact of an outage, designing appropriate solutions to recover from a failure is the next important activity. One or more copies of the original data are maintained using any of the following strategies, so that data can be recovered and business operations can be restarted using an alternate copy: Ali Broumandnia, Broumandnia@gmail.com 39
  • 40.  Backup and recovery  Storage array-based replication (local)  Storage array-based replication (remote)  Host-based replication Ali Broumandnia, Broumandnia@gmail.com 40
  • 41. 11.7 Concept in Practice: EMC PowerPath PowerPath is a host-based multipathing software that provides path failover and load balancing functionality. PowerPath operates between operating systems and device drivers and supports SCSI, iSCSI, and Fibre Channel environment. It prioritizes I/O bandwidth utilization by using sophisticated load balancing algorithms to ensure optimal application performance. Ali Broumandnia, Broumandnia@gmail.com 41
  • 42. 11.7.1 PowerPath Features PowerPath provides the following features: • Online path configuration and management • Dynamic load balancing across multiple paths • Automatic path failover • Proactive path testing • Cluster support • InteroperabilityAli Broumandnia, Broumandnia@gmail.com 42
  • 43. 11.7.2 Dynamic Load Balancing For every I/O, the PowerPath filter driver selects the path based on the load balancing policy and failover setting for the logical device. The driver identifies all available paths that read and write to a device and builds a routing table called a volume path set for the devices. PowerPath follows any one of the following user specified load- balancing policies: Ali Broumandnia, Broumandnia@gmail.com 43
  • 44.  Round-Robin policy  Least I/O s policy  Least Blocks policy  Priority-Based policy Ali Broumandnia, Broumandnia@gmail.com 44
  • 45. I/O Operation without PowerPath Figure 11-6 illustrates I/O operations in a storage system environment in the absence of PowerPath. The applications running on a host have four paths to the storage array. However, the applications can use only one of the paths because the LVM that is native to the host operating system allows only one path for application I/O operations. Ali Broumandnia, Broumandnia@gmail.com 45
  • 46. This example illustrates how I/O throughput is unbalanced without PowerPath. Two applications are generating high I/O traffic, which overloads both paths, but the other two paths are less loaded. In this scenario, some paths may be idle or unused while other paths have multiple I/O operations queued. As a result, the applications cannot achieve optimal performance. Ali Broumandnia, Broumandnia@gmail.com 46
  • 48. I/O Operation with PowerPath Figure 11-7 shows I/O operations in a storage system environment that has PowerPath. PowerPath ensures that I/O requests are balanced across the four paths to storage, based on the load-balancing algorithm chosen. As a result, the applications can effectively utilize their resources, thereby improving their performance. Ali Broumandnia, Broumandnia@gmail.com 48
  • 50. 11.7.3 Automatic Path Failover The next two examples demonstrate how PowerPath performs path failover operations in the event of a path failure for active-active and active-passive array configurations. Ali Broumandnia, Broumandnia@gmail.com 50
  • 51. Path Failure without PowerPath Figure 11-8 shows a scenario in which applications use only one of the four paths defined by the operating system. Without PowerPath, the loss of paths (the path failure is marked by a cross “X”) due to single points of failure, such as the loss of an HBA, storage array front- end connectivity, switch port, or a failed cable, can result in an outage for one or more applications. Ali Broumandnia, Broumandnia@gmail.com 51
  • 53. Path Failover with PowerPath: Active-Active Array Figure 11-9 shows a storage system environment in which an application uses PowerPath with an active- active array configuration to perform I/O operations. PowerPath redirects the application I/O s through an alternate active path. Ali Broumandnia, Broumandnia@gmail.com 53
  • 55. Path Failover with PowerPath: Active- Passive Array Figure 11-10 shows a scenario in which a logical device is assigned to a storage processor B (SP B). Ali Broumandnia, Broumandnia@gmail.com 55
  • 57. Path failure can occur due to a failure of the link, HBA, or storage processor (SP). In the event of a path failure, PowerPath with an active-passive configuration performs the path failover operation in the following way: Ali Broumandnia, Broumandnia@gmail.com 57
  • 58. If an active I/O path to SP B through HBA 2 fails, PowerPath uses a passive path to SP B through HBA 1. If HBA 2 fails, the application uses HBA 1 to access the logical device. If SP B fails, PowerPath stops all I/O to SP B and trespasses the device over to SP A. All I/O will be sent down the paths to SP A, this process is referred as LUN trespassing. When SP B is brought back online, PowerPath recognizes that it is available and resumes sending I/O down to SP B. Ali Broumandnia, Broumandnia@gmail.com 58
  • 59. Summary Technology innovations have led to a rich set of options in terms of storage devices and solutions to meet the needs of businesses for high availability and business continuity. The goal of any business continuity (BC) plan is to identify and implement the most appropriate risk management and mitigation procedures to protect against possible failures. The process of analyzing the hardware and software configuration to identify any single points of failure and their impact on business operations is critical. Ali Broumandnia, Broumandnia@gmail.com 59
  • 60. A business impact analysis (BIA) helps a company develop an appropriate BC plan to ensure that the storage infrastructure and services are designed to meet business requirements. BC provides the framework for organizations to implement effective and cost-efficient disaster recovery and restart procedures. In a constantly changing business environment, BC can become a demanding endeavor. The next three chapters discuss specific BC technology solutions, backup and recovery, local replication, and remote replication. Ali Broumandnia, Broumandnia@gmail.com 60