SlideShare a Scribd company logo
1 of 40
Chapter 32
Disaster Recovery, Business Continuity,
Backups, and High Availability
Copyright © 2014 by McGraw-Hill Education.
Introduction
Disaster recovery and business continuity planning are separate
but related concepts. In fact, disaster recovery is part of
business continuity.
Disaster recovery (DR) concerns the recovery of the technical
components of your business, such as computers, software, the
network, data, and so on.
Business continuity planning (BCP) includes disaster recovery
along with procedures to restore business operations and the
underlying functionality of the business infrastructure needed to
support the business, along with the resumption of the daily
work of the people in your workplace. Business continuity
planning is vital to keeping your business running and to
providing a return to “business as usual” during a disaster.
Copyright © 2014 by McGraw-Hill Education.
What Constitutes a Disaster?
A disaster is defined as a “sudden, unplanned calamitous event
causing great damage or loss” or “any event that creates an
inability on an organization’s part to provide critical business
functions for some predetermined period of time.”
With this general definition in mind, the disaster recovery
planner or business continuity professional would sit down with
all the principals in the organization and map out what would
constitute a disaster for that organization. This is the initial
stage
of creating a business impact analysis (BIA).
Copyright © 2014 by McGraw-Hill Education.
Service Assurance Methods
DR and BCP professionals work together to ensure the
recoverability and continuity of all aspects of an organization
that are affected by an outage or security event. This chapter
analyzes the best practices and methodologies for DR and BCP.
We also give close consideration to backups, which are
necessary for disaster recovery as well as recovery from less
severe incidents. Tape backups, which have traditionally been a
key component of DR strategies to move data from the primary
data center to the backup site, are giving way to online, real -
time data replication strategies to keep data synchronized.
We consider high availability in the final section of this
chapter. All three of these components–DR/BCP, backups, and
HA, form the core of a resiliency strategy for services and data.
Copyright © 2014 by McGraw-Hill Education.
Disaster Recovery
When you put together a disaster recovery plan, you need to
understand how your organization’s information technology (IT)
infrastructure, applications, and network support the business
functions of the enterprise you are recovering.
For example, a particular business unit may claim not to need a
certain application or function on day three of a disaster, but the
technology process may dictate that the application should be
available on day one, due to technological interdependencies. In
this example, the DR planner should work with (and educate)
the business unit to help them understand why they need to pay
for a day-one recovery as opposed to a day-three recovery. The
business unit’s budget will typically include a sizeable expense
for the IT department, and this may cause the business unit to
think that any disaster recovery or business continuity efforts
will be cost prohibitive. In working with the IT subject matter
experts (SMEs), you can sometimes figure out a way to bypass a
particular electronic feed or file dependency that may be needed
to continue the recovery of your system.
Copyright © 2014 by McGraw-Hill Education.
Determining What to Recover
All of this will work well if you know what you are recovering
and who to consult with. The responsible business continuity or
disaster recovery professional should work with the IT group
and the business unit to achieve one purpose—to operate a fine,
productive, and lucrative organization.
You can come to know what you are recovering and who is
involved by gathering experts, such as the programmer, business
analyst, system architect, or any other necessary SME. These
experts will prove to be invaluable when it comes to creating
your DR plan. They know what it takes to technically run the
business systems in question and can explain why a certain
disaster recovery process will cost a certain amount. This
information is important for the manager of the business unit, so
that she can make informed decisions.
Copyright © 2014 by McGraw-Hill Education.
Business Continuity Planning
The business continuity professional is more concerned with the
business functions that the employees perform than with the
underlying technologies. To figure out how the business can
resume normal operations during a disaster, the business
continuity professional needs to work with each business unit as
closely as possible. This means they need to meet with the
people who make the decisions, the people who carry out the
decisions in the management team, and finally the “worker
bees” who actually do the work.
You can think of the “worker bees” as power users who know an
application intimately. They know the nuances and
idiosyncrasies of the business function—they are looking at the
trees as opposed to the forest. This is important when it comes
to preparing the business unit’s business continuity plan. The
power users should participate in your disaster recovery
rehearsals and business continuity tabletop exercises.
Copyright © 2014 by McGraw-Hill Education.
Management Team
The business unit management team is vital because its
members see the business unit from a business perspective—at a
higher level—and will help in determining the importance of the
application, as they are acquainted with the mission of the
business unit. The business unit also needs to keep in mind the
need for a disaster recovery plan as it introduces new or
upgraded program applications. The disaster recovery and/or
business continuity professional should be kept informed about
such changes.
For example, a member of management in a business unit might
talk to a vendor about a product that could make a current
business function quicker, smarter, and better. Being the
diligent manager, he would bring the vendor in to meet with
upper management, and the decision would be made to buy the
product, all without informing the IT department or the disaster
recovery or business continuity professional.
As you can see, the business continuity professional needs to
have a relationship with every principle within the business unit
so that, should a new product be brought into the organization,
the knowledge and ability to recover the product will be taken
into consideration.
Copyright © 2014 by McGraw-Hill Education.
The Four Components of Business Continuity Planning
There are four main components of business continuity
planning, each of which is essential to the whole BCP initiative:
Plan initiation
Business impact analysis or assessment
Development of the recovery strategies
Rehearsal or exercise of the disaster recovery and business
continuity plans
Each business unit should have its own plan. The organization
as a whole needs to have a global plan, encompassing all the
business units. There should be two plans that work in tandem:
a business continuity plan (recovery of the people and business
function) and a disaster recovery plan (technological and
application recovery).
Copyright © 2014 by McGraw-Hill Education.
Initiating a Plan
Plan initiation puts everyone on the same page at the beginning
of the creation of the plan. A disaster or event is defined from
the perspective of the specific business unit or entire
organization. What one business unit or organization considers
a disaster may not be considered a disaster by another business
unit or organization, and vice versa.
A BIA is important for several reasons. It provides an
organization or business unit with a dollar value impact
for an unexpected event. This indicates how long an
organization can have its business interrupted before it will go
out of business completely.
Copyright © 2014 by McGraw-Hill Education.
Events
Here are three examples of possible events that could impact
your business and compel you to implement your disaster
recovery or business continuity plan, along with some possible
responses:
Hurricane: Because a hurricane can be predicted a reasonable
amount of time before it strikes, you have time to inform
employees to prepare their homes and other personal effects.
You also have the time to alert your technology group so that
they can initiate their preparation strategy procedures.
Blackout: You can ensure that your enterprise is attached to a
backup generator or an uninterruptible power supply (UPS).
You can conduct awareness programs and perhaps give away
small flashlights that employees can keep in their desks.
Illness outbreak: You can provide an offsite facility where your
employees can relocate during the outbreak and investigation.
Copyright © 2014 by McGraw-Hill Education.
Analyzing the Business Impact
With a BIA, you must first establish what the critical business
function is. This can be determined only by the critical members
of the business unit.
The BIA should be completed and reviewed by the business
unit, including upper management, since the financing of the
business continuity plan and disaster recovery project will
ultimately come from the business unit’s coffers.
Copyright © 2014 by McGraw-Hill Education.
Developing Recovery Strategies
The next step is to develop your recovery strategy. The business
unit will be paying for the recovery, so they need to know what
their options are for different types of recoveries.
You can provide anything from a no-frills recovery to an
instantaneous recovery. It all depends on the business functions
that have to be recovered and on how long the business unit can
go without the function.
The question is essentially how much insurance the business
unit wants to buy. If it is your business, you are the only one
who can make that decision. Someone who does not have as
large a stake in the growth of the business cannot look at the
business from the same perspective.
Copyright © 2014 by McGraw-Hill Education.
Procedures and Contacts
In a business recovery situation, there must be written
procedures that all employees in your business unit can qui ckly
access, understand, and follow. Information needs to be readily
available about the business function that has to be performed.
The procedures should be stored in multiple, accessible
locations to ensure they are available in a disaster scenario.
You also need to make readily available a list of people to
contact, along with their contact information. This list must be
of the current employees to contact, and it should include
members of the Human Resources, Facilities, Risk Management,
and Legal departments. The list of contacts should also include
the local fire and rescue department, police department, and
emergency operations center.
Copyright © 2014 by McGraw-Hill Education.
Rehearsing Disaster Recovery and Business Continuity Plans
The fourth BCP component, and the most crucial, is to rehearse,
exercise, or test the plan. This is “where the rubber meets the
road.”
Having the other three components in place is important, but the
plan is inadequate if you’re not sure whether it will work. It is
vital to test your plan. If the plan has not been tested and it fails
during a disaster, all the work you put into developing it is for
naught. If the plan fails during a test, though, you can improve
on it and test again.
Copyright © 2014 by McGraw-Hill Education.
Third-Party Vendor Issues
Most organizations make use of various third-party vendors
(Enterprise Resource Planning [ERP], Application Service
Provider [ASP], etc.) in their recovery efforts. In such cases,
the information about the third-party vendor is just as critical in
your business or technology recovery. When you need to make
use of such resources, it is beneficial, if not crucial, to make
inquiries into the third-party’s operations prior to the
implementation of its product or services.
In the real world, the disaster recovery and/or business
continuity professional has to integrate the vendor’s information
into the business unit’s continuity plan. If a critical path in your
DR plan depends on the involvement of a third-party vendor,
you can’t get your operation up and running if that third-party
vendor isn’t prepared to assist you. For example, suppose that
processing loans is the bread and butter of your business, and
your business relies on credit bureau reports to process loans. In
this scenario, you need to ensure that if your organization
experiences an outage, you will still receive these reports so
that your company can continue to conduct business.
The vendor’s ability to recover from a failure will also affect
how robust your recovery is. Although your recovery may be
technically sound, you must be sure that you can conduct
business. The same standards you apply to your own
organization should apply to third-party vendors you do
business with. They should be available to you to conduct
business. The disaster recovery or business continuity
coordinator should make the appropriate inquiries with vendors
to ensure that they can support a DR scenario.
Copyright © 2014 by McGraw-Hill Education.
Awareness and Training Programs
Another important element of disaster recovery and business
continuity planning is an awareness program. The business
continuity or disaster recovery professional can meet with each
business for tabletop exercises. These exercises are important,
because they actually get the members of the business unit to sit
down and think about a particular event and how first to prevent
or mitigate it and then how to recover from it.
The event can be anything from a category 3 hurricane to
workplace violence. Any work stoppage can potentially impede
the progress of an organization’s recovery or resumption of
services, and it is up to the management team to design or
develop a plan of action or a business continuity plan. The
business continuity or disaster recovery professional must
facilitate this process and make the business unit aware that
there are events that can bring the business to a grinding halt.
Copyright © 2014 by McGraw-Hill Education.
Backups
Backups may be used for complete system restoration, but they
can also allow you to recover the contents of a mailbox, for
example, or an “accidentally” deleted document. Backups can
be extended to saving more than just digital data. Backup
processes can include the backup of specifications and
configurations, policies and procedures, equipment, and data
centers.
However, if the backup is not good or is too old, or the backup
media is damaged, it will not fix the problem. Just having a
backup procedure in place does not always offer adequate
protection.
Many organizations can no longer depend on traditional backup
processes—doing an offline backup is unacceptable, doing an
online backup would unacceptably degrade system performance,
and restoring from a backup would take so much time that the
organization could not recover. Such organizations are using
alternatives to traditional backups, such as redundant systems
and cloud services.
Backup systems and processes, therefore, reflect the availability
needs of an organization as well as its recovery needs.
Copyright © 2014 by McGraw-Hill Education.
Traditional Backup Methods
In the traditional backup process, data is copied to backup
media, primarily tape, in a predictable and orderly fashion for
secure storage both onsite and offsite.
Backup media can thus be made available to restore data to new
or repaired systems after failure. In addition to data, modern
operating systems and application configurations are also
backed up.
This provides faster restore capabilities and occasionally may
be the only way to restore systems where applications that
support data are intimately integrated with a specific system.
Copyright © 2014 by McGraw-Hill Education.
Backup Types
There are several standard types of backups:
Full
Copy
Incremental
Differential
Copyright © 2014 by McGraw-Hill Education.
Full Backups
Backs up all data selected, whether or not it has changed since
the last backup. The definition of a full backup varies on
different systems. On some systems it includes critical
operating system files needed to rebuild a system completely;
on other systems it backs up only the user data.
Copyright © 2014 by McGraw-Hill Education.
Copy Backups
Data is copied from one disk to another.
Copyright © 2014 by McGraw-Hill Education.
Incremental Backups
When data is backed up, the archive bit on a file is turned off.
When changes are made to the file, the archive bit is set again.
An incremental backup uses this information to back up only
files that have changed since the last backup. This backup turns
the archive bit off again, and the next incremental backup backs
up only the files that have changed since the last incremental
backup. This backup type saves time, but it means that the
restore process will involve restoring the last full backup and
every incremental backup made after it.
Copyright © 2014 by McGraw-Hill Education.
Restoring from an Incremental backup requires that all backups
be applied.
The circle encloses all the backups that must be restored.
Copyright © 2014 by McGraw-Hill Education.
Differential Backups
Like an incremental backup, a differential backup only backs up
files with the archive bit set—files that have changed since the
last backup. Unlike an incremental backup, however, a
differential backup does not reset the archive bit.
Each differential backup backs up all files that have changed
since the last backup that reset the bits. Using this strategy, a
full backup is followed by differential backups.
A restore consists of restoring the full backup and then only the
last differential backup made. This saves time during the
restore, but, depending on your system, creating differential
backups takes longer than creating incremental backups.
Copyright © 2014 by McGraw-Hill Education.
Restoring from a differential backup requires applying only the
full backup and the last differential backup.
The circle encloses all of the backups that must be restored.
Copyright © 2014 by McGraw-Hill Education.
Backup Rotation Strategies
In the traditional backup process, old backups are usually not
immediately replaced by the new backup. Instead, multiple
previous copies of backups are kept. This ensures recovery
should one backup tape set be damaged or otherwise be found
not to be good. Two traditional backup rotation strategies are
Grandfather-Father-Son (GFS) and Tower of Hanoi.
Copyright © 2014 by McGraw-Hill Education.
GFS Backup Strategy
In the GFS rotation strategy, a backup is made to separate
media each day.
Each Sunday a full backup is made, and each day of the week an
incremental backup is made.
The Sunday backups are kept for a month, and the current
week’s incremental backups are also kept.
On the first Sunday of the month, a new tape or disk is used to
make a full backup. The previous full backup becomes the last
full backup of the prior month and is re-labeled as a monthly
backup.
Weekly and daily tapes are rotated as needed, with the oldest
being used for the current backup.
Thus, on any given day of the month, that week’s backup is
available, as well as the previous four or five weeks’ full
backups, along with the incremental backups taken each day of
the preceding week. If the backup scheme has been in use for a
while, prior months’ backups are also available.
Copyright © 2014 by McGraw-Hill Education.
Note:
No backup strategy is complete without plans to test backup
media and backups by doing a restore. If a backup is unusable,
it’s worse than having no backup at all, because it has lured
users into a sense of security. Be sure to add the testing of
backups to your backup strategy, and do this on a test system.
Copyright © 2014 by McGraw-Hill Education.
The Tower of Hanoi Backup Strategy
The Tower of Hanoi strategy is based on a game played with
three poles and a number of rings. The object is to move the
rings from their starting point on one pole to the other pole.
However, the rings are of different sizes, and you are not
allowed to have a ring on top of one that is smaller than itself.
To accomplish the task, a certain order must be followed.
Consider a simple version of the Tower of Hanoi, in which you
are given three pegs, one of which has three rings stacked on it
from largest at the bottom to smallest at the top. Call these
rings A (small), B (medium), and C (large). You need to move
the rings to the right-hand peg. How do you solve this puzzle?
Copyright © 2014 by McGraw-Hill Education.
Tower of Hanoi
Solution
The solution is to move
A to the right-hand peg,
then B to the middle peg,
A on top of B on the middle peg,
then C to the right-hand peg,
then A to the now-empty left-hand peg,
B on top of C on the right-hand peg,
and finally A on top of B to complete the stack on the right-
hand peg.
The rings were moved in this order: A B A C A B A. If you
solve this puzzle with four rings labeled A through D, your
moves would be A B A C A B A D A B A C A B A.
Five rings are solved with the sequence A B A C A B A D A B
A C A B A E A B A C A B A D A B A C A B A.
As you can see, there is a recursive pattern here that looks
complicated but is actually very repetitive. Small children solve
this puzzle all the time.
Copyright © 2014 by McGraw-Hill Education.
Tower of Hanoi for Backups
To use the same strategy with backup tapes requires the use of
multiple tapes in this same complicated order. Each backup is a
full backup, and multiple backups are made to each tape. Since
each tape’s backups are not sequential, the chance that the loss
of one tape or damage to one tape will destroy backups for the
current period is nil. A fairly current backup is always available
on another tape. This backup method gives you as many
different restore options as you have tapes.
Consider a three-tape Tower of Hanoi backup scheme and its
similarity to the sequence of the game. On day one, you perform
a full backup to tape A. On day two, your full backup goes to
tape B. On day three, you back up to tape A again, and on day
four you introduce tape C, which hasn’t been used yet. At this
point, you now have three tapes containing full backups for the
last three days. That’s pretty good coverage. On days 5, 6, and
7, you use tapes A, B, and A again, respectively. This gives you
three tapes containing full backups that you can rely on, even if
one tape is damaged.
Copyright © 2014 by McGraw-Hill Education.
Use More Tapes
For additional coverage, you can use a four-tape or five-tape
Tower of Hanoi scheme.
You would perform the same rotation as in the game, either A B
A C A B A D A B A C A B A in a four-tape system or A B A C
A B A D A B A C A B A E A B A C A B A D A B A C A B A
in a five-tape system.
Higher numbers of tapes can be used as well, but the system is
complicated enough that human error can become a concern.
Backup software can assist by prompting the backup operator
for the correct tape if it is configured for a Tower of Hanoi
scheme.
Copyright © 2014 by McGraw-Hill Education.
Backup Alternatives and Newer Methodologies
Many backup strategies are available for use today as
alternatives to traditional tape backups:
Hierarchical Storage Management (HSM)
Windows shadow copy
Online backup or data vaulting
Dedicated backup networks
Disk-to-disk (D2D) technology
Copyright © 2014 by McGraw-Hill Education.
Hierarchical Storage Management (HSM)
HSM is more of an archiving system than a strict “backup”
strategy, but it is a valid way of preserving data that can be
considered as part of a data retention strategy. Long available
for mainframe systems, it is also available on Windows.
HSM is an automated process that moves the least-used files to
progressively more remote data storage. In other words,
frequently used and changed data is stored online on high speed,
local disks. As data ages (as it is not accessed and is not
changed), it is moved to more remote storage locations, such as
disk appliances or even tape systems.
However, the data is still cataloged and appears readily
available to the user. If accessed, it can be automatically made
available—it can be moved to local disks, it can be returned via
network access, or, in the case of offline storage, operators can
be prompted to load the data. Online services or cloud storage
can be used for the more remote data storage, and this approach
is commonly found in e-mail archiving solutions.
Copyright © 2014 by McGraw-Hill Education.
Windows Shadow Copy
This Windows service takes a snapshot of a working volume,
and then a normal data backup can be made that includes open
files. The shadow copy service doesn’t make a copy; it just
fixes a point in time and then places subsequent changes in a
hidden volume.
When a backup is made, closed files and disk copies of open
files are stored along with the changes. When files are stored on
a Windows system, the service runs in the background,
constantly recording file changes.
If a special client is loaded, previous versions of a file can be
accessed and restored by any user who has authorization to read
the file. Imagine that Alice deletes a file on Monday, or Bob
makes a mistake in a complex spreadsheet design on Friday. On
the following Tuesday, each can obtain their old versions of the
file on their own, without a call to the help desk, and without IT
getting involved.
Copyright © 2014 by McGraw-Hill Education.
Online Backup or Data Vaulting
An individual or business can contract with an online servi ce
that automatically and regularly connects to a host or hosts and
copies identified data to an online server.
Typically, arrangements can be made to back up everything,
data only, or specific data sets.
Payment plans are based both on volume of data backed up and
on the number of hosts, ranging up to complete data backups of
entire data centers.
Copyright © 2014 by McGraw-Hill Education.
Dedicated Backup Networks
An Ethernet LAN can become a backup bottleneck if disk and
tape systems are provided in parallel and exceed the LAN’s
throughput capacity. Backups also consume bandwidth and thus
degrade performance for other network operations.
Dedicated backup networks are often implemented using a Fibre
Channel storage area network (SAN) or Gigabit Ethernet
network and Internet Small Computer Systems Interface
(iSCSI). iSCSI and Gigabit Ethernet can provide wire-speed
data transfer. Backup is to servers or disk appliances on the
SAN.
Copyright © 2014 by McGraw-Hill Education.
Disk-to-Disk (D2D) Technology
A slow tape backup system may be a bottleneck, as servers may
be able to provide data faster than the tape system can record it.
D2D servers don’t wait for a tape drive, and disks can be
provided over high-speed dedicated backup networks, so both
backups and restores can be faster.
D2D can use traditional network-attached storage (NAS)
systems supported by Ethernet connectivity and either the
Network File System (NFS on Unix) protocol or Common
Internet File System (CIFS on Windows) protocol, or dedicated
backup networks can be provided for D2D.
Copyright © 2014 by McGraw-Hill Education.
Backup Benefits
Many benefits can be obtained from backing up as a regular part
of IT operations:
Cost savings: It takes many people-hours to reproduce digitally
stored data. The cost of backup software and hardware is a
fraction of this cost.
Productivity: Users cannot work without data. When data can be
restored quickly, productivity is maintained.
Increased security: When backups are available, the impact of
an attack that destroys or corrupts data is lessened. Data can be
replaced or compared to ensure its integrity.
Simplicity: When centralized backups are used, no user needs to
make a decision about what to back up.
Copyright © 2014 by McGraw-Hill Education.
Backup Policy
The way to ensure that backups are made and protected is to
have an enforceable and enforced backup policy.
The policy should identify the goals of the process, such as
frequency, the necessity of onsite and offsite storage, and
requirements for formal processes, authority, and
documentation.
Procedures can then be developed, approved, and used that
interpret policy in light of current applications, data sets,
equipment, and the availability of technologies. Several topics
should be specifically detailed in the policy.
Copyright © 2014 by McGraw-Hill Education.
Administrative Authority
Designate who has the authority to physically start the backup,
transport and check out backup media, perform restores, sign
off on activity, and approve changes in procedures. This should
also include guidelines for how individuals are chosen.
Recommendations should include separating duties between
backing up and restoring, between approval and activity, and
even between systems. (For example, those authorized to back
up directory services and password databases should be
different from those given authority to back up databases.) This
allows for role separation, a critical security requirement, and
the delegation of many routine duties to junior IT employees.
Copyright © 2014 by McGraw-Hill Education.
What to Back Up
Designate which information should be backed up.
Should system data or only application data be backed up?
What about configuration information, patch levels, and version
levels?
How will applications and operating systems be replaced?
Are original and backup copies of their installation disks
provided for?
These details should be specified.
Copyright © 2014 by McGraw-Hill Education.
Scheduling
Identify how often backups should be performed.
Copyright © 2014 by McGraw-Hill Education.
Monitoring
Specify how to ensure the completion and retention of backups.
Copyright © 2014 by McGraw-Hill Education.
Storage for Backup Media
Specify which of the many ways to store backup media are
appropriate.
Is media stored both onsite and offsite?
What are the requirements for each type of storage? For
example, are fireproof vaults or cabinets available? Are they
kept closed? Where are they located?
Onsite backup media needs to be available, but storing backups
near the original systems may be counterproductive. A disaster
that damages the original system might take out the backup
media as well.
Copyright © 2014 by McGraw-Hill Education.
Type of Media and Process Used
Specify how backups are made.
How many backups are made, and of what type?
How often are they made, and how long are they kept?
How often is backup media replaced?
Copyright © 2014 by McGraw-Hill Education.
High Availability
Not too long ago, most businesses closed at 5 p.m. Many were
not open on the weekends, holidays were observed by closings
or shortened hours, and few of us worried when we couldn’t
read the latest news at midnight or shop for bath towels at 3
a.m. That’s not true anymore. Even ordinary businesses
maintain computer systems around the clock, and their
customers expect instant gratification at any hour. Somehow,
since computers and networks are devices and not people, we
expect them just to keep working without breaks, or sleep.
Of course, they do break. Procedures, processes, software, and
hardware that enable system and network redundancy are a
necessary part of operations. However, they serve another
purpose as well. Redundancy ensures the integrity and
availability of information.
Copyright © 2014 by McGraw-Hill Education.
Redundancy
What effect does system redundancy have? Calculations
including the mean time to repair (how long it takes to replace a
failed component) and uptime (the percentage of time a system
is operational) can show the results of having versus not having
redundancy built into a computer system or a network.
However, the importance of these figures depends on the needs
and requirements of the system.
Most desktop systems, for example, do not require built-in
redundancy; if one fails and our work is critical, we simply
obtain another desktop system. The need for redundancy is met
by another system. In most cases, however, we do something
else while the system is fixed. Other systems, however, are
critical to the survival of a business or perhaps even of a life.
These systems need either built-in hardware redundancy,
support alternatives that can keep their functions intact, or both.
Copyright © 2014 by McGraw-Hill Education.
Note:
Critical systems are those systems a business must have, and
without which it would be critically damaged, or whose failure
might be life-threatening. Which systems are critical to a
business must be determined by the business. For some it will
be their e-commerce site, for others the billing system, and for
others their customer information databases. Everyone
recognizes the critical nature of air traffic control systems and
life support systems used in hospitals.
Two methods can be used to evaluate where and how much
redundancy is needed . The first, more traditional method is to
weigh the cost of providing redundancy against the cost of
downtime without redundancy. These costs can be calculated
and compared directly. (Is the cost of downtime greater or less
than the cost of redundancy?) The second method, which is
harder to calculate but is increasingly easier to justify, is to
decide based on the likelihood that customers will gravitate to
the organization that can provide the best availability of
service. This, in turn, is based on the increasing demands that
online services, unlike traditional services, be available
24×7×365. High availability can be a selling point that directly
leads to more business. Indeed, some customers will demand it.
There are automated methods for providing system redundancy,
such as hardware fault tolerance, clustering, and network
routing, and there are operational methods, such as component
hot-swapping and standby systems.
Copyright © 2014 by McGraw-Hill Education.
Automated Redundancy Methods
It has become commonplace to expect significant hardware
redundancy and fault tolerance in server systems. A wide range
of components are either duplicated within the systems or
effectively duplicated by linking systems into a cluster. Some
typical components and techniques are used:
Clustering
Fault tolerance
Redundant System Slot (RSS)
Cluster in a box
High-availability design
Internet network routing
Copyright © 2014 by McGraw-Hill Education.
Clustering
Entire computers or systems are duplicated. If a system fails,
operation automatically transfers to the other systems.
Clusters may be set up as active-standby, in which case one
system is live and the other is idle, or active-active, in which
case multiple systems are kept perfectly in synch, and even
dynamic load sharing is possible.
Active-active is ideal, as no system stands idle and the total
capacity of all systems can always be utilized. If there is a
system failure, fewer systems carry the load. When the failed
system is replaced, load balancing readjusts.
Clustering does have its downside. When active-standby is used,
duplication of systems is expensive. These active-standby
systems may also take seconds for the failover to occur, which
is a long time when systems are under heavy loads. Active-
active systems, however, may require specialized hardware and
additional, specialized administrative knowledge and
maintenance.
Copyright © 2014 by McGraw-Hill Education.
Fault Tolerance
Components may have backup systems or parts of systems that
allow them to recover from errors or to survive in spite of them.
For example, fault-tolerant CPUs use multiple CPUs running in
lockstep, each using the same processing logic. In the typical
case, three CPUs are used and the results from all CPUs are
compared. If one CPU produces results that don’t match those
of the other two, it is considered to have failed and is no longer
consulted until it is replaced.
Another example is the fault tolerance built into Microsoft’s
NTFS file system. If the system detects a bad spot on a disk
during a write, it automatically marks it as bad and writes the
data elsewhere. The logic to both these strategies is to isolate
failure and continue on. Meanwhile, the system can raise alerts
and record error messages to prompt maintenance.
Copyright © 2014 by McGraw-Hill Education.
Redundant System Slot (RSS)
Entire hot-swappable computer units are provided in a single
unit.
Each system has its own operating system and bus, but all
systems are connected and share other components.
Like clustered systems, RSS systems can be either active-
standby or active-active. RSS systems exist as a unit, and
systems cannot be removed from their unit and continue to
operate.
Copyright © 2014 by McGraw-Hill Education.
Cluster in a Box
Two or more systems are combined in a single unit.
The difference between these systems and RSS systems is that
each unit has its own CPU, bus, peripherals, operating system,
and applications.
Components can be hot-swapped, and therein lies its advantage
over a traditional cluster.
Copyright © 2014 by McGraw-Hill Education.
High Availability Design
Two or more complete components are placed on the network,
with one component serving either as a standby system (with
traffic being routed to the standby system if the primary fails)
or as an active node (with load balancing being used to route
traffic to multiple systems sharing the load, and if one fails,
traffic is routed only to the other functional systems).
Copyright © 2014 by McGraw-Hill Education.
A High Availability Network Design Supporting a Web Site
Multiple ISP backbones are available, and duplicate firewalls,
load-balancing systems, application servers, and database
servers support a single web site.
Copyright © 2014 by McGraw-Hill Education.
Internet Network Routing
In an attempt to achieve redundancy for Internet-based systems
similar to that of the Public Switched Telephone Network
(PSTN), new architectures for Internet routing are adding or
proposing a variety of techniques, such as these:
Reserve capacity
System and geographic diversity
Size limits
Dynamic restoration switching
Self-healing protection switching
Fast rerouting (which reverses traffic at the point of failure so
that it can be directed to an alternative route)
RSVP-based backup tunnels (where a node adjacent to a failed
link signals failure to upstream nodes, and traffic is thus
rerouted around the failure)
Two-path protection (in which sophisticated engineering
algorithms develop alternative paths between every node)
Two examples of such architectures are Multiprotocol Label
Switching (MPLS), which integrates IP and data-link layer
technologies to introduce sophisticated routing control, and
Automatic Switching Protection (ASP), which provides the fast
restoration times that modern technologies, such as voice and
streaming media, require.
Copyright © 2014 by McGraw-Hill Education.
Operational Redundancy Methods
In addition to technologies that provide automated redundancy,
there are many processes that help you to quickly get your
systems up and running, if a problem occurs. These include
Standby systems
Hot-swappable components
Copyright © 2014 by McGraw-Hill Education.
Standby Systems
Complete or partial systems are kept ready. Should a system, or
one of its subsystems, fail, the standby system can be put into
service. There are many variations on this technique.
Some clusters are deployed in active-standby state, so the
clustered system is ready to go but idle. To recover from a CPU
or other major system failure quickly, a hard drive might be
moved to another, duplicate, online system.
To recover quickly from the failure of a database system, a
duplicate system complete with database software may be kept
ready. The database is periodically updated by replication or by
export and import functions. If the main system fails, the
standby system can be placed online, though it may be lacking
some recent transactions.
Copyright © 2014 by McGraw-Hill Education.
Hot-Swappable Components
Many hardware components can now be replaced without
shutting down systems. Hard drives, network cards, and memory
are examples of current hardware components that can be added.
Modern operating systems detect the addition of these devices
on the fly, and operations continue with minor, if any, service
outages.
In a RAID array, for example, drive failure may be compensated
for by the built-in redundancy of the array. If the failed drive
can be replaced without shutting down the system, the array will
return to its prefailure state. Interruptions in service will be nil,
though performance may suffer depending on the current load.
Copyright © 2014 by McGraw-Hill Education.
Summary
In this chapter, we covered the four related business resumption
strategies that are all necessary for recovery from incidents,
outages, and disasters that result in service or data loss: disaster
recovery, business continuity planning, backups, and high-
availability. Together, these form the core of a strategy to keep
the organization’s information infrastructure operational.
Here in summary are the principal points, roles, and
responsibilities of a good disaster recovery and business
continuity program:
Develop and maintain disaster recovery and business continuity
plans for all your organization’s enterprise technologies.
Schedule and oversee disaster recovery rehearsals for all
enterprise systems.
Ensure disaster awareness by planning and conducting
awareness programs, hazard fairs, lunch-and-learn sessions, and
other informative events and materials.
Activate the plan.
Ensure community involvement by participating in local
community disaster mitigation and planning initiatives and
professional groups.
The disaster recovery and business continuity process is
cyclical and must be maintained for it to stay current with the
needs of the organization and the technologies in the
environment. Your plans must be updated and rehearsed
regularly. Disaster recovery is vital to everyone.
Backups can be an important part of a recovery strategy. They
play a role in disaster recovery process, to move data from the
primary site to the DR site, although real-time data replication
approaches are replacing traditional tape shipments in modern
DR plans. Backups are also necessary for recovering data in a
traditional data center.
High availability architectures are the fourth leg of the table
supporting service resiliency, to ensure that failure of one
system or component of a service doesn’t cause that service to
fail.
Copyright © 2014 by McGraw-Hill Education.

More Related Content

Similar to DR/BCP Guide for Service Assurance

Business continuity planning and disaster recovery
Business continuity planning and disaster recoveryBusiness continuity planning and disaster recovery
Business continuity planning and disaster recoveryKrutiShah114
 
Business Continuity Getting Started
Business Continuity Getting StartedBusiness Continuity Getting Started
Business Continuity Getting Startedmxp5714
 
Promotion_of_Business_Continuity_Management_-_Plan_Guide_and_template.pdf
Promotion_of_Business_Continuity_Management_-_Plan_Guide_and_template.pdfPromotion_of_Business_Continuity_Management_-_Plan_Guide_and_template.pdf
Promotion_of_Business_Continuity_Management_-_Plan_Guide_and_template.pdfCPittman3
 
1First Example of a Good AnswerA disaster recovery plan re.docx
1First Example of  a Good AnswerA disaster recovery plan re.docx1First Example of  a Good AnswerA disaster recovery plan re.docx
1First Example of a Good AnswerA disaster recovery plan re.docxhyacinthshackley2629
 
A to Z of Business Continuity Managment
A to Z of Business Continuity ManagmentA to Z of Business Continuity Managment
A to Z of Business Continuity ManagmentMark Conway
 
Bcm Roadmap
Bcm RoadmapBcm Roadmap
Bcm Roadmapbtrmuray
 
BCM Roadmap
BCM RoadmapBCM Roadmap
BCM Roadmapbtrmuray
 
Buisness contingency plan
Buisness contingency planBuisness contingency plan
Buisness contingency planRMC
 
Business Continuity and Disaster Recover Week3Part4-ISr.docx
Business Continuity and Disaster Recover  Week3Part4-ISr.docxBusiness Continuity and Disaster Recover  Week3Part4-ISr.docx
Business Continuity and Disaster Recover Week3Part4-ISr.docxhumphrieskalyn
 
Large and globally disbursed businesses have a wealth of resources.docx
Large and globally disbursed businesses have a wealth of resources.docxLarge and globally disbursed businesses have a wealth of resources.docx
Large and globally disbursed businesses have a wealth of resources.docxsmile790243
 
Business Continuation - The basics according to John Small 2014-02-21
Business Continuation - The basics according to John Small 2014-02-21Business Continuation - The basics according to John Small 2014-02-21
Business Continuation - The basics according to John Small 2014-02-21Business As Usual, Inc.
 
Business Continuity Planning Presentation Overview
Business Continuity Planning Presentation OverviewBusiness Continuity Planning Presentation Overview
Business Continuity Planning Presentation OverviewBob Winkler
 
Incident managment plan
Incident managment planIncident managment plan
Incident managment planSafwan Hashmi
 
Business continuity plan
Business continuity planBusiness continuity plan
Business continuity planSafwan Hashmi
 
Business continuity in general
Business continuity in generalBusiness continuity in general
Business continuity in generalJohn Johari
 
Topic Describe each of the elements of a Business Continuity Plan .docx
Topic Describe each of the elements of a Business Continuity Plan .docxTopic Describe each of the elements of a Business Continuity Plan .docx
Topic Describe each of the elements of a Business Continuity Plan .docxjuliennehar
 
Article on Emergency Management and Corporate Certification
Article on Emergency Management and Corporate CertificationArticle on Emergency Management and Corporate Certification
Article on Emergency Management and Corporate CertificationThomas Bronack
 

Similar to DR/BCP Guide for Service Assurance (20)

Business continuity planning and disaster recovery
Business continuity planning and disaster recoveryBusiness continuity planning and disaster recovery
Business continuity planning and disaster recovery
 
Business Continuity Getting Started
Business Continuity Getting StartedBusiness Continuity Getting Started
Business Continuity Getting Started
 
Promotion_of_Business_Continuity_Management_-_Plan_Guide_and_template.pdf
Promotion_of_Business_Continuity_Management_-_Plan_Guide_and_template.pdfPromotion_of_Business_Continuity_Management_-_Plan_Guide_and_template.pdf
Promotion_of_Business_Continuity_Management_-_Plan_Guide_and_template.pdf
 
1First Example of a Good AnswerA disaster recovery plan re.docx
1First Example of  a Good AnswerA disaster recovery plan re.docx1First Example of  a Good AnswerA disaster recovery plan re.docx
1First Example of a Good AnswerA disaster recovery plan re.docx
 
A to Z of Business Continuity Managment
A to Z of Business Continuity ManagmentA to Z of Business Continuity Managment
A to Z of Business Continuity Managment
 
Bcm Roadmap
Bcm RoadmapBcm Roadmap
Bcm Roadmap
 
BCM Roadmap
BCM RoadmapBCM Roadmap
BCM Roadmap
 
Buisness contingency plan
Buisness contingency planBuisness contingency plan
Buisness contingency plan
 
Business Continuity and Disaster Recover Week3Part4-ISr.docx
Business Continuity and Disaster Recover  Week3Part4-ISr.docxBusiness Continuity and Disaster Recover  Week3Part4-ISr.docx
Business Continuity and Disaster Recover Week3Part4-ISr.docx
 
Large and globally disbursed businesses have a wealth of resources.docx
Large and globally disbursed businesses have a wealth of resources.docxLarge and globally disbursed businesses have a wealth of resources.docx
Large and globally disbursed businesses have a wealth of resources.docx
 
Document the drp now
Document the drp nowDocument the drp now
Document the drp now
 
Microsoft Whitepaper: Disaster Preparedness Guide
Microsoft Whitepaper: Disaster Preparedness GuideMicrosoft Whitepaper: Disaster Preparedness Guide
Microsoft Whitepaper: Disaster Preparedness Guide
 
Business Continuation - The basics according to John Small 2014-02-21
Business Continuation - The basics according to John Small 2014-02-21Business Continuation - The basics according to John Small 2014-02-21
Business Continuation - The basics according to John Small 2014-02-21
 
Business Continuity Planning Presentation Overview
Business Continuity Planning Presentation OverviewBusiness Continuity Planning Presentation Overview
Business Continuity Planning Presentation Overview
 
Incident managment plan
Incident managment planIncident managment plan
Incident managment plan
 
Business continuity plan
Business continuity planBusiness continuity plan
Business continuity plan
 
Business continuity in general
Business continuity in generalBusiness continuity in general
Business continuity in general
 
Topic Describe each of the elements of a Business Continuity Plan .docx
Topic Describe each of the elements of a Business Continuity Plan .docxTopic Describe each of the elements of a Business Continuity Plan .docx
Topic Describe each of the elements of a Business Continuity Plan .docx
 
Article on Emergency Management and Corporate Certification
Article on Emergency Management and Corporate CertificationArticle on Emergency Management and Corporate Certification
Article on Emergency Management and Corporate Certification
 
The Ultimate Guide To Business Continuity
The Ultimate Guide To Business ContinuityThe Ultimate Guide To Business Continuity
The Ultimate Guide To Business Continuity
 

More from EstelaJeffery653

Individual ProjectMedical TechnologyWed, 9617Num.docx
Individual ProjectMedical TechnologyWed, 9617Num.docxIndividual ProjectMedical TechnologyWed, 9617Num.docx
Individual ProjectMedical TechnologyWed, 9617Num.docxEstelaJeffery653
 
Individual ProjectThe Post-Watergate EraWed, 3817Numeric.docx
Individual ProjectThe Post-Watergate EraWed, 3817Numeric.docxIndividual ProjectThe Post-Watergate EraWed, 3817Numeric.docx
Individual ProjectThe Post-Watergate EraWed, 3817Numeric.docxEstelaJeffery653
 
Individual ProjectArticulating the Integrated PlanWed, 31.docx
Individual ProjectArticulating the Integrated PlanWed, 31.docxIndividual ProjectArticulating the Integrated PlanWed, 31.docx
Individual ProjectArticulating the Integrated PlanWed, 31.docxEstelaJeffery653
 
Individual Multilingualism Guidelines1)Where did the a.docx
Individual Multilingualism Guidelines1)Where did the a.docxIndividual Multilingualism Guidelines1)Where did the a.docx
Individual Multilingualism Guidelines1)Where did the a.docxEstelaJeffery653
 
Individual Implementation Strategiesno new messagesObjectives.docx
Individual Implementation Strategiesno new messagesObjectives.docxIndividual Implementation Strategiesno new messagesObjectives.docx
Individual Implementation Strategiesno new messagesObjectives.docxEstelaJeffery653
 
Individual Refine and Finalize WebsiteDueJul 02View m.docx
Individual Refine and Finalize WebsiteDueJul 02View m.docxIndividual Refine and Finalize WebsiteDueJul 02View m.docx
Individual Refine and Finalize WebsiteDueJul 02View m.docxEstelaJeffery653
 
Individual Cultural Communication Written Assignment  (Worth 20 of .docx
Individual Cultural Communication Written Assignment  (Worth 20 of .docxIndividual Cultural Communication Written Assignment  (Worth 20 of .docx
Individual Cultural Communication Written Assignment  (Worth 20 of .docxEstelaJeffery653
 
Individual ProjectThe Basic Marketing PlanWed, 3117N.docx
Individual ProjectThe Basic Marketing PlanWed, 3117N.docxIndividual ProjectThe Basic Marketing PlanWed, 3117N.docx
Individual ProjectThe Basic Marketing PlanWed, 3117N.docxEstelaJeffery653
 
Individual ProjectFinancial Procedures in a Health Care Organiza.docx
Individual ProjectFinancial Procedures in a Health Care Organiza.docxIndividual ProjectFinancial Procedures in a Health Care Organiza.docx
Individual ProjectFinancial Procedures in a Health Care Organiza.docxEstelaJeffery653
 
Individual Expanded Website PlanView more »Expand view.docx
Individual Expanded Website PlanView more  »Expand view.docxIndividual Expanded Website PlanView more  »Expand view.docx
Individual Expanded Website PlanView more »Expand view.docxEstelaJeffery653
 
Individual Expanded Website PlanDueJul 02View more .docx
Individual Expanded Website PlanDueJul 02View more .docxIndividual Expanded Website PlanDueJul 02View more .docx
Individual Expanded Website PlanDueJul 02View more .docxEstelaJeffery653
 
Individual Communicating to Management Concerning Information Syste.docx
Individual Communicating to Management Concerning Information Syste.docxIndividual Communicating to Management Concerning Information Syste.docx
Individual Communicating to Management Concerning Information Syste.docxEstelaJeffery653
 
Individual Case Analysis-MatavIn max 4 single-spaced total pag.docx
Individual Case Analysis-MatavIn max 4 single-spaced total pag.docxIndividual Case Analysis-MatavIn max 4 single-spaced total pag.docx
Individual Case Analysis-MatavIn max 4 single-spaced total pag.docxEstelaJeffery653
 
Individual Assignment Report Format• Report should contain not m.docx
Individual Assignment Report Format• Report should contain not m.docxIndividual Assignment Report Format• Report should contain not m.docx
Individual Assignment Report Format• Report should contain not m.docxEstelaJeffery653
 
Include LOCO api that allows user to key in an address and get the d.docx
Include LOCO api that allows user to key in an address and get the d.docxInclude LOCO api that allows user to key in an address and get the d.docx
Include LOCO api that allows user to key in an address and get the d.docxEstelaJeffery653
 
Include the title, the name of the composer (if known) and of the .docx
Include the title, the name of the composer (if known) and of the .docxInclude the title, the name of the composer (if known) and of the .docx
Include the title, the name of the composer (if known) and of the .docxEstelaJeffery653
 
include as many events as possible to support your explanation of th.docx
include as many events as possible to support your explanation of th.docxinclude as many events as possible to support your explanation of th.docx
include as many events as possible to support your explanation of th.docxEstelaJeffery653
 
Incorporate the suggestions that were provided by your fellow projec.docx
Incorporate the suggestions that were provided by your fellow projec.docxIncorporate the suggestions that were provided by your fellow projec.docx
Incorporate the suggestions that were provided by your fellow projec.docxEstelaJeffery653
 
inal ProjectDUE Jun 25, 2017 1155 PMGrade DetailsGradeNA.docx
inal ProjectDUE Jun 25, 2017 1155 PMGrade DetailsGradeNA.docxinal ProjectDUE Jun 25, 2017 1155 PMGrade DetailsGradeNA.docx
inal ProjectDUE Jun 25, 2017 1155 PMGrade DetailsGradeNA.docxEstelaJeffery653
 
include 1page proposal- short introduction to research paper and yo.docx
include 1page proposal- short introduction to research paper and yo.docxinclude 1page proposal- short introduction to research paper and yo.docx
include 1page proposal- short introduction to research paper and yo.docxEstelaJeffery653
 

More from EstelaJeffery653 (20)

Individual ProjectMedical TechnologyWed, 9617Num.docx
Individual ProjectMedical TechnologyWed, 9617Num.docxIndividual ProjectMedical TechnologyWed, 9617Num.docx
Individual ProjectMedical TechnologyWed, 9617Num.docx
 
Individual ProjectThe Post-Watergate EraWed, 3817Numeric.docx
Individual ProjectThe Post-Watergate EraWed, 3817Numeric.docxIndividual ProjectThe Post-Watergate EraWed, 3817Numeric.docx
Individual ProjectThe Post-Watergate EraWed, 3817Numeric.docx
 
Individual ProjectArticulating the Integrated PlanWed, 31.docx
Individual ProjectArticulating the Integrated PlanWed, 31.docxIndividual ProjectArticulating the Integrated PlanWed, 31.docx
Individual ProjectArticulating the Integrated PlanWed, 31.docx
 
Individual Multilingualism Guidelines1)Where did the a.docx
Individual Multilingualism Guidelines1)Where did the a.docxIndividual Multilingualism Guidelines1)Where did the a.docx
Individual Multilingualism Guidelines1)Where did the a.docx
 
Individual Implementation Strategiesno new messagesObjectives.docx
Individual Implementation Strategiesno new messagesObjectives.docxIndividual Implementation Strategiesno new messagesObjectives.docx
Individual Implementation Strategiesno new messagesObjectives.docx
 
Individual Refine and Finalize WebsiteDueJul 02View m.docx
Individual Refine and Finalize WebsiteDueJul 02View m.docxIndividual Refine and Finalize WebsiteDueJul 02View m.docx
Individual Refine and Finalize WebsiteDueJul 02View m.docx
 
Individual Cultural Communication Written Assignment  (Worth 20 of .docx
Individual Cultural Communication Written Assignment  (Worth 20 of .docxIndividual Cultural Communication Written Assignment  (Worth 20 of .docx
Individual Cultural Communication Written Assignment  (Worth 20 of .docx
 
Individual ProjectThe Basic Marketing PlanWed, 3117N.docx
Individual ProjectThe Basic Marketing PlanWed, 3117N.docxIndividual ProjectThe Basic Marketing PlanWed, 3117N.docx
Individual ProjectThe Basic Marketing PlanWed, 3117N.docx
 
Individual ProjectFinancial Procedures in a Health Care Organiza.docx
Individual ProjectFinancial Procedures in a Health Care Organiza.docxIndividual ProjectFinancial Procedures in a Health Care Organiza.docx
Individual ProjectFinancial Procedures in a Health Care Organiza.docx
 
Individual Expanded Website PlanView more »Expand view.docx
Individual Expanded Website PlanView more  »Expand view.docxIndividual Expanded Website PlanView more  »Expand view.docx
Individual Expanded Website PlanView more »Expand view.docx
 
Individual Expanded Website PlanDueJul 02View more .docx
Individual Expanded Website PlanDueJul 02View more .docxIndividual Expanded Website PlanDueJul 02View more .docx
Individual Expanded Website PlanDueJul 02View more .docx
 
Individual Communicating to Management Concerning Information Syste.docx
Individual Communicating to Management Concerning Information Syste.docxIndividual Communicating to Management Concerning Information Syste.docx
Individual Communicating to Management Concerning Information Syste.docx
 
Individual Case Analysis-MatavIn max 4 single-spaced total pag.docx
Individual Case Analysis-MatavIn max 4 single-spaced total pag.docxIndividual Case Analysis-MatavIn max 4 single-spaced total pag.docx
Individual Case Analysis-MatavIn max 4 single-spaced total pag.docx
 
Individual Assignment Report Format• Report should contain not m.docx
Individual Assignment Report Format• Report should contain not m.docxIndividual Assignment Report Format• Report should contain not m.docx
Individual Assignment Report Format• Report should contain not m.docx
 
Include LOCO api that allows user to key in an address and get the d.docx
Include LOCO api that allows user to key in an address and get the d.docxInclude LOCO api that allows user to key in an address and get the d.docx
Include LOCO api that allows user to key in an address and get the d.docx
 
Include the title, the name of the composer (if known) and of the .docx
Include the title, the name of the composer (if known) and of the .docxInclude the title, the name of the composer (if known) and of the .docx
Include the title, the name of the composer (if known) and of the .docx
 
include as many events as possible to support your explanation of th.docx
include as many events as possible to support your explanation of th.docxinclude as many events as possible to support your explanation of th.docx
include as many events as possible to support your explanation of th.docx
 
Incorporate the suggestions that were provided by your fellow projec.docx
Incorporate the suggestions that were provided by your fellow projec.docxIncorporate the suggestions that were provided by your fellow projec.docx
Incorporate the suggestions that were provided by your fellow projec.docx
 
inal ProjectDUE Jun 25, 2017 1155 PMGrade DetailsGradeNA.docx
inal ProjectDUE Jun 25, 2017 1155 PMGrade DetailsGradeNA.docxinal ProjectDUE Jun 25, 2017 1155 PMGrade DetailsGradeNA.docx
inal ProjectDUE Jun 25, 2017 1155 PMGrade DetailsGradeNA.docx
 
include 1page proposal- short introduction to research paper and yo.docx
include 1page proposal- short introduction to research paper and yo.docxinclude 1page proposal- short introduction to research paper and yo.docx
include 1page proposal- short introduction to research paper and yo.docx
 

Recently uploaded

भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,Virag Sontakke
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentInMediaRes1
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfadityarao40181
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
internship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerinternship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerunnathinaik
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitolTechU
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxAvyJaneVismanos
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 

Recently uploaded (20)

भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media Component
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdf
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdf
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
internship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerinternship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developer
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptx
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptx
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 

DR/BCP Guide for Service Assurance

  • 1. Chapter 32 Disaster Recovery, Business Continuity, Backups, and High Availability Copyright © 2014 by McGraw-Hill Education. Introduction Disaster recovery and business continuity planning are separate but related concepts. In fact, disaster recovery is part of business continuity. Disaster recovery (DR) concerns the recovery of the technical components of your business, such as computers, software, the network, data, and so on. Business continuity planning (BCP) includes disaster recovery along with procedures to restore business operations and the underlying functionality of the business infrastructure needed to support the business, along with the resumption of the daily work of the people in your workplace. Business continuity planning is vital to keeping your business running and to providing a return to “business as usual” during a disaster. Copyright © 2014 by McGraw-Hill Education.
  • 2. What Constitutes a Disaster? A disaster is defined as a “sudden, unplanned calamitous event causing great damage or loss” or “any event that creates an inability on an organization’s part to provide critical business functions for some predetermined period of time.” With this general definition in mind, the disaster recovery planner or business continuity professional would sit down with all the principals in the organization and map out what would constitute a disaster for that organization. This is the initial stage of creating a business impact analysis (BIA). Copyright © 2014 by McGraw-Hill Education. Service Assurance Methods DR and BCP professionals work together to ensure the recoverability and continuity of all aspects of an organization that are affected by an outage or security event. This chapter analyzes the best practices and methodologies for DR and BCP. We also give close consideration to backups, which are necessary for disaster recovery as well as recovery from less severe incidents. Tape backups, which have traditionally been a key component of DR strategies to move data from the primary data center to the backup site, are giving way to online, real - time data replication strategies to keep data synchronized. We consider high availability in the final section of this chapter. All three of these components–DR/BCP, backups, and HA, form the core of a resiliency strategy for services and data. Copyright © 2014 by McGraw-Hill Education.
  • 3. Disaster Recovery When you put together a disaster recovery plan, you need to understand how your organization’s information technology (IT) infrastructure, applications, and network support the business functions of the enterprise you are recovering. For example, a particular business unit may claim not to need a certain application or function on day three of a disaster, but the technology process may dictate that the application should be available on day one, due to technological interdependencies. In this example, the DR planner should work with (and educate) the business unit to help them understand why they need to pay for a day-one recovery as opposed to a day-three recovery. The business unit’s budget will typically include a sizeable expense for the IT department, and this may cause the business unit to think that any disaster recovery or business continuity efforts will be cost prohibitive. In working with the IT subject matter experts (SMEs), you can sometimes figure out a way to bypass a particular electronic feed or file dependency that may be needed to continue the recovery of your system. Copyright © 2014 by McGraw-Hill Education. Determining What to Recover All of this will work well if you know what you are recovering and who to consult with. The responsible business continuity or disaster recovery professional should work with the IT group and the business unit to achieve one purpose—to operate a fine, productive, and lucrative organization. You can come to know what you are recovering and who is involved by gathering experts, such as the programmer, business analyst, system architect, or any other necessary SME. These experts will prove to be invaluable when it comes to creating
  • 4. your DR plan. They know what it takes to technically run the business systems in question and can explain why a certain disaster recovery process will cost a certain amount. This information is important for the manager of the business unit, so that she can make informed decisions. Copyright © 2014 by McGraw-Hill Education. Business Continuity Planning The business continuity professional is more concerned with the business functions that the employees perform than with the underlying technologies. To figure out how the business can resume normal operations during a disaster, the business continuity professional needs to work with each business unit as closely as possible. This means they need to meet with the people who make the decisions, the people who carry out the decisions in the management team, and finally the “worker bees” who actually do the work. You can think of the “worker bees” as power users who know an application intimately. They know the nuances and idiosyncrasies of the business function—they are looking at the trees as opposed to the forest. This is important when it comes to preparing the business unit’s business continuity plan. The power users should participate in your disaster recovery rehearsals and business continuity tabletop exercises. Copyright © 2014 by McGraw-Hill Education. Management Team The business unit management team is vital because its
  • 5. members see the business unit from a business perspective—at a higher level—and will help in determining the importance of the application, as they are acquainted with the mission of the business unit. The business unit also needs to keep in mind the need for a disaster recovery plan as it introduces new or upgraded program applications. The disaster recovery and/or business continuity professional should be kept informed about such changes. For example, a member of management in a business unit might talk to a vendor about a product that could make a current business function quicker, smarter, and better. Being the diligent manager, he would bring the vendor in to meet with upper management, and the decision would be made to buy the product, all without informing the IT department or the disaster recovery or business continuity professional. As you can see, the business continuity professional needs to have a relationship with every principle within the business unit so that, should a new product be brought into the organization, the knowledge and ability to recover the product will be taken into consideration. Copyright © 2014 by McGraw-Hill Education. The Four Components of Business Continuity Planning There are four main components of business continuity planning, each of which is essential to the whole BCP initiative: Plan initiation Business impact analysis or assessment Development of the recovery strategies Rehearsal or exercise of the disaster recovery and business continuity plans Each business unit should have its own plan. The organization
  • 6. as a whole needs to have a global plan, encompassing all the business units. There should be two plans that work in tandem: a business continuity plan (recovery of the people and business function) and a disaster recovery plan (technological and application recovery). Copyright © 2014 by McGraw-Hill Education. Initiating a Plan Plan initiation puts everyone on the same page at the beginning of the creation of the plan. A disaster or event is defined from the perspective of the specific business unit or entire organization. What one business unit or organization considers a disaster may not be considered a disaster by another business unit or organization, and vice versa. A BIA is important for several reasons. It provides an organization or business unit with a dollar value impact for an unexpected event. This indicates how long an organization can have its business interrupted before it will go out of business completely. Copyright © 2014 by McGraw-Hill Education. Events Here are three examples of possible events that could impact your business and compel you to implement your disaster recovery or business continuity plan, along with some possible responses: Hurricane: Because a hurricane can be predicted a reasonable amount of time before it strikes, you have time to inform
  • 7. employees to prepare their homes and other personal effects. You also have the time to alert your technology group so that they can initiate their preparation strategy procedures. Blackout: You can ensure that your enterprise is attached to a backup generator or an uninterruptible power supply (UPS). You can conduct awareness programs and perhaps give away small flashlights that employees can keep in their desks. Illness outbreak: You can provide an offsite facility where your employees can relocate during the outbreak and investigation. Copyright © 2014 by McGraw-Hill Education. Analyzing the Business Impact With a BIA, you must first establish what the critical business function is. This can be determined only by the critical members of the business unit. The BIA should be completed and reviewed by the business unit, including upper management, since the financing of the business continuity plan and disaster recovery project will ultimately come from the business unit’s coffers. Copyright © 2014 by McGraw-Hill Education. Developing Recovery Strategies The next step is to develop your recovery strategy. The business unit will be paying for the recovery, so they need to know what their options are for different types of recoveries. You can provide anything from a no-frills recovery to an instantaneous recovery. It all depends on the business functions that have to be recovered and on how long the business unit can
  • 8. go without the function. The question is essentially how much insurance the business unit wants to buy. If it is your business, you are the only one who can make that decision. Someone who does not have as large a stake in the growth of the business cannot look at the business from the same perspective. Copyright © 2014 by McGraw-Hill Education. Procedures and Contacts In a business recovery situation, there must be written procedures that all employees in your business unit can qui ckly access, understand, and follow. Information needs to be readily available about the business function that has to be performed. The procedures should be stored in multiple, accessible locations to ensure they are available in a disaster scenario. You also need to make readily available a list of people to contact, along with their contact information. This list must be of the current employees to contact, and it should include members of the Human Resources, Facilities, Risk Management, and Legal departments. The list of contacts should also include the local fire and rescue department, police department, and emergency operations center. Copyright © 2014 by McGraw-Hill Education. Rehearsing Disaster Recovery and Business Continuity Plans The fourth BCP component, and the most crucial, is to rehearse, exercise, or test the plan. This is “where the rubber meets the road.”
  • 9. Having the other three components in place is important, but the plan is inadequate if you’re not sure whether it will work. It is vital to test your plan. If the plan has not been tested and it fails during a disaster, all the work you put into developing it is for naught. If the plan fails during a test, though, you can improve on it and test again. Copyright © 2014 by McGraw-Hill Education. Third-Party Vendor Issues Most organizations make use of various third-party vendors (Enterprise Resource Planning [ERP], Application Service Provider [ASP], etc.) in their recovery efforts. In such cases, the information about the third-party vendor is just as critical in your business or technology recovery. When you need to make use of such resources, it is beneficial, if not crucial, to make inquiries into the third-party’s operations prior to the implementation of its product or services. In the real world, the disaster recovery and/or business continuity professional has to integrate the vendor’s information into the business unit’s continuity plan. If a critical path in your DR plan depends on the involvement of a third-party vendor, you can’t get your operation up and running if that third-party vendor isn’t prepared to assist you. For example, suppose that processing loans is the bread and butter of your business, and your business relies on credit bureau reports to process loans. In this scenario, you need to ensure that if your organization experiences an outage, you will still receive these reports so that your company can continue to conduct business. The vendor’s ability to recover from a failure will also affect how robust your recovery is. Although your recovery may be technically sound, you must be sure that you can conduct business. The same standards you apply to your own
  • 10. organization should apply to third-party vendors you do business with. They should be available to you to conduct business. The disaster recovery or business continuity coordinator should make the appropriate inquiries with vendors to ensure that they can support a DR scenario. Copyright © 2014 by McGraw-Hill Education. Awareness and Training Programs Another important element of disaster recovery and business continuity planning is an awareness program. The business continuity or disaster recovery professional can meet with each business for tabletop exercises. These exercises are important, because they actually get the members of the business unit to sit down and think about a particular event and how first to prevent or mitigate it and then how to recover from it. The event can be anything from a category 3 hurricane to workplace violence. Any work stoppage can potentially impede the progress of an organization’s recovery or resumption of services, and it is up to the management team to design or develop a plan of action or a business continuity plan. The business continuity or disaster recovery professional must facilitate this process and make the business unit aware that there are events that can bring the business to a grinding halt. Copyright © 2014 by McGraw-Hill Education. Backups Backups may be used for complete system restoration, but they can also allow you to recover the contents of a mailbox, for
  • 11. example, or an “accidentally” deleted document. Backups can be extended to saving more than just digital data. Backup processes can include the backup of specifications and configurations, policies and procedures, equipment, and data centers. However, if the backup is not good or is too old, or the backup media is damaged, it will not fix the problem. Just having a backup procedure in place does not always offer adequate protection. Many organizations can no longer depend on traditional backup processes—doing an offline backup is unacceptable, doing an online backup would unacceptably degrade system performance, and restoring from a backup would take so much time that the organization could not recover. Such organizations are using alternatives to traditional backups, such as redundant systems and cloud services. Backup systems and processes, therefore, reflect the availability needs of an organization as well as its recovery needs. Copyright © 2014 by McGraw-Hill Education. Traditional Backup Methods In the traditional backup process, data is copied to backup media, primarily tape, in a predictable and orderly fashion for secure storage both onsite and offsite. Backup media can thus be made available to restore data to new or repaired systems after failure. In addition to data, modern operating systems and application configurations are also backed up. This provides faster restore capabilities and occasionally may be the only way to restore systems where applications that support data are intimately integrated with a specific system. Copyright © 2014 by McGraw-Hill Education.
  • 12. Backup Types There are several standard types of backups: Full Copy Incremental Differential Copyright © 2014 by McGraw-Hill Education. Full Backups Backs up all data selected, whether or not it has changed since the last backup. The definition of a full backup varies on different systems. On some systems it includes critical operating system files needed to rebuild a system completely; on other systems it backs up only the user data. Copyright © 2014 by McGraw-Hill Education. Copy Backups Data is copied from one disk to another. Copyright © 2014 by McGraw-Hill Education.
  • 13. Incremental Backups When data is backed up, the archive bit on a file is turned off. When changes are made to the file, the archive bit is set again. An incremental backup uses this information to back up only files that have changed since the last backup. This backup turns the archive bit off again, and the next incremental backup backs up only the files that have changed since the last incremental backup. This backup type saves time, but it means that the restore process will involve restoring the last full backup and every incremental backup made after it. Copyright © 2014 by McGraw-Hill Education. Restoring from an Incremental backup requires that all backups be applied. The circle encloses all the backups that must be restored. Copyright © 2014 by McGraw-Hill Education. Differential Backups Like an incremental backup, a differential backup only backs up files with the archive bit set—files that have changed since the last backup. Unlike an incremental backup, however, a differential backup does not reset the archive bit. Each differential backup backs up all files that have changed since the last backup that reset the bits. Using this strategy, a full backup is followed by differential backups.
  • 14. A restore consists of restoring the full backup and then only the last differential backup made. This saves time during the restore, but, depending on your system, creating differential backups takes longer than creating incremental backups. Copyright © 2014 by McGraw-Hill Education. Restoring from a differential backup requires applying only the full backup and the last differential backup. The circle encloses all of the backups that must be restored. Copyright © 2014 by McGraw-Hill Education. Backup Rotation Strategies In the traditional backup process, old backups are usually not immediately replaced by the new backup. Instead, multiple previous copies of backups are kept. This ensures recovery should one backup tape set be damaged or otherwise be found not to be good. Two traditional backup rotation strategies are Grandfather-Father-Son (GFS) and Tower of Hanoi. Copyright © 2014 by McGraw-Hill Education. GFS Backup Strategy In the GFS rotation strategy, a backup is made to separate media each day.
  • 15. Each Sunday a full backup is made, and each day of the week an incremental backup is made. The Sunday backups are kept for a month, and the current week’s incremental backups are also kept. On the first Sunday of the month, a new tape or disk is used to make a full backup. The previous full backup becomes the last full backup of the prior month and is re-labeled as a monthly backup. Weekly and daily tapes are rotated as needed, with the oldest being used for the current backup. Thus, on any given day of the month, that week’s backup is available, as well as the previous four or five weeks’ full backups, along with the incremental backups taken each day of the preceding week. If the backup scheme has been in use for a while, prior months’ backups are also available. Copyright © 2014 by McGraw-Hill Education. Note: No backup strategy is complete without plans to test backup media and backups by doing a restore. If a backup is unusable, it’s worse than having no backup at all, because it has lured users into a sense of security. Be sure to add the testing of backups to your backup strategy, and do this on a test system. Copyright © 2014 by McGraw-Hill Education. The Tower of Hanoi Backup Strategy The Tower of Hanoi strategy is based on a game played with three poles and a number of rings. The object is to move the
  • 16. rings from their starting point on one pole to the other pole. However, the rings are of different sizes, and you are not allowed to have a ring on top of one that is smaller than itself. To accomplish the task, a certain order must be followed. Consider a simple version of the Tower of Hanoi, in which you are given three pegs, one of which has three rings stacked on it from largest at the bottom to smallest at the top. Call these rings A (small), B (medium), and C (large). You need to move the rings to the right-hand peg. How do you solve this puzzle? Copyright © 2014 by McGraw-Hill Education. Tower of Hanoi Solution The solution is to move A to the right-hand peg, then B to the middle peg, A on top of B on the middle peg, then C to the right-hand peg, then A to the now-empty left-hand peg, B on top of C on the right-hand peg, and finally A on top of B to complete the stack on the right- hand peg.
  • 17. The rings were moved in this order: A B A C A B A. If you solve this puzzle with four rings labeled A through D, your moves would be A B A C A B A D A B A C A B A. Five rings are solved with the sequence A B A C A B A D A B A C A B A E A B A C A B A D A B A C A B A. As you can see, there is a recursive pattern here that looks complicated but is actually very repetitive. Small children solve this puzzle all the time. Copyright © 2014 by McGraw-Hill Education. Tower of Hanoi for Backups To use the same strategy with backup tapes requires the use of multiple tapes in this same complicated order. Each backup is a full backup, and multiple backups are made to each tape. Since each tape’s backups are not sequential, the chance that the loss of one tape or damage to one tape will destroy backups for the current period is nil. A fairly current backup is always available on another tape. This backup method gives you as many different restore options as you have tapes. Consider a three-tape Tower of Hanoi backup scheme and its similarity to the sequence of the game. On day one, you perform
  • 18. a full backup to tape A. On day two, your full backup goes to tape B. On day three, you back up to tape A again, and on day four you introduce tape C, which hasn’t been used yet. At this point, you now have three tapes containing full backups for the last three days. That’s pretty good coverage. On days 5, 6, and 7, you use tapes A, B, and A again, respectively. This gives you three tapes containing full backups that you can rely on, even if one tape is damaged. Copyright © 2014 by McGraw-Hill Education. Use More Tapes For additional coverage, you can use a four-tape or five-tape Tower of Hanoi scheme. You would perform the same rotation as in the game, either A B A C A B A D A B A C A B A in a four-tape system or A B A C A B A D A B A C A B A E A B A C A B A D A B A C A B A in a five-tape system. Higher numbers of tapes can be used as well, but the system is complicated enough that human error can become a concern. Backup software can assist by prompting the backup operator for the correct tape if it is configured for a Tower of Hanoi
  • 19. scheme. Copyright © 2014 by McGraw-Hill Education. Backup Alternatives and Newer Methodologies Many backup strategies are available for use today as alternatives to traditional tape backups: Hierarchical Storage Management (HSM) Windows shadow copy Online backup or data vaulting Dedicated backup networks Disk-to-disk (D2D) technology Copyright © 2014 by McGraw-Hill Education. Hierarchical Storage Management (HSM) HSM is more of an archiving system than a strict “backup” strategy, but it is a valid way of preserving data that can be considered as part of a data retention strategy. Long available
  • 20. for mainframe systems, it is also available on Windows. HSM is an automated process that moves the least-used files to progressively more remote data storage. In other words, frequently used and changed data is stored online on high speed, local disks. As data ages (as it is not accessed and is not changed), it is moved to more remote storage locations, such as disk appliances or even tape systems. However, the data is still cataloged and appears readily available to the user. If accessed, it can be automatically made available—it can be moved to local disks, it can be returned via network access, or, in the case of offline storage, operators can be prompted to load the data. Online services or cloud storage can be used for the more remote data storage, and this approach is commonly found in e-mail archiving solutions. Copyright © 2014 by McGraw-Hill Education. Windows Shadow Copy This Windows service takes a snapshot of a working volume, and then a normal data backup can be made that includes open files. The shadow copy service doesn’t make a copy; it just fixes a point in time and then places subsequent changes in a
  • 21. hidden volume. When a backup is made, closed files and disk copies of open files are stored along with the changes. When files are stored on a Windows system, the service runs in the background, constantly recording file changes. If a special client is loaded, previous versions of a file can be accessed and restored by any user who has authorization to read the file. Imagine that Alice deletes a file on Monday, or Bob makes a mistake in a complex spreadsheet design on Friday. On the following Tuesday, each can obtain their old versions of the file on their own, without a call to the help desk, and without IT getting involved. Copyright © 2014 by McGraw-Hill Education. Online Backup or Data Vaulting An individual or business can contract with an online servi ce that automatically and regularly connects to a host or hosts and copies identified data to an online server. Typically, arrangements can be made to back up everything, data only, or specific data sets. Payment plans are based both on volume of data backed up and
  • 22. on the number of hosts, ranging up to complete data backups of entire data centers. Copyright © 2014 by McGraw-Hill Education. Dedicated Backup Networks An Ethernet LAN can become a backup bottleneck if disk and tape systems are provided in parallel and exceed the LAN’s throughput capacity. Backups also consume bandwidth and thus degrade performance for other network operations. Dedicated backup networks are often implemented using a Fibre Channel storage area network (SAN) or Gigabit Ethernet network and Internet Small Computer Systems Interface (iSCSI). iSCSI and Gigabit Ethernet can provide wire-speed data transfer. Backup is to servers or disk appliances on the SAN. Copyright © 2014 by McGraw-Hill Education.
  • 23. Disk-to-Disk (D2D) Technology A slow tape backup system may be a bottleneck, as servers may be able to provide data faster than the tape system can record it. D2D servers don’t wait for a tape drive, and disks can be provided over high-speed dedicated backup networks, so both backups and restores can be faster. D2D can use traditional network-attached storage (NAS) systems supported by Ethernet connectivity and either the Network File System (NFS on Unix) protocol or Common Internet File System (CIFS on Windows) protocol, or dedicated backup networks can be provided for D2D. Copyright © 2014 by McGraw-Hill Education. Backup Benefits Many benefits can be obtained from backing up as a regular part of IT operations: Cost savings: It takes many people-hours to reproduce digitally stored data. The cost of backup software and hardware is a fraction of this cost. Productivity: Users cannot work without data. When data can be restored quickly, productivity is maintained.
  • 24. Increased security: When backups are available, the impact of an attack that destroys or corrupts data is lessened. Data can be replaced or compared to ensure its integrity. Simplicity: When centralized backups are used, no user needs to make a decision about what to back up. Copyright © 2014 by McGraw-Hill Education. Backup Policy The way to ensure that backups are made and protected is to have an enforceable and enforced backup policy. The policy should identify the goals of the process, such as frequency, the necessity of onsite and offsite storage, and requirements for formal processes, authority, and documentation. Procedures can then be developed, approved, and used that interpret policy in light of current applications, data sets, equipment, and the availability of technologies. Several topics should be specifically detailed in the policy. Copyright © 2014 by McGraw-Hill Education.
  • 25. Administrative Authority Designate who has the authority to physically start the backup, transport and check out backup media, perform restores, sign off on activity, and approve changes in procedures. This should also include guidelines for how individuals are chosen. Recommendations should include separating duties between backing up and restoring, between approval and activity, and even between systems. (For example, those authorized to back up directory services and password databases should be different from those given authority to back up databases.) This allows for role separation, a critical security requirement, and the delegation of many routine duties to junior IT employees. Copyright © 2014 by McGraw-Hill Education. What to Back Up Designate which information should be backed up. Should system data or only application data be backed up? What about configuration information, patch levels, and version
  • 26. levels? How will applications and operating systems be replaced? Are original and backup copies of their installation disks provided for? These details should be specified. Copyright © 2014 by McGraw-Hill Education. Scheduling Identify how often backups should be performed. Copyright © 2014 by McGraw-Hill Education. Monitoring Specify how to ensure the completion and retention of backups. Copyright © 2014 by McGraw-Hill Education.
  • 27. Storage for Backup Media Specify which of the many ways to store backup media are appropriate. Is media stored both onsite and offsite? What are the requirements for each type of storage? For example, are fireproof vaults or cabinets available? Are they kept closed? Where are they located? Onsite backup media needs to be available, but storing backups near the original systems may be counterproductive. A disaster that damages the original system might take out the backup media as well. Copyright © 2014 by McGraw-Hill Education. Type of Media and Process Used Specify how backups are made. How many backups are made, and of what type? How often are they made, and how long are they kept? How often is backup media replaced? Copyright © 2014 by McGraw-Hill Education.
  • 28. High Availability Not too long ago, most businesses closed at 5 p.m. Many were not open on the weekends, holidays were observed by closings or shortened hours, and few of us worried when we couldn’t read the latest news at midnight or shop for bath towels at 3 a.m. That’s not true anymore. Even ordinary businesses maintain computer systems around the clock, and their customers expect instant gratification at any hour. Somehow, since computers and networks are devices and not people, we expect them just to keep working without breaks, or sleep. Of course, they do break. Procedures, processes, software, and hardware that enable system and network redundancy are a necessary part of operations. However, they serve another purpose as well. Redundancy ensures the integrity and availability of information. Copyright © 2014 by McGraw-Hill Education.
  • 29. Redundancy What effect does system redundancy have? Calculations including the mean time to repair (how long it takes to replace a failed component) and uptime (the percentage of time a system is operational) can show the results of having versus not having redundancy built into a computer system or a network. However, the importance of these figures depends on the needs and requirements of the system. Most desktop systems, for example, do not require built-in redundancy; if one fails and our work is critical, we simply obtain another desktop system. The need for redundancy is met by another system. In most cases, however, we do something else while the system is fixed. Other systems, however, are critical to the survival of a business or perhaps even of a life. These systems need either built-in hardware redundancy, support alternatives that can keep their functions intact, or both. Copyright © 2014 by McGraw-Hill Education. Note: Critical systems are those systems a business must have, and
  • 30. without which it would be critically damaged, or whose failure might be life-threatening. Which systems are critical to a business must be determined by the business. For some it will be their e-commerce site, for others the billing system, and for others their customer information databases. Everyone recognizes the critical nature of air traffic control systems and life support systems used in hospitals. Two methods can be used to evaluate where and how much redundancy is needed . The first, more traditional method is to weigh the cost of providing redundancy against the cost of downtime without redundancy. These costs can be calculated and compared directly. (Is the cost of downtime greater or less than the cost of redundancy?) The second method, which is harder to calculate but is increasingly easier to justify, is to decide based on the likelihood that customers will gravitate to the organization that can provide the best availability of service. This, in turn, is based on the increasing demands that online services, unlike traditional services, be available 24×7×365. High availability can be a selling point that directly leads to more business. Indeed, some customers will demand it. There are automated methods for providing system redundancy, such as hardware fault tolerance, clustering, and network routing, and there are operational methods, such as component hot-swapping and standby systems. Copyright © 2014 by McGraw-Hill Education.
  • 31. Automated Redundancy Methods It has become commonplace to expect significant hardware redundancy and fault tolerance in server systems. A wide range of components are either duplicated within the systems or effectively duplicated by linking systems into a cluster. Some typical components and techniques are used: Clustering Fault tolerance Redundant System Slot (RSS) Cluster in a box High-availability design Internet network routing Copyright © 2014 by McGraw-Hill Education. Clustering Entire computers or systems are duplicated. If a system fails,
  • 32. operation automatically transfers to the other systems. Clusters may be set up as active-standby, in which case one system is live and the other is idle, or active-active, in which case multiple systems are kept perfectly in synch, and even dynamic load sharing is possible. Active-active is ideal, as no system stands idle and the total capacity of all systems can always be utilized. If there is a system failure, fewer systems carry the load. When the failed system is replaced, load balancing readjusts. Clustering does have its downside. When active-standby is used, duplication of systems is expensive. These active-standby systems may also take seconds for the failover to occur, which is a long time when systems are under heavy loads. Active- active systems, however, may require specialized hardware and additional, specialized administrative knowledge and maintenance. Copyright © 2014 by McGraw-Hill Education. Fault Tolerance Components may have backup systems or parts of systems that allow them to recover from errors or to survive in spite of them.
  • 33. For example, fault-tolerant CPUs use multiple CPUs running in lockstep, each using the same processing logic. In the typical case, three CPUs are used and the results from all CPUs are compared. If one CPU produces results that don’t match those of the other two, it is considered to have failed and is no longer consulted until it is replaced. Another example is the fault tolerance built into Microsoft’s NTFS file system. If the system detects a bad spot on a disk during a write, it automatically marks it as bad and writes the data elsewhere. The logic to both these strategies is to isolate failure and continue on. Meanwhile, the system can raise alerts and record error messages to prompt maintenance. Copyright © 2014 by McGraw-Hill Education. Redundant System Slot (RSS) Entire hot-swappable computer units are provided in a single unit. Each system has its own operating system and bus, but all systems are connected and share other components. Like clustered systems, RSS systems can be either active- standby or active-active. RSS systems exist as a unit, and
  • 34. systems cannot be removed from their unit and continue to operate. Copyright © 2014 by McGraw-Hill Education. Cluster in a Box Two or more systems are combined in a single unit. The difference between these systems and RSS systems is that each unit has its own CPU, bus, peripherals, operating system, and applications. Components can be hot-swapped, and therein lies its advantage over a traditional cluster. Copyright © 2014 by McGraw-Hill Education. High Availability Design Two or more complete components are placed on the network, with one component serving either as a standby system (with traffic being routed to the standby system if the primary fails)
  • 35. or as an active node (with load balancing being used to route traffic to multiple systems sharing the load, and if one fails, traffic is routed only to the other functional systems). Copyright © 2014 by McGraw-Hill Education. A High Availability Network Design Supporting a Web Site Multiple ISP backbones are available, and duplicate firewalls, load-balancing systems, application servers, and database servers support a single web site. Copyright © 2014 by McGraw-Hill Education. Internet Network Routing In an attempt to achieve redundancy for Internet-based systems similar to that of the Public Switched Telephone Network (PSTN), new architectures for Internet routing are adding or proposing a variety of techniques, such as these:
  • 36. Reserve capacity System and geographic diversity Size limits Dynamic restoration switching Self-healing protection switching Fast rerouting (which reverses traffic at the point of failure so that it can be directed to an alternative route) RSVP-based backup tunnels (where a node adjacent to a failed link signals failure to upstream nodes, and traffic is thus rerouted around the failure) Two-path protection (in which sophisticated engineering algorithms develop alternative paths between every node) Two examples of such architectures are Multiprotocol Label Switching (MPLS), which integrates IP and data-link layer technologies to introduce sophisticated routing control, and Automatic Switching Protection (ASP), which provides the fast restoration times that modern technologies, such as voice and streaming media, require. Copyright © 2014 by McGraw-Hill Education. Operational Redundancy Methods
  • 37. In addition to technologies that provide automated redundancy, there are many processes that help you to quickly get your systems up and running, if a problem occurs. These include Standby systems Hot-swappable components Copyright © 2014 by McGraw-Hill Education. Standby Systems Complete or partial systems are kept ready. Should a system, or one of its subsystems, fail, the standby system can be put into service. There are many variations on this technique. Some clusters are deployed in active-standby state, so the clustered system is ready to go but idle. To recover from a CPU or other major system failure quickly, a hard drive might be moved to another, duplicate, online system. To recover quickly from the failure of a database system, a duplicate system complete with database software may be kept ready. The database is periodically updated by replication or by export and import functions. If the main system fails, the standby system can be placed online, though it may be lacking some recent transactions.
  • 38. Copyright © 2014 by McGraw-Hill Education. Hot-Swappable Components Many hardware components can now be replaced without shutting down systems. Hard drives, network cards, and memory are examples of current hardware components that can be added. Modern operating systems detect the addition of these devices on the fly, and operations continue with minor, if any, service outages. In a RAID array, for example, drive failure may be compensated for by the built-in redundancy of the array. If the failed drive can be replaced without shutting down the system, the array will return to its prefailure state. Interruptions in service will be nil, though performance may suffer depending on the current load. Copyright © 2014 by McGraw-Hill Education. Summary
  • 39. In this chapter, we covered the four related business resumption strategies that are all necessary for recovery from incidents, outages, and disasters that result in service or data loss: disaster recovery, business continuity planning, backups, and high- availability. Together, these form the core of a strategy to keep the organization’s information infrastructure operational. Here in summary are the principal points, roles, and responsibilities of a good disaster recovery and business continuity program: Develop and maintain disaster recovery and business continuity plans for all your organization’s enterprise technologies. Schedule and oversee disaster recovery rehearsals for all enterprise systems. Ensure disaster awareness by planning and conducting awareness programs, hazard fairs, lunch-and-learn sessions, and other informative events and materials. Activate the plan. Ensure community involvement by participating in local community disaster mitigation and planning initiatives and professional groups. The disaster recovery and business continuity process is cyclical and must be maintained for it to stay current with the needs of the organization and the technologies in the environment. Your plans must be updated and rehearsed regularly. Disaster recovery is vital to everyone.
  • 40. Backups can be an important part of a recovery strategy. They play a role in disaster recovery process, to move data from the primary site to the DR site, although real-time data replication approaches are replacing traditional tape shipments in modern DR plans. Backups are also necessary for recovering data in a traditional data center. High availability architectures are the fourth leg of the table supporting service resiliency, to ensure that failure of one system or component of a service doesn’t cause that service to fail. Copyright © 2014 by McGraw-Hill Education.