Disaster recovery white_paper

Disaster Recovery
Table of Contents
1.0. Executive Summary................................................................................................................. 4
2.0. Business Continuity and Disaster Recovery ........................................................................... 4
2.1. Business Drivers .................................................................................................................. 5
2.2. Compliance Concerns.......................................................................................................... 7
2.2.1. PCI DSS ........................................................................................................................ 7
2.2.2. HIPAA............................................................................................................................ 9
3.0. Business Continuity ............................................................................................................... 10
4.0. Disaster Recovery ................................................................................................................. 13
4.1. Recovery Point and Time Objectives ................................................................................ 13
4.2. Designing for Recovery...................................................................................................... 14
5.0. Technical Implementation Considerations ............................................................................ 17
5.1. Virtualization/Cloud Computing Disaster Recovery .......................................................... 17
5.1.1. Traditional Disaster Recovery..................................................................................... 17
5.1.2. Active-Passive............................................................................................................. 18
5.1.3. Active-Active................................................................................................................ 18
5.1.4. Cloud Case Study ....................................................................................................... 18
5.2. Location for Disaster Recovery.......................................................................................... 22
5.2.1. Micro-Sufficiency vs. Macro-Efficiency ....................................................................... 22
5.2.2. Geography Matters...................................................................................................... 23
5.2.3. Selection of Second Sites ........................................................................................... 23
5.3. Hardware Protection vs. Data Center Protection .............................................................. 24
5.3.1. Offsite Backup Options................................................................................................ 24
5.4. SAN-to-SAN Replication.................................................................................................... 28
5.5. Best Practices .................................................................................................................... 29
5.5.1. Encryption.................................................................................................................... 29
5.5.2. Network Replication .................................................................................................... 31
5.5.3. Testing......................................................................................................................... 32

5.5.4. Communication Plan Testing ...................................................................................... 32
6.0. Conclusion ............................................................................................................................. 34
7.0. References............................................................................................................................. 35
7.1. Questions to Ask Your Disaster Recovery Provider.......................................................... 35
8.0. Contact Us ............................................................................................................................. 36

1.0. Executive Summary
Investing in risk management means investing in business sustainability – designing a
comprehensive business continuity and disaster recovery plan is about analyzing the impact of
a business interruption on revenue.
Mapping out your business model, identifying key components essential to operations,
developing and testing a strategy to efficiently recover and restore data and systems is an
involved, long-term project that may take 12-18 months depending on the complexity of your
organization.
Addressing high-level business drivers for designing, implementing and testing a business
continuity and disaster recovery plan, this white paper makes a case for the investment while
discussing the innate challenges, benefits and detriments of different solutions from the
perspective of experienced IT and data security professionals.
Speaking directly to different compliance requirements, this paper addresses how to protect
sensitive backup data within the parameters of standards set for the healthcare and e-
commerce industries.
From there, this paper delves into different disaster recovery and offsite backup technical
solutions, from traditional to virtualization (cloud-based disaster recovery), as well as
considerations in seeking a disaster recovery as a service solution (DRaaS) provider. A case
study of the switch from physical servers and traditional disaster recovery to a private cloud
environment details the differences in cost, uptime, performance and more.
This white paper is ideal for executives and IT decision-makers seeking a primer as well as up-
to-date information regarding disaster recovery best practices and specific technology
recommendations.
2.0. Business Continuity and Disaster Recovery
Business continuity is the process of analyzing the mission critical components required to keep
your business running in the event of a disaster – business continuity is an overarching plan
involving a few steps (see section 3.0 Business Continuity for a detailed description of what
each step entails):
 Business Impact Analysis (BIA)
 Recovery Strategies
 Plan Development
 Testing and Exercises
Creating an IT disaster recovery plan is part of the Plan Development step. As can be seen from
the multiple steps within business continuity planning, disaster recovery is only a subset within a

larger overarching plan to keep a business running. Disaster recovery requires creating a plan
to recover and restore IT infrastructure, including servers, networks, devices, data and
connectivity.
2.1. Business Drivers
Why allocate budget toward a business continuity and IT disaster recovery plan? According to a
Forrester/Disaster Recovery Journal Business Continuity Preparedness Survey, the top reason
is due to an increased reliance on technology.1
Increased Reliance on Technology
An increased reliance on technology can be seen from the retail industry that must upgrade to
digital transactions and mobile payments to the healthcare industry that relies on electronic
patient data entry, information exchange, processing, etc., demarcating the shift from paper
records to electronic health record systems (EHRs).2
Ensuring network and power connectivity
is essential to support the availability of websites, data and applications critical to business
operations and profitability – this is where the greatest benefit can be seen in investing in an IT
disaster recovery plan.
Increased Business Complexity
Other business drivers include the increasing business complexity of their organization; more
relevant for larger businesses that might juggle many vendors, different processes and
components that are all necessary to keep business operations running.
With so many different factors in play as well as individuals, a business continuity and IT
disaster recovery plan tackles the challenge of coordinating efforts and navigating a complex
communication and workflow model in the event of a disaster. The plan must identify and
support the complex interdependencies typically found in a larger organization that all work to
keep the business running.
Increasing Frequency and Intensity of Natural Disasters
An increasing frequency and intensity of natural disasters is also motivation for establishing a
plan to deal with the effects of, for example, Hurricane Sandy; a largely unanticipated and
devastating natural disaster that caused delays, power outages and downed
businesses/websites. Ideally, your disaster recovery data center should be located in a region
with low risk of natural disasters.
However, Gigaom.com reports that the greatest amount of data centers are located in states
that also experienced the greatest number of FEMA (Federal Emergency Management Agency)
disaster declarations, suggesting a change in disaster recovery strategy is in order. Which
1
Forrester Research and Disaster Recovery Journal, The State of Business Continuity Preparedness;
http://www.drj.com/images/surveys_pdf/forrester/2011_Forrester_SOBC.pdf
2
Online Tech, Risks on the Rise: Making a Case for IT Disaster Recovery;
http://resource.onlinetech.com/risks-on-the-rise-making-a-case-for-it-disaster-recovery/

states were hit the hardest, with the highest concentration of existing data centers? The top
three include Texas with 332 disasters and 120 data centers; California with 211 disasters and
greater than 160 data centers; and New York with 91 disasters and greater than 120 data
centers.3
Source: Giagom, FEMA, Data Center Map
For more on geography, data centers and disaster recovery, see section 5.2.2. Geography
Matters.
Increased Reliance on Third-Parties
Another business driver is the increased business reliance on third-parties (i.e., outsourcing,
suppliers, etc.). As one factor in the business complexity of an organization, vendors can also
introduce potential new or increased risks, depending on their internal security policies and
practices, as well as general security awareness. Read more about Administrative Security to
find out what to look for in a security-conscious third-party vendor, from audits, reports and
policies to staff training.
Increased Regulatory Requirements
Increased regulatory requirements have also shifted attention to the need for disaster recovery.
For the e-commerce, retail and franchise industries, the Payment Card Industry Data Security
Standards (PCI DSS) require the offsite backup and verification of the physical security of the
facility in which cardholder data is found. Another requirement explicitly mandates the
establishment and testing of an incident response plan in the event of a system breach.4
(See
section 2.2.1 PCI DSS for more).
The healthcare industry is regulated by the Health Insurance Portability and Accountability Act
(HIPAA) and more specifically the Health Information Technology for Economic and Clinical
Health (HITECH) Act that addresses privacy and security concerns related to the electronic
3
Gigaom, The States with the Most Data Centers Are Also the Most Disaster-Prone [Maps];
http://gigaom.com/2013/01/10/the-states-with-the-most-data-centers-are-also-the-most-disaster-prone-
maps/
4
PCI Security Standards Council, PCI DSS Requirements and Security Assessment Procedures, Version
2.0; https://www.pcisecuritystandards.org/documents/pci_dss_v2.pdf (PDF)

transmission of health information. Within the Administrative Safeguards of the HIPAA Security
Rule standards, a contingency plan is required, comprised of: a data backup plan, disaster
recovery plan, emergency mode operation plan, testing and revision procedures and
applications and data criticality analysis.5
Accordingly, failure to meet regulatory requirements can result in federal fines, legal fees, loss
of business credibility, and other significant consequences; motivating businesses of all sizes to
implement a compliant disaster recovery and backup plan.
Increased Threat of Cyber Attacks
The last risk factor making a case for disaster recovery is the increased threat of cyber attacks.
From attacks on federal agencies to corporate franchises to mobile malware, hackers are
frequently developing new methods to gain unauthorized access to systems – or to take down
entire systems.
A denial-of-service attack (DoS attack) is one method of sending an abnormally high volume of
requests/traffic in an attempt to overload servers and bring down networks. While many other
technical security tools can be used to prevent, detect and mitigate potential cyber attacks, a
comprehensive disaster recovery plan is essential in order to properly recover and restore
critical data and applications after an attack.
2.2. Compliance Concerns
As mentioned in the previous section, failure to meet industry compliance/regulatory
requirements can result in federal fines, legal fees, loss of business credibility, and other
significant consequences – with disaster recovery and backup as an integral part of the
requirements, it’s important to review what’s at stake and why for each industry.
2.2.1. PCI DSS
For companies that deal with credit cardholder data, including e-commerce, retail, franchise,
etc., the Payment Card Industry Data Security Standards (PCI DSS) are the official security
guidelines set by the major credit card brands.
Of the 12 PCI DSS requirements and sub-requirements, 12.9.1 dictates:6
Create the incident response plan to be implemented in the event of
system breach. Ensure the plan addresses the following, at a
minimum:
5
U.S. Depart. of Health and Human Services (HHS), HIPAA Security Series: Security Standards:
Organizational, Policies and Procedures and Documentation Requirements;
http://www.hhs.gov/ocr/privacy/hipaa/administrative/securityrule/pprequirements.pdf (PDF)
6

 Roles, responsibilities, and communication and contact
strategies in the event of a compromise including notification of
the payment brands, at a minimum
 Specific incident response procedures
 Business recovery and continuity procedures
 Data back-up processes
 Analysis of legal requirements for reporting compromises
 Coverage and responses of all critical system components
 Reference or inclusion of incident response procedures from
the payment brands
In addition, the PCI standard 9.5 requires a data backup plan, disaster recovery plan,
emergency mode operation plan, testing and revision procedures, and application and data
criticality analysis.7
Store media back-ups in a secure location, preferably an off-site
facility, such as an alternate or back-up site, or a commercial storage
facility. Review the location’s security at least annually.
The auditor testing procedures call for observation of the storage location’s physical security. A
PCI compliant data center should have proper physical security including limited access
authorization, dual-identification control access to the facility and servers, and complete
environmental control with monitoring, logged surveillance, alarm systems and an alert system.
Ideally, if outsourcing your disaster recovery solution, partner only with a disaster recovery
provider that allows physical tours and walkthroughs of their facilities. What else should you look
for in a PCI disaster recovery provider?
 Policies and procedures, process documents, training records, incident response/data
breach plans, etc.
 Proof that all PCI requirements are in place and
sufficiently compliant within the scope of their
contracts
Read more about the required network and technical
7

security, and high availability infrastructure in PCI Compliant Data Centers. For a complete
guide to outsourcing data hosting and disaster recovery solutions, read our PCI Compliant
Hosting white paper.
2.2.2. HIPAA
For companies that deal with protected health information (PHI), including healthcare providers,
hospitals, physicians, hospital systems, etc., the HIPAA Insurance Portability and Accountability
Act (HIPAA) is the official legislation set forth by the U.S. Dept. of Health and Human Services
(HHS).
This set of security standards work to protect the availability, confidentiality and integrity of PHI
– the availability aspect becomes all the more dependent on the reliability of your IT
infrastructure as hospitals and healthcare practices increase reliance on the use of electronic
health record systems (EHRs). Healthcare applications and Software as a Service (SaaS)
companies need offsite backup for their data in the event that a production data center
experiences a disaster.
The Contingency Plan standard (§ 164.308(a)(7)) of the Administrative Safeguards of the
HIPAA Security Rule requires covered entities to:
Establish (and implement as needed) policies and procedures for
responding to an emergency or other occurrence (for example, fire,
vandalism, system failure, and natural disaster) that damages systems
that contain electronic protected health information.8
The specifications of the standard include a data backup plan, disaster recovery plan,
emergency mode operation plan, testing and revision procedures, and applications and data
criticality analysis.
Read Components of a HIPAA Compliant IT Contingency Plan for a detailed overview and a
customizable IT Contingency Plan template provided by the Dept. of Health and Human
Resources.
Read more about the required physical, network and
technical security, and high availability infrastructure in
HIPAA Compliant Data Centers. For a complete guide
to outsourcing data hosting and HIPAA disaster
recovery solutions, read our HIPAA Compliant hosting
white paper.
8
U.S. Dept. of Health and Human Services, Administrative Safeguards;
http://www.gpo.gov/fdsys/pkg/CFR-2009-title45-vol1/pdf/CFR-2009-title45-vol1-sec164-308.pdf (PDF)

3.0. Business Continuity
Within a business continuity plan exists a few steps:9
Business Impact Analysis (BIA)
This involves determining the operational and financial impact of a potential disaster or
disruption, including loss of sales, credibility, compliance fines, legal fees, PR management, etc.
It also includes measuring the amount of financial/operational damage depending on the time of
the year. A risk assessment should be conducted as part of the BIA to determine what kind of
assets are actually at risk – including people, property, critical infrastructure, IT systems, etc.; as
well as the probability and significance of possible hazards – including natural disasters, fires,
mechanical problems, supply failure, cyber attacks; etc.10
Mapping out your business model and determining where the interdependencies lie between the
different departments and vendors within your company is also part of the BIA. The larger the
organization, the more challenging it will be to develop a successful business continuity and
disaster recovery plan. Sometimes organizational restructuring and business process or
workflow realignment is necessary not only to create a business continuity/disaster recovery
plan, but also to maximize and drive operational efficiency.11
Ready.gov/business has a BIA worksheet available12
(seen below) to help you document and
calculate the operational and financial impact of a potential disaster by matching the timing and
duration of an interruption with the loss of sales/income, as well as on a per department, service
and process basis.
9
FEMA (Federal Emergency Management Agency), Business Continuity Plan;
http://www.ready.gov/business/implementation/continuity
10
FEMA (Federal Emergency Management Agency, Risk Assessment; http://www.ready.gov/risk-
assessment
11
Online Tech, Business Continuity in Lean Times (Webinar);
http://www.onlinetech.com/events/business-continuity-in-lean-times
12
Ready.gov, Business Impact Analysis Worksheet;
http://www.ready.gov/sites/default/files/documents/files/BusinessImpactAnalysis_Worksheet.pdf

Recovery Strategies
Analyzing your company’s most valuable data, that is data that directly leads to revenue, is key
when determining what you need to backup and restore as part of your information technology
(IT) disaster recovery plan.
Create an inventory of documents, databases and systems that are used on a day-to-day basis
to generate revenue, and then quantify and match income with those processes as part of your
recovery strategy/business impact analysis.13
Aside from IT, a recovery strategy also involves personnel, equipment, facilities, a
communication strategy and more in order to effectively recover and restore business
operations.
Plan Development
Using information derived from the business impact analysis in conjunction with the recovery
strategies, establish a plan framework. Documenting an IT disaster recovery plan is part of this
stage.
13
Online Tech, Business Continuity in Lean Times (Webinar);
http://www.onlinetech.com/events/business-continuity-in-lean-times

As can be seen from the multiple steps within business continuity planning, disaster recovery is
a subset within a larger overarching plan to keep a business running. It involves restoring and
recovering IT infrastructure, including servers, networks, devices, data and connectivity (see
section 4.0 Disaster Recovery for more).
A data backup plan involves choosing the right hardware and software backup procedures for
your company, scheduling and implementing backups as well as checking/testing for accuracy
(see section 5.3.1. Offsite Backup Options for more).
Testing & Exercises
Develop a testing process to measure the efficiency and effectiveness of your plans, as well as
how often to conduct tests. Part of this step involves establishing a training program and
conducting training for your company/business continuity team.
Testing allows you to clearly define roles and responsibilities and improve communication within
the team, as well as identify any weaknesses in the plans that require attention. This allows you
to allocate resources as needed to fill the gaps and build up a stronger, more resilient plan.
Read section 5.5.3 Testing for more information.

4.0. Disaster Recovery
As an integral part of business continuity plan development, creating an IT disaster recovery
plan is essential to keep businesses running as they increasingly rely on IT infrastructure
(networks, servers, systems, databases, devices, connectivity, power, etc.) to collect, process
and store mission-critical data.
A disaster recovery plan is designed to restore IT operations at an alternate site after a major
system disruption with long-term effects. After successfully transferring systems, the goal is to
restore, recover, test affected systems and put them back in operation.
Your IT infrastructure is, in most cases, the lifeblood of your organization. When websites are
down or patient data is unavailable due to hacking, natural disasters, hardware failure or human
error, businesses cannot survive.
According to FEMA, a recovery strategy should be developed for each component:
 Physical environment in which data/servers are stored – data centers equipped with
climate control, fire suppression systems, alarm systems, authorization and access
security, etc.
 Hardware – Networks, servers, devices and peripherals.
 Connectivity – Fiber, cable, wireless, etc.
 Software applications – Email, data exchange, project management, electronic
healthcare record systems, etc.
 Data and restoration
Identify the critical software applications and data, as well as the hardware required to run them.
Additionally, determining your company’s custom recovery point and time objectives can
prepare you for recovery success by creating guidelines around when data must be recovered.
4.1. Recovery Point and Time Objectives
Recovery Point Objective (RPO)
A recovery point objective (RPO) specifies a point in time that data must be recovered and
backed up in order for business operations to resume. The RPO determines the minimum
frequency at which interval backups need to occur, from every hour to every 5 minutes.14
Recovery Time Objective (RTO)
The recovery time objective (RTO) refers to the maximum length of time a system (or computer,
network or application) can be down after a failure or disaster before the company is negatively
impacted by the downtime. Determining the amount of lost revenue per amount of lost time can
help determine which applications and systems are critical to business sustainability.
14
Online Tech, Seeking a Disaster Recovery Solution? Five Questions to Ask Your DR Provider;
http://resource.onlinetech.com/five-questions-to-ask-your-disaster-recovery-provider/

For example, if your email server was down for only an hour, yet a large portion of your
database was wiped out and you lost 12 hours’ worth of email, how would that impact your
business?
4.2. Designing for Recovery
High Availability Infrastructure
Strategic data center design involving high availability and redundancy can help support larger
companies that rely on mission-critical (high-impact) applications. High availability is a design
approach that takes into account the sum of all the parts including the application, all the
hardware it is running on, power infrastructure, and the networking behind the hardware.15
Using high availability architecture can reduce the risks of lost revenue and customers in the
event of Internet connectivity or power loss – with high availability, you can perform
maintenance without downtime and the failure of a single firewall, switch, or PDU will not affect
your availability. With this type of IT design, you can achieve 99.999%, meaning you have less
than 5.26 minutes of downtime per year.
High availability power means the primary power circuit should be provided by the primary UPS
(Uninterruptible Power Supply) and be backed up by the primary generator. A secondary circuit
should be provided by the secondary UPS, which is backed up by the secondary generator.
This redundant design ensures that a UPS or generator failure will never interrupt power in your
environment.
For a high availability data center, you should seek not only a primary and secondary power
feed, but also a primary and secondary Internet uplink if purchasing Internet from them.
Additionally ensure any available hardware, firewalls or switches include redundant hardware.
If using managed services and purchasing a server from a data center, ensure all of the
hardware is configured for high availability, including dual power supplies and dual NIC (network
interface controller) cards. Ensure their server is also wired back to different switches, and the
switches are dual homed to different access layer routing so there is no single point of failure
anywhere in the environment.
Offsite backup and disaster recovery are still important; as high availability cannot help you
recover from a natural disaster such as a flood or hurricane. Additionally, disaster recovery
comes after high availability has completely failed and you must recover to a different
geographical location.
15
Online Tech, Online Tech Expert Interview: What is High Availability?;
http://resource.onlinetech.com/michigan-data-center-operator-online-tech-expert-interview-what-is-high-
availability/

Redundant Infrastructure
Redundancy is another factor to consider when it comes to disaster recovery data center
design. With a fully redundant data center design, automatic failover can ensure server uptime
in the event that one provider experiences any connectivity issues.
This includes multiple Internet Service Providers (ISPs) and fully redundant Cisco networks with
automatic failover. Pooled UPS (Uninterruptible Power Supply), battery and generators can
ensure a backup source of power in the event one provider fails. View an example of Online
Tech’s redundant network and data centers below:
Cold Site Disaster Recovery
A cold site is little more than an appropriately configured space in a building. Everything
required to restore service to your users must be retrieved and delivered to the site before the

process of recovery can begin. As you can imagine, the delay going from a cold backup site to
full operation can be substantial.
Warm Site Disaster Recovery
A warm site is leasing space from a data center provider or disaster recovery provider that
already has the power, cooling and network installed. It is also already stocked with hardware
similar to that found in your data center, or primary site. To restore service, the last backups
from an offsite storage facility are required.
Hot Site Disaster Recovery
A hot site is the most expensive yet fastest way to get your servers back online in the event of
an interruption. Hardware and operating systems are kept in sync and in place at a data center
provider's facility in order to quickly restore operations. Real time synchronization between the
two sites may be used to completely mirror the data environment of the original site using wide
area network links and specialized software. Following a disruption to the original site, the hot
site exists so that the organization can relocate with minimal losses to normal operations.
Ideally, a hot site will be up and running within a matter of hours or even less.
When you partner with a data center/disaster recovery provider, you're sharing the cost of the
infrastructure, so it's not as expensive if you were to have an entirely secondary data center.

5.0. Technical Implementation Considerations
5.1. Virtualization/Cloud Computing Disaster Recovery
With virtualization, the entire server, including the operating system, applications, patches and
data are encapsulated into a single software bundle or server – this virtual server can be copied
or backed up to an offsite data center, and spun up on a virtual host in minutes in the event of a
disaster.
Since the virtual server is hardware independent, the operating system, applications, patches
and data can be safely and accurately transferred from one data center to a second site without
reloading each component of the server.
This can reduce recovery times compared to traditional disaster recovery approaches where
servers need to be loaded with the OS and application software, as well as patched to the last
configuration used in production before the data can be restored.
Virtual machines (VMs) can be mirrored, or running in sync, at a remote site to ensure failover in
the event that the original site should fail; ensuring complete data accuracy when recovering
and restoring after an interruption.
Another aspect of cloud-based disaster recovery that improves recovery times drastically is full
network replication. Replicating the entire network and security configuration between the
production and disaster recovery site as configuration changes are made saves you the time
and trouble of configuring VLAN, firewall rules and VPNs before the disaster recovery site can
go live.
In order to achieve full replication, your cloud-based disaster recovery provider should manage
both the production cloud servers and disaster recovery cloud servers at both sites.
For warm site disaster recovery, backups of critical servers can be spun up on a shared or
private cloud host platform.
For SAN-to-SAN replication, hot site disaster recovery is more affordable – SAN replication
allows not only rapid failover to the secondary site, but also the ability to return to the production
site when the disaster is over.
For a case study of a real physical-to-cloud switch scenario from a business enterprise
perspective, read section 5.1.4. Cloud Case Study for a detailed comparison of managing
physical servers vs. a private cloud environment, including differences in costs, energy use,
uptime, performance and development.
5.1.1. Traditional Disaster Recovery

With traditional disaster recovery outsourced to a vendor with a shared infrastructure, after a
disaster is declared, the hardware, software and operating system must be configured to match
the original affected site.
Data is being stored on offsite tape backups – after a disaster, the data must be retrieved and
restored in the remote site location that has been configured to match the original. This can take
hours or a few days to recover and restore completely. If not outsourcing, the traditional disaster
recovery method of using a cold site can be very time-consuming and very costly.
If you have a disaster recovery infrastructure with preconfigured hardware and software ready at
a secondary site (a warm site), this can cut down on the time it takes to recover. However, even
with a secondary site, your organization is still dependent on retrieving physical backup tapes
for complete restoration. There is no data synchronization and no failback option available with
traditional disaster recovery.
The missing step in many traditional disaster recovery plans is how to return to the production
site once it has been re-established. Traditional disaster recovery plans are often not fully tested
through a full failover disaster scenario due to the time-consuming design of the plan.
5.1.2. Active-Passive
In an active-passive disaster recovery setup, the original or primary site is designed so that the
network fails over at an alternative or secondary site with delayed resiliency. Applications and
configurations must be replicated with a delay anywhere from five minutes to 24 hours. With a
secondary site, there is reduced capacity hardware, and failback requires a maintenance
window.
5.1.3. Active-Active
In an active-active disaster recovery setup, there is synchronous data replication between the
primary and secondary sites, with no delayed resiliency. The database spans the two data
centers, and the application layer multi-writes. There is equivalent capacity hardware at a
secondary data center to ensure full capacity redundancy.
5.1.4. Cloud Case Study
Online Tech is one example of making the switch from traditional physical servers to a cloud
environment that resulted in savings in hardware, disaster recovery and more. Back in 2011, we
found our growth was beginning to become difficult to manage internally.
Mission Critical Hardware, Facilities and Employees
We had two data centers, hundreds of circuits, network devices, racks, cages and private suites
to manage and maintain. We also had thousands of servers and support tickets due to a rapidly
growing client-base, as well as certification and auditing processes to keep up annually (SSAE
16, SOC 2, HIPAA, PCI DSS, SOX) in order to maintain compliance and data security for our
clients.

With employees at five different locations and in two different countries, we needed a scalable
and efficient solution to support our mission critical business components.
Mission Critical Systems
Within our administrative department, Exchange, SharePoint, a file server, and domain
controller supported their everyday processes. Our marketing department uses a production
and development website to test and implement updates, as well as load-balanced website to
optimize resources.
For OTPortal, our client and intranet portal, we use Microsoft .net applications and a MS SQL
database. For OTMobile (provides mobile access for our engineers), we use a PHP application.
Within our operations department, we use a custom Centos program to manage the data and
create a MySQL database for our bandwidth management and billing processes.
Operations has thousands of patches to apply each month, as well as firewall, IDS management
consoles, antivirus management, server and cloud backup managers, SAN and NAS
management, and uptime/performance monitoring to maintain. We also have a sandbox for
testing in our lab.
From Physical Servers to a Private Cloud
We consolidated from 23 physical servers (18 Windows, 5 CentOS servers, 4 database servers;
each with 10 percent utilization) to private cloud. The private cloud consisted of 2 redundant
hardware servers (N+1) and an 8 terabyte SAN. Our high availability (HA) configuration includes
automatic load-balancing across hosts, and automatic failover to a single server.
The private cloud also includes continuous offsite backup, allowing for real-time data
synchronization. We employ a disaster recovery warm site located in Ann Arbor, Michigan that
allows us a four hour recovery time that has been fully tested.
Leveraging Our Cloud
When we switched over, we actualized several benefits, from faster client-support development,
lower total cost of ownership, improved uptime and performance, as well as significantly
decreasing our energy usage and carbon footprint.
Pace of Development
With the switch to our private cloud, we’ve increased the pace of development. A project that
would typically take two weeks can now be completed within an hour, as we can create new
servers and test concepts using production data.
As a result, our development team can update the client portal, OTPortal, with new releases
every two weeks; implementing new time-saving features much sooner than before.
Total Cost of Ownership (TCO)

We also reduced the total cost of managing our infrastructure. Our old TCO required
management of 26 physical Dell servers with a variety of specifications, versions, bios, CPU,
memory configurations and the need for several different spares. In addition, we had to manage
26 backups, antivirus and machines to network and patch.
We also had four Cisco network switches, two racks in the data center, more than a hundred
network cables and half a dozen power strips. It took hours to upgrades disks, and downtime
also contributed to costs, as it was required to upgrade memory.
The cloud TCO consolidated everything into two servers, one SAN, two network switches, two
power strips and down from two racks to a quarter of a rack. Overall, we saved 50 percent on
hardware and 90 percent on management costs.
Improved Uptime
Another benefit is improved uptime – always a major benefit when it comes to hosting critical
data for our clients. With N+ 1 (redundant) hosts, every virtual server we create is protected
from a failed hardware server. For redundancy with physical servers, it would have required an
additional 26 servers, adding to cost, time management and energy expenditure.
To guard against SAN failure, we have redundant controllers in our SAN, with RAID array drives
and spare drives on hand. With our high availability power configuration, we were further
protected against downtime.
Initially we considered using a separate server for the database, resulting in a hybrid cloud
configuration, which would have required a cluster of database servers for the same protection.
Instead, we upgraded our entire cloud for less than a new single database server, resulting in
protection against server failure for significantly lower cost than a cluster.
Improved Performance
We also improved our ability to respond to performance issues. Previously with our physical
server setup, it took a few days to get the right RAM/disk/CPU, and we had to schedule
downtime with anywhere from two days to one week of notice.
The actual process included shutting down and removing the server from the rack, opening the
server, installing additional resources and then booting up the server. Then we would have to
test the performance, turn the server back off in order to re-rack it, and then restart the server;
resulting in about two hours of downtime.

When we switched to the cloud, the steps were reduced to: schedule downtime; click to add
more RAM/disk/CPU; reboot server
and test performance – a total of
five minutes of downtime. The entire
cloud upgraded hardware one host
at a time with nearly zero downtime.
Decreasing Energy, Carbon
Footprint & Costs
We significantly reduced our energy
use and subsequent carbon
footprint. When it came to power
consumption, for 100 percent
uptime at 300 watts/server and a
PUE of 1.8, we went through 1.58
lbs. of CO2/kwhr. For 26 physical
servers and a network, that amounts
to about 200,000 lbs. of CO2 per
year, and twice as much annually
for redundancy.
The cloud required two physical
servers, network and SAN. With a
35 server capacity, we are burning
31,000 lbs. of CO2 per year – a
savings of nearly 17,000 lbs. of CO2
annually.16
Faster Disaster Recovery
With every server we create, we’re
reassured they are automatically
protected, as we have a single backup process for each host and backups for every virtual
server on all hosts. We have a 4 hour RTO from catastrophic failure – we can failover to a
secondary data center site in less than 4 hours. With the cloud, we are able to test twice a year
to ensure the process runs smoothly.
In a virtualized environment, the entire server, including the operating system (OS), apps,
patches and data are captured on a virtual server that can be backed up to another one of our
16
Online Tech, How the Cloud is Changing the Data Center’s Bad Reputation for Energy Inefficiency;
http://resource.onlinetech.com/how-the-cloud-is-changing-the-data-centers-bad-reputation-for-energy-
inefficiency/

data centers and spun up in a matter of minutes.17
This makes both testing and full recovery
and failover much faster and efficient than if we still used physical servers.
5.2. Location for Disaster Recovery
Strategic distance between the primary and secondary sites for disaster recovery is important in
order to avoid natural disasters, ensure data synchronicity, allow for business scalability and
maximize operational efficiency.
5.2.1. Micro-Sufficiency vs. Macro-Efficiency
Micro-sufficiency is the concept in which your core functions or critical pieces of your
infrastructure are centrally located, as well as replicated regionally. In an example of business
model scaling, core departments may be human resources, IT and/or legal located centrally at a
headquarters.
However, each branch of the business in different regions has its own local core departments.
The idea behind this model is that risk is mitigated by dispersing core functions close to each
region and their local customers, and not solely in one location (headquarters). Each branch
also has different strategies to better serve their unique customers as their needs vary from
region to region.
Similarly, with disaster recovery planning, once you identify your critical business processes,
you can distribute those core functions into your various operational units in the event of a
disaster or interruption. Designing with this concept of redundancy and resiliency in your
infrastructure can result in a graceful and safe failover with efficient recovery.
Partnering with a disaster recovery data center provider allows your organization to take
advantage of the risk-mitigating benefits of the micro-sufficiency concept, while avoiding the
costs of building and maintaining your own data center. Instead of installing your own redundant
set of equipment in an alternate facility, colocation or disaster recovery with a partner allows you
to pay for space in a fully staffed, redundant environment.
With macro-efficiency, the concept is economy of scale – the bigger your company, the more
buying power you have and the larger equipment you can buy. In an example of business model
scaling, the core departments make blanket corporate decisions across their regional branches,
regardless of differences in customers and needs. Without recognizing differences and
identifying the workflow of interdependencies between different departments, the model suffers
from lack of organization and inability to identify and recover critical functions.18
17
Data Center Knowledge/Online Tech, How the Cloud Changes Disaster Recovery;
http://www.datacenterknowledge.com/archives/2011/07/26/how-the-cloud-changes-disaster-recovery/
18
Online Tech, Disaster Recovery in Depth; http://www.onlinetech.com/events/disaster-recovery-in-depth/

Micro-sufficiency is the ideal model for disaster recovery and business continuity planning, as it
effectively mitigates risk and presents a better strategy for protecting data through redundant
and resilient design.
5.2.2. Geography Matters
The geographic selection of low natural disaster zones is essential for lowering the risk of critical
IT infrastructure destruction. A large enough distance between your primary and secondary
sites ensures that your secondary site isn’t affected by a potential natural disaster. Read on for
more about specific parameters of secondary disaster recovery sites.
5.2.3. Selection of Second Sites
If your organization or primary site is located in a disaster-prone zone, consider a secondary site
in a landlocked and more temperate region. Compared to coastal regions, the Midwest has low
national averages for significant natural disasters such as floods, tornadoes, hurricanes and
fires that cause mass destruction and may be a threat to your business.

If your organization or primary site is located on the coast or in an earthquake zone, your
secondary site should be located at least 100 miles away.19
Ideally, your secondary site should
be located far enough away to mitigate the risk of it being affected by the same disaster
affecting your primary site.
The design of your secondary site should also be strategic – never locate generators in a
basement or other location that may be difficult to service, or prone to destruction.
Additionally, ensure your secondary data center is located close enough to your primary for
optimal bandwidth and response time, as well as the ability to mirror data in real time. Facilities
should also be easily reached by your IT team in the event of a disaster for faster service and
recovery.
However, your disaster recovery data center should always be located on a separate utility
power grid than your primary data center. In the event of a power outage at your primary,
separate power grids ensures that your secondary site will still be up and running.
5.3. Hardware Protection vs. Data Center Protection
5.3.1. Offsite Backup Options
Sending data offsite ensures a copy of your critical data is available in the event of a disaster at
your primary site, and it is considered a best practice in disaster recovery planning. There are
several offsite data backup media options available, including the traditional tape backup
method that involves periodic copying of data to tape drives that can be done manually or with
software.
However, physical tape backup has its drawbacks, including read or write errors, slow data
retrieval times, and required maintenance windows. With critical business data from medical
records to customer credit card data, your organization can’t afford to risk losing archives or the
ability to completely recover after a disaster.
According to NIST, the different types of data backups include:20
 Full backup – All files on the disk or within the folder are backed up. This can be time-
consuming due to the sheer size of files. According to NIST, maintaining duplicates of
files that don’t change very often, such as system files, can lead to excessive and costly
storage requirements.
19
CIOUpdate.com, Disaster Recovery Planning;
http://www.cioupdate.com/trends/article.php/3872926/Disaster-Recovery-Planning---How-Far-is-Far-
Enough.htm
20
NIST (National Institute of Science and Technology), Special Publication 800-34 Rev. 1 – Contingency
Planning Guide for Federal Information Systems; http://csrc.nist.gov/publications/nistpubs/800-34-
rev1/sp800-34-rev1_errata-Nov11-2010.pdf (PDF)

 Incremental – Files that were created or changed since the last backup are captured in
an incremental backup. Backup times are shorter and more efficient, but might require
compiling backups from multiple days and media, depending on when files where
changed.
 Differential – All files that were created or modified since the last full backup – if a file is
changed after the last backup, the file will be saved each time until the next full backup is
completed. Backup times are shorter than a full backup, and require less media than
incremental.
For more about specific offsite backup technology, read section 5.4 SAN-to-SAN Replication
and SAN Snapshots.
Outsource vs. In-Source
Outsourcing your offsite backup to a managed services provider can provide your organization
with continuous data protection and full file-level restoration, and offload the burden of installing,
managing, monitoring as well as complete restoration after a disaster.
With a vendor, your encrypted server files are sent to an onsite backup manager (primary site),
which are then sent to a secondary, offsite backup manager, ideally far enough apart to reduce
the chances of the secondary site being affected by the same disaster or interruption.
While offsite backup managed in-house can be costly due to building out, maintaining and
upgrading both primary and secondary sites, outsourcing your offsite backup to professionals
means you can take advantage of their investments in capital, technology and expertise.

As NIST (National Institute of Science and Technology) states, backup media should be stored
offsite or at an alternate site in a secure, environmentally controlled facility.21
An offsite backup
data center should have physical, network and environmental controls to maintain a high level of
security and safety from possible backup damage.
Physical security at a data center means only authorized personnel have limited access to client
servers, and the facility itself should require dual-identification control access (through the use
of a secondary identification device, such a biometric authentication that requires a fingerprint
scan). Environmental controls should include 24x7 monitoring, logged surveillance cameras and
multiple alarm systems.
Any sensitive infrastructure should be protected by restricted access, and redundancy in
routers, switches and paired universal threat management devices should provide network
security for your offsite backup data.
Vendor Selection Criteria
When vetting offsite backup and disaster recovery vendors (also known as disaster recovery as
a service, or DRaaS) check certain criteria to ensure your data is protected. Look for certain
security certifications, compliance, communication styles and technology when comparing
offsite backup providers, as well as the basic disaster recovery criteria of geographic area,
accessibility, security, environment and costs discussed in section 5.2 Location for Disaster
Recovery.
Compliance
One way to gain assurance of an offsite backup/data center provider’s security practices is to
inquire about their industry security and compliance reports.
Vendors that have invested the significant time and resources toward building out and meeting
regulatory requirements for operating excellence and security practices will have undergone
independent audits. They should also be able to provide a copy of their audit report under NDA
(non-disclosure agreements).
Look for these data center audit compliance reports:
 SSAE 16 (Statement on Standards for Attestation Engagements), which replaced SAS
70 (Statement on Auditing Standard), measures controls and processes related the
financial recordkeeping and reporting. A SOC 1 (service organization controls) report
measures and reports on the same controls as an SSAE 16 report.
 A SOC 2 audit is actually most closely related to reporting on the security, availability
and privacy of the data in your offsite backup and data hosting environment. A SOC 2
21

report is highly recommended for companies that host or store large amounts of data,
particularly data centers. A SOC 3 report measures the same controls as a SOC 2, yet
has less technical detail, and can be used publicly.
 For specific industries that deal with certain types of data, there exist more stringent sets
of compliance regulations. For the healthcare industry, or any company that touches
protected health information (PHI), HIPAA compliance (Health Insurance Portability and
Accountability Act) is federally mandated to protect health data. If your disaster
recovery/offsite back data center provider has undergone an independent HIPAA audit
of its facilities and processes, you can be assured your data is secure.
 For e-commerce, retail, franchise and any other company that touches credit cardholder
data (CHD), PCI DSS compliance (Payment Card Industry Data Security Standard) is
the regulatory requirements designed to protect CHD.
Communication
When there’s an interruption in your service or issue at the data center, you should be able to
count on your disaster recovery provider to promptly communicate with you in order to give your
IT staff or clients proper notification. An updated contact list and tested communication plan
should be key aspects of your disaster recovery and business continuity plan.
The lack of communication can put a company out of business and leave coworkers and
customers in the dark. Designate a primary contact and backup contacts from your company to
be the first to know in the event of a disaster, as well as assemble a technical team that can
work with your provider, if outsourcing your disaster recovery solution.
When searching for an offsite backup/data center provider, ask about their communication
policies and processes. Good communication can also give you insight into their level of
transparency into their business operations. See section 5.5.4. Communication Plan Testing for
more about establishing a realistic and effective communication plan between your company
and vendors.
Fully Reserved or First-Come, First-Served?
Does your provider offer fully reserved servers for disaster recovery? Or do they lease a number
of physical servers and resources to be used on a first-come, first-served basis, shared with
other companies?
Providers that offer this service allow companies to load applications and attempt to recover
operations on “cold” servers – these servers are considered bare metal servers with no
operating system (OS), applications, patches or data. Recovery would take longer due to the
time spent retrieving tape backups and traveling to the secondary site.
Ask your provider if they offer fully reserved servers for complete assurance that your company
will be able to recover your data as quickly as possible, without the chance of being second in

line. In addition, virtualization can eliminate the need of restoring from tape or disk, thus
reducing recovery times compared to traditional disaster recovery in which physical servers
need to first be loaded with the OS and application software, as well as patched to the last
configuration used in production before data restoration.
5.4. SAN-to-SAN Replication
SAN (Storage Area Network)
Due to compliance reasons or due diligence, many companies not only want a backup locally
that they can recover to very quickly, but they also need to get that data offsite in the event that
they experience a site failure. SAN can help with these backup and recovery needs.
SAN Snapshots
A snapshot is a point-in-time reference of data that you can schedule after your database
dumps and transaction logs have finished running. A SAN snapshot gives you a virtual
copy/image of what your database volumes, devices or systems look like at a given time. If you
have an entire server failure, you can very quickly spin up a server, install SQL or do a bare
metal restore, then import all of your data and get your database server back online.
SAN-to-SAN Replication
The counterpart to SAN snapshots is SAN-to-SAN replication (or synchronization). With
replication, if you had a SAN in one data center, you can send data to another SAN in a different
data center location. You can back up very large volumes of data very quickly using SAN
technology, and you can also transfer that data to a secondary location independently of your
snapshot schedule.
This is more efficient because traditional backup windows can take a very long time and impact
the performance of your system. By keeping it all on the SAN, it allows backups to be done very
fast, and the data copy can be done in the background so it’s not impacting the performance of
your systems.
You can configure and maintain snapshots on both your primary and disaster recovery sites,
i.e., you can keep seven days’ worth of snapshots on your primary site, and you can keep seven
days of replication on your disaster recovery site.
However, SANs are fairly expensive, and snapshots and replication can use a lot of space. You
will also need specialized staff to configure and manage SAN operations.
SAN-based recovery focuses on large volumes of data, and it is more difficult to recover
individual files. Traditional recovery focuses on critical business files for more granular recovery,
but that comes at the cost of speed. With a large volume of data, traditional recovery can be
much slower than SAN-based snapshots.
SAN-to-SAN replication can support a private cloud environment and provide fast recovery
times (RTO of 1 hour and RPO of minutes). After a disaster is mitigated, SAN-to-SAN

replication provides a smooth failback from the secondary site to the production site by
reversing the replication process.
SAN vs. Traditional Backup and Disaster Recovery
Traditionally, 10 or 15 years ago, people had email servers, FTP/document servers,
unstructured data and database servers. The backup and recovery of these systems must be
viewed differently as they each present their own unique challenges.
With email servers, they are mission critical, highly transactional and essential to a business.
They may have SQL or custom databases, and they can take a long time to rebuild after a
disaster. The actual install and configuration of the application that sits on top of the database
itself can be very intensive, and rebuilding that system may put you over your recovery time
objective (RTO).
For a smaller company, an exchange server may be 100 to 200 GB in size. FTP/file servers can
be terabytes in size, and contain large volumes of unstructured data. They are less transactional
than email servers, and server configuration could be minimal. Each individual file must be
backed up. When looking at systems of that size, you should stop looking at traditional backups,
and you can start leveraging SAN (Storage Area Network) technology – which is a large group
of disks.
Instead of having a backup window that runs for an entire day that can slow operations, you can
use a SAN snapshot technology which allows you to back up more efficiently. If you need a
backup of your FTP/file servers every night, you can leverage a snapshot during off-hours very
quickly, from a matter of seconds to a minute. SAN snapshots can back up a large amount of
data with very little impact on your production environment.
The tradeoff is it can be slightly harder to restore the data because you would need to bring up
your file drive online and present it to the server. However, it can be faster than having to
restore terabytes of data from a tape backup.
For standalone database servers with a large volume of structured data that are highly
transactional, consider using SAN snapshot technologies with specified volumes for database
dumps and transaction logs.
5.5. Best Practices
5.5.1. Encryption
What is encryption? Encryption takes plaintext (your data) and encodes it into unreadable,
scrambled text using algorithms that render it unreadable unless a cryptographic key is used to
convert it. Encryption ensures data security and integrity even if accessed by an unauthorized
user.

According to NIST (National Institute of Science and Technology), encryption is most effective
when applied to both the primary data storage device and on backup media going to an
offsite location in the event that data is lost or stolen on its way or at the site, meaning data in
transit and at rest.22
NIST also recommends keeping a solid cryptographic key management
process in order to allow encrypted data to be read and available as needed (decryption).
According to data security expert Chris Heuman, Certified Information Systems Security
Professional (CISSP), performing a disaster recovery test of encrypted data should be an
important part of your business continuity strategy. Forcing recovery from an encrypted backup
source and forcing a recovery of the encryption key to the recovery device allows organizations
to find out if encryption is effective before a real disaster or breach occurs.
Encryption for HIPAA and PCI Compliance
Encryption is considered a best practice for data security and is recommended for organizations
with sensitive data, such as healthcare or credit card data. It is highly recommended for the
healthcare industry that must report to the federal agency, Dept. of Health and Human Services
(HHS), if unencrypted data is exposed, lost stolen or misused.
The federally mandated HIPAA Security Rule for healthcare organizations handling electronic
protected health information (ePHI) dictates that organizations must:
In accordance with §164.306… Implement a mechanism to encrypt and
decrypt electronic protected health information. (45 CFR §
164.312(a)(2)(iv))
HIPAA also mandates that organizations must:
§164.306(e)(2)(ii): Implement a mechanism to encrypt electronic
protected health information whenever deemed appropriate.
Protecting ePHI at rest and in transit means encrypting not only data collected or processed, but
also data stored or archived as backups.
For organizations that deal with credit cardholder data, they must adhere to PCI DSS standards
that require encryption only if cardholder data is stored.23
PCI explicitly states:24
22
23
Chris Heuman CHP, CHSS, CSCS, CISSP, Practice Leader for RISC Management and Consulting,
Encryption – Perspective on Privacy, Security & Compliance;
http://www.onlinetech.com/events/encryption-perspective-on-privacy-security-a-compliance (Webinar)
24
PCI Security Standards Council, PCI DSS Requirements and Security Assessment Procedures,
Version 2.0; https://www.pcisecuritystandards.org/documents/pci_dss_v2.pdf (PDF)

3.4 Render PAN (Primary Account Number) unreadable anywhere it is
stored (including on portable digital media, backup media, and in logs)
by using any of the following approaches:
 One-way hashes based on strong cryptography (hash must be
of the entire PAN)
 Truncation (hashing cannot be used to replace the truncated
segment of PAN)
 Index tokens and pads (pads must be securely stored)
 Strong cryptography with associated key-management
processes and procedures
3.4.1.c Verify that cardholder data on removable media is encrypted
wherever stored.
While both addressable and required for compliance, encryption is also considered an industry
best practice – no longer just an option but necessary to protect backup data in rest and in
transit to your disaster recovery/offsite backup site.
For more on encryption from both a technical and compliance perspective, check back to our
White Paper section for our Encryption white paper to be released Fall 2013. Or, watch our
recorded encryption webinar series with IT and data security professional guest speakers as
well as experts from Online Tech in:
 Encryption – Perspective on Privacy, Security & Compliance
 Encryption at the Software Level: Linux and Windows
 Encryption at the Hardware and Storage Level
5.5.2. Network Replication
With a single stand-alone server, cloud-based disaster recovery allows you to ship a copy of
your virtual server image offsite to run on a cloud server in the event of a disaster. However, for
enterprise or more complex server configurations, more than just a server image is required for
recovery. Firewall rules, VLANs, VPNs and the network replication must be fully replicated at
the disaster recovery site before the site can go live.
In order to achieve rapid recovery time objectives (RTOs), the server and network must be fully
replicated at the secondary site in synchronicity with the production site as changes are made.
Ideally, a cloud-based disaster recovery provider should have control of both the production and
disaster recovery sites to ensure network replication.

5.5.3. Testing
Testing your disaster recovery plan at least annually is a best practice for numerous reasons,
including verifying that the plan actually works and training your team in the process. Testing
also allows you to figure out where weaknesses lie, or gaps in the process that need to be
addressed. According to NIST, the following areas should be tested:
 Notification procedures
 System recovery on secondary site
 Internal and external connectivity
 System performance with secondary equipment
 Restoration of normal operations
Testing with a traditional disaster recovery plan can be time-consuming and costly due to the
retrieval, restoration and system re-configuration required, and often conventional plans are
rarely tested through a full failover scenario. With cloud-based disaster recovery, testing is
easier, faster and less disruptive to your production environment and business operations than
traditional disaster recovery.
Since the cloud offers offsite backup of the entire virtual server in sync with the production site,
there is no need to retrieve tapes to test full recovery.
5.5.4. Communication Plan Testing
Part of your overall disaster recovery and business continuity planning should involve a well-
documented communication plan based on your BIA (Business Impact Analysis).
Mapping out the interdependencies and complexity of your organization can help you identify
who is the proper point of contact for any given critical function. Testing your communication
plan is key to getting everyone on board and working together to achieve a smooth and realistic
recovery.
Determine who is responsible for officially declaring a disaster – from IT to executives, a
communication plan should be in place for business interruption or disaster notification, and
then a formal declaration. After declaration, a process should be established for notifying
shareholders, employees, customers, vendors and the general public, if necessary.
Aside from notification, a trained disaster recovery IT team should be identified for the
secondary site, as well as for production. If working with a disaster recovery provider, ensure
your contracts and agreements reflect notification and communication policies to clarify their
roles and responsibilities involved in facilitating recovery.
Someone should be tasked with keeping a well-organized and up-to-date contact list for those
involved in the communication plan, with cell phone and home phone numbers as well as an

alternative email address in the event that corporate email/phone systems are down during a
disaster.

6.0. Conclusion
Disaster recovery technology advancements have streamlined the process to offer a faster,
more accurate and complete recovery solution. Leveraging the capabilities of a disaster
recovery as a service (DRaaS) provider allows organizations to realize these benefits, including
cost-effective and efficient testing to ensure plan viability.
The time and resource-intensive challenge of managing a secondary disaster recovery site that
both meets stringent industry compliance requirements and protects mission critical data and
applications can be relieved with the right disaster recovery partner.
Here is a high-level overview of what to look for in an offsite backup and disaster recovery
provider and plan (see section 7.0 Questions to Ask Your Disaster Recovery Provider for more
details):
 Strategic location
 Risk of natural disaster
 Recovery time objective (RTO)
 Recovery point objective (RPO)
 Cloud-based disaster recovery
 High availability/redundancy
 Annual testing
 Compliance audits and reports
Contact our disaster recovery and offsite backup experts at Online Tech for more information if
you still have questions about IT disaster recovery planning or our disaster recovery data
centers.
Visit: www.onlinetech.com
Email: contactus@onlinetech.com
Call: 734.213.2020

7.0. References
7.1. Questions to Ask Your Disaster Recovery Provider
When you look to a third party disaster recovery provider, what kind of questions should you ask
to ensure your critical data and applications are safe? Read on for tips on what to look for in a
disaster recovery as a service (DRaaS) solution from your hosting provider.
1. Do you have the following data center certifications: SSAE 16, SOC 1, 2 and 3?
Data center certifications should be up-to-date, backed up by an auditor’s report, and
comprehensive of all security-related controls. Here’s a brief snippet of what each one
measures:
 SSAE 16
The Statement on Standards for Attestation Engagements (SSAE) No. 16 replaced SAS
70 in June 2011 – if your current disaster recovery provider only has a SAS 70
certification, keep looking! SSAE 16 has made SAS 70 extinct.
A SSAE 16 audit measures the controls, design and operating effectiveness of data
centers, as relevant to financial reporting. (Note: SSAE 16 does not provide assurance
of controls directly related to data centers/disaster recovery providers).
 SOC 1
The first of three new Service Organization Controls reports developed by the AICPA,
this report measures the controls of a data center as relevant to financial reporting. SOC
1 is essentially the same as SSAE 16 – the purpose of the report is to meet financial
reporting needs of companies that use data hosting services, including disaster
recovery.
 SOC 2
SOC 2 measures controls specifically related to IT and data center service providers,
unlike SOC 1 or SSAE 16. The five controls are security, availability, processing integrity
(ensuring system accuracy, completion and authorization), confidentiality and privacy.
 SOC 3
SOC 3 delivers an auditor’s opinion of SOC 2 components with the additional seal of
approval needed to ensure you are hosting with an audited and compliant data center. A
SOC 3 report is less detailed and technical than a SOC 2 report.
2. What is your recovery time objective and recovery point objective SLA?
Recovery Time Objective (RTO): This refers to the maximum length of time a system can be
down after a failure or disaster before the company is negatively impacted by the downtime.

Recovery Point Objective (RPO): This specifies a point in time that data must be recovered
and backed up. The RPO determines the minimum frequency at which interval backups need to
occur, from every hour to every 5 minutes.
Clarifying the time objectives with your disaster recovery provider can help your organization
plan for the worst and know what to expect, when.
3. Where are your disaster recovery data centers located?
Natural disasters happen at any time, almost anywhere – but you can decrease your odds of
experiencing them by choosing to partner with a disaster recovery provider that has data center
facilities located in a disaster-free zone. The Midwest is one region that is relatively free from
major disasters. Read more in High Density of Data Centers Correlate with Disaster Zones;
Michigan Provides Safe Haven.
4. Do you offer cloud-based disaster recovery?
As VMware.com states, “traditional disaster recovery solutions are complex to set up. They
require a secondary site, dedicated infrastructure, and hardware-based replication to move data
to the secondary site.”
With cloud-based disaster recovery, you could achieve a 4 hour RTO and 24 hour RPO. Cloud-
based disaster recovery replicates the entire hosted cloud (servers, software, network and
security) to an offsite data center, allowing for far faster recovery times than traditional disaster
recovery solutions can offer.
5. How often do you test your disaster recovery systems?
Disaster recovery providers should test at least annually to ensure systems are prepared for an
emergency response whenever a disaster is declared. Testing also allows for a valuable
learning experience – if anything goes wrong, professionals can investigate and remediate
before an actual disaster occurs. It’s also a test run for the personnel involved in managing the
event to ensure the documented communication plan actually works as anticipated.
8.0. Contact Us
Contact our disaster recovery and offsite backup experts at Online Tech for more information if
you still have questions about IT disaster recovery planning or our disaster recovery data
centers.
Visit: www.onlinetech.com
Email: contactus@onlinetech.com
Call: 734.213.2020

Disaster recovery white_paper

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Disaster recovery white_paper

Similar to Disaster recovery white_paper (20)

More from CMR WORLD TECH

More from CMR WORLD TECH (20)

Recently uploaded

Recently uploaded (20)

Disaster recovery white_paper