SlideShare a Scribd company logo
1 of 36
Download to read offline
Copyright © Online Tech 2013. All Rights Reserved page 1 of 36
Copyright © Online Tech 2013. All Rights Reserved page 2 of 36
Disaster Recovery
Table of Contents
1.0. Executive Summary................................................................................................................. 4
2.0. Business Continuity and Disaster Recovery ........................................................................... 4
2.1. Business Drivers .................................................................................................................. 5
2.2. Compliance Concerns.......................................................................................................... 7
2.2.1. PCI DSS ........................................................................................................................ 7
2.2.2. HIPAA............................................................................................................................ 9
3.0. Business Continuity ............................................................................................................... 10
4.0. Disaster Recovery ................................................................................................................. 13
4.1. Recovery Point and Time Objectives ................................................................................ 13
4.2. Designing for Recovery...................................................................................................... 14
5.0. Technical Implementation Considerations ............................................................................ 17
5.1. Virtualization/Cloud Computing Disaster Recovery .......................................................... 17
5.1.1. Traditional Disaster Recovery..................................................................................... 17
5.1.2. Active-Passive............................................................................................................. 18
5.1.3. Active-Active................................................................................................................ 18
5.1.4. Cloud Case Study ....................................................................................................... 18
5.2. Location for Disaster Recovery.......................................................................................... 22
5.2.1. Micro-Sufficiency vs. Macro-Efficiency ....................................................................... 22
5.2.2. Geography Matters...................................................................................................... 23
5.2.3. Selection of Second Sites ........................................................................................... 23
5.3. Hardware Protection vs. Data Center Protection .............................................................. 24
5.3.1. Offsite Backup Options................................................................................................ 24
5.4. SAN-to-SAN Replication.................................................................................................... 28
5.5. Best Practices .................................................................................................................... 29
5.5.1. Encryption.................................................................................................................... 29
5.5.2. Network Replication .................................................................................................... 31
5.5.3. Testing......................................................................................................................... 32
Copyright © Online Tech 2013. All Rights Reserved page 3 of 36
5.5.4. Communication Plan Testing ...................................................................................... 32
6.0. Conclusion ............................................................................................................................. 34
7.0. References............................................................................................................................. 35
7.1. Questions to Ask Your Disaster Recovery Provider.......................................................... 35
8.0. Contact Us ............................................................................................................................. 36
Copyright © Online Tech 2013. All Rights Reserved page 4 of 36
1.0. Executive Summary
Investing in risk management means investing in business sustainability – designing a
comprehensive business continuity and disaster recovery plan is about analyzing the impact of
a business interruption on revenue.
Mapping out your business model, identifying key components essential to operations,
developing and testing a strategy to efficiently recover and restore data and systems is an
involved, long-term project that may take 12-18 months depending on the complexity of your
organization.
Addressing high-level business drivers for designing, implementing and testing a business
continuity and disaster recovery plan, this white paper makes a case for the investment while
discussing the innate challenges, benefits and detriments of different solutions from the
perspective of experienced IT and data security professionals.
Speaking directly to different compliance requirements, this paper addresses how to protect
sensitive backup data within the parameters of standards set for the healthcare and e-
commerce industries.
From there, this paper delves into different disaster recovery and offsite backup technical
solutions, from traditional to virtualization (cloud-based disaster recovery), as well as
considerations in seeking a disaster recovery as a service solution (DRaaS) provider. A case
study of the switch from physical servers and traditional disaster recovery to a private cloud
environment details the differences in cost, uptime, performance and more.
This white paper is ideal for executives and IT decision-makers seeking a primer as well as up-
to-date information regarding disaster recovery best practices and specific technology
recommendations.
2.0. Business Continuity and Disaster Recovery
Business continuity is the process of analyzing the mission critical components required to keep
your business running in the event of a disaster – business continuity is an overarching plan
involving a few steps (see section 3.0 Business Continuity for a detailed description of what
each step entails):
 Business Impact Analysis (BIA)
 Recovery Strategies
 Plan Development
 Testing and Exercises
Creating an IT disaster recovery plan is part of the Plan Development step. As can be seen from
the multiple steps within business continuity planning, disaster recovery is only a subset within a
Copyright © Online Tech 2013. All Rights Reserved page 5 of 36
larger overarching plan to keep a business running. Disaster recovery requires creating a plan
to recover and restore IT infrastructure, including servers, networks, devices, data and
connectivity.
2.1. Business Drivers
Why allocate budget toward a business continuity and IT disaster recovery plan? According to a
Forrester/Disaster Recovery Journal Business Continuity Preparedness Survey, the top reason
is due to an increased reliance on technology.1
Increased Reliance on Technology
An increased reliance on technology can be seen from the retail industry that must upgrade to
digital transactions and mobile payments to the healthcare industry that relies on electronic
patient data entry, information exchange, processing, etc., demarcating the shift from paper
records to electronic health record systems (EHRs).2
Ensuring network and power connectivity
is essential to support the availability of websites, data and applications critical to business
operations and profitability – this is where the greatest benefit can be seen in investing in an IT
disaster recovery plan.
Increased Business Complexity
Other business drivers include the increasing business complexity of their organization; more
relevant for larger businesses that might juggle many vendors, different processes and
components that are all necessary to keep business operations running.
With so many different factors in play as well as individuals, a business continuity and IT
disaster recovery plan tackles the challenge of coordinating efforts and navigating a complex
communication and workflow model in the event of a disaster. The plan must identify and
support the complex interdependencies typically found in a larger organization that all work to
keep the business running.
Increasing Frequency and Intensity of Natural Disasters
An increasing frequency and intensity of natural disasters is also motivation for establishing a
plan to deal with the effects of, for example, Hurricane Sandy; a largely unanticipated and
devastating natural disaster that caused delays, power outages and downed
businesses/websites. Ideally, your disaster recovery data center should be located in a region
with low risk of natural disasters.
However, Gigaom.com reports that the greatest amount of data centers are located in states
that also experienced the greatest number of FEMA (Federal Emergency Management Agency)
disaster declarations, suggesting a change in disaster recovery strategy is in order. Which
1
Forrester Research and Disaster Recovery Journal, The State of Business Continuity Preparedness;
http://www.drj.com/images/surveys_pdf/forrester/2011_Forrester_SOBC.pdf
2
Online Tech, Risks on the Rise: Making a Case for IT Disaster Recovery;
http://resource.onlinetech.com/risks-on-the-rise-making-a-case-for-it-disaster-recovery/
Copyright © Online Tech 2013. All Rights Reserved page 6 of 36
states were hit the hardest, with the highest concentration of existing data centers? The top
three include Texas with 332 disasters and 120 data centers; California with 211 disasters and
greater than 160 data centers; and New York with 91 disasters and greater than 120 data
centers.3
Source: Giagom, FEMA, Data Center Map
For more on geography, data centers and disaster recovery, see section 5.2.2. Geography
Matters.
Increased Reliance on Third-Parties
Another business driver is the increased business reliance on third-parties (i.e., outsourcing,
suppliers, etc.). As one factor in the business complexity of an organization, vendors can also
introduce potential new or increased risks, depending on their internal security policies and
practices, as well as general security awareness. Read more about Administrative Security to
find out what to look for in a security-conscious third-party vendor, from audits, reports and
policies to staff training.
Increased Regulatory Requirements
Increased regulatory requirements have also shifted attention to the need for disaster recovery.
For the e-commerce, retail and franchise industries, the Payment Card Industry Data Security
Standards (PCI DSS) require the offsite backup and verification of the physical security of the
facility in which cardholder data is found. Another requirement explicitly mandates the
establishment and testing of an incident response plan in the event of a system breach.4
(See
section 2.2.1 PCI DSS for more).
The healthcare industry is regulated by the Health Insurance Portability and Accountability Act
(HIPAA) and more specifically the Health Information Technology for Economic and Clinical
Health (HITECH) Act that addresses privacy and security concerns related to the electronic
3
Gigaom, The States with the Most Data Centers Are Also the Most Disaster-Prone [Maps];
http://gigaom.com/2013/01/10/the-states-with-the-most-data-centers-are-also-the-most-disaster-prone-
maps/
4
PCI Security Standards Council, PCI DSS Requirements and Security Assessment Procedures, Version
2.0; https://www.pcisecuritystandards.org/documents/pci_dss_v2.pdf (PDF)
Copyright © Online Tech 2013. All Rights Reserved page 7 of 36
transmission of health information. Within the Administrative Safeguards of the HIPAA Security
Rule standards, a contingency plan is required, comprised of: a data backup plan, disaster
recovery plan, emergency mode operation plan, testing and revision procedures and
applications and data criticality analysis.5
Accordingly, failure to meet regulatory requirements can result in federal fines, legal fees, loss
of business credibility, and other significant consequences; motivating businesses of all sizes to
implement a compliant disaster recovery and backup plan.
Increased Threat of Cyber Attacks
The last risk factor making a case for disaster recovery is the increased threat of cyber attacks.
From attacks on federal agencies to corporate franchises to mobile malware, hackers are
frequently developing new methods to gain unauthorized access to systems – or to take down
entire systems.
A denial-of-service attack (DoS attack) is one method of sending an abnormally high volume of
requests/traffic in an attempt to overload servers and bring down networks. While many other
technical security tools can be used to prevent, detect and mitigate potential cyber attacks, a
comprehensive disaster recovery plan is essential in order to properly recover and restore
critical data and applications after an attack.
2.2. Compliance Concerns
As mentioned in the previous section, failure to meet industry compliance/regulatory
requirements can result in federal fines, legal fees, loss of business credibility, and other
significant consequences – with disaster recovery and backup as an integral part of the
requirements, it’s important to review what’s at stake and why for each industry.
2.2.1. PCI DSS
For companies that deal with credit cardholder data, including e-commerce, retail, franchise,
etc., the Payment Card Industry Data Security Standards (PCI DSS) are the official security
guidelines set by the major credit card brands.
Of the 12 PCI DSS requirements and sub-requirements, 12.9.1 dictates:6
Create the incident response plan to be implemented in the event of
system breach. Ensure the plan addresses the following, at a
minimum:
5
U.S. Depart. of Health and Human Services (HHS), HIPAA Security Series: Security Standards:
Organizational, Policies and Procedures and Documentation Requirements;
http://www.hhs.gov/ocr/privacy/hipaa/administrative/securityrule/pprequirements.pdf (PDF)
6
PCI Security Standards Council, PCI DSS Requirements and Security Assessment Procedures, Version
2.0; https://www.pcisecuritystandards.org/documents/pci_dss_v2.pdf (PDF)
Copyright © Online Tech 2013. All Rights Reserved page 8 of 36
 Roles, responsibilities, and communication and contact
strategies in the event of a compromise including notification of
the payment brands, at a minimum
 Specific incident response procedures
 Business recovery and continuity procedures
 Data back-up processes
 Analysis of legal requirements for reporting compromises
 Coverage and responses of all critical system components
 Reference or inclusion of incident response procedures from
the payment brands
In addition, the PCI standard 9.5 requires a data backup plan, disaster recovery plan,
emergency mode operation plan, testing and revision procedures, and application and data
criticality analysis.7
Store media back-ups in a secure location, preferably an off-site
facility, such as an alternate or back-up site, or a commercial storage
facility. Review the location’s security at least annually.
The auditor testing procedures call for observation of the storage location’s physical security. A
PCI compliant data center should have proper physical security including limited access
authorization, dual-identification control access to the facility and servers, and complete
environmental control with monitoring, logged surveillance, alarm systems and an alert system.
Ideally, if outsourcing your disaster recovery solution, partner only with a disaster recovery
provider that allows physical tours and walkthroughs of their facilities. What else should you look
for in a PCI disaster recovery provider?
 Policies and procedures, process documents, training records, incident response/data
breach plans, etc.
 Proof that all PCI requirements are in place and
sufficiently compliant within the scope of their
contracts
Read more about the required network and technical
7
PCI Security Standards Council, PCI DSS Requirements and Security Assessment Procedures, Version
2.0; https://www.pcisecuritystandards.org/documents/pci_dss_v2.pdf (PDF)
Copyright © Online Tech 2013. All Rights Reserved page 9 of 36
security, and high availability infrastructure in PCI Compliant Data Centers. For a complete
guide to outsourcing data hosting and disaster recovery solutions, read our PCI Compliant
Hosting white paper.
2.2.2. HIPAA
For companies that deal with protected health information (PHI), including healthcare providers,
hospitals, physicians, hospital systems, etc., the HIPAA Insurance Portability and Accountability
Act (HIPAA) is the official legislation set forth by the U.S. Dept. of Health and Human Services
(HHS).
This set of security standards work to protect the availability, confidentiality and integrity of PHI
– the availability aspect becomes all the more dependent on the reliability of your IT
infrastructure as hospitals and healthcare practices increase reliance on the use of electronic
health record systems (EHRs). Healthcare applications and Software as a Service (SaaS)
companies need offsite backup for their data in the event that a production data center
experiences a disaster.
The Contingency Plan standard (§ 164.308(a)(7)) of the Administrative Safeguards of the
HIPAA Security Rule requires covered entities to:
Establish (and implement as needed) policies and procedures for
responding to an emergency or other occurrence (for example, fire,
vandalism, system failure, and natural disaster) that damages systems
that contain electronic protected health information.8
The specifications of the standard include a data backup plan, disaster recovery plan,
emergency mode operation plan, testing and revision procedures, and applications and data
criticality analysis.
Read Components of a HIPAA Compliant IT Contingency Plan for a detailed overview and a
customizable IT Contingency Plan template provided by the Dept. of Health and Human
Resources.
Read more about the required physical, network and
technical security, and high availability infrastructure in
HIPAA Compliant Data Centers. For a complete guide
to outsourcing data hosting and HIPAA disaster
recovery solutions, read our HIPAA Compliant hosting
white paper.
8
U.S. Dept. of Health and Human Services, Administrative Safeguards;
http://www.gpo.gov/fdsys/pkg/CFR-2009-title45-vol1/pdf/CFR-2009-title45-vol1-sec164-308.pdf (PDF)
Copyright © Online Tech 2013. All Rights Reserved page 10 of 36
3.0. Business Continuity
Within a business continuity plan exists a few steps:9
Business Impact Analysis (BIA)
This involves determining the operational and financial impact of a potential disaster or
disruption, including loss of sales, credibility, compliance fines, legal fees, PR management, etc.
It also includes measuring the amount of financial/operational damage depending on the time of
the year. A risk assessment should be conducted as part of the BIA to determine what kind of
assets are actually at risk – including people, property, critical infrastructure, IT systems, etc.; as
well as the probability and significance of possible hazards – including natural disasters, fires,
mechanical problems, supply failure, cyber attacks; etc.10
Mapping out your business model and determining where the interdependencies lie between the
different departments and vendors within your company is also part of the BIA. The larger the
organization, the more challenging it will be to develop a successful business continuity and
disaster recovery plan. Sometimes organizational restructuring and business process or
workflow realignment is necessary not only to create a business continuity/disaster recovery
plan, but also to maximize and drive operational efficiency.11
Ready.gov/business has a BIA worksheet available12
(seen below) to help you document and
calculate the operational and financial impact of a potential disaster by matching the timing and
duration of an interruption with the loss of sales/income, as well as on a per department, service
and process basis.
9
FEMA (Federal Emergency Management Agency), Business Continuity Plan;
http://www.ready.gov/business/implementation/continuity
10
FEMA (Federal Emergency Management Agency, Risk Assessment; http://www.ready.gov/risk-
assessment
11
Online Tech, Business Continuity in Lean Times (Webinar);
http://www.onlinetech.com/events/business-continuity-in-lean-times
12
Ready.gov, Business Impact Analysis Worksheet;
http://www.ready.gov/sites/default/files/documents/files/BusinessImpactAnalysis_Worksheet.pdf
Copyright © Online Tech 2013. All Rights Reserved page 11 of 36
Recovery Strategies
Analyzing your company’s most valuable data, that is data that directly leads to revenue, is key
when determining what you need to backup and restore as part of your information technology
(IT) disaster recovery plan.
Create an inventory of documents, databases and systems that are used on a day-to-day basis
to generate revenue, and then quantify and match income with those processes as part of your
recovery strategy/business impact analysis.13
Aside from IT, a recovery strategy also involves personnel, equipment, facilities, a
communication strategy and more in order to effectively recover and restore business
operations.
Plan Development
Using information derived from the business impact analysis in conjunction with the recovery
strategies, establish a plan framework. Documenting an IT disaster recovery plan is part of this
stage.
13
Online Tech, Business Continuity in Lean Times (Webinar);
http://www.onlinetech.com/events/business-continuity-in-lean-times
Copyright © Online Tech 2013. All Rights Reserved page 12 of 36
As can be seen from the multiple steps within business continuity planning, disaster recovery is
a subset within a larger overarching plan to keep a business running. It involves restoring and
recovering IT infrastructure, including servers, networks, devices, data and connectivity (see
section 4.0 Disaster Recovery for more).
A data backup plan involves choosing the right hardware and software backup procedures for
your company, scheduling and implementing backups as well as checking/testing for accuracy
(see section 5.3.1. Offsite Backup Options for more).
Testing & Exercises
Develop a testing process to measure the efficiency and effectiveness of your plans, as well as
how often to conduct tests. Part of this step involves establishing a training program and
conducting training for your company/business continuity team.
Testing allows you to clearly define roles and responsibilities and improve communication within
the team, as well as identify any weaknesses in the plans that require attention. This allows you
to allocate resources as needed to fill the gaps and build up a stronger, more resilient plan.
Read section 5.5.3 Testing for more information.
Copyright © Online Tech 2013. All Rights Reserved page 13 of 36
4.0. Disaster Recovery
As an integral part of business continuity plan development, creating an IT disaster recovery
plan is essential to keep businesses running as they increasingly rely on IT infrastructure
(networks, servers, systems, databases, devices, connectivity, power, etc.) to collect, process
and store mission-critical data.
A disaster recovery plan is designed to restore IT operations at an alternate site after a major
system disruption with long-term effects. After successfully transferring systems, the goal is to
restore, recover, test affected systems and put them back in operation.
Your IT infrastructure is, in most cases, the lifeblood of your organization. When websites are
down or patient data is unavailable due to hacking, natural disasters, hardware failure or human
error, businesses cannot survive.
According to FEMA, a recovery strategy should be developed for each component:
 Physical environment in which data/servers are stored – data centers equipped with
climate control, fire suppression systems, alarm systems, authorization and access
security, etc.
 Hardware – Networks, servers, devices and peripherals.
 Connectivity – Fiber, cable, wireless, etc.
 Software applications – Email, data exchange, project management, electronic
healthcare record systems, etc.
 Data and restoration
Identify the critical software applications and data, as well as the hardware required to run them.
Additionally, determining your company’s custom recovery point and time objectives can
prepare you for recovery success by creating guidelines around when data must be recovered.
4.1. Recovery Point and Time Objectives
Recovery Point Objective (RPO)
A recovery point objective (RPO) specifies a point in time that data must be recovered and
backed up in order for business operations to resume. The RPO determines the minimum
frequency at which interval backups need to occur, from every hour to every 5 minutes.14
Recovery Time Objective (RTO)
The recovery time objective (RTO) refers to the maximum length of time a system (or computer,
network or application) can be down after a failure or disaster before the company is negatively
impacted by the downtime. Determining the amount of lost revenue per amount of lost time can
help determine which applications and systems are critical to business sustainability.
14
Online Tech, Seeking a Disaster Recovery Solution? Five Questions to Ask Your DR Provider;
http://resource.onlinetech.com/five-questions-to-ask-your-disaster-recovery-provider/
Copyright © Online Tech 2013. All Rights Reserved page 14 of 36
For example, if your email server was down for only an hour, yet a large portion of your
database was wiped out and you lost 12 hours’ worth of email, how would that impact your
business?
4.2. Designing for Recovery
High Availability Infrastructure
Strategic data center design involving high availability and redundancy can help support larger
companies that rely on mission-critical (high-impact) applications. High availability is a design
approach that takes into account the sum of all the parts including the application, all the
hardware it is running on, power infrastructure, and the networking behind the hardware.15
Using high availability architecture can reduce the risks of lost revenue and customers in the
event of Internet connectivity or power loss – with high availability, you can perform
maintenance without downtime and the failure of a single firewall, switch, or PDU will not affect
your availability. With this type of IT design, you can achieve 99.999%, meaning you have less
than 5.26 minutes of downtime per year.
High availability power means the primary power circuit should be provided by the primary UPS
(Uninterruptible Power Supply) and be backed up by the primary generator. A secondary circuit
should be provided by the secondary UPS, which is backed up by the secondary generator.
This redundant design ensures that a UPS or generator failure will never interrupt power in your
environment.
For a high availability data center, you should seek not only a primary and secondary power
feed, but also a primary and secondary Internet uplink if purchasing Internet from them.
Additionally ensure any available hardware, firewalls or switches include redundant hardware.
If using managed services and purchasing a server from a data center, ensure all of the
hardware is configured for high availability, including dual power supplies and dual NIC (network
interface controller) cards. Ensure their server is also wired back to different switches, and the
switches are dual homed to different access layer routing so there is no single point of failure
anywhere in the environment.
Offsite backup and disaster recovery are still important; as high availability cannot help you
recover from a natural disaster such as a flood or hurricane. Additionally, disaster recovery
comes after high availability has completely failed and you must recover to a different
geographical location.
15
Online Tech, Online Tech Expert Interview: What is High Availability?;
http://resource.onlinetech.com/michigan-data-center-operator-online-tech-expert-interview-what-is-high-
availability/
Copyright © Online Tech 2013. All Rights Reserved page 15 of 36
Redundant Infrastructure
Redundancy is another factor to consider when it comes to disaster recovery data center
design. With a fully redundant data center design, automatic failover can ensure server uptime
in the event that one provider experiences any connectivity issues.
This includes multiple Internet Service Providers (ISPs) and fully redundant Cisco networks with
automatic failover. Pooled UPS (Uninterruptible Power Supply), battery and generators can
ensure a backup source of power in the event one provider fails. View an example of Online
Tech’s redundant network and data centers below:
Cold Site Disaster Recovery
A cold site is little more than an appropriately configured space in a building. Everything
required to restore service to your users must be retrieved and delivered to the site before the
Copyright © Online Tech 2013. All Rights Reserved page 16 of 36
process of recovery can begin. As you can imagine, the delay going from a cold backup site to
full operation can be substantial.
Warm Site Disaster Recovery
A warm site is leasing space from a data center provider or disaster recovery provider that
already has the power, cooling and network installed. It is also already stocked with hardware
similar to that found in your data center, or primary site. To restore service, the last backups
from an offsite storage facility are required.
Hot Site Disaster Recovery
A hot site is the most expensive yet fastest way to get your servers back online in the event of
an interruption. Hardware and operating systems are kept in sync and in place at a data center
provider's facility in order to quickly restore operations. Real time synchronization between the
two sites may be used to completely mirror the data environment of the original site using wide
area network links and specialized software. Following a disruption to the original site, the hot
site exists so that the organization can relocate with minimal losses to normal operations.
Ideally, a hot site will be up and running within a matter of hours or even less.
When you partner with a data center/disaster recovery provider, you're sharing the cost of the
infrastructure, so it's not as expensive if you were to have an entirely secondary data center.
Copyright © Online Tech 2013. All Rights Reserved page 17 of 36
5.0. Technical Implementation Considerations
5.1. Virtualization/Cloud Computing Disaster Recovery
With virtualization, the entire server, including the operating system, applications, patches and
data are encapsulated into a single software bundle or server – this virtual server can be copied
or backed up to an offsite data center, and spun up on a virtual host in minutes in the event of a
disaster.
Since the virtual server is hardware independent, the operating system, applications, patches
and data can be safely and accurately transferred from one data center to a second site without
reloading each component of the server.
This can reduce recovery times compared to traditional disaster recovery approaches where
servers need to be loaded with the OS and application software, as well as patched to the last
configuration used in production before the data can be restored.
Virtual machines (VMs) can be mirrored, or running in sync, at a remote site to ensure failover in
the event that the original site should fail; ensuring complete data accuracy when recovering
and restoring after an interruption.
Another aspect of cloud-based disaster recovery that improves recovery times drastically is full
network replication. Replicating the entire network and security configuration between the
production and disaster recovery site as configuration changes are made saves you the time
and trouble of configuring VLAN, firewall rules and VPNs before the disaster recovery site can
go live.
In order to achieve full replication, your cloud-based disaster recovery provider should manage
both the production cloud servers and disaster recovery cloud servers at both sites.
For warm site disaster recovery, backups of critical servers can be spun up on a shared or
private cloud host platform.
For SAN-to-SAN replication, hot site disaster recovery is more affordable – SAN replication
allows not only rapid failover to the secondary site, but also the ability to return to the production
site when the disaster is over.
For a case study of a real physical-to-cloud switch scenario from a business enterprise
perspective, read section 5.1.4. Cloud Case Study for a detailed comparison of managing
physical servers vs. a private cloud environment, including differences in costs, energy use,
uptime, performance and development.
5.1.1. Traditional Disaster Recovery
Copyright © Online Tech 2013. All Rights Reserved page 18 of 36
With traditional disaster recovery outsourced to a vendor with a shared infrastructure, after a
disaster is declared, the hardware, software and operating system must be configured to match
the original affected site.
Data is being stored on offsite tape backups – after a disaster, the data must be retrieved and
restored in the remote site location that has been configured to match the original. This can take
hours or a few days to recover and restore completely. If not outsourcing, the traditional disaster
recovery method of using a cold site can be very time-consuming and very costly.
If you have a disaster recovery infrastructure with preconfigured hardware and software ready at
a secondary site (a warm site), this can cut down on the time it takes to recover. However, even
with a secondary site, your organization is still dependent on retrieving physical backup tapes
for complete restoration. There is no data synchronization and no failback option available with
traditional disaster recovery.
The missing step in many traditional disaster recovery plans is how to return to the production
site once it has been re-established. Traditional disaster recovery plans are often not fully tested
through a full failover disaster scenario due to the time-consuming design of the plan.
5.1.2. Active-Passive
In an active-passive disaster recovery setup, the original or primary site is designed so that the
network fails over at an alternative or secondary site with delayed resiliency. Applications and
configurations must be replicated with a delay anywhere from five minutes to 24 hours. With a
secondary site, there is reduced capacity hardware, and failback requires a maintenance
window.
5.1.3. Active-Active
In an active-active disaster recovery setup, there is synchronous data replication between the
primary and secondary sites, with no delayed resiliency. The database spans the two data
centers, and the application layer multi-writes. There is equivalent capacity hardware at a
secondary data center to ensure full capacity redundancy.
5.1.4. Cloud Case Study
Online Tech is one example of making the switch from traditional physical servers to a cloud
environment that resulted in savings in hardware, disaster recovery and more. Back in 2011, we
found our growth was beginning to become difficult to manage internally.
Mission Critical Hardware, Facilities and Employees
We had two data centers, hundreds of circuits, network devices, racks, cages and private suites
to manage and maintain. We also had thousands of servers and support tickets due to a rapidly
growing client-base, as well as certification and auditing processes to keep up annually (SSAE
16, SOC 2, HIPAA, PCI DSS, SOX) in order to maintain compliance and data security for our
clients.
Copyright © Online Tech 2013. All Rights Reserved page 19 of 36
With employees at five different locations and in two different countries, we needed a scalable
and efficient solution to support our mission critical business components.
Mission Critical Systems
Within our administrative department, Exchange, SharePoint, a file server, and domain
controller supported their everyday processes. Our marketing department uses a production
and development website to test and implement updates, as well as load-balanced website to
optimize resources.
For OTPortal, our client and intranet portal, we use Microsoft .net applications and a MS SQL
database. For OTMobile (provides mobile access for our engineers), we use a PHP application.
Within our operations department, we use a custom Centos program to manage the data and
create a MySQL database for our bandwidth management and billing processes.
Operations has thousands of patches to apply each month, as well as firewall, IDS management
consoles, antivirus management, server and cloud backup managers, SAN and NAS
management, and uptime/performance monitoring to maintain. We also have a sandbox for
testing in our lab.
From Physical Servers to a Private Cloud
We consolidated from 23 physical servers (18 Windows, 5 CentOS servers, 4 database servers;
each with 10 percent utilization) to private cloud. The private cloud consisted of 2 redundant
hardware servers (N+1) and an 8 terabyte SAN. Our high availability (HA) configuration includes
automatic load-balancing across hosts, and automatic failover to a single server.
The private cloud also includes continuous offsite backup, allowing for real-time data
synchronization. We employ a disaster recovery warm site located in Ann Arbor, Michigan that
allows us a four hour recovery time that has been fully tested.
Leveraging Our Cloud
When we switched over, we actualized several benefits, from faster client-support development,
lower total cost of ownership, improved uptime and performance, as well as significantly
decreasing our energy usage and carbon footprint.
Pace of Development
With the switch to our private cloud, we’ve increased the pace of development. A project that
would typically take two weeks can now be completed within an hour, as we can create new
servers and test concepts using production data.
As a result, our development team can update the client portal, OTPortal, with new releases
every two weeks; implementing new time-saving features much sooner than before.
Total Cost of Ownership (TCO)
Copyright © Online Tech 2013. All Rights Reserved page 20 of 36
We also reduced the total cost of managing our infrastructure. Our old TCO required
management of 26 physical Dell servers with a variety of specifications, versions, bios, CPU,
memory configurations and the need for several different spares. In addition, we had to manage
26 backups, antivirus and machines to network and patch.
We also had four Cisco network switches, two racks in the data center, more than a hundred
network cables and half a dozen power strips. It took hours to upgrades disks, and downtime
also contributed to costs, as it was required to upgrade memory.
The cloud TCO consolidated everything into two servers, one SAN, two network switches, two
power strips and down from two racks to a quarter of a rack. Overall, we saved 50 percent on
hardware and 90 percent on management costs.
Improved Uptime
Another benefit is improved uptime – always a major benefit when it comes to hosting critical
data for our clients. With N+ 1 (redundant) hosts, every virtual server we create is protected
from a failed hardware server. For redundancy with physical servers, it would have required an
additional 26 servers, adding to cost, time management and energy expenditure.
To guard against SAN failure, we have redundant controllers in our SAN, with RAID array drives
and spare drives on hand. With our high availability power configuration, we were further
protected against downtime.
Initially we considered using a separate server for the database, resulting in a hybrid cloud
configuration, which would have required a cluster of database servers for the same protection.
Instead, we upgraded our entire cloud for less than a new single database server, resulting in
protection against server failure for significantly lower cost than a cluster.
Improved Performance
We also improved our ability to respond to performance issues. Previously with our physical
server setup, it took a few days to get the right RAM/disk/CPU, and we had to schedule
downtime with anywhere from two days to one week of notice.
The actual process included shutting down and removing the server from the rack, opening the
server, installing additional resources and then booting up the server. Then we would have to
test the performance, turn the server back off in order to re-rack it, and then restart the server;
resulting in about two hours of downtime.
Copyright © Online Tech 2013. All Rights Reserved page 21 of 36
When we switched to the cloud, the steps were reduced to: schedule downtime; click to add
more RAM/disk/CPU; reboot server
and test performance – a total of
five minutes of downtime. The entire
cloud upgraded hardware one host
at a time with nearly zero downtime.
Decreasing Energy, Carbon
Footprint & Costs
We significantly reduced our energy
use and subsequent carbon
footprint. When it came to power
consumption, for 100 percent
uptime at 300 watts/server and a
PUE of 1.8, we went through 1.58
lbs. of CO2/kwhr. For 26 physical
servers and a network, that amounts
to about 200,000 lbs. of CO2 per
year, and twice as much annually
for redundancy.
The cloud required two physical
servers, network and SAN. With a
35 server capacity, we are burning
31,000 lbs. of CO2 per year – a
savings of nearly 17,000 lbs. of CO2
annually.16
Faster Disaster Recovery
With every server we create, we’re
reassured they are automatically
protected, as we have a single backup process for each host and backups for every virtual
server on all hosts. We have a 4 hour RTO from catastrophic failure – we can failover to a
secondary data center site in less than 4 hours. With the cloud, we are able to test twice a year
to ensure the process runs smoothly.
In a virtualized environment, the entire server, including the operating system (OS), apps,
patches and data are captured on a virtual server that can be backed up to another one of our
16
Online Tech, How the Cloud is Changing the Data Center’s Bad Reputation for Energy Inefficiency;
http://resource.onlinetech.com/how-the-cloud-is-changing-the-data-centers-bad-reputation-for-energy-
inefficiency/
Copyright © Online Tech 2013. All Rights Reserved page 22 of 36
data centers and spun up in a matter of minutes.17
This makes both testing and full recovery
and failover much faster and efficient than if we still used physical servers.
5.2. Location for Disaster Recovery
Strategic distance between the primary and secondary sites for disaster recovery is important in
order to avoid natural disasters, ensure data synchronicity, allow for business scalability and
maximize operational efficiency.
5.2.1. Micro-Sufficiency vs. Macro-Efficiency
Micro-sufficiency is the concept in which your core functions or critical pieces of your
infrastructure are centrally located, as well as replicated regionally. In an example of business
model scaling, core departments may be human resources, IT and/or legal located centrally at a
headquarters.
However, each branch of the business in different regions has its own local core departments.
The idea behind this model is that risk is mitigated by dispersing core functions close to each
region and their local customers, and not solely in one location (headquarters). Each branch
also has different strategies to better serve their unique customers as their needs vary from
region to region.
Similarly, with disaster recovery planning, once you identify your critical business processes,
you can distribute those core functions into your various operational units in the event of a
disaster or interruption. Designing with this concept of redundancy and resiliency in your
infrastructure can result in a graceful and safe failover with efficient recovery.
Partnering with a disaster recovery data center provider allows your organization to take
advantage of the risk-mitigating benefits of the micro-sufficiency concept, while avoiding the
costs of building and maintaining your own data center. Instead of installing your own redundant
set of equipment in an alternate facility, colocation or disaster recovery with a partner allows you
to pay for space in a fully staffed, redundant environment.
With macro-efficiency, the concept is economy of scale – the bigger your company, the more
buying power you have and the larger equipment you can buy. In an example of business model
scaling, the core departments make blanket corporate decisions across their regional branches,
regardless of differences in customers and needs. Without recognizing differences and
identifying the workflow of interdependencies between different departments, the model suffers
from lack of organization and inability to identify and recover critical functions.18
17
Data Center Knowledge/Online Tech, How the Cloud Changes Disaster Recovery;
http://www.datacenterknowledge.com/archives/2011/07/26/how-the-cloud-changes-disaster-recovery/
18
Online Tech, Disaster Recovery in Depth; http://www.onlinetech.com/events/disaster-recovery-in-depth/
Copyright © Online Tech 2013. All Rights Reserved page 23 of 36
Micro-sufficiency is the ideal model for disaster recovery and business continuity planning, as it
effectively mitigates risk and presents a better strategy for protecting data through redundant
and resilient design.
5.2.2. Geography Matters
The geographic selection of low natural disaster zones is essential for lowering the risk of critical
IT infrastructure destruction. A large enough distance between your primary and secondary
sites ensures that your secondary site isn’t affected by a potential natural disaster. Read on for
more about specific parameters of secondary disaster recovery sites.
5.2.3. Selection of Second Sites
If your organization or primary site is located in a disaster-prone zone, consider a secondary site
in a landlocked and more temperate region. Compared to coastal regions, the Midwest has low
national averages for significant natural disasters such as floods, tornadoes, hurricanes and
fires that cause mass destruction and may be a threat to your business.
Copyright © Online Tech 2013. All Rights Reserved page 24 of 36
If your organization or primary site is located on the coast or in an earthquake zone, your
secondary site should be located at least 100 miles away.19
Ideally, your secondary site should
be located far enough away to mitigate the risk of it being affected by the same disaster
affecting your primary site.
The design of your secondary site should also be strategic – never locate generators in a
basement or other location that may be difficult to service, or prone to destruction.
Additionally, ensure your secondary data center is located close enough to your primary for
optimal bandwidth and response time, as well as the ability to mirror data in real time. Facilities
should also be easily reached by your IT team in the event of a disaster for faster service and
recovery.
However, your disaster recovery data center should always be located on a separate utility
power grid than your primary data center. In the event of a power outage at your primary,
separate power grids ensures that your secondary site will still be up and running.
5.3. Hardware Protection vs. Data Center Protection
5.3.1. Offsite Backup Options
Sending data offsite ensures a copy of your critical data is available in the event of a disaster at
your primary site, and it is considered a best practice in disaster recovery planning. There are
several offsite data backup media options available, including the traditional tape backup
method that involves periodic copying of data to tape drives that can be done manually or with
software.
However, physical tape backup has its drawbacks, including read or write errors, slow data
retrieval times, and required maintenance windows. With critical business data from medical
records to customer credit card data, your organization can’t afford to risk losing archives or the
ability to completely recover after a disaster.
According to NIST, the different types of data backups include:20
 Full backup – All files on the disk or within the folder are backed up. This can be time-
consuming due to the sheer size of files. According to NIST, maintaining duplicates of
files that don’t change very often, such as system files, can lead to excessive and costly
storage requirements.
19
CIOUpdate.com, Disaster Recovery Planning;
http://www.cioupdate.com/trends/article.php/3872926/Disaster-Recovery-Planning---How-Far-is-Far-
Enough.htm
20
NIST (National Institute of Science and Technology), Special Publication 800-34 Rev. 1 – Contingency
Planning Guide for Federal Information Systems; http://csrc.nist.gov/publications/nistpubs/800-34-
rev1/sp800-34-rev1_errata-Nov11-2010.pdf (PDF)
Copyright © Online Tech 2013. All Rights Reserved page 25 of 36
 Incremental – Files that were created or changed since the last backup are captured in
an incremental backup. Backup times are shorter and more efficient, but might require
compiling backups from multiple days and media, depending on when files where
changed.
 Differential – All files that were created or modified since the last full backup – if a file is
changed after the last backup, the file will be saved each time until the next full backup is
completed. Backup times are shorter than a full backup, and require less media than
incremental.
For more about specific offsite backup technology, read section 5.4 SAN-to-SAN Replication
and SAN Snapshots.
Outsource vs. In-Source
Outsourcing your offsite backup to a managed services provider can provide your organization
with continuous data protection and full file-level restoration, and offload the burden of installing,
managing, monitoring as well as complete restoration after a disaster.
With a vendor, your encrypted server files are sent to an onsite backup manager (primary site),
which are then sent to a secondary, offsite backup manager, ideally far enough apart to reduce
the chances of the secondary site being affected by the same disaster or interruption.
While offsite backup managed in-house can be costly due to building out, maintaining and
upgrading both primary and secondary sites, outsourcing your offsite backup to professionals
means you can take advantage of their investments in capital, technology and expertise.
Copyright © Online Tech 2013. All Rights Reserved page 26 of 36
As NIST (National Institute of Science and Technology) states, backup media should be stored
offsite or at an alternate site in a secure, environmentally controlled facility.21
An offsite backup
data center should have physical, network and environmental controls to maintain a high level of
security and safety from possible backup damage.
Physical security at a data center means only authorized personnel have limited access to client
servers, and the facility itself should require dual-identification control access (through the use
of a secondary identification device, such a biometric authentication that requires a fingerprint
scan). Environmental controls should include 24x7 monitoring, logged surveillance cameras and
multiple alarm systems.
Any sensitive infrastructure should be protected by restricted access, and redundancy in
routers, switches and paired universal threat management devices should provide network
security for your offsite backup data.
Vendor Selection Criteria
When vetting offsite backup and disaster recovery vendors (also known as disaster recovery as
a service, or DRaaS) check certain criteria to ensure your data is protected. Look for certain
security certifications, compliance, communication styles and technology when comparing
offsite backup providers, as well as the basic disaster recovery criteria of geographic area,
accessibility, security, environment and costs discussed in section 5.2 Location for Disaster
Recovery.
Compliance
One way to gain assurance of an offsite backup/data center provider’s security practices is to
inquire about their industry security and compliance reports.
Vendors that have invested the significant time and resources toward building out and meeting
regulatory requirements for operating excellence and security practices will have undergone
independent audits. They should also be able to provide a copy of their audit report under NDA
(non-disclosure agreements).
Look for these data center audit compliance reports:
 SSAE 16 (Statement on Standards for Attestation Engagements), which replaced SAS
70 (Statement on Auditing Standard), measures controls and processes related the
financial recordkeeping and reporting. A SOC 1 (service organization controls) report
measures and reports on the same controls as an SSAE 16 report.
 A SOC 2 audit is actually most closely related to reporting on the security, availability
and privacy of the data in your offsite backup and data hosting environment. A SOC 2
21
NIST (National Institute of Science and Technology), Special Publication 800-34 Rev. 1 – Contingency
Planning Guide for Federal Information Systems; http://csrc.nist.gov/publications/nistpubs/800-34-
rev1/sp800-34-rev1_errata-Nov11-2010.pdf (PDF)
Copyright © Online Tech 2013. All Rights Reserved page 27 of 36
report is highly recommended for companies that host or store large amounts of data,
particularly data centers. A SOC 3 report measures the same controls as a SOC 2, yet
has less technical detail, and can be used publicly.
 For specific industries that deal with certain types of data, there exist more stringent sets
of compliance regulations. For the healthcare industry, or any company that touches
protected health information (PHI), HIPAA compliance (Health Insurance Portability and
Accountability Act) is federally mandated to protect health data. If your disaster
recovery/offsite back data center provider has undergone an independent HIPAA audit
of its facilities and processes, you can be assured your data is secure.
 For e-commerce, retail, franchise and any other company that touches credit cardholder
data (CHD), PCI DSS compliance (Payment Card Industry Data Security Standard) is
the regulatory requirements designed to protect CHD.
Communication
When there’s an interruption in your service or issue at the data center, you should be able to
count on your disaster recovery provider to promptly communicate with you in order to give your
IT staff or clients proper notification. An updated contact list and tested communication plan
should be key aspects of your disaster recovery and business continuity plan.
The lack of communication can put a company out of business and leave coworkers and
customers in the dark. Designate a primary contact and backup contacts from your company to
be the first to know in the event of a disaster, as well as assemble a technical team that can
work with your provider, if outsourcing your disaster recovery solution.
When searching for an offsite backup/data center provider, ask about their communication
policies and processes. Good communication can also give you insight into their level of
transparency into their business operations. See section 5.5.4. Communication Plan Testing for
more about establishing a realistic and effective communication plan between your company
and vendors.
Fully Reserved or First-Come, First-Served?
Does your provider offer fully reserved servers for disaster recovery? Or do they lease a number
of physical servers and resources to be used on a first-come, first-served basis, shared with
other companies?
Providers that offer this service allow companies to load applications and attempt to recover
operations on “cold” servers – these servers are considered bare metal servers with no
operating system (OS), applications, patches or data. Recovery would take longer due to the
time spent retrieving tape backups and traveling to the secondary site.
Ask your provider if they offer fully reserved servers for complete assurance that your company
will be able to recover your data as quickly as possible, without the chance of being second in
Copyright © Online Tech 2013. All Rights Reserved page 28 of 36
line. In addition, virtualization can eliminate the need of restoring from tape or disk, thus
reducing recovery times compared to traditional disaster recovery in which physical servers
need to first be loaded with the OS and application software, as well as patched to the last
configuration used in production before data restoration.
5.4. SAN-to-SAN Replication
SAN (Storage Area Network)
Due to compliance reasons or due diligence, many companies not only want a backup locally
that they can recover to very quickly, but they also need to get that data offsite in the event that
they experience a site failure. SAN can help with these backup and recovery needs.
SAN Snapshots
A snapshot is a point-in-time reference of data that you can schedule after your database
dumps and transaction logs have finished running. A SAN snapshot gives you a virtual
copy/image of what your database volumes, devices or systems look like at a given time. If you
have an entire server failure, you can very quickly spin up a server, install SQL or do a bare
metal restore, then import all of your data and get your database server back online.
SAN-to-SAN Replication
The counterpart to SAN snapshots is SAN-to-SAN replication (or synchronization). With
replication, if you had a SAN in one data center, you can send data to another SAN in a different
data center location. You can back up very large volumes of data very quickly using SAN
technology, and you can also transfer that data to a secondary location independently of your
snapshot schedule.
This is more efficient because traditional backup windows can take a very long time and impact
the performance of your system. By keeping it all on the SAN, it allows backups to be done very
fast, and the data copy can be done in the background so it’s not impacting the performance of
your systems.
You can configure and maintain snapshots on both your primary and disaster recovery sites,
i.e., you can keep seven days’ worth of snapshots on your primary site, and you can keep seven
days of replication on your disaster recovery site.
However, SANs are fairly expensive, and snapshots and replication can use a lot of space. You
will also need specialized staff to configure and manage SAN operations.
SAN-based recovery focuses on large volumes of data, and it is more difficult to recover
individual files. Traditional recovery focuses on critical business files for more granular recovery,
but that comes at the cost of speed. With a large volume of data, traditional recovery can be
much slower than SAN-based snapshots.
SAN-to-SAN replication can support a private cloud environment and provide fast recovery
times (RTO of 1 hour and RPO of minutes). After a disaster is mitigated, SAN-to-SAN
Copyright © Online Tech 2013. All Rights Reserved page 29 of 36
replication provides a smooth failback from the secondary site to the production site by
reversing the replication process.
SAN vs. Traditional Backup and Disaster Recovery
Traditionally, 10 or 15 years ago, people had email servers, FTP/document servers,
unstructured data and database servers. The backup and recovery of these systems must be
viewed differently as they each present their own unique challenges.
With email servers, they are mission critical, highly transactional and essential to a business.
They may have SQL or custom databases, and they can take a long time to rebuild after a
disaster. The actual install and configuration of the application that sits on top of the database
itself can be very intensive, and rebuilding that system may put you over your recovery time
objective (RTO).
For a smaller company, an exchange server may be 100 to 200 GB in size. FTP/file servers can
be terabytes in size, and contain large volumes of unstructured data. They are less transactional
than email servers, and server configuration could be minimal. Each individual file must be
backed up. When looking at systems of that size, you should stop looking at traditional backups,
and you can start leveraging SAN (Storage Area Network) technology – which is a large group
of disks.
Instead of having a backup window that runs for an entire day that can slow operations, you can
use a SAN snapshot technology which allows you to back up more efficiently. If you need a
backup of your FTP/file servers every night, you can leverage a snapshot during off-hours very
quickly, from a matter of seconds to a minute. SAN snapshots can back up a large amount of
data with very little impact on your production environment.
The tradeoff is it can be slightly harder to restore the data because you would need to bring up
your file drive online and present it to the server. However, it can be faster than having to
restore terabytes of data from a tape backup.
For standalone database servers with a large volume of structured data that are highly
transactional, consider using SAN snapshot technologies with specified volumes for database
dumps and transaction logs.
5.5. Best Practices
5.5.1. Encryption
What is encryption? Encryption takes plaintext (your data) and encodes it into unreadable,
scrambled text using algorithms that render it unreadable unless a cryptographic key is used to
convert it. Encryption ensures data security and integrity even if accessed by an unauthorized
user.
Copyright © Online Tech 2013. All Rights Reserved page 30 of 36
According to NIST (National Institute of Science and Technology), encryption is most effective
when applied to both the primary data storage device and on backup media going to an
offsite location in the event that data is lost or stolen on its way or at the site, meaning data in
transit and at rest.22
NIST also recommends keeping a solid cryptographic key management
process in order to allow encrypted data to be read and available as needed (decryption).
According to data security expert Chris Heuman, Certified Information Systems Security
Professional (CISSP), performing a disaster recovery test of encrypted data should be an
important part of your business continuity strategy. Forcing recovery from an encrypted backup
source and forcing a recovery of the encryption key to the recovery device allows organizations
to find out if encryption is effective before a real disaster or breach occurs.
Encryption for HIPAA and PCI Compliance
Encryption is considered a best practice for data security and is recommended for organizations
with sensitive data, such as healthcare or credit card data. It is highly recommended for the
healthcare industry that must report to the federal agency, Dept. of Health and Human Services
(HHS), if unencrypted data is exposed, lost stolen or misused.
The federally mandated HIPAA Security Rule for healthcare organizations handling electronic
protected health information (ePHI) dictates that organizations must:
In accordance with §164.306… Implement a mechanism to encrypt and
decrypt electronic protected health information. (45 CFR §
164.312(a)(2)(iv))
HIPAA also mandates that organizations must:
§164.306(e)(2)(ii): Implement a mechanism to encrypt electronic
protected health information whenever deemed appropriate.
Protecting ePHI at rest and in transit means encrypting not only data collected or processed, but
also data stored or archived as backups.
For organizations that deal with credit cardholder data, they must adhere to PCI DSS standards
that require encryption only if cardholder data is stored.23
PCI explicitly states:24
22
NIST (National Institute of Science and Technology), Special Publication 800-34 Rev. 1 – Contingency
Planning Guide for Federal Information Systems; http://csrc.nist.gov/publications/nistpubs/800-34-
rev1/sp800-34-rev1_errata-Nov11-2010.pdf (PDF)
23
Chris Heuman CHP, CHSS, CSCS, CISSP, Practice Leader for RISC Management and Consulting,
Encryption – Perspective on Privacy, Security & Compliance;
http://www.onlinetech.com/events/encryption-perspective-on-privacy-security-a-compliance (Webinar)
24
PCI Security Standards Council, PCI DSS Requirements and Security Assessment Procedures,
Version 2.0; https://www.pcisecuritystandards.org/documents/pci_dss_v2.pdf (PDF)
Copyright © Online Tech 2013. All Rights Reserved page 31 of 36
3.4 Render PAN (Primary Account Number) unreadable anywhere it is
stored (including on portable digital media, backup media, and in logs)
by using any of the following approaches:
 One-way hashes based on strong cryptography (hash must be
of the entire PAN)
 Truncation (hashing cannot be used to replace the truncated
segment of PAN)
 Index tokens and pads (pads must be securely stored)
 Strong cryptography with associated key-management
processes and procedures
3.4.1.c Verify that cardholder data on removable media is encrypted
wherever stored.
While both addressable and required for compliance, encryption is also considered an industry
best practice – no longer just an option but necessary to protect backup data in rest and in
transit to your disaster recovery/offsite backup site.
For more on encryption from both a technical and compliance perspective, check back to our
White Paper section for our Encryption white paper to be released Fall 2013. Or, watch our
recorded encryption webinar series with IT and data security professional guest speakers as
well as experts from Online Tech in:
 Encryption – Perspective on Privacy, Security & Compliance
 Encryption at the Software Level: Linux and Windows
 Encryption at the Hardware and Storage Level
5.5.2. Network Replication
With a single stand-alone server, cloud-based disaster recovery allows you to ship a copy of
your virtual server image offsite to run on a cloud server in the event of a disaster. However, for
enterprise or more complex server configurations, more than just a server image is required for
recovery. Firewall rules, VLANs, VPNs and the network replication must be fully replicated at
the disaster recovery site before the site can go live.
In order to achieve rapid recovery time objectives (RTOs), the server and network must be fully
replicated at the secondary site in synchronicity with the production site as changes are made.
Ideally, a cloud-based disaster recovery provider should have control of both the production and
disaster recovery sites to ensure network replication.
Copyright © Online Tech 2013. All Rights Reserved page 32 of 36
5.5.3. Testing
Testing your disaster recovery plan at least annually is a best practice for numerous reasons,
including verifying that the plan actually works and training your team in the process. Testing
also allows you to figure out where weaknesses lie, or gaps in the process that need to be
addressed. According to NIST, the following areas should be tested:
 Notification procedures
 System recovery on secondary site
 Internal and external connectivity
 System performance with secondary equipment
 Restoration of normal operations
Testing with a traditional disaster recovery plan can be time-consuming and costly due to the
retrieval, restoration and system re-configuration required, and often conventional plans are
rarely tested through a full failover scenario. With cloud-based disaster recovery, testing is
easier, faster and less disruptive to your production environment and business operations than
traditional disaster recovery.
Since the cloud offers offsite backup of the entire virtual server in sync with the production site,
there is no need to retrieve tapes to test full recovery.
5.5.4. Communication Plan Testing
Part of your overall disaster recovery and business continuity planning should involve a well-
documented communication plan based on your BIA (Business Impact Analysis).
Mapping out the interdependencies and complexity of your organization can help you identify
who is the proper point of contact for any given critical function. Testing your communication
plan is key to getting everyone on board and working together to achieve a smooth and realistic
recovery.
Determine who is responsible for officially declaring a disaster – from IT to executives, a
communication plan should be in place for business interruption or disaster notification, and
then a formal declaration. After declaration, a process should be established for notifying
shareholders, employees, customers, vendors and the general public, if necessary.
Aside from notification, a trained disaster recovery IT team should be identified for the
secondary site, as well as for production. If working with a disaster recovery provider, ensure
your contracts and agreements reflect notification and communication policies to clarify their
roles and responsibilities involved in facilitating recovery.
Someone should be tasked with keeping a well-organized and up-to-date contact list for those
involved in the communication plan, with cell phone and home phone numbers as well as an
Copyright © Online Tech 2013. All Rights Reserved page 33 of 36
alternative email address in the event that corporate email/phone systems are down during a
disaster.
Copyright © Online Tech 2013. All Rights Reserved page 34 of 36
6.0. Conclusion
Disaster recovery technology advancements have streamlined the process to offer a faster,
more accurate and complete recovery solution. Leveraging the capabilities of a disaster
recovery as a service (DRaaS) provider allows organizations to realize these benefits, including
cost-effective and efficient testing to ensure plan viability.
The time and resource-intensive challenge of managing a secondary disaster recovery site that
both meets stringent industry compliance requirements and protects mission critical data and
applications can be relieved with the right disaster recovery partner.
Here is a high-level overview of what to look for in an offsite backup and disaster recovery
provider and plan (see section 7.0 Questions to Ask Your Disaster Recovery Provider for more
details):
 Strategic location
 Risk of natural disaster
 Recovery time objective (RTO)
 Recovery point objective (RPO)
 Cloud-based disaster recovery
 High availability/redundancy
 Annual testing
 Compliance audits and reports
Contact our disaster recovery and offsite backup experts at Online Tech for more information if
you still have questions about IT disaster recovery planning or our disaster recovery data
centers.
Visit: www.onlinetech.com
Email: contactus@onlinetech.com
Call: 734.213.2020
Copyright © Online Tech 2013. All Rights Reserved page 35 of 36
7.0. References
7.1. Questions to Ask Your Disaster Recovery Provider
When you look to a third party disaster recovery provider, what kind of questions should you ask
to ensure your critical data and applications are safe? Read on for tips on what to look for in a
disaster recovery as a service (DRaaS) solution from your hosting provider.
1. Do you have the following data center certifications: SSAE 16, SOC 1, 2 and 3?
Data center certifications should be up-to-date, backed up by an auditor’s report, and
comprehensive of all security-related controls. Here’s a brief snippet of what each one
measures:
 SSAE 16
The Statement on Standards for Attestation Engagements (SSAE) No. 16 replaced SAS
70 in June 2011 – if your current disaster recovery provider only has a SAS 70
certification, keep looking! SSAE 16 has made SAS 70 extinct.
A SSAE 16 audit measures the controls, design and operating effectiveness of data
centers, as relevant to financial reporting. (Note: SSAE 16 does not provide assurance
of controls directly related to data centers/disaster recovery providers).
 SOC 1
The first of three new Service Organization Controls reports developed by the AICPA,
this report measures the controls of a data center as relevant to financial reporting. SOC
1 is essentially the same as SSAE 16 – the purpose of the report is to meet financial
reporting needs of companies that use data hosting services, including disaster
recovery.
 SOC 2
SOC 2 measures controls specifically related to IT and data center service providers,
unlike SOC 1 or SSAE 16. The five controls are security, availability, processing integrity
(ensuring system accuracy, completion and authorization), confidentiality and privacy.
 SOC 3
SOC 3 delivers an auditor’s opinion of SOC 2 components with the additional seal of
approval needed to ensure you are hosting with an audited and compliant data center. A
SOC 3 report is less detailed and technical than a SOC 2 report.
2. What is your recovery time objective and recovery point objective SLA?
Recovery Time Objective (RTO): This refers to the maximum length of time a system can be
down after a failure or disaster before the company is negatively impacted by the downtime.
Copyright © Online Tech 2013. All Rights Reserved page 36 of 36
Recovery Point Objective (RPO): This specifies a point in time that data must be recovered
and backed up. The RPO determines the minimum frequency at which interval backups need to
occur, from every hour to every 5 minutes.
Clarifying the time objectives with your disaster recovery provider can help your organization
plan for the worst and know what to expect, when.
3. Where are your disaster recovery data centers located?
Natural disasters happen at any time, almost anywhere – but you can decrease your odds of
experiencing them by choosing to partner with a disaster recovery provider that has data center
facilities located in a disaster-free zone. The Midwest is one region that is relatively free from
major disasters. Read more in High Density of Data Centers Correlate with Disaster Zones;
Michigan Provides Safe Haven.
4. Do you offer cloud-based disaster recovery?
As VMware.com states, “traditional disaster recovery solutions are complex to set up. They
require a secondary site, dedicated infrastructure, and hardware-based replication to move data
to the secondary site.”
With cloud-based disaster recovery, you could achieve a 4 hour RTO and 24 hour RPO. Cloud-
based disaster recovery replicates the entire hosted cloud (servers, software, network and
security) to an offsite data center, allowing for far faster recovery times than traditional disaster
recovery solutions can offer.
5. How often do you test your disaster recovery systems?
Disaster recovery providers should test at least annually to ensure systems are prepared for an
emergency response whenever a disaster is declared. Testing also allows for a valuable
learning experience – if anything goes wrong, professionals can investigate and remediate
before an actual disaster occurs. It’s also a test run for the personnel involved in managing the
event to ensure the documented communication plan actually works as anticipated.
8.0. Contact Us
Contact our disaster recovery and offsite backup experts at Online Tech for more information if
you still have questions about IT disaster recovery planning or our disaster recovery data
centers.
Visit: www.onlinetech.com
Email: contactus@onlinetech.com
Call: 734.213.2020

More Related Content

What's hot

IT-as-a-Service Solutions for Healthcare Providers
IT-as-a-Service Solutions for Healthcare ProvidersIT-as-a-Service Solutions for Healthcare Providers
IT-as-a-Service Solutions for Healthcare ProvidersEMC
 
Business continuity and disaster recovery
Business continuity and disaster recoveryBusiness continuity and disaster recovery
Business continuity and disaster recoveryAdeel Javaid
 
Corporater Overview | Business Management Platform (BMP)
Corporater Overview | Business Management Platform (BMP)Corporater Overview | Business Management Platform (BMP)
Corporater Overview | Business Management Platform (BMP)Corporater
 
Business Continuity and Disaster Recovery Strategy
Business Continuity and Disaster Recovery Strategy Business Continuity and Disaster Recovery Strategy
Business Continuity and Disaster Recovery Strategy Chandrak Trivedi
 
Business continuity plan
Business continuity planBusiness continuity plan
Business continuity planSafwan Hashmi
 
Business Continuity And Disaster Recovery Are Top IT Priorities For 2010 And ...
Business Continuity And Disaster Recovery Are Top IT Priorities For 2010 And ...Business Continuity And Disaster Recovery Are Top IT Priorities For 2010 And ...
Business Continuity And Disaster Recovery Are Top IT Priorities For 2010 And ...Citrix Online
 
Cscchealthcare110512
Cscchealthcare110512Cscchealthcare110512
Cscchealthcare110512Accenture
 
Building a Business Continuity Capability
Building a Business Continuity CapabilityBuilding a Business Continuity Capability
Building a Business Continuity CapabilityRod Davis
 
What is business continuity planning-bcp
What is business continuity planning-bcpWhat is business continuity planning-bcp
What is business continuity planning-bcpAdv Prashant Mali
 
Right size enterprise disaster recovery plans
Right size enterprise disaster recovery plansRight size enterprise disaster recovery plans
Right size enterprise disaster recovery plansInfo-Tech Research Group
 
Understanding Data Backups
Understanding Data BackupsUnderstanding Data Backups
Understanding Data BackupsGFI Software
 
Rolling out Business Continuity Planning (BCP) for Manufacturer Company
Rolling out Business Continuity Planning (BCP) for Manufacturer CompanyRolling out Business Continuity Planning (BCP) for Manufacturer Company
Rolling out Business Continuity Planning (BCP) for Manufacturer CompanyBank Alfalah Limited
 
Disaster Recovery Planning
Disaster Recovery PlanningDisaster Recovery Planning
Disaster Recovery PlanningKathy Pelletier
 
Business Continuity - Business Risk & Management
Business Continuity - Business Risk & ManagementBusiness Continuity - Business Risk & Management
Business Continuity - Business Risk & ManagementAndrew Styles
 
Example business continuity plan
Example business continuity planExample business continuity plan
Example business continuity planMicheal Axelsen
 

What's hot (20)

IT-as-a-Service Solutions for Healthcare Providers
IT-as-a-Service Solutions for Healthcare ProvidersIT-as-a-Service Solutions for Healthcare Providers
IT-as-a-Service Solutions for Healthcare Providers
 
Business continuity and disaster recovery
Business continuity and disaster recoveryBusiness continuity and disaster recovery
Business continuity and disaster recovery
 
Qatar Proposal
Qatar ProposalQatar Proposal
Qatar Proposal
 
Corporater Overview | Business Management Platform (BMP)
Corporater Overview | Business Management Platform (BMP)Corporater Overview | Business Management Platform (BMP)
Corporater Overview | Business Management Platform (BMP)
 
Business Continuity and Disaster Recovery Strategy
Business Continuity and Disaster Recovery Strategy Business Continuity and Disaster Recovery Strategy
Business Continuity and Disaster Recovery Strategy
 
Business continuity plan
Business continuity planBusiness continuity plan
Business continuity plan
 
Business Continuity And Disaster Recovery Are Top IT Priorities For 2010 And ...
Business Continuity And Disaster Recovery Are Top IT Priorities For 2010 And ...Business Continuity And Disaster Recovery Are Top IT Priorities For 2010 And ...
Business Continuity And Disaster Recovery Are Top IT Priorities For 2010 And ...
 
Cscchealthcare110512
Cscchealthcare110512Cscchealthcare110512
Cscchealthcare110512
 
BCBS Information Article By Mike Gowlett
BCBS Information Article By Mike GowlettBCBS Information Article By Mike Gowlett
BCBS Information Article By Mike Gowlett
 
Building a Business Continuity Capability
Building a Business Continuity CapabilityBuilding a Business Continuity Capability
Building a Business Continuity Capability
 
Hcd corporateoverviewbrochure
Hcd corporateoverviewbrochureHcd corporateoverviewbrochure
Hcd corporateoverviewbrochure
 
What is business continuity planning-bcp
What is business continuity planning-bcpWhat is business continuity planning-bcp
What is business continuity planning-bcp
 
Right size enterprise disaster recovery plans
Right size enterprise disaster recovery plansRight size enterprise disaster recovery plans
Right size enterprise disaster recovery plans
 
Understanding Data Backups
Understanding Data BackupsUnderstanding Data Backups
Understanding Data Backups
 
Disaster Recovery is Dead
Disaster Recovery is DeadDisaster Recovery is Dead
Disaster Recovery is Dead
 
Bcp drp
Bcp drpBcp drp
Bcp drp
 
Rolling out Business Continuity Planning (BCP) for Manufacturer Company
Rolling out Business Continuity Planning (BCP) for Manufacturer CompanyRolling out Business Continuity Planning (BCP) for Manufacturer Company
Rolling out Business Continuity Planning (BCP) for Manufacturer Company
 
Disaster Recovery Planning
Disaster Recovery PlanningDisaster Recovery Planning
Disaster Recovery Planning
 
Business Continuity - Business Risk & Management
Business Continuity - Business Risk & ManagementBusiness Continuity - Business Risk & Management
Business Continuity - Business Risk & Management
 
Example business continuity plan
Example business continuity planExample business continuity plan
Example business continuity plan
 

Similar to Disaster recovery white_paper

Whitepaper : Building a disaster ready infrastructure
Whitepaper : Building a disaster ready infrastructureWhitepaper : Building a disaster ready infrastructure
Whitepaper : Building a disaster ready infrastructureJake Weaver
 
Business continuity & disaster recovery
Business continuity & disaster recoveryBusiness continuity & disaster recovery
Business continuity & disaster recoveryGeorge Coutsoumbidis
 
White paper data center critical infrastructure risk and vulnerabilities
White paper   data center critical infrastructure risk and vulnerabilitiesWhite paper   data center critical infrastructure risk and vulnerabilities
White paper data center critical infrastructure risk and vulnerabilitiesSkylogica
 
BUSINESS-CONTINUITY-AND-DISASTER-RECOVERY.pptx
BUSINESS-CONTINUITY-AND-DISASTER-RECOVERY.pptxBUSINESS-CONTINUITY-AND-DISASTER-RECOVERY.pptx
BUSINESS-CONTINUITY-AND-DISASTER-RECOVERY.pptxJayLloyd8
 
Business Continuity for Mission Critical Applications
Business Continuity for Mission Critical ApplicationsBusiness Continuity for Mission Critical Applications
Business Continuity for Mission Critical ApplicationsDataCore Software
 
Disaster Recovery, Business Continuity, Backups, and High Av.docx
Disaster Recovery, Business Continuity, Backups, and High Av.docxDisaster Recovery, Business Continuity, Backups, and High Av.docx
Disaster Recovery, Business Continuity, Backups, and High Av.docxcuddietheresa
 
Chapter 32Disaster Recovery, Business Continuity, Backups, a
Chapter 32Disaster Recovery, Business Continuity, Backups, aChapter 32Disaster Recovery, Business Continuity, Backups, a
Chapter 32Disaster Recovery, Business Continuity, Backups, aEstelaJeffery653
 
Forrester: How Organizations Are Improving Business Resiliency with Continuou...
Forrester: How Organizations Are Improving Business Resiliency with Continuou...Forrester: How Organizations Are Improving Business Resiliency with Continuou...
Forrester: How Organizations Are Improving Business Resiliency with Continuou...EMC
 
Disaster Recovery Deep Dive
Disaster Recovery Deep DiveDisaster Recovery Deep Dive
Disaster Recovery Deep DiveLiberteks
 
COM-CON Session Topics, Audiences, and Presentation Types
COM-CON Session Topics, Audiences, and Presentation Types COM-CON Session Topics, Audiences, and Presentation Types
COM-CON Session Topics, Audiences, and Presentation Types LynellBull52
 
Business Continuation The Basics
Business Continuation   The BasicsBusiness Continuation   The Basics
Business Continuation The Basicsguest13df88e8
 
Business Continuity and Disaster Recover Week3Part4-ISr.docx
Business Continuity and Disaster Recover  Week3Part4-ISr.docxBusiness Continuity and Disaster Recover  Week3Part4-ISr.docx
Business Continuity and Disaster Recover Week3Part4-ISr.docxhumphrieskalyn
 
Business Resiliency
Business ResiliencyBusiness Resiliency
Business ResiliencyRackspace
 
Disaster Recovery: Develop Efficient Critique for an Emergency
Disaster Recovery: Develop Efficient Critique for an EmergencyDisaster Recovery: Develop Efficient Critique for an Emergency
Disaster Recovery: Develop Efficient Critique for an Emergencysco813f8ko
 
Delphix modernization whitepaper
Delphix  modernization whitepaperDelphix  modernization whitepaper
Delphix modernization whitepaperFranco_Dagosto
 
Information Technology Disaster Planning
Information Technology Disaster PlanningInformation Technology Disaster Planning
Information Technology Disaster Planningguest340570
 
VMware Disaster Recovery Planning: Essential Checklist
VMware Disaster Recovery Planning: Essential ChecklistVMware Disaster Recovery Planning: Essential Checklist
VMware Disaster Recovery Planning: Essential ChecklistVeeam Software
 
V mware business trend brief - crash insurance - protect your business with...
V mware   business trend brief - crash insurance - protect your business with...V mware   business trend brief - crash insurance - protect your business with...
V mware business trend brief - crash insurance - protect your business with...VMware_EMEA
 
Business Continuity Detailed Plan
Business Continuity Detailed PlanBusiness Continuity Detailed Plan
Business Continuity Detailed PlanWissam Abdel Baki
 

Similar to Disaster recovery white_paper (20)

Whitepaper : Building a disaster ready infrastructure
Whitepaper : Building a disaster ready infrastructureWhitepaper : Building a disaster ready infrastructure
Whitepaper : Building a disaster ready infrastructure
 
Business continuity & disaster recovery
Business continuity & disaster recoveryBusiness continuity & disaster recovery
Business continuity & disaster recovery
 
White paper data center critical infrastructure risk and vulnerabilities
White paper   data center critical infrastructure risk and vulnerabilitiesWhite paper   data center critical infrastructure risk and vulnerabilities
White paper data center critical infrastructure risk and vulnerabilities
 
BUSINESS-CONTINUITY-AND-DISASTER-RECOVERY.pptx
BUSINESS-CONTINUITY-AND-DISASTER-RECOVERY.pptxBUSINESS-CONTINUITY-AND-DISASTER-RECOVERY.pptx
BUSINESS-CONTINUITY-AND-DISASTER-RECOVERY.pptx
 
Business Continuity for Mission Critical Applications
Business Continuity for Mission Critical ApplicationsBusiness Continuity for Mission Critical Applications
Business Continuity for Mission Critical Applications
 
Disaster Recovery, Business Continuity, Backups, and High Av.docx
Disaster Recovery, Business Continuity, Backups, and High Av.docxDisaster Recovery, Business Continuity, Backups, and High Av.docx
Disaster Recovery, Business Continuity, Backups, and High Av.docx
 
Chapter 32Disaster Recovery, Business Continuity, Backups, a
Chapter 32Disaster Recovery, Business Continuity, Backups, aChapter 32Disaster Recovery, Business Continuity, Backups, a
Chapter 32Disaster Recovery, Business Continuity, Backups, a
 
Forrester: How Organizations Are Improving Business Resiliency with Continuou...
Forrester: How Organizations Are Improving Business Resiliency with Continuou...Forrester: How Organizations Are Improving Business Resiliency with Continuou...
Forrester: How Organizations Are Improving Business Resiliency with Continuou...
 
Disaster Recovery Deep Dive
Disaster Recovery Deep DiveDisaster Recovery Deep Dive
Disaster Recovery Deep Dive
 
COM-CON Session Topics, Audiences, and Presentation Types
COM-CON Session Topics, Audiences, and Presentation Types COM-CON Session Topics, Audiences, and Presentation Types
COM-CON Session Topics, Audiences, and Presentation Types
 
Business Continuation The Basics
Business Continuation   The BasicsBusiness Continuation   The Basics
Business Continuation The Basics
 
Business Continuity and Disaster Recover Week3Part4-ISr.docx
Business Continuity and Disaster Recover  Week3Part4-ISr.docxBusiness Continuity and Disaster Recover  Week3Part4-ISr.docx
Business Continuity and Disaster Recover Week3Part4-ISr.docx
 
Business Resiliency
Business ResiliencyBusiness Resiliency
Business Resiliency
 
Disaster Recovery: Develop Efficient Critique for an Emergency
Disaster Recovery: Develop Efficient Critique for an EmergencyDisaster Recovery: Develop Efficient Critique for an Emergency
Disaster Recovery: Develop Efficient Critique for an Emergency
 
Delphix modernization whitepaper
Delphix  modernization whitepaperDelphix  modernization whitepaper
Delphix modernization whitepaper
 
Information Technology Disaster Planning
Information Technology Disaster PlanningInformation Technology Disaster Planning
Information Technology Disaster Planning
 
VMware Disaster Recovery Planning: Essential Checklist
VMware Disaster Recovery Planning: Essential ChecklistVMware Disaster Recovery Planning: Essential Checklist
VMware Disaster Recovery Planning: Essential Checklist
 
V mware business trend brief - crash insurance - protect your business with...
V mware   business trend brief - crash insurance - protect your business with...V mware   business trend brief - crash insurance - protect your business with...
V mware business trend brief - crash insurance - protect your business with...
 
Business Continuity Detailed Plan
Business Continuity Detailed PlanBusiness Continuity Detailed Plan
Business Continuity Detailed Plan
 
BCI Counting The Cost
BCI Counting The CostBCI Counting The Cost
BCI Counting The Cost
 

More from CMR WORLD TECH

Cyber Security for Everyone Course - Final Project Presentation
Cyber Security for Everyone Course - Final Project PresentationCyber Security for Everyone Course - Final Project Presentation
Cyber Security for Everyone Course - Final Project PresentationCMR WORLD TECH
 
Cpq basics bycesaribeiro
Cpq basics bycesaribeiroCpq basics bycesaribeiro
Cpq basics bycesaribeiroCMR WORLD TECH
 
Questoes processautomation
Questoes processautomationQuestoes processautomation
Questoes processautomationCMR WORLD TECH
 
Aws migration-whitepaper-en
Aws migration-whitepaper-enAws migration-whitepaper-en
Aws migration-whitepaper-enCMR WORLD TECH
 
Delivery readness for pick season and higth volume
Delivery readness for pick season and higth volumeDelivery readness for pick season and higth volume
Delivery readness for pick season and higth volumeCMR WORLD TECH
 
Why digital-will-become-the-primary-channel-for-b2 b-engagement
Why digital-will-become-the-primary-channel-for-b2 b-engagementWhy digital-will-become-the-primary-channel-for-b2 b-engagement
Why digital-will-become-the-primary-channel-for-b2 b-engagementCMR WORLD TECH
 
Transcript Micrsosft Java Azure
Transcript Micrsosft Java Azure Transcript Micrsosft Java Azure
Transcript Micrsosft Java Azure CMR WORLD TECH
 
Buisiness UK Trading Marketing Finance
Buisiness UK Trading Marketing Finance Buisiness UK Trading Marketing Finance
Buisiness UK Trading Marketing Finance CMR WORLD TECH
 
Hyperledger arch wg_paper_1_consensus
Hyperledger arch wg_paper_1_consensusHyperledger arch wg_paper_1_consensus
Hyperledger arch wg_paper_1_consensusCMR WORLD TECH
 
Apexand visualforcearchitecture
Apexand visualforcearchitectureApexand visualforcearchitecture
Apexand visualforcearchitectureCMR WORLD TECH
 
Trailblazers guide-to-apps
Trailblazers guide-to-appsTrailblazers guide-to-apps
Trailblazers guide-to-appsCMR WORLD TECH
 
Berkeley program on_data_science___analytics_1
Berkeley program on_data_science___analytics_1Berkeley program on_data_science___analytics_1
Berkeley program on_data_science___analytics_1CMR WORLD TECH
 
Rep consumer experience_in_the_retail_renaissance_en_28_mar18_final_dm_
Rep consumer experience_in_the_retail_renaissance_en_28_mar18_final_dm_Rep consumer experience_in_the_retail_renaissance_en_28_mar18_final_dm_
Rep consumer experience_in_the_retail_renaissance_en_28_mar18_final_dm_CMR WORLD TECH
 
Salesforce voice-and-tone
Salesforce voice-and-toneSalesforce voice-and-tone
Salesforce voice-and-toneCMR WORLD TECH
 

More from CMR WORLD TECH (20)

Cyber Security
Cyber SecurityCyber Security
Cyber Security
 
Cyber Security for Everyone Course - Final Project Presentation
Cyber Security for Everyone Course - Final Project PresentationCyber Security for Everyone Course - Final Project Presentation
Cyber Security for Everyone Course - Final Project Presentation
 
CPQ Básico
CPQ BásicoCPQ Básico
CPQ Básico
 
Cpq basics bycesaribeiro
Cpq basics bycesaribeiroCpq basics bycesaribeiro
Cpq basics bycesaribeiro
 
Apexbasic
ApexbasicApexbasic
Apexbasic
 
Questoes processautomation
Questoes processautomationQuestoes processautomation
Questoes processautomation
 
Process automationppt
Process automationpptProcess automationppt
Process automationppt
 
Transcript mva.cesar
Transcript mva.cesarTranscript mva.cesar
Transcript mva.cesar
 
Aws migration-whitepaper-en
Aws migration-whitepaper-enAws migration-whitepaper-en
Aws migration-whitepaper-en
 
Delivery readness for pick season and higth volume
Delivery readness for pick season and higth volumeDelivery readness for pick season and higth volume
Delivery readness for pick season and higth volume
 
Why digital-will-become-the-primary-channel-for-b2 b-engagement
Why digital-will-become-the-primary-channel-for-b2 b-engagementWhy digital-will-become-the-primary-channel-for-b2 b-engagement
Why digital-will-become-the-primary-channel-for-b2 b-engagement
 
Transcript Micrsosft Java Azure
Transcript Micrsosft Java Azure Transcript Micrsosft Java Azure
Transcript Micrsosft Java Azure
 
Buisiness UK Trading Marketing Finance
Buisiness UK Trading Marketing Finance Buisiness UK Trading Marketing Finance
Buisiness UK Trading Marketing Finance
 
Hyperledger arch wg_paper_1_consensus
Hyperledger arch wg_paper_1_consensusHyperledger arch wg_paper_1_consensus
Hyperledger arch wg_paper_1_consensus
 
Master lob-e-book
Master lob-e-bookMaster lob-e-book
Master lob-e-book
 
Apexand visualforcearchitecture
Apexand visualforcearchitectureApexand visualforcearchitecture
Apexand visualforcearchitecture
 
Trailblazers guide-to-apps
Trailblazers guide-to-appsTrailblazers guide-to-apps
Trailblazers guide-to-apps
 
Berkeley program on_data_science___analytics_1
Berkeley program on_data_science___analytics_1Berkeley program on_data_science___analytics_1
Berkeley program on_data_science___analytics_1
 
Rep consumer experience_in_the_retail_renaissance_en_28_mar18_final_dm_
Rep consumer experience_in_the_retail_renaissance_en_28_mar18_final_dm_Rep consumer experience_in_the_retail_renaissance_en_28_mar18_final_dm_
Rep consumer experience_in_the_retail_renaissance_en_28_mar18_final_dm_
 
Salesforce voice-and-tone
Salesforce voice-and-toneSalesforce voice-and-tone
Salesforce voice-and-tone
 

Recently uploaded

办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一F La
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degreeyuu sss
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 

Recently uploaded (20)

办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 

Disaster recovery white_paper

  • 1. Copyright © Online Tech 2013. All Rights Reserved page 1 of 36
  • 2. Copyright © Online Tech 2013. All Rights Reserved page 2 of 36 Disaster Recovery Table of Contents 1.0. Executive Summary................................................................................................................. 4 2.0. Business Continuity and Disaster Recovery ........................................................................... 4 2.1. Business Drivers .................................................................................................................. 5 2.2. Compliance Concerns.......................................................................................................... 7 2.2.1. PCI DSS ........................................................................................................................ 7 2.2.2. HIPAA............................................................................................................................ 9 3.0. Business Continuity ............................................................................................................... 10 4.0. Disaster Recovery ................................................................................................................. 13 4.1. Recovery Point and Time Objectives ................................................................................ 13 4.2. Designing for Recovery...................................................................................................... 14 5.0. Technical Implementation Considerations ............................................................................ 17 5.1. Virtualization/Cloud Computing Disaster Recovery .......................................................... 17 5.1.1. Traditional Disaster Recovery..................................................................................... 17 5.1.2. Active-Passive............................................................................................................. 18 5.1.3. Active-Active................................................................................................................ 18 5.1.4. Cloud Case Study ....................................................................................................... 18 5.2. Location for Disaster Recovery.......................................................................................... 22 5.2.1. Micro-Sufficiency vs. Macro-Efficiency ....................................................................... 22 5.2.2. Geography Matters...................................................................................................... 23 5.2.3. Selection of Second Sites ........................................................................................... 23 5.3. Hardware Protection vs. Data Center Protection .............................................................. 24 5.3.1. Offsite Backup Options................................................................................................ 24 5.4. SAN-to-SAN Replication.................................................................................................... 28 5.5. Best Practices .................................................................................................................... 29 5.5.1. Encryption.................................................................................................................... 29 5.5.2. Network Replication .................................................................................................... 31 5.5.3. Testing......................................................................................................................... 32
  • 3. Copyright © Online Tech 2013. All Rights Reserved page 3 of 36 5.5.4. Communication Plan Testing ...................................................................................... 32 6.0. Conclusion ............................................................................................................................. 34 7.0. References............................................................................................................................. 35 7.1. Questions to Ask Your Disaster Recovery Provider.......................................................... 35 8.0. Contact Us ............................................................................................................................. 36
  • 4. Copyright © Online Tech 2013. All Rights Reserved page 4 of 36 1.0. Executive Summary Investing in risk management means investing in business sustainability – designing a comprehensive business continuity and disaster recovery plan is about analyzing the impact of a business interruption on revenue. Mapping out your business model, identifying key components essential to operations, developing and testing a strategy to efficiently recover and restore data and systems is an involved, long-term project that may take 12-18 months depending on the complexity of your organization. Addressing high-level business drivers for designing, implementing and testing a business continuity and disaster recovery plan, this white paper makes a case for the investment while discussing the innate challenges, benefits and detriments of different solutions from the perspective of experienced IT and data security professionals. Speaking directly to different compliance requirements, this paper addresses how to protect sensitive backup data within the parameters of standards set for the healthcare and e- commerce industries. From there, this paper delves into different disaster recovery and offsite backup technical solutions, from traditional to virtualization (cloud-based disaster recovery), as well as considerations in seeking a disaster recovery as a service solution (DRaaS) provider. A case study of the switch from physical servers and traditional disaster recovery to a private cloud environment details the differences in cost, uptime, performance and more. This white paper is ideal for executives and IT decision-makers seeking a primer as well as up- to-date information regarding disaster recovery best practices and specific technology recommendations. 2.0. Business Continuity and Disaster Recovery Business continuity is the process of analyzing the mission critical components required to keep your business running in the event of a disaster – business continuity is an overarching plan involving a few steps (see section 3.0 Business Continuity for a detailed description of what each step entails):  Business Impact Analysis (BIA)  Recovery Strategies  Plan Development  Testing and Exercises Creating an IT disaster recovery plan is part of the Plan Development step. As can be seen from the multiple steps within business continuity planning, disaster recovery is only a subset within a
  • 5. Copyright © Online Tech 2013. All Rights Reserved page 5 of 36 larger overarching plan to keep a business running. Disaster recovery requires creating a plan to recover and restore IT infrastructure, including servers, networks, devices, data and connectivity. 2.1. Business Drivers Why allocate budget toward a business continuity and IT disaster recovery plan? According to a Forrester/Disaster Recovery Journal Business Continuity Preparedness Survey, the top reason is due to an increased reliance on technology.1 Increased Reliance on Technology An increased reliance on technology can be seen from the retail industry that must upgrade to digital transactions and mobile payments to the healthcare industry that relies on electronic patient data entry, information exchange, processing, etc., demarcating the shift from paper records to electronic health record systems (EHRs).2 Ensuring network and power connectivity is essential to support the availability of websites, data and applications critical to business operations and profitability – this is where the greatest benefit can be seen in investing in an IT disaster recovery plan. Increased Business Complexity Other business drivers include the increasing business complexity of their organization; more relevant for larger businesses that might juggle many vendors, different processes and components that are all necessary to keep business operations running. With so many different factors in play as well as individuals, a business continuity and IT disaster recovery plan tackles the challenge of coordinating efforts and navigating a complex communication and workflow model in the event of a disaster. The plan must identify and support the complex interdependencies typically found in a larger organization that all work to keep the business running. Increasing Frequency and Intensity of Natural Disasters An increasing frequency and intensity of natural disasters is also motivation for establishing a plan to deal with the effects of, for example, Hurricane Sandy; a largely unanticipated and devastating natural disaster that caused delays, power outages and downed businesses/websites. Ideally, your disaster recovery data center should be located in a region with low risk of natural disasters. However, Gigaom.com reports that the greatest amount of data centers are located in states that also experienced the greatest number of FEMA (Federal Emergency Management Agency) disaster declarations, suggesting a change in disaster recovery strategy is in order. Which 1 Forrester Research and Disaster Recovery Journal, The State of Business Continuity Preparedness; http://www.drj.com/images/surveys_pdf/forrester/2011_Forrester_SOBC.pdf 2 Online Tech, Risks on the Rise: Making a Case for IT Disaster Recovery; http://resource.onlinetech.com/risks-on-the-rise-making-a-case-for-it-disaster-recovery/
  • 6. Copyright © Online Tech 2013. All Rights Reserved page 6 of 36 states were hit the hardest, with the highest concentration of existing data centers? The top three include Texas with 332 disasters and 120 data centers; California with 211 disasters and greater than 160 data centers; and New York with 91 disasters and greater than 120 data centers.3 Source: Giagom, FEMA, Data Center Map For more on geography, data centers and disaster recovery, see section 5.2.2. Geography Matters. Increased Reliance on Third-Parties Another business driver is the increased business reliance on third-parties (i.e., outsourcing, suppliers, etc.). As one factor in the business complexity of an organization, vendors can also introduce potential new or increased risks, depending on their internal security policies and practices, as well as general security awareness. Read more about Administrative Security to find out what to look for in a security-conscious third-party vendor, from audits, reports and policies to staff training. Increased Regulatory Requirements Increased regulatory requirements have also shifted attention to the need for disaster recovery. For the e-commerce, retail and franchise industries, the Payment Card Industry Data Security Standards (PCI DSS) require the offsite backup and verification of the physical security of the facility in which cardholder data is found. Another requirement explicitly mandates the establishment and testing of an incident response plan in the event of a system breach.4 (See section 2.2.1 PCI DSS for more). The healthcare industry is regulated by the Health Insurance Portability and Accountability Act (HIPAA) and more specifically the Health Information Technology for Economic and Clinical Health (HITECH) Act that addresses privacy and security concerns related to the electronic 3 Gigaom, The States with the Most Data Centers Are Also the Most Disaster-Prone [Maps]; http://gigaom.com/2013/01/10/the-states-with-the-most-data-centers-are-also-the-most-disaster-prone- maps/ 4 PCI Security Standards Council, PCI DSS Requirements and Security Assessment Procedures, Version 2.0; https://www.pcisecuritystandards.org/documents/pci_dss_v2.pdf (PDF)
  • 7. Copyright © Online Tech 2013. All Rights Reserved page 7 of 36 transmission of health information. Within the Administrative Safeguards of the HIPAA Security Rule standards, a contingency plan is required, comprised of: a data backup plan, disaster recovery plan, emergency mode operation plan, testing and revision procedures and applications and data criticality analysis.5 Accordingly, failure to meet regulatory requirements can result in federal fines, legal fees, loss of business credibility, and other significant consequences; motivating businesses of all sizes to implement a compliant disaster recovery and backup plan. Increased Threat of Cyber Attacks The last risk factor making a case for disaster recovery is the increased threat of cyber attacks. From attacks on federal agencies to corporate franchises to mobile malware, hackers are frequently developing new methods to gain unauthorized access to systems – or to take down entire systems. A denial-of-service attack (DoS attack) is one method of sending an abnormally high volume of requests/traffic in an attempt to overload servers and bring down networks. While many other technical security tools can be used to prevent, detect and mitigate potential cyber attacks, a comprehensive disaster recovery plan is essential in order to properly recover and restore critical data and applications after an attack. 2.2. Compliance Concerns As mentioned in the previous section, failure to meet industry compliance/regulatory requirements can result in federal fines, legal fees, loss of business credibility, and other significant consequences – with disaster recovery and backup as an integral part of the requirements, it’s important to review what’s at stake and why for each industry. 2.2.1. PCI DSS For companies that deal with credit cardholder data, including e-commerce, retail, franchise, etc., the Payment Card Industry Data Security Standards (PCI DSS) are the official security guidelines set by the major credit card brands. Of the 12 PCI DSS requirements and sub-requirements, 12.9.1 dictates:6 Create the incident response plan to be implemented in the event of system breach. Ensure the plan addresses the following, at a minimum: 5 U.S. Depart. of Health and Human Services (HHS), HIPAA Security Series: Security Standards: Organizational, Policies and Procedures and Documentation Requirements; http://www.hhs.gov/ocr/privacy/hipaa/administrative/securityrule/pprequirements.pdf (PDF) 6 PCI Security Standards Council, PCI DSS Requirements and Security Assessment Procedures, Version 2.0; https://www.pcisecuritystandards.org/documents/pci_dss_v2.pdf (PDF)
  • 8. Copyright © Online Tech 2013. All Rights Reserved page 8 of 36  Roles, responsibilities, and communication and contact strategies in the event of a compromise including notification of the payment brands, at a minimum  Specific incident response procedures  Business recovery and continuity procedures  Data back-up processes  Analysis of legal requirements for reporting compromises  Coverage and responses of all critical system components  Reference or inclusion of incident response procedures from the payment brands In addition, the PCI standard 9.5 requires a data backup plan, disaster recovery plan, emergency mode operation plan, testing and revision procedures, and application and data criticality analysis.7 Store media back-ups in a secure location, preferably an off-site facility, such as an alternate or back-up site, or a commercial storage facility. Review the location’s security at least annually. The auditor testing procedures call for observation of the storage location’s physical security. A PCI compliant data center should have proper physical security including limited access authorization, dual-identification control access to the facility and servers, and complete environmental control with monitoring, logged surveillance, alarm systems and an alert system. Ideally, if outsourcing your disaster recovery solution, partner only with a disaster recovery provider that allows physical tours and walkthroughs of their facilities. What else should you look for in a PCI disaster recovery provider?  Policies and procedures, process documents, training records, incident response/data breach plans, etc.  Proof that all PCI requirements are in place and sufficiently compliant within the scope of their contracts Read more about the required network and technical 7 PCI Security Standards Council, PCI DSS Requirements and Security Assessment Procedures, Version 2.0; https://www.pcisecuritystandards.org/documents/pci_dss_v2.pdf (PDF)
  • 9. Copyright © Online Tech 2013. All Rights Reserved page 9 of 36 security, and high availability infrastructure in PCI Compliant Data Centers. For a complete guide to outsourcing data hosting and disaster recovery solutions, read our PCI Compliant Hosting white paper. 2.2.2. HIPAA For companies that deal with protected health information (PHI), including healthcare providers, hospitals, physicians, hospital systems, etc., the HIPAA Insurance Portability and Accountability Act (HIPAA) is the official legislation set forth by the U.S. Dept. of Health and Human Services (HHS). This set of security standards work to protect the availability, confidentiality and integrity of PHI – the availability aspect becomes all the more dependent on the reliability of your IT infrastructure as hospitals and healthcare practices increase reliance on the use of electronic health record systems (EHRs). Healthcare applications and Software as a Service (SaaS) companies need offsite backup for their data in the event that a production data center experiences a disaster. The Contingency Plan standard (§ 164.308(a)(7)) of the Administrative Safeguards of the HIPAA Security Rule requires covered entities to: Establish (and implement as needed) policies and procedures for responding to an emergency or other occurrence (for example, fire, vandalism, system failure, and natural disaster) that damages systems that contain electronic protected health information.8 The specifications of the standard include a data backup plan, disaster recovery plan, emergency mode operation plan, testing and revision procedures, and applications and data criticality analysis. Read Components of a HIPAA Compliant IT Contingency Plan for a detailed overview and a customizable IT Contingency Plan template provided by the Dept. of Health and Human Resources. Read more about the required physical, network and technical security, and high availability infrastructure in HIPAA Compliant Data Centers. For a complete guide to outsourcing data hosting and HIPAA disaster recovery solutions, read our HIPAA Compliant hosting white paper. 8 U.S. Dept. of Health and Human Services, Administrative Safeguards; http://www.gpo.gov/fdsys/pkg/CFR-2009-title45-vol1/pdf/CFR-2009-title45-vol1-sec164-308.pdf (PDF)
  • 10. Copyright © Online Tech 2013. All Rights Reserved page 10 of 36 3.0. Business Continuity Within a business continuity plan exists a few steps:9 Business Impact Analysis (BIA) This involves determining the operational and financial impact of a potential disaster or disruption, including loss of sales, credibility, compliance fines, legal fees, PR management, etc. It also includes measuring the amount of financial/operational damage depending on the time of the year. A risk assessment should be conducted as part of the BIA to determine what kind of assets are actually at risk – including people, property, critical infrastructure, IT systems, etc.; as well as the probability and significance of possible hazards – including natural disasters, fires, mechanical problems, supply failure, cyber attacks; etc.10 Mapping out your business model and determining where the interdependencies lie between the different departments and vendors within your company is also part of the BIA. The larger the organization, the more challenging it will be to develop a successful business continuity and disaster recovery plan. Sometimes organizational restructuring and business process or workflow realignment is necessary not only to create a business continuity/disaster recovery plan, but also to maximize and drive operational efficiency.11 Ready.gov/business has a BIA worksheet available12 (seen below) to help you document and calculate the operational and financial impact of a potential disaster by matching the timing and duration of an interruption with the loss of sales/income, as well as on a per department, service and process basis. 9 FEMA (Federal Emergency Management Agency), Business Continuity Plan; http://www.ready.gov/business/implementation/continuity 10 FEMA (Federal Emergency Management Agency, Risk Assessment; http://www.ready.gov/risk- assessment 11 Online Tech, Business Continuity in Lean Times (Webinar); http://www.onlinetech.com/events/business-continuity-in-lean-times 12 Ready.gov, Business Impact Analysis Worksheet; http://www.ready.gov/sites/default/files/documents/files/BusinessImpactAnalysis_Worksheet.pdf
  • 11. Copyright © Online Tech 2013. All Rights Reserved page 11 of 36 Recovery Strategies Analyzing your company’s most valuable data, that is data that directly leads to revenue, is key when determining what you need to backup and restore as part of your information technology (IT) disaster recovery plan. Create an inventory of documents, databases and systems that are used on a day-to-day basis to generate revenue, and then quantify and match income with those processes as part of your recovery strategy/business impact analysis.13 Aside from IT, a recovery strategy also involves personnel, equipment, facilities, a communication strategy and more in order to effectively recover and restore business operations. Plan Development Using information derived from the business impact analysis in conjunction with the recovery strategies, establish a plan framework. Documenting an IT disaster recovery plan is part of this stage. 13 Online Tech, Business Continuity in Lean Times (Webinar); http://www.onlinetech.com/events/business-continuity-in-lean-times
  • 12. Copyright © Online Tech 2013. All Rights Reserved page 12 of 36 As can be seen from the multiple steps within business continuity planning, disaster recovery is a subset within a larger overarching plan to keep a business running. It involves restoring and recovering IT infrastructure, including servers, networks, devices, data and connectivity (see section 4.0 Disaster Recovery for more). A data backup plan involves choosing the right hardware and software backup procedures for your company, scheduling and implementing backups as well as checking/testing for accuracy (see section 5.3.1. Offsite Backup Options for more). Testing & Exercises Develop a testing process to measure the efficiency and effectiveness of your plans, as well as how often to conduct tests. Part of this step involves establishing a training program and conducting training for your company/business continuity team. Testing allows you to clearly define roles and responsibilities and improve communication within the team, as well as identify any weaknesses in the plans that require attention. This allows you to allocate resources as needed to fill the gaps and build up a stronger, more resilient plan. Read section 5.5.3 Testing for more information.
  • 13. Copyright © Online Tech 2013. All Rights Reserved page 13 of 36 4.0. Disaster Recovery As an integral part of business continuity plan development, creating an IT disaster recovery plan is essential to keep businesses running as they increasingly rely on IT infrastructure (networks, servers, systems, databases, devices, connectivity, power, etc.) to collect, process and store mission-critical data. A disaster recovery plan is designed to restore IT operations at an alternate site after a major system disruption with long-term effects. After successfully transferring systems, the goal is to restore, recover, test affected systems and put them back in operation. Your IT infrastructure is, in most cases, the lifeblood of your organization. When websites are down or patient data is unavailable due to hacking, natural disasters, hardware failure or human error, businesses cannot survive. According to FEMA, a recovery strategy should be developed for each component:  Physical environment in which data/servers are stored – data centers equipped with climate control, fire suppression systems, alarm systems, authorization and access security, etc.  Hardware – Networks, servers, devices and peripherals.  Connectivity – Fiber, cable, wireless, etc.  Software applications – Email, data exchange, project management, electronic healthcare record systems, etc.  Data and restoration Identify the critical software applications and data, as well as the hardware required to run them. Additionally, determining your company’s custom recovery point and time objectives can prepare you for recovery success by creating guidelines around when data must be recovered. 4.1. Recovery Point and Time Objectives Recovery Point Objective (RPO) A recovery point objective (RPO) specifies a point in time that data must be recovered and backed up in order for business operations to resume. The RPO determines the minimum frequency at which interval backups need to occur, from every hour to every 5 minutes.14 Recovery Time Objective (RTO) The recovery time objective (RTO) refers to the maximum length of time a system (or computer, network or application) can be down after a failure or disaster before the company is negatively impacted by the downtime. Determining the amount of lost revenue per amount of lost time can help determine which applications and systems are critical to business sustainability. 14 Online Tech, Seeking a Disaster Recovery Solution? Five Questions to Ask Your DR Provider; http://resource.onlinetech.com/five-questions-to-ask-your-disaster-recovery-provider/
  • 14. Copyright © Online Tech 2013. All Rights Reserved page 14 of 36 For example, if your email server was down for only an hour, yet a large portion of your database was wiped out and you lost 12 hours’ worth of email, how would that impact your business? 4.2. Designing for Recovery High Availability Infrastructure Strategic data center design involving high availability and redundancy can help support larger companies that rely on mission-critical (high-impact) applications. High availability is a design approach that takes into account the sum of all the parts including the application, all the hardware it is running on, power infrastructure, and the networking behind the hardware.15 Using high availability architecture can reduce the risks of lost revenue and customers in the event of Internet connectivity or power loss – with high availability, you can perform maintenance without downtime and the failure of a single firewall, switch, or PDU will not affect your availability. With this type of IT design, you can achieve 99.999%, meaning you have less than 5.26 minutes of downtime per year. High availability power means the primary power circuit should be provided by the primary UPS (Uninterruptible Power Supply) and be backed up by the primary generator. A secondary circuit should be provided by the secondary UPS, which is backed up by the secondary generator. This redundant design ensures that a UPS or generator failure will never interrupt power in your environment. For a high availability data center, you should seek not only a primary and secondary power feed, but also a primary and secondary Internet uplink if purchasing Internet from them. Additionally ensure any available hardware, firewalls or switches include redundant hardware. If using managed services and purchasing a server from a data center, ensure all of the hardware is configured for high availability, including dual power supplies and dual NIC (network interface controller) cards. Ensure their server is also wired back to different switches, and the switches are dual homed to different access layer routing so there is no single point of failure anywhere in the environment. Offsite backup and disaster recovery are still important; as high availability cannot help you recover from a natural disaster such as a flood or hurricane. Additionally, disaster recovery comes after high availability has completely failed and you must recover to a different geographical location. 15 Online Tech, Online Tech Expert Interview: What is High Availability?; http://resource.onlinetech.com/michigan-data-center-operator-online-tech-expert-interview-what-is-high- availability/
  • 15. Copyright © Online Tech 2013. All Rights Reserved page 15 of 36 Redundant Infrastructure Redundancy is another factor to consider when it comes to disaster recovery data center design. With a fully redundant data center design, automatic failover can ensure server uptime in the event that one provider experiences any connectivity issues. This includes multiple Internet Service Providers (ISPs) and fully redundant Cisco networks with automatic failover. Pooled UPS (Uninterruptible Power Supply), battery and generators can ensure a backup source of power in the event one provider fails. View an example of Online Tech’s redundant network and data centers below: Cold Site Disaster Recovery A cold site is little more than an appropriately configured space in a building. Everything required to restore service to your users must be retrieved and delivered to the site before the
  • 16. Copyright © Online Tech 2013. All Rights Reserved page 16 of 36 process of recovery can begin. As you can imagine, the delay going from a cold backup site to full operation can be substantial. Warm Site Disaster Recovery A warm site is leasing space from a data center provider or disaster recovery provider that already has the power, cooling and network installed. It is also already stocked with hardware similar to that found in your data center, or primary site. To restore service, the last backups from an offsite storage facility are required. Hot Site Disaster Recovery A hot site is the most expensive yet fastest way to get your servers back online in the event of an interruption. Hardware and operating systems are kept in sync and in place at a data center provider's facility in order to quickly restore operations. Real time synchronization between the two sites may be used to completely mirror the data environment of the original site using wide area network links and specialized software. Following a disruption to the original site, the hot site exists so that the organization can relocate with minimal losses to normal operations. Ideally, a hot site will be up and running within a matter of hours or even less. When you partner with a data center/disaster recovery provider, you're sharing the cost of the infrastructure, so it's not as expensive if you were to have an entirely secondary data center.
  • 17. Copyright © Online Tech 2013. All Rights Reserved page 17 of 36 5.0. Technical Implementation Considerations 5.1. Virtualization/Cloud Computing Disaster Recovery With virtualization, the entire server, including the operating system, applications, patches and data are encapsulated into a single software bundle or server – this virtual server can be copied or backed up to an offsite data center, and spun up on a virtual host in minutes in the event of a disaster. Since the virtual server is hardware independent, the operating system, applications, patches and data can be safely and accurately transferred from one data center to a second site without reloading each component of the server. This can reduce recovery times compared to traditional disaster recovery approaches where servers need to be loaded with the OS and application software, as well as patched to the last configuration used in production before the data can be restored. Virtual machines (VMs) can be mirrored, or running in sync, at a remote site to ensure failover in the event that the original site should fail; ensuring complete data accuracy when recovering and restoring after an interruption. Another aspect of cloud-based disaster recovery that improves recovery times drastically is full network replication. Replicating the entire network and security configuration between the production and disaster recovery site as configuration changes are made saves you the time and trouble of configuring VLAN, firewall rules and VPNs before the disaster recovery site can go live. In order to achieve full replication, your cloud-based disaster recovery provider should manage both the production cloud servers and disaster recovery cloud servers at both sites. For warm site disaster recovery, backups of critical servers can be spun up on a shared or private cloud host platform. For SAN-to-SAN replication, hot site disaster recovery is more affordable – SAN replication allows not only rapid failover to the secondary site, but also the ability to return to the production site when the disaster is over. For a case study of a real physical-to-cloud switch scenario from a business enterprise perspective, read section 5.1.4. Cloud Case Study for a detailed comparison of managing physical servers vs. a private cloud environment, including differences in costs, energy use, uptime, performance and development. 5.1.1. Traditional Disaster Recovery
  • 18. Copyright © Online Tech 2013. All Rights Reserved page 18 of 36 With traditional disaster recovery outsourced to a vendor with a shared infrastructure, after a disaster is declared, the hardware, software and operating system must be configured to match the original affected site. Data is being stored on offsite tape backups – after a disaster, the data must be retrieved and restored in the remote site location that has been configured to match the original. This can take hours or a few days to recover and restore completely. If not outsourcing, the traditional disaster recovery method of using a cold site can be very time-consuming and very costly. If you have a disaster recovery infrastructure with preconfigured hardware and software ready at a secondary site (a warm site), this can cut down on the time it takes to recover. However, even with a secondary site, your organization is still dependent on retrieving physical backup tapes for complete restoration. There is no data synchronization and no failback option available with traditional disaster recovery. The missing step in many traditional disaster recovery plans is how to return to the production site once it has been re-established. Traditional disaster recovery plans are often not fully tested through a full failover disaster scenario due to the time-consuming design of the plan. 5.1.2. Active-Passive In an active-passive disaster recovery setup, the original or primary site is designed so that the network fails over at an alternative or secondary site with delayed resiliency. Applications and configurations must be replicated with a delay anywhere from five minutes to 24 hours. With a secondary site, there is reduced capacity hardware, and failback requires a maintenance window. 5.1.3. Active-Active In an active-active disaster recovery setup, there is synchronous data replication between the primary and secondary sites, with no delayed resiliency. The database spans the two data centers, and the application layer multi-writes. There is equivalent capacity hardware at a secondary data center to ensure full capacity redundancy. 5.1.4. Cloud Case Study Online Tech is one example of making the switch from traditional physical servers to a cloud environment that resulted in savings in hardware, disaster recovery and more. Back in 2011, we found our growth was beginning to become difficult to manage internally. Mission Critical Hardware, Facilities and Employees We had two data centers, hundreds of circuits, network devices, racks, cages and private suites to manage and maintain. We also had thousands of servers and support tickets due to a rapidly growing client-base, as well as certification and auditing processes to keep up annually (SSAE 16, SOC 2, HIPAA, PCI DSS, SOX) in order to maintain compliance and data security for our clients.
  • 19. Copyright © Online Tech 2013. All Rights Reserved page 19 of 36 With employees at five different locations and in two different countries, we needed a scalable and efficient solution to support our mission critical business components. Mission Critical Systems Within our administrative department, Exchange, SharePoint, a file server, and domain controller supported their everyday processes. Our marketing department uses a production and development website to test and implement updates, as well as load-balanced website to optimize resources. For OTPortal, our client and intranet portal, we use Microsoft .net applications and a MS SQL database. For OTMobile (provides mobile access for our engineers), we use a PHP application. Within our operations department, we use a custom Centos program to manage the data and create a MySQL database for our bandwidth management and billing processes. Operations has thousands of patches to apply each month, as well as firewall, IDS management consoles, antivirus management, server and cloud backup managers, SAN and NAS management, and uptime/performance monitoring to maintain. We also have a sandbox for testing in our lab. From Physical Servers to a Private Cloud We consolidated from 23 physical servers (18 Windows, 5 CentOS servers, 4 database servers; each with 10 percent utilization) to private cloud. The private cloud consisted of 2 redundant hardware servers (N+1) and an 8 terabyte SAN. Our high availability (HA) configuration includes automatic load-balancing across hosts, and automatic failover to a single server. The private cloud also includes continuous offsite backup, allowing for real-time data synchronization. We employ a disaster recovery warm site located in Ann Arbor, Michigan that allows us a four hour recovery time that has been fully tested. Leveraging Our Cloud When we switched over, we actualized several benefits, from faster client-support development, lower total cost of ownership, improved uptime and performance, as well as significantly decreasing our energy usage and carbon footprint. Pace of Development With the switch to our private cloud, we’ve increased the pace of development. A project that would typically take two weeks can now be completed within an hour, as we can create new servers and test concepts using production data. As a result, our development team can update the client portal, OTPortal, with new releases every two weeks; implementing new time-saving features much sooner than before. Total Cost of Ownership (TCO)
  • 20. Copyright © Online Tech 2013. All Rights Reserved page 20 of 36 We also reduced the total cost of managing our infrastructure. Our old TCO required management of 26 physical Dell servers with a variety of specifications, versions, bios, CPU, memory configurations and the need for several different spares. In addition, we had to manage 26 backups, antivirus and machines to network and patch. We also had four Cisco network switches, two racks in the data center, more than a hundred network cables and half a dozen power strips. It took hours to upgrades disks, and downtime also contributed to costs, as it was required to upgrade memory. The cloud TCO consolidated everything into two servers, one SAN, two network switches, two power strips and down from two racks to a quarter of a rack. Overall, we saved 50 percent on hardware and 90 percent on management costs. Improved Uptime Another benefit is improved uptime – always a major benefit when it comes to hosting critical data for our clients. With N+ 1 (redundant) hosts, every virtual server we create is protected from a failed hardware server. For redundancy with physical servers, it would have required an additional 26 servers, adding to cost, time management and energy expenditure. To guard against SAN failure, we have redundant controllers in our SAN, with RAID array drives and spare drives on hand. With our high availability power configuration, we were further protected against downtime. Initially we considered using a separate server for the database, resulting in a hybrid cloud configuration, which would have required a cluster of database servers for the same protection. Instead, we upgraded our entire cloud for less than a new single database server, resulting in protection against server failure for significantly lower cost than a cluster. Improved Performance We also improved our ability to respond to performance issues. Previously with our physical server setup, it took a few days to get the right RAM/disk/CPU, and we had to schedule downtime with anywhere from two days to one week of notice. The actual process included shutting down and removing the server from the rack, opening the server, installing additional resources and then booting up the server. Then we would have to test the performance, turn the server back off in order to re-rack it, and then restart the server; resulting in about two hours of downtime.
  • 21. Copyright © Online Tech 2013. All Rights Reserved page 21 of 36 When we switched to the cloud, the steps were reduced to: schedule downtime; click to add more RAM/disk/CPU; reboot server and test performance – a total of five minutes of downtime. The entire cloud upgraded hardware one host at a time with nearly zero downtime. Decreasing Energy, Carbon Footprint & Costs We significantly reduced our energy use and subsequent carbon footprint. When it came to power consumption, for 100 percent uptime at 300 watts/server and a PUE of 1.8, we went through 1.58 lbs. of CO2/kwhr. For 26 physical servers and a network, that amounts to about 200,000 lbs. of CO2 per year, and twice as much annually for redundancy. The cloud required two physical servers, network and SAN. With a 35 server capacity, we are burning 31,000 lbs. of CO2 per year – a savings of nearly 17,000 lbs. of CO2 annually.16 Faster Disaster Recovery With every server we create, we’re reassured they are automatically protected, as we have a single backup process for each host and backups for every virtual server on all hosts. We have a 4 hour RTO from catastrophic failure – we can failover to a secondary data center site in less than 4 hours. With the cloud, we are able to test twice a year to ensure the process runs smoothly. In a virtualized environment, the entire server, including the operating system (OS), apps, patches and data are captured on a virtual server that can be backed up to another one of our 16 Online Tech, How the Cloud is Changing the Data Center’s Bad Reputation for Energy Inefficiency; http://resource.onlinetech.com/how-the-cloud-is-changing-the-data-centers-bad-reputation-for-energy- inefficiency/
  • 22. Copyright © Online Tech 2013. All Rights Reserved page 22 of 36 data centers and spun up in a matter of minutes.17 This makes both testing and full recovery and failover much faster and efficient than if we still used physical servers. 5.2. Location for Disaster Recovery Strategic distance between the primary and secondary sites for disaster recovery is important in order to avoid natural disasters, ensure data synchronicity, allow for business scalability and maximize operational efficiency. 5.2.1. Micro-Sufficiency vs. Macro-Efficiency Micro-sufficiency is the concept in which your core functions or critical pieces of your infrastructure are centrally located, as well as replicated regionally. In an example of business model scaling, core departments may be human resources, IT and/or legal located centrally at a headquarters. However, each branch of the business in different regions has its own local core departments. The idea behind this model is that risk is mitigated by dispersing core functions close to each region and their local customers, and not solely in one location (headquarters). Each branch also has different strategies to better serve their unique customers as their needs vary from region to region. Similarly, with disaster recovery planning, once you identify your critical business processes, you can distribute those core functions into your various operational units in the event of a disaster or interruption. Designing with this concept of redundancy and resiliency in your infrastructure can result in a graceful and safe failover with efficient recovery. Partnering with a disaster recovery data center provider allows your organization to take advantage of the risk-mitigating benefits of the micro-sufficiency concept, while avoiding the costs of building and maintaining your own data center. Instead of installing your own redundant set of equipment in an alternate facility, colocation or disaster recovery with a partner allows you to pay for space in a fully staffed, redundant environment. With macro-efficiency, the concept is economy of scale – the bigger your company, the more buying power you have and the larger equipment you can buy. In an example of business model scaling, the core departments make blanket corporate decisions across their regional branches, regardless of differences in customers and needs. Without recognizing differences and identifying the workflow of interdependencies between different departments, the model suffers from lack of organization and inability to identify and recover critical functions.18 17 Data Center Knowledge/Online Tech, How the Cloud Changes Disaster Recovery; http://www.datacenterknowledge.com/archives/2011/07/26/how-the-cloud-changes-disaster-recovery/ 18 Online Tech, Disaster Recovery in Depth; http://www.onlinetech.com/events/disaster-recovery-in-depth/
  • 23. Copyright © Online Tech 2013. All Rights Reserved page 23 of 36 Micro-sufficiency is the ideal model for disaster recovery and business continuity planning, as it effectively mitigates risk and presents a better strategy for protecting data through redundant and resilient design. 5.2.2. Geography Matters The geographic selection of low natural disaster zones is essential for lowering the risk of critical IT infrastructure destruction. A large enough distance between your primary and secondary sites ensures that your secondary site isn’t affected by a potential natural disaster. Read on for more about specific parameters of secondary disaster recovery sites. 5.2.3. Selection of Second Sites If your organization or primary site is located in a disaster-prone zone, consider a secondary site in a landlocked and more temperate region. Compared to coastal regions, the Midwest has low national averages for significant natural disasters such as floods, tornadoes, hurricanes and fires that cause mass destruction and may be a threat to your business.
  • 24. Copyright © Online Tech 2013. All Rights Reserved page 24 of 36 If your organization or primary site is located on the coast or in an earthquake zone, your secondary site should be located at least 100 miles away.19 Ideally, your secondary site should be located far enough away to mitigate the risk of it being affected by the same disaster affecting your primary site. The design of your secondary site should also be strategic – never locate generators in a basement or other location that may be difficult to service, or prone to destruction. Additionally, ensure your secondary data center is located close enough to your primary for optimal bandwidth and response time, as well as the ability to mirror data in real time. Facilities should also be easily reached by your IT team in the event of a disaster for faster service and recovery. However, your disaster recovery data center should always be located on a separate utility power grid than your primary data center. In the event of a power outage at your primary, separate power grids ensures that your secondary site will still be up and running. 5.3. Hardware Protection vs. Data Center Protection 5.3.1. Offsite Backup Options Sending data offsite ensures a copy of your critical data is available in the event of a disaster at your primary site, and it is considered a best practice in disaster recovery planning. There are several offsite data backup media options available, including the traditional tape backup method that involves periodic copying of data to tape drives that can be done manually or with software. However, physical tape backup has its drawbacks, including read or write errors, slow data retrieval times, and required maintenance windows. With critical business data from medical records to customer credit card data, your organization can’t afford to risk losing archives or the ability to completely recover after a disaster. According to NIST, the different types of data backups include:20  Full backup – All files on the disk or within the folder are backed up. This can be time- consuming due to the sheer size of files. According to NIST, maintaining duplicates of files that don’t change very often, such as system files, can lead to excessive and costly storage requirements. 19 CIOUpdate.com, Disaster Recovery Planning; http://www.cioupdate.com/trends/article.php/3872926/Disaster-Recovery-Planning---How-Far-is-Far- Enough.htm 20 NIST (National Institute of Science and Technology), Special Publication 800-34 Rev. 1 – Contingency Planning Guide for Federal Information Systems; http://csrc.nist.gov/publications/nistpubs/800-34- rev1/sp800-34-rev1_errata-Nov11-2010.pdf (PDF)
  • 25. Copyright © Online Tech 2013. All Rights Reserved page 25 of 36  Incremental – Files that were created or changed since the last backup are captured in an incremental backup. Backup times are shorter and more efficient, but might require compiling backups from multiple days and media, depending on when files where changed.  Differential – All files that were created or modified since the last full backup – if a file is changed after the last backup, the file will be saved each time until the next full backup is completed. Backup times are shorter than a full backup, and require less media than incremental. For more about specific offsite backup technology, read section 5.4 SAN-to-SAN Replication and SAN Snapshots. Outsource vs. In-Source Outsourcing your offsite backup to a managed services provider can provide your organization with continuous data protection and full file-level restoration, and offload the burden of installing, managing, monitoring as well as complete restoration after a disaster. With a vendor, your encrypted server files are sent to an onsite backup manager (primary site), which are then sent to a secondary, offsite backup manager, ideally far enough apart to reduce the chances of the secondary site being affected by the same disaster or interruption. While offsite backup managed in-house can be costly due to building out, maintaining and upgrading both primary and secondary sites, outsourcing your offsite backup to professionals means you can take advantage of their investments in capital, technology and expertise.
  • 26. Copyright © Online Tech 2013. All Rights Reserved page 26 of 36 As NIST (National Institute of Science and Technology) states, backup media should be stored offsite or at an alternate site in a secure, environmentally controlled facility.21 An offsite backup data center should have physical, network and environmental controls to maintain a high level of security and safety from possible backup damage. Physical security at a data center means only authorized personnel have limited access to client servers, and the facility itself should require dual-identification control access (through the use of a secondary identification device, such a biometric authentication that requires a fingerprint scan). Environmental controls should include 24x7 monitoring, logged surveillance cameras and multiple alarm systems. Any sensitive infrastructure should be protected by restricted access, and redundancy in routers, switches and paired universal threat management devices should provide network security for your offsite backup data. Vendor Selection Criteria When vetting offsite backup and disaster recovery vendors (also known as disaster recovery as a service, or DRaaS) check certain criteria to ensure your data is protected. Look for certain security certifications, compliance, communication styles and technology when comparing offsite backup providers, as well as the basic disaster recovery criteria of geographic area, accessibility, security, environment and costs discussed in section 5.2 Location for Disaster Recovery. Compliance One way to gain assurance of an offsite backup/data center provider’s security practices is to inquire about their industry security and compliance reports. Vendors that have invested the significant time and resources toward building out and meeting regulatory requirements for operating excellence and security practices will have undergone independent audits. They should also be able to provide a copy of their audit report under NDA (non-disclosure agreements). Look for these data center audit compliance reports:  SSAE 16 (Statement on Standards for Attestation Engagements), which replaced SAS 70 (Statement on Auditing Standard), measures controls and processes related the financial recordkeeping and reporting. A SOC 1 (service organization controls) report measures and reports on the same controls as an SSAE 16 report.  A SOC 2 audit is actually most closely related to reporting on the security, availability and privacy of the data in your offsite backup and data hosting environment. A SOC 2 21 NIST (National Institute of Science and Technology), Special Publication 800-34 Rev. 1 – Contingency Planning Guide for Federal Information Systems; http://csrc.nist.gov/publications/nistpubs/800-34- rev1/sp800-34-rev1_errata-Nov11-2010.pdf (PDF)
  • 27. Copyright © Online Tech 2013. All Rights Reserved page 27 of 36 report is highly recommended for companies that host or store large amounts of data, particularly data centers. A SOC 3 report measures the same controls as a SOC 2, yet has less technical detail, and can be used publicly.  For specific industries that deal with certain types of data, there exist more stringent sets of compliance regulations. For the healthcare industry, or any company that touches protected health information (PHI), HIPAA compliance (Health Insurance Portability and Accountability Act) is federally mandated to protect health data. If your disaster recovery/offsite back data center provider has undergone an independent HIPAA audit of its facilities and processes, you can be assured your data is secure.  For e-commerce, retail, franchise and any other company that touches credit cardholder data (CHD), PCI DSS compliance (Payment Card Industry Data Security Standard) is the regulatory requirements designed to protect CHD. Communication When there’s an interruption in your service or issue at the data center, you should be able to count on your disaster recovery provider to promptly communicate with you in order to give your IT staff or clients proper notification. An updated contact list and tested communication plan should be key aspects of your disaster recovery and business continuity plan. The lack of communication can put a company out of business and leave coworkers and customers in the dark. Designate a primary contact and backup contacts from your company to be the first to know in the event of a disaster, as well as assemble a technical team that can work with your provider, if outsourcing your disaster recovery solution. When searching for an offsite backup/data center provider, ask about their communication policies and processes. Good communication can also give you insight into their level of transparency into their business operations. See section 5.5.4. Communication Plan Testing for more about establishing a realistic and effective communication plan between your company and vendors. Fully Reserved or First-Come, First-Served? Does your provider offer fully reserved servers for disaster recovery? Or do they lease a number of physical servers and resources to be used on a first-come, first-served basis, shared with other companies? Providers that offer this service allow companies to load applications and attempt to recover operations on “cold” servers – these servers are considered bare metal servers with no operating system (OS), applications, patches or data. Recovery would take longer due to the time spent retrieving tape backups and traveling to the secondary site. Ask your provider if they offer fully reserved servers for complete assurance that your company will be able to recover your data as quickly as possible, without the chance of being second in
  • 28. Copyright © Online Tech 2013. All Rights Reserved page 28 of 36 line. In addition, virtualization can eliminate the need of restoring from tape or disk, thus reducing recovery times compared to traditional disaster recovery in which physical servers need to first be loaded with the OS and application software, as well as patched to the last configuration used in production before data restoration. 5.4. SAN-to-SAN Replication SAN (Storage Area Network) Due to compliance reasons or due diligence, many companies not only want a backup locally that they can recover to very quickly, but they also need to get that data offsite in the event that they experience a site failure. SAN can help with these backup and recovery needs. SAN Snapshots A snapshot is a point-in-time reference of data that you can schedule after your database dumps and transaction logs have finished running. A SAN snapshot gives you a virtual copy/image of what your database volumes, devices or systems look like at a given time. If you have an entire server failure, you can very quickly spin up a server, install SQL or do a bare metal restore, then import all of your data and get your database server back online. SAN-to-SAN Replication The counterpart to SAN snapshots is SAN-to-SAN replication (or synchronization). With replication, if you had a SAN in one data center, you can send data to another SAN in a different data center location. You can back up very large volumes of data very quickly using SAN technology, and you can also transfer that data to a secondary location independently of your snapshot schedule. This is more efficient because traditional backup windows can take a very long time and impact the performance of your system. By keeping it all on the SAN, it allows backups to be done very fast, and the data copy can be done in the background so it’s not impacting the performance of your systems. You can configure and maintain snapshots on both your primary and disaster recovery sites, i.e., you can keep seven days’ worth of snapshots on your primary site, and you can keep seven days of replication on your disaster recovery site. However, SANs are fairly expensive, and snapshots and replication can use a lot of space. You will also need specialized staff to configure and manage SAN operations. SAN-based recovery focuses on large volumes of data, and it is more difficult to recover individual files. Traditional recovery focuses on critical business files for more granular recovery, but that comes at the cost of speed. With a large volume of data, traditional recovery can be much slower than SAN-based snapshots. SAN-to-SAN replication can support a private cloud environment and provide fast recovery times (RTO of 1 hour and RPO of minutes). After a disaster is mitigated, SAN-to-SAN
  • 29. Copyright © Online Tech 2013. All Rights Reserved page 29 of 36 replication provides a smooth failback from the secondary site to the production site by reversing the replication process. SAN vs. Traditional Backup and Disaster Recovery Traditionally, 10 or 15 years ago, people had email servers, FTP/document servers, unstructured data and database servers. The backup and recovery of these systems must be viewed differently as they each present their own unique challenges. With email servers, they are mission critical, highly transactional and essential to a business. They may have SQL or custom databases, and they can take a long time to rebuild after a disaster. The actual install and configuration of the application that sits on top of the database itself can be very intensive, and rebuilding that system may put you over your recovery time objective (RTO). For a smaller company, an exchange server may be 100 to 200 GB in size. FTP/file servers can be terabytes in size, and contain large volumes of unstructured data. They are less transactional than email servers, and server configuration could be minimal. Each individual file must be backed up. When looking at systems of that size, you should stop looking at traditional backups, and you can start leveraging SAN (Storage Area Network) technology – which is a large group of disks. Instead of having a backup window that runs for an entire day that can slow operations, you can use a SAN snapshot technology which allows you to back up more efficiently. If you need a backup of your FTP/file servers every night, you can leverage a snapshot during off-hours very quickly, from a matter of seconds to a minute. SAN snapshots can back up a large amount of data with very little impact on your production environment. The tradeoff is it can be slightly harder to restore the data because you would need to bring up your file drive online and present it to the server. However, it can be faster than having to restore terabytes of data from a tape backup. For standalone database servers with a large volume of structured data that are highly transactional, consider using SAN snapshot technologies with specified volumes for database dumps and transaction logs. 5.5. Best Practices 5.5.1. Encryption What is encryption? Encryption takes plaintext (your data) and encodes it into unreadable, scrambled text using algorithms that render it unreadable unless a cryptographic key is used to convert it. Encryption ensures data security and integrity even if accessed by an unauthorized user.
  • 30. Copyright © Online Tech 2013. All Rights Reserved page 30 of 36 According to NIST (National Institute of Science and Technology), encryption is most effective when applied to both the primary data storage device and on backup media going to an offsite location in the event that data is lost or stolen on its way or at the site, meaning data in transit and at rest.22 NIST also recommends keeping a solid cryptographic key management process in order to allow encrypted data to be read and available as needed (decryption). According to data security expert Chris Heuman, Certified Information Systems Security Professional (CISSP), performing a disaster recovery test of encrypted data should be an important part of your business continuity strategy. Forcing recovery from an encrypted backup source and forcing a recovery of the encryption key to the recovery device allows organizations to find out if encryption is effective before a real disaster or breach occurs. Encryption for HIPAA and PCI Compliance Encryption is considered a best practice for data security and is recommended for organizations with sensitive data, such as healthcare or credit card data. It is highly recommended for the healthcare industry that must report to the federal agency, Dept. of Health and Human Services (HHS), if unencrypted data is exposed, lost stolen or misused. The federally mandated HIPAA Security Rule for healthcare organizations handling electronic protected health information (ePHI) dictates that organizations must: In accordance with §164.306… Implement a mechanism to encrypt and decrypt electronic protected health information. (45 CFR § 164.312(a)(2)(iv)) HIPAA also mandates that organizations must: §164.306(e)(2)(ii): Implement a mechanism to encrypt electronic protected health information whenever deemed appropriate. Protecting ePHI at rest and in transit means encrypting not only data collected or processed, but also data stored or archived as backups. For organizations that deal with credit cardholder data, they must adhere to PCI DSS standards that require encryption only if cardholder data is stored.23 PCI explicitly states:24 22 NIST (National Institute of Science and Technology), Special Publication 800-34 Rev. 1 – Contingency Planning Guide for Federal Information Systems; http://csrc.nist.gov/publications/nistpubs/800-34- rev1/sp800-34-rev1_errata-Nov11-2010.pdf (PDF) 23 Chris Heuman CHP, CHSS, CSCS, CISSP, Practice Leader for RISC Management and Consulting, Encryption – Perspective on Privacy, Security & Compliance; http://www.onlinetech.com/events/encryption-perspective-on-privacy-security-a-compliance (Webinar) 24 PCI Security Standards Council, PCI DSS Requirements and Security Assessment Procedures, Version 2.0; https://www.pcisecuritystandards.org/documents/pci_dss_v2.pdf (PDF)
  • 31. Copyright © Online Tech 2013. All Rights Reserved page 31 of 36 3.4 Render PAN (Primary Account Number) unreadable anywhere it is stored (including on portable digital media, backup media, and in logs) by using any of the following approaches:  One-way hashes based on strong cryptography (hash must be of the entire PAN)  Truncation (hashing cannot be used to replace the truncated segment of PAN)  Index tokens and pads (pads must be securely stored)  Strong cryptography with associated key-management processes and procedures 3.4.1.c Verify that cardholder data on removable media is encrypted wherever stored. While both addressable and required for compliance, encryption is also considered an industry best practice – no longer just an option but necessary to protect backup data in rest and in transit to your disaster recovery/offsite backup site. For more on encryption from both a technical and compliance perspective, check back to our White Paper section for our Encryption white paper to be released Fall 2013. Or, watch our recorded encryption webinar series with IT and data security professional guest speakers as well as experts from Online Tech in:  Encryption – Perspective on Privacy, Security & Compliance  Encryption at the Software Level: Linux and Windows  Encryption at the Hardware and Storage Level 5.5.2. Network Replication With a single stand-alone server, cloud-based disaster recovery allows you to ship a copy of your virtual server image offsite to run on a cloud server in the event of a disaster. However, for enterprise or more complex server configurations, more than just a server image is required for recovery. Firewall rules, VLANs, VPNs and the network replication must be fully replicated at the disaster recovery site before the site can go live. In order to achieve rapid recovery time objectives (RTOs), the server and network must be fully replicated at the secondary site in synchronicity with the production site as changes are made. Ideally, a cloud-based disaster recovery provider should have control of both the production and disaster recovery sites to ensure network replication.
  • 32. Copyright © Online Tech 2013. All Rights Reserved page 32 of 36 5.5.3. Testing Testing your disaster recovery plan at least annually is a best practice for numerous reasons, including verifying that the plan actually works and training your team in the process. Testing also allows you to figure out where weaknesses lie, or gaps in the process that need to be addressed. According to NIST, the following areas should be tested:  Notification procedures  System recovery on secondary site  Internal and external connectivity  System performance with secondary equipment  Restoration of normal operations Testing with a traditional disaster recovery plan can be time-consuming and costly due to the retrieval, restoration and system re-configuration required, and often conventional plans are rarely tested through a full failover scenario. With cloud-based disaster recovery, testing is easier, faster and less disruptive to your production environment and business operations than traditional disaster recovery. Since the cloud offers offsite backup of the entire virtual server in sync with the production site, there is no need to retrieve tapes to test full recovery. 5.5.4. Communication Plan Testing Part of your overall disaster recovery and business continuity planning should involve a well- documented communication plan based on your BIA (Business Impact Analysis). Mapping out the interdependencies and complexity of your organization can help you identify who is the proper point of contact for any given critical function. Testing your communication plan is key to getting everyone on board and working together to achieve a smooth and realistic recovery. Determine who is responsible for officially declaring a disaster – from IT to executives, a communication plan should be in place for business interruption or disaster notification, and then a formal declaration. After declaration, a process should be established for notifying shareholders, employees, customers, vendors and the general public, if necessary. Aside from notification, a trained disaster recovery IT team should be identified for the secondary site, as well as for production. If working with a disaster recovery provider, ensure your contracts and agreements reflect notification and communication policies to clarify their roles and responsibilities involved in facilitating recovery. Someone should be tasked with keeping a well-organized and up-to-date contact list for those involved in the communication plan, with cell phone and home phone numbers as well as an
  • 33. Copyright © Online Tech 2013. All Rights Reserved page 33 of 36 alternative email address in the event that corporate email/phone systems are down during a disaster.
  • 34. Copyright © Online Tech 2013. All Rights Reserved page 34 of 36 6.0. Conclusion Disaster recovery technology advancements have streamlined the process to offer a faster, more accurate and complete recovery solution. Leveraging the capabilities of a disaster recovery as a service (DRaaS) provider allows organizations to realize these benefits, including cost-effective and efficient testing to ensure plan viability. The time and resource-intensive challenge of managing a secondary disaster recovery site that both meets stringent industry compliance requirements and protects mission critical data and applications can be relieved with the right disaster recovery partner. Here is a high-level overview of what to look for in an offsite backup and disaster recovery provider and plan (see section 7.0 Questions to Ask Your Disaster Recovery Provider for more details):  Strategic location  Risk of natural disaster  Recovery time objective (RTO)  Recovery point objective (RPO)  Cloud-based disaster recovery  High availability/redundancy  Annual testing  Compliance audits and reports Contact our disaster recovery and offsite backup experts at Online Tech for more information if you still have questions about IT disaster recovery planning or our disaster recovery data centers. Visit: www.onlinetech.com Email: contactus@onlinetech.com Call: 734.213.2020
  • 35. Copyright © Online Tech 2013. All Rights Reserved page 35 of 36 7.0. References 7.1. Questions to Ask Your Disaster Recovery Provider When you look to a third party disaster recovery provider, what kind of questions should you ask to ensure your critical data and applications are safe? Read on for tips on what to look for in a disaster recovery as a service (DRaaS) solution from your hosting provider. 1. Do you have the following data center certifications: SSAE 16, SOC 1, 2 and 3? Data center certifications should be up-to-date, backed up by an auditor’s report, and comprehensive of all security-related controls. Here’s a brief snippet of what each one measures:  SSAE 16 The Statement on Standards for Attestation Engagements (SSAE) No. 16 replaced SAS 70 in June 2011 – if your current disaster recovery provider only has a SAS 70 certification, keep looking! SSAE 16 has made SAS 70 extinct. A SSAE 16 audit measures the controls, design and operating effectiveness of data centers, as relevant to financial reporting. (Note: SSAE 16 does not provide assurance of controls directly related to data centers/disaster recovery providers).  SOC 1 The first of three new Service Organization Controls reports developed by the AICPA, this report measures the controls of a data center as relevant to financial reporting. SOC 1 is essentially the same as SSAE 16 – the purpose of the report is to meet financial reporting needs of companies that use data hosting services, including disaster recovery.  SOC 2 SOC 2 measures controls specifically related to IT and data center service providers, unlike SOC 1 or SSAE 16. The five controls are security, availability, processing integrity (ensuring system accuracy, completion and authorization), confidentiality and privacy.  SOC 3 SOC 3 delivers an auditor’s opinion of SOC 2 components with the additional seal of approval needed to ensure you are hosting with an audited and compliant data center. A SOC 3 report is less detailed and technical than a SOC 2 report. 2. What is your recovery time objective and recovery point objective SLA? Recovery Time Objective (RTO): This refers to the maximum length of time a system can be down after a failure or disaster before the company is negatively impacted by the downtime.
  • 36. Copyright © Online Tech 2013. All Rights Reserved page 36 of 36 Recovery Point Objective (RPO): This specifies a point in time that data must be recovered and backed up. The RPO determines the minimum frequency at which interval backups need to occur, from every hour to every 5 minutes. Clarifying the time objectives with your disaster recovery provider can help your organization plan for the worst and know what to expect, when. 3. Where are your disaster recovery data centers located? Natural disasters happen at any time, almost anywhere – but you can decrease your odds of experiencing them by choosing to partner with a disaster recovery provider that has data center facilities located in a disaster-free zone. The Midwest is one region that is relatively free from major disasters. Read more in High Density of Data Centers Correlate with Disaster Zones; Michigan Provides Safe Haven. 4. Do you offer cloud-based disaster recovery? As VMware.com states, “traditional disaster recovery solutions are complex to set up. They require a secondary site, dedicated infrastructure, and hardware-based replication to move data to the secondary site.” With cloud-based disaster recovery, you could achieve a 4 hour RTO and 24 hour RPO. Cloud- based disaster recovery replicates the entire hosted cloud (servers, software, network and security) to an offsite data center, allowing for far faster recovery times than traditional disaster recovery solutions can offer. 5. How often do you test your disaster recovery systems? Disaster recovery providers should test at least annually to ensure systems are prepared for an emergency response whenever a disaster is declared. Testing also allows for a valuable learning experience – if anything goes wrong, professionals can investigate and remediate before an actual disaster occurs. It’s also a test run for the personnel involved in managing the event to ensure the documented communication plan actually works as anticipated. 8.0. Contact Us Contact our disaster recovery and offsite backup experts at Online Tech for more information if you still have questions about IT disaster recovery planning or our disaster recovery data centers. Visit: www.onlinetech.com Email: contactus@onlinetech.com Call: 734.213.2020