1. Rethink Your Disaster RecoveryRethink Your Disaster Recovery
JOESEL BIYOYO
Consulting Engineer
Total RISC Technology Singapore Pte Ltd
Version 1.3
2. The need for Disaster Recovery (DR)The need for Disaster Recovery (DR)
The Traditional ApproachThe Traditional Approach
PlateSpin Forge ProductPlateSpin Forge Product
TRT Business Recovery as a ServiceTRT Business Recovery as a Service
3. The Need for IT Disaster RecoveryThe Need for IT Disaster Recovery
of enterprises have declared a disaster or
experienced a major business disruption*
of enterprises have indicated that improving
disaster recovery capabilities was critical**
Power Failure 42% - Natural Disaster 33% - IT Hardware Failure 31%
Cost of Downtime 56%; Improving Mission Critical Availability 52%;
Requirement to Stay Online 24/7 48%; Increased Risk 44%
4. The Traditional ApproachThe Traditional Approach
• Low Cost
• Focus is on protecting data
• Need for backup software, tape
devices/library.
• Tapes are stored onsite and
send offsite
• Poor Performance
• Prone to errors (operator, etc.)
1. Backup to Tape/Image/Disk1. Backup to Tape/Image/Disk
5. 2. Redundant Infrastructure2. Redundant Infrastructure
• High costs
• Focus is on protecting application
• In a form of Clustering or High-
end Replication
• High Performance
• Duplicate or like-for-like
infrastructure on prod and DR
sites (hardware and software)
• Management complexity
6. Two Traditional ApproachesTwo Traditional Approaches
DR by
Duplication
DR by
Duplication
DR by
Backup
DR by
Backup
Server/Application Protection
• Local cluster
• Mirrored hot site
Data Protection
• Tape Back-up
• Imaging
High Performance
• Near Zero RTO, RPO
Poor Performance
• Slow RTO, RPO (days)
High Cost
• Duplicate infrastructure, and
management
Low Cost (not practical)
• Days to rebuild
Solution Focus
• Highest Performance
• Lowest Risk
Solution Focus
• Cost
DR by
Virtualization
DR by
Virtualization
Workload Protection
• Virtualization
Medium Performance
• 30min-8hrs RPO, RTO
Medium Cost
• Many-to-one
• Less servers
Solution Focus
• Medium Performance
• Lower Risk
7. Disaster Recovery by Virtualization
Local or
Wide Area Network
Consolidated Virtual Host
Physical or Virtual Workloads
Organization can protect workloads across geographically dispersed sites and
rapidly recover in the event of server downtime or site disaster – without having to
invest in costly duplicate hardware or redundant operating system and software
licenses
9. Define Your ObjectivesDefine Your Objectives
• How long do the business processes must
be restored after declaring DR?
• What amount of data loss is acceptable?
• How do we get the system back into a
useable state?
10.
11.
12. Criticality
Level
Description
1
Business and mission critical data which impact business
operation if DOWN, required; RPO = 0hr ; RTO= 0-1hr
2
Critical and important data which impact partially of business
operation, required; RPO=1-4hr ; RTO=1-4hr
3
Less-critical but important data which do not impact main
business operation but internal operation may be affected,
required; RPO=4-8hr ; RTO=4-8hr
4 Non-critical data, required; RPO=>8hr ; RTO=>8hr
14. PlateSpin ForgePlateSpin Forge
Protects up to 40 workloads
PlateSpin Forge Includes:
• Storage
• Replication software
• Hypervisor
Plug-in and protect solution for :
• Medium enterprises
• Branch use for large enterprises
World’s first disaster recovery hardware appliance with virtualization
16. Workload ProtectionWorkload Protection
• Backing up entire server workloads - the contents of a server,
including the operating system, applications and data.
• Recovering workloads during an outage
• Restoring workloads to their original production locations after the
outage.
17. Workload Portability and ConsolidationWorkload Portability and Consolidation
Detach workloads from their native hardware configurations
and move a server’s entire software stack to any physical or
virtual host, or image archive.
Reduced Infrastructure, Reduced CostReduced Infrastructure, Reduced Cost
PlateSpin ForgePlateSpin Forge
ServerServer
18. Current Product ExpectationCurrent Product Expectation
• Does not protect Unix-based operating system such as AIX,
Solaris and HPUX.
• Supports Windows 7, 8, 8.1, Windows Server 2003 SP1 or later
• Supports Linux such as RHEL, CentOS, Debian, SUSE, etc.
• PlateSpin Protect and Forge doesn't support FAT32 partition.
Supports only the NTFS file system on any supported Windows
system.
• Prefer not to protect AD. Use native AD replication to DR site
22. TRT Business ContinuityTRT Business Continuity
Business Recovery as a ServiceBusiness Recovery as a Service
(BRaaS)(BRaaS)
23. 1. Operational Expenditure (OpEx)
Backup and Disaster Recovery Solution as a Service. TRT will host your Disaster Recovery Site and TRT will have full
ownership of the solution. TRT will perform minimum 2 DR exercise per annum and provide monthly reports.
WAN
Production Site
TRT Data Center as Consolidated DR Site
PlateSpin
25. 25
Services Include:
•Daily back up of data (incremental replication) from remote locations
• Restoration of business process (server up and running at DR site) after
declaring Disaster Recovery within 0-4 hours.
•Restoration of whole workload to the original server or to another server within
12 hours of a failure (depends on the current bandwidth and data capacity).
Additional fee applies if lesser hours is required - burstable bandwidth billing
(pay per use).
•2 DR tests Per Annum. Minimum of 7 days’ notice. Otherwise a (RRF applies)
•Monthly reports confirming successful back up, etc.
Optional TRT Premium Services:
•Recovery Restoration Fee (RRF)
•Additional DR Test Fee for each addition test over the two DR tests annual
26. 2. Capital Expenditure (CapEx)2. Capital Expenditure (CapEx)
Purchase of PlateSpin Forge Backup and Disaster Recovery Solution. Customer will host their own DR site and will have full ownership of
the solution. TRT will implement, support and perform product knowledge transfer to IT personnel.
WAN
Production Site
Customer’s DR Site
PlateSpin
Tape backups are the workhorses of most disaster recovery plans. Organizations use magnetic tape to store duplicate copies of hard disk files, not whole server workloads. They typically copy server and desktop-based files to the tapes WHITE PAPER: Consolidated Disaster Recovery | 4
using an automated backup utility that updates on a periodic schedule, typically overnight. Many organizations use magnetic tape in combination with magnetic disks and optical disks in a backup management program that automatically moves data from one storage medium to another. They usually store tape archives offsite for recovery purposes; a third-party provider may pick up and store these backup tapes. Because of its low cost per gigabyte, tape backup is the most economically prudent recovery alternative; however, backup utilities and processes can be difficult to administer, as can the logistics of transporting, storing and retrieving tape archives in the event of an outage. It can take hours to restore a system from a backup tape, and days if multiple systems are involved. You must manually rebuild systems (reinstall the operating systems, applications and patch levels) before you can restore the application data.
Server clustering generally refers to multiple servers that are linked together to provide fault tolerance balancing distributes workloads over multiple systems. Clustering achieves near-zero recovery time and point objectives but at a very high cost. Because it can be prohibitively expensive and complicated to implement and maintain, clustering is typically a viable disaster recovery option for only the most mission-critical server environments.
There are two common approaches available in the market today when implementing a business continuity solution… disaster through duplication and disaster recovery through back-up…
Let’s take a closer at these two solutions to see how they compare…
Duplicating the entire datacenter infrastructure is used primarily for application protection – examples are server clustering technology and remote mirrored hot sites to ensure near real-time recovery and availability
Disaster Recovery by back-up is the more traditional approach, primarily focused on data protection; magnetic tapes and imaging technologies are used to store data either locally or offsite in the event of downtime
Duplication provides exceptional performance with near zero “recovery time objectives” (which measures downtime) and “recovery point objectives” (which measures Data Loss). These are two very important metrics when gauging the effectiveness of a high availability solution. The problem is that high performance comes at a high price, due to the redundant hardware and software that is required for this exceptionally high performance.
Back-ups deliver poor performance with extremely poor “recovery time objectives” and “recovery point objectives” taking in some cases up to a week to successfully recover and rebuild lost data onto new bare metal servers.
Considering the three main challenge areas data centers face today – cost, performance and risk – High cost duplication easily addresses “performance” and “risk” but falls far short on cost
Back-up technologies are a price leader – but fall short in performance and risk – it is simply not a practical and efficient way to protect the most important pieces of your customer’s business
How long can we afford downtime?
PlateSpin coverage is Criticality 2 and 3
PlateSpin Protect and Forge doesn't support FAT32 partition. PlateSpin Forge supports only the NTFS file system on any supported Windows system.
Prefer not to protect AD. Use native AD replication to DR site
Windows Network Load Balancing (NLB) support required additional manual configuration
Currently it is not GPT cluster-aware. Only MBR cluster-aware
Windows Cluster (MBR only) support with specific steps performed
Total RTO can only be determined via Total TEST FAILOVER activity
Requirement:
Good WAN connection
WAN bandwidth needed for disaster recovery is the amount of data that needs to be moved divided by the time available to move it
By leveraging Profiling and Portability, PlateSpin is able to deliver a unique solution we call “Consolidated Recovery”
Consolidated Recovery leverages our ability to Inventory and monitor a workload environment to plan out a feasible Recovery plan
We take this further by using Portability to replicate these workloads in to offline Virtual Machine Archives, which offer the flexibility of one-click test restore and failover
This approach to DR allows organizations to drastically reduce Total Cost of Ownership and Recovery Time objective while achieving Whole Workload protection
We are able to offer very compelling Recovery Point Objectives by leveraging our incremental synchronization, and simplify testing by having easily bootable backup archives