VMware - VMUG Montreal

VMware for Business Continuity
What’s new with SRM 5

Vadim Shvarts
Sr. Systems Engineer
VMware Canada
vshvarts@vmware.com

© 2009 VMware Inc. All rights reserved

Introduction –
VMware for Business Continuity

2 Confidential

Disasters Happen. Do You Need Protection?

43% of companies experiencing disasters never
re-open, and 29% close within two years.
(McGladrey and Pullen)

93% of business that lost their data center for
10 days went bankrupt within one year.
(National Archives & Records Administration)

40% of all companies that experience a major
disaster will go out of business if they cannot
gain access to their data within 24 hours.
(Gartner)

Top executives say 10 hours to recovery;
IT managers say up to 30 hours.
(Harris Interactive)

3

Business-Critical Applications Require Business Continuity

% of Application Instances Running on VMware in Customer Base

67%

53%
2011
47%
42% 43% 2010
34% 28% 28%
38% 25% 25%
18%

MS MS MS Oracle Oracle SAP
Exchange SharePoint SQL Middleware DB

Source: VMware customer survey, Jan 2010 and April 2011 interim results,
Data: Total number of instances of that workload deployed in your organization and the percentage of those instances that are virtualized

Availability Expectations on vSphere Continue to Increase
RTO‟s decreasing from >24 hours to <12 hours

4

Drawbacks Of Traditional Business Continuity Solutions

Application-level availability silos:
Availability
Complex and expensive requirements

Middleware / Local Availability
Java
Data Protection
Oracle RAC MS Clustering DB Access App Server Disaster Recovery
Groups Cluster
Oracle Session State
DB Mirroring CCR / SCR
DataGuard Replication

Backup Data replication Shared availability
services:
Longer RTOs and
RPOs

5

Improving Business Continuity At All Levels

Local Site Failover Site

vSphere vSphere vSphere vSphere vSphere

Improved
in 2011
Local Availability Disaster Recovery
Improved
 vSphere High Availability in 2011  vCenter Site Recovery Manager
 vSphere Fault Tolerance  Includes vSphere Replication
 vMotion and Storage vMotion
New
Data Protection in 2011
Improved
 vSphere Data Recovery in 2011
 Storage APIs for Data Protection

6

Transforming Cost And Complexity Of Business Continuity

RTO / • Similar RTOs to app-level
RPO availability solutions
Continuous
• Much lower cost / complexity
App-level availability
VMware business (Oracle RAC, MSCS, …)
continuity
Minutes
(HA, FT, vMotion, SRM,
VDR)
• Much better RTOs than
Hours
traditional backup and replication
• Similar or lower cost
Shared availability
services
Days (traditional backup,
replication)

$100 $1,000 $10,000

Cost ($ per app)

7

Better Business Continuity Is #1 Objective For Virtualization

Top Five Objectives for Virtualization
Use virtualization to improve Business Continuity and
Disaster Recovery (BCDR)
46%

Improve virtual machine performance 33%

Increase the server consolidation ratio 32%

Improve VM environment management 31%

More mission-critical applications 24%

Source: WW VMware customer survey, January 2010
N=1083

8

Simple and Reliable DR with vSphere
and SRM

9 Confidential

Challenges of Traditional Disaster Recovery

Complex Unreliable
Expensive
Recovery Plans Failovers

Software Apps

Hosts ? ?
Hosts
?
Storage Storage
?
?
?
Facilities ? Network
?
>$10K per app

Failure to meet business requirements
• Long RTOs – days to weeks
• Too much time and resources consumed

10

vSphere Provides The Best Foundation For Disaster Recovery

Cost-Efficient Infrastructure
vSphere • Reduced hardware requirements at recovery site
Consolidation
• Use recovery hardware to run low-priority apps

Flexible Infrastructure
• Eliminate need for identical hardware across
Hardware vSphere vSphere sites
Independence • Enable waterfalling of equipment to recovery site

Simple Application Protection
• Entire system – including application, OS,
Encapsulation and data – is stored as virtual machine files
• Entire system can be protected with data
protection tools

11

Encapsulation Simplifies Application Protection And Recovery

Physical

40+ Hrs.
Install
Configure Install Configure Start “Single-step
backup
hardware OS OS automatic recovery”
agent

Virtual
< 4 Hrs.
Restore Power
VM on VM

Simplify recovery
• No operating system re-install or bare-metal recovery
• No time spent reconfiguring hardware
Standardize recovery process
• Consistent process independent of applications,
operating systems and hardware

12

vCenter Site Recovery Manager Ensures Simple, Reliable DR

Site Recovery Manager Complements vSphere to provide the simplest
and most reliable disaster protection and site migration for all applications

Provide cost-efficient replication of
applications to failover site
• Built-in vSphere Replication
• Broad support for storage-based
Site A (Primary) Site B (Recovery)

VMware
vCenter Server
Site Recovery
Manager
VMware
vCenter Server
Site Recovery
Manager
replication
Simplify management of recovery and
VMware vSphere VMware vSphere
migration plans
• Replace manual runbooks with
centralized recovery plans
• From weeks to minutes to set up new
plan
Servers Servers Automate failover and migration
processes for reliable recovery
• Enable frequent non-disruptive testing
• Ensure fast, automated failover
• Automate failback processes

13

SRM Momentum

Introduced in Q2’ 2008
125,000+ units sold
5,000+ customers
50% annual growth in 2010

“If your organization is already taking advantage of virtualization,
then adding Site Recovery Manager to handle disaster recovery
is a no-brainer.”
― Jerry Wilkin
Senior Systems Administrator, Dayton Superior Corp

14

Key Components Of SRM 5

Site Recovery Manager
• Manages recovery plans
Site
vCenter Server Recovery • Automates failovers and failbacks
Manager
• Tightly integrated with vCenter and replication

Choice of Replication Options
vSphere vSphere Replication
• Bundled with SRM
• Replicates virtual machines between
vSphere clusters

Storage
Storage-Based Replication (3rd party)
• Provided by replication vendor
• Integrated via replication adapters created,
certified and supported by replication vendor

Required at Both Protected
and Recovery Sites

15

Site Recovery Manager Complements vSphere For DR

Traditional DR VMware

Consolidation to reduce costs

Hardware independence at vSphere
failover site
Functionality
Encapsulation for simple recovery
of entire systems

vSphere Replication

Simple management of recovery
and migration plans SRM
Automated DR failover and non- Functionality
disruptive testing
Streamline planned migrations
and automated failback

16

SRM Provides Broad Application Coverage

RTO: 30 minutes to hours
RPO: Flexible based on storage replication

Continuous
App-level geo-clustering / load balancing
Tier 1

RTO Hours
Tier 2
Site Recovery Manager

Days
Tier 3

Days Hours Synchronous

RPO
17

SRM Supports Flexible Topologies

Active-Passive Active-Active Bi-directional Shared
Failover Failover Failover Recovery Sites

Production Production Production

Recovery Recovery Production

• Most common • Leverage recovery • Production applications • Many-to-one failover
traditional scenario infrastructure for test, at both sites • Particularly useful for
• Expensive dedicated development, training • Each site acts as the Remote Office /
resources • Utilize sunk cost of recovery site for the Branch Office
recovery site other

18

What’s New In Site Recovery Manager 5.0?

vSphere Replication
 Bundled with SRM at no additional cost
Expand DR coverage to
Tier 2 apps and smaller
 Provides simple, cost-efficient replication
between vSphere clusters
sites

Automated failback
 Bi-directional recovery plans
 Automates failback to original site Streamline planned
Planned migration migrations
 New workflow that can be applied to any
(for disaster avoidance,
recovery plan planned maintenance, …)
 Ensures no data-loss, application-consistent
migrations of virtual machines

Others
 More granular control over VM startup order
 Protection-side APIs
 IPv6 support

19

Cost-Efficient Replication To Expand
DR Coverage

20 Confidential

DR Coverage Often Limited Due To High Protection Costs

Tier 1 Apps - Protected
APP APP APP Need to expand DR protection
OS OS OS
• Tier 2 / 3 applications in larger
datacenters
Tier 2 / 3 Apps – Backup only • Small and medium businesses
APP APP APP • Remote office / branch offices
APP APP APP APP
OS OS OS
OS OS OS OS

Small Sites – Backup only

Small Business
Corporate Datacenter
Remote Office / Branch Office

21

SRM Provides Broad Choice of Replication Options

Site A (Primary) Site B (Recovery)
Site Site
vCenter Server Recovery vCenter Server Recovery
Manager Manager

vSphere vSphere
vSphere
Replication

Storage-based
replication
vSphere Replication
Simple, cost-efficient replication for Tier 2 applications and smaller sites

Storage-based Replication
High-performance replication for business-critical applications in larger sites

22

vSphere Replication For Cost-Efficient, Simple Replication

Cost-efficient Simple Powerful

Reduce storage costs by 2X Manage replication directly 15 minute RPOs
• Support for heterogeneous from vCenter • Set RPOs between 15
storage across sites, • Eliminate complex minutes and 24 hours
including non-replicating interactions with storage
storage teams Efficient network utilization
• Use lower-end or older • Replicate only changed disk
storage at failover site Manage replication at the areas
individual VM level
Eliminate replication • Eliminate need for Highly scalable
software costs complicated VM-to-LUN • 500 virtual machines
• vSphere Replication mapping
included with Site Recovery Limitations
Manager at no additional • No automated failback
cost • File-level consistency only
(except planned migration)
• No FT, templates, linked
clones, physical RDMs

23

Expand DR Protection To Tier 2 Apps And Small Sites

Tier 1 Apps Storage, Replication, and SRM
Costs per Protected VM
$2,000/VM
$2,000
SRM
Enterprise
Storage Replication
Replication SW
$1,000
Tier 2 / 3 Apps Tier 1 Storage $600/VM
Failover Site SRM Standard
Tier 2 Storage
Failover Site
vSphere Storage Replication vSphere Replication
Large site Small site
vSphere Replication

Small Sites

vSphere Replication

Small Business
Corporate Datacenter
Remote Office / Branch Office

24

Simplify Replication Management With vSphere Replication

Storage-based Replication Overview
SharePoint Datastore Group

VMFS A vSphere Replication provides simple management
Web Datastore of replication
LUN 1
 Managed directly from vCenter
App
VMFS B
Datastore Hub  Managed at the individual VM-level
LUN 2
SQL

vSphere Storage Admin
Admin

Benefits
vSphere Replication
SharePoint  Eliminate complex interactions between
vSphere and storage teams to set up
Web replication
 Eliminate need to shuffle VMs between
App
datastores to map applications to replicated
LUNs
vSphere SQL
Admin

25

vSphere Replication Architecture
Tightly Integrated With SRM, vCenter and ESX
Protected Site Recovery Site
Site Site
vCenter Server Recovery Recovery vCenter Server
Manager Manager

vSphere Replication vSphere Replication
Management Server Management Server

VSR Agent vSphere
ESX
ESXi Replication ESXi
ESX Server

Any storage Any storage
supported by supported by
vSphere vSphere

26 Confidential

Simple Recovery and Migration Plans

27 Confidential

Simple Setup And Management of Recovery And Migration Plans

From Complex Runbooks… …to Simple Recovery Plans

 Weeks or months to set up  Simple recovery plan set up in minutes
 Error-prone  Fewer steps means far less room for errors
 Quickly falls out of sync with apps  Simple to keep in sync with changes
and infrastructure changes

28

Five Simple Steps To Create Recovery And Migration Plans

Create Recovery Plans …And Eliminate Manual Steps of
in 5 Steps… Traditional Recovery

Map production site resources to recovery site
• Resource pools Reconfigure individual hosts
Step 1 • vSwitches
• VM folders
Recover entire systems including OS
and application binaries
Select virtual machine protection groups
Step 2 to include in recovery Coordinate storage and replication processes
for recovery
Select low-priority VMs to suspend at • Stop replication and make replicated
recovery site LUNs writable
Step 3
• Present data to applications
• Present VMs to vSphere
Specify boot sequence of recovered VMs
Step 4
Reconfigure physical switching
infrastructure
Customize IP addresses of recovered VMs
Step 5

Optional Add messages and custom scripts

29

Application Consistent Recovery With SRM

Storage-based replication: application
Application Consistency Enabled consistency widely available
by Replication Provider • Enabled by replication management
software
Quiesce Replicate app- App-consistent • Typically relies on agents in the VMs to
application consistent VM VM presented properly quiesce applications
to SRM
• For both DR failover and planned
migrations

vSphere Replication: Application
consistency for planned migrations only
• File-system consistency for DR failover
Replication
via VSS requester in VMware Tools
management

30

Fully Automated Disaster Failovers and
Planned Migrations

31 Confidential

Beyond DR: Disaster Avoidance And Planned Migrations

3 typical use-cases for SRM

Disaster Failover Disaster Avoidance Planned Migration

Recover from unexpected Anticipate potential Most frequent SRM use case
site failure datacenter outages • Planned datacenter
• Full or partial site failure • For example: in case of maintenance
planned hurricane, floods, • Global load balancing
The most critical but least forced evacuation, etc.
frequent use-case Streamline routine
• Unexpected site failures do Initiate preventive failover migrations across sites
not happen often for smooth migration • Test to minimize risk
• When they do, fast recovery • Leverage SRM „planned • Execute partial failovers
is critical to the business migration‟ to ensure no • Leverage SRM „planned
data-loss migration‟ to ensure no
• „Automated failback‟ data-loss
enables easy return to • „Automated failback‟
original site enables bi-directional
migrations

32

SRM Reduces Recovery Risk With Frequent Testing

Recovery Traditional Disaster Recovery
Risk

TESTING GAP
Time
DR Test DR Test
Recovery Site Recovery Manager
Risk
Lack of confidence
in DR process
Frequent
DR Testing
 During the testing gap, organizations can‟t be sure that they
can recover the current IT environment
 A failover scenario may take days or weeks to complete,
leaving the business at extreme risk Time
DR Test DR Test

SRM provides assurance that DR objectives will be met.

33

SRM Enables Frequent Non-Disruptive Testing

Non-disruptive Testing Overview
Recovery Site  Automate test execution
• Execute recovery plan
• Customizable for testing with extra callouts
Recovery Site
and breakpoints
Isolated test • Log results of the test
environment
 Isolated test environment
• Snapshot replicated LUNs
• Launch VMs in fenced network
• Reset environment after test
vSphere
Benefits

Replication  Confidence and documentation that DR
requirements are satisfied
 Quickly identify and remediate potential issues
 Reduce cost and resources required for DR
testing
• Eliminate traditional „DR testing weekends‟
LUN snapshot

34

Automate DR Failover Processes

DR Failover Overview

Automatically detect site failures
Raise alert when
1 hearbeat lost  Require user to manually initiate failover

Automate recovery process
2 User initiates  Stop replication and present replicated LUNs to
failover vSphere
 Execute user-defined recovery plan
Site A Site B

Benefits
vSphere vSphere

4 Recover VMs Ensure fast and predictable failovers and
migrations

Replication
 Consistently meet business requirements

3 Minimize risk of user errors
Stop replication and
present LUNs to vSphere

35

Testing and Executing Recovery Plans

Steps in
recovery plan Status and time
stamps

When to execute

User
confirmation
message

36

Planned Migrations For App Consistency & No Data Loss

Planned Migration Overview
Two workflows can be applied to recovery plans:
 DR failover
1 Shut down 3 Recover app-  Planned migration
production VMs consistent VMs
Site A Site B Planned migration ensures application
consistency and no data-loss during migration
 Graceful shutdown of production VMs in
application consistent state
 Data sync to complete replication of VMs
vSphere vSphere  Recover fully replicated VMs

Replication
Benefits
2 Better support for planned migrations
Sync data, stop replication
 No loss of data during migration process
and present LUNs to vSphere
 Recover „application-consistent‟ VMs at
recovery site

37

Automated Failback To Streamline Bi-Directional Migrations

Automated Failback Overview
Re-protect VMs from Site B to Site A
 Reverse replication
 Apply reverse resource mapping
Automate failover from Site B to Site A
Reverse original recovery plan  Reverse original recovery plan
Restrictions
Site A Site B
 Does not apply if Site A has undergone major
changes / been rebuilt
 Not available with vSphere Replication
vSphere vSphere

Reverse
Benefits
Replication
Simplify failback process
 Automate replication management
 Eliminate need to set up new recovery plan
Streamline frequent bi-directional migarations

38

Next Steps

39 Confidential

Successful Business Continuity Requires Careful Planning

Business Requirements / Business Impact
Analysis (BIA)
• Map service Tiers by availability requirements and cost
• For each service, identify Availability requirements,
Recovery Time Objectives (RTO), Recovery Point
Objectives (RPO)

Application Dependency Mapping
• Identify dependencies between application
components
• Weakest link in the chain? (AD, DNS, etc)
Use Professional Services
Business Continuity Design • VMware PSO
• App-specific solutions / virtualization • VMware BCDR
for HA and DR / backup only Competency partners
(300+ highly qualified
• Budget ahead of time partners)
• Project planning / phasing

40

SRM 5 Editions Lineup

SRM 5

Standard Enterprise
Price per protected virtual machine
$195 $495
(license only)

Scalability Limits
(1)
• Maximum protected VMs 75 virtual machines Unlimited(2)

Features

• Support for storage-based replication

• Centralized recovery plans

• Non-disruptive testing

• Automated DR failover

• vSphere Replication

• Automated failback

• Planned migration

1. Maximum of 75 VMs per site and per SRM instance
2. Subject to the product‟s technical scalability limits
New in SRM 5.0

41

VMware BC/DR Service Offerings

VMware vCenter Site Recovery Manager Jumpstart
• The VMware vCenter Site Recovery Manager Jumpstart provides you
with a proof-of-concept, on-site installation and configuration of SRM
• 3 days on-site, 5 participants max

Custom BCDR Plan and Design Service
• Comprehensive architectural design for BCDR, covering data protection, local
availability, and disaster recovery.
• Address customer-specific requirements
• Flexible engagement model and duration

42

VMware - VMUG Montreal

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to VMware - VMUG Montreal

Similar to VMware - VMUG Montreal (20)

More from 1CloudRoad.com

More from 1CloudRoad.com (19)

Recently uploaded

Recently uploaded (20)

VMware - VMUG Montreal

Editor's Notes