VMware - VMUG Montreal


Published on

Published in: Technology, Business
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Traditional business continuity solutions are complex and expensive.Business continuity can be implemented at the application level with application-specific clustering solution such as Oracle RAC, MS clustering for SQL, Exchange Database Access Groups (DAG), etc. App-level clustering usually provides best-in-class availability, but at a very high cost and complexity. There are also solutions that can be implemented at the infrastructure layer, typically backup and data replication. These tend to be more cost-efficient than app-level solutions, but can’t quite compare in terms of uptime and RTOs.
  • VMware provides a suite of Business Continuity solutions to offer holistic BCDR protection to all applications running on the vSphere platform. These solutions provide simple, cost-effective protection with a common solution for all your applications. The VMware BCDR solutions includes: Local availability products to protect applications against downtime of individual hosts. This includes vSphere HA and FT for unplanned downtime, as well as vMotion and Storage vMotion for planned downtime. Data protection solutions to back up entire VMs, including OS, application binaries, and application data, in a simple, non-disruptive manner. This includes the vSphere Data Recovery product, designed for smaller deployments, and the Storage APIs for Data Protection that enable third party backup vendors to integrate directly with vCenter and vSphere.vCenter Site Recovery Manager and the new vSphere Replication product to protect applications against site failures.The new Cloud Infrastructure Launch improves many of our BCDR capabilities, including vSphere HA, vSphere Data Recovery, vCenter Site Recovery Manager, and introduces the new vSphere Replication product.VMware vSphere™ is the leading virtualization platform that forms the foundation for building cloud infrastructures. It is specifically designed to holistically manage large collections of infrastructure – CPUs, storage, networking – as a seamless, flexible and dynamic resource pools that can be allocated to the right users and applications on-demand. vSphere comprises infrastructure services that transform server, storage and network hardware into a shared resource and application services that are built in and available to all applications that run on it.
  • VMware business continuity products provide the best of both worlds – similar RTOs and RPOs to app-level availability, but with the simplicity and cost-efficiency of shared availability services.VMware business continuity solutions provide protection at similar cost to traditional backup and replication – but with much better RTOs. For example, recovering an application from a traditional backup can take many hours or even days. Recovering an application from a VM backup only takes a few minutes, since the whole system – including OS, application binaries and data – are encapsulated as a single set of files. Similarly, doing a DR failover using traditional replication can take days, while recovering using SRM can take less than one hour.VMware business continuity provides similar RTOs to app-level availability, but at much lower cost and complexity. For example, recovering from a failed host with VMware HA takes a few minutes – similar to recovering from a host failure using MSCS. But VMware HA does not require dedicated standby systems, expensive software licenses, and complicated set up.h
  • We all understand that virtualization helps consolidate and reduce costs – and that is typically the first thing that comes to mind when thinking about virtualization. But it’s also interesting to note that for our existing customers, the #1 objective for virtualization is consistently to improve BCDR.VMware vSphere is inherently a better platform for BCDR that offers many new BCDR options. And customers that know VMware well make it a priority to leverage these inherent capabilities to improve their BCDR configurations.
  • Let’s do a quick overview of SRM for disaster recovery
  • Key Points:Disaster recovery in the traditional physical x86 world is hard for the following reasons:Easiest and most reliable recovery requires complex server and recovery site infrastructureThe recovery process is also typically fairly complex due to hardware dependencies, application dependencies, etc.Due to the complexity of the process, having accurate and up-to-date documentation (a disaster recovery “runbook”) is essential, as are ensuring that everyone is thoroughly trained on recovery procedures and executes them correctly when neededAs a result, recovery requirements often fail to be met: recovery takes too long, is often unsuccessful, and requires significant investment of IT $$ and staffScript:Traditional disaster recovery plans depend on a very complex set of processes and infrastructure: duplicate datacenters, duplicate server infrastructure, processes for getting data to a recovery site, processes for restarting servers, processes for reinstalling operating systems, and so on. Because disaster recovery can be some complex, organizations often find themselves unable to provide good protection to more than a privileged few of their production workloads, leaving other workloads (e.g. file/print servers, internal web servers, departmental applications) unprotected or poorly protected.Because of the complexity of disaster recovery plans and infrastructure, organizations are heavily dependent on significant amounts of personnel training, on the accuracy and completeness of thick paper “runbooks” that document the recovery process, and on perfect execution of the recovery process when an outage does occur. Because testing is disruptive and expensive, organizations have a limited ability to ensure that all of their training, documentation, and execution is practiced and can successful recover their IT services.As a result of these challenges, tests of recovery plans often fail; basic recovery of critical workloads – if successful at all – often takes days or weeks; and a significant amount of IT time and resources are consumed by managing and maintaining recovery plans. In short, most firms fail to meet the continuity requirements set by their organizations.
  • I want to review 3 key features of virtualization that lead to better disaster recovery.Consolidation. Server consolidation means doing more with less. You’re reducing your physical footprint, which also means you can streamline your disaster recovery plans and standardize your recovery process. Hardware Independence. This means being able to recover onto any x86 hardware. So, you have the flexibility to buy different servers for your recovery site, or even fewer servers. Continuing to virtualize your production site will free up additional machines that can then be moved over to your recovery site. Encapsulation. Because virtualization captures everything about a server into just a few files on disk, the real benefit here is mobility. The ability to move your VMs wherever you want. You can back them up in the same way you currently protect your other files. You can also replicate them to your disaster recovery site, so that they will be available when you need to recover from an outage.
  • SRM provides very broad application coverage.The RPO is set by the replication product in use. Since SRM supports a very broad range of replication products, the RPO can be set anywhere from days to synchronous replication, depending on business needs, replication capabilities, and network bandwidth.The RTO will depend on the actual configuration, but is typically in the range of 30 mins to a few hours. Factors that will influence this include the number of VMs that need to be recovered, the number of hosts available to recover those VMs, and the replication technology.SRM can be used to protect the vast majority of applications. The only applications that can not be protected with SRM are applications that need better RTOs than 30 minutes to a few hours. Applications that need continuous or near-continuous RTOs will need to be protected using geo-clustering.
  • Key Points: Site Recovery Manager can be used in a number of different failover scenarios In particular, SRM helps you to make better use of your investment in a second site—you can use that second site for other workloads when you aren’t in a disaster recovery scenario rather than just having it sit idleScript:Site Recovery Manager can be used in a number of different failover scenarios:Active-Passive: Site Recovery Manager absolutely supports the traditional active-passive DR scenario, where a production site running applications is recovered at a second site that is idle until failover is required. Although the most common configuration, this scenario also means that you are paying a lot of money for a DR site that is idle most of the time.Active-Active: To make better use of the recovery site, Site Recovery Manager also enables you to leverage your recovery site for other workloads when you aren’t using it for DR. Site Recovery Manager can be configured to automatically shutdown or suspend VMs at the recovery site as part of the failover process so that you can easily free up compute capacity for the workloads being recovered.Bidirectional: Site Recovery Manager can also provide bidirectional failover protection so that you can run active production workloads at both sites and failover to the other site in either direction. The spare capacity at the other site will be used to run the VMs that are failed over.Local Failover: Although less common, some of our customers need to be able to failover within a given “site” or campus, for example when a storage array failure occurs or when building maintenance forces you to move workloads to a different campus building. These customers are leveraging Site Recovery Manager to perform these failovers.
  • Today, DR coverage is often limited to Tier 1 apps in larger datacenters. In many cases, Tier 2 / 3 apps ad smaller sites do not have true DR protection, but are only protected with backups. This is because traditional DR protection is cost prohibitive and too complex to be broadly applied to all applications and sites.Unfortunately, this level of DR protection leaves a substantial business risk as day-to-day activities still rely extensively on the T2 / 3 apps and smaller sites. Ideally, organizations should have a simple, cost-effective, reliable DR plan in place for all their applications and sites.
  • vSphere Replication has been designed specifically to provide very cost-efficient, simple, yet powerful replication for SRM deployments.It is more cost-efficient because it reduces both storage costs and replication costs. At the storage layer, vSphere Replication eliminates the need to have higher end storage arrays at both sites. Customers can use lower end, different storage across sites, including Direct Attached Storage. For example, one popular option is to have Tier 1 storage at the production site, but lower end storage at the failover site, such as older arrays or less expensive arrays. vSphere Replication is also bundled with SRM at no additional cost, eliminating the additional cost of storage-based replication licenses.vSphere Replication is also inherently simpler than storage-based replication. Replication is managed directly from vCenter, eliminating dependencies on storage teams. And it is managed at the level of individual VMs, making the setup of SRM much simpler.Despite its simplicity and cost-efficiency, vSphere Replication is still a robust, powerful replication solution. It can provide 15 minute RPOs, and gives the flexibility to set RPOs between 15 minutes to 24 hours. It tracks changed disk areas and only replicates the latest deltas to increase network efficiency, and will scale upwards of 500 virtual machines. It does however have a few limitations, such as the lack of support for automated failback (in this initial release), file-level consistency, no support for FT, linked clones, templates, and physical-mode RDMs.
  • vSphere Replication is a great way to expand DR protection to Tier 2 / 3 apps and smaller sites.Storage-based DR protection is fairly expensive, with the cost mostly driven by storage capacity on Tier 1 storage arrays and additional replication licenses. The typical storage, replication and SRM cost is in the range of $2,000 per VM. While this is much better than physical DR, it is still fairly high and can be cost-prohibitive for less business-critical environments.vSphere Replication is much more cost-effective. By supporting the use of lower-end storage arrays, eliminating the need for dedicated replication licenses, and offering lower-cost SRM Standard licenses, the cost per VM is reduced by 3X to approximately $600 / VM.The much lower cost per VM enables organizations to expand their DR protection to many more applications and sites.
  • Setting up an SRM recovery plan is done in 5 simple steps.1 – The user maps resources at the production site to resources at the failover site. This includes resource pools, vSwitches, and VM folders. The VMs will automatically be mapped to the appropriate resources upon failover.2 – The user selects which virtual machines and virtual machine protection groups to include in the plan. This enables users to set up plans for failing over only part of the protected VMs.3 – The user selects which low-priority VMs to suspend at the recovery site – for example test and dev VMs that will be suspended to make room for the recovered VMs.4 – The user specifies the boot sequence of recovered VMs – for example, first the database, then the application tier, and finally the web tier.5 – Finally, the user can customize re-IPing of VMs to comply with the network at the failover site.Optionally, users can add messages, pauses for execution of manual steps, and custom scripts for example to update DNS servers with new IP addresses.SRM eliminates many of the traditional recovery steps by automatically coordinating recovery across infrastructure layers. For example:Individual hosts no longer need to be re-configured at the recovery site Users no longer need to worry about recovering entire systems including the OS, application binaries, and application data – since the entire system is recovered with the VM Coordinating of replication and storage is done automatically. Users no longer need to worry about stopping replication, presenting the LUNs to vSphere, and registering the VMs into vCenter There is no need to reconfigure physical switching infrastructure.
  • Customers often wonder whether SRM enables application consistent recoveries. In other words, whether the application can be recovered in a clean state, or whether the application is recovered in a crash-consistent state.The short answer is that it is possible to ensure application-consistent recoveries. The application consistency is enabled by the underlying replication technology. Many replication vendors have management products for this purpose – such as EMC’s Replication Manager or NetApp’sSnapManager.Typically, these products place an agent in the VM that will quiesce the VM prior to executing the replication. The quiesced VM at the recovery site is then used for recovery purposes.In this initial release, vSphere Replication will only provide file-level consistency, and not application consistency.Even when application consistency is not ensured by the replication product, the ‘planned migration’ workflow, new with SRM 5, ensures application-consistent migrations for planned migrations. This is the case for both storage-based and vSphere Replication.
  • SRM 5 now comes in two editions: Standard and Enterprise. Standard is designed for smaller environments, and can be used to protect up to 75 VMs (per site and per SRM instance). Enterprise is designed for larger environments with more than 75 VMs to protect. Both editions are full-featured and include vSphere Replication, automated failback, and planned migration.The tiered SRM editions provide a particularly attractive price-point for SMBs and Remote Offices – at $195 per VM. This is a steep reduction from the standard SRM 4 list price of $450 / VM. This more attractive price-point will enable smaller organizations and sites to leverage vSphere Replication and significantly reduce the overall DR costs.SRM Enterprise is priced at $495 / VM, a very small increment to SRM 4 which now includes vSphere Replication and the other new SRM 5 capabilities.
  • VMware provides a comprehensive set of BCDR services.The VMware SRM Jumpstart helps organizations get started with SRM. It’s a short 3-day service offering, designed to do an initial PoC / on-site installation of SRM, integrate SRM with the storage arrays, and set up a simple recovery plan.VMware also offers a custom BCDR Plan and Design service, intended to provide a holistic BCDR service covering local availability, data protection, and disaster recovery.
  • VMware - VMUG Montreal

    1. 1. VMware for Business ContinuityWhat’s new with SRM 5Vadim ShvartsSr. Systems EngineerVMware Canadavshvarts@vmware.com © 2009 VMware Inc. All rights reserved
    2. 2. Introduction – VMware for Business Continuity2 Confidential
    3. 3. Disasters Happen. Do You Need Protection? 43% of companies experiencing disasters never re-open, and 29% close within two years. (McGladrey and Pullen) 93% of business that lost their data center for 10 days went bankrupt within one year. (National Archives & Records Administration) 40% of all companies that experience a major disaster will go out of business if they cannot gain access to their data within 24 hours. (Gartner) Top executives say 10 hours to recovery; IT managers say up to 30 hours. (Harris Interactive)3
    4. 4. Business-Critical Applications Require Business Continuity % of Application Instances Running on VMware in Customer Base 67% 53% 2011 47% 42% 43% 2010 34% 28% 28% 38% 25% 25% 18% MS MS MS Oracle Oracle SAP Exchange SharePoint SQL Middleware DB Source: VMware customer survey, Jan 2010 and April 2011 interim results, Data: Total number of instances of that workload deployed in your organization and the percentage of those instances that are virtualized Availability Expectations on vSphere Continue to Increase RTO‟s decreasing from >24 hours to <12 hours4
    5. 5. Drawbacks Of Traditional Business Continuity Solutions Application-level availability silos: Availability Complex and expensive requirements Middleware / Local Availability Java Data Protection Oracle RAC MS Clustering DB Access App Server Disaster Recovery Groups Cluster Oracle Session State DB Mirroring CCR / SCR DataGuard Replication Backup Data replication Shared availability services: Longer RTOs and RPOs5
    6. 6. Improving Business Continuity At All Levels Local Site Failover Site vSphere vSphere vSphere vSphere vSphere Improved in 2011 Local Availability Disaster Recovery Improved  vSphere High Availability in 2011  vCenter Site Recovery Manager  vSphere Fault Tolerance  Includes vSphere Replication  vMotion and Storage vMotion New Data Protection in 2011 Improved  vSphere Data Recovery in 2011  Storage APIs for Data Protection6
    7. 7. Transforming Cost And Complexity Of Business ContinuityRTO / • Similar RTOs to app-levelRPO availability solutions Continuous • Much lower cost / complexity App-level availability VMware business (Oracle RAC, MSCS, …) continuity Minutes (HA, FT, vMotion, SRM, VDR) • Much better RTOs than Hours traditional backup and replication • Similar or lower cost Shared availability services Days (traditional backup, replication) $100 $1,000 $10,000 Cost ($ per app) 7
    8. 8. Better Business Continuity Is #1 Objective For Virtualization Top Five Objectives for Virtualization Use virtualization to improve Business Continuity and Disaster Recovery (BCDR) 46% Improve virtual machine performance 33% Increase the server consolidation ratio 32% Improve VM environment management 31% More mission-critical applications 24% Source: WW VMware customer survey, January 2010 N=10838
    9. 9. Simple and Reliable DR with vSphere and SRM9 Confidential
    10. 10. Challenges of Traditional Disaster Recovery Complex Unreliable Expensive Recovery Plans Failovers Software Apps Hosts ? ? Hosts ? Storage Storage ? ? ? Facilities ? Network ? >$10K per app Failure to meet business requirements • Long RTOs – days to weeks • Too much time and resources consumed10
    11. 11. vSphere Provides The Best Foundation For Disaster Recovery Cost-Efficient Infrastructure vSphere • Reduced hardware requirements at recovery site Consolidation • Use recovery hardware to run low-priority apps Flexible Infrastructure • Eliminate need for identical hardware across Hardware vSphere vSphere sites Independence • Enable waterfalling of equipment to recovery site Simple Application Protection • Entire system – including application, OS, Encapsulation and data – is stored as virtual machine files • Entire system can be protected with data protection tools11
    12. 12. Encapsulation Simplifies Application Protection And Recovery Physical 40+ Hrs. Install Configure Install Configure Start “Single-step backup hardware OS OS automatic recovery” agent Virtual < 4 Hrs. Restore Power VM on VM Simplify recovery • No operating system re-install or bare-metal recovery • No time spent reconfiguring hardware Standardize recovery process • Consistent process independent of applications, operating systems and hardware12
    13. 13. vCenter Site Recovery Manager Ensures Simple, Reliable DR Site Recovery Manager Complements vSphere to provide the simplest and most reliable disaster protection and site migration for all applications Provide cost-efficient replication of applications to failover site • Built-in vSphere Replication • Broad support for storage-based Site A (Primary) Site B (Recovery) VMwarevCenter Server Site Recovery Manager VMware vCenter Server Site Recovery Manager replication Simplify management of recovery and VMware vSphere VMware vSphere migration plans • Replace manual runbooks with centralized recovery plans • From weeks to minutes to set up new plan Servers Servers Automate failover and migration processes for reliable recovery • Enable frequent non-disruptive testing • Ensure fast, automated failover • Automate failback processes13
    14. 14. SRM Momentum Introduced in Q2’ 2008 125,000+ units sold 5,000+ customers 50% annual growth in 2010 “If your organization is already taking advantage of virtualization, then adding Site Recovery Manager to handle disaster recovery is a no-brainer.” ― Jerry Wilkin Senior Systems Administrator, Dayton Superior Corp14
    15. 15. Key Components Of SRM 5 Site Recovery Manager • Manages recovery plans Site vCenter Server Recovery • Automates failovers and failbacks Manager • Tightly integrated with vCenter and replication Choice of Replication Options vSphere vSphere Replication • Bundled with SRM • Replicates virtual machines between vSphere clusters Storage Storage-Based Replication (3rd party) • Provided by replication vendor • Integrated via replication adapters created, certified and supported by replication vendor Required at Both Protected and Recovery Sites15
    16. 16. Site Recovery Manager Complements vSphere For DR Traditional DR VMwareConsolidation to reduce costsHardware independence at vSpherefailover site FunctionalityEncapsulation for simple recoveryof entire systemsvSphere ReplicationSimple management of recoveryand migration plans SRMAutomated DR failover and non- Functionalitydisruptive testingStreamline planned migrationsand automated failback 16
    17. 17. SRM Provides Broad Application CoverageRTO: 30 minutes to hoursRPO: Flexible based on storage replication Continuous App-level geo-clustering / load balancing Tier 1 RTO Hours Tier 2 Site Recovery Manager Days Tier 3 Days Hours Synchronous RPO17
    18. 18. SRM Supports Flexible Topologies Active-Passive Active-Active Bi-directional Shared Failover Failover Failover Recovery SitesProduction Production Production Recovery Recovery Production• Most common • Leverage recovery • Production applications • Many-to-one failover traditional scenario infrastructure for test, at both sites • Particularly useful for• Expensive dedicated development, training • Each site acts as the Remote Office / resources • Utilize sunk cost of recovery site for the Branch Office recovery site other18
    19. 19. What’s New In Site Recovery Manager 5.0?vSphere Replication  Bundled with SRM at no additional cost Expand DR coverage to Tier 2 apps and smaller  Provides simple, cost-efficient replication between vSphere clusters sitesAutomated failback  Bi-directional recovery plans  Automates failback to original site Streamline plannedPlanned migration migrations  New workflow that can be applied to any (for disaster avoidance, recovery plan planned maintenance, …)  Ensures no data-loss, application-consistent migrations of virtual machinesOthers  More granular control over VM startup order  Protection-side APIs  IPv6 support19
    20. 20. Cost-Efficient Replication To Expand DR Coverage20 Confidential
    21. 21. DR Coverage Often Limited Due To High Protection Costs Tier 1 Apps - Protected APP APP APP Need to expand DR protection OS OS OS • Tier 2 / 3 applications in larger datacenters Tier 2 / 3 Apps – Backup only • Small and medium businesses APP APP APP • Remote office / branch offices APP APP APP APP OS OS OS OS OS OS OS Small Sites – Backup only Small Business Corporate Datacenter Remote Office / Branch Office 21
    22. 22. SRM Provides Broad Choice of Replication Options Site A (Primary) Site B (Recovery) Site Site vCenter Server Recovery vCenter Server Recovery Manager Manager vSphere vSphere vSphere Replication Storage-based replication vSphere Replication Simple, cost-efficient replication for Tier 2 applications and smaller sites Storage-based Replication High-performance replication for business-critical applications in larger sites22
    23. 23. vSphere Replication For Cost-Efficient, Simple Replication Cost-efficient Simple PowerfulReduce storage costs by 2X Manage replication directly 15 minute RPOs • Support for heterogeneous from vCenter • Set RPOs between 15 storage across sites, • Eliminate complex minutes and 24 hours including non-replicating interactions with storage storage teams Efficient network utilization • Use lower-end or older • Replicate only changed disk storage at failover site Manage replication at the areas individual VM levelEliminate replication • Eliminate need for Highly scalablesoftware costs complicated VM-to-LUN • 500 virtual machines • vSphere Replication mapping included with Site Recovery Limitations Manager at no additional • No automated failback cost • File-level consistency only (except planned migration) • No FT, templates, linked clones, physical RDMs23
    24. 24. Expand DR Protection To Tier 2 Apps And Small Sites Tier 1 Apps Storage, Replication, and SRM Costs per Protected VM $2,000/VM $2,000 SRM Enterprise Storage Replication Replication SW $1,000 Tier 2 / 3 Apps Tier 1 Storage $600/VM Failover Site SRM Standard Tier 2 Storage Failover Site vSphere Storage Replication vSphere Replication Large site Small site vSphere Replication Small Sites vSphere Replication Small Business Corporate Datacenter Remote Office / Branch Office24
    25. 25. Simplify Replication Management With vSphere Replication Storage-based Replication Overview SharePoint Datastore Group VMFS A vSphere Replication provides simple management Web Datastore of replication LUN 1  Managed directly from vCenterApp VMFS B Datastore Hub  Managed at the individual VM-level LUN 2 SQL vSphere Storage Admin Admin Benefits vSphere Replication SharePoint  Eliminate complex interactions between vSphere and storage teams to set up Web replication  Eliminate need to shuffle VMs between App datastores to map applications to replicated LUNs vSphere SQL Admin25
    26. 26. vSphere Replication Architecture Tightly Integrated With SRM, vCenter and ESX Protected Site Recovery Site Site Site vCenter Server Recovery Recovery vCenter Server Manager Manager vSphere Replication vSphere Replication Management Server Management Server VSR Agent vSphere ESX ESXi Replication ESXi ESX Server Any storage Any storage supported by supported by vSphere vSphere26 Confidential
    27. 27. Simple Recovery and Migration Plans27 Confidential
    28. 28. Simple Setup And Management of Recovery And Migration Plans From Complex Runbooks… …to Simple Recovery Plans  Weeks or months to set up  Simple recovery plan set up in minutes  Error-prone  Fewer steps means far less room for errors  Quickly falls out of sync with apps  Simple to keep in sync with changes and infrastructure changes 28
    29. 29. Five Simple Steps To Create Recovery And Migration Plans Create Recovery Plans …And Eliminate Manual Steps of in 5 Steps… Traditional Recovery Map production site resources to recovery site • Resource pools Reconfigure individual hostsStep 1 • vSwitches • VM folders Recover entire systems including OS and application binaries Select virtual machine protection groupsStep 2 to include in recovery Coordinate storage and replication processes for recovery Select low-priority VMs to suspend at • Stop replication and make replicated recovery site LUNs writableStep 3 • Present data to applications • Present VMs to vSphere Specify boot sequence of recovered VMsStep 4 Reconfigure physical switching infrastructure Customize IP addresses of recovered VMsStep 5 Optional Add messages and custom scripts29
    30. 30. Application Consistent Recovery With SRM Storage-based replication: application Application Consistency Enabled consistency widely available by Replication Provider • Enabled by replication management software Quiesce Replicate app- App-consistent • Typically relies on agents in the VMs toapplication consistent VM VM presented properly quiesce applications to SRM • For both DR failover and planned migrations vSphere Replication: Application consistency for planned migrations only • File-system consistency for DR failover Replication via VSS requester in VMware Tools management30
    31. 31. Fully Automated Disaster Failovers and Planned Migrations31 Confidential
    32. 32. Beyond DR: Disaster Avoidance And Planned Migrations3 typical use-cases for SRM Disaster Failover Disaster Avoidance Planned MigrationRecover from unexpected Anticipate potential Most frequent SRM use casesite failure datacenter outages • Planned datacenter • Full or partial site failure • For example: in case of maintenance planned hurricane, floods, • Global load balancingThe most critical but least forced evacuation, etc.frequent use-case Streamline routine • Unexpected site failures do Initiate preventive failover migrations across sites not happen often for smooth migration • Test to minimize risk • When they do, fast recovery • Leverage SRM „planned • Execute partial failovers is critical to the business migration‟ to ensure no • Leverage SRM „planned data-loss migration‟ to ensure no • „Automated failback‟ data-loss enables easy return to • „Automated failback‟ original site enables bi-directional migrations32
    33. 33. SRM Reduces Recovery Risk With Frequent TestingRecovery Traditional Disaster Recovery Risk TESTING GAP Time DR Test DR TestRecovery Site Recovery Manager Risk Lack of confidence in DR process Frequent DR Testing  During the testing gap, organizations can‟t be sure that they can recover the current IT environment  A failover scenario may take days or weeks to complete, leaving the business at extreme risk Time DR Test DR Test SRM provides assurance that DR objectives will be met.33
    34. 34. SRM Enables Frequent Non-Disruptive Testing Non-disruptive Testing Overview Recovery Site  Automate test execution • Execute recovery plan • Customizable for testing with extra callouts Recovery Site and breakpoints Isolated test • Log results of the test environment  Isolated test environment • Snapshot replicated LUNs • Launch VMs in fenced network • Reset environment after test vSphere Benefits Replication  Confidence and documentation that DR requirements are satisfied  Quickly identify and remediate potential issues  Reduce cost and resources required for DR testing • Eliminate traditional „DR testing weekends‟ LUN snapshot34
    35. 35. Automate DR Failover Processes DR Failover Overview Automatically detect site failures Raise alert when 1 hearbeat lost  Require user to manually initiate failover Automate recovery process 2 User initiates  Stop replication and present replicated LUNs to failover vSphere  Execute user-defined recovery plan Site A Site B Benefits vSphere vSphere 4 Recover VMs Ensure fast and predictable failovers and migrations Replication  Consistently meet business requirements 3 Minimize risk of user errors Stop replication and present LUNs to vSphere35
    36. 36. Testing and Executing Recovery Plans Steps in recovery plan Status and time stamps When to execute User confirmation message36
    37. 37. Planned Migrations For App Consistency & No Data Loss Planned Migration Overview Two workflows can be applied to recovery plans:  DR failover1 Shut down 3 Recover app-  Planned migration production VMs consistent VMs Site A Site B Planned migration ensures application consistency and no data-loss during migration  Graceful shutdown of production VMs in application consistent state  Data sync to complete replication of VMs vSphere vSphere  Recover fully replicated VMs Replication Benefits 2 Better support for planned migrations Sync data, stop replication  No loss of data during migration process and present LUNs to vSphere  Recover „application-consistent‟ VMs at recovery site37
    38. 38. Automated Failback To Streamline Bi-Directional Migrations Automated Failback Overview Re-protect VMs from Site B to Site A  Reverse replication  Apply reverse resource mapping Automate failover from Site B to Site A Reverse original recovery plan  Reverse original recovery plan Restrictions Site A Site B  Does not apply if Site A has undergone major changes / been rebuilt  Not available with vSphere Replication vSphere vSphere Reverse Benefits Replication Simplify failback process  Automate replication management  Eliminate need to set up new recovery plan Streamline frequent bi-directional migarations38
    39. 39. Next Steps39 Confidential
    40. 40. Successful Business Continuity Requires Careful PlanningBusiness Requirements / Business ImpactAnalysis (BIA) • Map service Tiers by availability requirements and cost • For each service, identify Availability requirements, Recovery Time Objectives (RTO), Recovery Point Objectives (RPO)Application Dependency Mapping • Identify dependencies between application components • Weakest link in the chain? (AD, DNS, etc) Use Professional ServicesBusiness Continuity Design • VMware PSO • App-specific solutions / virtualization • VMware BCDR for HA and DR / backup only Competency partners (300+ highly qualified • Budget ahead of time partners) • Project planning / phasing40
    41. 41. SRM 5 Editions Lineup SRM 5 Standard Enterprise Price per protected virtual machine $195 $495 (license only) Scalability Limits (1) • Maximum protected VMs 75 virtual machines Unlimited(2) Features • Support for storage-based replication • Centralized recovery plans • Non-disruptive testing • Automated DR failover • vSphere Replication • Automated failback • Planned migration 1. Maximum of 75 VMs per site and per SRM instance 2. Subject to the product‟s technical scalability limits New in SRM 5.041
    42. 42. VMware BC/DR Service OfferingsVMware vCenter Site Recovery Manager Jumpstart • The VMware vCenter Site Recovery Manager Jumpstart provides you with a proof-of-concept, on-site installation and configuration of SRM • 3 days on-site, 5 participants maxCustom BCDR Plan and Design Service • Comprehensive architectural design for BCDR, covering data protection, local availability, and disaster recovery. • Address customer-specific requirements • Flexible engagement model and duration42
    43. 43. Questions? © 2009 VMware Inc. All rights reserved