DR Planning Project
Training Document
Prepared by:
Thomas Bronack, CBCP
(917) 673-6992
bronackt@dcag.com

Thomas Bronack ©...
What do we want to achieve?

2

• Fully converted Information Technology Environment.
• Savings through equipment, locatio...
3

Outsourcing Project Time Line of Events
Phase I
Inventory

Bid

•
•
•
•
•
•

RFP;
Bid;
SOW;
Scope;
Goals and
Timeframe
...
Phase V – Perform Application Recovery Certification
Three Regional Data Centers and One Global Recovery Site
Prod 1
(Amer...
Failover / Failback DR Process
• Use
Existing
Recovery
Plan to
Certify
Application
Recovery

Failover

Production
Site

Fa...
6

Logical DR Environment
DR Logical Architecture from Prod to Recovery
Recovery
Site

WAN

Array
Replication
Over WAN

Vi...
7

DR Environment Target

Recovery Site

This diagram describes what the DR
Environment will look like when
completed. Rem...
Lifecycle of a Disaster Event (Why we create Recovery Plans)
“The goal of Enterprise Resiliency is to achieve ZERO DOWNTIM...
Disaster Recovery Testing Process

9

The DR Testing process is illustrated
here and includes:
1. Select Application for D...
Recovery Testing process
A.

Develop Recovery Objectives and Testing Schedule:
1.
2.
3.
4.
5.
6.

B.

Create a Recovery Si...
Where do we go from here
A.

Develop Application Actual DR Testing process:
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.

B.

Applic...
Disaster Recovery Dashboard and Documents
12
1. DR Planning Guide

2. DR Management Dashboard

3. DR Exercise Booklet Temp...
What should be accomplished during the Planning
Meetings
1. Infrastructure Readiness Information

2. Contact List

3. EMC ...
Upcoming SlideShare
Loading in …5
×

Dcag training on VMware DR Process

534 views

Published on

Description on implementing a recovery environment with VMware, vSphere, vConnect, and RPA as an initial training document to application DR Teams going through Application Recovery Certification with links to additional materials.

Published in: Business, Technology
  • Be the first to comment

  • Be the first to like this

Dcag training on VMware DR Process

  1. 1. DR Planning Project Training Document Prepared by: Thomas Bronack, CBCP (917) 673-6992 bronackt@dcag.com Thomas Bronack © Initial Training Class Phone: (917) 673-6992 / Email: bronackt@dcag.com
  2. 2. What do we want to achieve? 2 • Fully converted Information Technology Environment. • Savings through equipment, locations, and vendor contracts. • Savings through better controls and efficiency. • Continuity of Business achieved through Enterprise Resiliency. • World-Wide Compliance achieved through Corporate Certification. • Additional savings through integration with everyday functions. • Improved Reputation and Higher Employee Morale. • Better retention of staff and clients. • More able to recruit new personnel and close client business. • Costs go down and efficiency goes up. • Improved Savings and Profitability. Thomas Bronack © Initial Training Class Phone: (917) 673-6992 / Email: bronackt@dcag.com
  3. 3. 3 Outsourcing Project Time Line of Events Phase I Inventory Bid • • • • • • RFP; Bid; SOW; Scope; Goals and Timeframe • • • • • • • • • Phase II Build Regionals Transition Phase IV Phase V • Move Applications to Regional Data Center; • Test Successful Operation; and • Use Virtualization. • • • • • • • • • • • • • • Initial Training Class Phase VI Build Recovery Disaster Recovery Recovery Site • Prod 1; • Prod 2; • Prod 3. What they Have; Infrastructure; Equipment; Software; Applications; Locations; Computer Sites; Recovery Sites; Applications with Recovery Plans; and • Application that need Recovery Plans. Thomas Bronack © Phase III “Proof of Concept”; Infrastructure Readiness; Disaster Recovery; Application Recovery; Business Recovery; Workplace Safety and Violence Prevention; Emergency Management; Crisis Management; Protection, Salvage, and Restoration; Supply Chain Management; Insurance; Community Relations; Communications; and Use of Social Media. Compliance • Laws and Regulations; • Requirements to Comply with; • Present Compliance; • Gaps & Exceptions; • Obstacles; • Domestic; • International; and • Cross Border Requirements. Phone: (917) 673-6992 / Email: bronackt@dcag.com
  4. 4. Phase V – Perform Application Recovery Certification Three Regional Data Centers and One Global Recovery Site Prod 1 (Americas) Prod 2 (Europe) Recovery Site is Built so that existing recovery sites and vendor contracts can be eliminated. User Sites Prod 3 (Asia Pacific) Phase V – Application Recovery Certification is accomplished; initially for selected applications to validate Regional Sites can recover to Global Recovery Site, Cloud or WAN User Sites Global Recovery Site User Sites User Locations connected to Regional Data Centers and Global Recovery Site Thomas Bronack © Initial Training Class Phone: (917) 673-6992 / Email: bronackt@dcag.com 4
  5. 5. Failover / Failback DR Process • Use Existing Recovery Plan to Certify Application Recovery Failover Production Site Failback Old ID Address New IP Address Users Switched to Recovery Site Production Path User 1 Recovery Site Disaster Recovery Path Cloud or WAN User 2 • Declare Disaster; • Failover to Recovery Site; • Continue User Processing within RTO; • Supplies are routed to Recovery Site; • Original Site is Safeguarded, Salvaged, and Restored; • Failback to Original Site User 3 User n 1. Users stay at their site, while Production is switched to Recovery Site. 2. User has to move to a secondary site because User site is lost, connect to Region Site & Test Recovery. 3. Users move to recovery site and production is switched to Recovery. Thomas Bronack © Initial Training Class Phone: (917) 673-6992 / Email: bronackt@dcag.com 5
  6. 6. 6 Logical DR Environment DR Logical Architecture from Prod to Recovery Recovery Site WAN Array Replication Over WAN Virtual Machines (VM) are maintained by the VMware vSphere system which manages a vCenter Server used for Site Recovery Management. VM can be considered as a Resource Manager that separates Real Equipment (Storage, Computer, Network, etc.) into Logical Equipment Sections. Each VM can represent a Real Server, but many VM can reside in a Real Server which will free up real servers presently used for disposal and a reduction in cost. VMs save space, power, and reduce environmental concerns, all of which affects the bottom line and reputation of the company. It also takes fewer people to manage a Virtual Environment that the number of people now required to manage a real environment. Servers are Rack Mounted in what is called a Blade to save floor space and infrastructure. Switches can re-route servers to the Recovery Point Application to the Recovery Site when a disaster event occurs. Thomas Bronack © Initial Training Class Phone: (917) 673-6992 / Email: bronackt@dcag.com
  7. 7. 7 DR Environment Target Recovery Site This diagram describes what the DR Environment will look like when completed. Remote sites are transformed and virtualized via the Avamar Virtual Environment, which will allow for the removal of remote equipment and support personnel. Windows, UNIX, and ESX Operating Systems will be housed in an EMC VNX Unified Storage facility. Network Backup Servers will protect communications, and Data Domains will protect Remote Users. A Tape Library is provided for long term storage and electronic transfer to the Iron Mountain Tape Vault via encrypted communications. The System Attached Network (SAN) and EMC Unified Storage Facility are connected to the wide Area Network through EMC Recovery Point Applications (RPAs) that can automatically switch a failing location to the recovery site to continue processing. Thomas Bronack © Initial Training Class Phone: (917) 673-6992 / Email: bronackt@dcag.com
  8. 8. Lifecycle of a Disaster Event (Why we create Recovery Plans) “The goal of Enterprise Resiliency is to achieve ZERO DOWNTIME by implementing Application Recovery Certification for HA and Gold Standard Recovery Certification for CA Applications” Point of Failure Failover Production Recovery Processing CA Flip / Flop Switch Over  RPO (Last Snapshot)  RTO  Data Sync Continuous Availability (CA) is immediate Switch Production Failback Shut Down Secondary Site HA Flip / Flop Switch Over Failover Start Up Failback from Secondary Site after Restoration Primary Site High Availability (HA) is RTO / SLA based Switch Repair Primary Site to Resume Production via Failback Production Primary Site Primary Site Primary Site Primary Site Primary Site Disaster Event: • Event; • Analyze; • Declare; • Failover. Safeguard: • Evacuate; • Protect Site; • First Responders. Salvage: • Clean Facility; • Repair; • Resupply. Restoration: • Restart; • Test; • Success; • Failback. Resume: • Reload Data; • Restart; • Continue. Thomas Bronack © Initial Training Class Phone: (917) 673-6992 / Email: bronackt@dcag.com 8
  9. 9. Disaster Recovery Testing Process 9 The DR Testing process is illustrated here and includes: 1. Select Application for DR Testing; 2. Define DR Testing Goals and Objectives; 3. Define Production Site where application resides; 4. Complete Pre-Staging form to provide DR team with the information need to make the recovery site ready to perform DR Testing; 5. Complete DR Exercise Booklet for Application; 6. Conduct the Actual DR Exercise; 7. DR Coordinator receives Work Sheets and prepares a Report and Presentation of findings for the Post Mortem; 8. Implement recommendations for improvement, Thomas Bronack © Initial Training Class Phone: (917) 673-6992 / Email: bronackt@dcag.com
  10. 10. Recovery Testing process A. Develop Recovery Objectives and Testing Schedule: 1. 2. 3. 4. 5. 6. B. Create a Recovery Site DR Site Testing; Validate Production Site to Recovery Site Connectivity; Disaster Recovery Plans for interruptions to Information technology; Application Recovery Certification (CA, HA, Best Effort, Deferred); Business Recovery for loss of a location; Emergency Management for Incidents and Natural Disasters; etc. DR Testing is conducted in Five Steps, which are: 1. DR Planning Meeting – to orientate Application DR Team; 2. Infrastructure Readiness – To prepare the Recovery Site and Obtain Data; 3. DR Pre-Test – To prepare the Recovery Site for Application DR Test: a. Recovery Site establishes recovery environment for disaster event or test. b. Develop procedures for providing Recovery Site with the information they need. 4. Actual DR Recovery / Test – to DR Test the Application: a. Follow the “Script of Actions” contained in the Recovery Plan. b. Record event times, comments, and encountered problems. 5. Post Mortem Meeting – Review of DR Test Results: a. To discuss recovery events and recommend improvements. Thomas Bronack © Initial Training Class Phone: (917) 673-6992 / Email: bronackt@dcag.com 10
  11. 11. Where do we go from here A. Develop Application Actual DR Testing process: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. B. Application Actual DR Test Activities Sheet is completed with Estimated Times. Production Servers are brought-down in Production. Recovery Servers are brought-up in Recovery. Application is connected to Recovery Facility. Data is Synchronized to point just before failure. Application resumes normal processing like in Production Mode. Application connectivity and functionality is verified. Recovery Servers are Brought-Down. Production Servers are Brought-Up. Application resumes processing at Production Site and is verified. If Successful, Application receives Application Recovery Certification – otherwise Application DR problems are repaired and the Application goes through DR Testing again until Application Recovery Certification is achieved. Develop Application Work Sheet; 1. C. 11 Same as Activities Sheets, but is used to record Actual Times, Durations, Encountered Problems, and Comments. Post Mortem Meeting is conducted to review results, go over “Lessons Learned” and make “Recommendations for Improvement”. Thomas Bronack © Initial Training Class Phone: (917) 673-6992 / Email: bronackt@dcag.com 11
  12. 12. Disaster Recovery Dashboard and Documents 12 1. DR Planning Guide 2. DR Management Dashboard 3. DR Exercise Booklet Template 4. Planning Meeting Agenda Thomas Bronack © Initial Training Class Phone: (917) 673-6992 / Email: bronackt@dcag.com
  13. 13. What should be accomplished during the Planning Meetings 1. Infrastructure Readiness Information 2. Contact List 3. EMC Disaster Recovery and Business Continuity Solutions. 4. VMware vSphere, vCenter Prep 5. VMware Usage and Recovery Thomas Bronack © Initial Training Class Phone: (917) 673-6992 / Email: bronackt@dcag.com 13

×