SlideShare a Scribd company logo
DOSE: Deployment and Operations
for Software Engineers
Disaster Recovery
© Len Bass 2019 2
Overview
• Basic concepts
• Data Strategies for Tiers 2-4
• Tier 1 Data Management
• Big Data
• Software in the Secondary Data Centers
• Fail Over
© Len Bass 2019 3
Terminology
• Business continuity – keeping your
business going in the event of a disaster
• Involves customers, employees, protecting
people and equipment
• Disaster Recovery – the IT portion of
business continuity. Maintaining service
to customers
© Len Bass 2019 4
Key measures per system
• RTO – recovery time objective
• How long before system is in service again
• RPO – recovery point objective
• How much data can be lost in the event of
a disaster
• Will vary for each system in an
organization
© Len Bass 2019 5
Graphical representation
Time
Disaster
happens
RTORPO
Data saved System is available
with data saved
previously
© Len Bass 2019 6
Tiers
• Divide your systems into tiers based on
RTO and RPO
• Tier 1 (mission critical) – 15 minutes
• Tier 2 (important support ) - 2 hours
• Tier 3 (less important support) - 4 hours
• Tier 4 (everything else) - 24 hours
© Len Bass 2019 7
Secondary data center
• When a disaster
occurs, your data
center will be out of
operation for
days/weeks/months
• You need a
secondary (back up)
data center
© Len Bass 2019 8
Types of secondary data centers
• Warm – computing facilities in place but your
software has not been loaded
• Hot – computing facilities in place with your software
loaded and either executing or ready to execute.
Current data not in data center
• Mirrored – computing facilities in place, current
software in place and executing, data up to date.
• All secondary locations are geographically separated
from primary data center
© Len Bass 2019 9
Overview
• Basic concepts
• Data Strategies for Tiers 2-4
• Tier 1 Data Management
• Big Data
• Software in the Secondary Data Centers
• Fail Over
© Len Bass 2019 10
Online or offline?
• Tiers 2-4 have recovery times in terms
of hours
• Major decision is whether to keep data
• Online – in a different region or availability
zone or
• Offline – on tape that is stored remotely
from primary data center
© Len Bass 2019 11
Considerations
• Volume of data
• Storage costs
• Encryption of data on tape
• Recovery time for tape from storage site
• Transfer time from online storage to
secondary site
© Len Bass 2019 12
Overview
• Basic concepts
• Data Strategies for Tiers 2-4
• Tier 1 Data Management
• Big Data
• Software in the Secondary Data Centers
• Fail Over
© Len Bass 2019 13
Data in Tier 1
• RTO and RPO in minutes, not hours
• Data must be kept online and up to date
in secondary data center
• Secondary data center must be mirrored.
• Database system can be used to keep
transactional data consistent at both sites
© Len Bass 2019 14
Non transactional Tier 1 data
• Non replicated data.
• Session data may not be replicated
• Requires user to log in again if disaster
• Slowly changing data
• Static web pages, videos, other data
changes only slowly
• Can be kept up to date with configuration
management system
© Len Bass 2019 15
Overview
• Basic concepts
• Data Strategies for Tiers 2-4
• Tier 1 Data Management
• Big Data
• Software in the Secondary Data Centers
• Fail Over
© Len Bass 2019 16
Big Data
• “Big Data” is a data set too large to back up
• Data is divided into groups – “shards”
• Each shard is replicated several times
• For performance and availability reasons
• Each shard is managed by database system
that keeps replicas up to date
© Len Bass 2019 17
Overview
• Basic concepts
• Data Strategies for Tiers 2-4
• Tier 1 Data Management
• Big Data
• Software in the Secondary Data Centers
• Fail Over
© Len Bass 2019 18
Software in secondary data
center
• Must be kept in alignment with software in
primary data center
• Version inconsistency may lead to
behavioral inconsistency.
• Configuration management systems can work
across data centers
• Deployment process for modified software
should consider secondary data center
© Len Bass 2019 19
Overview
• Basic concepts
• Data Strategies for Tiers 2-4
• Tier 1 Data Management
• Big Data
• Software in the Secondary Data Centers
• Fail Over
© Len Bass 2019 20
Fail over
• Three activities to a fail over process
• Trigger switch to secondary data center
• Activate secondary data center
• Involves ensuring data and software are up to
date
• Resume operations at secondary data
center
© Len Bass 2019 21
Trigger
• Manual or automatic
• In either case, the trigger is scripted so that it is a one
button/command
• Automatic trigger should only be used with very short
RTO
• Requires data center failure detector that may
have false positives
• Any failover has business implications.
© Len Bass 2019 22
Activate secondary data center
• Data and software must be brought up to
date.
• If secondary data center is not mirrored, the
last back up must be restored
• User requests sent to secondary data center
• Software is activated based on tiers
© Len Bass 2019 23
Resume operations
• Make secondary data center be primary
• May need to locate a new secondary data
center in case of disaster at currently
operating center
© Len Bass 2019 24
Testing fail over
• If you can afford down time, test during
scheduled down time
• If no scheduled down time, then test
using staging environment where care
is taken to avoid corrupting production
data base
© Len Bass 2019 25
Summary
• RPO and RTO are basic measures to describe behavior in
case of failure. Used to divide apps into tiers
• Secondary data center must be identified
• Tiers 2-4 can use back ups, either online or offline
• Tier 1 data has database support to keep copy at
secondary data center up to date
• Software in secondary data center kept current with
configuration management system
• Fail over is scripted and must be tested

More Related Content

What's hot

GWAVACon 2013: Gain Control - ZENworks
GWAVACon 2013: Gain Control - ZENworksGWAVACon 2013: Gain Control - ZENworks
GWAVACon 2013: Gain Control - ZENworks
GWAVA
 

What's hot (20)

4 container management
4  container management4  container management
4 container management
 
03. non-functional-attributes-introduction-4-slides
03. non-functional-attributes-introduction-4-slides03. non-functional-attributes-introduction-4-slides
03. non-functional-attributes-introduction-4-slides
 
Chapter09
Chapter09Chapter09
Chapter09
 
04. availability-concepts
04. availability-concepts04. availability-concepts
04. availability-concepts
 
05. performance-concepts
05. performance-concepts05. performance-concepts
05. performance-concepts
 
08. networking-part-2
08. networking-part-208. networking-part-2
08. networking-part-2
 
Chapter08
Chapter08Chapter08
Chapter08
 
01. 02. introduction (13 slides)
01.   02. introduction (13 slides)01.   02. introduction (13 slides)
01. 02. introduction (13 slides)
 
01. 03.-introduction-to-infrastructure
01. 03.-introduction-to-infrastructure01. 03.-introduction-to-infrastructure
01. 03.-introduction-to-infrastructure
 
2.ibm flex system manager overview
2.ibm flex system manager overview2.ibm flex system manager overview
2.ibm flex system manager overview
 
GWAVACon 2013: Gain Control - ZENworks
GWAVACon 2013: Gain Control - ZENworksGWAVACon 2013: Gain Control - ZENworks
GWAVACon 2013: Gain Control - ZENworks
 
Citrix xenapp training
Citrix xenapp training Citrix xenapp training
Citrix xenapp training
 
CloudBridge and Repeater Datasheet
CloudBridge and Repeater DatasheetCloudBridge and Repeater Datasheet
CloudBridge and Repeater Datasheet
 
07. datacenters
07. datacenters07. datacenters
07. datacenters
 
05. performance-concepts-26-slides
05. performance-concepts-26-slides05. performance-concepts-26-slides
05. performance-concepts-26-slides
 
How to Balance System Speed and Risk for Multi-Platform Innovation
How to Balance System Speed and Risk for Multi-Platform InnovationHow to Balance System Speed and Risk for Multi-Platform Innovation
How to Balance System Speed and Risk for Multi-Platform Innovation
 
End-point Management
End-point ManagementEnd-point Management
End-point Management
 
Cisco Standard Network Platform (SNP) - Catholic Relief Services Case Study
Cisco Standard Network Platform (SNP) - Catholic Relief Services Case StudyCisco Standard Network Platform (SNP) - Catholic Relief Services Case Study
Cisco Standard Network Platform (SNP) - Catholic Relief Services Case Study
 
IBM Endpoint Manger for Power Management (Overview)
IBM Endpoint Manger for Power Management (Overview)IBM Endpoint Manger for Power Management (Overview)
IBM Endpoint Manger for Power Management (Overview)
 
V mware thin app 4.5 what_s new presentation
V mware thin app 4.5 what_s new presentationV mware thin app 4.5 what_s new presentation
V mware thin app 4.5 what_s new presentation
 

Similar to 10 disaster recovery

Similar to 10 disaster recovery (20)

The Impact of IoT on Data Centers
The Impact of IoT on Data CentersThe Impact of IoT on Data Centers
The Impact of IoT on Data Centers
 
Trends in Government ICT - Chasing Data, Information, and Decision Support
Trends in Government ICT - Chasing Data, Information, and Decision SupportTrends in Government ICT - Chasing Data, Information, and Decision Support
Trends in Government ICT - Chasing Data, Information, and Decision Support
 
Disaster Recover : 10 tips for disaster recovery planning
Disaster Recover : 10 tips for disaster recovery planningDisaster Recover : 10 tips for disaster recovery planning
Disaster Recover : 10 tips for disaster recovery planning
 
Webinar: How to Create a Disaster Recovery (DR) Plan that Actually Works
Webinar: How to Create a Disaster Recovery (DR) Plan that Actually WorksWebinar: How to Create a Disaster Recovery (DR) Plan that Actually Works
Webinar: How to Create a Disaster Recovery (DR) Plan that Actually Works
 
Governance webinar 09062016
Governance webinar 09062016Governance webinar 09062016
Governance webinar 09062016
 
Governance webinar 09062016
Governance webinar 09062016Governance webinar 09062016
Governance webinar 09062016
 
The Stratification of Data Center Responsibilities
The Stratification of Data Center ResponsibilitiesThe Stratification of Data Center Responsibilities
The Stratification of Data Center Responsibilities
 
Dutch Bangla Bank MIS
Dutch Bangla Bank MISDutch Bangla Bank MIS
Dutch Bangla Bank MIS
 
Database administrators (dbas) face increasing pressure to monitor databases
Database administrators (dbas) face increasing pressure to monitor databasesDatabase administrators (dbas) face increasing pressure to monitor databases
Database administrators (dbas) face increasing pressure to monitor databases
 
Just Say No to Spreadsheets: A Practical Approach to Automating Capacity Mana...
Just Say No to Spreadsheets: A Practical Approach to Automating Capacity Mana...Just Say No to Spreadsheets: A Practical Approach to Automating Capacity Mana...
Just Say No to Spreadsheets: A Practical Approach to Automating Capacity Mana...
 
Hadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data WarehouseHadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data Warehouse
 
Chapter 2 auditing it governance controls
Chapter 2 auditing it governance controlsChapter 2 auditing it governance controls
Chapter 2 auditing it governance controls
 
Chapter 2 auditing it governance controls
Chapter 2 auditing it governance controlsChapter 2 auditing it governance controls
Chapter 2 auditing it governance controls
 
Notes Migrations Don't Have to be Hard
Notes Migrations Don't Have to be HardNotes Migrations Don't Have to be Hard
Notes Migrations Don't Have to be Hard
 
ROI for IP Address Management (IPAM) Solutions
ROI for IP Address Management (IPAM) SolutionsROI for IP Address Management (IPAM) Solutions
ROI for IP Address Management (IPAM) Solutions
 
Data Centre Strategy Summit 2015 "Are you ready to embark on your Data Cent...
Data Centre Strategy Summit 2015   "Are you ready to embark on your Data Cent...Data Centre Strategy Summit 2015   "Are you ready to embark on your Data Cent...
Data Centre Strategy Summit 2015 "Are you ready to embark on your Data Cent...
 
Case mis ch04
Case mis ch04Case mis ch04
Case mis ch04
 
Network Sage™ Into To C Level V1.4
Network Sage™ Into To C Level V1.4Network Sage™ Into To C Level V1.4
Network Sage™ Into To C Level V1.4
 
CISA Training - Chapter 4 - 2016
CISA Training - Chapter 4 - 2016CISA Training - Chapter 4 - 2016
CISA Training - Chapter 4 - 2016
 
From Disaster to Recovery: Preparing Your IT for the Unexpected
From Disaster to Recovery: Preparing Your IT for the UnexpectedFrom Disaster to Recovery: Preparing Your IT for the Unexpected
From Disaster to Recovery: Preparing Your IT for the Unexpected
 

More from Len Bass

Introduction to dev ops
Introduction to dev opsIntroduction to dev ops
Introduction to dev ops
Len Bass
 
Securing deployment pipeline
Securing deployment pipelineSecuring deployment pipeline
Securing deployment pipeline
Len Bass
 

More from Len Bass (20)

Devops syllabus
Devops syllabusDevops syllabus
Devops syllabus
 
DevOps Syllabus summer 2020
DevOps Syllabus summer 2020DevOps Syllabus summer 2020
DevOps Syllabus summer 2020
 
5 infrastructure security
5 infrastructure security5 infrastructure security
5 infrastructure security
 
Quantum talk
Quantum talkQuantum talk
Quantum talk
 
Icsa2018 blockchain tutorial
Icsa2018 blockchain tutorialIcsa2018 blockchain tutorial
Icsa2018 blockchain tutorial
 
Experience in teaching devops
Experience in teaching devopsExperience in teaching devops
Experience in teaching devops
 
Understanding blockchains
Understanding blockchainsUnderstanding blockchains
Understanding blockchains
 
What is a blockchain
What is a blockchainWhat is a blockchain
What is a blockchain
 
Dev ops and safety critical systems
Dev ops and safety critical systemsDev ops and safety critical systems
Dev ops and safety critical systems
 
My first deployment pipeline
My first deployment pipelineMy first deployment pipeline
My first deployment pipeline
 
Packaging tool options
Packaging tool optionsPackaging tool options
Packaging tool options
 
Introduction to dev ops
Introduction to dev opsIntroduction to dev ops
Introduction to dev ops
 
Securing deployment pipeline
Securing deployment pipelineSecuring deployment pipeline
Securing deployment pipeline
 
Deployability
DeployabilityDeployability
Deployability
 
Architecture for the cloud deployment case study future
Architecture for the cloud deployment case study futureArchitecture for the cloud deployment case study future
Architecture for the cloud deployment case study future
 
Architecting for the cloud cloud providers
Architecting for the cloud cloud providersArchitecting for the cloud cloud providers
Architecting for the cloud cloud providers
 
Architecting for the cloud storage build test
Architecting for the cloud storage build testArchitecting for the cloud storage build test
Architecting for the cloud storage build test
 
Architecting for the cloud map reduce creating
Architecting for the cloud   map reduce creatingArchitecting for the cloud   map reduce creating
Architecting for the cloud map reduce creating
 
Architecting for the cloud storage misc topics
Architecting for the cloud storage misc topicsArchitecting for the cloud storage misc topics
Architecting for the cloud storage misc topics
 
Architecting for the cloud elasticity security
Architecting for the cloud elasticity securityArchitecting for the cloud elasticity security
Architecting for the cloud elasticity security
 

Recently uploaded

AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
Alluxio, Inc.
 

Recently uploaded (20)

Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
 
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdfA Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
 
A Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdfA Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdf
 
AI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in MichelangeloAI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in Michelangelo
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
 
Breaking the Code : A Guide to WhatsApp Business API.pdf
Breaking the Code : A Guide to WhatsApp Business API.pdfBreaking the Code : A Guide to WhatsApp Business API.pdf
Breaking the Code : A Guide to WhatsApp Business API.pdf
 
GraphAware - Transforming policing with graph-based intelligence analysis
GraphAware - Transforming policing with graph-based intelligence analysisGraphAware - Transforming policing with graph-based intelligence analysis
GraphAware - Transforming policing with graph-based intelligence analysis
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
 

10 disaster recovery

  • 1. DOSE: Deployment and Operations for Software Engineers Disaster Recovery
  • 2. © Len Bass 2019 2 Overview • Basic concepts • Data Strategies for Tiers 2-4 • Tier 1 Data Management • Big Data • Software in the Secondary Data Centers • Fail Over
  • 3. © Len Bass 2019 3 Terminology • Business continuity – keeping your business going in the event of a disaster • Involves customers, employees, protecting people and equipment • Disaster Recovery – the IT portion of business continuity. Maintaining service to customers
  • 4. © Len Bass 2019 4 Key measures per system • RTO – recovery time objective • How long before system is in service again • RPO – recovery point objective • How much data can be lost in the event of a disaster • Will vary for each system in an organization
  • 5. © Len Bass 2019 5 Graphical representation Time Disaster happens RTORPO Data saved System is available with data saved previously
  • 6. © Len Bass 2019 6 Tiers • Divide your systems into tiers based on RTO and RPO • Tier 1 (mission critical) – 15 minutes • Tier 2 (important support ) - 2 hours • Tier 3 (less important support) - 4 hours • Tier 4 (everything else) - 24 hours
  • 7. © Len Bass 2019 7 Secondary data center • When a disaster occurs, your data center will be out of operation for days/weeks/months • You need a secondary (back up) data center
  • 8. © Len Bass 2019 8 Types of secondary data centers • Warm – computing facilities in place but your software has not been loaded • Hot – computing facilities in place with your software loaded and either executing or ready to execute. Current data not in data center • Mirrored – computing facilities in place, current software in place and executing, data up to date. • All secondary locations are geographically separated from primary data center
  • 9. © Len Bass 2019 9 Overview • Basic concepts • Data Strategies for Tiers 2-4 • Tier 1 Data Management • Big Data • Software in the Secondary Data Centers • Fail Over
  • 10. © Len Bass 2019 10 Online or offline? • Tiers 2-4 have recovery times in terms of hours • Major decision is whether to keep data • Online – in a different region or availability zone or • Offline – on tape that is stored remotely from primary data center
  • 11. © Len Bass 2019 11 Considerations • Volume of data • Storage costs • Encryption of data on tape • Recovery time for tape from storage site • Transfer time from online storage to secondary site
  • 12. © Len Bass 2019 12 Overview • Basic concepts • Data Strategies for Tiers 2-4 • Tier 1 Data Management • Big Data • Software in the Secondary Data Centers • Fail Over
  • 13. © Len Bass 2019 13 Data in Tier 1 • RTO and RPO in minutes, not hours • Data must be kept online and up to date in secondary data center • Secondary data center must be mirrored. • Database system can be used to keep transactional data consistent at both sites
  • 14. © Len Bass 2019 14 Non transactional Tier 1 data • Non replicated data. • Session data may not be replicated • Requires user to log in again if disaster • Slowly changing data • Static web pages, videos, other data changes only slowly • Can be kept up to date with configuration management system
  • 15. © Len Bass 2019 15 Overview • Basic concepts • Data Strategies for Tiers 2-4 • Tier 1 Data Management • Big Data • Software in the Secondary Data Centers • Fail Over
  • 16. © Len Bass 2019 16 Big Data • “Big Data” is a data set too large to back up • Data is divided into groups – “shards” • Each shard is replicated several times • For performance and availability reasons • Each shard is managed by database system that keeps replicas up to date
  • 17. © Len Bass 2019 17 Overview • Basic concepts • Data Strategies for Tiers 2-4 • Tier 1 Data Management • Big Data • Software in the Secondary Data Centers • Fail Over
  • 18. © Len Bass 2019 18 Software in secondary data center • Must be kept in alignment with software in primary data center • Version inconsistency may lead to behavioral inconsistency. • Configuration management systems can work across data centers • Deployment process for modified software should consider secondary data center
  • 19. © Len Bass 2019 19 Overview • Basic concepts • Data Strategies for Tiers 2-4 • Tier 1 Data Management • Big Data • Software in the Secondary Data Centers • Fail Over
  • 20. © Len Bass 2019 20 Fail over • Three activities to a fail over process • Trigger switch to secondary data center • Activate secondary data center • Involves ensuring data and software are up to date • Resume operations at secondary data center
  • 21. © Len Bass 2019 21 Trigger • Manual or automatic • In either case, the trigger is scripted so that it is a one button/command • Automatic trigger should only be used with very short RTO • Requires data center failure detector that may have false positives • Any failover has business implications.
  • 22. © Len Bass 2019 22 Activate secondary data center • Data and software must be brought up to date. • If secondary data center is not mirrored, the last back up must be restored • User requests sent to secondary data center • Software is activated based on tiers
  • 23. © Len Bass 2019 23 Resume operations • Make secondary data center be primary • May need to locate a new secondary data center in case of disaster at currently operating center
  • 24. © Len Bass 2019 24 Testing fail over • If you can afford down time, test during scheduled down time • If no scheduled down time, then test using staging environment where care is taken to avoid corrupting production data base
  • 25. © Len Bass 2019 25 Summary • RPO and RTO are basic measures to describe behavior in case of failure. Used to divide apps into tiers • Secondary data center must be identified • Tiers 2-4 can use back ups, either online or offline • Tier 1 data has database support to keep copy at secondary data center up to date • Software in secondary data center kept current with configuration management system • Fail over is scripted and must be tested