Principles and definitions: HA and DR
• Business continuity
Ability to recover business operations within specified parameters in case of specified disasters
• Continuous availability
Operation of a system where unplanned outages prevent the operation for at most 5 minutes
per year (“five nines” or 99.999% availability)
• High availability
Operation of a system where unplanned outages prevent the operation for at most a few
seconds or minutes while failover occurs. Often used as an umbrella term to include continuous
• Disaster recovery
Operation of a system with a plan and process for reconstructing or recovering operations in a
separate location in case of disaster.
Principles and definitions: Active, Passive, etc.
A system where continuous or high availability is achieved by having active operation in multiple
• Active–Standby (or “warm standby”)
A system where high availability is achieved by having active operation in one location with
another location or locations able to become active within seconds or minutes, without a
“failover” of responsibility
• Active–Passive (or “cold standby”)
A system where high availability or disaster recovery is achieved by having active operation in
one location with another location or locations able to become active within minutes or hours
after a “failover” of responsibility
Principles and definitions: RTO and RPO
• RTO: recovery time objective
How long it takes for an HA or DR procedure to bring a system back into operation
• RPO: recovery point objective
How much data (measured in elapsed time) might be lost in the event of a disaster
zero daysminutesseconds hours
mirrored file systems
replicated file systems backup and restore
Principles and definitions: Scenarios
• Metropolitan distance: multiple data centers within 100–300km
–High availability is achievable using Active–Active or Active–Standby solutions that involve active
mirroring of data between sites.
–Disaster recovery with zero RPO is achievable using Active–Passive solutions that involve replication of
data between sites.
• Regional to global distance: multiple data centers beyond 200–300km
Disaster recovery with nonzero RPO is achievable using Active–Passive solutions that involve
replication of data between sites.
Principles and definitions: Personas
• Application architect
Responsible for planning the application design in such a way that high availability or disaster
recovery is achievable (e.g., separating application from data)
• Infrastructure administrator
Responsible for configuring and managing infrastructure in such a way as to achieve the ability
to implement high availability or disaster recovery (e.g., configuring and managing disk
mirroring or replication)
• Application administrator
Responsible for deploying and managing the components of an application in such a way as to
achieve high availability or disaster recovery (e.g., deploying the application in duplicate
between two sites and orchestrating the failover of the application and its disks together with
the infrastructure administrator)
Principles: Automation and repeatability
• Automate all aspects of your application’s deployment and configuration
–Using PureApplication patterns, pattern components, script packages, customized images
–Using external application lifecycle tooling such as IBM UrbanCode Deploy
• Why? This achieves rapid and confident repeatability of your application deployment, allowing:
–Quality and control: lower risk and chance of error
–Agility and simplicity
• Quickly recover application if you need to redeploy it
• Quickly deploy your application at separate sites for HA or DR purposes
• Quickly deploy new versions of the application for test or upgrade purposes
• Create a continuous integration lifecycle for faster and more frequent application deployment and testing
–Portability: deploy to other cloud environments (e.g., PureApplication Service)
Principles: Separation of application and data
• Ensure that all persistent data (transaction logs, database, etc.) is stored on separate disks from
the application or database application itself
• Why? This multiplies your recovery options because it decouples your strategy for application
and data recovery, which often must be addressed in different ways:
–Application recovery may involve backup & restore, re–deployment, or multiple deployment
Often the application cannot be replicated due to infrastructure entanglement
–Data recovery may involve backup & restore, replication, or mirroring
• This also allows additional flexibility for development and test cycles, for example:
–Deploy new versions of the application or database server and connect to original data
–Deploy test instances of the application using copies of the production data
Principles: Transaction consistency
If your application stores data in multiple locations (e.g., transaction logs on file server and transactions in
database), then you must ensure that either:
• The “lower” statements of record are replicated with total consistency together with the “higher”
statements of record, or else
• The “lower” statements of record are at all times replicated in advance of the “higher” statements of
This ensures that you do not replicate inconsistent data (e.g., transaction log indicates a transaction is
committed but the transaction is not present in the database). So, for example:
• Your database and fileserver disks are replicated together with strict consistency, or instead
• Your database is replicated synchronously (zero RPO) but your fileserver asynchronously (nonzero RPO).
Tools: Compute node availability
• PureApplication System offers two options for planning for failure of compute nodes:
–Cloud group HA, if enabled, will reserve 1/n CPU and memory overhead on each compute node in a
cloud group containing n compute nodes. If one compute node fails, all VMs will be recovered into this
reserved space on the remaining nodes.
–System HA allows you to designate one or more compute nodes as spares for all cloud groups that are
enabled for system HA. This allows you both to (1) allocate more than one spare, and also (2) share a
spare between multiple cloud groups.
• If neither cloud group HA or system HA is enabled and a compute node fails, the system will
attempt to recover as many VMs as possible on the remaining nodes in the cloud group, in
• VMs being recovered will experience an outage equivalent to being rebooted.
• Recommendation: always enable cloud group HA or system HA
–This ensures your workload capacity is restored quickly after a compute node failure
–This also ensures that workload does not need to be stopped for planned compute node maintenance
Tools: Block storage
Block storage volumes in PureApplication System:
• May be up to 8TB in size
• Are allocated and managed independently of VM storage, can be attached and detached
• Is not included in VM snapshots
• Can be cloned (copied)
• Can be exported and imported against external scp servers
• Groups of volumes can be created for time–consistent cloning or export of multiple volumes
Tools: Shared block storage
• Block storage volumes may be shared (simultaneously attached) by virtual machines
–On the same system
Note: this is supported on Intel, and on Power beginning with V2.2.
–Between systems. Notes:
• This is supported only for external block storage that resides outside of the system (see later slide).
• This is supported on Intel. Support on Power is forthcoming.
• This allows for creation of highly available clusters (GPFS, GFS, DB2 pureScale, Windows cluster)
–A clustering protocol is necessary for sharing of the disk
–The IBM GPFS pattern (see later slide) supports GPFS clusters on a single rack using shared block
storage, but does not support cross–system clusters using shared external block storage
–Storage volumes must be specifically created as “shared” volumes
–Special placement techniques are required in the pattern to ensure anti–collocation of VMs
–IBM GPFS pattern supports clustering (see below)
Tools: Block storage replication
Two PureApplication Systems can be connected for replication of block storage
• Connectivity options
–Fiber channel connectivity supported beginning in V2.0
–TCP/IP connectivity supported beginning in V2.2
• Volumes are selected for replication individually
–Replicate in either direction
–Replicate synchronously up to 3ms latency (~300km), asynchronously up to 80ms latency (~8000km).
RPO for asynchronous replication is up to 1 second.
• All volumes are replicated together with strict consistency
• Target volume must not be attached while replication is taking place
• Replication may be terminated (unplanned failover) or reversed in place (planned failover).
Reverse in place requires volume to be unattached on both sides.
Tools: External block storage
• PureApplication System can connect to external SVC, V7000, V9000 devices:
–Allows for block and block “shared” volumes to be accessed by VMs on PureApplication System.
Base VM disks cannot reside on external storage.
–Depending on extent size, allows for volumes larger than 8TB in size
–Requires both TCP/IP and fiber channel connectivity to external device
• All volume management is performed outside of system
–Volumes are allocated and deleted by admin on external device
–Alternate storage providers, RAID configurations, or combinations of HDD and SSD may be used
–Volumes may be mirrored externally (e.g., SVC–managed mirroring across multiple devices)
–Volumes may be replicated externally (e.g., SVC to SVC replication between data centers)
• Advanced scenarios, sharing access to the same SVC cluster or V7000, or replicated ones:
–Two systems sharing access to cluster or to replicated volumes
–PureApplication System and PureApplication Software sharing access to cluster or replicated volumes
Tools: IBM GPFS (General Parallel File System) / Spectrum Scale
• GPFS is:
–A shared filesystem (like NFS)
–Optionally: a clustered filesystem (unlike NFS) providing HA and high performance.
Note: clustering supported on Power Systems beginning with V2.2.
–Optionally: mirrored between cloud groups or systems
• A tiebreaker (on third rack or external system) is required for quorum
• Mirroring is not recommended above 1–3ms (~100–300km) latency
–Optionally: (using block storage or external storage replication) replicated between systems
Shared Clustered Mirrored Replicated
Tools: Multi–system deployment
• Connect systems in a “deployment subdomain” for cross–system pattern deployment
–Virtual machines for individual vsys.next or vapp deployments may be distributed across systems
–Allows for easier deployment and management of highly available applications using a single pattern
–Systems may be located in same or different data centers
• Notes and restrictions
–Up to four systems may be connected (limit is two systems prior to V2.2)
–Inter–system network latencies must be less than 3ms (~300km)
–An external 1GB iSCSI tiebreaker target must be configured for quorum purposes
–Special network configuration is required for inter–system management communications
Scenario: Test application, middleware, or schema update
Copy block storage from production application for use in testing
Scenario: Update application or middleware
When both the current and new application and middleware can share the same database
without conflict (e.g., no changes to database schema), you can run the newer version of the
application or middleware side by side for testing, and then eventually direct clients to the new
version and retire the old version.
Scenario: Backward incompatible updates to database or schema
In some cases, a new version of an application, database server, or database schema may be
unable to coexist with the existing application. In this case, you can use the “copy” strategy on a
previous slide to test the upgrade of your application. When you are ready to promote the new
version to production, you can detach the block storage from the existing deployment and attach
it to the upgraded deployment.
Scenario: HA planning for compute node failure
• Deploy multiple instances of each service so that each service continues if one instance is lost
• Enable cloud group or system HA so that failed instances can be recovered quickly
Scenario: recovery planning for VM failure or corruption
• Backup and restore of the VM itself is feasible if it can be recovered in place
• If the VM cannot be recovered:
–If the VM is part of a horizontally scalable cluster, you can scale
in to remove the failed VM and scale out to create a new VM
–If the VM is not horizontally scalable, you must plan to re–deploy it:
• You can deploy the entire pattern again and recover the data to it
• You may be able to deploy a new pattern that recreates only the failed VM,
and use manual or scripted configuration to reconnect it to your existing
Scenario: recovery planning for database corruption
You may use your database’s own capabilities for backup and restore, import and export.
Alternatively, you may use block storage copies (and optionally export and import) to backup your
database. Attach the backup copy (importing it beforehand if necessary) to restore.
Scenario: HA planning for system or site failure
• As with planning for compute node failure, deploy multiple instances: now across systems.
• You may deploy separately on each system, or use multi–system deployment across systems.
• Distance at which HA is possible is limited.
• GPFS clustering is optional. It can provide additional throughput and also additional availability
on a single system.
System A System B
Scenario: Two–tier HA planning for system or site failure
• Compared to the previous slide, if you desire HA both within a site and also between sites, you
must duplicate your application, database and filesystem both within and between sites.
• Native database replication between sites must be synchronous, or may be asynchronous if you
have no need of GPFS (see slide 11).
Load balancer or DNS
Site A Site B
Scenario: DR planning for rack or site failure
• You should expect nonzero RPO if the sites are too far apart to allow synchronous replication
• Applications must be quiesced at the recovery site because replicated disks are inaccessible
• The database is here replicated using disk replication for transaction consistency. You can use
native database replication (as on slide 28) only if it is synchronous, or asynchronously only if
you have no need of GPFS (see slide 11).
Load balancer or DNS
System A System B
Scenario: horizontal scaling and bursting
• Use of the base scaling policy allows you to horizontally scale, manually or in some cases
automatically, new instances of a virtual machine with clustered software components.
• When using multi–system deployment, horizontally scaled virtual machines will be distributed
as much as possible across systems referenced in your environment profile
• An alternate approach, especially in heterogeneous environments like PureApplication System
and PureApplication Service, is to deploy new pattern instances for scaling or bursting, and
federate them together.
Caveats: Networking considerations
• Some middleware is sensitive to IP addresses and hostnames (e.g., WAS) and for DR purposes
you may need to plan to duplicate either IP addresses or hostnames in your backup data center
• Both HA architectures and zero–RPO DR architectures are sensitive to latency. If latency is too
high you can experience poor write throughput or even mirroring or replication failure. For
these cases you should ideally plan for less than 1ms (~100km) of latency between sites.
• You must also plan for adequate network throughput between sites when mirroring or
• HA architectures require the use of a tiebreaker to govern quorum–leader determination in case
of a network split. In a multi–site HA design, you should plan to locate the quorum at a third
location, with equally low latency.
Caveats: Middleware–specific considerations
• Combining both mirroring and replication (Active–Active–Passive–Passive)
–The IBM GPFS pattern does not support combining both mirroring and replication
–This combination is possible for other middleware (e.g., DB2 as on slide 29), but you must manually
determine and designate which instance is Primary or Secondary at the time of recovery
• Read carefully your middleware’s recommendations for configuring HA. For example:
–IBM WebSphere recommends against cross–site cells
–The IBM DB2 HADR pattern preconfigures a reservationless IP–based tiebreaker, which is not
–IBM DB2 HADR provides a variety of synchronization modes with different RPO characteristics
• Ensure your middleware tolerates attaching existing storage if you replicate or copy volumes
–The IBM DB2 HADR pattern requires an empty disk when first deploying. You can attach a new disk or
replicate into this disk only after deployment.
–The IBM GPFS pattern does not support attaching existing GPFS disks
Caveats: Virtual machine backup and restore
The power and flexibility of PureApplication patterns means that your PureApplication VMs are
tightly integrated both within a single deployment, and with the system on which they are
Because of this tight integration, you cannot use backup and restore techniques to recover your
PureApplication VMs unless you are recovering to the exact same virtual machine that was
previously backed up.
Your cloud strategy for recovering corrupted deployments should build on the efficiency and
repeatability of patterns so that you are able to re–deploy in the event of extreme failure
scenarios such as accidental virtual machine deletion or total system failure.
Caveats: Practice, practice, practice
Because of the complexity of HA and DR implementation, and especially because of some of the
caveats we have noted and which you may encounter in your unique situation, it is vital for you to
practice all aspects of your HA or DR implementation and lifecycle before you roll it out into
This includes testing network bandwidth and latency to their expected limits. It also includes
simulating failures and verifying and perfecting your procedures for recovery and also for failback.
• Implementing High Availability and Disaster Recovery in IBM PureApplication Systems V2
• “Implement multisystem management and deployment with IBM PureApplication System”
• “Demystifying virtual machine placement in IBM PureApplication System”
• “High availability (again) versus continuous availability”
• “Can I run a WebSphere Application Server cell over multiple data centers?”
• “Increase DB2 availability”
• “HADR sync mode”