SlideShare a Scribd company logo
1 of 66
Download to read offline
InterConnect
2017
HHM-6891 Making MQ Resilient
across Your Datacenters and the
Cloud – High Availability, Disaster
Recovery etc
Mark Taylor
marke_taylor@uk.ibm.com
Please Note
• IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal
without notice at IBM’s sole discretion.
• Information regarding potential future products is intended to outline our general product direction and
it should not be relied on in making a purchasing decision.
• The information mentioned regarding potential future products is not a commitment, promise, or legal
obligation to deliver any material, code or functionality. Information about potential future products
may not be incorporated into any contract. The development, release, and timing of any future
features or functionality described for our products remains at our sole discretion.
• Performance is based on measurements and projections using standard IBM benchmarks in a
controlled environment. The actual throughput or performance that any user will experience will vary
depending upon many factors, including considerations such as the amount of multiprogramming in
the user’s job stream, the I/O configuration, the storage configuration, and the workload processed.
Therefore, no assurance can be given that an individual user will achieve results similar to those
stated here.
Abstract
• IBM MQ provides many capabilities that will keep your data safe and your business running in the
event of failures whether you're running on your own systems in your data centres, on-premise IaaS,
IaaS in a public cloud, or a hybrid cloud across all of these. This session introduces you to the
solutions available and how they can be effectively used together to build extremely reliable
environments providing HA and DR for messaging on-premise in in the hybrid cloud.
Introduction
• Availability is a very large subject
• You can have the best technology in the world, but you have to manage it correctly
• Technology is not a substitute for good planning and testing!
What is DR
• Getting applications running after a major (often whole-site) failure or loss
• It is not about High Availability although often the two are related and share design and
implementation choices
‒ “HA is having 2, DR is having them a long way apart”
‒ More seriously, HA is about keeping things running, while DR is about recovering when HA has failed.
• Requirements driven by business, and often by regulators
‒ Data integrity, timescales, geography …
• One major decision point: cost
‒ How much does DR cost you, even if it’s never used?
‒ How much are you prepared to lose
Disaster Recovery vs High Availability
• Designs for HA typically involve a single site for each component of the overall architecture
• Designs for DR typically involve separate sites
• Designs for HA (and CA) typically require no data loss
• Designs for DR typically can have limited data loss
• Designs for HA typically involve high-speed takeover
• Designs for DR typically can permit several hours down-time
HIGH AVAILABILITY
Single Points of Failure
• With no redundancy or fault tolerance, a failure of any component can lead to a loss of availability
• Every component is critical. The system relies on the:
‒ Power supply, system unit, CPU, memory
‒ Disk controller, disks, network adapter, network cable
‒ ...and so on
• Various techniques have been developed to tolerate failures:
‒ UPS or dual supplies for power loss
‒ RAID for disk failure
‒ Fault-tolerant architectures for CPU/memory failure
‒ ...etc
• Elimination of SPOFs is important to achieve HA
IBM MQ HA technologies
• Queue manager clusters
• Queue-sharing groups
• Support for networked storage
• Multi-instance queue managers
• MQ Appliance
• HA clusters
• Client reconnection
Queue Manager Clusters
• Sharing cluster queues on multiple queue
managers prevents a queue from being a
SPOF
• Cluster workload algorithm automatically
routes traffic away from failed queue
managers
Queue-Sharing Groups
• On z/OS, queue managers can be members of a queue-sharing
group
• Shared queues are held in a coupling facility
‒ All queue managers in the QSG can access the messages
• Benefits:
‒ Messages remain available even if a queue manager fails
‒ Pull workload balancing
‒ Apps can connect to the group Queue
manager
Private
queues
Queue
manager
Private
queues
Queue
manager
Private
queues
Shared
queues
Introduction to Failover and MQ
• Failover is the automatic switching of availability of a service
‒ For MQ, the “service” is a queue manager
• Traditionally the preserve of an HA cluster, such as HACMP
• Requires:
‒ Data accessible on all servers
‒ Equivalent or at least compatible servers
➢ Common software levels and environment
‒ Sufficient capacity to handle workload after failure
➢ Workload may be rebalanced after failover requiring spare capacity
‒ Startup processing of queue manager following the failure
• MQ offers several ways of configuring for failover:
‒ Multi-instance queue managers
‒ HA clusters
‒ MQ Appliance
Failover considerations
• Failover times are made up of three parts:
‒ Time taken to notice the failure
➢ Heartbeat missed
➢ Bad result from status query
‒ Time taken to establish the environment before activating the service
➢ Switching IP addresses and disks, and so on
‒ Time taken to activate the service
➢ This is queue manager restart
• Failover involves a queue manager restart
‒ Nonpersistent messages, nondurable subscriptions discarded
• For fastest times, ensure that queue manager restart is fast
‒ No long running transactions, for example
MULTI-INSTANCE
QUEUE MANAGERS
Multi-instance Queue Managers
• Basic failover support without HA cluster
• Two instances of a queue manager on different machines
‒ One is the “active” instance, other is the “standby” instance
‒ Active instance “owns” the queue manager’s files
➢ Accepts connections from applications
‒ Standby instance monitors the active instance
➢ Applications cannot connect to the standby instance
➢ If active instance fails, standby restarts queue manager and becomes active
• Instances are the SAME queue manager – only one set of data files
‒ Queue manager data is held in networked storage
Multi-instance Queue Managers
1. Normal
execution
Owns the queue manager data
MQ
Client
Machine A Machine B
QM1
QM1
Active
instance
QM1
Standby
instance
can fail-over
MQ
Client
network
168.0.0.2168.0.0.1
networked storage
Multi-instance Queue Managers
2. Disaster
strikes
MQ
Client
Machine A Machine B
QM1
QM1
Active
instance
QM1
Standby
instance
locks freed
MQ
Client
network
IPA
networked storage
168.0.0.2
Client
connections
broken
Multi-instance Queue Managers
3. FAILOVER
Standby
becomes
active
MQ
Client
Machine B
QM1
QM1
Active
instance
MQ
Client
network
networked storage
Owns the queue manager data
168.0.0.2
Client
connection
still broken
Multi-instance Queue Managers
4. Recovery
complete
MQ
Client
Machine B
QM1
QM1
Active
instance
MQ
Client
network
networked storage
Owns the queue manager data
168.0.0.2
Client
connections
reconnect
Dealing with multiple IP addresses
• The IP address of the queue manager changes when it moves
‒ So MQ channel configuration needs way to select address
• Connection name syntax extended to a comma-separated list
‒ CONNAME(‘168.0.0.1,168.0.0.2’)
‒ Needs 7.0.1 qmgr or client
• Unless you use external IPAT or an intelligent router or MR01
HA CLUSTERS
HA clusters
• MQ traditionally made highly available using an HA cluster
‒ IBM PowerHA for AIX (formerly HACMP), Veritas Cluster Server, Microsoft Cluster Server, HP Serviceguard,
…
• HA clusters can:
‒ Coordinate multiple resources such as application server, database
‒ Consist of more than two machines
‒ Failover more than once without operator intervention
‒ Takeover IP address as part of failover
‒ Likely to be more resilient in cases of MQ and OS defects
HA clusters
• In HA clusters, queue manager data and logs are placed on a shared disk
‒ Disk is switched between machines during failover
• The queue manager has its own “service” IP address
‒ IP address is switched between machines during failover
‒ Queue manager’s IP address remains the same after failover
• The queue manager is defined to the HA cluster as a resource dependent on the shared disk and the
IP address
‒ During failover, the HA cluster will switch the disk, take over the IP address and then start the queue manager
HA cluster
MQ in an HA cluster – Active/active
Normal
execution
MQ
Client
Machine A Machine B
QM1
Active
instance
MQ
Client
network
168.0.0.1
QM2
Active
instance
QM2
data
and logs
QM1
data
and logs
shared disk
168.0.0.2
HA cluster
MQ in an HA cluster – Active/active
FAILOVER MQ
Client
Machine A Machine B
MQ
Client
network
168.0.0.1
QM2
Active
instance
QM2
data
and logs
QM1
data
and logs
shared disk
168.0.0.2
QM1
Active
instance
Shared disk
switched
IP address
takeover
Queue
manager
restarted
Multi-instance QM or HA cluster?
• Multi-instance queue manager
‒ Integrated into the MQ product
‒ Faster failover than HA cluster
➢ Delay before queue manager restart is much shorter
‒ Runtime performance of networked storage
‒ System administrator responsible for restarting a standby instance after failover
• HA cluster
‒ Capable of handling a wider range of failures
‒ Failover historically rather slow, but some HA clusters are improving
‒ Capable of more flexible configurations (eg N+1)
‒ Extra product purchase and skills required
• Storage distinction
‒ Multi-instance queue manager typically uses NAS
‒ HA clustered queue manager typically uses SAN
Primary Secondary
High Availability for the MQ Appliance
• IBM MQ Appliances can be deployed in HA pairs
‒ Primary instance of queue manager runs on one
‒ Secondary instance on the other for HA protection
• Primary and secondary work together
‒ Operations on primary automatically replicated to secondary
‒ Appliances monitor one another and perform local restart/failover
• Easier config than other HA solutions (no shared file system/shared disks)
• Supports manual failover, e.g. for rolling upgrades
• Replication is synchronous over Ethernet, for 100% fidelity
‒ Routable but not intended for long distances
Setting up HA for the Appliance
• The following command is run from appl1:
‒ prepareha -s <some random text> -a <address of appl2>
• The following command is run from appl2:
‒ crthagrp -s <the same random text> -a <address of appl1>
• crtmqm -sx HAQM1
• Note that there is no need to run strmqm
Virtual Images
• Another mechanism being regularly used
• When MQ is in a virtual machine … simply shoot and restart the VM
• “Turning it off and back on again”
• Can be faster than any other kind of failover
MQ Virtual System Pattern for PureApplication System
• MQ Pattern for Pure supports multi-instance HA model
• Exploits GPFS storage
• Use the pattern editor to create a virtual machine containing MQ and GPFS
HA for MQ in clouds
• Fundamentally the same options
• Orchestration tools to help provision
• Different filesystems with different characteristics
‒ https://www.ibm.com/developerworks/community/blogs/messaging/entry/mq_aws_efs
‒ https://www.ibm.com/developerworks/community/blogs/messaging/entry/POC_MQ_HA_using_Ceph
‒ https://github.com/ibm-messaging/mq-aws/tree/master/drbd
Shared Queues,
HP NonStop Server continuous continuous
MQ
Clusters none continuous
continuousautomatic
automatic automatic
none none
HA Clustering,
Multi-instance
No special
support
Access to
existing messages
Access for
new messages
Comparison of Technologies
APPLICATIONS AND
AUTO-RECONNECTION
HA applications – MQ connectivity
• If an application loses connection to a queue manager, what does it do?
‒ End abnormally
‒ Handle the failure and retry the connection
‒ Reconnect automatically thanks to application container
➢ WebSphere Application Server contains logic to reconnect JMS clients
‒ Use MQ automatic client reconnection
Automatic client reconnection
• MQ client automatically reconnects when connection broken
‒ MQI C clients and standalone JMS clients
‒ JMS in app servers (EJB, MDB) does not need auto-reconnect
• Reconnection includes reopening queues, remaking subscriptions
‒ All MQI handles keep their original values
• Can reconnect to same queue manager or another, equivalent queue manager
• MQI or JMS calls block until connection is remade
‒ By default, will wait for up to 30 minutes
‒ Long enough for a queue manager failover (even a really slow one)
Automatic client reconnection
• Can register event handler to observe reconnection
• Not all MQI is seamless, but majority repaired transparently
‒ Browse cursors revert to the top of the queue
‒ Nonpersistent messages are discarded during restart
‒ Nondurable subscriptions are remade and may miss some messages
‒ In-flight transactions backed out
• Tries to keep dynamic queues with same name
‒ If queue manager doesn’t restart, reconnecting client’s TDQs are kept for a while in case it reconnects
‒ If queue manager does restart, TDQs are recreated when it reconnects
Automatic client reconnection
• Enabled in application code, ini file or CLNTCONN definition
‒ MQI: MQCNO_RECONNECT, MQCNO_RECONNECT_Q_MGR
‒ JMS: Connection factory properties
• Plenty of opportunity for configuration
‒ Reconnection timeout
‒ Frequency of reconnection attempts
• Requires:
‒ Threaded client
‒ 7.0.1 server – including z/OS
‒ Full-duplex client communications (SHARECNV >= 1)
Client Configurations for Availability
• Use wildcarded queue manager names in CCDT
‒ Gets weighted distribution of connections
‒ Selects a “random” queue manager from an equivalent set
• Use V9 CCDT ftp/http downloads to simplify distribution of configurations
• Use multiple addresses in a CONNAME
‒ Could potentially point at different queue managers
‒ More likely pointing at the same queue manager in a multi-instance setup
• Use automatic reconnection
• Pre-connect Exit from V7.0.1.4
• Use IP routers to select address from a list
‒ Based on workload or anything else known to the router
• Can use all of these in combination!
Application Patterns for availability
• Article describing examples of how to build a hub topology supporting:
‒ Continuous availability to send MQ messages, with no single point of failure
‒ Linear horizontal scale of throughput, for both MQ and the attaching applications
‒ Exactly once delivery, with high availability of individual persistent messages
‒ Three messaging styles: Request/response, fire-and-forget, and pub/sub
• http://www.ibm.com/developerworks/websphere/library/techarticles/1303_broadhurst/1303_broadhurst.html
Disaster Recovery
What makes a Queue Manager on Dist?
ini files
Registry
System
ini files
Registry
QMgr
Recovery
Logs
QFiles
/var/mqm/log/QMGR
/var/mqm/qmgrs/QMGR
Obj Definiitions
Security
Cluster
etc
SSL Store
Backups
• At minimum, backup definitions at regular intervals
‒ Include ini files and security settings
• One view is there is no point to backing up messages
‒ They will be obsolete if they ever need to be restored
‒ Distributed platforms – data backup only possible when qmgr stopped
• Use rcdmqimg on Distributed platforms to take images
‒ Channel sync information is recovered even for circular logs
• Backup everything before upgrading code levels
‒ On Distributed, you cannot go back
• Exclude queue manager data from normal system backups
‒ Some backup products interfere with MQ processing
What makes a Queue Manager on z/OS?
ACTIVE LOGSARCHIVED LOGS LOG
BSDS
VSAM
DS
HIGHEST RBA
CHKPT LIST
LOG INVENTORY
VSAM Linear
DS
Tapes
VSAM
DS
PagesetsPrivate Objects
Private Messages
RECOV RBAs
CPCPCP
CF
Shared Messages
Shared Objects
Group Objects
DB2
......
What makes up a Queue Manager?
• Queue manager started task procedure
‒ Specifies MQ libraries to use, location of BSDS and pagesets and INP1, INP2 members start up processing
• System Parameter Module – zParm
‒ Configuration settings for logging, trace and connection environments for MQ
• BSDS: Vital for Queue Manager start up
‒ Contains info about log RBAs, checkpoint information and log dataset names
• Active and Archive Logs: Vital for Queue Manager start up
‒ Contain records of all recoverable activity performed by the Queue Manager
• Pagesets
‒ Updates made “lazily” and brought “up to date” from logs during restart
‒ Start up with an old pageset (restored backup) is not really any different from start up after queue manager
failure
‒ Backup needs to copy page 0 of pageset first (don’t do volume backup!)
• DB2 Configuration information & Group Object Definitions
• Coupling Facility Structures
‒ Hold QSG control information and MQ messages
Backing Up a z/OS Queue Manager
• Keep copies of ZPARM, MSTR procedure, product datasets and INP1/INP2 members
• Use dual BSDS, dual active and dual archive logs
• Take backups of your pagesets
‒ This can be done while the queue manager is running (fuzzy backups)
‒ Make sure you backup Page 0 first, REPRO or ADRDSSU logical copy
• DB2 data should be backed up as part of the DB2 backup procedures
• CF application structures should be backed up on a regular basis
‒ These are made in the logs of the queue manager where the backup was issued
Remote Recovery
Topologies
• Sometimes a data centre is kept PURELY as the DR site
• Sometimes 2 data centres are in daily use; back each other up for disasters
‒ Normal workload distributed to the 2 sites
‒ These sites are probably geographically distant
• Another variation has 2 data centres “near” each other
‒ Often synchronous replication
‒ With a 3rd site providing a long-distance backup
• And of course further variations and combinations of these
Queue Manager Connections
• DR topologies have little difference for individual queue managers
• But they do affect overall design
‒ Where do applications connect to
‒ How are messages routed
• Clients need ClntConn definitions that reach any machine
• Will be affected by how you manage network
‒ Do DNS names move with the site?
‒ Do IP addresses move with the site?
• Some sites always put IP addresses in CONNAME; others use hostname
‒ No rule on which is better
Disk replication
• Disk replication can be used for MQ disaster recovery
• Either synchronous or asynchronous disk replication is OK
‒ Synchronous:
➢ No data loss if disaster occurs
➢ Performance is impacted by replication delay
➢ Limited by distance (eg 100km)
‒ Asynchronous:
➢ Some limited data loss if disaster occurs
➢ It is critical that queue manager data and logs are replicated in the same consistency group if replicating both
• Disk replication cannot be used between the active and standby instances of a multi-instance queue
manager
‒ Could be used to replicate to a DR site in addition though
Combining HA and DR
QM1
Machine A
Active
instance
Machine B
Standby
instance
shared storage
Machine C
Backup
instance
QM1Replication
Primary Site Backup Site
HA Pair
Combining HA and DR – “Active/Active”
QM1
Machine A
Active
instance
Machine B
Standby
instance
Machine C
Backup
instance QM1
Replication
Site 1 Site 2
HA Pair
Machine C
Active
instance
Machine D
Standby
instance
Machine A
Backup
instance
QM2 QM2
Replication
HA Pair
Use of "spare" machines
• Don't use DR queue managers for "other" work when not needed
‒ Using the machines may be OK, though not recommended
‒ Definitely do not try to use the queue managers as test/dev environments
Primary Secondary
DR for the MQ Appliance
• IBM MQ Appliances can now be deployed with DR option
• Similar to HA design
‒ But using async replication for longer-distance
DR Role
Primary
Secondary
DR Status
Normal
Sync in progress
Partitioned
Remote Appliance(s) Unavailable
Inactive
Inconsistent
Remote appliance(s) not configured
DR Synchronization progress XX% complete
DR estimated synchronization time Estimated absolute time at which the
synchronization will complete
Out of sync data Amount of data in KB written to this
instance since the Partition
DR Replication
Asynchronous
(10 Gb Ethernet)
8.0.0.4 introduced DR
but with one major
restriction – appliances
and the queue
managers they host
can participate either in
HA Groups, or DR but
not both at the same
time
The DR appliance is
asynchronously updated
from whichever HA node
is active
HA Replication
Synchronous
(10Gb Ethernet)
Note that this does still
not (yet) allow
symmetrical HA pair to
HA pair replication
Disaster Recovery for HA groups
Integration with other products
• May want to have consistency with other data resources
‒ For example, databases and app servers
• Only way for guaranteed consistency is disk replication where all logs are in same group
‒ Otherwise transactional state might be out of sync
DR in Cloud configurations
• Don't assume that using a cloud absolves you from DR considerations
‒ Recent AWS outage is good demonstration (but not the only one)
• "I think we have grown accustomed to the idea that the cloud has become a panacea for a lot of
companies. It is important for businesses to recognize that the cloud is their new legacy system, and if
the worst does occur the situation can be worse for businesses using cloud than those choosing their
own private data centers, because they have less visibility and control."
• Some published articles/code on configuring replication for MQ within clouds
‒ https://www.ibm.com/developerworks/community/blogs/messaging/entry/Using_DRBD_to_replicate_data_for_
a_queue_manager?lang=en
• Basic concepts are the same, as you would expect
‒ Selection of technologies may be restricted by the environments
Planning and Testing
Planning for Recovery
• Write a DR plan
‒ Document everything – to tedious levels of detail
‒ Include actual commands, not just a description of the operation
➢ Not “Stop MQ”, but “as mqm, run /usr/local/bin/stopmq.sh US.PROD.01”
• And test it frequently
‒ Recommend twice a year
‒ Record time taken for each task
• Remember that the person executing the plan in a real emergency might be under-skilled and over-
pressured
‒ Plan for no access to phones, email, online docs …
• Each test is likely to show something you’ve forgotten
‒ Update the plan to match
‒ You’re likely to have new applications, hardware, software …
• May have different plans for different disaster scenarios
Example Exercises from MQ Development
• Different groups have different activities that must continue
‒ Realistic scenarios can help show what might not be available
• From the MQ development lab …
• Most of the change team were told there was a virulent disease and they had to work from home
‒ Could they continue to support customers
• If Hursley machine room was taken out by a plane missing its landing at Southampton airport
‒ Could we carry on developing the MQ product
‒ Source code libraries, build machines, test machines …
‒ Could fixes be produced
• (A common one) Someone hit emergency power-off button
• Not just paper exercises
Networking Considerations
• DNS - You will probably redirect hostnames to a new site
‒ But will you also keep the same IP addresses?
‒ Including NAT when routing to external partners?
‒ Affects CONNAME
• Include external organisations in your testing
‒ 3rd parties may have firewalls that do not recognize your DR servers
• LOCLADDR configuration
‒ Not normally used by MQ, but firewalls, IPT and channel exits may inspect it
‒ May need modification if a machine changes address
• Clustering needs special consideration
‒ Easy to accidentally join the real cluster and start stealing messages
‒ Ideally keep network separated, but can help by:
➢ Not giving backup ‘live’ security certs
➢ Not starting chinit address space (z/OS); not allowing channel initiators to start (distributed)
➢ Use CHLAUTH rules
• Backup will be out of sync with the cluster
‒ REFRESH CLUSTER() resolves updates
A Real MQ Network Story
• Customer did an IP move during a DR test
• Forgot to do the IP move back when they returned to prime systems
• Didn’t have monitoring in place that picked this up until users complained about lack of response
Other Resources
• Applications may need to deal with replay or loss of data.
‒ Decide whether to clear queues down to a known state, or enough information elsewhere to manage replays
• Order of recovery may change with different product releases
‒ Every time you install a new version of a product revisit your DR plan
• What do you really need to recover
‒ DR site might be lower-power than primary site
‒ Some apps might not be critical to the business
‒ But some might be unrecognised prereqs
If a Real Disaster Hits
• Hopefully you never need it. But if the worst happens:
• Follow your tested plan
‒ Don’t try shortcuts
• But also, if possible:
‒ Get someone to take notes and keep track of the time tasks took
‒ Prepare to attend post mortem meetings on steps you took to recover
‒ Accept all offers of assistance
• And afterwards:
‒ Update your plan for the next time
Summary
• Various ways of recovering queue managers
• Plan what you need to recover for MQ
• Plan the relationship with other resources
• Test your plan
Notices and disclaimers
Copyright © 2017 by International Business Machines Corporation (IBM).
No part of this document may be reproduced or transmitted in any form
without written permission from IBM.
U.S. Government Users Restricted Rights — use, duplication or
disclosure restricted by GSA ADP Schedule Contract with IBM.
Information in these presentations (including information relating to
products that have not yet been announced by IBM) has been reviewed
for accuracy as of the date of initial publication and could include
unintentional technical or typographical errors. IBM shall have no
responsibility to update this information. This document is distributed “as
is” without any warranty, either express or implied. In no event shall IBM
be liable for any damage arising from the use of this information,
including but not limited to, loss of data, business interruption, loss of
profit or loss of opportunity. IBM products and services are warranted
according to the terms and conditions of the agreements under which
they are provided.
IBM products are manufactured from new parts or new and used parts.
In some cases, a product may not be new and may have been previously
installed. Regardless, our warranty terms apply.”
Any statements regarding IBM's future direction, intent or product plans
are subject to change or withdrawal without notice.
Performance data contained herein was generally obtained in a
controlled, isolated environments. Customer examples are presented
as illustrations of how those customers have used IBM products and
the results they may have achieved. Actual performance, cost, savings or
other results in other operating environments may vary.
References in this document to IBM products, programs, or services
does not imply that IBM intends to make such products, programs or
services available in all countries in which IBM operates or does
business.
Workshops, sessions and associated materials may have been prepared
by independent session speakers, and do not necessarily reflect the
views of IBM. All materials and discussions are provided for informational
purposes only, and are neither intended to, nor shall constitute legal or
other guidance or advice to any individual participant or their specific
situation.
It is the customer’s responsibility to insure its own compliance with legal
requirements and to obtain advice of competent legal counsel as to
the identification and interpretation of any relevant laws and regulatory
requirements that may affect the customer’s business and any actions
the customer may need to take to comply with such laws. IBM does not
provide legal advice or represent or warrant that its services or products
will ensure that the customer is in compliance with any law.
Notices and disclaimers continued
Information concerning non-IBM products was obtained from the
suppliers of those products, their published announcements or
other publicly available sources. IBM has not tested
those products in connection with this publication and cannot
confirm the accuracy of performance, compatibility or any other
claims related to non-IBM products. Questions on the
capabilities of non-IBM products should be addressed to the
suppliers of those products. IBM does not warrant the quality of
any third-party products, or the ability of any such third-party
products to interoperate with IBM’s products. IBM expressly
disclaims all warranties, expressed or implied, including but not
limited to, the implied warranties of merchantability and fitness
for a particular, purpose.
The provision of the information contained herein is not intended
to, and does not, grant any right or license under any IBM
patents, copyrights, trademarks or other intellectual
property right.
IBM, the IBM logo, ibm.com, Aspera®, Bluemix, Blueworks Live,
CICS, Clearcase, Cognos®, DOORS®, Emptoris®, Enterprise
Document Management System™, FASP®, FileNet®,
Global Business Services®,
Global Technology Services®, IBM ExperienceOne™, IBM
SmartCloud®, IBM Social Business®, Information on Demand,
ILOG, Maximo®, MQIntegrator®, MQSeries®, Netcool®,
OMEGAMON, OpenPower, PureAnalytics™, PureApplication®,
pureCluster™, PureCoverage®, PureData®, PureExperience®,
PureFlex®, pureQuery®, pureScale®, PureSystems®, QRadar®,
Rational®, Rhapsody®, Smarter Commerce®, SoDA, SPSS,
Sterling Commerce®, StoredIQ, Tealeaf®, Tivoli® Trusteer®,
Unica®, urban{code}®, Watson, WebSphere®, Worklight®, X-
Force® and System z® Z/OS, are trademarks of International
Business Machines Corporation, registered in many jurisdictions
worldwide. Other product and service names might
be trademarks of IBM or other companies. A current list of IBM
trademarks is available on the Web at "Copyright and trademark
information" at: www.ibm.com/legal/copytrade.shtml.

More Related Content

What's hot

Deploying CloudStack with Ceph
Deploying CloudStack with CephDeploying CloudStack with Ceph
Deploying CloudStack with CephShapeBlue
 
IBM MQ Overview (IBM Message Queue)
IBM MQ Overview (IBM Message Queue)IBM MQ Overview (IBM Message Queue)
IBM MQ Overview (IBM Message Queue)Juarez Junior
 
M2M Protocols for Constrained Environments in the Context of IoT: A Compariso...
M2M Protocols for Constrained Environments in the Context of IoT: A Compariso...M2M Protocols for Constrained Environments in the Context of IoT: A Compariso...
M2M Protocols for Constrained Environments in the Context of IoT: A Compariso...Edielson P. Frigieri
 
IBM MQ - What's new in 9.2
IBM MQ - What's new in 9.2IBM MQ - What's new in 9.2
IBM MQ - What's new in 9.2David Ware
 
VMware virtual SAN 6 overview
VMware virtual SAN 6 overviewVMware virtual SAN 6 overview
VMware virtual SAN 6 overviewsolarisyougood
 
Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)Brian Brazil
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache KafkaJeff Holoman
 
IBM MQ and Kafka, what is the difference?
IBM MQ and Kafka, what is the difference?IBM MQ and Kafka, what is the difference?
IBM MQ and Kafka, what is the difference?David Ware
 
IBM MQ on cloud and containers
IBM MQ on cloud and containersIBM MQ on cloud and containers
IBM MQ on cloud and containersRobert Parker
 
An Introduction to VMware NSX
An Introduction to VMware NSXAn Introduction to VMware NSX
An Introduction to VMware NSXScott Lowe
 
EMC Vnx master-presentation
EMC Vnx master-presentationEMC Vnx master-presentation
EMC Vnx master-presentationsolarisyougood
 
A day in the life of a VSAN I/O - STO7875
A day in the life of a VSAN I/O - STO7875A day in the life of a VSAN I/O - STO7875
A day in the life of a VSAN I/O - STO7875Duncan Epping
 
Introduction to Cloud Data Center and Network Issues
Introduction to Cloud Data Center and Network IssuesIntroduction to Cloud Data Center and Network Issues
Introduction to Cloud Data Center and Network IssuesJason TC HOU (侯宗成)
 
Emc vnx2 technical deep dive workshop
Emc vnx2 technical deep dive workshopEmc vnx2 technical deep dive workshop
Emc vnx2 technical deep dive workshopsolarisyougood
 

What's hot (20)

Deploying CloudStack with Ceph
Deploying CloudStack with CephDeploying CloudStack with Ceph
Deploying CloudStack with Ceph
 
IBM MQ Overview (IBM Message Queue)
IBM MQ Overview (IBM Message Queue)IBM MQ Overview (IBM Message Queue)
IBM MQ Overview (IBM Message Queue)
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
M2M Protocols for Constrained Environments in the Context of IoT: A Compariso...
M2M Protocols for Constrained Environments in the Context of IoT: A Compariso...M2M Protocols for Constrained Environments in the Context of IoT: A Compariso...
M2M Protocols for Constrained Environments in the Context of IoT: A Compariso...
 
IBM MQ - What's new in 9.2
IBM MQ - What's new in 9.2IBM MQ - What's new in 9.2
IBM MQ - What's new in 9.2
 
CloudStack Architecture
CloudStack ArchitectureCloudStack Architecture
CloudStack Architecture
 
VMware virtual SAN 6 overview
VMware virtual SAN 6 overviewVMware virtual SAN 6 overview
VMware virtual SAN 6 overview
 
Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
IBM MQ and Kafka, what is the difference?
IBM MQ and Kafka, what is the difference?IBM MQ and Kafka, what is the difference?
IBM MQ and Kafka, what is the difference?
 
High performance computing
High performance computingHigh performance computing
High performance computing
 
IBM MQ on cloud and containers
IBM MQ on cloud and containersIBM MQ on cloud and containers
IBM MQ on cloud and containers
 
Vitualisation
VitualisationVitualisation
Vitualisation
 
An Introduction to VMware NSX
An Introduction to VMware NSXAn Introduction to VMware NSX
An Introduction to VMware NSX
 
VNX Overview
VNX Overview   VNX Overview
VNX Overview
 
EMC Vnx master-presentation
EMC Vnx master-presentationEMC Vnx master-presentation
EMC Vnx master-presentation
 
A day in the life of a VSAN I/O - STO7875
A day in the life of a VSAN I/O - STO7875A day in the life of a VSAN I/O - STO7875
A day in the life of a VSAN I/O - STO7875
 
Introduction to Cloud Data Center and Network Issues
Introduction to Cloud Data Center and Network IssuesIntroduction to Cloud Data Center and Network Issues
Introduction to Cloud Data Center and Network Issues
 
Emc vnx2 technical deep dive workshop
Emc vnx2 technical deep dive workshopEmc vnx2 technical deep dive workshop
Emc vnx2 technical deep dive workshop
 
Apache CloudStack from API to UI
Apache CloudStack from API to UIApache CloudStack from API to UI
Apache CloudStack from API to UI
 

Similar to IBM MQ High Availabillity and Disaster Recovery (2017 version)

Ame 2269 ibm mq high availability
Ame 2269 ibm mq high availabilityAme 2269 ibm mq high availability
Ame 2269 ibm mq high availabilityAndrew Schofield
 
IBM MQ Disaster Recovery
IBM MQ Disaster RecoveryIBM MQ Disaster Recovery
IBM MQ Disaster RecoveryMarkTaylorIBM
 
Hhm 3474 mq messaging technologies and support for high availability and acti...
Hhm 3474 mq messaging technologies and support for high availability and acti...Hhm 3474 mq messaging technologies and support for high availability and acti...
Hhm 3474 mq messaging technologies and support for high availability and acti...Pete Siddall
 
VMware Log Insight
VMware Log Insight VMware Log Insight
VMware Log Insight Iwan Rahabok
 
IBM IMPACT 2014 - AMC-1882 Building a Scalable & Continuously Available IBM M...
IBM IMPACT 2014 - AMC-1882 Building a Scalable & Continuously Available IBM M...IBM IMPACT 2014 - AMC-1882 Building a Scalable & Continuously Available IBM M...
IBM IMPACT 2014 - AMC-1882 Building a Scalable & Continuously Available IBM M...Peter Broadhurst
 
AME-1934 : Enable Active-Active Messaging Technology to Extend Workload Balan...
AME-1934 : Enable Active-Active Messaging Technology to Extend Workload Balan...AME-1934 : Enable Active-Active Messaging Technology to Extend Workload Balan...
AME-1934 : Enable Active-Active Messaging Technology to Extend Workload Balan...wangbo626
 
Software Architecture for Cloud Infrastructure
Software Architecture for Cloud InfrastructureSoftware Architecture for Cloud Infrastructure
Software Architecture for Cloud InfrastructureTapio Rautonen
 
IBM MQ - better application performance
IBM MQ - better application performanceIBM MQ - better application performance
IBM MQ - better application performanceMarkTaylorIBM
 
Capital One's Next Generation Decision in less than 2 ms
Capital One's Next Generation Decision in less than 2 msCapital One's Next Generation Decision in less than 2 ms
Capital One's Next Generation Decision in less than 2 msApache Apex
 
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera Cluster
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera ClusterWebinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera Cluster
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera ClusterContinuent
 
Next-Gen Decision Making in Under 2ms
Next-Gen Decision Making in Under 2msNext-Gen Decision Making in Under 2ms
Next-Gen Decision Making in Under 2msIlya Ganelin
 
Simplifying Hyper-V Management for VMware Administrators
Simplifying Hyper-V Management for VMware AdministratorsSimplifying Hyper-V Management for VMware Administrators
Simplifying Hyper-V Management for VMware Administrators5nine
 
Lookout on Scaling Security to 100 Million Devices
Lookout on Scaling Security to 100 Million DevicesLookout on Scaling Security to 100 Million Devices
Lookout on Scaling Security to 100 Million DevicesScyllaDB
 
Oracle business continuity for virtualization and cloud infrastructure
Oracle business continuity for virtualization and cloud infrastructureOracle business continuity for virtualization and cloud infrastructure
Oracle business continuity for virtualization and cloud infrastructureOTN Systems Hub
 
Best Practice for Achieving High Availability in MariaDB
Best Practice for Achieving High Availability in MariaDBBest Practice for Achieving High Availability in MariaDB
Best Practice for Achieving High Availability in MariaDBMariaDB plc
 
Choosing the right high availability strategy
Choosing the right high availability strategyChoosing the right high availability strategy
Choosing the right high availability strategyMariaDB plc
 
What's new in informix v11.70
What's new in informix v11.70What's new in informix v11.70
What's new in informix v11.70am_prasanna
 
Virtualizing Tier One Applications - Varrow
Virtualizing Tier One Applications - VarrowVirtualizing Tier One Applications - Varrow
Virtualizing Tier One Applications - VarrowAndrew Miller
 
CSD-2881 - Achieving System Production Readiness for IBM PureApplication System
CSD-2881 - Achieving System Production Readiness for IBM PureApplication SystemCSD-2881 - Achieving System Production Readiness for IBM PureApplication System
CSD-2881 - Achieving System Production Readiness for IBM PureApplication SystemHendrik van Run
 

Similar to IBM MQ High Availabillity and Disaster Recovery (2017 version) (20)

Ame 2269 ibm mq high availability
Ame 2269 ibm mq high availabilityAme 2269 ibm mq high availability
Ame 2269 ibm mq high availability
 
IBM MQ Disaster Recovery
IBM MQ Disaster RecoveryIBM MQ Disaster Recovery
IBM MQ Disaster Recovery
 
Hhm 3474 mq messaging technologies and support for high availability and acti...
Hhm 3474 mq messaging technologies and support for high availability and acti...Hhm 3474 mq messaging technologies and support for high availability and acti...
Hhm 3474 mq messaging technologies and support for high availability and acti...
 
VMware Log Insight
VMware Log Insight VMware Log Insight
VMware Log Insight
 
IBM IMPACT 2014 - AMC-1882 Building a Scalable & Continuously Available IBM M...
IBM IMPACT 2014 - AMC-1882 Building a Scalable & Continuously Available IBM M...IBM IMPACT 2014 - AMC-1882 Building a Scalable & Continuously Available IBM M...
IBM IMPACT 2014 - AMC-1882 Building a Scalable & Continuously Available IBM M...
 
AME-1934 : Enable Active-Active Messaging Technology to Extend Workload Balan...
AME-1934 : Enable Active-Active Messaging Technology to Extend Workload Balan...AME-1934 : Enable Active-Active Messaging Technology to Extend Workload Balan...
AME-1934 : Enable Active-Active Messaging Technology to Extend Workload Balan...
 
Software Architecture for Cloud Infrastructure
Software Architecture for Cloud InfrastructureSoftware Architecture for Cloud Infrastructure
Software Architecture for Cloud Infrastructure
 
IBM MQ - better application performance
IBM MQ - better application performanceIBM MQ - better application performance
IBM MQ - better application performance
 
Capital One's Next Generation Decision in less than 2 ms
Capital One's Next Generation Decision in less than 2 msCapital One's Next Generation Decision in less than 2 ms
Capital One's Next Generation Decision in less than 2 ms
 
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera Cluster
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera ClusterWebinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera Cluster
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera Cluster
 
Next-Gen Decision Making in Under 2ms
Next-Gen Decision Making in Under 2msNext-Gen Decision Making in Under 2ms
Next-Gen Decision Making in Under 2ms
 
Simplifying Hyper-V Management for VMware Administrators
Simplifying Hyper-V Management for VMware AdministratorsSimplifying Hyper-V Management for VMware Administrators
Simplifying Hyper-V Management for VMware Administrators
 
Lookout on Scaling Security to 100 Million Devices
Lookout on Scaling Security to 100 Million DevicesLookout on Scaling Security to 100 Million Devices
Lookout on Scaling Security to 100 Million Devices
 
Oracle business continuity for virtualization and cloud infrastructure
Oracle business continuity for virtualization and cloud infrastructureOracle business continuity for virtualization and cloud infrastructure
Oracle business continuity for virtualization and cloud infrastructure
 
Best Practice for Achieving High Availability in MariaDB
Best Practice for Achieving High Availability in MariaDBBest Practice for Achieving High Availability in MariaDB
Best Practice for Achieving High Availability in MariaDB
 
Choosing the right high availability strategy
Choosing the right high availability strategyChoosing the right high availability strategy
Choosing the right high availability strategy
 
What's new in informix v11.70
What's new in informix v11.70What's new in informix v11.70
What's new in informix v11.70
 
Virtualizing Tier One Applications - Varrow
Virtualizing Tier One Applications - VarrowVirtualizing Tier One Applications - Varrow
Virtualizing Tier One Applications - Varrow
 
CSD-2881 - Achieving System Production Readiness for IBM PureApplication System
CSD-2881 - Achieving System Production Readiness for IBM PureApplication SystemCSD-2881 - Achieving System Production Readiness for IBM PureApplication System
CSD-2881 - Achieving System Production Readiness for IBM PureApplication System
 
Ame 4166 ibm mq appliance
Ame 4166 ibm mq applianceAme 4166 ibm mq appliance
Ame 4166 ibm mq appliance
 

More from MarkTaylorIBM

IBM MQ Clustering (2017 version)
IBM MQ Clustering (2017 version)IBM MQ Clustering (2017 version)
IBM MQ Clustering (2017 version)MarkTaylorIBM
 
IBM MQ - Monitoring and Managing Hybrid Messaging Environments
IBM MQ - Monitoring and Managing Hybrid Messaging EnvironmentsIBM MQ - Monitoring and Managing Hybrid Messaging Environments
IBM MQ - Monitoring and Managing Hybrid Messaging EnvironmentsMarkTaylorIBM
 
IBM MQ - Comparing Distributed and z/OS platforms
IBM MQ - Comparing Distributed and z/OS platformsIBM MQ - Comparing Distributed and z/OS platforms
IBM MQ - Comparing Distributed and z/OS platformsMarkTaylorIBM
 
MQ What's New Beyond V8 - V8003 level
MQ What's New Beyond V8 - V8003 levelMQ What's New Beyond V8 - V8003 level
MQ What's New Beyond V8 - V8003 levelMarkTaylorIBM
 
MQ Security Overview
MQ Security OverviewMQ Security Overview
MQ Security OverviewMarkTaylorIBM
 
IBM MQ - Comparing Distributed and z/OS platforms
IBM MQ - Comparing Distributed and z/OS platformsIBM MQ - Comparing Distributed and z/OS platforms
IBM MQ - Comparing Distributed and z/OS platformsMarkTaylorIBM
 
What's new in IBM MQ Messaging
What's new in IBM MQ MessagingWhat's new in IBM MQ Messaging
What's new in IBM MQ MessagingMarkTaylorIBM
 
What's New in IBM MQ - Version 8
What's New in IBM MQ - Version 8What's New in IBM MQ - Version 8
What's New in IBM MQ - Version 8MarkTaylorIBM
 

More from MarkTaylorIBM (10)

IBM MQ Clustering (2017 version)
IBM MQ Clustering (2017 version)IBM MQ Clustering (2017 version)
IBM MQ Clustering (2017 version)
 
IBM MQ V9 Overview
IBM MQ V9 OverviewIBM MQ V9 Overview
IBM MQ V9 Overview
 
IBM MQ - Monitoring and Managing Hybrid Messaging Environments
IBM MQ - Monitoring and Managing Hybrid Messaging EnvironmentsIBM MQ - Monitoring and Managing Hybrid Messaging Environments
IBM MQ - Monitoring and Managing Hybrid Messaging Environments
 
IBM MQ - Comparing Distributed and z/OS platforms
IBM MQ - Comparing Distributed and z/OS platformsIBM MQ - Comparing Distributed and z/OS platforms
IBM MQ - Comparing Distributed and z/OS platforms
 
MQ V8004 Summary
MQ V8004 SummaryMQ V8004 Summary
MQ V8004 Summary
 
MQ What's New Beyond V8 - V8003 level
MQ What's New Beyond V8 - V8003 levelMQ What's New Beyond V8 - V8003 level
MQ What's New Beyond V8 - V8003 level
 
MQ Security Overview
MQ Security OverviewMQ Security Overview
MQ Security Overview
 
IBM MQ - Comparing Distributed and z/OS platforms
IBM MQ - Comparing Distributed and z/OS platformsIBM MQ - Comparing Distributed and z/OS platforms
IBM MQ - Comparing Distributed and z/OS platforms
 
What's new in IBM MQ Messaging
What's new in IBM MQ MessagingWhat's new in IBM MQ Messaging
What's new in IBM MQ Messaging
 
What's New in IBM MQ - Version 8
What's New in IBM MQ - Version 8What's New in IBM MQ - Version 8
What's New in IBM MQ - Version 8
 

Recently uploaded

Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 

Recently uploaded (20)

Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 

IBM MQ High Availabillity and Disaster Recovery (2017 version)

  • 1. InterConnect 2017 HHM-6891 Making MQ Resilient across Your Datacenters and the Cloud – High Availability, Disaster Recovery etc Mark Taylor marke_taylor@uk.ibm.com
  • 2. Please Note • IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice at IBM’s sole discretion. • Information regarding potential future products is intended to outline our general product direction and it should not be relied on in making a purchasing decision. • The information mentioned regarding potential future products is not a commitment, promise, or legal obligation to deliver any material, code or functionality. Information about potential future products may not be incorporated into any contract. The development, release, and timing of any future features or functionality described for our products remains at our sole discretion. • Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or performance that any user will experience will vary depending upon many factors, including considerations such as the amount of multiprogramming in the user’s job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve results similar to those stated here.
  • 3. Abstract • IBM MQ provides many capabilities that will keep your data safe and your business running in the event of failures whether you're running on your own systems in your data centres, on-premise IaaS, IaaS in a public cloud, or a hybrid cloud across all of these. This session introduces you to the solutions available and how they can be effectively used together to build extremely reliable environments providing HA and DR for messaging on-premise in in the hybrid cloud.
  • 4. Introduction • Availability is a very large subject • You can have the best technology in the world, but you have to manage it correctly • Technology is not a substitute for good planning and testing!
  • 5. What is DR • Getting applications running after a major (often whole-site) failure or loss • It is not about High Availability although often the two are related and share design and implementation choices ‒ “HA is having 2, DR is having them a long way apart” ‒ More seriously, HA is about keeping things running, while DR is about recovering when HA has failed. • Requirements driven by business, and often by regulators ‒ Data integrity, timescales, geography … • One major decision point: cost ‒ How much does DR cost you, even if it’s never used? ‒ How much are you prepared to lose
  • 6. Disaster Recovery vs High Availability • Designs for HA typically involve a single site for each component of the overall architecture • Designs for DR typically involve separate sites • Designs for HA (and CA) typically require no data loss • Designs for DR typically can have limited data loss • Designs for HA typically involve high-speed takeover • Designs for DR typically can permit several hours down-time
  • 8. Single Points of Failure • With no redundancy or fault tolerance, a failure of any component can lead to a loss of availability • Every component is critical. The system relies on the: ‒ Power supply, system unit, CPU, memory ‒ Disk controller, disks, network adapter, network cable ‒ ...and so on • Various techniques have been developed to tolerate failures: ‒ UPS or dual supplies for power loss ‒ RAID for disk failure ‒ Fault-tolerant architectures for CPU/memory failure ‒ ...etc • Elimination of SPOFs is important to achieve HA
  • 9. IBM MQ HA technologies • Queue manager clusters • Queue-sharing groups • Support for networked storage • Multi-instance queue managers • MQ Appliance • HA clusters • Client reconnection
  • 10. Queue Manager Clusters • Sharing cluster queues on multiple queue managers prevents a queue from being a SPOF • Cluster workload algorithm automatically routes traffic away from failed queue managers
  • 11. Queue-Sharing Groups • On z/OS, queue managers can be members of a queue-sharing group • Shared queues are held in a coupling facility ‒ All queue managers in the QSG can access the messages • Benefits: ‒ Messages remain available even if a queue manager fails ‒ Pull workload balancing ‒ Apps can connect to the group Queue manager Private queues Queue manager Private queues Queue manager Private queues Shared queues
  • 12. Introduction to Failover and MQ • Failover is the automatic switching of availability of a service ‒ For MQ, the “service” is a queue manager • Traditionally the preserve of an HA cluster, such as HACMP • Requires: ‒ Data accessible on all servers ‒ Equivalent or at least compatible servers ➢ Common software levels and environment ‒ Sufficient capacity to handle workload after failure ➢ Workload may be rebalanced after failover requiring spare capacity ‒ Startup processing of queue manager following the failure • MQ offers several ways of configuring for failover: ‒ Multi-instance queue managers ‒ HA clusters ‒ MQ Appliance
  • 13. Failover considerations • Failover times are made up of three parts: ‒ Time taken to notice the failure ➢ Heartbeat missed ➢ Bad result from status query ‒ Time taken to establish the environment before activating the service ➢ Switching IP addresses and disks, and so on ‒ Time taken to activate the service ➢ This is queue manager restart • Failover involves a queue manager restart ‒ Nonpersistent messages, nondurable subscriptions discarded • For fastest times, ensure that queue manager restart is fast ‒ No long running transactions, for example
  • 15. Multi-instance Queue Managers • Basic failover support without HA cluster • Two instances of a queue manager on different machines ‒ One is the “active” instance, other is the “standby” instance ‒ Active instance “owns” the queue manager’s files ➢ Accepts connections from applications ‒ Standby instance monitors the active instance ➢ Applications cannot connect to the standby instance ➢ If active instance fails, standby restarts queue manager and becomes active • Instances are the SAME queue manager – only one set of data files ‒ Queue manager data is held in networked storage
  • 16. Multi-instance Queue Managers 1. Normal execution Owns the queue manager data MQ Client Machine A Machine B QM1 QM1 Active instance QM1 Standby instance can fail-over MQ Client network 168.0.0.2168.0.0.1 networked storage
  • 17. Multi-instance Queue Managers 2. Disaster strikes MQ Client Machine A Machine B QM1 QM1 Active instance QM1 Standby instance locks freed MQ Client network IPA networked storage 168.0.0.2 Client connections broken
  • 18. Multi-instance Queue Managers 3. FAILOVER Standby becomes active MQ Client Machine B QM1 QM1 Active instance MQ Client network networked storage Owns the queue manager data 168.0.0.2 Client connection still broken
  • 19. Multi-instance Queue Managers 4. Recovery complete MQ Client Machine B QM1 QM1 Active instance MQ Client network networked storage Owns the queue manager data 168.0.0.2 Client connections reconnect
  • 20. Dealing with multiple IP addresses • The IP address of the queue manager changes when it moves ‒ So MQ channel configuration needs way to select address • Connection name syntax extended to a comma-separated list ‒ CONNAME(‘168.0.0.1,168.0.0.2’) ‒ Needs 7.0.1 qmgr or client • Unless you use external IPAT or an intelligent router or MR01
  • 22. HA clusters • MQ traditionally made highly available using an HA cluster ‒ IBM PowerHA for AIX (formerly HACMP), Veritas Cluster Server, Microsoft Cluster Server, HP Serviceguard, … • HA clusters can: ‒ Coordinate multiple resources such as application server, database ‒ Consist of more than two machines ‒ Failover more than once without operator intervention ‒ Takeover IP address as part of failover ‒ Likely to be more resilient in cases of MQ and OS defects
  • 23. HA clusters • In HA clusters, queue manager data and logs are placed on a shared disk ‒ Disk is switched between machines during failover • The queue manager has its own “service” IP address ‒ IP address is switched between machines during failover ‒ Queue manager’s IP address remains the same after failover • The queue manager is defined to the HA cluster as a resource dependent on the shared disk and the IP address ‒ During failover, the HA cluster will switch the disk, take over the IP address and then start the queue manager
  • 24. HA cluster MQ in an HA cluster – Active/active Normal execution MQ Client Machine A Machine B QM1 Active instance MQ Client network 168.0.0.1 QM2 Active instance QM2 data and logs QM1 data and logs shared disk 168.0.0.2
  • 25. HA cluster MQ in an HA cluster – Active/active FAILOVER MQ Client Machine A Machine B MQ Client network 168.0.0.1 QM2 Active instance QM2 data and logs QM1 data and logs shared disk 168.0.0.2 QM1 Active instance Shared disk switched IP address takeover Queue manager restarted
  • 26. Multi-instance QM or HA cluster? • Multi-instance queue manager ‒ Integrated into the MQ product ‒ Faster failover than HA cluster ➢ Delay before queue manager restart is much shorter ‒ Runtime performance of networked storage ‒ System administrator responsible for restarting a standby instance after failover • HA cluster ‒ Capable of handling a wider range of failures ‒ Failover historically rather slow, but some HA clusters are improving ‒ Capable of more flexible configurations (eg N+1) ‒ Extra product purchase and skills required • Storage distinction ‒ Multi-instance queue manager typically uses NAS ‒ HA clustered queue manager typically uses SAN
  • 27. Primary Secondary High Availability for the MQ Appliance • IBM MQ Appliances can be deployed in HA pairs ‒ Primary instance of queue manager runs on one ‒ Secondary instance on the other for HA protection • Primary and secondary work together ‒ Operations on primary automatically replicated to secondary ‒ Appliances monitor one another and perform local restart/failover • Easier config than other HA solutions (no shared file system/shared disks) • Supports manual failover, e.g. for rolling upgrades • Replication is synchronous over Ethernet, for 100% fidelity ‒ Routable but not intended for long distances
  • 28. Setting up HA for the Appliance • The following command is run from appl1: ‒ prepareha -s <some random text> -a <address of appl2> • The following command is run from appl2: ‒ crthagrp -s <the same random text> -a <address of appl1> • crtmqm -sx HAQM1 • Note that there is no need to run strmqm
  • 29. Virtual Images • Another mechanism being regularly used • When MQ is in a virtual machine … simply shoot and restart the VM • “Turning it off and back on again” • Can be faster than any other kind of failover
  • 30. MQ Virtual System Pattern for PureApplication System • MQ Pattern for Pure supports multi-instance HA model • Exploits GPFS storage • Use the pattern editor to create a virtual machine containing MQ and GPFS
  • 31. HA for MQ in clouds • Fundamentally the same options • Orchestration tools to help provision • Different filesystems with different characteristics ‒ https://www.ibm.com/developerworks/community/blogs/messaging/entry/mq_aws_efs ‒ https://www.ibm.com/developerworks/community/blogs/messaging/entry/POC_MQ_HA_using_Ceph ‒ https://github.com/ibm-messaging/mq-aws/tree/master/drbd
  • 32. Shared Queues, HP NonStop Server continuous continuous MQ Clusters none continuous continuousautomatic automatic automatic none none HA Clustering, Multi-instance No special support Access to existing messages Access for new messages Comparison of Technologies
  • 34. HA applications – MQ connectivity • If an application loses connection to a queue manager, what does it do? ‒ End abnormally ‒ Handle the failure and retry the connection ‒ Reconnect automatically thanks to application container ➢ WebSphere Application Server contains logic to reconnect JMS clients ‒ Use MQ automatic client reconnection
  • 35. Automatic client reconnection • MQ client automatically reconnects when connection broken ‒ MQI C clients and standalone JMS clients ‒ JMS in app servers (EJB, MDB) does not need auto-reconnect • Reconnection includes reopening queues, remaking subscriptions ‒ All MQI handles keep their original values • Can reconnect to same queue manager or another, equivalent queue manager • MQI or JMS calls block until connection is remade ‒ By default, will wait for up to 30 minutes ‒ Long enough for a queue manager failover (even a really slow one)
  • 36. Automatic client reconnection • Can register event handler to observe reconnection • Not all MQI is seamless, but majority repaired transparently ‒ Browse cursors revert to the top of the queue ‒ Nonpersistent messages are discarded during restart ‒ Nondurable subscriptions are remade and may miss some messages ‒ In-flight transactions backed out • Tries to keep dynamic queues with same name ‒ If queue manager doesn’t restart, reconnecting client’s TDQs are kept for a while in case it reconnects ‒ If queue manager does restart, TDQs are recreated when it reconnects
  • 37. Automatic client reconnection • Enabled in application code, ini file or CLNTCONN definition ‒ MQI: MQCNO_RECONNECT, MQCNO_RECONNECT_Q_MGR ‒ JMS: Connection factory properties • Plenty of opportunity for configuration ‒ Reconnection timeout ‒ Frequency of reconnection attempts • Requires: ‒ Threaded client ‒ 7.0.1 server – including z/OS ‒ Full-duplex client communications (SHARECNV >= 1)
  • 38. Client Configurations for Availability • Use wildcarded queue manager names in CCDT ‒ Gets weighted distribution of connections ‒ Selects a “random” queue manager from an equivalent set • Use V9 CCDT ftp/http downloads to simplify distribution of configurations • Use multiple addresses in a CONNAME ‒ Could potentially point at different queue managers ‒ More likely pointing at the same queue manager in a multi-instance setup • Use automatic reconnection • Pre-connect Exit from V7.0.1.4 • Use IP routers to select address from a list ‒ Based on workload or anything else known to the router • Can use all of these in combination!
  • 39. Application Patterns for availability • Article describing examples of how to build a hub topology supporting: ‒ Continuous availability to send MQ messages, with no single point of failure ‒ Linear horizontal scale of throughput, for both MQ and the attaching applications ‒ Exactly once delivery, with high availability of individual persistent messages ‒ Three messaging styles: Request/response, fire-and-forget, and pub/sub • http://www.ibm.com/developerworks/websphere/library/techarticles/1303_broadhurst/1303_broadhurst.html
  • 41. What makes a Queue Manager on Dist? ini files Registry System ini files Registry QMgr Recovery Logs QFiles /var/mqm/log/QMGR /var/mqm/qmgrs/QMGR Obj Definiitions Security Cluster etc SSL Store
  • 42. Backups • At minimum, backup definitions at regular intervals ‒ Include ini files and security settings • One view is there is no point to backing up messages ‒ They will be obsolete if they ever need to be restored ‒ Distributed platforms – data backup only possible when qmgr stopped • Use rcdmqimg on Distributed platforms to take images ‒ Channel sync information is recovered even for circular logs • Backup everything before upgrading code levels ‒ On Distributed, you cannot go back • Exclude queue manager data from normal system backups ‒ Some backup products interfere with MQ processing
  • 43. What makes a Queue Manager on z/OS? ACTIVE LOGSARCHIVED LOGS LOG BSDS VSAM DS HIGHEST RBA CHKPT LIST LOG INVENTORY VSAM Linear DS Tapes VSAM DS PagesetsPrivate Objects Private Messages RECOV RBAs CPCPCP CF Shared Messages Shared Objects Group Objects DB2 ......
  • 44. What makes up a Queue Manager? • Queue manager started task procedure ‒ Specifies MQ libraries to use, location of BSDS and pagesets and INP1, INP2 members start up processing • System Parameter Module – zParm ‒ Configuration settings for logging, trace and connection environments for MQ • BSDS: Vital for Queue Manager start up ‒ Contains info about log RBAs, checkpoint information and log dataset names • Active and Archive Logs: Vital for Queue Manager start up ‒ Contain records of all recoverable activity performed by the Queue Manager • Pagesets ‒ Updates made “lazily” and brought “up to date” from logs during restart ‒ Start up with an old pageset (restored backup) is not really any different from start up after queue manager failure ‒ Backup needs to copy page 0 of pageset first (don’t do volume backup!) • DB2 Configuration information & Group Object Definitions • Coupling Facility Structures ‒ Hold QSG control information and MQ messages
  • 45. Backing Up a z/OS Queue Manager • Keep copies of ZPARM, MSTR procedure, product datasets and INP1/INP2 members • Use dual BSDS, dual active and dual archive logs • Take backups of your pagesets ‒ This can be done while the queue manager is running (fuzzy backups) ‒ Make sure you backup Page 0 first, REPRO or ADRDSSU logical copy • DB2 data should be backed up as part of the DB2 backup procedures • CF application structures should be backed up on a regular basis ‒ These are made in the logs of the queue manager where the backup was issued
  • 47. Topologies • Sometimes a data centre is kept PURELY as the DR site • Sometimes 2 data centres are in daily use; back each other up for disasters ‒ Normal workload distributed to the 2 sites ‒ These sites are probably geographically distant • Another variation has 2 data centres “near” each other ‒ Often synchronous replication ‒ With a 3rd site providing a long-distance backup • And of course further variations and combinations of these
  • 48. Queue Manager Connections • DR topologies have little difference for individual queue managers • But they do affect overall design ‒ Where do applications connect to ‒ How are messages routed • Clients need ClntConn definitions that reach any machine • Will be affected by how you manage network ‒ Do DNS names move with the site? ‒ Do IP addresses move with the site? • Some sites always put IP addresses in CONNAME; others use hostname ‒ No rule on which is better
  • 49. Disk replication • Disk replication can be used for MQ disaster recovery • Either synchronous or asynchronous disk replication is OK ‒ Synchronous: ➢ No data loss if disaster occurs ➢ Performance is impacted by replication delay ➢ Limited by distance (eg 100km) ‒ Asynchronous: ➢ Some limited data loss if disaster occurs ➢ It is critical that queue manager data and logs are replicated in the same consistency group if replicating both • Disk replication cannot be used between the active and standby instances of a multi-instance queue manager ‒ Could be used to replicate to a DR site in addition though
  • 50. Combining HA and DR QM1 Machine A Active instance Machine B Standby instance shared storage Machine C Backup instance QM1Replication Primary Site Backup Site HA Pair
  • 51. Combining HA and DR – “Active/Active” QM1 Machine A Active instance Machine B Standby instance Machine C Backup instance QM1 Replication Site 1 Site 2 HA Pair Machine C Active instance Machine D Standby instance Machine A Backup instance QM2 QM2 Replication HA Pair
  • 52. Use of "spare" machines • Don't use DR queue managers for "other" work when not needed ‒ Using the machines may be OK, though not recommended ‒ Definitely do not try to use the queue managers as test/dev environments
  • 53. Primary Secondary DR for the MQ Appliance • IBM MQ Appliances can now be deployed with DR option • Similar to HA design ‒ But using async replication for longer-distance DR Role Primary Secondary DR Status Normal Sync in progress Partitioned Remote Appliance(s) Unavailable Inactive Inconsistent Remote appliance(s) not configured DR Synchronization progress XX% complete DR estimated synchronization time Estimated absolute time at which the synchronization will complete Out of sync data Amount of data in KB written to this instance since the Partition
  • 54. DR Replication Asynchronous (10 Gb Ethernet) 8.0.0.4 introduced DR but with one major restriction – appliances and the queue managers they host can participate either in HA Groups, or DR but not both at the same time The DR appliance is asynchronously updated from whichever HA node is active HA Replication Synchronous (10Gb Ethernet) Note that this does still not (yet) allow symmetrical HA pair to HA pair replication Disaster Recovery for HA groups
  • 55. Integration with other products • May want to have consistency with other data resources ‒ For example, databases and app servers • Only way for guaranteed consistency is disk replication where all logs are in same group ‒ Otherwise transactional state might be out of sync
  • 56. DR in Cloud configurations • Don't assume that using a cloud absolves you from DR considerations ‒ Recent AWS outage is good demonstration (but not the only one) • "I think we have grown accustomed to the idea that the cloud has become a panacea for a lot of companies. It is important for businesses to recognize that the cloud is their new legacy system, and if the worst does occur the situation can be worse for businesses using cloud than those choosing their own private data centers, because they have less visibility and control." • Some published articles/code on configuring replication for MQ within clouds ‒ https://www.ibm.com/developerworks/community/blogs/messaging/entry/Using_DRBD_to_replicate_data_for_ a_queue_manager?lang=en • Basic concepts are the same, as you would expect ‒ Selection of technologies may be restricted by the environments
  • 58. Planning for Recovery • Write a DR plan ‒ Document everything – to tedious levels of detail ‒ Include actual commands, not just a description of the operation ➢ Not “Stop MQ”, but “as mqm, run /usr/local/bin/stopmq.sh US.PROD.01” • And test it frequently ‒ Recommend twice a year ‒ Record time taken for each task • Remember that the person executing the plan in a real emergency might be under-skilled and over- pressured ‒ Plan for no access to phones, email, online docs … • Each test is likely to show something you’ve forgotten ‒ Update the plan to match ‒ You’re likely to have new applications, hardware, software … • May have different plans for different disaster scenarios
  • 59. Example Exercises from MQ Development • Different groups have different activities that must continue ‒ Realistic scenarios can help show what might not be available • From the MQ development lab … • Most of the change team were told there was a virulent disease and they had to work from home ‒ Could they continue to support customers • If Hursley machine room was taken out by a plane missing its landing at Southampton airport ‒ Could we carry on developing the MQ product ‒ Source code libraries, build machines, test machines … ‒ Could fixes be produced • (A common one) Someone hit emergency power-off button • Not just paper exercises
  • 60. Networking Considerations • DNS - You will probably redirect hostnames to a new site ‒ But will you also keep the same IP addresses? ‒ Including NAT when routing to external partners? ‒ Affects CONNAME • Include external organisations in your testing ‒ 3rd parties may have firewalls that do not recognize your DR servers • LOCLADDR configuration ‒ Not normally used by MQ, but firewalls, IPT and channel exits may inspect it ‒ May need modification if a machine changes address • Clustering needs special consideration ‒ Easy to accidentally join the real cluster and start stealing messages ‒ Ideally keep network separated, but can help by: ➢ Not giving backup ‘live’ security certs ➢ Not starting chinit address space (z/OS); not allowing channel initiators to start (distributed) ➢ Use CHLAUTH rules • Backup will be out of sync with the cluster ‒ REFRESH CLUSTER() resolves updates
  • 61. A Real MQ Network Story • Customer did an IP move during a DR test • Forgot to do the IP move back when they returned to prime systems • Didn’t have monitoring in place that picked this up until users complained about lack of response
  • 62. Other Resources • Applications may need to deal with replay or loss of data. ‒ Decide whether to clear queues down to a known state, or enough information elsewhere to manage replays • Order of recovery may change with different product releases ‒ Every time you install a new version of a product revisit your DR plan • What do you really need to recover ‒ DR site might be lower-power than primary site ‒ Some apps might not be critical to the business ‒ But some might be unrecognised prereqs
  • 63. If a Real Disaster Hits • Hopefully you never need it. But if the worst happens: • Follow your tested plan ‒ Don’t try shortcuts • But also, if possible: ‒ Get someone to take notes and keep track of the time tasks took ‒ Prepare to attend post mortem meetings on steps you took to recover ‒ Accept all offers of assistance • And afterwards: ‒ Update your plan for the next time
  • 64. Summary • Various ways of recovering queue managers • Plan what you need to recover for MQ • Plan the relationship with other resources • Test your plan
  • 65. Notices and disclaimers Copyright © 2017 by International Business Machines Corporation (IBM). No part of this document may be reproduced or transmitted in any form without written permission from IBM. U.S. Government Users Restricted Rights — use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM. Information in these presentations (including information relating to products that have not yet been announced by IBM) has been reviewed for accuracy as of the date of initial publication and could include unintentional technical or typographical errors. IBM shall have no responsibility to update this information. This document is distributed “as is” without any warranty, either express or implied. In no event shall IBM be liable for any damage arising from the use of this information, including but not limited to, loss of data, business interruption, loss of profit or loss of opportunity. IBM products and services are warranted according to the terms and conditions of the agreements under which they are provided. IBM products are manufactured from new parts or new and used parts. In some cases, a product may not be new and may have been previously installed. Regardless, our warranty terms apply.” Any statements regarding IBM's future direction, intent or product plans are subject to change or withdrawal without notice. Performance data contained herein was generally obtained in a controlled, isolated environments. Customer examples are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual performance, cost, savings or other results in other operating environments may vary. References in this document to IBM products, programs, or services does not imply that IBM intends to make such products, programs or services available in all countries in which IBM operates or does business. Workshops, sessions and associated materials may have been prepared by independent session speakers, and do not necessarily reflect the views of IBM. All materials and discussions are provided for informational purposes only, and are neither intended to, nor shall constitute legal or other guidance or advice to any individual participant or their specific situation. It is the customer’s responsibility to insure its own compliance with legal requirements and to obtain advice of competent legal counsel as to the identification and interpretation of any relevant laws and regulatory requirements that may affect the customer’s business and any actions the customer may need to take to comply with such laws. IBM does not provide legal advice or represent or warrant that its services or products will ensure that the customer is in compliance with any law.
  • 66. Notices and disclaimers continued Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products in connection with this publication and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. IBM does not warrant the quality of any third-party products, or the ability of any such third-party products to interoperate with IBM’s products. IBM expressly disclaims all warranties, expressed or implied, including but not limited to, the implied warranties of merchantability and fitness for a particular, purpose. The provision of the information contained herein is not intended to, and does not, grant any right or license under any IBM patents, copyrights, trademarks or other intellectual property right. IBM, the IBM logo, ibm.com, Aspera®, Bluemix, Blueworks Live, CICS, Clearcase, Cognos®, DOORS®, Emptoris®, Enterprise Document Management System™, FASP®, FileNet®, Global Business Services®, Global Technology Services®, IBM ExperienceOne™, IBM SmartCloud®, IBM Social Business®, Information on Demand, ILOG, Maximo®, MQIntegrator®, MQSeries®, Netcool®, OMEGAMON, OpenPower, PureAnalytics™, PureApplication®, pureCluster™, PureCoverage®, PureData®, PureExperience®, PureFlex®, pureQuery®, pureScale®, PureSystems®, QRadar®, Rational®, Rhapsody®, Smarter Commerce®, SoDA, SPSS, Sterling Commerce®, StoredIQ, Tealeaf®, Tivoli® Trusteer®, Unica®, urban{code}®, Watson, WebSphere®, Worklight®, X- Force® and System z® Z/OS, are trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at: www.ibm.com/legal/copytrade.shtml.