• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Active ims replication ims ug hartford & boston 10-2013
 

Active ims replication ims ug hartford & boston 10-2013

on

  • 350 views

 

Statistics

Views

Total Views
350
Views on SlideShare
350
Embed Views
0

Actions

Likes
0
Downloads
2
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Active ims replication ims ug hartford & boston 10-2013 Active ims replication ims ug hartford & boston 10-2013 Presentation Transcript

    • IBM GDPS Active-Active and IMS Replication Greg Vance IMS Development, STSM gvance@us.ibm.com 1
    • Acknowledgements and Disclaimers Availability. References in this presentation to IBM products, programs, or services do not imply that they will be available in all countries in which IBM operates. The workshops, sessions and materials have been prepared by IBM or the session speakers and reflect their own views. They are provided for informational purposes only, and are neither intended to, nor shall have the effect of being, legal or other guidance or advice to any participant. While efforts were made to verify the completeness and accuracy of the information contained in this presentation, it is provided AS-IS without warranty of any kind, express or implied. IBM shall not be responsible for any damages arising out of the use of, or otherwise related to, this presentation or any other materials. Nothing contained in this presentation is intended to, nor shall have the effect of, creating any warranties or representations from IBM or its suppliers or licensors, or altering the terms and conditions of the applicable license agreement governing the use of IBM software. All customer examples described are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics may vary by customer. Nothing contained in these materials is intended to, nor shall have the effect of, stating or implying that any activities undertaken by you will result in any specific sales, revenue growth or other results. © Copyright IBM Corporation 2013. All rights reserved. – U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. IBM, the IBM logo, ibm.com, IMS, DB2, CICS and WebSphere MQ are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol (® or ™), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at “Copyright and trademark information” at www.ibm.com/legal/copytrade.shtml Other company, product, or service names may be trademarks or service marks of others. 2
    • Agenda • Level set • Concepts • Sample Scenarios • IMS Replication • Summary 3
    • Suite of GDPS service products to meet various business requirements for availability and disaster recovery Continuous Availability of Data within a Data Center Continuous Availability with DR within Metropolitan Region Disaster Recovery Extended Distance CA Regionally and Disaster Recovery Extended Distance GDPS/PPRC HM GDPS/PPRC GDPS/GM & GDPS/XRC GDPS/MGM & GDPS/MzGM RPO=0 [RTO secs] RPO=0 RTO mins / RTO<1h RPO secs, RTO<1h RPO=0,RTO mins/<1h & RPO secs, RTO<1h for disk only (<20km) (>20km) Single Data Center Two Data Centers Two Data Centers Three Data Centers Applications remain active Systems remain active High availability for site disasters Continuous access to data in the event of a storage outage Multi-site workloads can withstand site and/or storage failures Rapid Systems D/R w/ “seconds” of data loss RPO – recovery point objective 4 Disaster Recovery for out of region interruptions RTO – recovery time objective Disaster recovery for regional disasters
    • How Much Interruption can your Business Tolerate? Standby Ensuring Business Continuity: • Disaster Recovery – Restore business after an unplanned outage • High-Availability – Meet Service Availability objectives e.g., 99.9% availability or 8.8 hours of down-time a year • Continuous Availability – No downtime (planned or not) Active/Active Global Enterprises that operate across timezones no longer have any ‘off-hours’ window. Continuous Availability is required. What is the cost of 1 hour of downtime during core business hours? 5
    • Evolving customer requirements • Shift focus from failover model to near-continuous availability model (RTO near zero) • Access data from any site (unlimited distance between sites) • Multi-sysplex, multi-platform solution – “Recover my business rather than my platform technology” • Ensure successful recovery via automated processes (similar to GDPS technology today) – Can be handled by less-skilled operators • Provide workload distribution between sites (route around failed sites, dynamically select sites based on ability of site to handle additional workload) • Provide application level granularity – Some workloads may require immediate access from every site, other workloads may only need to update other sites every 24 hours (less critical data) – Current solutions employ an all-or-nothing approach (complete disk mirroring, requiring extra network capacity) 6
    • From High Availability to Continuous Availability GDPS/PPRC GDPS/XRC or GDPS/GM GDPS/Active-Active Failover Model Failover Model Near CA model Recovery Time ≈ 2 min Recovery Time < 1 hour Recovery time < 1 minute Distance < 20 km Unlimited distance Unlimited distance • GDPS/Active-Active is for mission critical workloads that have stringent recovery objectives that can not be achieved using existing GDPS solutions – RTO approaching zero, measured in seconds for unplanned outages – RPO approaching zero, measured in seconds for unplanned outages – Non-disruptive site switch of workloads for planned outages – At any distance • Active-Active is NOT intended to substitute for local availability solution such as Parallel Sysplex 7
    • Agenda • Level set • Concepts • Sample Scenarios • IMS Replication • Summary 8
    • Terminology • Active/Active Sites – This is the overall concept of the shift from a failover model to a continuous availability model • GDPS active/active continuous availability – This is the formal name of the overall solution under which IBM will deliver capabilities over a period of time – While IBM currently provides the GDPS active-standby configuration, our future road map includes additional configurations that can lead to full active-active function • GDPS/Active-Active – The name of the GDPS product which provides, along with the other products that make up the solution, the capabilities mentioned in this presentation such as workload, replication and routing management and so on. This can be shortened to GDPS/A-A 9
    • Active/Active Sites concept • Two or more sites, separated by unlimited distances, running the same applications and having the same data to provide: Transactions – Cross-site Workload Balancing – Continuous Availability – Disaster Recovery • Data at geographically dispersed sites kept in sync via s/w replication Workload Distributor Replication Workloads are managed by a client and routed to one of many replicas, depending upon workload weight and latency constraints; extends workload balancing to SYSPLEXs across multiple sites 10 Monitoring spans the sites and now becomes an essential element of the solution for site health checks, performance tuning, etc
    • Active/Active Sites Configurations • Configurations 1. Active/Standby – GA date 30th June 2011 2. Active/Query – statement of direction 3. Active/Active – intended direction • A configuration is specified on a workload basis • A workload is the aggregation of these components – – – 11 Software: user written applications (eg: COBOL programs) and the middleware run time environment (eg: CICS regions, InfoSphere Replication Server instances and DB2 subsystems) Data: related set of objects that must preserve transactional consistency and optionally referential integrity constraints (eg: DB2 Tables, IMS Databases) Network connectivity: one or more TCP/IP addresses & ports (eg: 10.10.10.1:80)
    • Active/Standby configuration Transactions Static Routing Automatic Failover Wordload A, B standby Application A, B active Workload A, B active A B Workload Distributor queued >> IMS site1 >> DB2 Replication << DB2 << IMS site2 This is a fundamental paradigm shift from a failover model to a continuous availability model 12
    • Active/Query configuration (SOD) Transactions Appl B (grey) is in active/query configuration Appl A (gold) is in active/standby configuration • using same data as Appl A but read only • active to both site1 & site2, but favor site1 • Appl B query routing according to Appl A latency policy • performing updates in active site [site2] Workload Distributor A B B B B B latency<3; [A] latency=2; as latency is less latency>5; “max latency” than “resetbeen exceeded. policy policy has latency”, policy, route “max latency” follow route more queries to site1 to skew queries to site1 all queries to site2 M IMS site1 Replication DB2 DB2 << << IMS site2 Read-only or query transactions to be routed to both sites, while update transactions are routed only to the active site All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. 13
    • What S/W makes up a GDPS/Active-Active environment? • GDPS/Active-Active • IBM Tivoli NetView for z/OS – IBM Tivoli NetView for z/OS Enterprise Management Agent (NetView agent) • IBM Tivoli Monitoring • System Automation for z/OS • Multi-site Workload Lifeline for z/OS (SA z/OS) • Middleware – DB2, IMS, CICS… • Replication Software – IBM InfoSphere Replication Server for z/OS (DB2) – IBM InfoSphere IMS Replication for z/OS • Optionally the Tivoli OMEGAMON XE suite of monitoring products Integration of a number of software products 14
    • GDPS/A-A configuration Network TEP Interface GDPS Web Interface Primary Controller Backup Controller LB 1st Tier AAC1 CSM NetView Backup NetView Master LLAdvisor Primary LB 2nd Tier Production 1 A1P1 A1 LLAdvisor Secondary LB 2nd Tier Sysplex Distrib TEMA A1 AAC2 Sysplex Distrib Production 2 A2 A1P2 TEMA Production 2 A2P2 A2 Production 1 LLAgent LLAgent LLAgent LLAgent MQ / TCPIP MQ / TCPIP MQ / TCPIP A2P1 MQ / TCPIP Workload 1 Workload 3 Workload 1 Workload 3 Active Active Active Active Workload 1 Workload 3 Workload 1 Workload 3 Standby Standby Standby Standby DB2 Rep IMS Rep DB2 Rep IMS Rep DB2 Rep IMS Rep DB2 Rep IMS Rep CICS/DB2 Appl IMS Appl CICS/DB2 Appl IMS Appl CICS/DB2 Appl IMS Appl CICS/DB2 Appl IMS Appl DB2 IMS DB2 IMS Site 1 15 S/W Replication Site 2
    • GDPS/Active-Active (the product) • Automation code is an extension on many of the techniques tried and tested in other GDPS products and with many client environments for management of their mainframe CA & DR requirements • Control code only runs on Controller systems • Workload management - start/stop components of a workload in a given Sysplex • Replication management - start/stop replication for a given workload between sites • Routing management - start/stop routing of transactions to a site • System and Server management - STOP (graceful shutdown) of a system, LOAD, RESET, ACTIVATE, DEACTIVATE the LPAR for a system, and capacity on demand actions such as CBU/OOCoD • Monitoring the environment and alerting for unexpected situations • Planned/Unplanned situation management and control - planned or unplanned site or workload switches; automatic actions such as automatic workload switch (policy dependent) • Powerful scripting capability for complex/compound scenario automation 16
    • Agenda • Level set • Concepts • Sample Scenarios • IMS Replication • Summary 17
    • Sample scenario – all workloads active in one site Network SASP-compliant Routers LB 1st Tier Routing for WKLD 1, 2 & 3 AA Controller CSM AA Controller [AAC1] [AAC2] LB 2nd Tier Sysplex Distrib Primary LB 2nd Tier Sysplex Distrib Backup A2 Prod-sys [A1P1] [A1P2] [A2P1] [A2P2] wkld-1 active [AAPLEX1] Sysplex-A1 A2 Prod-sys wkld-1 active wkld-1 standby wkld-1 standby wkld-2 standby wkld-2 active wkld-3 active wkld-3 active IMS Site 1 18 [AAPLEX2] A1 Prod-sys DB2 IMS wkld-3 standby wkld-3 standby S/W Replication IMS DB2 IMS Site 2 Sysplex-A2 A1 Prod-sys
    • Sample scenario – both sites active for individual workloads Network SASP-compliant Routers LB 1st Tier Routing for WKLD 1 & 3 AA Controller CSM Routing for WKLD 2 AA Controller [AAC1] [AAC2] LB 2nd Tier Sysplex Distrib Primary LB 2nd Tier Sysplex Distrib Backup A2 Prod-sys [A1P1] [A1P2] [A2P1] [A2P2] wkld1 active [AAPLEX1] Sysplex-A1 A2 Prod-sys wkld1 active wkld1 standby wkld1 standby wkld2 active wkld2 standby wkld3 active wkld3 active IMS Site 1 19 [AAPLEX2] A1 Prod-sys DB2 IMS wkld3 standby wkld3 standby S/W Replication IMS DB2 IMS Site 2 Sysplex-A2 A1 Prod-sys
    • Planned workload/site switch Initiate [clickcontinue here] [click here] Network SS Y all data has been drained TEP Interface LB 1st Tier GDPS Web Interface CSM STOP ROUTING AAC1 START ROUTING AAC2 LB 2nd Tier LB 2nd Tier Sysplex Distrib primary Sysplex Distrib backup WKLD3 WKLD3 WKLD2 WKLD2 WKLD2 WKLD2 standby active standby active WKLD1 WKLD1 WKLD1 WKLD1 IMS Appl AAPLEX1 standby active IMS Appl IMS Appl IMS Appl IMS Rep [IMS Rep] A1P1 <<<< >>>> IMS Rep A1P2 IMS DB2 IMS Rep] A2P2 >> IMS << IMS DB2 Site1 A2P1 IMS Site2 Note: multiple workloads and needed infrastructure resources are not shown for clarity sake 20 AAPLEX2 standby active
    • Unplanned site failure Network TEP Interface LB 1st Tier GDPS Web Interface CSM START ROUTING AAC1 LB 2nd Tier LB 2nd Tier Sysplex Distrib WKLD3 WKLD2 WKLD1 WKLD1 IMS Appl IMS Appl IMS Rep [IMS Rep] WKLD2 active DB2 active WKLD1 etection Interval = 60 sec WKLD1 Failure D IMS IMS SITE_FAILURE = Automatic Appl Appl logged <<<< IMS Rep A1P2 IMS WKLD2 IMS Rep] A2P2 >> IMS << IMS DB2 Site1 A2P1 IMS Site2 Note: multiple workloads and needed infrastructure resources are not shown for clarity sake 21 AAPLEX2 standby active backup WKLD3 Automatic switch WKLD2 standby active AAPLEX1 AAC2 Sysplex Distrib primary A1P1 STOP ROUTING
    • GDPS/A-A – testing results* 120 120 min 45 30 60 60 15 0 20 sec GDPS/A-A planned site switch 15 min 0 GDPS/A-A sec unplanned site switch • Configuration – 9 * CICS/DB2 and 1 * IMS workload Site Failover Time (mins) Switch Time (secs) 60 GDPS/XRC, GDPS/GM site failover • Planned site switch – Distance between sites 300 miles (≈500 km) – Operations initiated switch of the workloads in a site to the other site took 20 seconds – Site failure detection interval is 60 seconds – Current GDPS and disk replication will take ≈1-2 hours • Unplanned site switch – Automatic switch of failed site workloads to the surviving site took 15 seconds – Current GDPS & disk replication will take about ≈1 hour * IBM laboratory results; actual results may vary 22
    • Agenda • Level set • Concepts • Sample Scenarios • IMS Replication • Summary 23
    • IMS Software-Based Data Mirroring InfoSphere IMS Replication IMS • Unidirectional Replication of IMS data – Subscription Level Replication • • • • • DB/TM, DBCTL, Batch DL/I, FDBR Capture x’99’ log records • Increase in log volume due to change data capture records Uses IMS Database Resource Adapter interface Parallel Apply Conflicts will be detected • Manual resolution will be required Classic Data Architect – – 24 IMS IMS “Apply” – – – • InfoSphere IMS Replication IMS “Capture” – – • Transaction consistency All or nothing at DB level Basic replication monitoring TCP/IP for data transmission Administration (some administration can be done via z/OS console commands) Basic replication monitoring
    • IMS Replication Architecture Classic Data Architect Source IMS Databases Target IMS Databases Replication Metadata IMS Replication Metadata IMS Administration Administration IMS TCP/IP IMS Logs Log Read/ Merge UOR Capture SOURCE SERVER 25 Bookmark Database UOR Analysis UOR Apply One Session Per Subscription TARGET SERVER IMS DRA Interface
    • Multi-Target Configuration Source IMS Databases Target Server IMS Source Server IMS IMS IMS Target Server IMS IMS Logs 26 • Multiple Subscriptions • Each subscription associated with one of the target servers • Single server sends updates to appropriate target
    • Sample Two-way Replication Configuration Workload A Update/Query IMS Target Server Source Server IMS Workload A Query IMS IMS IMS Workload B Update/Query 27 Workload B Query Source Server Target Server IMS
    • Source Server Details IMS TM / DB IMS Logs Partner Product Exit TCP/IP IMS DBCTL Partner Product Exit IMS Logs TCP/IP IMS Logs BATCH DL/I Replication Metadata TCP/IP Change Stream Ordering Capture Services IMS Logger Exit RECON Log Info 28 SOURCE SERVER Source IMS Databases •User exits to notify server of new IMS instance •Merge Waits for Batch DL/I to complete •Idle IMS regions can slow processing
    • Target Server Details DRA thread Staged Unit-of-Recovery Data Writer Services Change Messages Apply Service Writer Services Dependency Analysis TARGET SERVER • Parallelism based on dependency analysis within a subscription • Database and root key used for analysis 29 IMS
    • Classic Data Architect – Replication Management 30
    • Classic Data Architect - Monitoring Throughput 31
    • Classic Data Architect - Monitoring Latency 32
    • Adaptive Apply • Adaptive apply error handling is the default behavior – Can be set to standard apply, which does not tolerate conflicts • If a conflict is detected, the action will be to ignore update • Conflicts are: – Before image mismatch – Unable to locate segment to process update • All conflicts are logged in the event log • Manual resolution will be required 33
    • Current Restrictions • All segments for a DB must have change capture logging enabled and will be replicated – Must augment the DBD with the EXIT=(…,LOG) specification – IMS change capture restrictions • DEDB FLD calls not supported • Subset Pointers not managed • ISRT HERE -> ISRT FIRST • Workload Restrictions – All logically related DBs must be in the same subscription – Workload with logically related DBs will be serialized – UORs with unkeyed or non-unique keyed segments will be serialized • External load of target DB – Must be a static image copy 34
    • Performance considerations • Transactional consistency vs. Parallelism – All updates for a given UR processed as a single transaction during apply – All transactions involving the same ‘resource’ will be serially processed in commit order – Running transactions in parallel can have application consistency implications • Increase in log data • Multiple source IMSs to 1 Apply Target implications • Internally achieved 53K updates per second – ~116,000 updates per second when deploying two apply servers – sustained <2sec latency – your results may vary 35
    • Agenda • Level set • Concepts • Sample Scenarios • IMS Replication • Summary 36
    • Replication Summary • Asynchronous Replication – Allows for unlimited distance support • Low Latency through parallelism – Allows for almost immediate data availability and low RTO • Transaction Consistency – Access with integrity on target system and low RTO • Subscription independence – Switch can be at a workload level vs. system level 37
    • GDPS Active/Active Summary • Manages availability at a workload level • Provides a central point of monitoring & control • Manages replication between sites • Provides the ability to perform a controlled workload site switch • Provides near-continuous data and systems availability and helps simplify disaster recovery with an automated, customized solution • Reduces recovery time and recovery point objectives – measured in seconds • Facilitates regulatory compliance management with a more effective business continuity plan • Simplifies system resource management GDPS/Active-Active is the next generation of GDPS 38
    • There are multiple GDPS service products under the GDPS solution umbrella to meet various customer requirements for Availability and Disaster Recovery Continuous Availability of Data within a Data Center Continuous Availability with DR within Metropolitan Region Disaster Recovery Extended Distance CA Regionally and Disaster Recovery Extended Distance CA, DR, & Crosssite Workload Balancing Extended Distance GDPS/PPRC HM GDPS/PPRC GDPS/GM & GDPS/XRC GDPS/MGM & GDPS/MzGM GDPS/Active-Active RPO=0 [RTO secs] RPO=0 RTO mins / RTO<1h for disk only (<20km) RPO secs, RTO<1h RPO=0,RTO mins/<1h RPO secs, RTO secs & RPO secs, RTO<1h (>20km) Single Data Center Two Data Centers Two Data Centers Three Data Centers Applications remain active Systems remain active Continuous access to data in the event of a storage outage Multi-site workloads can withstand site and/or storage failures Rapid Systems D/R w/ “seconds” of data loss High availability for site disasters Automatic workload switch in seconds; Disaster recovery seconds of data for regional loss disasters Disaster Recovery for out of region interruptions RPO – recovery point objective 39 RTO – recovery time objective Two Active Data Centers
    • Thank You 40
    • Trademarks The following are trademarks of the International Business Machines Corporation in the United States and/or other countries. IBM* IBM (logo)* Ibm.com* AIX* DB2* DS6000 DS8000 Dynamic Infrastructure* ESCON* FlashCopy* GDPS* HyperSwap IBM* IBM logo* Parallel Sysplex* POWER5 Redbooks* Sysplex Timer* System p* System z* Tivoli* z/OS* z/VM* * Registered trademarks of IBM Corporation The following are trademarks or registered trademarks of other companies. Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries. Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license there from. Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. InfiniBand is a trademark and service mark of the InfiniBand Trade Association. Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. UNIX is a registered trademark of The Open Group in the United States and other countries. Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. ITIL is a registered trademark, and a registered community trademark of the Office of Government Commerce, and is registered in the U.S. Patent and Trademark Office. IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency, which is now part of the Office of Government Commerce. * All other products may be trademarks or registered trademarks of their respective companies. Notes: Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here. IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply. All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics will vary depending on individual customer configurations and conditions. This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the information may be subject to change without notice. Consult your local IBM business contact for information on the product or services available in your area. All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. Information about non-IBM products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the performance, compatibility, or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. Prices subject to change without notice. Contact your IBM representative or Business Partner for the most current pricing in your geography. 41