Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Always on high availability best practices for informix


Published on

Best Practices for high availability with the IBM Informix database

Published in: Technology
  • Be the first to comment

Always on high availability best practices for informix

  1. 1. © 2015 IBM Corporation DMX-2628 – Always On: High Availability Best Practices for Informix Nagaraju Inturi Scott Lashley
  2. 2. •  IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice at IBM’s sole discretion. •  Information regarding potential future products is intended to outline our general product direction and it should not be relied on in making a purchasing decision. •  The information mentioned regarding potential future products is not a commitment, promise, or legal obligation to deliver any material, code or functionality. Information about potential future products may not be incorporated into any contract. •  The development, release, and timing of any future features or functionality described for our products remains at our sole discretion. Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or performance that any user will experience will vary depending upon many factors, including considerations such as the amount of multiprogramming in the user’s job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve results similar to those stated here. Please Note: 2
  3. 3. Industry Terms •  Recovery Point Objective (RPO) §  How much data are you willing to lose? •  Recovery Time Objective (RTO) §  How much time to recovery from a failure •  Example §  ONCONFIG parameter RTO_SERVER_RESTART Monitors transaction activity and coordinates checkpoints such that in the event of a server crash, the server can reboot in the time specified by RTO_SERVER_RESTART 2
  4. 4. Hot Standby •  Fred wants to implement an RTO policy of 15 seconds in the event of a failure. 3 Primary Secondary
  5. 5. Updatable Secondary •  Fred wants to extend his HDR solution to utilize the secondary. 4 Primary Secondary
  6. 6. Updatable Secondary •  How do updates on the secondary work? §  Row locks are acquired on secondary as updates are applied from primary §  Initial read is done on secondary §  Update is forwarded to primary •  If row versioning is defined in the schema for the table, the version is compared to determine if update can be applied •  Otherwise, whole row is compared to determine if update can be applied •  What isolation levels are supported on a secondary? §  Dirty Read §  Committed Read §  Committed Read Last Committed 5
  7. 7. Application Perspective - Locking & Queries 6 Begin Work Read Row (V1) Update Row (V2) Apply Update Sec Commit Work Apply Commit Sec ulock xlock release lock Pri Query Primary row(V1) DR=row(V2) CRLC=row(V1) CR,CS,RR=block Query Secondary row(V1) DR=row(V2) CRLC=row(V1) CR=block Anatomy Of Update release lock Sec xlock Sec
  8. 8. Application Perspective - Locking & Updates 7 Begin Work Read Row (V1) Update Row (V2) Apply Update Commit Work Apply Commit ulock xlock release lock Update Primary row(V1) block Update Secondary If hot row, push to primary Otherwise, row(V1) block Anatomy Of Update xlock Sec release lock Sec
  9. 9. Application behavior •  I’m on an updatable secondary and my application just did an update to a row but its not committed yet. If I go read the row, what version of the row will I see? §  When my session (or any other session) attempts to read a row that is recently updated, it will wait for secondary server I’m connected to to replay that row update prior to reading the row. 8 Begin Work Read Row (V1) Forward Update Sec Wait Apply Update Sec Commit Work Apply Commit Sec Read Row Again Block until row is applied
  10. 10. Application behavior •  When I get error 7350 “Attempt to update a stale version of a row”, what happened? §  My application read a row from the secondary node and between the time the row was read and forwarded to the primary to be updated, another transaction was able to complete an update to the row. 9 Update Secondary Update Primary Read row (V1) Read row (V1) Update row (V2) Commit row (V2) Forward update (V3) At this point, the forwarded update is the wrong version to what is committed; error is returned.
  11. 11. What’s new? •  My application is using UPDATABLE_SECONDARY configuration to perform queries and updates on all the members of my HDR cluster. How do I coordinate transactions across the HDR cluster? •  CLUSTER_TXN_SCOPE ONCONFIG and session parameter used to control when the application receives an acknowledgement of the commit of a user’s transaction. 10 CLUSTER_TXN_SCOPE Connected to Primary Connected to Secondary SESSION ACK when commit is complete ACK when commit is complete on primary SERVER (default) ACK when commit is complete ACK when commit is complete on primary and processed on the node I’m connected to CLUSTER ACK when commit has been applied to all nodes ACK when commit has been applied to all nodes
  12. 12. What’s new? •  DRINTERVAL & HDR_TXN_SCOPE These parameters work together to determine synchronization between primary and secondary nodes •  FULL_SYNC is new 11 DRINTERVAL HDR_TXN_SCOPE Buffered logging Unbuffered logging -1 n/a Async Near sync 0 FULL_SYNC Full sync Full sync 0 ASYNC Async Async 0 NEAR_SYNC Near sync Near sync >0 n/a Async Async
  13. 13. DRINTERVAL & HDR_TXN_SCOPE •  My RPO is 0 for single point of failure DRINTERVAL=0 HDR_TXN_SCOPE=NEAR_SYNC This setting makes sure that committed transactions are received by the secondary. If the primary fails, all committed transactions will be guaranteed to be at least in volatile memory on the secondary. •  My RPO is 10 for a single point of failure DRINTERVAL=10 Make sure I send to the secondary a buffer at least every 10s •  My RPO is 0 for multiple points of failure DRINTERVAL=0 HDR_TXN_SCOPE=FULL_SYNC This setting makes sure that committed transactions are received and written to disk by the secondary. If the primary fails, all committed transactions will be guaranteed to be hardened to disk on the secondary. 12
  14. 14. Offsite disaster •  Fred wants to extend his HDR solution to include offsite replication in case of site disaster. 13 Primary Secondary RSS Secondary
  15. 15. Remote Standalone Secondary (RSS) •  You want our remote site located in TimBuktu? How’s the network connectivity to that site? •  You dropped what database? §  DELAY_APPY •  Your planning to do what maintenance this weekend? §  Stop Apply command •  RSS Limitations §  Can only be promoted to HDR secondary, not primary §  SYNC mode not supported 14
  16. 16. Improved Network performance •  SMX_NUMPIPES §  There is a limit on how many TCP buffers can be inflight across a wire between a pair of ports until a TCP ACK is sent to the sender. This is referred to as the TCP window. SMX can be configured to have multiple pairs of ports between two given servers, in effect filling in the gaps that would otherwise occur on the network wire. This is especially advantageous if the network connection is over a WAN or of less that best quality. In such conditions, setting SMX_NUMPIPES to 2 can result in twice as much data being sent across the wire. §  SMX will reorganize the transmissions on the target node so that it appears to have been received across a single serial connection. 15
  17. 17. What’s new (and really cool)? •  Informix warehouse accelerator (IWA) 16
  18. 18. What’s really cool? 17 Hey Scott, we are having an online sale this weekend and we expect a huge influx of internet activity on our web site. I might have forgot to tell you that. Can our infrastructure handle that? •  Share Disk Secondary (SDS) §  Adjust capacity as demand changes §  Does not duplicate disk space §  No special hardware •  Cluster mgr or SDS_LOGCHECK §  Coexist with ER, HDR & RSS §  Primary can failover to any SDS •  ifxclone §  Make a quick copy
  19. 19. What’s improved? •  Index page logging (IPL) §  Copies a newly created index from primary to secondary using the logical log. §  Required for RSS secondary servers §  Big performance boost (4x) 18
  20. 20. Best Practices for HDR, RSS, SDS •  All nodes which are candidates for failover (HDR secondary & SDS) should have similar specs in case there is a failover •  Use unbuffered database logging to minimize lost transactions •  ONCONFIG parameter OFF_RECVRY_TRHEADS should be set to prime (# of cpus) * 3 •  Turn on AUTO_READAHEAD on secondary •  Larger BUFFERPOOL can alleviate some random I/O •  ONCONFIG parameter TEMPTAB_NOLOG=1 to default temp tables to non logging •  ONCONFIG parameter HA_ALIAS= TCP network-based server alias §  Used to tell server network interface port to do server to server replication traffic. 19
  21. 21. Best practices for HDR •  ONCONFIG parameter DRINTERVAL=0 and use HDR_TXN_SCOPE (ASYNC, NEAR_SYNC or FULL_SYNC) •  ONCONFIG parameter DRAUTO=3 and use connection manage to arbitrate failover •  ONCONFIG parameter LOG_STAGING_DIR always set §  Some log records, like CHECKPOINT, require serialized processing which can block the primary from sending log data. When an HDR secondary is configured with a log staging directory, the logs can be spooled to disk while the serialized log record is applied on the secondary. Once the log record has been applied, the secondary will apply the spooled log until it catches up with the primary. This can alleviate backflow pressure from the secondary to the primary. 20
  22. 22. Best practices for RSS •  ONCONFIG parameter RSS_FLOW_CONTROL §  This ONCONFIG parameter controls RPO (units=amount of data rather than time) for the RSS node so it doesn’t fall too far behind •  ONCONFIG parameter SMX_NUMPIPES §  Take advantage of parallel data transmission using multiple network pipes 21
  23. 23. Best practices for SDS •  ONCONFIG parameter SDS_LOGCHECK User scenario… I’m using HDR SDS with no cluster manager. How do I avoid disk corruption and split brain in a failover scenario? §  SDS_LOGCHECK is used to watch to log space in the event of a failover scenario. After waiting N seconds, if no log activity is seen, SDS secondary will assume takeover. §  10 is a good starting value •  ONCONFIG parameter SDS_FLOW_CONTROL §  This ONCONFIG parameter controls RTO (units=amount of data rather than time) for the SDS node so it doesn’t fall too far behind •  No data will be lost because the disks are shared! •  By not falling too far behind, it maintains RTO in the event of a failover so there isn’t too much log to apply in order to catch up 22
  24. 24. Connection Manager
  25. 25. Connection Manager •  Route client connection… 24 ? Cluster Flexible Grid / ER
  26. 26. Connection Manager •  Failover arbitration 25 New Primary Cluster
  27. 27. Connect Manager •  Act as a proxy 26 Port Blocked CM as Proxy CM-used port allowed Client that cannot be recompiled
  28. 28. Connection Manager •  Connection unit types 27 1) CLUSTER 2) REPLSET 3) GRID Primary HDR RSS Enterprise Replication 4) SERVERSET
  29. 29. Connection Manager – Best Practices •  Avoid single point of failure 28 Client’s INFORMIXSQLHOSTS: g_mySLA group - - c=1,i=123456 cm1_mySLA onsoctcp cm1Host cm1Port g=g_mySLA cm2_mySLA onsoctcp cm2Host cm2Port g=g_mySLA cm3_mySLA onsoctcp cm3Host cm3Port g=g_mySLA
  30. 30. c=1?
  31. 31. Network paths offer perspective PRI HDR switch Is PRI down? Yes PRI HDR Is PRI down? No vs
  32. 32. We Value Your Feedback! Don’t forget to submit your Insight session and speaker feedback! Your feedback is very important to us – we use it to continually improve the conference. Access your surveys at to quickly submit your surveys from your smartphone, laptop or conference kiosk. 31
  33. 33. 32 Notices and Disclaimers Copyright © 2015 by International Business Machines Corporation (IBM). No part of this document may be reproduced or transmitted in any form without written permission from IBM. U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM. Information in these presentations (including information relating to products that have not yet been announced by IBM) has been reviewed for accuracy as of the date of initial publication and could include unintentional technical or typographical errors. IBM shall have no responsibility to update this information. THIS DOCUMENT IS DISTRIBUTED "AS IS" WITHOUT ANY WARRANTY, EITHER EXPRESS OR IMPLIED. IN NO EVENT SHALL IBM BE LIABLE FOR ANY DAMAGE ARISING FROM THE USE OF THIS INFORMATION, INCLUDING BUT NOT LIMITED TO, LOSS OF DATA, BUSINESS INTERRUPTION, LOSS OF PROFIT OR LOSS OF OPPORTUNITY. IBM products and services are warranted according to the terms and conditions of the agreements under which they are provided. Any statements regarding IBM's future direction, intent or product plans are subject to change or withdrawal without notice. Performance data contained herein was generally obtained in a controlled, isolated environments. Customer examples are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual performance, cost, savings or other results in other operating environments may vary. References in this document to IBM products, programs, or services does not imply that IBM intends to make such products, programs or services available in all countries in which IBM operates or does business. Workshops, sessions and associated materials may have been prepared by independent session speakers, and do not necessarily reflect the views of IBM. All materials and discussions are provided for informational purposes only, and are neither intended to, nor shall constitute legal or other guidance or advice to any individual participant or their specific situation. It is the customer’s responsibility to insure its own compliance with legal requirements and to obtain advice of competent legal counsel as to the identification and interpretation of any relevant laws and regulatory requirements that may affect the customer’s business and any actions the customer may need to take to comply with such laws. IBM does not provide legal advice or represent or warrant that its services or products will ensure that the customer is in compliance with any law.
  34. 34. 33 Notices and Disclaimers (con’t) Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products in connection with this publication and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. IBM does not warrant the quality of any third-party products, or the ability of any such third-party products to interoperate with IBM’s products. IBM EXPRESSLY DISCLAIMS ALL WARRANTIES, EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. The provision of the information contained herein is not intended to, and does not, grant any right or license under any IBM patents, copyrights, trademarks or other intellectual property right. •  IBM, the IBM logo,, Aspera®, Bluemix, Blueworks Live, CICS, Clearcase, Cognos®, DB2® , DOORS®, Emptoris®, Enterprise Document Management System™, FASP®, FileNet®, Global Business Services ®, Global Technology Services ®, IBM ExperienceOne™, IBM SmartCloud®, IBM Social Business®, IMS™, Information on Demand, ILOG, Maximo®, MQIntegrator®, MQSeries®, Netcool®, OMEGAMON, OpenPower, PureAnalytics™, PureApplication®, pureCluster™, PureCoverage®, PureData®, PureExperience®, PureFlex®, pureQuery®, pureScale®, PureSystems®, QRadar®, Rational®, Rhapsody®, Smarter Commerce®, SoDA, SPSS, Sterling Commerce®, StoredIQ, Tealeaf®, Tivoli®, Trusteer®, Unica®, urban{code}®, Watson, WebSphere®, Worklight®, X-Force® and System z® Z/OS, are trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at:
  35. 35. © 2015 IBM Corporation Thank You