Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

IBM z/OS V2R2 Networking Technologies Update

1,289 views

Published on

Apresentação realizada durante IBM ITSO z Systems world tour 2015, em São Paulo, Brasil, entre os dias 16 e 22 de Outubro de 2015. Apresentação criada pelo time técnico IBM ITSO.

Published in: Technology
  • Be the first to comment

IBM z/OS V2R2 Networking Technologies Update

  1. 1. ibm.com www.ibm.com/redbooks International Technical Support Organization Global Content Services IBM Inside Sales IBM z/OS V2R2 Networking Technologies Update Chris Meyer – meyerchr@us.ibm.com Doris Bunn – dbunn@us.ibm.com Howie Odishoo – odishoo@us.ibm.com Mike Fox – mjfox@us.ibm.com Pat Brown – patbrown@us.ibm.com Todd Valler – tevaller@us.ibm.com
  2. 2. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-2 Notices This information was developed for products and services offered in the U.S.A. IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or service. IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not give you any license to these patents. You can send license inquiries, in writing, to: IBM Director of Licensing, IBM Corporation, North Castle Drive, Armonk, NY 10504-1785 U.S.A. The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you. This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice. Any references in this information to non-IBM Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk. IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you. Any performance data contained herein was determined in a controlled environment. Therefore, the results obtained in other operating environments may vary significantly. Some measurements may have been made on development-level systems and there is no guarantee that these measurements will be the same on generally available systems. Furthermore, some measurement may have been estimated through extrapolation. Actual results may vary. Users of this document should verify the applicable data for their specific environment. Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental. COPYRIGHT LICENSE: This information contains sample application programs in source language, which illustrate programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
  3. 3. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-3 Trademarks The following terms are trademarks of the International Business Machines Corporation in the United States, other countries, or both: IBM has two registered trademarks for the branding of ITSO publications. These registered marks are for the text word "IBM Redbooks" and the Redbooks logo. In a nutshell, the term Redbooks must always be used in the plural form (for both text and logo) since IBM only owns the registered mark for the plural form. Usage must follow the guidelines below: Using the term Redbooks in written text Redbooks are only to be referred to in the plural form, NEVER in the singular. For the initial reference (first occurrence), you must use "IBM Redbooks®" and include "IBM" as well as the ®. For instances thereafter you may use "Redbooks" without "IBM" preceding the word or ® following it. Correct usage for written text : In this IBM Redbooks® publication we will explore…..(® symbol required for 1st usage) This Redbooks publication will show you…..(2nd usage or later - no ® or "IBM" needed) Using the logo: OTHER ITSO PUBLICATIONS - Marks not yet registered Trademark registration is a lengthy process and until we are officially registered, we cannot use the ® symbol. For those terms/logos in process, we will be using the ™ symbol. In contrast to the ® symbol (placed in the lower right hand corner), the ™ symbol is placed in the upper right hand corner. Please see examples below: Redpaper ™ Redpapers ™ Redwiki ™ Redwikis ™ The following terms are trademarks of other companies: UNIX is a registered trademark of The Open Group in the United States and other countries. Linux is a trademark of Linus Torvalds in the United States, other countries, or both. Other company, product, or service names may be trademarks or service marks of others. Redbooks (logo)
  4. 4. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-4 Session objectives • Provide an overview of the z/OS Communications Server features and enhancements delivered in V2R2 • The following areas will be described for each item where appropriate – Background information – Business problem – Solution – Enablement actions – Externals – Migration considerations
  5. 5. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-5 Release content themes • The release content is grouped into 4 major categories – Availability – Scalability and Performance – Security – Simplification and Usability
  6. 6. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-6 • Reordering of cached Resolver results • Activate trace resolver without restarting applications • CICS sockets support for CICS TS 4.2 transaction tracking Availability
  7. 7. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-7 REORDERING OF CACHED RESOLVER RESULTS
  8. 8. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-8 Background information • System Resolver caching was introduced in z/OS V1R11 Communications Server – Resolver will only cache response data from Domain Name System (DNS) servers – Information obtained from local data files is not cached – Resolver maintains separate IPv4 and IPv6 entries for the same resource • Primary advantage of caching is the improved performance – Eliminates repetitive DNS queries • Caching activated on a system-wide basis – Individual applications can turn off caching independently
  9. 9. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-9 Background information (continued) • Host name to IP address resolution options – Getaddrinfo, which supports both IPv4 and IPv6 addresses – Gethostbyname, which supports only IPv4 addresses • IP address to host name resolution options – Getnameinfo, which supports both IPv4 and IPv6 addresses – Gethostbyaddr, which supports only IPv4 addresses
  10. 10. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-10 Business problem • Some DNS implementations reorder the list of IP addresses returned for a given host name in a round robin fashion – Provides a basic level of load balancing of IP addresses used by clients • Resolver caching does not reorder the list of IP addresses – IP addresses cached in the order received from the DNS server – Same order used for all subsequent requires for the life of the cache information – Any load balancing that might have been provided by the DNS server is eliminated
  11. 11. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-11 Solution • Resolver can now reorder cached information – Both system-wide and application levels of control are provided – Only applicable to host name to IP address resolution (Getaddrinfo and Gethostbyname) – IP addresses resolve to a single host name, so there is nothing to be reordered – System-wide caching must be active
  12. 12. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-12 Solution (continued) • Resolver reorders the cached information on a resolution query basis – Reordering is independent of which application issues the query – Reordering is independent of which type of query (Gethostbyname or Getaddrinfo) is issued • Resolver reorders IPv4 and IPv6 resource information separately • Resolver reorders the list before performing any sorting – Gethostbyname results sorted based on SORTLIST configuration statement – Getaddrinfo results sorted based on default destination address selection algorithm
  13. 13. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-13 Solution (continued) • Application X issues Getaddrinfo request for aaa.com, and Resolver caches this list of IP Addresses for aaa.com: • Application X issues Getaddrinfo request for aaa.com, and Resolver caches this list of IP Addresses for aaa.com:
  14. 14. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-14 Enablement actions: Resolver setup statements • Use CACHEREORDER to activate cache reordering • Use NOCACHEREORDER to stop cache reordering – NOCACHEREORDER is the default • Resolver ignores either statement when the NOCACHE setup statement is also specified • You can modify the setting dynamically – Update setting in resolver setup file, then issue MODIFY <resolver>,REFRESH,SETUP=<setup file name>
  15. 15. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-15 Enablement actions: TCPIP.DATA file • Use the new NOCACHEREORDER statement to stop cache reordering for any application using this profile – NOCACHEREORDER is meaningless if either system-wide caching or cache reordering is not active – Specifying NOCACHEREORDER in the GLOBALTCPIPDATA data set is the equivalent of coding the NOCACHEREORDER setup statement • You can modify the setting dynamically – Update setting in TCPIP.DATA file, then issue MODIFY <resolver>,REFRESH
  16. 16. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-16 Externals: MODIFY RESOLVER display changes • CACHEREORDER (or NOCACHEREORDER) setting included in MODIFY RESOLVER,DISPLAY output
  17. 17. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-17 Externals: Trace RESOLVER changes • CACHEREORDER (or NOCACHEREORDER) setting included in res_init Trace Resolver output
  18. 18. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-18 Externals • Resolver NMI (EZBREIFR) available starting with z/OS V1R13 Communications Server – Updated to include new setup file setting – Updated to include GLOBALTCPIPDATA file setting, if any • IPCS Resolver output also updated to include new setup file setting
  19. 19. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-19 ACTIVATE TRACE RESOLVER WITHOUT RESTARTING APPLICATIONS
  20. 20. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-20 Background information • Trace Resolver is useful for diagnosing problems in resolving host names to IP addresses, or IP addresses to host names • Trace Resolver traces information on a per-application basis • Trace Resolver can be enabled using one of these methods: – z/OS UNIX RESOLVER_TRACE environment variable – SYSTCPT DD allocation in the MVS batch job or TSO environment – TRACE RESOLVER or OPTIONS DEBUG statement in the TCPIP.DATA file – Debug option (resDebug) in an application $__res_state structure
  21. 21. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-21 Background information (continued) • Trace Resolver output can be written to a variety of locations – TSO user terminal screen – Existing MVS sequential data set – New or existing HFS file – JES SYSOUT (for MVS batch job) • Each record length can be between 80-256 characters – If the record length is 128 or larger, the last six print positions are the storage address of the MVS TCB that issued the resolver call
  22. 22. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-22 Background information (continued) • Component Trace (CTRACE) is useful for collecting additional Resolver debug information – Resolver CTRACE component is SYSTCPRE • Unlike Trace Resolver, Resolver CTRACE shows resolver actions for all applications – Information can be filtered by JOBNAME, ASID, or both – All Resolver CTRACE records written to a common output location • Only two Resolver CTRACE options – ALL, MINIMUM
  23. 23. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-23 Business problem • Dynamically starting or stopping Trace Resolver takes two steps: – Setting TRACE RESOLVER or OPTIONS DEBUG in the TCPIP.DATA file – Issuing the MODIFY RESOLVER,REFRESH command • This approach is not possible for long-running Started Task Control (STC) servers – STC servers use SYSTCPT DD allocation method or z/OS UNIX RESOLVER_TRACE environment variable to start trace – Modifying the setting of the Trace Resolver requires stopping and restarting the server – Extremely disruptive to users and typically requires scheduled outage
  24. 24. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-24 Solution • Use Resolver CTRACE to collect Trace Resolver information as CTRACE records – New CTRACE option (TRACERES) defined – Supports ASID and JOBNAME filtering – Allows Trace Resolver information to still be collected on an individual application basis – Allows Trace Resolver information to be collected without stopping and restarting the server • Use IPCS CTRACE subcommand processing to view the formatted component trace data
  25. 25. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-25 Enablement actions: Activate tracing • Use TRACE CT,ON command to enable the collection of Trace Resolver output as Resolver CTRACE records – Full syntax: TRACE CT,ON,COMP=SYSTCPRE,SUB=(resolver jobname) – Specify OPTION=(TRACERES) in response text, plus any additional filters
  26. 26. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-26 Enablement actions (continued) • Example of starting TRACERES collection using the TRACE,CT command
  27. 27. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-27 Enablement actions: Disable tracing • Use TRACE CT,ON command to disable the collection of Trace Resolver output as Resolver CTRACE records – Full syntax: TRACE CT,ON,COMP=SYSTCPRE,SUB=(resolver jobname) – Specify OPTION=() in response text, plus any additional filters – OPTION=(ALL) or OPTION=(MINIMUM) can also be used
  28. 28. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-28 Externals • Use IPCS CTRACE subcommand processing to view the formatted component trace data from a dump or an external CTRACE data set
  29. 29. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-29 Externals (continued) • Examples of formatted CTRACE TRACERES records
  30. 30. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-30 CICS SOCKETS SUPPORT FOR CICS TS 4.2 TRANSACTION TRACKING
  31. 31. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-31 Business problem • CICS Transaction Server V4R2 introduced a new function to supply meta data to identify transaction Point of Origin information – CICS Explorer can display the Point of Origin information – CICS SMF records include the Point of Origin information • Point of Origin information is useful for problem determination • CICS TCP/IP sockets support does not register Point of Origin information – The CICS TCP/IP sockets listener transaction (CSKL) is commonly used to initiate CICS transactions – CSKL initiated transactions reduces the value of CICS transaction tracking and adds complexity to problem diagnosis
  32. 32. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-32 Solution • Add support for transaction tracking to CICS TCP/IP sockets – Listener program EZACIC02 (CSKL) makes Point of Origin information available to the TRUE – TRUE program EZACIC01 uses CICS facilities to register Point of Origin information for the transaction – CICS Transaction Server for z/OS Version 4.2 and later allow resource managers to register tracking information in their TRUE – No Point of Origin information registered for other transactions – Transactions acting as clients – Non-IBM provided listeners (i.e. vendor or home grown listeners)
  33. 33. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-33 Enablement actions • None
  34. 34. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-34 Externals • Transaction tracking fields – Origin Adapter Data 1 → TCPIP Jobname – Origin Adapter Data 2 → Local IP address and local port (Listener) – Origin Adapter Data 3 → Remote IP address and remote port – Origin Adapter ID → IBM zOS CommServer supplied listener name
  35. 35. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-35 Externals: CICS Explorer
  36. 36. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-36 Externals: CICS SMF 110 subtype 001 record
  37. 37. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-37 • 64 bit enablement of the TCP/IP stack • Enterprise Extender scalability • Enhanced IKED scalability • Shared memory communications over RDMA enhancements • Shared memory communications over RDMA adapter (RoCE) virtualization • SMC applicability tool (SMCAT) • Increase single stack DVIPA limit to 4096 • Removed support for legacy devices • VIPAROUTE fragmentation avoidance • TCP autonomic tuning Scalability and Performance
  38. 38. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-38 64 BIT ENABLEMENT OF THE TCP/IP STACK
  39. 39. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-39 Background information: z/OS storage map PSA Private area Nucleus/SQA LPA/CSA ECSA/ELPA ESQA/ENucleus Extended private area reserved ELSQA common common extended common LSQA 16 MB 2 GB 4 GB 64-bit 31-bit 24-bit 16 MB 2 GB 16 EB User extended private area User extended private area Shared area 512 TB 2 TB
  40. 40. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-40 Background information: Prior 64-bit usage • z/OS V1R11 Communications Server – Socket Control Blocks (SCBs) • z/OS V1R13 Communications Server – VTAM Internal Trace (VIT) – TCP/IP CTRACE Area – TN3270 CTRACE Area • z/OS V2R1 Communications Server – Shared Memory Communications for RDMA (SMC-R) control blocks and network data
  41. 41. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-41 Business problem • Workload consolidation and larger systems – increases demand for ECSA – Increases demand on TCP/IP private area • Performance implications – AMODE switching to reference 64 bit storage – Use of 31 bit addressing in AMODE(64) – Access Register (AR) mode switching to reference dataspace storage
  42. 42. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-42 Solution • Convert the TCP/IP stack to run in AMODE(64) • Convert the TCP/IP stack to use 64 bit addresses • Move 31 bit data areas to 64 bit storage – Run time work areas and save areas (DUCB/DUSA) – Moved from ECSA/private – Network data (CSM) – Moved from ECSA/dataspace – Reduce switches to AR mode to reference dataspace – Transmission control block (TCB) – Moved from private
  43. 43. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-43 Solution: Network connectivity and 64 bit storage • Interface types which exploit 64-bit virtual data in z/OS CS V2R2 – OSA-Express QDIO – Inbound Enterprise Extender (EE) traffic with Inbound Workload Queueing (IWQ) still uses 31-bit CSM dataspace – HiperSockets – RoCE Express (for SMC-R) • All other supported TCP/IP network connectivity (such as MPCPTP, LCS, CTC) is compatible with 64-bit virtual memory – These are referred to as 31-bit network interface types – z/OS CS still uses 31-bit CSM dataspace for these types
  44. 44. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-44 Solution: Storage results • Lab results for 128,000 TN3270 sessions in KB V2R1 V2R2 % change TN3270 ECSA 1,575 145 -91% TN3270 Private 440,054 541,618 23% TCP/IP ECSA 9,188 6,593 -28% TCP/IP Private 275,338 43,332 -84% TCP/IP HVCOMMON 63,000 70,000 11% TCP/IP HVPRIVATE 1,000 513,000 512%
  45. 45. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-45 Enablement actions: New IVTPRM00 parameter • HVCOMM maxhvcommM – Defines the maximum amount of storage dedicated to High Virtual Common storage CSM buffers. – maxhvcommM – A decimal integer specifying the maximum bytes of HVCOMM storage dedicated to CSM use. – Valid Range: 100M to 999999M – Default Value: 2000M – Notes: – M indicates megabytes – Defined in megabytes only
  46. 46. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-46 Externals: Modify CSM • MODIFY CSM command to update CSM storage value dynamically or to activate changes made to the CSM parmlib member IVTPRM00 without requiring an IPL >>__MODIFY proc,CSM_ _____________ _ _____________ _ _________________ _>< |_,ECSA=mecsa_| |_,FIXED=mfix_| |_,HVCOMM=mhvcomm_| – mhvcomm specifies the maximum number of bytes of high virtual common (HVCOMM) storage for CSM buffers – Valid Range: 100M to 999999M
  47. 47. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-47 Externals: Display NET,CSM • Display NET,CSM command output showing 64 bit storage values **** Continued *** IVT5532I ------------------------------------------------------ IVT5533I 4K HVCOMM 24K 1000K 1M IVT5533I 16K HVCOMM 96K 928K 1M IVT5533I 32K HVCOMM 192K 832K 1M IVT5533I 60K HVCOMM 360K 660K 1020K IVT5533I 180K HVCOMM 720K 1080K 1800K IVT5535I TOTAL HVCOMM 1392K 4500K 5892K IVT5532I ------------------------------------------------------ “”””””””””””””””””””””””””””””””””””””””””””””””””””””””””””””” IVT5538I FIXED MAXIMUM = 2048M FIXED CURRENT = 5949K IVT5541I FIXED MAXIMUM USED = 5949K SINCE LAST DISPLAY CSM IVT5594I FIXED MAXIMUM USED = 5949K SINCE IPL
  48. 48. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-48 Externals: Display NET,CSM (continued) • Display NET,CSM command output showing 64 bit storage values **** Continued *** IVT5539I ECSA MAXIMUM = 100M ECSA CURRENT = 5073K IVT5541I ECSA MAXIMUM USED = 5073K SINCE LAST DISPLAY CSM IVT5594I ECSA MAXIMUM USED = 5073K SINCE IPL IVT5604I HVCOMM MAXIMUM = 1000M HVCOMM CURRENT = 9M IVT5541I HVCOMM MAXIMUM USED = 9M SINCE LAST DISPLAY CSM IVT5594I HVCOMM MAXIMUM USED = 9M SINCE IPL IVT5559I CSM DATA SPACE 1 NAME: CSM64001 IVT5559I CSM DATA SPACE 2 NAME: CSM31002 IVT5599I END
  49. 49. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-49 Externals: VTAM Internal Trace (VIT) • New and changed VIT records with 64-bit addresses – New records – IUT6 (outbound QDIO) – XB61, XB62, XB63 (inbound/outbound QDIO) – QAP6 (QDIO Accelerator) – GCE6 (64-bit CSM) – Changed records – ODPK (inbound/outbound QDIO)
  50. 50. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-50 Migration considerations • IVTAPRM00 default value for FIXED changed to 200M – Defines the maximum amount of storage dedicated to fixed CSM buffers. • Use VIPAROUTE over OSA-Express QDIO or HiperSockets to optimize SD traffic – Forwarding over 31 bit network interface types (XCF) involves additional data copy • Use the IWQ function for OSA-Express QDIO to optimize EE inbound traffic (INBPERF WORKLOADQ) – EE inbound traffic will be staged in 31 bit storage • Display NET,CSM displays new HVCOMM information
  51. 51. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-51 ENTERPRISE EXTENDER SCALABILITY
  52. 52. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-52 Background information • An Enterprise Extender local node communicates with a remote node using UDP over an IP network – A local node is defined within a TCP/IP stack by a local IP address, typically a static VIPA, and 5 UDP sockets (5 UCB control blocks). – Each UDP socket is bound to the static VIPA and one of 5 UDP ports (default 12000-12004). The ports map to 4 SNA routing priorities for data traffic, plus one port for LLC commands – An EE link represents the “connection” between a local node and remote node. The link has 5 routes through the IP network - one for each port EE local node EE remote node VIPA UCB table 12000 12001 12002 12003 12004 EE route cache route route route route route
  53. 53. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-53 Business problem: Scaling issues • As packets are processed: – Serialization on one of 5 UCBs causes performance bottlenecks, storage constraints for suspended threads (suspended DUCBs) – Increased cache misses on IPSEC and Policy rules causes higher CPU utilization • As an EE link to a remote node is created, extra processing time is needed to find open slots in IP MAIN's route cache (lesser concern since this is per connection)
  54. 54. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-54 Solution • Create a new “remote UCB” structure for each EE port – IPSEC rules and Policy moved from main UCB – Outbound flows – new “remote UCB” lock accessed instead of local UCB lock – Inbound flows – EE policy lock replaced by remote UCB lock • Access remote UCB – using one of 5 new hash tables added to UCB table (one per local port) – Hash key to access remote UCB is remote node's IP and port • Move route cache to remote UCB EE local node EE remote node VIPA UCB table 12000 12001 12002 12003 12004 Remote UCB IPSEC rules Inbound filter rule Outbound filter rule Inbound policy Outbound policy Route info Hash key
  55. 55. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-55 Enablement actions • None
  56. 56. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-56 Migration considerations • None
  57. 57. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-57 ENHANCED IKED SCALABILITY
  58. 58. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-58 Background information: Internet Key Exchange IKE peers negotiate an IKE (“phase 1”) tunnel (one bidirectional SA) over an unprotected UDP socket. IKE peers negotiate IPSec (“phase 2”) tunnel (two unidirectional SAs) under protection of the IKE tunnel. These SAs are installed into the TCP/IP stack Data flows through IPSec tunnel using Authentication Header (AH) and/or Encapsulating Security Payload (ESP) protocol Each peer authenticates each other using digital signatures based on digital certificates or pre-shared keys Peers agree on a set of cryptographic algorithms to use to protect the subsequent IKE messages that will flow between the two (phase 2 SA negotiations, informational exchanges and notifications) • A series of IKE messages are exchanged under the protection of the phase 1 tunnel. This includes encryption, authentication and integrity protections for every IKE message • Upon completion, the phase 2 SAs are installed in the TCP/IP stack • Data packets are sent between the IPsec endpoints under the protection of the phase 2 tunnels. This includes encryption, authentication and integrity protections for every data packet • The IKE daemon is not involved until it is time to refresh or delete one of the security associations
  59. 59. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-59 Business problem • When a very large number (multiple thousands) of remote IKE peers simultaneously initiate negotiations with a single z/OS IKED, the z/OS daemon struggles to keep up with the load • Symptoms: – A large portion of the remote IKE peers retransmit messages due to timeouts (per the IKE protocol) – Inbound IKE messages are discarded by z/OS TCP/IP stack as capacity of UDP queues is reached – z/OS IKED spends more and more time handling retransmitted messages from peers (per the IKE protocol) – IKED takes a significant amount of time to recover to a stable state – A “stairstep” effect in the rate of negotiation activity – Bursts interleaved with increasingly longer quiet intervals – Dropped inbound IKE messages and IKE protocol's geometric back-off
  60. 60. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-60 Business problem (continued) • “Stairstep effect” of large numbers of remote IKE peer retransmissions: Completedtunnels Time
  61. 61. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-61 Solution • A new thread pool with appropriate serialization is added to IKED – IKE negotiations are now handled by this pool (vs. a single thread per the previous design) – Inbound IKE protocol messages – Other internal events required to complete the negotiations – No permanent affinity between a given IKE peer and any thread within IKED. • Inbound IKE messages now prioritized – Duplicate (retransmitted) IKE messages are detected and discarded upon receipt – significantly reduces workload – “Later” IKE messages prioritized ahead of “earlier” ones – promotes completion of in-progress negotiation before starting new ones
  62. 62. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-62 Solution (continued) • Initial scalability testing has been very positive - generally linear scalability as the number of CPUs is increased • Changes will be transparent to the vast majority of z/OS IKED users – significant improvements will be more noticeable under heavier workloads V2R1 V2R2
  63. 63. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-63 Enablement actions • None
  64. 64. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-64 Externals • One new level added to IkeSyslogLevel parameter of the iked.conf file 128 – IKE_SYSLOG_LEVEL_DEBUGPTP Show additional information regarding primary thread pool scheduling • Syslogd output: – New messages for log level DEBUGPTP – Messages might now be interleaved (up until now, they have appeared in an order that was fairly representative of the actual order of events) – IKED thread ID will now appear in the syslogd message header
  65. 65. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-65 Migration considerations • Those with multiple thousands of IKE peers might need to adjust specific resources: – Virtual storage available to IKED – Maximum number of messages allowed on z/OS message queues – Limitations on number of messages allowed on inbound UDP queues • Automated processing of SYSLOGD messages may need to be adjusted for the thread id
  66. 66. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-66 SHARED MEMORY COMMUNICATIONS OVER RDMA ENHANCEMENTS
  67. 67. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-67 Background information: SMC-R • SMC-R is a “hybrid” solution: – Existing TCP connection establishment flows still used – SMC-R option exchanged as TCP option in connection establishment – SMC-R usage negotiated similarly to how SSL usage is negotiated – Application data flows “out-of-band” using RDMA protocols – RoCE Express MTUs 1024 and 2048 supported – Peers negotiate and use the smallest size supported • Preserves critical existing operational and network management features of TCP/IP
  68. 68. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-68 Background information: View of SMC-R • Shared Memory Communications over RDMA (SMC-R) defines a means to exploit Remote Direct Memory Access (RMDA) technology for communications transparently to the applications SMC-R enabled platform OS image OS image Virtual server instance server client RNIC Shared Memory Communications via RDMA SMCSMC RDMA enabled (RoCE) RNIC Clustered Systems SMC-R enabled platform Virtual server instance shared memory shared memory Sockets Sockets
  69. 69. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-69 Business problem: MTU • The RoCE Express can support sending data in three different MTU sizes: 1024, 2048 and 4096 – z/OS V2R1 SMCR implementation supported PFID configuration of just two of the sizes: 1024 and 2048 – For large data sends, a larger MTU can improve throughput
  70. 70. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-70 Solution: Support 4K MTU • GLOBALCONFIG SMCR PFID configuration now supports 4K MTU • Existing displays will show new value – Netstat,CONFIG/-f command shows configured value – Netstat,DEvlinks/-d,SMC command will show actual value in use for SMCR link
  71. 71. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-71 Enablement actions: MTU configuration • Configure the SMCR PFID MTU value with GLOBALCONFIG – Default MTU value is 1024 >>_GLOBALCONFif___________________________________________________________> ... | '-SMCR---+---------------------------------------------------+-+-' | | | .-----------------------------------------------. | | | | | .------------------------------. | | | | | V V | | | | | +---PFID - pfid----+--------------------------+-+-+-+ | | | | .-PORTNum -1---. | | | | | +-+--------------+---------+ | | | | | '-PORTNum -num-' | | | | | | .-MTU -1024----' | | | | | '-+--------------+---------' | | | | '-MTU -mtusize-' | | ...
  72. 72. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-72 Externals: Netstat CONFIG/-f • Confirm PFID MTU is configured correctly GLOBAL CONFIGURATION INFORMATION: TCPIPSTATS: YES ECSALIMIT: 2096128K POOLLIMIT: 2096128K MLSCHKTERM: NO XCFGRPID: 11 IQDVLANID: 27 SYSPLEXWLMPOLK: 060 MAXRECS: 100 EXPLICITBINDPORTRANGE: 05000 -06023 IQDMULTIWRITE: YES WLMPRIORITYQ: YES IOPRI1 0 1 IOPRI2 2 IOPRI3 3 4 IOPRI4 5 6 FWD SYSPLEX MONITOR: TIMERSECS: 0060 RECOVERY: YES DELAYJOIN: NO AUTOREJOIN: YES MONINTF: YES DYNROUTE: YES JOIN: YES zIIP: IPSECURITY: YES IQDIOMULTIWRITE: YES SMCR: YES FIXEDMEMORY: 200M TCPKEEPMININT: 00000300 PFID: 001C PORTNUM: 1 MTU: 1024 PFID: 0015 PORTNUM: 2 MTU: 4096
  73. 73. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-73 Externals: Netstat DEVLINKS,SMC • Confirm actual MTU for SMC-R link is correct D TCPIP,TCPCS1,NETSTAT,DEVLINKS, SMC EZD0101I NETSTAT CS V2R2 TCPCS1 INTFNAME: EZARIUT1001C INTFTYPE: RNIC INTFSTATUS: READY PFID: 001C PORTNUM: 1 TRLE: IUT1001C PNETID: ZOSNET VMACADDR: 02000035F740 GIDADDR: FE80::200:FF:FE35:F740 INTERFACE STATISTICS: BYTESIN = 160 INBOUND OPERATIONS = 5 BYTESOUT = 344 OUTBOUND OPERATIONS = 11 SMC LINKS = 1 TCP CONNECTIONS = 1 INTF RECEIVE BUFFER INUSE = 64K SMC LINK INFORMATION: LOCALSMCLINKID: 2D8F0101 REMOTESMCLINKID: 729D0101 SMCLINKGROUPID: 2D8F0100 VLANID: 100 MTU: 4096 LOCALGID: FE80::200:FF:FE35:F740 LOCALMACADDR: 02000035F740 LOCALQP: 000040 REMOTEGID: FE80::200:1FF:FE35:F740 REMOTEMACADDR: 02000135F740 REMOTEQP: 000041
  74. 74. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-74 Migration considerations • None – When an SMC-R link is initially established between two peer hosts, the MTU size is exchanged and negotiated to the lowest value for both hosts
  75. 75. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-75 Business problem: Repeated SMC-R failures • Every SMC-R eligible TCP connection will attempt to connect to its peer using SMC-R • Examples of reasons a TCP connection cannot use SMC-R – IPSEC – Mismatching subnets (two peers not in same subnet or vlan) – Link layer issues prevent connectivity over RoCE fabric – Config problem – Connection setup delays possible • In these cases the stack attempts to use SMC-R then generally falls back to TCP • These conditions can exist for extended periods of time affecting numerous TCP connections – Even if they fallback to using TCP they incur overhead
  76. 76. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-76 Solution: Cache SMC-R failures • Cache IP destinations with persistent SMC-R establishment failures – Cached when we encounter three consecutive failures in an interval (approximately twenty minutes) – While cached, connections will use TCP – Cached destinations cleared approximately every interval – Gives new connections opportunity to exploit SMC-R periodically – Cache can also be cleared by disabling AUTOCACHE function • Enabled with new GLOBALCONFig SMCGlobal AUTOCACHE configuration statement – Enabled by default – Disabled with GLOBALCONFig SMCGlobal NOAUTOCACHE
  77. 77. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-77 Enablement actions • Configured with GLOBALCONFig SMCGlobal statement • Default value is AUTOCACHE (function enabled) >>-GLOBALCONFig--------------------------------------------------> >----+--------------------------------------------------------+-+->< : : | .-------------------------. | | V .-AUTOCACHE---. | | +-SMCGlobal---+--+-------------+----+--+-----------------+ | | '-NOAUTOCACHE-' | | | | | | | | .-AUTOSMC------. | | | '--+---------------+--' | | '-NOAUTOSMC----' |
  78. 78. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-78 Externals: Netstat CONFIG/-f • Confirm AUTOCACHE is configured correctly GLOBAL CONFIGURATION INFORMATION: TCPIPSTATS: NO ECSALIMIT: 0000000K POOLLIMIT: 0000000K MLSCHKTERM: NO XCFGRPID: IQDVLANID: 0 SYSPLEXWLMPOLL: 060 MAXRECS: 100 EXPLICITBINDPORTRANGE: 00000 -00000 IQDMULTIWRITE: NO AUTOIQDX: ALLTRAFFIC ADJUSTDVIPAMSS: AUTO WLMPRIORITYQ: NO SYSPLEX MONITOR: TIMERSECS: 0060 RECOVERY: NO DELAYJOIN: NO AUTOREJOIN: NO MONINTF: NO DYNROUTE: NO JOIN: YES ZIIP: IPSECURITY: NO IQDIOMULTIWRITE: NO SMCGLOBAL: AUTOCACHE: YES AUTOSMC: NO SMCR: YES FIXEDMEMORY: 200M TCPKEEPMININT: 00000300 PFID: 001C PORTNUM: 1 MTU: 1024 PFID: 0015 PORTNUM: 2 MTU: 4096
  79. 79. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-79 Externals: Netstat ALL/-A • Determine if connection was cached not to use SMC-R – Asterisk (*) after reason code indicates destination IP address was cached D TCPIP,TCPCS1,NETSTAT,ALL,IPPORT=10.1.1.14+21 EZD0101I NETSTAT CS V2R2 TCPCS1 CLIENT NAME: FTPDOE34 CLIENT ID: 0000003B LOCAL SOCKET: ::FFFF:10.1.1.14..21 FOREIGN SOCKET: ::FFFF:10.1.1.24..1024 ... SMC INFORMATION: SMCSTATUS: INACTIVE SMCREASON: 00005301* - PEER DID NOT ACCEPT SMC -R REQUEST ---- 1 OF 1 RECORDS DISPLAYED END OF THE REPORT
  80. 80. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-80 Migration considerations • SMCGLOBAL AUTOCACHE is the default value – Configure SMCGLOBAL NOAUTOCACHE to preserve the existing behavior • Netstat ALL / -a and CONFIG / -f
  81. 81. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-81 Business problem: SMC-R short-lived connections • Short-lived TCP connections that exchange small amounts of data might be better suited for TCP instead of SMC-R – Impacted by extra packet flows creating SMC-R connection – PORT/PORTRANGE configuration provides the NOSMC subparameter – Inbound TCP connections using this port will not use SMC-R – Useful if user knowledgeable about the workload to particular servers – Many users are not aware of the workload patterns or the patterns can change over time
  82. 82. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-82 Solution: SMC-R workload monitoring • Enables the stack to analyze incoming TCP connections to dynamically determine whether SMC-R is beneficial for a local TCP server application – Identifies short-lived connections exchanging little data • Results of this monitoring influences whether TCP connections to a particular server (port) use SMC-R • Ensures TCP connections use the most appropriate communications protocol (TCP or SMC-R) • Workload data analyzed every interval so results reflect most recent activity
  83. 83. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-83 Enablement actions • Enabled with new GLOBALCONFig SMCGlobal AUTOSMC configuration statement – Enabled by default • New PORT/PORTRANGE SMC configuration option added – PORT/PORTRANGE NOSMC added in z/OS V2R1 – PORT/PORTRANGE configuration will override AUTOSMC monitoring
  84. 84. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-84 Enablement actions: Configuring AUTOSMC • Configured with GLOBALCONFig SMCGlobal statement • Default value is AUTOSMC (function enabled) >>-GLOBALCONFig--------------------------------------------------> >----+--------------------------------------------------------+-+->< : : | | | | | | V .-AUTOCACHE---. | | +-SMCGlobal---+--+-------------+----+--+-----------------+ | | '-NOAUTOCACHE-' | | | | | | .-AUTOSMC------. | | | '--+---------------+--' | | '-NOAUTOSMC----' |
  85. 85. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-85 Externals: Netstat CONFIG/-f • Confirm AUTOSMC is configured correctly GLOBAL CONFIGURATION INFORMATION: TCPIPSTATS: NO ECSALIMIT: 0000000K POOLLIMIT: 0000000K MLSCHKTERM: NO XCFGRPID: IQDVLANID: 0 SYSPLEXWLMPOLL: 060 MAXRECS: 100 EXPLICITBINDPORTRANGE: 00000-00000 IQDMULTIWRITE: NO AUTOIQDX: ALLTRAFFIC ADJUSTDVIPAMSS: AUTO WLMPRIORITYQ: NO SYSPLEX MONITOR: TIMERSECS: 0060 RECOVERY: NO DELAYJOIN: NO AUTOREJOIN: NO MONINTF: NO DYNROUTE: NO JOIN: YES ZIIP: IPSECURITY: NO IQDIOMULTIWRITE: NO SMCGLOBAL: AUTOCACHE: YES AUTOSMC: YES SMCR: YES FIXEDMEMORY: 200M TCPKEEPMININT: 00000300 PFID: 001C PORTNUM: 1 MTU: 1024 PFID: 0015 PORTNUM: 2 MTU: 4096
  86. 86. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-86 Externals: Netstat ALL/-A • View current server details – 90% of monitored connections over last interval had ideal workload for SMC-R – AutoSMC% must be >= 50% for UseSMC to be YES CLIENT NAME: USER19 CLIENT ID: 00000052 LOCAL SOCKET: 0.0.0.0..4206 FOREIGN SOCKET: 0.0.0.0..0 BYTESIN: 00000000000000000000 BYTESOUT: 00000000000000000000 SEGMENTSIN: 00000000000000000000 SEGMENTSOUT: 00000000000000000000 STARTDATE: 01/30/2015 STARTTIME: 19:02:04 LAST TOUCHED: 19:02:05 STATE: LISTEN ........ CONNECTIONSIN: 0000000200 CONNECTIONSDROPPED: 0000000000 MAXIMUMBACKLOG: 0000000010 CONNECTIONFLOOD: NO CURRENTBACKLOG: 0000000000 SERVERBACKLOG: 0000000000 FRCABACKLOG: 0000000000 CURRENTCONNECTIONS: 0000000050 SEF: 100 QUIESCED: NO SMC INFORMATION: SMCRCURRCONNS: 0000000025 SMCRTOTALCONNS: 0000000100 UseSMC: Yes Source: AutoSMC AutoSMC%: 090
  87. 87. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-87 Externals: Netstat PORTlist/-o • Confirm port configuration – M indicates port explicitly enabled for SMC (new function) – N indicates port explictly enabled for NOSMC (existing function) – These settings will override AUTOSMC for these ports NETSTAT PORTLIST MVS TCP/IP NETSTAT CS V2R2 TCPIP Name: TCPCS 15:24:23 Port# Prot User Flags Range SAF Name ----- ---- ---- ----- ----- -------- ..... 04002 TCP OMVS DABU 04020 TCP DCICSTS DAN 05000 TCP * DARN 05000 - 05001 06020 TCP * DAM 06000 TCP * DARM 06000- 06001 UNRSV UDP * FI GENERIC .....
  88. 88. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-88 Migration considerations • SMCGLOBAL AUTOSMC is the default value – Configure SMCGLOBAL NOAUTOSMC to preserve the existing behavior • Netstat – ALL / -a – CONFIG / -f – PORTList / -o
  89. 89. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-89 SHARED MEMORY COMMUNICATIONS OVER RDMA ADAPTER (ROCE) VIRTUALIZATION
  90. 90. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-90 Background information: View of SMC-R • Shared Memory Communications over RDMA (SMC-R) defines a means to exploit Remote Direct Memory Access (RMDA) technology for communications transparently to the applications SMC-R enabled platform OS image OS image Virtual server instance server client RNIC Shared Memory Communications via RDMA SMCSMC RDMA enabled (RoCE) RNIC Clustered Systems SMC-R enabled platform Virtual server instance shared memory shared memory Sockets Sockets
  91. 91. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-91 Background information: RoCE Express feature • System/z provides a physically separate 10GbE RoCE Express feature to exploit RoCE (RDMA over Converged Ethernet) functionality – Used in conjunction with the existing Ethernet connectivity provided by OSA – Provides access to the same physical Ethernet fabric used for traditional IP connectivity – Provides two 10GbE ports – Sometimes referred to as “RNIC adapter” • For redundancy, at a minimum two 10GbE RoCE Express features should be configured for each physical network you configure • RoCE Express features are supported using a converged interface model
  92. 92. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-92 Background information: View of dedicated RoCE PFIDPFID CPC PFIDPFID z/OS 2z/OS 1 z/OS 3 PFID 1PFID 1 PFID 2PFID 2 PR/SM If 1If 1 If 1If 1 If 1If 1If 1If 1 LP 1 LP 2 LP 3 LP 4 LP 5 LP 6 PCHID 100 FID 01 Ports 1 and 2 Physical Net ID = ‘NETA’ z/OS 4 Ports 1 and 2 RoCE RoCE PCHID 200 FID 16 I/O Draw 1 I/O Draw 2 VMAC for each PFID (per TCP stack) VMAC for each PFID (per TCP stack) PFIDPFID z/OS 5 If 1If 1 PFID 16PFID 16 PFID 17PFID 17 If 2If 2 If 2If 2 z/OS 6 Physical Network IDs are configured in HCD (IOCDS) for each physical port Up to 16 PCHIDs per CPC PFIDPFID If 1If 1
  93. 93. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-93 Business problem: Dedicated RoCE • Inability to share RoCE Express features between multiple LPARs – Up to eight TCP/IP stacks on one LPAR can share a feature – VTAM provides the virtualization – Redundancy requirements can quickly increase the number of RoCE Express features required for SMC-R – Limit of 16 features per CPC • Only one port on a given RoCE Express feature could be used – Could switch between ports, but still only use one at a time
  94. 94. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-94 Solution: Shared RoCE • RoCE Express features can be shared across LPARs – Up to 31 operating system instances can share one feature • Both RoCE Express ports can be used simultaneously • No additional RNIC definitions in z/OS Comm Server – PFID values are still defined on TCP/IP profile GLOBALCONFIG statement – PFID value must be unique if the RoCE Express feature is being shared by multiple TCP/IP stacks • No change in RNIC activation – RoCE Express features are still activated when the first SMC-R capable OSA interface is activated
  95. 95. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-95 Solution: Shared RoCE (continued) • z/OS V2R2 supports both dedicated and shared RoCE environments, depending on the hardware: – IBM zEnterprise EC12 (zEC12) with driver 15 or an IBM zEnterprise BC12 (zBC12) support dedicated RoCE environment only – IBM z13 or later supports shared RoCE environment only • z/OS V2R1 also supports both environments – APARs OA44576 and PI12223 • z/OS Communications Server detects the working environment during activation of the first RoCE Express feature
  96. 96. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-96 Solution: VLAN considerations • Each RoCE Express feature still supports 126 VLANs • The 126 VLANs must be shared across all virtual functions using the feature – Each VF is guaranteed at least two VLANs on a given RoCE Express feature – Each VF can use at most 16 VLANs on a given RoCE Express feature – Note: If two, or more, VFs share a RoCE Express feature, and use the same VLANID, that counts as only one of the 126 available VLANs • OSA (and RNIC) interfaces that use VLANs can now co-exist with OSA (and RNIC) interfaces that do not use VLANs on the same RoCE Express feature – Requires APAR OA44679 in z/OS V2R1
  97. 97. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-97 Solution: Redundancy considerations • Full SMC-R redundancy requires two unique physical paths – Different RoCE Express features – Different I/O draws – Different internal support structures • You must be careful to configure your system to ensure that the TCP/IP stack uses RoCE Express features that provide full redundancy – Less than full redundancy can result in TCP connection failures if a RoCE Express failure is encountered
  98. 98. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-98 Solution: View of shared RoCE using 2 ports PFIDPFID CPC PFIDPFID z/OS 2z/OS 1 z/OS 3 PFID 1PFID 1 PFID 2PFID 2 PR/SM If 1If 1 If 1If 1 If 1If 1If 1If 1 LP 1 LP 2 LP 3 LP 4 LP 5 LP 6 PCHID 100 FID 01 VF 10 FID 02 VF 11 Ports 1 and 2 Physical Net ID = ‘NETA’ z/OS 4 Ports 1 and 2 RoCE RoCE PCHID 200 FID 16 VF 22 FID 17 VF 23 I/O Draw 1 I/O Draw 2 VMAC for each VF per PFID VMAC for each VF per PFID PFIDPFID z/OS 5 If 1If 1 PFID 16PFID 16 PFID 17PFID 17 If 2If 2 If 2If 2 z/OS 6 Physical Network IDs are configured in HCD (IOCDS) for each physical port Up to 16 PCHIDs per CPC VFs 10 and 22 VFs 11 and 23 PFIDPFID If 1If 1
  99. 99. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-99 Enablement actions: HCD changes • Must provide a virtual function (VF) number Goto Filter Backup Query Help ------------------------------------------------------------------------------ PCIe Function List Row 28 of 500 More: > Command ===> _______________________________________________ Scroll ===> CSR Select one or more PCIe functions, then press Enter. To add, use F11. Processor ID . . . . : S88 z13 S88 / FID PCHID VF+ Type+ Description _ 028 108 28 ROCE S3E _ 029 108 29 ROCE S3E _ 030 108 30 ROCE S3E _ 031 108 31 ROCE S3E _ 032 13C 1 ROCE S36 _ 033 13C 2 ROCE S36 _ 034 13C 3 ROCE S36 _ 035 13C 4 ROCE S36 _ 036 13C 5 ROCE S36
  100. 100. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-100 Enablement actions: Selecting the PFID • The PFID value on the TCPIP profile GLOBALCONFIG SMCR statement has a slightly different meaning in a shared environment: – In a dedicated environment, the PFID directly identifies the RoCE Express feature, and all TCP/IP stacks sharing the feature use the same PFID – In a shared environment, each TCP/IP stack has its own unique PFID value to represent the RoCE Express feature • RoCE Express ports can be shared – Same or different TCP/IP stacks can use the two ports – Different PFID values must be defined for each usage of the port
  101. 101. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-101 Externals: Netstat DEvlinks/-d • SMC-R link group information included after all SMC-R links • New redundancy values defined: – Partial (Single local PCHID, unique ports) – Partial (Single local PCHID and port) SMC LINK GROUP INFORMATION: SMCLINKGROUPID: 2D8F0100 PNETID: NETID1 REDUNDANCY: PARTIAL (SINGLE LOCAL PCHID AND PORT) LINK GROUP RECEIVE BUFFER TOTAL: 3M 64K BUFFER TOTAL: 1M LOCALSMCLINKID REMOTESMCLINKID -------------- --------------- 2D8F0101 729D0101 2D8F0102 729D0102 2 OF 2 RECORDS DISPLAYED END OF THE REPORT
  102. 102. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-102 Externals: Display TRL,TRLE=rnic_trlename • When using shared RoCE environment: – VF number is displayed but the code level is not available D NET,TRL,TRLE=IUT1001C IST097I DISPLAY ACCEPTED IST075I NAME = IUT1001C, TYPE = TRLE IST1954I TRL MAJOR NODE = ISTTRL IST486I STATUS= ACTIV, DESIRED STATE= ACTIV IST087I TYPE = *NA* , CONTROL = ROCE, HPDT = *NA* IST2361I SMCR PFID = 001C PCHID = 0130 PNETID = NETID1 IST2362I PORTNUM = 1 RNIC CODE LEVEL = ***NA*** IST2389I PFIP = 01000300 IST2417I VFN = 0001 IST924I ------------------------------------------------------------ IST1717I ULPID = TCPCS1 ULP INTERFACE = EZARIUT1001C IST1724I I/O TRACE = OFF TRACE LENGTH = *NA* IST314I END
  103. 103. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-103 Externals: CSDUMP • New MSGVALUE option for RNICTRLE operand – Allows capture of diagnostic information when error message is generated for any RoCE Express feature – Only valid for MESSAGE=IST2406I or MESSAGE=IST2391I • A dump of the RoCE Express feature by one virtual function is NOT disruptive to other virtual functions that are using the feature |_,MESSAGE=_message_id_numbers________________________________________________________| |_,TCPNM=TCPIP_Jobname_||_,RNICTRLE= ______________ _| |_MSGVALUE_____| |_RNICTRLEName_| IST2406I or IST2391I
  104. 104. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-104 Externals: VTAM Internal Trace (VIT) records • New VIT records were defined – VHCR, VHC2, VHC3, VHC4, and VHC5 – Similar to existing HCR records, but for shared RoCE environment command processing – CCR and CCR2 – Communication channel operation in shared RoCE environment
  105. 105. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-105 Externals: VTAM dump formatting support • SNASMCR – Formats VTAM control blocks used to manage TCP/IP ownership of the RoCE Express feature, including associated RMB, VLAN, and QP information • SNAROCE – Formats VTAM control blocks used to manage the RoCE Express feature • Function rolled back to z/OS V2R1 using APAR OA44576
  106. 106. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-106 Externals: VTAM dump formatting support (continued)
  107. 107. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-107 Migration considerations • Assign VF values in HCD for each FID • If you currently have multiple TCP/IP stacks sharing a RoCE Express feature in a dedicated RoCE environment, you must: – Define unique FID values in HCD for the stacks to use as PFIDs on the TCPIP profile GLOBALCONFIG SMCR statement • Ensure you have full redundancy with your shared RoCE Express features or SMC-R fail-over processing can be compromised
  108. 108. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-108 SMC APPLICABILITY TOOL (SMCAT)
  109. 109. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-109 Business problem: Is SMC-R applicable to my environment • SMC-R requires a new RDMA capable NIC – 10GbE RoCE Express feature introduced in zEC12 GA2 and zBC12 – Each LPAR requires two RoCE Express features for High Availability • Useful to know if workload will exploit SMC-R beforehand – Some users are aware of the significant traffic patterns that can benefit from SMC-R – Others are unsure of how much of their traffic is able to use SMC-R – z/OS-z/OS – Workload patterns ideal for SMC-R – Not IPSec encrypted
  110. 110. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-110 Business problem: Is SMC-R applicable to my environment (continued) • Can use SMF records, Netstat displays, and reports from network management products – Helps users determine if their environments will benefit from the SMC-R function – This type of analysis is time consuming and requires significant expertise
  111. 111. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-111 Solution: SMC Applicability Tool (SMCAT) • A new tool that helps show the potential benefits of implementing SMC-R – Controlled by the Vary TCPIP,,SMCAT command – Monitors a stack's TCP traffic – For a set of configured destination IP addresses and subnets/prefixes – For a configured interval of time
  112. 112. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-112 Solution: SMCAT • SMCAT does not require SMC-R to be enabled • SMCAT is integrated within the TCP/IP stack and gathers new statistics that are used to project SMC-R applicability – Minimal system overhead, no changes in TCP/IP network flows – Produces report on potential benefits of enabling SMC-R • Available via the service stream on existing z/OS releases as well – V1R13 - Apar PI27252/PTF UI24872 – V2R1 - Apar PI29165/PTFs UI24762 and UI24763
  113. 113. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-113 Solution: SMCAT (continued) • At the end of the interval a summary report is generated that includes: – Percent of traffic “eligible” for SMC-R – All traffic that matches configured IP addresses and do not use IPSec or FRCA • Percent of traffic “well suited” for SMC-R – Eligible traffic that excludes workloads with very short lived TCP connections and trivial payloads – Includes break out of application send sizes – How large is the payload of each send request
  114. 114. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-114 Solution: SMCAT (continued) • The summary report contains two sections: – First section contains data for all eligible TCP connections – Includes connections that are not directly connected – Traffic between the hosts requires traversal of a router which is not supported by the SMC-R protocol – Indicates total amount of workload that can exploit SMC communications – Some connections might require network topology changes – The second section contains data for just the directly connected eligible (match configuration) TCP connections – Network traffic between the hosts does not require traversal of any IP routers – Indicates amount of workload that can immediately exploit SMC communications after SMC-R enablement – This section is a subset of the first section
  115. 115. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-115 Enablement actions: Configure the data set • SMCAT data set configuration – INTERVAL defaults to 60 minutes – Max is 1440 minutes (24 hours) – IPADDR is a list of IPv4 and IPv6 addresses and subnets – 256 max combination of addresses and subnets _INTERVAL 60_____ |---SMCATCFG____|_________________|_______________________________________________> |_INTERVAL minutes | >_________________________________________________________________________________| | | | |_IPADDR_______ipv4_address_____________________ _ipv4_address/num_mask_bits_ _ipv6_address_______________ _ipv6_address/prefix_length_
  116. 116. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-116 Enablement actions: Configure data set example • SMCAT data set configuration example – Monitor workload for two hours – Monitor workload for configured IPv4 address and IPv6 prefix SMCATCFG INTERVAL 120 IPADDR 192.168.1.1 192.168.3.0/24 C5::1:2:3:4/126
  117. 117. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-117 Enablement actions: Start/Stop SMCAT • Vary TCPIP,,SMCAT command starts and stops the monitoring tool – datasetname value indicates that SMCAT is being turned on – datasetname contains the SMCATCFG statement that specifies monitoring interval and IP addresses or subnets to be monitored – OFF will stop SMCAT monitoring and generate report >>__Vary__TCPIP,__ __________ __,__SMCAT,__ datasetname________>< |_procname_| |_,OFF__| VARY TCPIP,TCPPROC,SMCAT,USER99.TCPIP.SMCAT1
  118. 118. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-118 Externals: Report • Key messages – Operator console – EZD2031I SMC APPLICABILITY TOOL HAS STARTED COLLECTING DATA – EZD2032I SMC APPLICABILITY TOOL HAS STOPPED COLLECTING DATA • Configuration information and the SMCAT report are sent to the system log STC06578 EZD2040I TCP/IP CS V2R2 TCPIP Name: TCPIP 080 SMC Applicability Configuration Parameters- 02/04/2015, 10:09:49.08 080 Interval: 3 minutes 080 IP addresses/subnets being monitored 080 080 9.67.113.61 080 C5::1:2:3:4/126 080 End of configuration parameters
  119. 119. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-119 Externals: Report example SMC Applicability Interval Report- 10/08/2014, 14:07:32.06 Configured Interval Duration: 3 minutes Actual Interval Duration: 3 minutes TCP SMC-R traffic analysis for matching direct connections ---------------------------------------------------------- Connections meeting direct connectivity requirements 50% of connections can use SMC-R (eligible) 67% of eligible connections are well-suited for SMC-R 79% of total traffic (segments) is well-suited for SMC-R 81% of outbound traffic (segments) is well-suited for SMC-R 75% of inbound traffic (segments) is well-suited for SMC-R Interval Details: Total TCP Connections: 6 Total SMC-R eligible connections: 3 Total SMC-R well-suited connections: 2 Total outbound traffic (in segments) 274 SMC-R well-suited outbound traffic (in segments) 222 Total inbound traffic (in segments) 211 SMC-R well-suited inbound traffic (in segments) 159 Application send sizes used for well-suited connections: Size # sends Percentage ---- ------- ---------- 1500 (<=1500): 1 20% 4K (>1500 and <=4k): 1 20% 8K (>4k and <= 8k): 0 0% 16K (>8k and <= 16k): 0 0% 32K (>16k and <= 32k): 0 0% 64K (>32k and <= 64k): 1 20% 256K (>64K and <= 256K): 2 40% >256K: 0 0% End of report How much of my TCP workload can benefit from SMC -R? What kind of CPU savings can I expect from SMC-R?
  120. 120. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-120 Migration considerations • None
  121. 121. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-121 INCREASE SINGLE STACK DVIPA LIMIT TO 4096
  122. 122. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-122 Background information • Configuration defined dynamic virtual IP addresses – VIPADEFINE – VIPABACKUP – VIPADISTRIBUTE target stacks • Application instance dynamic virtual IP addresses – VIPARANGE to define a range of IP addresses – Application binds to an IP address – Application issues an SIOCVIPA ioctl()
  123. 123. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-123 Business problem • Application instance dynamic virtual IP addresses – Continue to increase – Need to follow the application – Higher utilization – CICS – dynamic virtual IP addresses for every region • Systems and sysplexes – Growing wider – Horizonal workload growth
  124. 124. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-124 Solution • Application instance dynamic virtual IP addresses – Increase limit to 4096 • Dynamic virtual IP addresses defined with VIPADEFINE and VIPABACKUP – Limit remains unchanged at 1024
  125. 125. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-125 Enablement actions • None
  126. 126. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-126 Externals • Existing message: – EZZ8309I TOO MANY VIPAS - [ip address] REJECTED • New message: – EZD2030I TOO MANY VIPADEFINE AND VIPABACKUP VIPAS - [ip address] REJECTED – Count includes both IPv4 and IPv6 dynamic virtual IP addresses
  127. 127. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-127 Migration considerations • None
  128. 128. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-128 REMOVED SUPPORT FOR LEGACY DEVICES
  129. 129. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-129 Background information: Legacy devices • Configured to the TCP/IP stack by using DEVICE and LINK profile statements • VTAM device drivers have these attributes: – Support an attachment to “legacy” hardware that is based on: – SSCH (CCWs) architecture – ESCON channel hardware (z196 – is last to support ESCON)
  130. 130. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-130 Business problem • Inability to test older or unsupported hardware – Most hardware no longer exists – Restricts product's exploitation of 64-bit storage – Risk to support software for non-existent hardware • Little or no customer usage of legacy devices – zBLC and SHARE surveys and PMR analysis
  131. 131. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-131 Solution • Remove supported for legacy DEVICE and LINK statements – ATM – CDLC – CLAW – HCH – SNAIUCV SNALINK – SNALU62 – X25NPSI • Remove ZOSMIGV2R1_CS_LEGACYDEVICE Health Check
  132. 132. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-132 Solution (continued) • Remove support for other profile statements – Unsupported ATM related statements – ATMARPSV, ATMLIS, and ATMPVC – Unsupported TRANSLATE statement parameters – NSAP (for ATM) and HCH – Unsupported IPCONFIG statement parameters – CLAWUSEDOUBLENOP and STOPONCLAWERROR • Unsupported server applications – SNALINK LU0 and LU6.2 – X.25 NPSI – NCPROUTE
  133. 133. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-133 Enablement actions • None
  134. 134. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-134 Externals • Legacy device type DEVICE and LINK statements: – EZZ0318I ATM WAS FOUND ON LINE 3 AND DEVICE TYPE WAS EXPECTED – EZZ0318I ATM WAS FOUND ON LINE 4 AND LINK TYPE WAS EXPECTED • ATM related statements: – EZZ0324I UNRECOGNIZED STATEMENT ATMARPSV FOUND ON LINE 1 • TRANSLATE statement parameters: – EZZ0318I NSAP WAS FOUND ON LINE 1 AND ETHERNET, IBMTR, OR FDDI WAS EXPECTED
  135. 135. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-135 Migration considerations • Migrate to strategic devices, such as OSA-Express QDIO and HiperSockets • Update automation for unsupported server applications
  136. 136. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-136 VIPAROUTE FRAGMENTATION AVOIDANCE
  137. 137. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-137 Background information: Generic Resource Encapsulation • Generic Routing Encapsulation header added for VIPAROUTE – Additional header can cause fragmentation • Ways to avoid fragmentation: – Use path MTU discovery – Use jumbo-frames between distributor and targets
  138. 138. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-138 Background information: Sysplex distributor with VIPAROUTE IPv4 Delivery Header GRE Header Original IP Packet GRE Encapsulation 20 bytes 4 bytes LPAR1 SD LPAR2 Target LPAR3 Target OSA OSA OSA CPC1 CPC2 Hipersockets XCF connectivity MTU 1492 IP Packet IP PacketGREIP MTU 8092
  139. 139. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-139 Business problem • Many benefits to enabling VIPAROUTE • Fragmentation is a common problem • Alternative options not always viable – Firewalls can prevent Path MTU discovery from working – Enabling Path MTU discovery on large number of clients can be problematic – Enabling Jumbo frames requires reconfiguration
  140. 140. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-140 Solution • Adjust TCP maximum segment size – Connections being forwarded using VIPAROUTE – Exchanged on TCP handshake – TCP hosts cannot exceed the maximum segment size advertised by the peer – Works across firewalls – Sometimes referred to as maximum segment size clamping
  141. 141. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-141 Enablement actions: GLOBALCONFIG • New GLOBALCONFIG parameter - ADJUSTDVIPAMSS – Specified on all target stacks – Specified on all stacks initiating outbound connections – Implemented on the initial connection packet – Done even if no fragmentation – Outgoing connections: generic routing encapsulation might be used on the return path – Incoming connections: Inbound routing paths can change over the life of a connection
  142. 142. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-142 Enablement actions: ADJUSTDVIPAMSS • AUTO (default) – Maximum segment size is adjusted for inbound connections if: – Local stack is a target and VIPAROUTE is being used – Local stack is both a distributor and a target and VIPAROUTE is defined – Maximum segment size is adjusted for outbound connections if: – Source IP address is a distributed dynamic virtual IP address • ALL – Maximum segment size is adjusted for all connections where – Source IP address is a dynamic virtual IP address – Both distributed and non-distributed • NONE – Maximum segment size is not adjusted for any connections
  143. 143. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-143 Externals: netstat CONFIG/-f • A sample netstat config/-f display command is shown below
  144. 144. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-144 Externals: netstat ALL/-A • A sample netstat ALL/-A display command is shown below – MaximumSegmentSize displays the maximum segment size value
  145. 145. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-145 Externals: NMI, SMF, IPCS • TCP/IP callable NMI – GetProfile request output provides values for new parameters • SMF 119 records – Subtype 4 TCP/IP profile record provides values for new parameters • TCPIPCS command – The TCPIPCS PROFILE command displays the values for the new parameters
  146. 146. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-146 Migration considerations • Preserving existing behavior – Code GLOBALCONFIG ADJUSTDVIPAMSS NONE
  147. 147. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-147 TCP AUTONOMIC TUNING
  148. 148. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-148 Background information: Dynamic Right-Sizing (DRS) • Dynamic right-sizing (DRS) introduced in V1R11 – Automatically increases the receive window size beyond the “maximum” window size for qualifying connections – Goal is to keep more data moving in the network – Receiving application must be able to keep pace with incoming data Window size Round trip time (RTT) Sender Receiver data Time ACK
  149. 149. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-149 Background information: Dynamic Outbound Right-Sizing (ORS) • Automatic attempt to deal with FRR (Fast Retransmit and Fast Recovery) impacts to streaming workloads – Outbound data becomes serialized to reduce risk of “out of order” packets – Send buffer size is allowed to grow to 1MB to keep value greater than the congestion window – FRR is suppressed when possible – Write-blocked applications are resumed sooner
  150. 150. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-150 Business problem: DRS • Disabled if receiving application is unable to keep up with data arrival – Never turned back on for the life of the connection • Storage status not taking into consideration • DRS eligibility is only determined once during the initial phase of the connection
  151. 151. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-151 Business problem: ORS • Solution targeted for a very narrow set of connections – RTT must be 20 ms or more – TCPCONFIG QUEUEDRTT operand created in V2R1 • Send buffer size grows with no consideration of receiver status – Once increased, send buffer size never shrinks
  152. 152. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-152 Solution: DRS • Allow DRS usage to be restarted on a connection – DRS detection can be re-initiated after a certain number of packets are processed • When CSM storage is not constrained: – Continue using DRS on a connection even if the application falls behind • When CSM storage is constrained: – If application falls behind, stop DRS on the connection temporarily – Do not activate DRS for connection, either initially or during “restart conditions”
  153. 153. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-153 Solution: ORS • Altered logic for growing the send buffer size – Only increase when current send buffer size is almost constrained – Do not increase send buffer size if retransmitting – Do not increase send buffer size when CSM storage constrained • Allow send buffer size to shrink dynamically – Determining factor is whether the sender is actually filling, or almost filling, the buffer • RTT requirement matches DRS value – 2 milliseconds
  154. 154. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-154 Solution: Autonomic outbound serialization • Change TCPCONFIG QUEUEDRTT default to 0 – Allow outbound serialization for all TCP connections • Connection must have a send buffer size of 64K or larger • Connection must be experiencing out of order packets
  155. 155. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-155 Enablement actions • None
  156. 156. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-156 Externals: Netstat ALL/-A • TcpPrf and TcpPrf2 indicate status of DRS and ORS MVS TCP/IP NETSTAT CS V2R1 TCPIP Name: TCPCS 22:24:30 Client Name: FTPD1 Client Id: 000000F9 Local Socket: 9.42.104.43..21 Foreign Socket: 9.42.103.165..1035 BytesIn: 0000000035 BytesOut: 0000000265 SegmentsIn: 0000000017 SegmentsOut: 0000000014 StartDate: 01/09/2012 StartTime: 22:04:11 Last Touched: 22:04:18 State: Establsh RcvNxt: 0214444666 SndNxt: 0216505563 ... MaximumSegmentSize: 0000000524 DSField: 00 Round-trip information: Smooth trip time: 102.000 SmoothTripVariance: 286.000 ReXmt: 0000000000 ReXmtCount: 0000000000 DupACKs: 0000000000 RcvWnd: 0000032730 SockOpt: 85 TcpTimer: 00 TcpSig: 84 TcpSel: 60 TcpDet: E0 TcpPol: 00 TcpPrf: C0 TcpPrf2: 70 QOSPolicy: No ...
  157. 157. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-157 Migration considerations • QUEUEDRTT default changed to 0 – Specify TCPCONFIG QUEUEDRTT 20 to retain the default behavior – Best practice is to use the new default value of 0 • Netstat All / -a • SMF type 119 records
  158. 158. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-158 Background information: FRR • Fast Recovery and Retransmit (FRR) allows a given TCP connection to continue sending new packets even as it is attempting to retransmit un-acknowledged packets – Triggered upon receipt of certain number of duplicate ACKs – Causes application's “slow start threshold” and “congestion window” values to be reduced • Purpose is to recovery from lost packets without waiting for retransmit timeout to occur • “FRR ambiguity” modifies the duplicate ACKs threshold – Requires that timestamps be included in the TCP packets – TCP uses timestamps in retransmitted packet and received ACK
  159. 159. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-159 Business problem: FRR for out of order packets • When “basic” FRR recovery is performed, the application cannot ramp back up to previous transmission rate – Permanent decrease in the growth rate of the congestion window • “FRR ambiguity” helps, but has its own problems – Requires timestamps to be present, so not universally available – Manipulation of FRR suppression threshold can mask real problems
  160. 160. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-160 Solution: FRR tuning • Utilize internal timestamps when timestamps in packets are not available • Modified FRR algorithm to be less punitive for out of order packets – Restore congestion window and slow start threshold – Eliminate FRR suppression logic so that FRR is performed after three duplicate ACKs – Lost packet behavior is unchanged
  161. 161. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-161 Enablement actions • None
  162. 162. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-162 Externals • None
  163. 163. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-163 Migration considerations • None
  164. 164. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-164 Background information: Delay ACK • TCP/IP will delay before sending an ACK until: – Have a response to send – Receive two packets from sender – 200 ms has expired • Default is to delay ACKs but numerous controls exist today to set or prevent delay ACK processing – TCPCONFIG DELAYAcks|NODELAYAcks – PORT(RANGE) DELAYAcks|NODELAYAcks – Various statements used to configure a route used by a TCP connection
  165. 165. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-165 Business problem • Delayed ACK processing generally works very well – Significant saving in request/response workloads – Likewise in streaming workloads • Occasionally, a workload incurs significant performance penalties because of delayed ACKs – Sender waiting for an ACK before sending the next packet (200 ms delay is incurred) – Can often occur due to interactions with Nagle's algorithm – Often hard to diagnose this delay
  166. 166. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-166 Solution: Autonomic delay ACK • Provide autonomic controls to monitor effectiveness of delay ACK processing • Do not delay sending the ACK if it repeatedly prevents the partner from sending more data • Do not keep sending ACKs to every packet if the sender is sending its next packet anyway Data From Sender ACK Data From Sender X??
  167. 167. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-167 Enablement actions • Enhance TCPCONFIG statement to request autonomic delayed ACK processing – New parameter AUTODELAYAcks to request autonomic delayed ACKs – Default remains DELAYAcks – AUTODELAYAcks is voided if DELAYAcks | NODELAYAcks is specified on any configuration statement related to this connection .-------------------------------. V | TCPCONFIG -------.--------------------------.--'------->< | _DELAYAcks_____ | |______|_______________|___| | | |_NODELAYAcks___| |_AUTODELAYAcks_|
  168. 168. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-168 Externals: Netstat CONFIG/-f • Also include new AUTODELAYAcks setting in NMI and SMF configuration/profile reports NETSTAT CONFIG MVS TCP/IP NETSTAT CS V2R1 TCPIP Name: TCPCS 11:37:31 TCP Configuration Table: DefaultRcvBufSize: 00016384 DefaultSndBufSize: 00016384 DefltMaxRcvBufSize: 00262144 SoMaxConn: 0000001024 MaxReTransmitTime: 120.000 MinReTransmitTime: 0.500 RoundTripGain: 0.125 VarianceGain: 0.250 VarianceMultiplier: 2.000 MaxSegLifeTime: 30.000 DefaultKeepALive: 00000120 DelayAck: Auto RestrictLowPort: Yes SendGarbage: No TcpTimeStamp: Yes FinWait2Time: 010 TTLS: No EphemeralPorts: 1024-65535 SelectiveACK: Yes TimeWaitInterval: 30 DefltMaxSndBufSize 262144 RetransmitAttempt: 15 ConnectTimeOut: 0120 ConnectInitIntval: 1000 KeepAliveProbes: 10 KAProbeInterval: 060 Nagle: No QueuedRTT: 20 FRRThreshold: 3
  169. 169. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-169 Externals: Netstat ALL/-A NETSTAT CONFIG NETSTAT ALL MVS TCP/IP NETSTAT CS V2R1 TCPIP Name: TCPCS 22:24:30 Client Name: FTPD1 Client Id: 000000F9 Local Socket: 9.42.104.43..21 Foreign Socket: 9.42.103.165..1035 BytesIn: 0000000035 BytesOut: 0000000265 SegmentsIn: 0000000017 SegmentsOut: 0000000014 StartDate: 01/09/2012 StartTime: 22:04:11 Last Touched: 22:04:18 State: Establsh RcvNxt: 0214444666 SndNxt: 0216505563 ... MaximumSegmentSize: 0000000524 DSField: 00 Round-trip information: Smooth trip time: 102.000 SmoothTripVariance: 286.000 ReXmt: 0000000000 ReXmtCount: 0000000000 DupACKs: 0000000000 RcvWnd: 0000032730 SockOpt: 85 TcpTimer: 00 TcpSig: 84 TcpSel: 60 TcpDet: E0 TcpPol: 00 TcpPrf: C0 TcpPrf2: 70 TcpPrf3: 00 DelayAck: AutoYes QOSPolicy: No ... AutoYes AutoNo Yes No
  170. 170. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-170 Migration considerations • Netstat ALL / -a and CONFIG / -f
  171. 171. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-171 • Simplified access permissions to ICSF cryptographic functions for IPSec • TCP/IP profile IP security filter enhancements • AT-TLS certificate processing enhancements • TLS session reuse support for FTP and AT-TLS applications • AT-TLS enablement for DCAS • TLS security enhancements for sendmail • TLS security enhancements for policy agent • Network security enhancements for SNMP Security
  172. 172. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-172 SIMPLIFIED ACCESS PERMISSIONS TO ICSF CRYPTOGRAPHIC FUNCTIONS FOR IPSEC
  173. 173. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-173 Background information: ICSF • Integrated Cryptographic Services Facility (ICSF) – Primary cryptographic provider on z/OS, including many crypto algorithms and access to all z Systems hardware crypto features – Offers a FIPS 140 mode through its PKCS#11 interface – SAF CSFSERV class resources control access to ICSF's many callable services – When CSFSERV class defined and CHECKAUTH(YES) specified in ICSF options dataset – Calling user ID must have READ permission to a SAF profile that covers the resource protecting the given callable service
  174. 174. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-174 Background information: IPSec and ICSF • TCP/IP stack's IPSec support – Uses 24 different ICSF callable services (both FIPS and non-FIPS mode) to perform many cryptographic operations – Often runs under the SAF credentials (ACEE) of the calling application (most commonly for send operations) – Therefore, IPSec operations run under caller's ACEE – As a result, in some cases, the user ID under which any application generates IPSec-protected traffic must be permitted to appropriate CSFSERV resources
  175. 175. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-175 Business problem • When CSFSERV class is defined and CHECKAUTH(YES) is specified in the ICSF options data set, every user ID under which IPSec-protected traffic is generated must be permitted to a long list of CSFSERV resources (in addition to permitting the TCP/IP stack's user ID) • Since the stack operates on behalf of the application and associated user ID, it makes sense that the TCP/IP stack's permissions to those resources should be sufficient • Prior to V2R2, ICSF did not provide a way for a service provider like the TCP/IP stack to specify the credentials under which the ICSF callable service should execute, so the stack had no way to avoid the issue • Note that CHECKAUTH(YES) tells ICSF to perform access control checks for supervisor state and system key callers – both of which describe the TCP/IP stack. The problem scenario does not exist if CHECKAUTH(NO) is in effect – and this is the default value.
  176. 176. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-176 Solution • In V2R2, ICSF provides a new CSFACEE (and CSFACEE6, for 64-bit) function that allows an authorized caller (either system key or supervisor state) to provide a SAF ENVR structure to use in place of the default ACEE for SAF checks • The TCP/IP stack's IPSec support is updated to use this new interface – Means that all ICSF calls within the TCP/IP stack can now be made under the TCP/IP stack's credentials instead of the calling application's – Covers both FIPS 140 and non-FIPS 140 mode • As a result, customers that use CHECKAUTH(YES) can eliminate all of the application-specific permissions to ICSF resources that were previously required due to IPSec protection. (Since the stack's user ID already required the same permissions, there are no additional permissions that need to be defined).
  177. 177. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-177 Enablement actions • None
  178. 178. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-178 Externals • None
  179. 179. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-179 Migration considerations • Optional action – Customers who have permitted application user IDs to CSFSERV resources because of IPSec protection can choose to remove those permissions – This is not mandatory – just a “clean up” and simplification task since the TCP/IP stack's user ID already must have the same permissions – Note that any new IPSec-generating applications do not have to be permitted to CSFSERV resources
  180. 180. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-180 TCP/IP PROFILE IP SECURITY FILTER ENHANCEMENTS
  181. 181. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-181 Background information: IP filtering • IP filters permit traffic, deny traffic, or require that it be protected with IPSec SrcIP Inbound or outbound packet DstIP Proto SrcPort DstPort SrcIP DstIP Proto SrcPort DstPort Action IP filter table in stack SrcIP DstIP Proto SrcPort DstPort Action SrcIP DstIP Proto SrcPort DstPort Action SrcIP DstIP Proto SrcPort DstPort Action DENYAll other traffic First filter to match Action is performed An implied “deny all” rule always exists at the bottom of the filter list
  182. 182. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-182 Background information: IP filter configuration • IP security is enabled in the TCP/IP profile: – IPSECURITY on IPCONFIG statement – IPSECURITY on IPCONFIG6 statement, to enable for IPv6 traffic • Default IP filters are defined in the TCP/IP profile on the IPSEC statement – Provides limited filtering capability – Protects the TCP/IP stack during initialization until Policy Agent installs an IPSec policy – Provides a “lockdown” option (ipsec -f default) • Policy IP filters are defined in an IPSec policy that is installed by Policy Agent – Provides full filtering and IPSec capability
  183. 183. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-183 Background information: Default IP filter policy • Used to permit a limited set of traffic • All traffic that is not explicitly permitted is denied • Traffic selection parameters for default filter rules are more limited than traffic descriptions provided for policy rules – For example, a range of ports cannot be specified • Address ranges are not supported for the source and destination address • All rules are bidirectional
  184. 184. IBM Inside Sales International Technical Support Organization Global Content Services © 2015 IBM CorporationITSO-184 Business problem: Configuration Assistant TCP/IP profile support • Configuration Assistant (CA) in V2R2 introduces TCP/IP profile support – Includes default filter rules • CA allows reusable object traffic descriptors to be defined for IPSec policies • Default filter rules do not support all traffic descriptor options provided for policy filter rules • CA profile support unable to share reusable objects defined for IPSec policies

×