© 2013 IBM Corporation
DB2 Data Sharing Performance
Martin Packer, IBM
martin_packer@uk.ibm.com
© 2013 IBM Corporation2
Abstract
 This presentation provides a view of how to look at the DB2
Data Sharing performance numbers
– from both a z/OS / RMF and a DB2 perspective

 Performance topics include:
– XCF
– Coupling Facility
– Data Sharing Structures
– and the application's perspective.
© 2013 IBM Corporation3
Agenda
 XCF
 Coupling Facility
– CPU
– Links
 DB2 Data Sharing Structures
 Application Perspective
 Some Parting Thoughts
© 2013 IBM Corporation4
Major DB2 Data Sharing Components
z/OS Image 1 z/OS Image 2
Coupling
Facility
XCF Structures
GBPs
LOCK1
IRLM 1 IRLM 2XCF XCFDB2 1 DB2 2
Shared Data
SCA
© 2013 IBM Corporation5
Major DB2 Data Sharing Components
 Locking
– DB2 Subsystems in a group in one z/OS image share an IRLM address space
– IRLMs communicate through their LOCK1 structure
• groupname_LOCK1
– IRLMs also communicate via XCF
• DXRnnn groups
• XES locking services also use XCF
– IXCLOnnn groups
 Group Buffer pools
– 1 CF structure per GBP
• groupname_GBPn
• Members connect directly to GBP structures
 Shared Communications Area (SCA)
– Status sharing
• groupname_SCA
• Much lower activity than LOCK1 and GBPs
– Not usually considered for tuning
• Members connect directly to SCA structure
© 2013 IBM Corporation6
XCF
© 2013 IBM Corporation7
XCF
 General signalling mechanism
– Introduced before the other Parallel Sysplex functions
 Traffic divided into transport classes
– These use either Coupling Facility structures or CTCs to pass
messages
• Dynamically routed based on XCF observing performance
• Dedicated CTCs
– Originally were faster than CF structures
– Must define a pair of paths for each connection
– Definition can get quite complex
– Transport classes have a maximum message size
• Fitting messages to classes is a significant tuning item
 Applications use specific XCF group names
– Example: IXCLOxxx is XES lock resolution
– Application address spaces connect as members of the group
© 2013 IBM Corporation8
XCF Tuning
 Aim to reduce transfer times
– “Mean Transfer Time” MXFER TIME in RMF
 Aim to minimise traffic
– Rates at all levels in RMF
– Eg Minimise Locking False Contention
– Eg Set up GRS Star in a way that minimises ENQs
 Optimise the use of links
– More modern CF-based links tend to outperform CTCs
• But CTCs still better for small messages
• CTCs drive SAP utilisation
– RMF counts the number of times each path was chosen
• Understand why signals use the paths in the ratio they do
 Transport Class buffer sizes
– Buffers that are too big waste memory
– Buffers that are too small have to be expanded
– RMF has counts of “Fit”, “Small”, “Big” and “Big With Overhead” messages
– RMF lists transport classes and their maximum buffer size values
© 2013 IBM Corporation9
 To manage XCF traffic down it's useful to know
who talks to whom
– 74-2 just says how many messages a member receives
and sends
• Not who their partners are
The XCF Traffic “Guessing Game”
© 2013 IBM Corporation10
DB2 Data Sharing XCF Groups – Between 2 Systems
© 2013 IBM Corporation11
Interesting “Hourly Spike” Behaviour
© 2013 IBM Corporation12
Coupling Facility
© 2013 IBM Corporation13
Parallel Sysplex SMF Records
 XCF - 74-2 (RMF XCF Activity Report)
– Applications
– Groups
– Paths
> CTCs are treated like real devices so SMF 74-1, 73 and 78-3 can be
useful
– Members and Job Names
 Coupling Facility - 74-4 (RMF Coupling Facility Activity Report)
– Usage Summary Section – Structure sizes and CPU usage
– Structure Activity Section
– Subchannel Activity Section – Path / Subchannel information
– CF to CF Section – Duplexing traffic at the CF level
© 2013 IBM Corporation14
SMF 74-4 Field: R744SETM “Structure Execution Time”
 Always 100% Capture Ratio
– Adds up to R744PBSY
 Multiple uses:
– Capacity planning for changing request rates
– Examine which structures are large consumers
– Compute CPU cost of a request & compare to service time
• Interesting number is “non-CPU” element of
– Understand whether CPU per request has degraded
– Estimating Structure Duplexing CPU cost
 NOTE:
– Need to collect 74-4 data from all z/OS systems sharing to get total request rate
• Otherwise “CPU per request” calculation will overestimate
Structure Level CPU
© 2013 IBM Corporation15
ISGLOCK Requests
0
2
4
6
8
10
12
14
16
0 10 20 30 40 50 60 70
Requests / Second
Microseconds
CPU Time Service Time
3us?
© 2013 IBM Corporation16
Structure CPU Time – By Time Of Day
© 2013 IBM Corporation17
Coupling Facility Path Information
 Dramatically improved in CFLEVEL 18 (zEC12)
– RMF APAR OA37826
• SMF 74-4
• Coupling Facility Activity Report
– Configuration:
• Detailed adapter and link type, PCHID, CHPID
– OA37826 gives CHPID even without CFLEVEL 18
• Infiniband and ISC only
– Performance:
• “Degraded” flag
• Channel Path Latency Time (R744HLAT)
– Divide by 10 us to give distance estimate in Postprocessor Report
– Would be interesting if it degraded
© 2013 IBM Corporation18
© 2013 IBM Corporation19
R744HOPM - Channel path operation mode
Value Meaning
X'01' CFP path supporting a 1.0625 Gbit/s data rate
X'02' CFP path supporting a 2.125 Gbit/s data rate
X'10' CIB path operating at 1x bandwidth using the IFB protocol,         
   adapter type HCA2­O LR
X'11' CIB path operating at 12x bandwidth using the IFB protocol,   
      adapter type HCA2­O
X'20' CIB path operating at 1x bandwidth using the IFB protocol,         
   adapter type HCA3­O LR
X'21' CIB path operating at 12x bandwidth using the IFB protocol,
      adapter type HCA3­O
X'30' CIB path operating at 12x bandwidth using the IFB3 protocol,
      adapter type HCA3­O
© 2013 IBM Corporation20
DB2 Data Sharing Structures
© 2013 IBM Corporation21
Synchronous and Asynchronous CF Requests
 Synchronous (Sync)
– z/OS engine waits for completion
• Each microsecond of request service time is a microsecond of lost engine capacity
– e.g GRS Star
 Asynchronous (Async)
– z/OS engine does not wait for completion
– Response times usually longer than for Sync requests
– e.g XCF signalling
 Automatic Sync to Async Conversion
– Algorithm introduced by z/OS Release 2
– Requests converted wholesale
– With conversion an occasional request is tried as Sync
• Governs whether conversion is the right thing to do
• Factors
– Larger data transfer
– Longer / slower links
– Processor speed
– Duplexing
– Thresholds recently refined
© 2013 IBM Corporation22
Structure Duplexing
 2 copies of the same structure in different CFs
– Maintained in sync
– Higher resilience to component failures
• Loss of z/OS images and ICF on same footprint less likely to cause an outage
• Faster than structure rebuild
 Bidirectional links required between the two CFs
– Preferably more than one
 User-Managed
– “User” is DB2
– DB2 writes data to both structures
• Async write to primary
• Then Sync write to secondary
• Completion when both writes succeed
• Reads only from the Primary
• In event of failure reads from Secondary
 System-Managed
– XES writes to primary and secondary
• Both CFs communicate through a separate link to ensure status is shared
• Request only completes when both structures have been accessed
© 2013 IBM Corporation23
Locking - LOCK1 Structure
 Locks must be known and respected between members
– Data Sharing uses global locks to achieve this
 But not all locks need to be propagated
– Only the most restrictive state needs be
 Locking is propagated from IRLM, via XES to the LOCK1 CF
structure
– IRLM knows about locking states that XES doesn’t
• XES only knows about “shared” and “exclusive” locks
• DB2 had many more states, even before Data Sharing
© 2013 IBM Corporation24
LOCK1 Requests
0
2
4
6
8
10
12
750 800 850 900
Requests / Second
Microseconds
CPU Time Service Time
3.5us?
© 2013 IBM Corporation25
Locking - LOCK1 Structure
 Contention types:
– Real Contention
• Different members really do need to use the same resource at the
same time
• Real application delay inherent while the holder retains the lock
– XES Contention
• When XES believes there is contention but IRLM knows there isn’t
– because of its more comprehensive view of locking
– IRLMs have to talk via XCF to resolve this - DXRnnn
– False Contention
• When the hashing algorithm for the lock table provides the same
hash value for two different resources
– XESs have to talk via XCF to resolve this - IXCLOnnn
© 2013 IBM Corporation26
Group Buffer Pool Tuning
 GBP tuning has similarities to local pool tuning
– But some twists
 Important to minimise traffic
– Application GETPAGEs in general
– Traffic to GBPs
 Also important to minimise response times
– Which is mainly a matter of tuning the underlying CF access
 Minimising the amount of data actually shared may be practical
– For many designs it isn’t
 Important to avoid invalidations due to too few directory entries
– GBP space divided into Directory entries and Data elements
• Directory entry reclaims if too few entries
– Causes invalidations of local buffers
• Installation can alter the balance
• Installation can increase the size of the group buffer pool
© 2013 IBM Corporation27
25us?
© 2013 IBM Corporation28
© 2013 IBM Corporation29
Group Buffer Pool Tuning - Traffic
 Reads:
– Cross invalidation reads
• Data returned i.e. is in GBP
• Data not returned i.e. is known to be down level but page not in the GBP
– Requires disk read
– Buffer pool miss reads
• Data returned i.e not in the local pool but is in the GBP
• Data not returned i.e is in neither the local pool nor the GBP
– A bigger GBP ought to provide more hits and fewer misses
• But I rarely see high GBP hit ratios
 Writes:
– Avoid GBPCACHE(NONE) as writes are SYNCHRONOUS TO DISK at Commit time
• Harmful to the Committing unit of work’s performance
– Writes can also be caused by the LOCAL pool’s Deferred Write thresholds being hit
• In this case Commits aren’t waited for
 Castouts:
– Dribbling out a good idea
• Just like for local pools
© 2013 IBM Corporation30
Locking Tuning
 It’s important to reduce locking traffic at all levels
– Application
– DB2 subsystem
 It’s also important to reduce False Contention
– Usually by increasing the Lock Table portion of the LOCK1 structure
• Number of entries will be a power of 2
– 4-byte lock table entry means fewer entries for same size than 2-byte
 It’s nice that in DB2 Version 8 there’s a remapping of IRLM lock
states to XES ones
– May reduce XES lock contention
 CF Request response times also important
© 2013 IBM Corporation31
Application Perspective
© 2013 IBM Corporation32
Data Sharing Instrumentation
 Accounting Trace
– Generally provides a time breakdown for each application
• Plan, Correlation ID and Package level
• Excellent tuning instrumentation for applications
– Timings:
• Global Lock Wait
• Time to retrieve pages from GBP
– Subsumed within Sync DB Wait and Async Read
• Time for commits
– Can involve GBP traffic
– Activities
• Group Buffer Pool
• Global Locking
 Statistics Trace
– Activities
• Group Buffer Pool
– Note: MXG change 25.075 required to support incompatibly-changed DB2 Version 8 GBP
statistics
• Global Locking
© 2013 IBM Corporation33
Some Parting Thoughts
 Parallel Sysplex has many benefits
– More fully realised with Data Sharing
 Need to manage carefully performance and cost
 Configuration choices make an enormous difference
 Avoid shared coupling facilities for Production
 Good monitoring tools for both z/OS / Hardware, and DB2
 Tune not only DB2 structure and XCF Performance
– But also other structures and users of XCF

DB2 Data Sharing Performance

  • 1.
    © 2013 IBMCorporation DB2 Data Sharing Performance Martin Packer, IBM martin_packer@uk.ibm.com
  • 2.
    © 2013 IBMCorporation2 Abstract  This presentation provides a view of how to look at the DB2 Data Sharing performance numbers – from both a z/OS / RMF and a DB2 perspective   Performance topics include: – XCF – Coupling Facility – Data Sharing Structures – and the application's perspective.
  • 3.
    © 2013 IBMCorporation3 Agenda  XCF  Coupling Facility – CPU – Links  DB2 Data Sharing Structures  Application Perspective  Some Parting Thoughts
  • 4.
    © 2013 IBMCorporation4 Major DB2 Data Sharing Components z/OS Image 1 z/OS Image 2 Coupling Facility XCF Structures GBPs LOCK1 IRLM 1 IRLM 2XCF XCFDB2 1 DB2 2 Shared Data SCA
  • 5.
    © 2013 IBMCorporation5 Major DB2 Data Sharing Components  Locking – DB2 Subsystems in a group in one z/OS image share an IRLM address space – IRLMs communicate through their LOCK1 structure • groupname_LOCK1 – IRLMs also communicate via XCF • DXRnnn groups • XES locking services also use XCF – IXCLOnnn groups  Group Buffer pools – 1 CF structure per GBP • groupname_GBPn • Members connect directly to GBP structures  Shared Communications Area (SCA) – Status sharing • groupname_SCA • Much lower activity than LOCK1 and GBPs – Not usually considered for tuning • Members connect directly to SCA structure
  • 6.
    © 2013 IBMCorporation6 XCF
  • 7.
    © 2013 IBMCorporation7 XCF  General signalling mechanism – Introduced before the other Parallel Sysplex functions  Traffic divided into transport classes – These use either Coupling Facility structures or CTCs to pass messages • Dynamically routed based on XCF observing performance • Dedicated CTCs – Originally were faster than CF structures – Must define a pair of paths for each connection – Definition can get quite complex – Transport classes have a maximum message size • Fitting messages to classes is a significant tuning item  Applications use specific XCF group names – Example: IXCLOxxx is XES lock resolution – Application address spaces connect as members of the group
  • 8.
    © 2013 IBMCorporation8 XCF Tuning  Aim to reduce transfer times – “Mean Transfer Time” MXFER TIME in RMF  Aim to minimise traffic – Rates at all levels in RMF – Eg Minimise Locking False Contention – Eg Set up GRS Star in a way that minimises ENQs  Optimise the use of links – More modern CF-based links tend to outperform CTCs • But CTCs still better for small messages • CTCs drive SAP utilisation – RMF counts the number of times each path was chosen • Understand why signals use the paths in the ratio they do  Transport Class buffer sizes – Buffers that are too big waste memory – Buffers that are too small have to be expanded – RMF has counts of “Fit”, “Small”, “Big” and “Big With Overhead” messages – RMF lists transport classes and their maximum buffer size values
  • 9.
    © 2013 IBMCorporation9  To manage XCF traffic down it's useful to know who talks to whom – 74-2 just says how many messages a member receives and sends • Not who their partners are The XCF Traffic “Guessing Game”
  • 10.
    © 2013 IBMCorporation10 DB2 Data Sharing XCF Groups – Between 2 Systems
  • 11.
    © 2013 IBMCorporation11 Interesting “Hourly Spike” Behaviour
  • 12.
    © 2013 IBMCorporation12 Coupling Facility
  • 13.
    © 2013 IBMCorporation13 Parallel Sysplex SMF Records  XCF - 74-2 (RMF XCF Activity Report) – Applications – Groups – Paths > CTCs are treated like real devices so SMF 74-1, 73 and 78-3 can be useful – Members and Job Names  Coupling Facility - 74-4 (RMF Coupling Facility Activity Report) – Usage Summary Section – Structure sizes and CPU usage – Structure Activity Section – Subchannel Activity Section – Path / Subchannel information – CF to CF Section – Duplexing traffic at the CF level
  • 14.
    © 2013 IBMCorporation14 SMF 74-4 Field: R744SETM “Structure Execution Time”  Always 100% Capture Ratio – Adds up to R744PBSY  Multiple uses: – Capacity planning for changing request rates – Examine which structures are large consumers – Compute CPU cost of a request & compare to service time • Interesting number is “non-CPU” element of – Understand whether CPU per request has degraded – Estimating Structure Duplexing CPU cost  NOTE: – Need to collect 74-4 data from all z/OS systems sharing to get total request rate • Otherwise “CPU per request” calculation will overestimate Structure Level CPU
  • 15.
    © 2013 IBMCorporation15 ISGLOCK Requests 0 2 4 6 8 10 12 14 16 0 10 20 30 40 50 60 70 Requests / Second Microseconds CPU Time Service Time 3us?
  • 16.
    © 2013 IBMCorporation16 Structure CPU Time – By Time Of Day
  • 17.
    © 2013 IBMCorporation17 Coupling Facility Path Information  Dramatically improved in CFLEVEL 18 (zEC12) – RMF APAR OA37826 • SMF 74-4 • Coupling Facility Activity Report – Configuration: • Detailed adapter and link type, PCHID, CHPID – OA37826 gives CHPID even without CFLEVEL 18 • Infiniband and ISC only – Performance: • “Degraded” flag • Channel Path Latency Time (R744HLAT) – Divide by 10 us to give distance estimate in Postprocessor Report – Would be interesting if it degraded
  • 18.
    © 2013 IBMCorporation18
  • 19.
    © 2013 IBMCorporation19 R744HOPM - Channel path operation mode Value Meaning X'01' CFP path supporting a 1.0625 Gbit/s data rate X'02' CFP path supporting a 2.125 Gbit/s data rate X'10' CIB path operating at 1x bandwidth using the IFB protocol,             adapter type HCA2­O LR X'11' CIB path operating at 12x bandwidth using the IFB protocol,          adapter type HCA2­O X'20' CIB path operating at 1x bandwidth using the IFB protocol,             adapter type HCA3­O LR X'21' CIB path operating at 12x bandwidth using the IFB protocol,       adapter type HCA3­O X'30' CIB path operating at 12x bandwidth using the IFB3 protocol,       adapter type HCA3­O
  • 20.
    © 2013 IBMCorporation20 DB2 Data Sharing Structures
  • 21.
    © 2013 IBMCorporation21 Synchronous and Asynchronous CF Requests  Synchronous (Sync) – z/OS engine waits for completion • Each microsecond of request service time is a microsecond of lost engine capacity – e.g GRS Star  Asynchronous (Async) – z/OS engine does not wait for completion – Response times usually longer than for Sync requests – e.g XCF signalling  Automatic Sync to Async Conversion – Algorithm introduced by z/OS Release 2 – Requests converted wholesale – With conversion an occasional request is tried as Sync • Governs whether conversion is the right thing to do • Factors – Larger data transfer – Longer / slower links – Processor speed – Duplexing – Thresholds recently refined
  • 22.
    © 2013 IBMCorporation22 Structure Duplexing  2 copies of the same structure in different CFs – Maintained in sync – Higher resilience to component failures • Loss of z/OS images and ICF on same footprint less likely to cause an outage • Faster than structure rebuild  Bidirectional links required between the two CFs – Preferably more than one  User-Managed – “User” is DB2 – DB2 writes data to both structures • Async write to primary • Then Sync write to secondary • Completion when both writes succeed • Reads only from the Primary • In event of failure reads from Secondary  System-Managed – XES writes to primary and secondary • Both CFs communicate through a separate link to ensure status is shared • Request only completes when both structures have been accessed
  • 23.
    © 2013 IBMCorporation23 Locking - LOCK1 Structure  Locks must be known and respected between members – Data Sharing uses global locks to achieve this  But not all locks need to be propagated – Only the most restrictive state needs be  Locking is propagated from IRLM, via XES to the LOCK1 CF structure – IRLM knows about locking states that XES doesn’t • XES only knows about “shared” and “exclusive” locks • DB2 had many more states, even before Data Sharing
  • 24.
    © 2013 IBMCorporation24 LOCK1 Requests 0 2 4 6 8 10 12 750 800 850 900 Requests / Second Microseconds CPU Time Service Time 3.5us?
  • 25.
    © 2013 IBMCorporation25 Locking - LOCK1 Structure  Contention types: – Real Contention • Different members really do need to use the same resource at the same time • Real application delay inherent while the holder retains the lock – XES Contention • When XES believes there is contention but IRLM knows there isn’t – because of its more comprehensive view of locking – IRLMs have to talk via XCF to resolve this - DXRnnn – False Contention • When the hashing algorithm for the lock table provides the same hash value for two different resources – XESs have to talk via XCF to resolve this - IXCLOnnn
  • 26.
    © 2013 IBMCorporation26 Group Buffer Pool Tuning  GBP tuning has similarities to local pool tuning – But some twists  Important to minimise traffic – Application GETPAGEs in general – Traffic to GBPs  Also important to minimise response times – Which is mainly a matter of tuning the underlying CF access  Minimising the amount of data actually shared may be practical – For many designs it isn’t  Important to avoid invalidations due to too few directory entries – GBP space divided into Directory entries and Data elements • Directory entry reclaims if too few entries – Causes invalidations of local buffers • Installation can alter the balance • Installation can increase the size of the group buffer pool
  • 27.
    © 2013 IBMCorporation27 25us?
  • 28.
    © 2013 IBMCorporation28
  • 29.
    © 2013 IBMCorporation29 Group Buffer Pool Tuning - Traffic  Reads: – Cross invalidation reads • Data returned i.e. is in GBP • Data not returned i.e. is known to be down level but page not in the GBP – Requires disk read – Buffer pool miss reads • Data returned i.e not in the local pool but is in the GBP • Data not returned i.e is in neither the local pool nor the GBP – A bigger GBP ought to provide more hits and fewer misses • But I rarely see high GBP hit ratios  Writes: – Avoid GBPCACHE(NONE) as writes are SYNCHRONOUS TO DISK at Commit time • Harmful to the Committing unit of work’s performance – Writes can also be caused by the LOCAL pool’s Deferred Write thresholds being hit • In this case Commits aren’t waited for  Castouts: – Dribbling out a good idea • Just like for local pools
  • 30.
    © 2013 IBMCorporation30 Locking Tuning  It’s important to reduce locking traffic at all levels – Application – DB2 subsystem  It’s also important to reduce False Contention – Usually by increasing the Lock Table portion of the LOCK1 structure • Number of entries will be a power of 2 – 4-byte lock table entry means fewer entries for same size than 2-byte  It’s nice that in DB2 Version 8 there’s a remapping of IRLM lock states to XES ones – May reduce XES lock contention  CF Request response times also important
  • 31.
    © 2013 IBMCorporation31 Application Perspective
  • 32.
    © 2013 IBMCorporation32 Data Sharing Instrumentation  Accounting Trace – Generally provides a time breakdown for each application • Plan, Correlation ID and Package level • Excellent tuning instrumentation for applications – Timings: • Global Lock Wait • Time to retrieve pages from GBP – Subsumed within Sync DB Wait and Async Read • Time for commits – Can involve GBP traffic – Activities • Group Buffer Pool • Global Locking  Statistics Trace – Activities • Group Buffer Pool – Note: MXG change 25.075 required to support incompatibly-changed DB2 Version 8 GBP statistics • Global Locking
  • 33.
    © 2013 IBMCorporation33 Some Parting Thoughts  Parallel Sysplex has many benefits – More fully realised with Data Sharing  Need to manage carefully performance and cost  Configuration choices make an enormous difference  Avoid shared coupling facilities for Production  Good monitoring tools for both z/OS / Hardware, and DB2  Tune not only DB2 structure and XCF Performance – But also other structures and users of XCF