Understanding IBM Tivoli OMEGAMON for DB2 Batch Reporting, Customization and Tuning Practices
1. Understanding IBM Tivoli
OMEGAMON for DB2 Batch
Reporting, Customization and
Tuning Practices
Session Number IDZ-1155B
Cüneyt Göksu, VBT
Cuneyt.Goksu@vbt.com.tr
#ibmiod
2. General Tuning Process
• Application Performance
o Application Logic
o SQL
o Application Server
• Database Design
o Tablespace, Table & Index Design
o Normalization
• DB2 Subsystem
o Pools, zParms, Logs, Locks, DDF
• Others
o Network
o Operating System
o DASD
#ibmiod
3. Two Options
Manual Tool
Or
- Explain - Data Studio
- DB2 Commands - Omegamon Family
- RTS - Other DB2 Tools such as
Query Monitor, Query Tuner
#ibmiod
5. DB2 System and Application Monitoring & Tuning
• Online Monitoring
o Real Time or Near Real time
o Snapshot (Interval Based)
o Good for Real Time Problem solving such as Lock Conflicts, Pool
Shortages
EDM Pool Tuning: zParm EDM_SKELETON_POOL
DB2 V10 CM8
#ibmiod
6. DB2 System and Application Monitoring & Tuning
• Batch Reporting
o DB2 Traces
o SMF Configuration
o Reporting and Operation Structure
• DB2 Traces
#ibmiod
7. DB2 System and Application Monitoring & Tuning
• DB2 Traces –START TRACE Command
PERFM : performance analysis and tuning (specific events in the system)
ACCTG : accounting for a particular program or authorization ID (for each thread)
STAT : statistical data for various components of DB2 at time intervals
AUDIT: audit data from various components of DB2
MONITOR: trace data available to DB2 monitor application programs
• DB2 Traces – Start Automatically from zPARM
For Audit : AUDITST (YES|NO|List of Classes)
For Accounting: SMFACCT (YES|NO|List of Classes)
For Statistics: SMFSTAT (YES|NO|List of Classes)
SMFACCT...................1,2,3,7,8,10
SMFSTAT......................1,3,4,5,6,8
AUDITST......................................2
#ibmiod
8. DB2 System and Application Monitoring & Tuning
• SMF Configuration
SMF Record IFCID IFCID IFCID
100 statistics (1) System (2) Database (202) System
Services Services Parameters
100 statistics (225) System (230) Data
Services Sharing
101 accounting (3) Agent (3) Agent
accounting accounting
102 perfor.
• SMF Configuration
SMF must be active & SMFPRMxx member allows 100-102
DSNW133I csect TRACE DATA LOST, dest NOT ACCESSIBLE RC=code
#ibmiod
9. DB2 System and Application Monitoring & Tuning
• SMF Configuration
What are OMEGAMON for DB2 recommendations for preventing DSNW133I?
http://www-01.ibm.com/support/docview.wss?uid=swg21187080
• SUBSYS(STC,TYPE(100:102))
• ACTIVE
• SMFCOMP (OFF|ON) DB2 zPARM in V10 + The z/OS compression service
CSRCESRV
#ibmiod
10. DB2 System and Application Monitoring & Tuning
• Reporting and Operation Structure
//OMEGA EXEC PGM=DB2PM,PARM='DATEFORMAT=DD-MM-YY‘
//INPUTDD DD DISP=SHR,DSN=ISB.SMFBACK.SYSZ.Y2012.A07.G16
//SYSIN DD * SMF Input
GLOBAL
TIMEZONE (-2)
STATISTICS Type of report
REDUCE
* FROM (21-06-11,11:00),TO(21-06-11,11:30)
* INTERVAL (15)
INCLUDE(MEMBER(PR4B))
REPORT
LAYOUT(LONG) Layout
* LAYOUT(SHORT)
* DSETSTAT
EXEC
#ibmiod
12. DB2 System and Application Monitoring & Tuning
• Fundamental Report Analysis with Accounting Report
1 LOCATION: PR0BPLOC OMEGAMON XE FOR DB2 PERFORMANCE MONITOR ( V4 ) PAGE: 1-1
GROUP: DSNPR0B ACCOUNTING REPORT - SHORT REQUESTED FROM: NOT SPECIFIED
MEMBER: PR1B TO: NOT SPECIFIED
SUBSYSTEM: PR1B ORDER: PRIMAUTH-PLANNAME INTERVAL FROM: 15-07-12 23:01:05.38
DB2 VERSION: V9 SCOPE: MEMBER TO: 16-07-12 23:04:47.05
ÖOCCURS ÖCOMMIT INSERTS OPENS PREPARE CLASS2 EL.TIME BUF.UPDT LOCK SUS
PRIMAUTH ÖDISTRS SELECTS UPDATES CLOSES CLASS1 EL.TIME CLASS2 CPUTIME SYN.READ ÖLOCKOUT
PLANNAME ÖROLLBK FETCHES MERGES DELETES CLASS1 CPUTIME GETPAGES TOT.PREF
--------------------------- ------- ------- ------- ------- -------------- -------------- -------- --------
AINTRNAA 23 23 0.00 10.13 0.00 0.014619 1.09 0.04
POFATURA 0 0.00 1.04 9.61 2.917138 0.004550 0.17 0
1 15.26 0.00 0.00 0.114770 40.13 0.48
----------------------------------------------------------------------------------------------------------------
!PROGRAM NAME TYPE ÖOCCURS ÖALLOCS SQLSTMT CL7 ELAP.TIME CL7 CPU TIME CL8 SUSP.TIME CL8 SUSP!
!TIBSQLGV PACKAGE 23 71 2.04 0.001293 0.000336 0.000381 0.17!
!TIBSQLHS PACKAGE 8 150 20.75 0.006328 0.001538 0.003464 2.00!
!TIBSQLIS PACKAGE 22 220 9.00 0.003570 0.000663 0.002437 1.32!
!GET_APÖ1 PACKAGE 22 33 6.00 0.000564 0.000277 0.000000 0.00!
!GET_CUÖ1 PACKAGE 8 142 19.25 0.000375 0.000236 0.000000 0.00!
!GET_EXÖ3 PACKAGE 22 33 6.00 0.000524 0.000316 0.000050 0.14!
!GET_IFÖ1 PACKAGE 22 33 6.00 0.000524 0.000336 0.000000 0.00!
---------------------------------------------------------------------------------------------------------------------------
!TRUNCATED VALUE FULL VALUE
!GET_APÖ1 GET_APP_TYPE_AND_ACC_REST
!GET_CUÖ1 GET_CUSTOMER_ACCOUNTS
!GET_EXÖ3 GET_EXTRA_RIGHT_INFO_FROM_GROUP_ID Packages list This time is the period
!GET_IFÖ1 GET_IF_CHILD_OF_ROOT used to gather information.
in this period
#ibmiod
13. DB2 System and Application Monitoring & Tuning
• Fundamental Report Analysis with Accounting Report
• OCCURS is the number of transaction. This analysis show which is
performance bottle neck,
• CLASS2 EL.TIME is average execution time of DB2 Application Server or DB2 Server.
• CLASS2 CPUTIME is average time that DB2 used CPU.
• Multiplying OCCURS by CLASS2 EL.TIME gives the total execution time of
DB2 in this period(job).
• Multiplying OCCURS by CLASS2 CPUTIME gives the total time that DB2
used this period(job).
ヨOCCURS * CLASS2 EL.TIME ヨOCCURS * CLASS2 CPUTIME
The part of DAY3 report
#ibmiod
14. DB2 System and Application Monitoring & Tuning
• Fundamental Report Analysis with Accounting Report
• PROGRAM NAME is package name used in this period(this job).
• CL7 ELAP.TIME is average execution time of the package. This analysis show
what package is
• CL7 CPU TIME is average time that the package used CPU. performance bottle neck.
• OCCURS is number of the package called.
• Multiplying OCCURS by CL7 ELAP.TIME gives the total execution time of
the package.
• Multiplying OCCURS by CL7 CPU TIME gives the total time that the
package used CPU.
ヨOCCURS * CL7 ELAP.TIME ヨOCCURS * CL7 CPU TIME
The part of DAY3 report
#ibmiod
15. DB2 System and Application Monitoring & Tuning
• Fundamental Report Analysis with Accounting Report
• Local Response Time = Class 1 elapsed Time
• SQL = Class 2 elapsed time
• Non-SQL = Class 1 – Class 2 elapsed time
• Lock Wait = Class 3 Lock Susp
• CPU Time = Class 2 CPU Time
• Sync Read = Class 3 Sync I/O Susp
• Wait for Prefetch = Class 3 Other read I/O Susp
• Other = SQL - (LOCK WAIT + CPU TIME + SYNCHRONOUS READ + WAIT FOR PREFETCH)
#ibmiod
16. DB2 System and Application Monitoring & Tuning
• Fundamental Report Analysis with Accounting Report
ELAPSED TIME 2.917138 Local Response Time (Class1)
ELAPSED TIME 0.014619 SQL Elapsed Time (Class 2)
ELAPSED TIME 2,902519 Non-SQL (Class 1 – Class 2)
LOCK/LATCH(DB2+IRLM) 0.000225 Class 3 Lock Susp
CPU TIME 0.004550 Class 2 CPU Time
SYNCHRON. I/O 0.003362 Class 3 Sync I/O Susp
OTHER READ I/O 0.000000 Class 3 Other read I/O Susp
OTHER = 0,014619 - 0,008137 = 0,006482
First to look @
If most time spent on non-SQL activities
– Look for reason for bad performance outside DB2
• CICS / IMS-DC
• Access to other databases
• File processing
• Program instructions, etc. etc...
#ibmiod
17. DB2 System and Application Monitoring & Tuning
• Fundamental Report Analysis with Accounting Report
Second to look @
What is the major contributor to class 2 elapsed time?
• High ‘OTHER’ means System Related such as CPU Queuing, Paging,...
• High Class 3 Sync I/O Susp time
o If (Sync I/O elapsed time / # Sync I/O events) > 10ms then system-related problem
SYNCHRON. I/O 0.003362 / 2.22 < 10ms
• High Class 3 lock/latch suspension times (Application-related)
• High Class 2 CPU time (Application-related)
• For application-related problems, check: And for locking-related problems, check:
– Missing matching columns
– Duration of commit interval
– Nonindexable predicates – Lock avoidance
– Non-Boolean term predicates – Isolation level
– Avoidable sort
– Missing denormalization
– Not index-only
#ibmiod
18. DB2 System and Application Monitoring & Tuning
• Fundamental Report Analysis with Statistics Report
Class Data Collected IFCIDs Activated
1 Statistics data 1, 2, 105,106, 202, 225
2 Installation-defined statistics record 152
3 Deadlock, lock escalation, group buffer pool, data set 172, 196, 250, 258, 261, 262, 313, 330,
extension information, indications of long-running 335, 337
URs, and active log space shortages
4 DB2 exceptional conditions 173,191-195, 203-210, 235, 236, 238,
267, 268, 343
5 DB2 data sharing statistics record 230
6 Storage usage details 225
8 Data set I/O statistics 199
• CPU overhead of the DB2 Statistics Traces is negligible
• SMFSTAT YES (default) starts the trace for the default classes (1, 3, 4, 5, 6)
• STATIME set to 1 minute, Only 1440 intervals per day
• DB2 Statistics Records are written as SMF 100 records
• Recommendation to copy SMF records, and to keep them separately
#ibmiod
19. DB2 System and Application Monitoring & Tuning
• Dataset Open/Close – Tuning
OPEN/CLOSE ACTIVITY QUANTITY /SECOND /THREAD /COMMIT
--------------------------- -------- ------- ------- -------
OPEN DATASETS - HWM 15004.00 N/A N/A N/A
OPEN DATASETS 14145.65 N/A N/A N/A
DS NOT IN USE,NOT CLOSE-HWM 14786.00 N/A N/A N/A
DS NOT IN USE,NOT CLOSED 13104.75 N/A N/A N/A
IN USE DATA SETS 1040.89 N/A N/A N/A
DSETS CLOSED-THRESH.REACHED 400.00 0.00 0.00 0.00
DSETS CONVERTED R/W -> R/O 16695.00 0.30 0.04 0.01
TOTAL GENERAL QUANTITY /SECOND /THREAD /COMMIT
-------------------------- -------- ------- ------- ------
...
NUMBER OF DATASET OPENS 20528.00 0.24 2.59 0.01
NUMBER OF DATASET OPENS < 0.1 per second
• Frequent dataset opens means high DBM1 TCB time and/or Acctg Class 3 Dataset Open time
• Could be caused by hitting DSMAX too frequently
o Increase DSMAX in small increments
#ibmiod
20. DB2 System and Application Monitoring & Tuning
• Dataset Open/Close – Tuning
DSETS CONVERTED R/W -> R/O < 10-15 per minute
• Possible side effects of pseudo close activity (RW->RO switching)
o Expensive SYSLGRNX processing
o Growth in SYSLGRNX pageset
o Frequent dataset close and re-open
o Ping-pong in and out of GBP dependency
• Expensive scan of XXXL local buffer pools
• Recommendations
o Take frequent system checkpoints
• Set CHKFREQ=2-5 (minutes)
o Adjust PCLOSEN/T to avoid too frequent pseudo closes
o Use CLOSE(YES) as a design default
#ibmiod
21. DB2 System and Application Monitoring & Tuning
• Lock Avoidance
LOCKING ACTIVITY QUANTITY /SECOND /THREAD /COMMIT
------------------------ -------- ------- ------- -------
...
LOCK REQUESTS 1007.5M 11.7K 127.1K 40.35
UNLOCK REQUESTS 102.1M 1181.27 12.9K 4.09
• Benefits of lock avoidance
o Increase in concurrency
o Decrease in lock / unlock activity requests, with an associated decrease in CPU
resource consumption, data sharing overhead
• Plans, packages have a better chance for lock avoidance if they are bound with
ISOLATION(CS) and CURRENTDATA(NO)
Lock avoidance may not be working effectively if Unlock requests/commit is high, e.g. >5
• High Unlock requests/commit could also be possible from
o Large number of relocated rows after update of compressed or VL row
o Large number of pseudo-deleted entries in unique indexes
o Both can be eliminated by REORG
#ibmiod
22. DB2 System and Application Monitoring & Tuning
• Lock Tuning – From Accounting Trace!
MAX PG/ROW LOCKS HELD/MAX PG/ROW LCK HELD DDOC510B AVERAGE TOTAL
--------------------- -------- --------
TIMEOUTS 0.00 0
Max. number of page or row locks concurrently DEADLOCKS 0.00 0
held against all table spaces by a single ESCAL.(SHARED) 0.00 0
application during its execution. It is a high-water ESCAL.(EXCLUS) 0.00 0
MAX PG/ROW LOCKS HELD 0.69 10685
mark. It cannot exceed the “Locks per user”
parameter on panel DSNTIPJ.
• MAX PG/ROW LOCKS HELD is a useful indicator of commit frequency
o AVERAGE is for average of MAX, TOTAL is for max of MAX (of Accounting records)
o In general, try to issue Commit to keep max. locks held below 100
It shows the average of "MAX PG/ROW LOCKS HELD" cross all occurrences in the left
column, the right column contains the aggregated value cross all threads reported under
this ACCTG Report.
So if transaction A had max. locks of 10 and transaction B had 20, then
AVERAGE (avg. of max.) = 15 TOTAL (max. of max.) = 20
#ibmiod
23. DB2 System and Application Monitoring & Tuning
• Sort Indicators
BP7 SORT/MERGE QUANTITY /SECOND /THREAD /COMMIT
--------------------------- -------- ------- ------- -------
MAX WORKFILES CONCURR. USED 1175.19 N/A N/A N/A
MERGE PASSES REQUESTED 3045.00 0.04 0.09 0.00
• Sort Pool MERGE PASS DEGRADED-LOW BUF 6.00 0.00 0.00 0.00
WORKFILE REQ.REJCTD-LOW BUF 71599.00 0.84 2.13 0.00
• Sort Buffer Pool WORKFILE REQ-ALL MERGE PASS 102.9K 1.20 3.07 0.00
WORKFILE NOT CREATED-NO BUF 0.00 0.00 0.00 0.00
WORKFILE PRF NOT SCHEDULED 0.00 0.00 0.00 0.00
Work Work
Files Files
Rows to - Sort tree Sorted
L L L
be - Runs W W Rows
W
sorted - Up to 64 M F F F
- One per thread
Sort Pool Buffer Pool
Input Phase Sort Phase
#ibmiod
24. DB2 System and Application Monitoring & Tuning
• Sort Indicators
BP7 SORT/MERGE QUANTITY /SECOND /THREAD /COMMIT
--------------------------- -------- ------- ------- -------
MERGE PASSES REQUESTED 3045.00 0.04 0.09 0.00
MERGE PASS DEGRADED-LOW BUF 6.00 0.00 0.00 0.00
• MERGE PASSES DEGRADED = number of times merge pass > 1 because workfile requests rejected
because buffer pool is too small.
Merge Passes Degraded < 1 to 5% of Merge Pass Requested (152 in this case)
BP7 SORT/MERGE QUANTITY /SECOND /THREAD /COMMIT
--------------------------- -------- ------- ------- -------
WORKFILE REQ.REJCTD-LOW BUF 71599.00 0.84 2.13 0.00
• WORKFILE REQUESTS REJECTED = number of workfile requests rejected because buffer pool is
too small to support concurrent sort activity
• Workfile Req. Rejected < 1 to 5% of Workfile Req. All Merge Passes
#ibmiod
25. DB2 System and Application Monitoring & Tuning
• Sort Indicators
BP7 READ OPERATIONS QUANTITY /SECOND /THREAD /COMMIT
--------------------------- -------- ------- ------- -------
SYNCHRONOUS READS 505.4K 5.91 15.06 0.00
...
SEQUENTIAL PREFETCH REQUEST 646.6K 7.56 19.27 0.00
SEQUENTIAL PREFETCH READS 633.0K 7.40 18.87 0.00
PAGES READ VIA SEQ.PREFETCH 3969.4K 46.43 118.31 0.02
S.PRF.PAGES READ/S.PRF.READ 6.27
• SYNCHRONOUS READS >> buffer pool shortage or too few physical work files
Sync Reads < 1 to 5% of pages read by prefetch %12 in this case (500 / 4000)
• Prefetch quantity = PRF.PAGES READ/PRF.READ
If prefetch quantity < 4, increase buffer pool size 6.27 in this case
#ibmiod
26. DB2 System and Application Monitoring & Tuning
• LOCKING REPORT - SUSPENSION
LOCKING CONFLICTS
XLOK
+Stat Plan Corrid Type Lvl Resource
+---- -------- ------------ ---- --- ------------------------------------------
+ OWN DISTSERV w3wp.exe ROW X Res ID = 0C0000180158002800022C1C
+WAIT DISTSERV w3wp.exe ROW S Res ID = 0C0000180158002800022C1C
.....
Thread: Plan=DISTSERV Connid=SERVER Corrid=w3wp.exe Authid=USRTIB
Thread Status = IN-SQL-CALL SQL Request Type = DYNAMIC
SQL DBRM Name = SYSSH200 SQL Statement Number = 00000
+ insert into SCTIB.LOGON_HST(CORPORATE_USER_CUSTOMER_ID, CUSTOMER_ID, WEB
....
• Problem is noticed during Real time Monitoring!...
#ibmiod
27. DB2 System and Application Monitoring & Tuning
• LOCKING REPORT - SUSPENSION
a) LOCKING REPORT LEVEL (SUSPENSION) INTERVAL FROM: 16-07-12 11:41:25.85
b) ORDER: PRIMAUTH-PLANNAME TO: 16-07-12 11:41:32.63
7 sec!...
USRTIB
DISTSERV
--- L O C K R E S O U R C E --- TOTAL
TYPE NAME SUSPENDS
--------- ----------------------- --------
PAGE DB =DBTIB 248
OB =LOGONHST
PAGE=X'25DFDA‘
BPID=BP0
BP0 = 5000 / BP8K = 80000
#ibmiod
28. DB2 System and Application Monitoring & Tuning
• AUDIT REPORT - DETAIL
AUDIT REPORT LEVEL(DETAIL) TYPE(ALL)
PRIMAUTH CORRNAME CONNTYPE
ORIGAUTH CORRNMBR INSTANCE
PLANNAME CONNECT MEMBER TIMESTAMP TYPE DETAIL
-------- -------- ------------ ----------- -------- ---------------------------------------
DB2SYS1 DB2SQL TSO 13:30:15.51 AUTHCNTL GRANTOR: IS95914 OWNER TYPE: PRIM/SECOND
DB2SYS1 'BLANK' C9D256E9A77A SQLCODE: -551
DSNTEP91 BATCH PR4B OBJECT TYPE: DATABASE
TEXT: GRANT DBADM ON DATABASE DSN8D91A TO PUBLIC
AUDIT Trace – Class 2 is enough!
IFCID 4 & 141
DB2 VERSION: V10
INPUT INPUT PROCESSED PROCESSED INPUT INPUT PROCESSED PROCESSED
IFCID COUNT PCT OF TOTAL COUNT PCT OF TOTAL IFCID COUNT PCT OF TOTAL COUNT PCT OF TOTAL
------- ---------- ------------ ---------- ------------ ------- ---------- ------------ ---------- ------------
4 1 25.00% 1 25.00% 141 3 75.00% 3 75.00%
#ibmiod
29. DB2 System and Application Monitoring & Tuning
• I/O ACTIVITY REPORT - SUMMARY
IOACTIVITY REPORT
BUFFER POOL TOTALS AET all the pages requested by a prefetch
---------------------------- -------- --------- read were already in the buffer pool.
0TOTAL I/O REQUESTS 2306 0.003214
TOTAL READ I/O REQUESTS 2114 0.003400
NON-PREFETCH READS 1900
PREFETCH READS
WITHOUT I/O 193 The number of successful
WITH I/O 21 prefetch reads.
PAGES READ 349
PAGES READ / SUCC READ 16.62
PREFETCH READS :
An aggregate of all types of prefetches:
Sequential prefetches (bind time) + List prefetch + Dynamic Prefetch (Runtime)
VPSEQT - sequential steal threshold for the buffer pool
#ibmiod
31. DB2 System and Application Monitoring & Tuning
• Stored Procedure Monitoring
Issues with Plan and Package Level SP Analysis
• Multiple SPs called in a transaction are summed at the plan level.
By definition this affects the analysis of nested SPs.
• Package level analysis can be difficult if an SP execute different
paths and SQL based on parameters. How do you differentiate
between the invocations?
• Package level analysis does not apply to SPs that do not execute
SQL
#ibmiod
32. Enhanced Instrumentation for Stored Procedure
Performance Analysis
• PM53243 (DB2 10) New IFCIDs 380 and 381 are created for Stored
Procedure and User-Defined Function detail respectively. These records:
o Identify the stored procedure or UDF beginning or ending
o Include the current CP, specialty engine, and elapsed time details for
nested activity
• These record can be used to determine the CP, specialty engine, and
elapsed time for a given SP or UDF invocation
• Additionally PM53243 (DB2 10) added IFCID 497, 498, 499 for SQL drill
down analysis. These records contain the dynamic or static statement IDs
for non-nested, UDF, and SP work respectively.
• The statement IDs can be correlated to IFCID 316 dynamic statement or
IFCID 401 static statement cache data.
• ACCOUNTING REPORT ORDER (PACKAGE) versus
ACCOUNTING REPORT ORDER (ACTNAME) (IFCID 233 data
needed!) UK70741
#ibmiod
33. Enhanced Instrumentation for Stored Procedure
Performance Analysis
Client
Connect
CALL
mySP (:p1)
IFCID 380 written here
SQL1 IFCID 499 written here
for mySP begin. Will
Insert with all statement IDs
contain 0’s for current
executed in the SP
CP, specialty engine and SQL2 (i.e., SQL1, SQL2)
elapsed times
Open
Return
IFCID 380 written here
Fetch to fill for mySP end. Will
row buffer contain values that can
be compared to the
begin IFCID380 record
Commit for mySP
DDF DBM1 SPAS
#ibmiod
34. DB2 Tuning and Automobile
• DB2 is like an automobile
• When it needs tuning, slows down as workloads increase, consumes
more MIPS.
• tune it up and it takes more workloads, performing well within the
limits of SLAs
• Modern automobile engine is highly complex with a multitude of
interconnected components. It requires an experienced mechanic
using sophisticated tools.
• Likewise, DB2 is highly complex. Tuning it requires a skilled
technician using sophisticated tool.*
* Get More Speed With Less Fuel Through DB2 Tuning by Kevin Baker in Enterprise Tech Journal
33
#ibmiod