Db2 sql tuning and bmc catalog manager


Published on

DB2 for Z/OS version 8
helps understand SQL Coding Strategies & Guidelines, optimization, filters and predicates for beginners

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Db2 sql tuning and bmc catalog manager

  1. 1. DB2 z/OS v8 - SQL Tuning
  2. 2. Overview Understanding DB2 Optimizer  SQL Coding Strategies & Guidelines  Fliter Factor  Stage1 & Stage 2 Predicates  Explain table  How to interpret the Explain Tables Using Monitoring Tools to understand the performance of SQLs  BMC Apptune  BMC SQL Explorer
  3. 3. SQL Coding Strategies & Guidelines SQL Optimized Access Path DB2 Optimizer Cost - Based Query Cost FormulasDB2 Catalog  Determines database navigation  Parses SQL statements for tables and columns which must be accessed  Queries statistics from DB2 Catalog (populated by RUNSTATS utility)  Determines least expensive access path  Checks Authorization  The DB2 Optimizer is Cost Based and chooses the least expensive access path
  4. 4. SQL Coding Strategies & Guidelines  Avoid unnecessary execution of SQL  Consider accomplishing as much as possible with a single call, so as to minimize table access as far as possible.  Limit the data selected (rows & columns) using SQL and avoid filtering using Application programs.  As far as possible, Code predicates on Indexable columns  Use equivalent data types for comparison. This avoids the data type conversion overhead.  JOIN tables on Indexed columns.  Avoid Cartesian Products.  The DISTINCT, ORDER BY, GROUP BY, UNION clauses involve a SORT operation. Use these clauses only if absolutely necessary.
  5. 5. SQL Coding Strategies & Guidelines Cursor Usage Tips  Use Singleton SELECT statements, if you need to retrieve one row only. This gives a far better performance than cursors. SELECT … INTO :<host variables>  Cursors should be used when you have more than one row to be retrieved. Cursors have the overhead of OPEN, FETCH & CLOSE.  To update rows using a Cursor, use the FOR UPDATE OF clause.  Use FOR FETCH ONLY clause when the cursor is used for data retrieval only. FOR READ ONLY clause provides the same functionality and is ODBC compliant.  Use the WITH HOLD clause if you don’t want DB2 to automatically close the cursor when the application issues a COMMIT statement. Static Vs Dynamic SQL  The Access paths for Dynamic SQL is determined at run-time, which results in additional overhead. Also, users need to have direct access to the tables.  The Access paths for Static SQL is determined at bind-time, and reused at run- time. Users need only the EXECUTE access on the plan.
  6. 6. SQL Coding Strategies & Guidelines UNION and UNION ALL  The OR operator requires Stage 2 processing. Consider rewriting the query as the union of two SELECT statements, making index access possible  UNION ALL allows duplicates, and hence does not involve a SORT. The BETWEEN clause  BETWEEN is usually more efficient than using <= and >= operators, except when comparing a host variable to 2 columns  Stage 2 : WHERE  :hostvar BETWEEN col1 and col2  Stage 1: WHERE  Col1 <= :hostvar AND col2 >= :hostvar
  7. 7. SQL Coding Strategies & Guidelines  Use IN Instead of Like  If you know that only a certain number of values exist and can be put in a list  Use IN or BETWEEN  IN (‘Value1’, ‘Value2’, ‘Value3’)  BETWEEN :valuelow AND :valuehigh  Rather than:  LIKE ‘Value_’  Use LIKE With Care  Avoid the % or the _ at the beginning because it prevents DB2 from using a matching index and may cause a scan  Use the % or the _ at the end to encourage index usage
  8. 8. SQL Coding Strategies & Guidelines Use NOT operator with care  Predicates formed using NOT (except NOT EXISTS) are Stage 1, but are not indexable.  For Subquery - when using negation logic: • Use NOT Exists instead of NOT IN Code the Most Restrictive Predicate First  After the indexes, place the predicate that will eliminate the greatest number of rows Avoid Arithmetic in Predicates  An index is not used for a column when the column is in an arithmetic expression.  Used at Stage 1 but not indexable
  9. 9. SQL Coding Strategies & Guidelines Nested loop join is efficient when  Outer table is small. Predicates with small filter factor reduces no of qualifying rows in outer table.  The number of data pages accessed in inner table is also small.  Highly clustered index available on join columns of the inner table.  This join method is efficient when filtering for both the tables (Outer and inner) is high.  This is the most common Join method. Merge scan is used when :  Qualifying rows of inner and outer tables are large and join predicates also does not provide much filtering  Tables are large and have no indexes with matching columns Hybrid Join is used when:  A non-clustered index available on join column of the inner table and there are duplicate qualifying rows on outer table.
  10. 10. SQL Coding Strategies & Guidelines Join Types & Join Predicate Considerations  Provide accurate JOIN predicates  Avoid JOIN without a predicate (Cartesian Join)  Join ON indexed columns  Use Joins over sub-queries  When the results of a join must be sorted -  Limiting the ORDER BY to columns of a single table can sometimes avoid a Sort  Specifying columns from multiple tables definitely involve a Sort  Favor coding LEFT OUTER joins over RIGHT OUTER joins as DB2 always converts RIGHT joins to LEFT before executing it.
  11. 11. SQL Coding Strategies & Guidelines Sub-Query Guidelines – If there are efficient indexes available on the tables in the subquery, then a correlated subquery is likely to be the most efficient kind of subquery. – If there are no efficient indexes available on the tables in the subquery, then a non-correlated subquery would likely perform better. – If there are multiple subqueries in any parent query, make sure that the subqueries are ordered in the most efficient manner.
  12. 12. SQL Coding Strategies & Guidelines  Techniques for Performance Improvement  Use OPTIMIZE OF n ROWS  DB2 assumes that only the said number of rows will be retrieved by the query before choosing the access path. It is basically like giving a Hint to the DB2 Optimizer.  This does not stop the user from accessing the entire result set.  This is not useful when DB2 has to gather whole result set before returning the first n rows.  With this clause, DB2 optimizes the query for quicker response.  Updating catalog tables  If RUNSTATS is costly or it cannot be executed then catalog table should be updated manually. Enhanced Techniques for Performance Improvement
  13. 13. SQL Coding Strategies & Guidelines Influencing access path – Add extra Predicate  DB2 evaluates the access path based on information available in catalog tables  Wrong catalog information or unavailable catalog information may result in selection of wrong access path  Wrong access path could be because of a wrong index selection or it could also be of index selection where a tablespace scan is effective  Code extra predicates or change the predicate to make DB2 use a different access path  Adding extra predicate may also influence the selection of join method  If you have extra predicate, Nested loop join may be selected as DB2 assumes that filter factor will be high. The proper type of predicate to add is WHERE T1.C1 = T1.C1  Hybrid join is a costlier method. Outer join does not use hybrid join. So If hybrid join is used by DB2, convert inner join to outer join and add extra predicates to remove unneeded rows. Enhanced Techniques for Performance Improvement
  14. 14. SQL Coding Strategies & Guidelines General recommendations Make sure that  The queries are as simple as possible  Unused rows are not fetched. Filtering to be done by DB2 not in the application program.  Unused columns are not selected  There is no unnecessary ORDER BY or GROUP BY Clause  Use page level locking and try to minimize lock duration.  Mass updates should be avoided.  Try to use indexable predicates wherever possible  Do not code redundant predicates  Make sure that declared length of the host variable is not greater than length attribute of data column.  If there are efficient indexes available on the tables in the subquery, co-related subquery will perform better. Otherwise non co related subquery will perform better.  If there are multiple subqueries, make sure that they are ordered in efficient manner. Summary
  15. 15.  Optimizer assigns a “Filter Factor” (FF) to each predicate or predicate combination – Number between 0 and 1 that provides the estimated filtering percentage  FF of 0.25 means 25% of the rows are estimated to qualify – Calculated using available statistics from catalog tables • Column cardinality (COLCARDF) • HIGH2KEY/LOW2KEY • Frequency statistics (FREQUENCYF in SYSCOLDIST) Filter Factor (FF)
  16. 16. RUNSTATS  RUNSTATS is a DB2 utility which provides catalog statistics used by the optimizer and statistics related to the organization of an object (TS / TB / IX / CO)  Accurate Statistics are a critical factor for performance of the SQL.  Updates the DB2 catalog and reports the statistics.  Some catalog statistics updated by RUNSTATS for use by the optimizer can be manually updated with appropriate authorization (SYSADM).
  18. 18. Stats Used for Access Path Determination  SYSINDEXPART – LIMITKEY  SYSTABLES – CARDF – EDPROC – NPAGES – PCTROWCOMP
  19. 19. Stage 1 vs. Stage 2 Predicates  Stage 1 predicates may use an available Index.  Stage 2 predicates cannot use any Index.
  20. 20. Wherever possible, prefer to use Stage 1 (Sargable) predicates in the where clause. These are conditions that can be evaluated in the Data Manager of DB2, before the results are passed to Relational Data System (RDS). The more conditions that can be evaluated early on, the more efficient data retrieval is. Stage 1- Refers to DM( Data Manager) A suitable index must exist! Reduces I-O from disk and bufferpool activity Stage 2 - Refers to RDS ( Relational Data System) Stage 1 vs. Stage 2 Predicates
  21. 21. How does the optimizer calculate Filter Factors?  The lower the filter factor, the lower the cost. In general, the more efficient the query will be
  22. 22.  A tool that shows the access path used by a query.  Results of Explain stored in table PLAN_TABLE.  Explain can be run for a query outside a program or for all queries in a program.  For all queries in a program: By using EXPLAIN(YES) parameter during BIND. Sample Explain Table Output Explain
  23. 23. Explain  Explain can be run at bind time using parm value of EXPLAIN(YES)  A PLAN_TABLE must previously exist based on OWNER parm value on BIND or current SQLID for dynamic SQL  Explain can also be run against dynamic SQL DELETE FROM PLAN_TABLE WHERE QUERYNO = 999; EXPLAIN PLAN SET QUERYNO = 999 FOR <SELECT STATEMENT GOES HERE - USE ? IN PLACE OF HOST VARIABLES>; SELECT * FROM PLAN_TABLE WHERE QUERYNO = 999 ORDER BY QBLOCKNO, PLANNO;  Don’t forget to Explain everything  Plan_Table is where all the tuning starts
  24. 24.  Non- Matching Index scan (ACCESSTYPE = I and MATCHCOLS = 0) Scan all leaf pages of index selected by optimizer selecting one OR more qualifying rows. Scan can be with OR without data access. Predicate does not match Leading columns in the index SELECT COUNT(*) FROM TABLEA SELECT MAX(COL1) FROM TABLEA SELECT COL1 FROM TABLEA WHERE COL2 = :HV Interpreting the Plan Table/Analyzing Access Paths
  25. 25. Non-Matching Index Scan Diagram Root Page Non-Leaf Page 1 Non-Leaf Page 2 Leaf Page 1 Leaf Page 2 Leaf Page 3 Leaf Page 4
  26. 26.  Matching Index scan (MATCHCOLS > 0) Scan one or more leaf pages of index selected by optimizer selecting one OR more qualifying rows. Index match based on one or more key columns of selected index. Scan can be with OR without data access. Predicates matches leading columns of the index. SELECT COL1 FROM TABLEA WHERE COL2 = :HV SELECT COL2 FROM TABLEA WHERE COL1 = :HV (host variable length longer than COL1) Interpreting the Plan Table/Analyzing Access Paths
  27. 27. Root Page Non-Leaf Page 1 Non-Leaf Page 2 Leaf Page 1 Leaf Page 2 Leaf Page 3 Leaf Page 4 Data Page Data Page Data Page Data Page Data Page Data Page Data Page Data Page Matching Index Scan Diagram Interpreting the Plan Table/Analyzing Access Paths
  28. 28. One Fetch Index Access (ACCESSTYPE = I1) In certain circumstances can be THE most efficient access path in DB2. May only need to access only 1 leaf page but MAY need to traverse index tree path. Requires only one row be retrieved ( Min or Max column function) SELECT MIN(COL1) FROM TABLEA SELECT MIN(COL2) FROM TABLEA WHERE COL1 = :HV (will still be I1 BUT with matchcols = 1) Interpreting the Plan Table/Analyzing Access Paths
  29. 29.  IN List Index Scan (ACCESSTYPE = N)  Scan one or more leaf pages of index selected by optimizer selecting one OR more qualifying rows.  Index match based on one or more key columns of selected index.  At least one key column incorporates an IN list. SELECT * FROM TABLEA WHERE COL1 = :HV AND COL2 IN (‘A’,’B’,’C’) SELECT COL3 FROM TABLEA WHERE COL1 IN (‘12345’,’56789’) AND COL2 = :HV Interpreting the Plan Table/Analyzing Access Paths
  30. 30. Table-space scan (ACCESSTYPE = R) Scan against partitioned tablespace or simple tablespace with one table scans all pages including pages which are empty or contain purely deleted rows. Scan against simple tablespace containing more than one table includes scanning of tables within that tablespace not necessarily included in the query. Scan against segmented tablespace includes only pages containing data. SELECT * FROM TABLEA SELECT * FROM TABLEA WHERE COL6 = 0 SELECT * FROM TABLEA WHERE COL1 <> :HV Interpreting the Plan Table/Analyzing Access Paths
  31. 31. Data Page 1 Data Page 2 Data Page 3 Data Page 4 Tablespace Scan Diagram Interpreting the Plan Table/Analyzing Access Paths
  32. 32. DB2 I/O Assisted Mechanisms  Prefetch To read data ahead in anticipation of its use. Prefetch can read up to 32 4K pages for applications, and up to 64 4K pages for utilities.  Sequential Prefetch In DB2 UDB for OS/390, a mechanism that triggers consecutive asynchronous I/O operations. Pages are fetched before they are required, and several pages are read with a single I/O operation. This action is determined at bind time and can be detected by a value of “S” in the prefetch column of the plan table. If index AND data are required for the SQL, prefetch can occurs both object types.  Dynamic Prefetch Using the same approach as sequential prefetch, the mechanism is trigger at runtime if DB2 detect that access to the index and/or data pages is sequential in nature but are distributed |in a nonconsecutive manner .  List Prefetch An access method that takes advantage of prefetching even in queries that do not access data sequentially. This is done by scanning the index and collecting RIDs in advance of accessing any data pages. These RIDs are then sorted in page number order, and then data is prefetched using this list.
  33. 33. DB2 Explain Columns  QUERY Number – Identifies the SQL statement in the PLAN_TABLE (any number you assign - the example uses the numeric part of the userid)  BLOCK – Query block within the query number, where 1 is the top level SELECT. Subselects, unions, materialized views, and nested table expressions will show multiple query blocks. Each QBLOCK has it's own access path.  PLAN – Indicates the order in which the tables will be accessed
  34. 34. DB2 Explain Columns  METHOD – Shows which JOIN technique was used: 00- First table accessed, continuation of previous table accessed, or not used. 01- Nested Loop Join. For each row of the present composite table, matching rows of a new table are found and joined 02- Merge Scan Join. The present composite table and the new table are scanned in the order of the join columns, and matching rows are joined. 03- Sorts needed by ORDER BY, GROUP BY, SELECT DISTINCT, UNION, a quantified predicate, or an IN predicate. This step does not access a new table. 04- Hybrid Join. The current composite table is scanned in the order of the join-column rows of the new table. The new table accessed using list prefetch.
  35. 35. DB2 Explain Columns  TNAME – name of the table whose access this row refers to. Either a table in the FROM clause, or a materialized VIEW name.  TYPE (ACCESS TYPE) – indicates whether an index was chosen:  I = INDEX  R = TABLESPACE SCAN (reads every data page of the table once)  I1 = ONE-FETCH INDEX SCAN  N = INDEX USING IN LIST  M = MULTIPLE INDEX SCAN  MX = NAMES ONE OF INDEXES USED  MI = INTERSECT MULT. INDEXES  MU = UNION MULT. INDEXES
  36. 36. DB2 Explain Columns  MC (MATCHCOLS) - number of columns of matching index scan  ANAME (ACCESS NAME) - name of index  IO (INDEX ONLY) - Y = index alone satisfies data request  N = table must be accessed also  8 Sort Groups: Each sort group has four indicators indicating why the sort is necessary. Usually, a sort will cause the statement to run longer.  UNIQ - DISTINCT option or UNION was part of the query or IN list for subselect  JOIN - sort for Join  ORDERBY - order by option was part of the query  GROUPBY - group by option was part of the query
  37. 37. DB2 Explain Columns Sort flags for 'new' (inner) tables:  SNU - SORTN_UNIQ - Y = remove duplicates, N = no sort  SNJ - SORTN_JOIN - Y = sort table for join, N = no sort  SNO - SORTN_ORDERBY - Y = sort for order by, N = no sort  SNG - SORTN_GROUPBY - Y = sort for group by, N = no sort Sort flags for 'composite' (outer) tables:  SCU - SORTC_UNIQ - Y = remove duplicates, N = no sort  SCJ - SORTC_JOIN - Y = sort table for join, N = no sort  SCO - SORTC_ORDERBY - Y = sort for order by, N = no sort  SCG - SORTC_GROUPBY - Y = sort for group by, N = no sort  PF - PREFETCH - Indicates whether data pages were read in advance by prefetch.  S = pure sequential PREFETCH  L = PREFETCH through a RID list  Blank = unknown, or not applicable
  38. 38. DB2 Explain Columns  MIXOPSEQ The sequence number of a step in a multiple index operation.  PAGE_RANGE Whether the table qualifies for page range screening, so that plans scan only the partitions that are needed. Y = Yes; blank = No  COLUMN_FN_EVAL: When an SQL aggregate function is evaluated. R = while the data is being read from the table or index; S = while performing a sort to satisfy a GROUP BY clause; blank =after data retrieval and after any sorts.  QBLOCK_TYPE For each query block, an indication of the type of SQL operation performed.  JOIN_TYPE: The type of join: F FULL OUTER JOIN L LEFT OUTER JOIN S STAR JOIN blank INNER JOIN or no join RIGHT OUTER JOIN converts to a LEFT OUTER JOIN when you use it, so that JOIN_TYPE contains L. EXPLAIN Statements with examples.doc
  39. 39. Performance Tools Overview  BMC APPTUNE  BMC SQL EXPLORER
  40. 40. BMC APPTUNE Use Option4- Performance Products
  41. 41. BMC APPTUNE Use Option Q- Apptune and Index components
  42. 42. BMC APPTUNE Option 1- SQL Workload
  43. 43. Setting Options in BMC APPTUNE Use Workload Analysis Choose 6. Data source 5. Time interval
  44. 44. Viewing Reports in APPTUNE Use Various Options To Generate Reports Reports Generated for Programs
  45. 45. Viewing SQLs in APPTUNE Use Option S- To Show SQLS Use Option X- To EXPLAIN SQLS
  46. 46. Example of EXPLAIN Result in BMC APPTUNE Cost Calculated by Optimizer Matching Index scan Performed Matching Columns used by index Table & Index names Used by access path
  47. 47. BMC SQL EXPLORER Use Option S- SQL Explorer Use Option 1- Explain
  48. 48. Setting Options in BMC SQL EXPLORER Plans or Packages or DBRMS can be analyzed Package options Analysis run in Batch Mode
  49. 49. More references BMC SQL EXPLORER.doct steps to get to Apptune.doc
  50. 50. Run thru of an Actual SQL Tuning Exercise
  51. 51. Set up Development Environment
  52. 52. Use Option 7 - Migrate Access Path Statistics Example of the SQL Tuning Process - Development Step 1.3: Import Statistics From Production to Development
  53. 53. Step 2: Identification of Problem SQL – Identify problem SQL SQL Statement being Analysed. Tool warns that Cardinality is missing. Predicate Mismatch is also detected. Example of the SQL Tuning Process - Development
  54. 54. Step 2: Identification of Problem SQL – Check SQL Best Practices No tool available for checking Best Practices. This needs to be manually checked using the SQL Best Practices document already Published. A snippet of the related Best Practice from the SQL Guidelines document. Example of the SQL Tuning Process - Development
  55. 55. Step 3: SQL Optimization – SQL Rewrite No tool available to automatically rewrite SQL statements. This needs to be manually rewritten and subsequent steps for Checking the new Access Path to be performed. Example of the SQL Tuning Process - Development
  56. 56. Step 3: SQL Optimization – Compare Access paths Access Paths can be compared. Notice the change in Estimated Indicative cost. A different Index is being used now. Example of the SQL Tuning Process - Development
  57. 57. Bibliography  Redbooks at www.redbooks.ibm.com DB2 UDB for z/OS V8 Everything you ever wanted to know… SG24-6079 DB2 UDB for z/OS V8 Performance Topics SG24-6465 DB2 for z/OS Application Design for High Performance and Availability SG24-7134 10/05  DB2 UDB for Z/OS V8 Application Programming and SQL Guide  SQL Tuning Best Practices & Guidelines Document In the IM Project & Document Database Process Document section 1) Database 'IM Project and Document Database' 2) Select the ‘Process Document’ Section 3) Select ‘By Process Category’ 4) Select ‘Best Practices’ 5) View ‘Table of Contents ' 6) Select document 'Database Access - SQL Tuning Best Practice & Guidelines’