0
 IBM Corporation 2003 Tampa Bay Relational Users Group Query diagnosis IBM Silicon Valley Lab, U.S.A.
Query analysis and tuning <ul><li>Format the SQL statement </li></ul><ul><ul><li>Prepare the statement for human tuning </...
Sample unformatted query <ul><li>EXPLAIN PLAN SET QUERYNO = 1  FOR  </li></ul><ul><li>SELECT DISTINCT ITEM.ITEM_NBR AS ITE...
Formatted <ul><li>EXPLAIN PLAN SET QUERYNO = 1  FOR  </li></ul><ul><li>SELECT DISTINCT ITEM.ITEM_NBR AS ITEM_NBR, ITEM.PRD...
Analyzing query <ul><li>Observe “interesting predicates” </li></ul><ul><ul><li>Optimizer may produce inaccurate filter fac...
Sample query Pat’s diagnosis
Query breakdown <ul><li>SELECT  … </li></ul><ul><li>FROM  SETL_TRANS S </li></ul><ul><li>,BRANCH CUST </li></ul><ul><li>,B...
Identify peculiar predicates <ul><li>SELECT  … </li></ul><ul><li>FROM  SETL_TRANS S </li></ul><ul><li>,BRANCH CUST </li></...
Why are they peculiar? <ul><li>Predicates with typical default often skewed. </li></ul><ul><li>AND S.PROCESS_DT < ‘9999-12...
Range predicate interpolation Table 104. Default filter factors for interpolation Note:   Op is one of these operators: <,...
Analyzing query <ul><li>Embed information within statement </li></ul><ul><ul><li>Table information </li></ul></ul><ul><ul>...
Embed statistics <ul><li>SELECT  … </li></ul><ul><li>FROM  SETL_TRANS S CARDF 1,600,254 NPAGES 21,627 </li></ul><ul><li>,B...
Suspicious predicate analysis <ul><li>1) The first range predicate, we’re looking for all values less than ‘9999-12-31.  <...
Suspicious predicate analysis <ul><li>The literal value used for each of the parameter markers in this case happened </li>...
Suspicious predicate analysis <ul><li>Conclusion </li></ul><ul><ul><li>The range predicates with parameter markers introdu...
Where’s the filtering? <ul><li>WHERE  S.ADV_ABA_R = ? COLCARDF 19,712 </li></ul><ul><li>(Very selective predicate) </li></...
Where’s the filtering? <ul><li>SELECT  … </li></ul><ul><li>FROM  SETL_TRANS S CARDF 1,600,254 NPAGES 21,627 </li></ul><ul>...
Index analysis <ul><li>One significant input to the optimizer is… </li></ul><ul><ul><li>Available indexes </li></ul></ul><...
Identify indexes <ul><li>Table: SETL_TRANS </li></ul><ul><li>INDEX IXSTRN01  </li></ul><ul><li>(PROCESS_DT, CLR_CYCLE_CD, ...
Index candidate usage <ul><li>Table: AJT_SETL_TRANS </li></ul><ul><li>INDEX IXSTRN01  </li></ul><ul><li>( PROCESS_DT ,  CL...
Index design analysis (by table) <ul><li>BRANCH table (Index design OK!) </li></ul><ul><ul><li>Index IXBRNC02 supports loc...
Index design analysis (by table) <ul><li>SETL_TRANS table (Not OK!) </li></ul><ul><ul><li>INDEX IXSTRN01 has one index. </...
Overlay table size <ul><li>Table: SETL_TRANS CARDF 1,600,254 NPAGES 21,627 </li></ul><ul><li>INDEX IXSTRN01  </li></ul><ul...
Possible new indexes <ul><li>Existing index </li></ul><ul><li>IXSTRN01  </li></ul><ul><li>( PROCESS_DT ,  CLR_CYCLE_CD ,  ...
Summary of this SQL <ul><li>Indexes on BRANCH, BRANCH_ADDR look better than they are </li></ul><ul><ul><li>Range predicate...
SQL 2 SELECT COLS FROM PART  A CARDF=17,598 QUALIFIED_ROWS=67.1 NPAGESF=1,467 , CONTRACTOR B CARDF=34,728 QUALIFIED_ROWS=7...
Local predicate analysis SELECT COLS FROM PART  A CARDF=17,598 QUALIFIED_ROWS=67.1 NPAGESF=1,467 , CONTRACTOR B CARDF=34,7...
Local index analysis – ‘A’ SELECT COLS FROM PART  A CARDF=17,598 QUALIFIED_ROWS=67.1 NPAGESF=1,467 WHERE A.COUNTRY_CD = ? ...
Local index analysis – ‘A’ SELECT COLS FROM PART  A CARDF=17,598 QUALIFIED_ROWS=67.1 NPAGESF=1,467 WHERE A.COUNTRY_CD = ? ...
Local index analysis B SELECT COLS FROM CONTRACTOR B CARDF=34,728 QUALIFIED_ROWS=77.8 NPAGESF=724 WHERE B.PART_NUM = ? COL...
Local index analysis B <ul><li>Note: SUB_CONTRACTOR is selective due to search for least frequent value.  Is not in any ca...
Local index analysis C <ul><li>Table C </li></ul><ul><ul><li>There is index support for local filtering. </li></ul></ul><u...
Indexes for local summary <ul><li>Each table with local filtering had efficient indexes to support local filtering </li></...
Join graph C B D E A <ul><li>Two most selective tables  ‘A’  and ‘B’ not joined directly </li></ul><ul><li>C – D – E each ...
Join considerations <ul><li>Index support for certain join sequences </li></ul><ul><ul><li>Indexes available to support ma...
Join indexes A SELECT COLS FROM  PART  A CARDF=17,598 QUALIFIED_ROWS=67.1 NPAGESF=1,467 , CONTRACTOR B CARDF=34,728 QUALIF...
Join indexes A <ul><li>Join access available through join the ‘D’ table only </li></ul><ul><ul><li>Via PART_NUM if ‘D’ is ...
Join indexes B SELECT COLS FROM PART  A CARDF=17,598 QUALIFIED_ROWS=67.1 NPAGESF=1,467 ,  CONTRACTOR B CARDF=34,728 QUALIF...
Join indexes B <ul><li>Join access available through join the ‘C’ table only </li></ul><ul><ul><li>Via CONTRACTOR_ID if ‘C...
Join indexes C SELECT COLS FROM PART  A CARDF=17,598 QUALIFIED_ROWS=67.1 NPAGESF=1,467 , CONTRACTOR B CARDF=34,728 QUALIFI...
Join indexes C <ul><li>Join access available through join the ‘B’, ‘D’, and ‘E’ tables </li></ul><ul><ul><li>Via CONTRACTO...
Join indexes D SELECT COLS FROM PART  A CARDF=17,598 QUALIFIED_ROWS=67.1 NPAGESF=1,467 , CONTRACTOR B CARDF=34,728 QUALIFI...
Join indexes D <ul><li>‘ D’ is accessed in multiple directions </li></ul><ul><ul><li>Via PART_NUM if ‘A’ is the outer </li...
Join indexes E <ul><li>Join access available through C and E tables </li></ul><ul><ul><li>Both tables join on PRODUCT_ID c...
Join fan-out SELECT COLS FROM  PART  A CARDF=17,598 QUALIFIED_ROWS=67.1 NPAGESF=1,467 ,  CONTRACTOR B CARDF=34,728 QUALIFI...
Explain <ul><li>Join  sequence </li></ul><ul><ul><li>Access ‘A’ via index IXPRT01 ( PART_CD, COUNTRY_CD , …) ~67 rows </li...
Issues – A as outer? <ul><li>Is local filtering to ‘A’ table accurate? </li></ul><ul><ul><li>There is skew, but use of mar...
Issues – B / C as outer? <ul><li>B as outer </li></ul><ul><ul><li>Less skew on B.PART_NUM = ? – less uncertainty in cost e...
Summary Query 2 <ul><li>Bottom line: </li></ul><ul><ul><li>Uniform distribution estimate on ‘A’ table allows it to compete...
Commentary <ul><li>How to perform SQL analysis </li></ul><ul><ul><li>Format query so it’s readable </li></ul></ul><ul><ul>...
Upcoming SlideShare
Loading in...5
×

Download

194

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
194
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Download"

  1. 1.  IBM Corporation 2003 Tampa Bay Relational Users Group Query diagnosis IBM Silicon Valley Lab, U.S.A.
  2. 2. Query analysis and tuning <ul><li>Format the SQL statement </li></ul><ul><ul><li>Prepare the statement for human tuning </li></ul></ul><ul><li>Separate sections for: </li></ul><ul><ul><li>SELECT list </li></ul></ul><ul><ul><li>FROM clause </li></ul></ul><ul><ul><li>WHERE clause </li></ul></ul><ul><ul><li>… </li></ul></ul><ul><li>Tools support </li></ul><ul><ul><li>Data Studio fixpack 2.2.0.1 includes SQL formatting </li></ul></ul><ul><ul><ul><li>Show transformed SQL text </li></ul></ul></ul>
  3. 3. Sample unformatted query <ul><li>EXPLAIN PLAN SET QUERYNO = 1 FOR </li></ul><ul><li>SELECT DISTINCT ITEM.ITEM_NBR AS ITEM_NBR, ITEM.PRDT_ID, STOREITEM.WK_STRT_DT AS WK_STRT_DT ,STOREITEM.DC_ID AS DC_ID FROM PROD.TIPA004_STITM_PROJ AS STOREITEM , PROD.TITM001_ITEM AS ITEM WHERE ITEM.BUS_UNIT_ID = ‘GS‘ AND ITEM.BUS_UNIT_ID = STOREITEM.BUS_UNIT_ID AND ITEM.MJR_CATG_ID = '00754‘ AND ITEM.INTMD_CATG_ID = '00043‘ AND ITEM.ITEM_NBR = STOREITEM.ITEM_NBR AND ITEM.MJR_CATG_ID = STOREITEM.MJR_CATG_ID AND ITEM.INTMD_CATG_ID = STOREITEM.INTMD_CATG_ID AND STOREITEM.RTL_DEPT_NBR = 1 AND AD_ITEM_FLG = 'Y‘ AND WK_STRT_DT = '2002-02-08'; </li></ul>Unformatted SQL, where to start?
  4. 4. Formatted <ul><li>EXPLAIN PLAN SET QUERYNO = 1 FOR </li></ul><ul><li>SELECT DISTINCT ITEM.ITEM_NBR AS ITEM_NBR, ITEM.PRDT_ID, STOREITEM.WK_STRT_DT AS WK_STRT_DT ,STOREITEM.DC_ID AS DC_ID </li></ul><ul><li>FROM PROD.TIPA004_STITM_PROJ AS STOREITEM </li></ul><ul><li>,PROD.TITM001_ITEM AS ITEM </li></ul><ul><li>WHERE ITEM.BUS_UNIT_ID = STOREITEM.BUS_UNIT_ID </li></ul><ul><li>AND ITEM.MJR_CATG_ID = STOREITEM.MJR_CATG_ID </li></ul><ul><li>AND ITEM.INTMD_CATG_ID = STOREITEM.INTMD_CATG_ID </li></ul><ul><li>AND ITEM.ITEM_NBR = STOREITEM.ITEM_NBR </li></ul><ul><li>AND ITEM.BUS_UNIT_ID = ‘GS‘ </li></ul><ul><li>AND ITEM.MJR_CATG_ID = '00754‘ </li></ul><ul><li>AND ITEM.INTMD_CATG_ID = '00043‘ </li></ul><ul><li>AND STOREITEM.AD_ITEM_FLG = 'Y‘ </li></ul><ul><li>AND STOREITEM.RTL_DEPT_NBR = 1 </li></ul><ul><li>AND STOREITEM.WK_STRT_DT = '2002-02-08'; </li></ul>
  5. 5. Analyzing query <ul><li>Observe “interesting predicates” </li></ul><ul><ul><li>Optimizer may produce inaccurate filter factor estimate </li></ul></ul><ul><ul><li>Range predicates with parameter markers </li></ul></ul><ul><ul><li>Predicates using interesting literals </li></ul></ul><ul><ul><ul><li>Probable defaults </li></ul></ul></ul><ul><ul><li>Complex predicates </li></ul></ul><ul><ul><ul><li>Complex OR expressions </li></ul></ul></ul><ul><ul><ul><li>Negation predicates </li></ul></ul></ul><ul><ul><ul><li>Column expressions </li></ul></ul></ul><ul><ul><ul><li>Non-column expressions </li></ul></ul></ul>
  6. 6. Sample query Pat’s diagnosis
  7. 7. Query breakdown <ul><li>SELECT … </li></ul><ul><li>FROM SETL_TRANS S </li></ul><ul><li>,BRANCH CUST </li></ul><ul><li>,BRANCH_ADDR A </li></ul><ul><li>WHERE S.ADV_ABA_R = ? </li></ul><ul><li>AND S.PROCESS_DT < '9999-12-31‘ </li></ul><ul><li>AND S.TYPE_CD IN ('A', ‘C’, ‘X’) </li></ul><ul><li>AND S.CLR_CYCLE_CD IN ('EOD', 'IMD‘, ‘OPN’) </li></ul><ul><li>AND S.STLMT_DT = ? </li></ul><ul><li>AND S.ACCT_NUM = CUST.ACCT_NUM </li></ul><ul><li>AND CUST.CUST_EFCT_DT <= ? </li></ul><ul><li>AND CUST.CUST_INACTV_DT > ? </li></ul><ul><li>AND A.ACCT_NUM = CUST.ACCT_NUM </li></ul><ul><li>AND A.CUST_EFCT_DT <= ? </li></ul><ul><li>AND A.CUST_INACTV_DT > ? </li></ul><ul><li>AND A.ADDR_TYP_CD = ' ' </li></ul>
  8. 8. Identify peculiar predicates <ul><li>SELECT … </li></ul><ul><li>FROM SETL_TRANS S </li></ul><ul><li>,BRANCH CUST </li></ul><ul><li>,BRANCH_ADDR A </li></ul><ul><li>WHERE S.ADV_ABA_R = ? </li></ul><ul><li>AND S.PROCESS_DT < ‘9999-12-31’  MAX DATE </li></ul><ul><li>AND S.TYPE_CD IN ('A', 'C', ‘X‘, ‘Z’) </li></ul><ul><li>AND S.CLR_CYCLE_CD IN ('EOD', 'IMD‘, ‘OPN’) </li></ul><ul><li>AND S.STLMT_DT = ? </li></ul><ul><li>AND S.ACCT_NUM = CUST.ACCT_NUM </li></ul><ul><li>AND CUST.CUST_EFCT_DT <= ?  Range with marker </li></ul><ul><li>AND CUST.CUST_INACTV_DT > ?  Range with marker </li></ul><ul><li>AND A.ACCT_NUM = CUST.ACCT_NUM </li></ul><ul><li>AND A.CUST_EFCT_DT <= ?  Range with marker </li></ul><ul><li>AND A.CUST_INACTV_DT > ?  Range with marker </li></ul><ul><li>AND A.ADDR_TYP_CD = ' ‘  COL = blank </li></ul>
  9. 9. Why are they peculiar? <ul><li>Predicates with typical default often skewed. </li></ul><ul><li>AND S.PROCESS_DT < ‘9999-12-31’  MAX DATE </li></ul><ul><li>AND A.ADDR_TYP_CD = ' ‘  COL = blank </li></ul><ul><li>Range predicates with parameter markers </li></ul><ul><li>- Impossible to estimate without literal </li></ul><ul><li>AND CUST.CUST_EFCT_DT <= ?  Range with marker </li></ul><ul><li>AND CUST.CUST_INACTV_DT > ?  Range with marker </li></ul><ul><li>AND A.CUST_EFCT_DT <= ?  Range with marker </li></ul><ul><li>AND A.CUST_INACTV_DT > ?  Range with marker </li></ul>
  10. 10. Range predicate interpolation Table 104. Default filter factors for interpolation Note: Op is one of these operators: <, <=, >, >=. COMMENT: This is DB2’s documented guess for an impossible to estimate Filter factor. 1 1 = 1 1 / 10 1 / 3 >= 2 1 / 10 1 / 3 >= 0 3 / 100 1 / 10 >= 100 1 / 100 1 / 30 >= 1,000 3 / 1,000 1 / 100 >= 10,000 1 / 1,000 1 / 300 >= 100,000 3 / 10,000 1 / 1,000 >= 1,000,000 1 / 10,000 1 / 3,000 >= 10,000,000 3 / 100,000 1 / 10,000 >= 100,000,000 Filter Factor for LIKE / BETWEEN Filter factor for OP COLCARDF
  11. 11. Analyzing query <ul><li>Embed information within statement </li></ul><ul><ul><li>Table information </li></ul></ul><ul><ul><ul><li>CARDF </li></ul></ul></ul><ul><ul><ul><li>NPAGES </li></ul></ul></ul><ul><ul><li>Column information for predicates </li></ul></ul><ul><ul><ul><li>Local predicates </li></ul></ul></ul><ul><ul><ul><li>Join predicates </li></ul></ul></ul><ul><ul><li>Observe where the filtering is </li></ul></ul><ul><ul><ul><li>Selectivity of a predicate is relative to table cardinality </li></ul></ul></ul><ul><li>Investigate “suspicious” predicates </li></ul><ul><ul><li>Determine actual versus estimated filtering </li></ul></ul><ul><ul><li>If there is a problem, identify options </li></ul></ul>
  12. 12. Embed statistics <ul><li>SELECT … </li></ul><ul><li>FROM SETL_TRANS S CARDF 1,600,254 NPAGES 21,627 </li></ul><ul><li>,BRANCH CUST CARDF 31,696 NPAGES 1132 </li></ul><ul><li>,BRANCH_ADDR A CARDF 58,627 NPAGES 2791 </li></ul><ul><li>WHERE S.ADV_ABA_R = ? COLCARDF 19,712 </li></ul><ul><li>AND S.PROCESS_DT < ‘9999-12-31’ COLCARDF 11 </li></ul><ul><li>LOW2KEY 2004-03-24 HIGH2KEY 2004-04-05 </li></ul><ul><li>AND S.TYPE_CD IN ('A', 'C', ‘X‘, ‘Z’) COLCARDF 4 </li></ul><ul><li>AND S.CLR_CYCLE_CD IN ('EOD', 'IMD', ‘OPN') COLCARDF 3 </li></ul><ul><li>AND S.STLMT_DT = ? COLCARDF 13 </li></ul><ul><li>AND S.ACCT_NUM = CUST.ACCT_NUM COLCARDF 15360 / 26,527 </li></ul><ul><li>AND CUST.CUST_EFCT_DT <= ? COLCARDF 2,496 </li></ul><ul><li>LOW2KEY 1994-09-02 HIGH2KEY 2004-04-06 </li></ul><ul><li>AND CUST.CUST_INACTV_DT > ? COLCARDF 279 </li></ul><ul><li>LOW2KEY 2004-03-04 HIGH2KEY 2004-04-07 </li></ul><ul><li>AND A.ACCT_NUM = CUST.ACCT_NUM COLCARDF 26,527 / 26,527 </li></ul><ul><li>AND A.CUST_EFCT_DT <= ? COLCARDF 2,496 </li></ul><ul><li>LOW2KEY 1994-09-02 HIGH2KEY 2004-04-06 </li></ul><ul><li>AND A.CUST_INACTV_DT > ? COLCARDF 274 </li></ul><ul><li>LOW2KEY ‘2004-03-04’ HIGH2KEY ‘2004-04-07’ </li></ul><ul><li>AND A.ADDR_TYP_CD = ‘ ‘ COLCARDF 5 </li></ul>
  13. 13. Suspicious predicate analysis <ul><li>1) The first range predicate, we’re looking for all values less than ‘9999-12-31. </li></ul><ul><li>So the predicate searches for all values less than a number significantly greater </li></ul><ul><li>Than the HIGH2KEY – so basically, all of the rows qualify here. </li></ul><ul><li>(since the optimizer has the literal value, it KNOWS that all rows qualify). </li></ul><ul><li>2) For the column = blank predicate, I don’t believe a skew search was ever done. </li></ul><ul><li>You could look to see how many values are blank. Is it > 20%? 1/5 = 20%. </li></ul><ul><li>1) AND S.PROCESS_DT < '9999-12-31‘ COLCARDF 11 </li></ul><ul><li>LOW2KEY 2004-03-24 HIGH2KEY 2004-04-05 </li></ul><ul><li>2) AND A.ADDR_TYP_CD = ' ‘ COLCARDF 5 </li></ul><ul><li>Conclusion: First predicate is should not be causing this SQL statement any </li></ul><ul><li>Problems. </li></ul>
  14. 14. Suspicious predicate analysis <ul><li>The literal value used for each of the parameter markers in this case happened </li></ul><ul><li>To be the same, and the value was 2004-04-06. </li></ul><ul><li>Comparing the literal value to the HIGH2KEY and what range would qualify </li></ul><ul><li>Is how I determined the ESTIMATED FF WITH LITERAL. </li></ul><ul><li>The ESTIMATED FF WITH MARKER is from the chart in the Admin guide. </li></ul><ul><li>The “error” is how different the optimizers DEFAULT estimate is from ACTUAL filtering. </li></ul><ul><li>3) AND CUST.CUST_EFCT_DT <= ? COLCARDF 2,496 </li></ul><ul><li>LOW2KEY 1994-09-02 HIGH2KEY 2004-04-06 </li></ul><ul><li>ESTIMATED FF WITH LITERAL: = 100% </li></ul><ul><li>ESTIMATE WITH MARKER: 1/30 = 3% ( 97% error ) </li></ul><ul><li>4) AND CUST.CUST_INACTV_DT > ? COLCARDF 279 </li></ul><ul><li>LOW2KEY 2004-03-04 HIGH2KEY 2004-04-07 </li></ul><ul><li>ESTIMATED FF WITH LITERAL: = 99% </li></ul><ul><li>ESTIMATE WITH MARKER: 1/10 = 10% ( 89% error ) </li></ul><ul><li>5) AND A.CUST_EFCT_DT <= ? COLCARDF 2,496 </li></ul><ul><li>LOW2KEY 1994-09-02 HIGH2KEY 2004-04-06 </li></ul><ul><li>ESTIMATED FF WITH LITERAL: = 100% </li></ul><ul><li>ESTIMATE WITH MARKER: 1/30 = 3% ( 97% error ) </li></ul><ul><li>6) AND A.CUST_INACTV_DT > ? COLCARDF 274 </li></ul><ul><li>LOW2KEY ‘2004-03-04’ HIGH2KEY ‘2004-04-07’ </li></ul><ul><li>ESTIMATED FF WITH LITERAL: = 99% </li></ul><ul><li>ESTIMATE WITH MARKER: 1/10 = 10% ( 89% error ) </li></ul>
  15. 15. Suspicious predicate analysis <ul><li>Conclusion </li></ul><ul><ul><li>The range predicates with parameter markers introduce significant filter factor error. So we should recognize that this filter factor error can cause significant cost estimation problems for the optimizer – possibly resulting in poor access path choice. </li></ul></ul>
  16. 16. Where’s the filtering? <ul><li>WHERE S.ADV_ABA_R = ? COLCARDF 19,712 </li></ul><ul><li>(Very selective predicate) </li></ul><ul><li>AND S.PROCESS_DT < ‘9999-12-31’ COLCARDF 11 </li></ul><ul><li>(This predicate doesn’t filter anything, known from suspicious predicate analysis) </li></ul><ul><li>AND S.TYPE_CD IN ('A', 'C', ‘X', ‘Z') COLCARDF 4 </li></ul><ul><li>(In-list looking for 4 values, COLCARDF 4 – not filtering) </li></ul><ul><li>AND S.CLR_CYCLE_CD IN ('EOD', 'IMD', ‘OPN') COLCARDF 3 </li></ul><ul><li>(In-list looking for 3 values, COLCARDF 3 – not filtering) </li></ul><ul><li>AND S.STLMT_DT = ? COLCARDF 13 </li></ul><ul><li>(COL = LIT, COLCARDF 13 – somewhat filtering, but not great selectivity) </li></ul><ul><li>AND S.ACCT_NUM = CUST.ACCT_NUM COLCARDF 15360 / 26,527 </li></ul><ul><li>(For the range predicates, we know that optimizer PERCIEVES them to be selective but </li></ul><ul><li>In reality, they are not. This was determined during suspicious predicate analysis) </li></ul><ul><li>AND CUST.CUST_EFCT_DT <= ? COLCARDF 2,496 </li></ul><ul><li>AND CUST.CUST_INACTV_DT > ? COLCARDF 279 </li></ul><ul><li>AND A.ACCT_NUM = CUST.ACCT_NUM COLCARDF 26,527 / 26,527 </li></ul><ul><li>AND A.CUST_EFCT_DT <= ? COLCARDF 2,496 </li></ul><ul><li>AND A.CUST_INACTV_DT > ? COLCARDF 274 </li></ul><ul><li>AND A.ADDR_TYP_CD = ‘ ‘ COLCARDF 5 </li></ul><ul><li>(COL = blank. Probably this column is skewed on blank. COLCARDF 5, not typically </li></ul><ul><li>Very filtering) </li></ul>
  17. 17. Where’s the filtering? <ul><li>SELECT … </li></ul><ul><li>FROM SETL_TRANS S CARDF 1,600,254 NPAGES 21,627 </li></ul><ul><li>,BRANCH CUST CARDF 31,696 NPAGES 1132 </li></ul><ul><li>,BRANCH_ADDR A CARDF 58,627 NPAGES 2791 </li></ul><ul><li>WHERE S.ADV_ABA_R = ? COLCARDF 19,712 </li></ul><ul><li>AND S.PROCESS_DT < ‘9999-12-31’ COLCARDF 11 </li></ul><ul><li>LOW2KEY 2004-03-24 HIGH2KEY 2004-04-05 </li></ul><ul><li>AND S.TYPE_CD IN ('A', 'C', ‘X', ‘Z') COLCARDF 4 </li></ul><ul><li>AND S.CLR_CYCLE_CD IN ('EOD', 'IMD', ‘OPN') COLCARDF 3 </li></ul><ul><li>AND S.STLMT_DT = ? COLCARDF 13 </li></ul><ul><li>AND S.ACCT_NUM = CUST.ACCT_NUM COLCARDF 15360 / 26,527 </li></ul><ul><li>AND CUST.CUST_EFCT_DT <= ? COLCARDF 2,496 </li></ul><ul><li>LOW2KEY 1994-09-02 HIGH2KEY 2004-04-06 </li></ul><ul><li>AND CUST.CUST_INACTV_DT > ? COLCARDF 279 </li></ul><ul><li>LOW2KEY 2004-03-04 HIGH2KEY 2004-04-07 </li></ul><ul><li>AND A.ACCT_NUM = CUST.ACCT_NUM COLCARDF 26,527 / 26,527 </li></ul><ul><li>AND A.CUST_EFCT_DT <= ? COLCARDF 2,496 </li></ul><ul><li>LOW2KEY 1994-09-02 HIGH2KEY 2004-04-06 </li></ul><ul><li>AND A.CUST_INACTV_DT > ? COLCARDF 274 </li></ul><ul><li>LOW2KEY ‘2004-03-04’ HIGH2KEY ‘2004-04-07’ </li></ul><ul><li>AND A.ADDR_TYP_CD = ‘ ‘ COLCARDF 5 </li></ul>Most selective by far
  18. 18. Index analysis <ul><li>One significant input to the optimizer is… </li></ul><ul><ul><li>Available indexes </li></ul></ul><ul><ul><li>What join sequence they encourage </li></ul></ul><ul><li>Some index performance considerations </li></ul><ul><ul><li>Provide efficient access for local predicates </li></ul></ul><ul><ul><ul><li>Encourages table to be outer table </li></ul></ul></ul><ul><ul><li>Provide efficient access for join predicates </li></ul></ul><ul><ul><ul><li>Encourage access to table as INNER table of join </li></ul></ul></ul><ul><ul><li>Provide ordering to avoid sort </li></ul></ul><ul><li>Analysis: </li></ul><ul><ul><li>Are there appropriate indexes to support this query? </li></ul></ul>
  19. 19. Identify indexes <ul><li>Table: SETL_TRANS </li></ul><ul><li>INDEX IXSTRN01 </li></ul><ul><li>(PROCESS_DT, CLR_CYCLE_CD, ADV_ABA_R, TYPE_CD, ACCT_NUM, STLMT_DT) </li></ul><ul><li>TABLE: BRANCH </li></ul><ul><li>INDEX: IXBRNC01 </li></ul><ul><li>(CUST_INACTV_DT, CUST_EFCT_DT) </li></ul><ul><li>INDEX: IXBRNC02 </li></ul><ul><li>(ACCT_NUM, CUST_EFCT_DT) </li></ul><ul><li>TABLE: BRANCH_ADDR </li></ul><ul><li>INDEX: IXBRAD01 </li></ul><ul><li>(CUST_INACTV_DT, CUST_EFCT_DT) </li></ul><ul><li>INDEX: IXBRAD02 </li></ul><ul><li>(ACCT_NUM, ADDR_TYP_CD, CUST_EFCT_DT) </li></ul>
  20. 20. Index candidate usage <ul><li>Table: AJT_SETL_TRANS </li></ul><ul><li>INDEX IXSTRN01 </li></ul><ul><li>( PROCESS_DT , CLR_CYCLE_CD , ADV_ABA_R , TYPE_CD , ACCT_NUM , STLMT_DT ) </li></ul><ul><li>TABLE: BRANCH </li></ul><ul><li>INDEX: IXBRNC01 </li></ul><ul><li>( CUST_INACTV_DT, CUST_EFCT_DT ) </li></ul><ul><li>INDEX IXBRNC02 </li></ul><ul><li>( ACCT_NUM , CUST_EFCT_DT ) </li></ul><ul><li>TABLE: BRANCH_ADDR </li></ul><ul><li>INDEX: IXBRAD01 </li></ul><ul><li>( CUST_INACTV_DT, CUST_EFCT_DT ) </li></ul><ul><li>INDEX: IXBRAD02 </li></ul><ul><li>( ACCT_NUM , ADDR_TYP_CD , CUST_EFCT_DT ) </li></ul><ul><li>Key: </li></ul><ul><li>RED = Range predicate, stops matching </li></ul><ul><li>BLUE: Join predicate </li></ul><ul><li>GREEN: Local equals predicate / in-list </li></ul>
  21. 21. Index design analysis (by table) <ul><li>BRANCH table (Index design OK!) </li></ul><ul><ul><li>Index IXBRNC02 supports local access </li></ul></ul><ul><ul><ul><li>CONCERN: Predicate on this column has filter factor grossly overestimated, so optimizer will perceive the access to be more efficient to this table than what really occurs! </li></ul></ul></ul><ul><ul><li>Index IXBRNC01 supports join access </li></ul></ul><ul><li>BRANCH_ADDR table (Index design OK!) </li></ul><ul><ul><li>Index IXBRAD01 leading column on local filtering </li></ul></ul><ul><ul><ul><li>Predicate on this column has filter factor grossly over estimated </li></ul></ul></ul><ul><ul><ul><li>Allows table to be considered as inner table efficiently </li></ul></ul></ul><ul><ul><li>Index IXBRAD02 leading column supports join </li></ul></ul><ul><ul><ul><li>Allows table to be an efficient inner table </li></ul></ul></ul>
  22. 22. Index design analysis (by table) <ul><li>SETL_TRANS table (Not OK!) </li></ul><ul><ul><li>INDEX IXSTRN01 has one index. </li></ul></ul><ul><ul><ul><li>No efficient for join </li></ul></ul></ul><ul><ul><ul><ul><li>join predicate needs to be leading col) </li></ul></ul></ul></ul><ul><ul><ul><li>No efficient index for outer access </li></ul></ul></ul><ul><ul><ul><ul><li>Leading column of index qualifies ALL rows </li></ul></ul></ul></ul>
  23. 23. Overlay table size <ul><li>Table: SETL_TRANS CARDF 1,600,254 NPAGES 21,627 </li></ul><ul><li>INDEX IXSTRN01 </li></ul><ul><li>( PROCESS_DT , CLR_CYCLE_CD , ADV_ABA_R , TYPE_CD , ACCT_NUM , STLMT_DT ) </li></ul><ul><li>TABLE: BRANCH CARDF 31,696 NPAGES 1132 </li></ul><ul><li>INDEX: IXBRNC02 </li></ul><ul><li>( CUST_INACTV_DT, CUST_EFCT_DT ) </li></ul><ul><li>INDEX: IXBRNC01 </li></ul><ul><li>( ACCT_NUM , CUST_EFCT_DT ) </li></ul><ul><li>TABLE: BRANCH_ADDR CARDF 58,627 NPAGES 2791 </li></ul><ul><li>INDEX: IXBRAD01 </li></ul><ul><li>( CUST_INACTV_DT, CUST_EFCT_DT ) </li></ul><ul><li>INDEX: IXBRAD02 </li></ul><ul><li>( ACCT_NUM , ADDR_TYP_CD , CUST_EFCT_DT ) </li></ul><ul><li>Key: </li></ul><ul><li>RED = Range predicate, stops matching </li></ul><ul><li>BLUE: Join predicate </li></ul><ul><li>GREEN: Local equals predicate / in-list </li></ul>Biggest table, worst index Options. Must scan 1.6 million rows!
  24. 24. Possible new indexes <ul><li>Existing index </li></ul><ul><li>IXSTRN01 </li></ul><ul><li>( PROCESS_DT , CLR_CYCLE_CD , ADV_ABA_R , TYPE_CD , ACCT_NUM , STLMT_DT ) </li></ul><ul><li>Efficient outer table access </li></ul><ul><li>INDEX opt_1 </li></ul><ul><li>( ADV_ABA_R, STLMT_DT , ACCT_NUM ) </li></ul><ul><li>Efficient inner table access: </li></ul><ul><li>INDEX opt_2 </li></ul><ul><li>( ACCT_NUM ) </li></ul>
  25. 25. Summary of this SQL <ul><li>Indexes on BRANCH, BRANCH_ADDR look better than they are </li></ul><ul><ul><li>Range predicate with parameter marker estimates 3% of rows qualify </li></ul></ul><ul><ul><li>In reality, 99% qualify </li></ul></ul><ul><li>Inefficient index available on SETL_TRANS table </li></ul><ul><ul><li>No efficient outer table index available </li></ul></ul><ul><ul><li>No efficient inner table index available </li></ul></ul><ul><ul><li>This is the biggest table, with the best filter!!! </li></ul></ul><ul><li>Optimizer bad join method due to combination of above factors </li></ul><ul><ul><li>Performed full scan of transaction index 26,000 times </li></ul></ul><ul><li>Resolution: </li></ul><ul><ul><li>Providing new index on SETL_TRANS should provide more stable, faster access than ever before </li></ul></ul><ul><ul><li>REOPT, or providing literal values avoids the disaster without new index </li></ul></ul>
  26. 26. SQL 2 SELECT COLS FROM PART A CARDF=17,598 QUALIFIED_ROWS=67.1 NPAGESF=1,467 , CONTRACTOR B CARDF=34,728 QUALIFIED_ROWS=77.8 NPAGESF=724 , CONT_PARTS C CARDF=2,093,750 QUALIFIED_ROWS=38,382 NPAGESF=52,189 , PARTS_PROD_ASMBLY D CARDF=7,058,356 QUALIFIED_ROWS=7,058,356 NPAGESF=68,644 , PARTS_PROD_ASM_DTL E CARDF=21,366,326 QUALIFIED_ROWS=21,366,320 NPAGESF=1,236,490 WHERE A.COUNTRY_CD = ? COLCARDF=208 MAX_FREQ=36.408% FF=0.005 AND A.PART_CD = ? COLCARDF=5 MAX_FREQ=47.199% FF=0.2 AND A.PART_TYPE IN ( 'F', 'I', 'P' ) COLCARDF=8 MAX_FREQ=79.867% FF=0.958 AND B.PART_NUM = ? COLCARDF=260 MAX_FREQ=3.032% FF=0.004 AND B.SUB_CONTRACTOR = 'Y' COLCARDF=2 MAX_FREQ=87.402% FF=0.126 LOW2KEY=N HIGH2KEY=Y AND B.SUSPENDED = 'N' COLCARDF=2 MAX_FREQ=99.833% FF=0.998 LOW2KEY=N HIGH2KEY=Y AND C.PREFERRED = 'Y' COLCARDF=3 MAX_FREQ=76.832% FF=0.018 AND B.CONTRACTOR_ID = C.CONTRACTOR_ID COLCARDF=1,047/316 FF=9.551E-4 AND D.PART_NUM = A.PART_NUM COLCARDF=6,132/17,598 FF=5.682E-5 AND C.PRODUCT_ID = D.PRODUCT_ID COLCARDF=1,391,650/7,058,356 FF=1.417E-7 AND C.PRODUCT_ID = E.PRODUCT_ID COLCARDF=1,391,650/21,366,326 FF=4.68E-8 AND E.PRODUCT_ID = D.PRODUCT_ID COLCARDF=21,366,326/7,058,356 FF=4.68E-8
  27. 27. Local predicate analysis SELECT COLS FROM PART A CARDF=17,598 QUALIFIED_ROWS=67.1 NPAGESF=1,467 , CONTRACTOR B CARDF=34,728 QUALIFIED_ROWS=77.8 NPAGESF=724 , CONT_PARTS C CARDF=2,093,750 QUALIFIED_ROWS=38,382 NPAGESF=52,189 , PARTS_PROD_ASMBLY D CARDF=7,058,356 QUALIFIED_ROWS=7,058,356 NPAGESF=68,644 , PARTS_PROD_ASM_DTL E CARDF=21,366,326 QUALIFIED_ROWS=21,366,320 NPAGESF=1,236,490 WHERE A.COUNTRY_CD = ? COLCARDF=208 MAX_FREQ=36.408% FF=0.005  ??? ‘ FR’ = 36.4% ‘GB’ = 17% ‘DE’=10% AND A.PART_CD = ? COLCARDF=5 MAX_FREQ=47.199% FF=0.2  ??? 4 = 47%, 2 = 27%, 6 = 17%, 1 = 8%, blank = < 1% AND A.PART_TYPE IN ( 'F', 'I', 'P' ) COLCARDF=8 MAX_FREQ=79.867% FF=0.958  Skewed, not selective AND B.PART_NUM = ? COLCARDF=260 MAX_FREQ=3.032% FF=0.004 AND B.SUB_CONTRACTOR = 'Y' COLCARDF=2 MAX_FREQ=87.402% FF=0.126  skewed, selective LOW2KEY=N HIGH2KEY=Y AND B.SUSPENDED = 'N' COLCARDF=2 MAX_FREQ=99.833% FF=0.998  skewed, not selective LOW2KEY=N HIGH2KEY=Y AND C.PREFERRED = 'Y' COLCARDF=3 MAX_FREQ=76.832% FF=0.018  skewed, selective AND B.CONTRACTOR_ID = C.CONTRACTOR_ID COLCARDF=1,047/316 FF=9.551E-4 AND D.PART_NUM = A.PART_NUM COLCARDF=6,132/17,598 FF=5.682E-5 AND C.PRODUCT_ID = D.PRODUCT_ID COLCARDF=1,391,650/7,058,356 FF=1.417E-7 AND C.PRODUCT_ID = E.PRODUCT_ID COLCARDF=1,391,650/21,366,326 FF=4.68E-8 AND E.PRODUCT_ID = D.PRODUCT_ID COLCARDF=21,366,326/7,058,356 FF=4.68E-8 <ul><li>Both ‘A’ and ‘B’ tables have selective predicates. </li></ul><ul><li>COUNTRY_CD and PART_CD predicates – there is skew, optimizer assumes uniform distribution </li></ul><ul><li>B.PART_NUM – Slightly skewed. 3% one value. Uniform estimate is 0.4%. </li></ul><ul><li>PREFERRED – skewed, query searches for an infrequently occurring value. </li></ul><ul><li>Without looking at indexes, seems ‘A’ and ‘B’ will compete to be outer table </li></ul><ul><ul><li>Qualified rows of 67.1 and 77.8 pretty close </li></ul></ul>
  28. 28. Local index analysis – ‘A’ SELECT COLS FROM PART A CARDF=17,598 QUALIFIED_ROWS=67.1 NPAGESF=1,467 WHERE A.COUNTRY_CD = ? COLCARDF=208 MAX_FREQ=36.408% FF=0.005  ??? ‘ FR’ = 36.4% ‘GB’ = 17% ‘DE’=10% AND A.PART_CD = ? COLCARDF=5 MAX_FREQ=47.199% FF=0.2  ??? 4 = 47%, 2 = 27%, 6 = 17%, 1 = 8%, blank = < 1% AND A.PART_TYPE IN ( 'F', 'I', 'P' ) COLCARDF=8 MAX_FREQ=79.867% FF=0.958  Skewed, not selective INDEX CLU UR NLEAF NLEVEL CR KEYCOLNAME COLCARDF MCARDF IXPRT01 Y U 151 3 0.999 PART_CD 5 5 COUNTRY_CD 208 251 FILE 2496 3054 DR 46 3176 SECTOR 178 3548 PDV 16830 17598 IXPRT02 N D 128 2 0.794 PART_CD 5 5 PART_TYPE 8 28 PDV 16830 16850 FILE 2496 16905 IXPRT03 N D 26 2 0.998 PART_TYPE 8 8 PART_CD 5 28 COUNTRY_CD 208 579 IXPRT04 N D 99 2 0.782 PART_TYPE 8 8 PART_NUM 17598 17598
  29. 29. Local index analysis – ‘A’ SELECT COLS FROM PART A CARDF=17,598 QUALIFIED_ROWS=67.1 NPAGESF=1,467 WHERE A.COUNTRY_CD = ? COLCARDF=208 MAX_FREQ=36.408% FF=0.005  ??? ‘ FR’ = 36.4% ‘GB’ = 17% ‘DE’=10% AND A.PART_CD = ? COLCARDF=5 MAX_FREQ=47.199% FF=0.2  ??? 4 = 47%, 2 = 27%, 6 = 17%, 1 = 8%, blank = < 1% AND A.PART_TYPE IN ( 'F', 'I', 'P' ) COLCARDF=8 MAX_FREQ=79.867% FF=0.958  Skewed, not selective INDEX CLU UR NLEAF NLEVEL CR KEYCOLNAME COLCARDF MCARDF IXPRT01 Y U 151 3 0.999 PART_CD 5 5 COUNTRY_CD 208 251 FILE 2496 3054 DR 46 3176 SECTOR 178 3548 PDV 16830 17598 IXPRT02 N D 128 2 0.794 PART_CD 5 5 PART_TYPE 8 28 PDV 16830 16850 FILE 2496 16905 IXPRT03 N D 26 2 0.998 PART_TYPE 8 8 PART_CD 5 28 COUNTRY_CD 208 579 IXPRT04 N D 99 2 0.782 PART_TYPE 8 8 PART_NUM 17598 17598
  30. 30. Local index analysis B SELECT COLS FROM CONTRACTOR B CARDF=34,728 QUALIFIED_ROWS=77.8 NPAGESF=724 WHERE B.PART_NUM = ? COLCARDF=260 MAX_FREQ=3.032% FF=0.004 AND B.SUB_CONTRACTOR = 'Y' COLCARDF=2 MAX_FREQ=87.402% FF=0.126  skewed, selective LOW2KEY=N HIGH2KEY=Y AND B.SUSPENDED = 'N' COLCARDF=2 MAX_FREQ=99.833% FF=0.998  skewed, not selective LOW2KEY=N HIGH2KEY=Y INDEX CLU UR NLEAF NLEVEL CR KEYCOLNAME COLCARDF MCARDF IXCTR01 Y P 210 3 0.962 PART_NUM 260 278 CONTRACTOR_ID 1047 34722 CONT_TYPE 7 34728 IXCTR02 N D 50 2 0.624 PART_NUM 260 278 IXCTR03 N D 56 2 0.348 BEGIN_DT 1015 1015 CONTRACTOR_ID 1047 2555 IXCTR04 N D 316 3 0.927 CONTRACTOR_ID 1047 1047 PART_NUM 260 34722 BEGIN_DT 1015 34722 END_DT 2656 34722 CONT_TYPE 7 34728 IXCTR05 N D 250 3 0.896 CONTRACTOR_ID 1047 1047 BEGIN_DT 1015 2555 PART_NUM 260 34722
  31. 31. Local index analysis B <ul><li>Note: SUB_CONTRACTOR is selective due to search for least frequent value. Is not in any candidate index. </li></ul><ul><li>Otherwise, local index support looks good. </li></ul><ul><li>May be able to drop IXCTR02 with reverse index scan support. </li></ul>SELECT COLS FROM CONTRACTOR B CARDF=34,728 QUALIFIED_ROWS=77.8 NPAGESF=724 WHERE B.PART_NUM = ? COLCARDF=260 MAX_FREQ=3.032% FF=0.004 AND B.SUB_CONTRACTOR = 'Y' COLCARDF=2 MAX_FREQ=87.402% FF=0.126  skewed, selective LOW2KEY=N HIGH2KEY=Y AND B.SUSPENDED = 'N' COLCARDF=2 MAX_FREQ=99.833% FF=0.998  skewed, not selective LOW2KEY=N HIGH2KEY=Y INDEX CLU UR NLEAF NLEVEL CR KEYCOLNAME COLCARDF MCARDF IXCTR01 Y P 210 3 0.962 PART_NUM 260 278 CONTRACTOR_ID 1047 34722 CONT_TYPE 7 34728 IXCTR02 N D 50 2 0.624 PART_NUM 260 278 IXCTR04 N D 316 3 0.927 CONTRACTOR_ID 1047 1047 PART_NUM 260 34722 BEGIN_DT 1015 34722 END_DT 2656 34722 CONT_TYPE 7 34728 IXCTR05 N D 250 3 0.896 CONTRACTOR_ID 1047 1047 BEGIN_DT 1015 2555 PART_NUM 260 34722
  32. 32. Local index analysis C <ul><li>Table C </li></ul><ul><ul><li>There is index support for local filtering. </li></ul></ul><ul><ul><li>Trailing join column (good) </li></ul></ul>SELECT COLS FROM CONT_PARTS C CARDF=2,093,750 QUALIFIED_ROWS=38,382 NPAGESF=52,189 WHERE C.PREFERRED = 'Y' COLCARDF=3 MAX_FREQ=76.832% FF=0.018  skewed, selective INDEX CLU UR NLEAF NLEVEL CR KEYCOLNAME COLCARDF MCARDF IXCPR04 N D 15352 3 0.998 PREFERRED 3 3 CONTRACTOR_ID 316 552 PRODUCT_ID 1391650 1808887
  33. 33. Indexes for local summary <ul><li>Each table with local filtering had efficient indexes to support local filtering </li></ul><ul><ul><li>Positives: </li></ul></ul><ul><ul><ul><li>Efficient access paths exist. </li></ul></ul></ul><ul><ul><li>Negatives: </li></ul></ul><ul><ul><ul><li>Each table will compete for the outer </li></ul></ul></ul><ul><ul><ul><li>More “apparently efficient” choices, more stress on optimizer, opportunity for incorrect choice </li></ul></ul></ul>
  34. 34. Join graph C B D E A <ul><li>Two most selective tables ‘A’ and ‘B’ not joined directly </li></ul><ul><li>C – D – E each join on same column (PRODUCT_ID) </li></ul><ul><li>Shaping up like ‘A’ with 67 outer rows as outer vs ‘B’ with 77 rows as outer </li></ul>
  35. 35. Join considerations <ul><li>Index support for certain join sequences </li></ul><ul><ul><li>Indexes available to support matching index access for different desirable join sequences? </li></ul></ul><ul><li>Join reduction / fan-out considerations </li></ul><ul><ul><li>Consider expansion / contraction of result size through different join sequences </li></ul></ul>
  36. 36. Join indexes A SELECT COLS FROM PART A CARDF=17,598 QUALIFIED_ROWS=67.1 NPAGESF=1,467 , CONTRACTOR B CARDF=34,728 QUALIFIED_ROWS=77.8 NPAGESF=724 , CONT_PARTS C CARDF=2,093,750 QUALIFIED_ROWS=38,382 NPAGESF=52,189 , PARTS_PROD_ASMBLY D CARDF=7,058,356 QUALIFIED_ROWS=7,058,356 NPAGESF=68,644 , PARTS_PROD_ASM_DTL E CARDF=21,366,326 QUALIFIED_ROWS=21,366,320 NPAGESF=1,236,490 WHERE A.COUNTRY_CD = ? COLCARDF=208 MAX_FREQ=36.408% FF=0.005 ‘ FR’ = 36.4% ‘GB’ = 17% ‘DE’=10% AND A.PART_CD = ? COLCARDF=5 MAX_FREQ=47.199% FF=0.2 4 = 47%, 2 = 27%, 6 = 17%, 1 = 8%, blank = < 1% AND A.PART_TYPE IN ( 'F', 'I', 'P' ) COLCARDF=8 MAX_FREQ=79.867% FF=0.958 AND B.PART_NUM = ? COLCARDF=260 MAX_FREQ=3.032% FF=0.004 AND B.SUB_CONTRACTOR = 'Y' COLCARDF=2 MAX_FREQ=87.402% FF=0.126 LOW2KEY=N HIGH2KEY=Y AND B.SUSPENDED = 'N' COLCARDF=2 MAX_FREQ=99.833% FF=0.998 LOW2KEY=N HIGH2KEY=Y AND C.PREFERRED = 'Y' COLCARDF=3 MAX_FREQ=76.832% FF=0.018 AND B.CONTRACTOR_ID = C.CONTRACTOR_ID COLCARDF=1,047/316 FF=9.551E-4 AND D.PART_NUM = A.PART_NUM COLCARDF=6,132/17,598 FF=5.682E-5 AND C.PRODUCT_ID = D.PRODUCT_ID COLCARDF=1,391,650/7,058,356 FF=1.417E-7 AND C.PRODUCT_ID = E.PRODUCT_ID COLCARDF=1,391,650/21,366,326 FF=4.68E-8 AND E.PRODUCT_ID = D.PRODUCT_ID COLCARDF=21,366,326/7,058,356 FF=4.68E-8 INDEX CLU UR NLEAF NLEVEL CR KEYCOLNAME COLCARDF MCARDF IXPRT01 N P 83 2 0.782 PART_NUM 17598 17598 IXPRT02 N D 112 2 0.782 PART_NUM 17598 17598 PART_TYPE 8 17598 PART_CD 5 17598 IXPRT04 N D 99 2 0.782 PART_TYPE 8 8 PART_NUM 17598 17598 IXPRTxx N D 122 2 0.782 PART_NUM 17598 17603 PART_TYPE 8 17598 PART_CD 5 -1 COUNTRY_CD 208 17603
  37. 37. Join indexes A <ul><li>Join access available through join the ‘D’ table only </li></ul><ul><ul><li>Via PART_NUM if ‘D’ is the outer </li></ul></ul><ul><li>There are multiple indexes to support ‘A’ as inner </li></ul><ul><ul><li>IXPRT02 and IXPRTxx appear redundant </li></ul></ul><ul><ul><li>IXPRTxx is superset of IXPRT02, same column sequence </li></ul></ul>INDEX CLU UR NLEAF NLEVEL CR KEYCOLNAME COLCARDF MCARDF IXPRT01 N P 83 2 0.782 PART_NUM 17598 17598 IXPRT02 N D 112 2 0.782 PART_NUM 17598 17598 PART_TYPE 8 17598 PART_CD 5 17598 IXPRT04 N D 99 2 0.782 PART_TYPE 8 8 PART_NUM 17598 17598 IXPRTxx N D 122 2 0.782 PART_NUM 17598 17603 PART_TYPE 8 17598 PART_CD 5 -1 COUNTRY_CD 208 17603
  38. 38. Join indexes B SELECT COLS FROM PART A CARDF=17,598 QUALIFIED_ROWS=67.1 NPAGESF=1,467 , CONTRACTOR B CARDF=34,728 QUALIFIED_ROWS=77.8 NPAGESF=724 , CONT_PARTS C CARDF=2,093,750 QUALIFIED_ROWS=38,382 NPAGESF=52,189 , PARTS_PROD_ASMBLY D CARDF=7,058,356 QUALIFIED_ROWS=7,058,356 NPAGESF=68,644 , PARTS_PROD_ASM_DTL E CARDF=21,366,326 QUALIFIED_ROWS=21,366,320 NPAGESF=1,236,490 WHERE A.COUNTRY_CD = ? COLCARDF=208 MAX_FREQ=36.408% FF=0.005 ‘ FR’ = 36.4% ‘GB’ = 17% ‘DE’=10% AND A.PART_CD = ? COLCARDF=5 MAX_FREQ=47.199% FF=0.2 4 = 47%, 2 = 27%, 6 = 17%, 1 = 8%, blank = < 1% AND A.PART_TYPE IN ( 'F', 'I', 'P' ) COLCARDF=8 MAX_FREQ=79.867% FF=0.958 AND B.PART_NUM = ? COLCARDF=260 MAX_FREQ=3.032% FF=0.004 AND B.SUB_CONTRACTOR = 'Y' COLCARDF=2 MAX_FREQ=87.402% FF=0.126 LOW2KEY=N HIGH2KEY=Y AND B.SUSPENDED = 'N' COLCARDF=2 MAX_FREQ=99.833% FF=0.998 LOW2KEY=N HIGH2KEY=Y AND C.PREFERRED = 'Y' COLCARDF=3 MAX_FREQ=76.832% FF=0.018 AND B.CONTRACTOR_ID = C.CONTRACTOR_ID COLCARDF=1,047/316 FF=9.551E-4 AND D.PART_NUM = A.PART_NUM COLCARDF=6,132/17,598 FF=5.682E-5 AND C.PRODUCT_ID = D.PRODUCT_ID COLCARDF=1,391,650/7,058,356 FF=1.417E-7 AND C.PRODUCT_ID = E.PRODUCT_ID COLCARDF=1,391,650/21,366,326 FF=4.68E-8 AND E.PRODUCT_ID = D.PRODUCT_ID COLCARDF=21,366,326/7,058,356 FF=4.68E-8 INDEX CLU UR NLEAF NLEVEL CR KEYCOLNAME COLCARDF MCARDF IXCTR01 Y P 210 3 0.962 PART_NUM 260 278 CONTRACTOR_ID 1047 34722 CONT_TYPE 7 34728 IXCTR04 N D 316 3 0.927 CONTRACTOR_ID 1047 1047 PART_NUM 260 34722 BEGIN_DT 1015 34722 END_DT 2656 34722 CONT_TYPE 7 34728 IXCTR05 N D 250 3 0.896 CONTRACTOR_ID 1047 1047 BEGIN_DT 1015 2555 PART_NUM 260 34722
  39. 39. Join indexes B <ul><li>Join access available through join the ‘C’ table only </li></ul><ul><ul><li>Via CONTRACTOR_ID if ‘C’ is the outer </li></ul></ul><ul><li>There are multiple indexes to support ‘B’ as inner </li></ul><ul><ul><li>IXCTR01 has PART_NUM as leading local </li></ul></ul><ul><ul><ul><li>Join from outer will hit far fewer leaf pages due to leading local predicate </li></ul></ul></ul><ul><ul><ul><li>Smaller “swath” of leaf pages: NLEAF * 1/PART_NUM COLCARDF </li></ul></ul></ul><ul><ul><ul><li>210 * (1/260) ~= 1 leaf page </li></ul></ul></ul><ul><ul><ul><li>Makes this index “outstanding” from inner index access perspective </li></ul></ul></ul><ul><ul><ul><li>Also an effective “outer” index since it provides good local filtering and join order for a join to ‘C’ table as inner </li></ul></ul></ul><ul><ul><li>IXCTR04, IXCTR05 lead with join predicate </li></ul></ul><ul><ul><ul><li>Support the join effectively </li></ul></ul></ul><ul><ul><ul><li>Join scattered over all leaf pages </li></ul></ul></ul>INDEX CLU UR NLEAF NLEVEL CR KEYCOLNAME COLCARDF MCARDF IXCTR01 Y P 210 3 0.962 PART_NUM 260 278 CONTRACTOR_ID 1047 34722 CE_TYPE 7 34728 IXCTR04 N D 316 3 0.927 CONTRACTOR_ID 1047 1047 PART_NUM 260 34722 CE_DTDIFFREEL 1015 34722 CE_DTLANCREEL 2656 34722 CE_TYPE 7 34728 IXCTR05 N D 250 3 0.896 CONTRACTOR_ID 1047 1047 CE_DTDIFFREEL 1015 2555 PART_NUM 260 34722
  40. 40. Join indexes C SELECT COLS FROM PART A CARDF=17,598 QUALIFIED_ROWS=67.1 NPAGESF=1,467 , CONTRACTOR B CARDF=34,728 QUALIFIED_ROWS=77.8 NPAGESF=724 , CONT_PARTS C CARDF=2,093,750 QUALIFIED_ROWS=38,382 NPAGESF=52,189 , PARTS_PROD_ASMBLY D CARDF=7,058,356 QUALIFIED_ROWS=7,058,356 NPAGESF=68,644 , PARTS_PROD_ASM_DTL E CARDF=21,366,326 QUALIFIED_ROWS=21,366,320 NPAGESF=1,236,490 WHERE A.COUNTRY_CD = ? COLCARDF=208 MAX_FREQ=36.408% FF=0.005 ‘ FR’ = 36.4% ‘GB’ = 17% ‘DE’=10% AND A.PART_CD = ? COLCARDF=5 MAX_FREQ=47.199% FF=0.2 4 = 47%, 2 = 27%, 6 = 17%, 1 = 8%, blank = < 1% AND A.PART_TYPE IN ( 'F', 'I', 'P' ) COLCARDF=8 MAX_FREQ=79.867% FF=0.958 AND B.PART_NUM = ? COLCARDF=260 MAX_FREQ=3.032% FF=0.004 AND B.SUB_CONTRACTOR = 'Y' COLCARDF=2 MAX_FREQ=87.402% FF=0.126 LOW2KEY=N HIGH2KEY=Y AND B.SUSPENDED = 'N' COLCARDF=2 MAX_FREQ=99.833% FF=0.998 LOW2KEY=N HIGH2KEY=Y AND C.PREFERRED = 'Y' COLCARDF=3 MAX_FREQ=76.832% FF=0.018 AND B.CONTRACTOR_ID = C.CONTRACTOR_ID COLCARDF=1,047/316 FF=9.551E-4 AND D.PART_NUM = A.PART_NUM COLCARDF=6,132/17,598 FF=5.682E-5 AND C.PRODUCT_ID = D.PRODUCT_ID COLCARDF=1,391,650/7,058,356 FF=1.417E-7 AND C.PRODUCT_ID = E.PRODUCT_ID COLCARDF=1,391,650/21,366,326 FF=4.68E-8 AND E.PRODUCT_ID = D.PRODUCT_ID COLCARDF=21,366,326/7,058,356 FF=4.68E-8 INDEX CLU UR NLEAF NLEVEL CR KEYCOLNAME COLCARDF MCARDF IXCPR01 Y U 21367 3 1.0 PRODUCT_ID 1391650 1391650 CONTRACTOR_ID 316 1794093 CO_DTHRCONTACT 1645213 2093750 IXCPR02 N D 14771 3 0.999 CONTRACTOR_ID 316 316 PRODUCT_ID 1391650 1794093 IXCPR03 N D 16188 3 0.998 CONTRACTOR_ID 316 316 CO_PHASECONTACT 4 783 PRODUCT_ID 1391650 1931232 IXCPR04 N D 15352 3 0.998 PREFERRED 3 3 CONTRACTOR_ID 316 552 PRODUCT_ID 1391650 1808887
  41. 41. Join indexes C <ul><li>Join access available through join the ‘B’, ‘D’, and ‘E’ tables </li></ul><ul><ul><li>Via CONTRACTOR_ID if ‘B’ is the outer composite </li></ul></ul><ul><ul><li>Via PRODUCT_ID if ‘D’ or ‘E’ are in the outer composite </li></ul></ul><ul><li>There is support for either join sequence. </li></ul><ul><ul><li>CPNQCC02 has PRODUCT_ID as leading column to support ‘D’ or ‘E’ in outer composite </li></ul></ul><ul><ul><li>CPNQXC02 and IXCPR03 have CONTRACTOR_ID as leading join column if ‘B’ is the outer composite </li></ul></ul><ul><ul><ul><li>IXCPR03 would also be a candidate if B were cartesianed with D or E. Not that I think that’s likely. </li></ul></ul></ul><ul><ul><li>CPMQXCOH would likely be preferred index if ‘B’ were in outer composite </li></ul></ul><ul><ul><ul><li>Selective leading local on PREFERRED bounds the leaf pages that would be hit to < 2% of all leaf pages </li></ul></ul></ul><ul><ul><ul><li>Makes ‘C’ a possible efficient outer – good local filtering, provides join ordering for join to ‘B’ table </li></ul></ul></ul>INDEX CLU UR NLEAF NLEVEL CR KEYCOLNAME COLCARDF MCARDF IXCPR01 Y U 21367 3 1.0 PRODUCT_ID 1391650 1391650 CONTRACTOR_ID 316 1794093 CO_DTHRCONTACT 1645213 2093750 IXCPR02 N D 14771 3 0.999 CONTRACTOR_ID 316 316 PRODUCT_ID 1391650 1794093 IXCPR03 N D 16188 3 0.998 CONTRACTOR_ID 316 316 CO_PHASECONTACT 4 783 PRODUCT_ID 1391650 1931232 IXCPR04 N D 15352 3 0.998 PREFERRED 3 3 CONTRACTOR_ID 316 552 PRODUCT_ID 1391650 1808887
  42. 42. Join indexes D SELECT COLS FROM PART A CARDF=17,598 QUALIFIED_ROWS=67.1 NPAGESF=1,467 , CONTRACTOR B CARDF=34,728 QUALIFIED_ROWS=77.8 NPAGESF=724 , CONT_PARTS C CARDF=2,093,750 QUALIFIED_ROWS=38,382 NPAGESF=52,189 , PARTS_PROD_ASMBLY D CARDF=7,058,356 QUALIFIED_ROWS=7,058,356 NPAGESF=68,644 , PARTS_PROD_ASM_DTL E CARDF=21,366,326 QUALIFIED_ROWS=21,366,320 NPAGESF=1,236,490 WHERE A.COUNTRY_CD = ? COLCARDF=208 MAX_FREQ=36.408% FF=0.005 ‘ FR’ = 36.4% ‘GB’ = 17% ‘DE’=10% AND A.PART_CD = ? COLCARDF=5 MAX_FREQ=47.199% FF=0.2 4 = 47%, 2 = 27%, 6 = 17%, 1 = 8%, blank = < 1% AND A.PART_TYPE IN ( 'F', 'I', 'P' ) COLCARDF=8 MAX_FREQ=79.867% FF=0.958 AND B.PART_NUM = ? COLCARDF=260 MAX_FREQ=3.032% FF=0.004 AND B.SUB_CONTRACTOR = 'Y' COLCARDF=2 MAX_FREQ=87.402% FF=0.126 LOW2KEY=N HIGH2KEY=Y AND B.SUSPENDED = 'N' COLCARDF=2 MAX_FREQ=99.833% FF=0.998 LOW2KEY=N HIGH2KEY=Y AND C.PREFERRED = 'Y' COLCARDF=3 MAX_FREQ=76.832% FF=0.018 AND B.CONTRACTOR_ID = C.CONTRACTOR_ID COLCARDF=1,047/316 FF=9.551E-4 AND D.PART_NUM = A.PART_NUM COLCARDF=6,132/17,598 FF=5.682E-5 AND C.PRODUCT_ID = D.PRODUCT_ID COLCARDF=1,391,650/7,058,356 FF=1.417E-7 AND C.PRODUCT_ID = E.PRODUCT_ID COLCARDF=1,391,650/21,366,326 FF=4.68E-8 AND E.PRODUCT_ID = D.PRODUCT_ID COLCARDF=21,366,326/7,058,356 FF=4.68E-8 INDEX CLU UR NLEAF NLEVEL CR KEYCOLNAME COLCARDF MCARDF IXPDA05 Y P 44900 4 0.975 PRODUCT_ID 7058356 7058356 IXPDA02 N D 70586 4 0.868 PART_NUM 6132 6132 PRODUCT_ID 7058356 7058356 IXPDA06 N D 66590 4 0.975 PRODUCT_ID 7058356 7058356 PART_NUM 6132 7058356
  43. 43. Join indexes D <ul><li>‘ D’ is accessed in multiple directions </li></ul><ul><ul><li>Via PART_NUM if ‘A’ is the outer </li></ul></ul><ul><ul><li>Via PRODUCT_ID if accessed through ‘C’ or ‘E’ </li></ul></ul><ul><li>Both join direction supported by matching index access. </li></ul><ul><ul><li>RT_ENTID leading column of IXPDA02 </li></ul></ul><ul><ul><li>PRODUCT_ID leading column of IXPDA05, IXPDA06 </li></ul></ul><ul><li>The non-primary key indexes are defined as allowing duplicates – but they cannot. </li></ul><ul><ul><li>PRODUCT_ID is the primary key and is included in a unique index. </li></ul></ul><ul><ul><li>Any index which contains PRODUCT_ID therefore is unique. Defining as unique would save some space in the index. Duplicate indexes have slightly larger control structures to allow for duplicate RIDS. </li></ul></ul><ul><ul><li>DB2 must allow for duplicates if the index is not explicitly defined as unique since you could drop the unique index. </li></ul></ul>
  44. 44. Join indexes E <ul><li>Join access available through C and E tables </li></ul><ul><ul><li>Both tables join on PRODUCT_ID column </li></ul></ul><ul><li>Join is supported via IXPDA01 index </li></ul><ul><ul><li>PRODUCT_ID only column </li></ul></ul><ul><ul><li>Unique index (no fan-out when joining to this table) </li></ul></ul>SELECT COLS FROM PART A CARDF=17,598 QUALIFIED_ROWS=67.1 NPAGESF=1,467 , CONTRACTOR B CARDF=34,728 QUALIFIED_ROWS=77.8 NPAGESF=724 , CONT_PARTS C CARDF=2,093,750 QUALIFIED_ROWS=38,382 NPAGESF=52,189 , PARTS_PROD_ASMBLY D CARDF=7,058,356 QUALIFIED_ROWS=7,058,356 NPAGESF=68,644 , PARTS_PROD_ASM_DTL E CARDF=21,366,326 QUALIFIED_ROWS=21,366,320 NPAGESF=1,236,490 WHERE A.COUNTRY_CD = ? COLCARDF=208 MAX_FREQ=36.408% FF=0.005 ‘ FR’ = 36.4% ‘GB’ = 17% ‘DE’=10% AND A.PART_CD = ? COLCARDF=5 MAX_FREQ=47.199% FF=0.2 4 = 47%, 2 = 27%, 6 = 17%, 1 = 8%, blank = < 1% AND A.PART_TYPE IN ( 'F', 'I', 'P' ) COLCARDF=8 MAX_FREQ=79.867% FF=0.958 AND B.PART_NUM = ? COLCARDF=260 MAX_FREQ=3.032% FF=0.004 AND B.SUB_CONTRACTOR = 'Y' COLCARDF=2 MAX_FREQ=87.402% FF=0.126 LOW2KEY=N HIGH2KEY=Y AND B.SUSPENDED = 'N' COLCARDF=2 MAX_FREQ=99.833% FF=0.998 LOW2KEY=N HIGH2KEY=Y AND C.PREFERRED = 'Y' COLCARDF=3 MAX_FREQ=76.832% FF=0.018 AND B.CONTRACTOR_ID = C.CONTRACTOR_ID COLCARDF=1,047/316 FF=9.551E-4 AND D.PART_NUM = A.PART_NUM COLCARDF=6,132/17,598 FF=5.682E-5 AND C.PRODUCT_ID = D.PRODUCT_ID COLCARDF=1,391,650/7,058,356 FF=1.417E-7 AND C.PRODUCT_ID = E.PRODUCT_ID COLCARDF=1,391,650/21,366,326 FF=4.68E-8 AND E.PRODUCT_ID = D.PRODUCT_ID COLCARDF=21,366,326/7,058,356 FF=4.68E-8 INDEX CLU UR NLEAF NLEVEL CR KEYCOLNAME COLCARDF MCARDF IXPPA01 N U 141499 4 0.609 PRODUCT_ID 21366326 21366326
  45. 45. Join fan-out SELECT COLS FROM PART A CARDF=17,598 QUALIFIED_ROWS=67.1 NPAGESF=1,467 , CONTRACTOR B CARDF=34,728 QUALIFIED_ROWS=77.8 NPAGESF=724 , CONT_PARTS C CARDF=2,093,750 QUALIFIED_ROWS=38,382 NPAGESF=52,189 , PARTS_PROD_ASMBLY D CARDF=7,058,356 QUALIFIED_ROWS=7,058,356 NPAGESF=68,644 , PARTS_PROD_ASM_DTL E CARDF=21,366,326 QUALIFIED_ROWS=21,366,320 NPAGESF=1,236,490 WHERE A.COUNTRY_CD = ? COLCARDF=208 MAX_FREQ=36.408% FF=0.005 ‘ FR’ = 36.4% ‘GB’ = 17% ‘DE’=10% AND A.PART_CD = ? COLCARDF=5 MAX_FREQ=47.199% FF=0.2 4 = 47%, 2 = 27%, 6 = 17%, 1 = 8%, blank = < 1% AND A.PART_TYPE IN ( 'F', 'I', 'P' ) COLCARDF=8 MAX_FREQ=79.867% FF=0.958 AND B.PART_NUM = ? COLCARDF=260 MAX_FREQ=3.032% FF=0.004 AND B.SUB_CONTRACTOR = 'Y' COLCARDF=2 MAX_FREQ=87.402% FF=0.126 LOW2KEY=N HIGH2KEY=Y AND B.SUSPENDED = 'N' COLCARDF=2 MAX_FREQ=99.833% FF=0.998 LOW2KEY=N HIGH2KEY=Y AND C.PREFERRED = 'Y' COLCARDF=3 MAX_FREQ=76.832% FF=0.018 AND B.CONTRACTOR_ID = C.CONTRACTOR_ID COLCARDF=1,047/316 FF=9.551E-4 AND D.PART_NUM = A.PART_NUM COLCARDF=6,132/17,598 FF=5.682E-5 AND C.PRODUCT_ID = D.PRODUCT_ID COLCARDF=1,391,650/7,058,356 FF=1.417E-7 AND C.PRODUCT_ID = E.PRODUCT_ID COLCARDF=1,391,650/21,366,326 FF=4.68E-8 AND E.PRODUCT_ID = D.PRODUCT_ID COLCARDF=21,366,326/7,058,356 FF=4.68E-8 <ul><li>Look at join fan-out issues </li></ul><ul><ul><ul><li>Qualified outer rows * (CARDF of inner / MAX(join colcardf) </li></ul></ul></ul><ul><ul><li>A  D </li></ul></ul><ul><ul><ul><li>67.1 rows * (7,058,356 / 17598) ~= 27,000 rows </li></ul></ul></ul><ul><ul><li>B  C or C  B </li></ul></ul><ul><ul><ul><li>77.8 rows * (2,093,750 / 1047) ~= 155,500 rows (after local filtering on C, down to 38K) </li></ul></ul></ul><ul><ul><li>So B  C expected to fan-out far more. </li></ul></ul>
  46. 46. Explain <ul><li>Join sequence </li></ul><ul><ul><li>Access ‘A’ via index IXPRT01 ( PART_CD, COUNTRY_CD , …) ~67 rows </li></ul></ul><ul><ul><li>Nested loop join to ‘D’ using index IXPDA02 ( RV_ENT_ID , PRODUCT_ID) ~27,000 rows </li></ul></ul><ul><ul><li>Sort merge join to C </li></ul></ul><ul><ul><ul><li>Sorting composite into PRODUCT_ID sequence </li></ul></ul></ul><ul><ul><ul><li>Access ‘C’ via IXCPR04 ( PREFERRED, CONTRACTOR_ID ) </li></ul></ul></ul><ul><ul><ul><li>Sorting new into PRODUCT_ID sequence ~7,900 rows </li></ul></ul></ul><ul><ul><li>Nested loop join to B via index IXCTR01 ~7,900 rows </li></ul></ul><ul><ul><ul><li>( PART_NUM , CONTRACTOR_ID , CE_TYPE) </li></ul></ul></ul><ul><ul><li>Nested loop join to E via index IXPDA01 ~7,900 rows </li></ul></ul><ul><ul><ul><li>( PRODUCT_ID ) </li></ul></ul></ul><ul><li>Blue = local predicate </li></ul><ul><li>Green = join predicate </li></ul>1 2 1 1 2 MATCH COLS N N N IXPDA01 I PARTS_PROD_ASM_DTL 1 5 N N N IXCTR01 I CONTRACTOR 1 4 Y Y N IXPRD04 I CONT_PARTS 1 2 3 N N Y IXPPA02 I PARTS_PROD_ASSEMBLY 1 2 N N N IXPRT01 I PART 0 1 SORTC_JOIN SORTN_JOIN IX_ONLY ACCESS NAME ACCESS_TYPE TB_NAME MERGE_COLS METHOD PLANNO
  47. 47. Issues – A as outer? <ul><li>Is local filtering to ‘A’ table accurate? </li></ul><ul><ul><li>There is skew, but use of markers precludes recognition of skew </li></ul></ul><ul><ul><li>Qualified rows and fan-out could be much worse than estimated </li></ul></ul><ul><ul><li>‘ A’ as outer could be underestimated, depends on what values being used </li></ul></ul><ul><li>Sort merge join to ‘C’ to avoid 27K probes </li></ul><ul><ul><li>Does not want to probe 27k times matching + fan-out on PRODUCT_ID </li></ul></ul><ul><ul><li>Uses efficient local index instead </li></ul></ul><ul><ul><ul><li>1 probe to scan of 38k rows via PREFERRED </li></ul></ul></ul><ul><ul><ul><li>27K probes * 2 rows per inner via index on PRODUCT_ID </li></ul></ul></ul><ul><ul><li>Index on PREFERRED, PRODUCT_ID likely would might avert SMJ in this context </li></ul></ul><ul><ul><li>Hesitant to recommend index – since A  D  C could be an inefficient sequence. </li></ul></ul>
  48. 48. Issues – B / C as outer? <ul><li>B as outer </li></ul><ul><ul><li>Less skew on B.PART_NUM = ? – less uncertainty in cost estimate </li></ul></ul><ul><ul><li>Fan-out to 38K rows is discouraging </li></ul></ul><ul><ul><li>B  C supported by efficient local + equals index </li></ul></ul><ul><ul><ul><li>(PREFERRED, CONTRACTOR_ID, PRODUCT_ID) </li></ul></ul></ul><ul><li>C also a desirable outer </li></ul><ul><ul><li>Index on (PREFERRED,CONTRACTOR_ID,PRODUCT_ID) provides good local filter </li></ul></ul><ul><ul><li>Could access B via local filtering on B.PART_NUM = ?, materialize 77 rows into workfile for sort merge join </li></ul></ul>
  49. 49. Summary Query 2 <ul><li>Bottom line: </li></ul><ul><ul><li>Uniform distribution estimate on ‘A’ table allows it to compete very favorably. </li></ul></ul><ul><ul><li>If ‘FR’, ‘GB’, ‘DE’ values used for COUNTRY_CD – ‘A’ as outer no longer desirable. </li></ul></ul><ul><ul><ul><li>Are ‘FR’, ‘GB’, ‘DE’ values frequently used for this query? </li></ul></ul></ul><ul><ul><li>If PART_CD = ‘4’ value is used frequently – ‘A’ as outer no longer desirable. </li></ul></ul><ul><ul><ul><li>Is ‘4’ used frequently? </li></ul></ul></ul><ul><ul><li>Split query, REOPT, OPTHINTS… </li></ul></ul><ul><li>Multiple choices </li></ul><ul><ul><li>Local filtering spread across several tables </li></ul></ul><ul><ul><li>Estimated filtering looks good </li></ul></ul><ul><ul><li>Efficient access paths (index to support local, join predicates) exist </li></ul></ul><ul><ul><li>More difficult for optimizer to identify the cheapest path </li></ul></ul><ul><ul><li>Scenario more regression prone </li></ul></ul><ul><ul><li>Optimizer may need more statistics, ability to use more statistics (REOPT) for optimizer identify the cheapest path </li></ul></ul>
  50. 50. Commentary <ul><li>How to perform SQL analysis </li></ul><ul><ul><li>Format query so it’s readable </li></ul></ul><ul><ul><li>Annotate with important statistics </li></ul></ul><ul><ul><ul><li>Tables: </li></ul></ul></ul><ul><ul><ul><ul><li>Table cardinality, NPAGES, qualified number of rows </li></ul></ul></ul></ul><ul><ul><ul><li>Predicates </li></ul></ul></ul><ul><ul><ul><ul><li>COLCARDF, LOW2KEY, HIGH2KEY, filter factor estimate </li></ul></ul></ul></ul><ul><ul><ul><li>Are table level estimates reasonable based on your knowledge? </li></ul></ul></ul><ul><ul><ul><ul><li>If you don’t know – perform counts to find out if estimates are accurate </li></ul></ul></ul></ul><ul><ul><ul><ul><li>If you don’t know how selective things are, how will you know what the best path should be? </li></ul></ul></ul></ul><ul><ul><ul><li>Are predicate level filtering estimates reasonable? </li></ul></ul></ul><ul><ul><li>Reference table, index, indexed columns report </li></ul></ul><ul><ul><ul><li>Is the best local filtering supported through matching index access? </li></ul></ul></ul><ul><ul><ul><li>Any mis-estimated local filtering that’s also matching indexable (may cause one path to look far more efficient than reality) </li></ul></ul></ul><ul><ul><ul><li>With trailing join predicates to provide order to next desired table (bonus) </li></ul></ul></ul><ul><ul><ul><li>Is there adequate (matching) index support for desired join sequences? </li></ul></ul></ul><ul><ul><li>Develop understanding of “plausible” and “desirable” access paths </li></ul></ul><ul><ul><li>Examine EXPLAIN output </li></ul></ul><ul><ul><ul><li>Does optimizer choose the path you expect? </li></ul></ul></ul><ul><ul><ul><li>If not, you should have better understanding of what makes other access paths competitive, tuning can be more targeted </li></ul></ul></ul><ul><ul><ul><ul><li>Eg. Certain predicate appears filtering, but is not. </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Can use REOPT, or trick – targeted to solve a specific problem. </li></ul></ul></ul></ul><ul><ul><ul><li>Skilled targeted tuning is less susceptible to re-regress than blind tuning (where problem is not understood) </li></ul></ul></ul>
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×